AWS Data Lake to Power InsightsCI

Executive Summary

Caja, a leading UK business transformation consultancy, has modernised its flagship analytics product, InsightsCI, delivering faster, more reliable insights to clients across Healthcare, Higher Education, Local Government, and the Private Sector.residential care.
Previously, InsightsCI relied on manual data preparation within Tableau Cloud, creating bottlenecks for both internal reporting and client delivery. By migrating to a centralised, event-driven Data Lake on Amazon Web Services (AWS), Caja automated its ETL pipelines, reduced manual effort, and established a scalable foundation for its analytics offering. This transformation enables Caja to deliver data-driven insights at scale, enhancing decision-making for clients and creating new commercial opportunities.

The Challenge: Scaling Beyond Manual ETL

InsightsCI was designed to provide dashboards for internal strategic planning and client-facing insights. However, as demand grew, the platform’s manual Tableau-based workflows became a bottleneck:
  • Manual Data Preparation: ETL was performed within Tableau, requiring repeated human effort for every client onboarding and data refresh.
  • Data Silos: Without a centralised repository, data was scattered across individual dashboards, limiting cross-analysis.
  • Scalability Limits: The manual workflow constrained Caja’s ability to expand InsightsCI as a commercial analytics product.
Caja needed a modern, centralised cloud platform to automate data workflows, reduce bottlenecks, and transform InsightsCI into a scalable Analytics-as-a-Service offering.

Why AWS?

AWS was chosen to professionalise Caja’s data operations due to its:
  • Serverless Architecture Automates ETL and data processing workflows without infrastructure overhead.
  • Centralisation: Provides a single, secure home in Amazon S3 for all data assets, moving the “source of truth” out of individual files.
  • Security & Compliance: AWS’s ISO27001-aligned infrastructure supports Caja’s handling of sensitive client data, and the migration contributed directly to Caja achieving Cyber Essentials Plus certification, ensuring robust protection across our technology stack.
  • Future-Proofing: Access to services like Amazon Bedrock enables Generative AI capabilities for enhanced analytics.

The Solution: A Centralised, Event-Driven Data Platform

Caja built a hub-and-spoke AWS platform for InsightsCI, replacing manual workflows with a streamlined, automated architecture:
  • Centralised Data Lake & Automated Ingestion All raw data is uploaded securely to Amazon S3’s “Raw Zone,” serving as an immutable system of record. File uploads trigger AWS Lambda, initiating a fully automated ETL workflow.
  • Intelligent Processing with AWS Glue AWS Glue jobs clean, standardise, and transform raw data into high-performance Parquet format. Glue Crawlers then update the Data Catalogue, making datasets immediately discoverable—without human intervention.
  • Powering InsightsCI with Athena & Bedrock - Amazon Athena enables serverless queries for fast, lightweight dashboards. Tableau now connects directly to Athena, allowing consultants to focus on insight generation instead of data wrangling. Amazon Bedrock adds a Generative AI layer, enabling natural language interrogation of data, making analytics accessible even to non-technical users.

Results and Benefits

The migration to AWS has transformed InsightsCI into a fully scalable analytics product:
  • Eliminated Manual ETL: Automated workflows save hours of manual effort, removing operational bottlenecks.
  • Scalable Commercial Product: InsightsCI can now be sold and deployed to multiple clients simultaneously.
  • Unified Data Governance: A centralised data asset provides a single source of truth, enabling cross-sector insights.
  • Enhanced Client Value: Clients receive faster, more reliable insights to support evidence-based decision-making.
  • Security & Compliance Confidence: ISO27001-aligned AWS infrastructure, combined with Caja’s Cyber Essentials Plus certification, ensures sensitive client data is protected and regulatory requirements are met.

testimonial

With AWS, we have liberated InsightsCI from manual processes. We now have a secure centralised, automated engine that allows us to scale our analytics offering and deliver true value to our clients across the public and private sectors.

Caroline Brown

Managing Director, CAJA

Get in Touch

We're here to help you transform your business. Whether you have questions about our services, need assistance with a project or want to explore potential collaborations, our team is ready to assist. Reach out to us through the contact details below, and let's start a conversation.

Data analytics and AI data pipeline at CAJA Group for healthcare data processing and insights.