Letting Data Speak, AI Act!

Case Study

AI-Powered Utility Bill Processing

About the Client

A utility bill management organization that processes high volumes of bills across diverse providers, relying on accurate bill data to drive downstream financial workflows for its users.

Untitled design - 2024-09-27T104509.589.png

Challenge

Utility bill PDFs arrived from dozens of providers, each with different layouts, fonts, and field structures. The organization had no automated system to extract, validate, or reconcile bill data, leaving staff to process every document by hand. Three core problems demanded a solution:

Manual Data Extraction
No Data Reconciliation
No Error Handling or Escalation Path

Manual Data Extraction

No automated system existed to extract key bill fields — provider name, account number, due date, and amount due — from uploaded PDFs. Staff processed each document by hand, creating backlogs and data quality issues as volumes grew.

No Data Reconciliation

For accounts where bill data already existed from API integrations, there was no mechanism to compare newly extracted PDF data against existing records. Conflicting values went undetected and missing fields were never auto-filled.

No Error Handling or Escalation Path

When extraction quality was uncertain — due to corrupted files, low-resolution scans, or ambiguous content — there was no structured escalation path. Failed documents were simply lost with no archival, logging, or retry logic in place.

Untitled design - 2024-09-27T105551.128.png

Key Results

Extraction Accuracy & Speed

Achieved 95%+ data extraction accuracy through a tiered confidence scoring system (1.0 for critical fields, 0.8 for secondary fields).
Reduced bill document processing time to under 60 seconds per PDF, replacing the fully manual data entry workflow and eliminating processing backlogs.

Reliability & Cost

Eliminated data loss through six categorised error types with automatic S3 archival, error metadata logging, and exponential-backoff retry logic.
Deployed a cost-effective serverless architecture processing documents at approximately $0.01 per PDF, with auto-scaling that absorbed volume spikes without infrastructure changes.
Delivered a fully integrated human-in-the-loop HubSpot review workflow — corrections written back to the database automatically upon ticket closure via webhook, requiring no additional manual steps.

Solution

The JashDS team designed and deployed a fully serverless, event-driven utility bill processing pipeline on AWS. The system automated the complete lifecycle — from PDF ingestion through AI extraction, intelligent data reconciliation, and human-in-the-loop review — with no manual intervention required for standard-quality documents.

Key Components:

Ingestion Layer: Dual-path S3 event-driven ingestion — manual uploads write directly to the database; batch uploads route through reconciliation before any writes
AI Extraction Layer: AWS Bedrock with Claude Sonnet 4 returning per-field confidence scores (0.0–1.0); Amazon Nova Pro as automatic fallback model
Reconciliation Layer: AWS Lambda with type-aware field comparison — decimal precision for monetary amounts, date normalisation, case-insensitive string matching
Review Layer: HubSpot-integrated ticketing with webhook handler that writes reviewer corrections back to Aurora PostgreSQL on ticket closure

Observability Layer: Amazon QuickSight dashboard connected via private VPC to Aurora PostgreSQL — surfacing active bills, overdue amounts, biller performance, and reviewer workload in real time

Technologies Used

AWS Bedrock (Claude Sonnet 4) — Primary AI model for PDF field extraction with per-field confidence scoring
Amazon Nova Pro — Automatic fallback model ensuring continuous availability
AWS Lambda (Python 3.12) — Serverless compute for extraction, reconciliation, review, and webhook handling
Amazon S3 & Amazon API Gateway — Event-driven document ingestion and archival of failed files with error metadata
Aurora PostgreSQL — Production database for extracted and reconciled bill records
HubSpot CRM — Human-in-the-loop review ticketing with webhook-driven database write-back on closure
Amazon QuickSight & Amazon CloudWatch — Real-time analytics dashboard and Lambda execution monitoring
Terraform & Python — Infrastructure as code and Lambda runtime

Other Case Study Items

Revolutionizing Personal Loans with AI-Driven Underwriting

A leading Indian personal loan provider revolutionized their underwriting process by leveraging AI and machine learning to automate 80% of loan decisions. By integrating social and financial data into a sophisticated predictive algorithm, the company drastically reduced decision times to seconds expanded access to underserved segments, and achieved lower default rates compared to human underwriters.

Artificial Intelligence - Powered Tyre Dimension Extraction System

JashDS developed an AI-powered computer vision system for a leading automotive e-commerce platform, enabling accurate extraction of tire dimensions from images. The solution, which increased conversion rates by 25% and reduced customer support inquiries by 80%, utilized advanced technologies such as YoloV8 for instance segmentation and custom-designed augmentation techniques to simplify the online tire purchasing process.

Enhanced Jira Data Analysis for Strategic Insights

JashDS developed a flexible framework for analyzing Jira project data that is capable of handling varying export structures and custom fields. The solution leveraged GenAI and LLM technologies to provide actionable insights, identify productivity trends, and uncover potential risks across diverse software projects, resulting in a ___% improvement in team efficiency and a ___% increase in successful project outcomes.

Data Science

Data Engineering

AI and Agentic AI