top of page

Letting Data Speak, AI Act!

Case Study

AI Driven Course Curriculum Map Generation and Searching System

A fully accredited, nonprofit medical school training physicians for practice in the United States and Canada. The institution positions itself as an exceptional alternative for qualified students who face limited medical school seats in North America. The client intended to utilize Generative AI technologies on Amazon Web Services (AWS) to implement an AI-driven system for ingesting IMSCC (IMS Common Cartridge) files and automatically generate a course curriculum map aligned to the AAMC framework.

About the Client

A fully accredited, nonprofit medical school training physicians for practice in the United States and Canada. The institution positions itself as an exceptional alternative for qualified students who face limited medical school seats in North America. The client intended to utilize Generative AI technologies on Amazon Web Services (AWS) to implement an AI-driven system for ingesting IMSCC (IMS Common Cartridge) files and automatically generate a course curriculum map aligned to the AAMC framework.

Untitled design - 2024-09-27T104509.589.png

Challenge

The client faced a time-intensive and inconsistent manual process for mapping course content to the AAMC (Association of American Medical Colleges) competency framework. With 22 courses and IMSCC files containing heterogeneous content — HTML pages, PDFs, PowerPoint slides, QTI assessments, and images — manual mapping was both error-prone and unscalable. Some IMSCC files exceeded 1.3 GB in size with 200+ resources, requiring processing times far beyond what Lambda's 15-minute execution limit could accommodate. The platform also lacked a searchable, structured store for generated curriculum maps and had no mechanism for aligning content across courses to surface prerequisite dependencies.

Untitled design - 2024-09-27T105551.128.png

Key Results

  • Delivered a fully automated, end-to-end curriculum mapping pipeline processing 22+ courses from raw IMSCC files to structured, AAMC-aligned curriculum maps

  • Achieved intelligent AAMC competency alignment across all four domains: Interpersonal, Intrapersonal, Thinking and Reasoning, and Science Competencies

  • Solved large-file processing challenge — files up to 1.3 GB processed within Lambda's 15-minute timeout using dynamic batch sizing and Step Functions orchestration

  • Implemented prerequisite-aware module sequencing via two-pass dependency analysis, ensuring correct course ordering in final curriculum maps

  • Deployed hybrid semantic and keyword search API over all processed curriculum content, accessible via API Gateway

  • Built automatic model fallback from Claude Sonnet 4.5 to Claude Sonnet 4 when daily Bedrock token quotas are reached, ensuring pipeline continuity

  • Delivered full infrastructure-as-code using Terraform across 50+ AWS resources for repeatable, multi-environment deployment

Solution

Event-Driven Batch Processing Pipeline

  • Built a fully serverless, event-driven pipeline on AWS that processes entire cohorts of courses once to generate structured curriculum maps — not a chatbot or RAG system

  • IMSCC files uploaded to S3 automatically trigger the pipeline after a 10-minute debounce window, ensuring all cohort files are present before processing begins

Content Extraction and Hierarchical Chunking

  • Unpacked and parsed IMSCC files across all content types: HTML, PDF, PPTX, DOCX, QTI/XML, and images (via Claude vision OCR)

  • Applied hierarchical chunking at 800 tokens with 100-token overlap across three levels: Course → Module/Unit → Resource/Item

  • Large PDFs split into 25-page parts; images compressed to reduce Bedrock token usage by 60–70%

  • Generated Titan Embed Text v2 embeddings (1536 dimensions) per chunk, stored in Aurora PostgreSQL with pgvector


AAMC Competency Mapping and Course Summarization

  • Claude extracted learning objectives per resource and mapped them to the four AAMC competency domains during chunking

  • Generated structured JSON course summaries capturing key topics, prerequisite topics, AAMC domain coverage percentages, and module breakdowns

Two-Pass Curriculum Map Generation

  • Pass 1: Claude built a prerequisite dependency graph by cross-referencing each course's topics against others

  • Pass 2: Dependency map injected as ordering constraints to generate a correctly sequenced module blueprint

  • Pass 3: Two full curriculum maps generated from the blueprint — Core (required competencies) and Supplementary (additional learning) — output as Excel and JSON to S3

Hybrid Semantic Search API

  • Hybrid search combining Titan v2 vector similarity and PostgreSQL ILIKE keyword matching; results appearing in both ranked highest

  • Claude generates a natural language narrative over top results; exposed via API Gateway GET /search with cohort-level filtering

Step Functions Orchestration for Scale

  • Step Functions orchestrated the pipeline with parallel Map states — up to 5 courses and 3 batches per course concurrently

  • ThreadPoolExecutor (3–5 workers) within each Lambda maximized Bedrock API throughput within the 15-minute timeout ceiling

  • Built-in retries, checkpoint recovery, and idempotent reprocessing ensured resilience across long-running cohort runs

Untitled design - 2024-09-27T104509.589.png

Technologies Used

  • Amazon Bedrock — Claude Sonnet 4.5 (primary), Claude Sonnet 4 (fallback)

  • Amazon Aurora PostgreSQL Serverless v2 with pgvector

  • AWS Step Functions

  • AWS Lambda (Python 3.12)

  • Amazon S3

  • Amazon DynamoDB

  • Amazon EventBridge and EventBridge Scheduler

  • API Gateway (HTTP v2)

  • Amazon Titan Embed Text v2

  • AWS Secrets Manager

Other Case Study Items

Revolutionizing Personal Loans with AI-Driven Underwriting

Revolutionizing Personal Loans with AI-Driven Underwriting

A leading Indian personal loan provider revolutionized their underwriting process by leveraging AI and machine learning to automate 80% of loan decisions. By integrating social and financial data into a sophisticated predictive algorithm, the company drastically reduced decision times to seconds expanded access to underserved segments, and achieved lower default rates compared to human underwriters.

Artificial Intelligence - Powered Tyre Dimension Extraction System

Artificial Intelligence - Powered Tyre Dimension Extraction System

JashDS developed an AI-powered computer vision system for a leading automotive e-commerce platform, enabling accurate extraction of tire dimensions from images. The solution, which increased conversion rates by 25% and reduced customer support inquiries by 80%, utilized advanced technologies such as YoloV8 for instance segmentation and custom-designed augmentation techniques to simplify the online tire purchasing process.

Enhanced Jira Data Analysis for Strategic Insights

Enhanced Jira Data Analysis for Strategic Insights

JashDS developed a flexible framework for analyzing Jira project data that is capable of handling varying export structures and custom fields. The solution leveraged GenAI and LLM technologies to provide actionable insights, identify productivity trends, and uncover potential risks across diverse software projects, resulting in a ___% improvement in team efficiency and a ___% increase in successful project outcomes.

bottom of page