Letting Data Speak, AI Act!

Case Study

AI-Powered Web-RTC Meeting System

About the Client

An enterprise-grade AI-powered video training platform provider serving organizations in sales enablement, partner training, and customer success domains.

Untitled design - 2024-09-27T104509.589.png

Challenge

The client needed to evaluate and implement LiveKit, an open-source WebRTC-based real-time communication platform, on a production-ready infrastructure to support their AI-powered video training solution. The existing infrastructure lacked:

Scalable WebRTC Infrastructure: The platform required a highly scalable, enterprise-grade WebRTC infrastructure capable of handling multiple concurrent real-time video sessions for training and assessment scenarios.
Multi-Agent Architecture Support: The solution needed to seamlessly run multiple AI agents including Text-to-Speech (TTS), Speech-to-Text (STT), Large Language Models (LLMs), and the client's proprietary training assessment agents, all working in coordination.
Regulatory Compliance Through Data Locality: Enterprise customers demanded strict data residency requirements, necessitating infrastructure that could maintain data within specific AWS regions to meet regulatory compliance standards.
Auto-Scaling Capabilities: The training platform experienced variable load patterns based on organizational training schedules, requiring intelligent auto-scaling for both compute resources and application pods to optimize costs while maintaining performance.
Private Network Architecture: Security requirements mandated a private nodegroup for redis with no public endpoints, while still maintaining operational efficiency through secure access patterns and AWS service integrations.

Without a properly architected solution, the platform would face scalability bottlenecks during peak training periods, potential regulatory compliance violations, excessive infrastructure costs, and inability to deliver reliable real-time AI-powered training experiences.

Untitled design - 2024-09-27T105551.128.png

Key Results

Successfully deployed a production-ready Proof of Concept (POC) for LiveKit platform on a multi-AZ Amazon EKS cluster with comprehensive auto-scaling capabilities across three availability zones
Implemented Infrastructure as Code using Terraform, enabling repeatable deployments and reducing infrastructure provisioning time by 85%
Configured Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler to dynamically scale LiveKit servers and AI agent pods based on CPU utilization, memory consumption, and active session metrics

Solution

The solution involved designing and implementing a comprehensive, enterprise-grade private EKS cluster architecture on AWS to host the LiveKit WebRTC platform and multiple AI agents for the video training platform: High-Level Architecture

Infrastructure Architecture and Design

Designed a multi-availability zone EKS cluster architecture spanning three availability zones (us-east-1a, us-east-1b, us-east-1c) to ensure high availability and fault tolerance for real-time video communications.
Configured VPC networking with separate private subnets for EKS worker nodes (10.0.0.0/20, 10.0.16.0/20, 10.0.32.0/20) and LiveKit/Agent workloads (10.0.48.0/20, 10.0.64.0/20, 10.0.80.0/20), implementing network segmentation for enhanced security.
Implemented NAT Gateways in each availability zone to enable outbound internet connectivity for private subnets while maintaining inbound isolation.

EKS Cluster Deployment with Terraform IaC

Provisioned the entire EKS cluster infrastructure using Terraform Infrastructure as Code scripts, enabling version-controlled, repeatable deployments and facilitating future multi-region expansions.
Implemented comprehensive IAM roles and Kubernetes Role-Based Access Control (RBAC) policies to enforce least-privilege access principles across the infrastructure.
Set up AWS Secrets Manager integration for secure management of LiveKit API keys and agent configuration secrets.

LiveKit Platform Implementation

Deployed the livekit component using helm charts.
Deployed LiveKit server components on dedicated node groups with node taints (workload=livekit) to ensure workload isolation and optimal resource allocation.
Configured Internal Network Load Balancer (NLB) for LiveKit services, exposing ports 7880 (HTTP/WebSocket) and 7881 (TCP/WebRTC).
Implemented session routing and load balancing strategies to distribute WebRTC connections across multiple LiveKit server pods for optimal performance.
Optimized WebRTC configuration parameters for low-latency video streaming and efficient bandwidth utilization during training sessions.

Multi-Agent Deployment Architecture

Deployed Livekit TTS(Text-to-speech) and STT(Speech-to-text) agents.
Configured pod-to-pod communication between LiveKit servers and AI agents through Kubernetes services and network policies for secure, low-latency data exchange.
Set initial replica counts of 2 pods per agent type across multiple availability zones for high availability.

Comprehensive Auto-Scaling Configuration

Implemented Kubernetes Horizontal Pod Autoscaler (HPA) for BongoLearn agents with custom scaling triggers: CPU utilization >70%, Memory utilization >75%, and custom metrics based on active training sessions.
Deployed Cluster Autoscaler to automatically provision additional EC2 instances when pod scheduling fails due to insufficient cluster capacity, with node group configurations: Node Group 1 (Min=2, Desired=3, Max=5 t3.xlarge instances) for general workloads and Node Group 2 (Min=1, Desired=2, Max=4 c5.2xlarge instances) for LiveKit workloads.
Established scaling policies that balance cost optimization during low-usage periods with performance requirements during peak training schedules.

Security Implementation and Hardening

Configured Kubernetes Network Policies to restrict pod-to-pod communication to only authorized services, implementing micro-segmentation within the cluster.
Enabled TLS/SSL encryption for all inter-service communications using AWS Certificate Manager and Kubernetes ingress controllers.
Implemented security groups with strict inbound/outbound rules: LiveKit/Agent security group restricting access to authorized CIDR ranges and necessary ports only.

Technologies Used

Amazon Elastic Kubernetes Service (EKS) - Managed Kubernetes control plane for orchestrating containerized workloads

Terraform - Infrastructure as Code for provisioning and managing AWS resources

LiveKit - Open-source WebRTC platform for real-time audio and video communications

Kubernetes Autoscaling - Horizontal Pod Autoscaler (HPA), and Cluster Autoscaler for dynamic resource management

AWS VPC & Networking - NAT Gateways, VPC endpoints, Network Load Balancer for secure network architecture

AWS Secrets Manager - Secure secrets management and rotation for sensitive configuration data

Amazon ECR - Container registry for storing and managing Docker images with vulnerability scanning.

Docker/Containers - Containerization of LiveKit servers and AI agent applications.

Redis Cache - To store the ongoing meeting records.

Other Case Study Items

Revolutionizing Personal Loans with AI-Driven Underwriting

A leading Indian personal loan provider revolutionized their underwriting process by leveraging AI and machine learning to automate 80% of loan decisions. By integrating social and financial data into a sophisticated predictive algorithm, the company drastically reduced decision times to seconds expanded access to underserved segments, and achieved lower default rates compared to human underwriters.

Artificial Intelligence - Powered Tyre Dimension Extraction System

JashDS developed an AI-powered computer vision system for a leading automotive e-commerce platform, enabling accurate extraction of tire dimensions from images. The solution, which increased conversion rates by 25% and reduced customer support inquiries by 80%, utilized advanced technologies such as YoloV8 for instance segmentation and custom-designed augmentation techniques to simplify the online tire purchasing process.

Enhanced Jira Data Analysis for Strategic Insights

JashDS developed a flexible framework for analyzing Jira project data that is capable of handling varying export structures and custom fields. The solution leveraged GenAI and LLM technologies to provide actionable insights, identify productivity trends, and uncover potential risks across diverse software projects, resulting in a ___% improvement in team efficiency and a ___% increase in successful project outcomes.

Data Science

Data Engineering

AI and Agentic AI