Work & Case Studies — Navneet Singh

Project Genius — Enterprise GenAI Platform

ML Advisor / Principal GenAI Architect

The Challenge: Cigna's healthcare operations involve millions of complex documents — insurance claims, medical records, provider contracts — with intricate table structures that traditional OCR and rule-based extraction cannot handle reliably. Manual processing was expensive, error-prone, and slow.

The Approach: Designed an end-to-end GenAI pipeline combining prompt engineering with few-shot retrieval. Built custom fine-tuned models using embedding + CNN architectures specifically for complex table extraction. The system uses FAISS for vector search to find similar historical extractions, feeds them as few-shot examples to the LLM, and validates output through multi-stage evaluation gates.

Key Decisions: Implemented LLM guardrails for safety, PII/data controls for HIPAA compliance, and established cost/latency SLOs. Chose LoRA fine-tuning over full fine-tuning for cost efficiency. Deployed with vLLM for high-throughput serving. Security-by-design was non-negotiable in healthcare.

Impact: Established as the reference architecture for all GenAI initiatives across Cigna's AI CoE. Serving as the ML Advisor setting standards for the organization's GenAI roadmap.

legalAI — Document Intelligence Platform

NLP Architect

The Challenge: Volkswagen's legal department needed to process thousands of complex legal documents — contracts, regulatory filings, compliance reports — with intricate layouts, nested tables, and multi-column formatting. Existing PDF extraction tools produced garbled output on these documents, requiring extensive manual correction.

The Approach: Developed a novel pipeline combining computer vision-based layout analysis with semantic merging and entity logic. The CV component identified structural elements (headers, paragraphs, tables, lists) using layout detection models. The semantic merging layer then reconstructed logical reading order across multi-column layouts and split/merged cells.

Key Innovation: The entity logic layer used named entity recognition to validate extracted fields against known legal entity types, catching extraction errors that pure layout analysis would miss. This hybrid approach — CV for structure, NLP for semantics — doubled throughput and improved extraction accuracy by 37%.

QAM — Hardware Knowledge Question Answering

NLP Architect

The Challenge: VW's engineering teams needed instant access to technical knowledge spread across thousands of hardware specification documents. Engineers spent hours searching through manuals, datasheets, and internal wikis for specific technical details.

The Approach: Built a RAG-based Question Answering Machine using FLAN-T5 as the generation backbone with VectorDB for semantic retrieval. Ingested and chunked the hardware knowledge base, generated embeddings for semantic search, and built a retrieval pipeline that surfaces the most relevant document passages for any engineering query.

Key Design: Implemented a two-stage retrieval — first coarse retrieval via vector similarity, then re-ranking using cross-encoder scoring for precision. The system handles ambiguous queries through query expansion and provides source attribution for every answer, enabling engineers to verify and drill deeper.

TIVA — Real-Time Video Analytics at Scale

Practice Lead — CV & IoT

The Challenge: Surveillance and industrial monitoring systems generated thousands of video streams but lacked intelligent analysis. False alarm rates were astronomically high (often 90%+), making the systems unusable for real-time decision making. Edge devices had severe compute constraints, and cloud-only processing introduced unacceptable latency.

The Approach: Built TIVA — Trigyn Intelligent Video Analytics — an end-to-end platform covering object detection, intrusion detection, people counting, ANPR (Automatic Number Plate Recognition), and anomaly detection. Designed a hybrid edge-cloud architecture where lightweight models run on Jetson Nano/TX2 for real-time inference, while heavier analytics run in the cloud.

Key Innovations: Applied model pruning and quantization strategies to fit production-grade detection models within 4GB edge device memory. Developed a custom post-processing pipeline with temporal filtering and zone-based logic that slashed false alarms by 75%. Introduced model versioning and A/B testing to continuously improve accuracy in the field.

Leadership: Managed a cross-functional team of 12 — embedded engineers, backend developers, frontend designers, and data scientists — establishing code review practices, sprint workflows, and model evaluation benchmarks.

75%

Fewer false alarms

1000s

Video streams

17-20%

Cost reduction

35%

FP improvement

Tech Stack

PyTorchTensorRTONNXJetson Nano/TX2CUDAOpenCVGStreamerDockerFlask

Retail Computer Vision Suite

Data Science Lead

The Challenge: Retail clients needed intelligent customer analytics — understanding who enters their stores, how many people visit, and demographic patterns — without requiring expensive dedicated hardware or manual counting.

The Approach: Built a suite of computer vision models: face recognition for customer identification using ArcFace embeddings with triplet loss training, crowd counting algorithms using density estimation for footfall analysis, and age/gender classification for demographic profiling.

Key Technical Decisions: Chose transfer learning with pretrained ResNet backbones over training from scratch, dramatically reducing data requirements. Implemented real-time inference optimization to run on commodity hardware (standard IP cameras + consumer GPUs), making the solution accessible to retail clients without expensive infrastructure upgrades.

SERT — Merchant Engagement & Relevance

Module Lead — Big Data

The Challenge: American Express needed to understand and score open-card merchant engagement across millions of transactions. Which merchants are most relevant to which cardholders? How do you surface the right offers at the right time? The data was massive, heterogeneous, and arriving in near-real-time.

The Approach: Architected SERT — Speed, Engagement, and Relevance Tool — using Hadoop for batch processing of historical transaction patterns, HBase for low-latency merchant profile lookups, and Elasticsearch for full-text search across millions of merchant descriptions and categories. Built collaborative filtering-based recommendation engines on the processed data.

Architecture Decisions: Designed the ETL pipeline using Oozie for workflow orchestration, Hive for SQL-accessible data warehousing, and custom MapReduce jobs for the heavy-lifting transformations. The system needed to handle data quality issues at scale — missing fields, duplicate records, format inconsistencies across partner data feeds — so built a robust data validation and reconciliation layer.

Work That Matters

Project Genius — Enterprise GenAI Platform

Tech Stack

legalAI — Document Intelligence Platform

Tech Stack

QAM — Hardware Knowledge Question Answering

Tech Stack

TIVA — Real-Time Video Analytics at Scale

Tech Stack

Retail Computer Vision Suite

Tech Stack

SERT — Merchant Engagement & Relevance

Tech Stack

Want the Full Story?