work · 2023 — now

Three roles. One thread.

Each job taught me a different layer of the same problem: how to take something messy — user behaviour, an LLM's worst day, a vague product brief — and turn it into something a team can act on. Here's the long version.

oct 2024 — present Bengaluru

EMA

AI Evaluation Analyst · Agentic AI for the enterprise

EMA builds AI agents for the enterprise — sales, support, HR, the long tail of knowledge work. My job is to make sure those agents are trustworthy enough to ship. When I joined, there was no structured evaluation process for the 50+ agents in the pipeline. I built it.

  • CSM Frontier App: Built an internal full-stack app in Claude Code to fix the fact that nobody had a single view of live deals. Designed the backend workflows for AI-powered data extraction and document upload, demoed to leadership, and shipped it. Now used by 20+ folks across GTM and Solution Architecture as the daily ground truth for deal tracking.
  • 0 → 1 AI Launch: Designed and owned the end-to-end QA framework for the 50+ AI agents — defined evaluation criteria, set up the test harnesses, ran the regression sweeps. Directly unblocked the first public launch and supported closure of a key enterprise deal.
  • LLM Accuracy: Diagnosed consistent failure patterns in LLM outputs that were causing customer escalations. Worked with engineering on targeted fixes — prompt tweaks, retrieval changes, agent routing logic. Improved accuracy by 20–30% and measurably reduced support load.
  • Integrations: Led rollout and validation of 50+ third-party integrations where production reliability had been inconsistent. Cut the issue rate sharply and improved end-user reliability at scale.
  • Product Decisions: Surfaced evaluation data as structured input to PM prioritisation. Directly influenced feature tradeoff calls across multiple release cycles — including a few "no, don't ship that yet" calls.
  • User Issues: Identified recurring AI output failures before they became widespread complaints. Prioritised fixes that reduced user-facing errors and improved retention.
LLM Evaluation AI Agents Claude Code Prompt Engineering Full-stack tools Enterprise AI
● present
nov 2023 — oct 2024 Hyderabad

Phenom

Product Analyst · Talent experience platform

Phenom is an enterprise talent platform serving Fortune 500s. I sat at the intersection of CS and Product — close enough to user pain to see the early signals, close enough to engineering to actually do something about them.

  • Churn Prevention: Identified behavioural churn signals nobody was actively monitoring. Built an alert system for the CS team — drop in seat usage, dropped feature engagement, support ticket velocity. Prevented an estimated $200K in revenue loss.
  • Adoption Gap: Analysed drop-off patterns across user journeys, identified friction points, drove targeted fixes with engineering. Result: a 5% lift in feature adoption on the touched flows.
  • Metrics Layer: Defined and tracked 35+ metrics across retention and engagement where no unified KPI framework existed. Built leadership a consistent dashboard for data-backed decisions instead of gut-led ones.
  • User Insights: Translated raw behavioural data into weekly executive reports that directly shaped roadmap priorities — and, more importantly, surfaced issues early enough to fix them before they hit the next QBR.
Churn Analytics KPI Frameworks Mixpanel SQL Executive Reporting
archived

Where the thinking got shaped.

2022 — 2023Mohali

Plaksha University

Tech Leaders Fellowship · PG Programme in AI & Leadership

Selected as 1 of 60 young leaders from India on a $6,000 merit scholarship — a 6% selection rate. The fellowship is a multidisciplinary tech leadership development programme run in collaboration with the University of California, Berkeley and Purdue University. This is where AI stopped being abstract for me.

$6K scholarship UC Berkeley collab Purdue collab 6% selection
complete
2018 — 2021Delhi

Ambedkar Institute of Technology

Bachelor of Computer Applications

The foundation — algorithms, web dev, databases, and the slow realisation that the most interesting questions were never about the code. They were about why the user clicked the wrong button.

BCA Foundation
complete