a little about me

I work where LLMs meet product, and refuse to pick a side.

I'm Vanshu — currently in Bengaluru, currently at EMA. I spend most of my days reading LLM outputs and asking why they went wrong, then helping the team turn those answers into shipped fixes.

How I got here

I started in computer applications at Ambedkar Institute of Technology in Delhi. Halfway through, I realised I was less interested in writing software and more interested in why people use it (and stop using it). That curiosity got me into the Plaksha Tech Leaders Fellowship — 60 students, scholarship, a year of intense exposure to AI, design, and product alongside UC Berkeley and Purdue.

From there it was the NextLeap PM Fellowship (top 4% of 500+), an internship at Blink X where I owned a stock-education app end to end, and then Phenom — where I learned what it actually feels like to defend a churn metric to a CS team that needs the answer yesterday.

Now I'm at EMA, building the evaluation layer for a fleet of AI agents that real enterprises are putting real money behind. There's no playbook for QA-ing a 50-agent fleet that hallucinates differently each day. We're writing it.

What I'm good at

Looking at messy data — LLM outputs, user funnels, behavioural signals — and finding the one pattern that explains most of the noise. Building tools nobody asked for that quietly become the thing the team can't live without (CSM Frontier is the latest example). Writing things down clearly enough that an engineer, a sales lead, and a VP can all agree on what we're doing.

What I'm still figuring out

How to balance speed and rigor when the LLM ecosystem moves faster than the evaluation literature. How to design eval frameworks that survive a model upgrade. How to explain to non-technical stakeholders that "the AI is wrong sometimes" isn't a bug — it's the entire product surface.

Outside of work

I read a lot about how products and people fail — startup post-mortems, behavioural psychology, the occasional Karpathy lecture. I built Freese in a weekend because a friend's PCOS conversation wouldn't leave my head. I will, given any opportunity, talk about why product teams underweight evaluation. Then I'll talk about it some more.

The facts

Reach out

If you've got a hard AI product problem — or just want to argue about the right way to evaluate an agent — drop me a line at vanshu.bu@gmail.com, or find me on LinkedIn ↗.