Does GenAI "belong" to data scientists? — Phil Hetzel, Braintrust
TL;DR
Agents are not just another predictive model — Hetzel argues that OpenAI, Anthropic, and Mistral already did the core model-training work, so building agents is less about classic ML pipelines and more about shaping prompts, context, systems, and feedback loops.
Traditional enterprises often assign GenAI to the wrong team by default — He sees CEOs and CIOs push “we need agents,” then delegate the work to existing ML or data science teams simply because generative AI sounds adjacent to their remit.
Context engineering changes who can contribute — Unlike traditional ML, where value often comes from feature engineering and retraining, agent behavior can often be improved by changing prompts and context, which opens the door to product managers and domain experts.
Data scientists still matter most around rigor and guardrails — Hetzel says ML-minded teams bring a healthy skepticism about how LLMs work, stronger testing discipline, and the ability to evaluate LLM-as-judge systems with labeled datasets and metrics like precision, recall, and F1.
The biggest failure mode is optimizing the wrong metrics — Teams trained on traditional ML can overfocus on precision, recall, and F1, while agent evaluation needs a broader view of functional performance across a much wider surface area.
Great agent teams are intentionally cross-functional — His ideal setup combines product, application, and systems engineers with non-technical subject matter experts doing prompt design and human annotation, plus data scientists building eval and observability pipelines.
The Breakdown
“The model’s already built” is Phil Hetzel’s blunt case for why agentic AI shouldn’t be handed automatically to data scientists just because it has “AI” in the name. His answer lands in the middle: the best agent teams mix data scientists, product and systems engineers, and domain experts who actually understand the problem.
Was This Useful?
Share
Keep Reading
Make Alcreon Yours
Tune your feedFive quick questions, and the feed ranks what matters to you first.Or just get notified
The weekly Echo. Signal worth keeping in your inbox.
Every new piece, announced on X.
Read Next
See all
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
The Art of Tasteful Prompting
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
The Codex /goal Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.