
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
Alex Lubyansky says AI crossed a real scientific threshold — he went from thinking AI was “useful for email” to calling GPT-5 “the most important discovery in my lifetime” after it reproduced one of his hardest black-hole calculations in under 30 minutes.
GPT helped crack a gluon problem that top physicists had been stuck on for a year — in the paper “single minus gluon tree amplitudes are non-zero,” Lubyansky, Andrew Strominger, Alfredo Guevara, and David Skinner showed that amplitudes long assumed to vanish can survive in a special collinear regime, and GPT-5.2 Pro guessed the simple general formula.
The key technical jump was from factorial mess to linear structure — humans had a horrible Feynman-diagram expansion where the number of terms blew up super-exponentially with particle count, while GPT proposed a Parke-Taylor-like formula whose complexity grows only linearly with n.
An internal OpenAI model didn’t just guess the gluon answer — it rederived and proved it — after a 12-hour run from a clean problem statement, it independently rediscovered the same formula and generated the proof skeleton that became most of the paper.
The follow-up graviton paper moved even faster and used public ChatGPT Pro — three weeks later, the team extended the result from gluons to gravitons, with GPT-5.2 Pro reading the first paper, suggesting next steps over a 110-page chat, and drafting text close to the final arXiv submission.
Lubyansky thinks the new bottleneck is no longer calculation but taste and verification — AI can already produce paper-grade derivations quickly, but deciding which questions matter, training students in a world where “rites of passage” get crushed by models, and verifying outputs may define the next phase of AI-for-science.
Alex Lubyansky opens with a blunt claim: in some directions, AI is already superhuman, and theoretical physics is one of the places where that became impossible for him to ignore. He says O3 was the first model that could do real research math for him, then GPT-5 hit and reproduced one of his best papers in about 30 minutes — the moment he started telling colleagues, “pay attention,” and eventually joined OpenAI during sabbatical.
He gives the high-level physics setup: quantum field theory is the 20th century miracle that reconciles relativity with quantum fuzziness, and scattering amplitudes are the core objects that encode probabilities for particle collisions. In gluon physics, the simplest helicity configurations were long thought to vanish — all-plus amplitudes are zero, and textbooks said single-minus amplitudes were zero too.
Andrew Strominger, Alfredo Guevara, and David Skinner had realized a year earlier that the textbook argument had a loophole: if particles are exactly aligned, or collinear, single-minus amplitudes need not vanish. But getting the actual answer by hand produced a nightmare expansion — by six points, the formula exploded into page-filling sums, with complexity growing factorially in the number of particles.
Lubyansky invited Strominger to OpenAI to try attacking the problem with AI, half-expecting a useful failure. Instead, while Strominger was literally still in transit, ChatGPT simplified the five-point case, then the six-point case, and finally guessed the general n-particle formula in a special region of phase space; Lubyansky describes it as the Parke-Taylor-style simplification they’d been hunting for all year.
The public model could conjecture the formula but not fully prove it, so they handed the sharpened problem to a stronger internal physics model. After thinking for 12 hours, it independently rediscovered the same formula and produced the proof strategy that became the backbone of the gluon paper — which the team intentionally presented as a physics result first, with only a brief note about AI in the text.
Then came the sequel, “single minus graviton tree amplitudes are non-zero,” extending the story from the strong force to gravity. Lubyansky says the wild part is that this one was done with public GPT-5.2 Pro: they gave it the gluon paper, a few key modifications, and a “good luck, you’re a brilliant theoretical physicist” prompt, and over a 110-page exchange it worked through checks, suggested next steps, used the directed matrix tree theorem, and drafted text close to the final paper.
For Lubyansky, the practical gains are concrete: he spends much less time confused, and can now launch many parallel “scouts” into the unknown by spinning up multiple chats to test different approaches. But he’s equally clear that the hardest skill in physics is still taste — not doing the calculation, but knowing what question is worth asking — and today’s models look more like insanely capable graduate students than autonomous scientific visionaries.
He worries academia has no good answer yet for how to train young physicists when the old rite-of-passage problems are now crushable by models. He also thinks science is entering an era where AI can churn out paper-grade results fast enough to flood arXiv, forcing researchers to raise the bar and focus on deeper questions, while verification — not derivation — becomes the limiting step. His parting message is simple: this is already happening in quantum gravity and field theory, and if the trajectory continues for another 6 to 12 months, research is going to look very different.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.