AIExplained

5 Takeaways from Sutskever Breaking Silence + Opus 4.5

Added 2025-11-27 18:45:05 +0000 UTC

Interview Transcript: https://www.dwarkesh.com/p/ilya-sutskever-2

The Information Exclusive: https://www.theinformation.com/articles/openai-ceo-braces-possible-economic-headwinds-catching-resurgent-google?rc=sy0ihq

Opus 4.5: https://www.anthropic.com/news/claude-opus-4-5

Error Bars: https://arxiv.org/pdf/2411.00640

OpenAI NYT: https://www.nytimes.com/2025/11/23/technology/openai-chatgpt-users-risks.html#selection-1759.88-1759.333

GPQA Organic Chemistry Gemini 3 Pro: https://x.com/EpochAIResearch/status/1993363375108333616

Active Parameters: https://epochai.substack.com/p/notes-on-gpt-5-training-compute

GPT 5.1 Codex Max Metr: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/

Superalignment: https://openai.com/index/introducing-superalignment/

5 Takeaways from Sutskever Breaking Silence + Opus 4.5

Comments

From my experience the biggest "step change" in reliability has been Gemini 2.5 DeepThink. Most of the prompts I have used in the past didn't cause errors and fixed existing ones and that was after about 60 iterations. That being said, it was a basic, cross-the-road style game like crossy road. All other models, even 3.0 pro and 4.5 Opus haven't been comparable for me. My guess is that 3-5 major breakthroughs like the Transformer are needed to get to some sort of AGI. Thinking memory (maybe Titans or Nested Learning?), some physical form factor (humanoid robot?), self-reflection, self-learning, and probably some others. I think the major question that remains is, can the current state of AI help us discover those breakthroughs faster? If yes, I think due to compounding, exponential effects we're probably closer than 2055. But if no, then it may take longer than the "2-5 years" people like Elon and Demis are saying

Luke Litowitz

2025-12-01 12:47:03 +0000 UTC

Philip, as always, you have very penetrating, insightful observations. I agree that Ilya, in this interview, was like a different person. It was almost shocking. Only about 18 months separate his "straight shot" enthusiasm from an apparent 180-degree reversal today. One might expect such reversals from an uninformed commenter. But Ilya had more experience than most, and likely early awareness of looming test-time compute plans. His SSI co-founder, Daniel Gross, was quite specific in interviews around 2024. Gross stated that the company planned to "spend a couple of years doing R&D on our product before bringing it to market." Since their "one product" is safe superintelligence, this could imply (then) a very aggressive anticipated timeline, possibly expecting a minimally deployable version within the late 2020s. Ilya himself said SSI's strategy involved trading commercial revenue for speed and focus. That might suggest they then believed the technology could be built fast enough that they wouldn't need intermediate revenue to survive. How their viewpoints seemed to have changed in only 18 months! Given the lack of forecasting precision by one of the most informed people, it makes one wonder about future projections. I personally think AI Explained offers better, more rational insight than Sutskever himself demonstrates.

Joe Marler

2025-11-30 16:20:13 +0000 UTC

To answer your question: Yes I think you should do a video on Opus 4.5, definitely more than iterative improvement.

Erik

2025-11-29 22:19:21 +0000 UTC

Yes! I used it to resolve just such an issue!

Philip

2025-11-29 10:23:43 +0000 UTC

Video was called 'Relentless Learning' I think on the title, maybe 3 weeks old

Philip

2025-11-29 10:23:33 +0000 UTC

Great clip! Quick question: at 2:10 you refer to your video on "nested learning". I can't seem to find it. Which video is this?

Frederick Batzler

2025-11-29 05:10:47 +0000 UTC

Enjoyed it as usual Phillip. For the model debate capability .. can that be used to solve an issue? (here is a coding problem I have what is the best approach)

Daniel A Barbatti

2025-11-29 01:08:44 +0000 UTC

If he’s right, the there should be a period of stagnation that will crash a lot of the AI stocks that we’re counting on further rapid advancements (Tesla FSD, and robotics jump to mind). But companies making do with what exists now will still find value.

Bibity bop

2025-11-28 19:23:00 +0000 UTC

Should have mentioned the 62% on simple bench too

Philip

2025-11-28 11:45:01 +0000 UTC

Ah you got there first! Yes lmcouncil.ai.

Philip

2025-11-28 11:44:31 +0000 UTC

llmcouncil.ai ? Site just has a contact us box. Am I missing something? update: turns out it's lmcouncil.ai not llmcouncil.ai

Niall Riddell

2025-11-28 10:53:05 +0000 UTC

I don't think Ilya was saying that he doesn't feel the AGI. He was saying that incremental deployment is helpful because it's the only way to make the public feel the AGI. But yes, I am really surprised by the 5-20 years forecast. For context, 20 years is almost 1/3 of the time between Darthmound Conference (1956) and now. Given current SOTA, plus researchers and capital flocking to the field, I cannot believe it will take that long. Unless an exogenous factor causes a new AI winter.

Vlad Gheorghe

2025-11-28 10:30:30 +0000 UTC

But didn’t you just “Feel the AGI”? Maybe you don’t have the “it”? /sarcasm

Pavol Vaskovic

2025-11-28 07:14:52 +0000 UTC

They have an 80% version but agree 99% would be more interesting ( and also more annoying to benchmark)

Markov

2025-11-27 20:54:30 +0000 UTC

On your closing point about the METR evals, going from ~2 hours to ~2 days is a big deal, but remember - the main benchmark is based on 50% success rate for a given time horizon, but what can they do with 99% or 99.9% success? As long as we need to check the actual code produced, not just higher level outputs, longer time horizons just give us more code to review and edit. I'm excited for the self chat feature - I've been thinking of this kind of thing for a while.

Chris Prosser

2025-11-27 20:47:58 +0000 UTC

Some analogies you cannot give without sounding at least a bit ridiculous but he could have been a better word smith in this interview. But, if you believe agi will come in 5 years maybe continent sized datacenters is not that ridiculous in 20. Any speculation about events well past agi are going to seem absurd. Agi is already inconcievable to most people, i.e. it's very hard to "feel the agi".

Markov

2025-11-27 19:21:38 +0000 UTC

I actually watched this interview already and was not impressed. I went in with a positive bias for him. But when I watched it, it feels like he just hasn't thought this through. Dwarkesh put some pretty obvious logic to him about the downsides of a straight shot to SSI and he didn't seem prepared for it. He also seemed to rely on vague statements, with lots of pauses, to try and add gravitas. At one point he said something like "I think it would have value if we could somehow cap super intelligence, but I'm not sure how to do that". Well no shit Sherlock, but I had hoped for more from the founder of Safe Super Intelligence. To steel man my own point I think it's possible he has a lot more to say but can't say it because of intellectual property concerns. I don't rest easy relying on these folks to guide us to a good outcome though.

2025-11-27 19:02:05 +0000 UTC

Been waiting for your take on Opus 4.5. I'll watch this on my run. I've been really impressed with its coding capabilities and how it reasons through problems. Even told me my idea wouldn't work when trying to optimize some code

Shawn Rosofsky

2025-11-27 18:49:27 +0000 UTC

More Creators

Junebuq

patreon

Ogirls Art

patreon

あまひよ＠大忙し

fanbox

Halakadira

gumroad

applewater

fanbox

meeshi

patreon

Wyatt

patreon

Lewd Piece

patreon

MaskedCocoa

patreon

BaraKatalouge

patreon

灰葉 - Lapin gris -

fantia

シロサワ

fanbox

nonadraws

patreon

エイジス

fanbox

venris

patreon

I'm Autistic, Now What?

patreon

revmagdalen

patreon

灰米

fantia

fanbox

polygonvariable

gumroad

hadesbutpatreon

patreon

Akuva Art

gumroad

Sweetie Bot Project

patreon

阿茉目えばと

fantia

206873925913

gumroad

KanyonIndustries

patreon

蛇苗/boaplant

fanbox

Xicz

patreon

Hoofedbeast

gumroad

vodgoblin

patreon

ArenEternal

patreon

Viktor Krev

fanbox

SpiderBearPile

patreon

Luc

patreon

濡れ羊羹

fanbox

niluviel

gumroad

名原しょうこ

fanbox

qelsz.ac Graphics

patreon

SlimeDrippy

fanbox

Proxima

patreon