AIExplained

AIExplained

10,000x Scaling Deep Dive, and a 5-year LLM Roadmap

Added 2024-09-01 21:04:42 +0000 UTC

A 20,000-word new report on AI scaling, and yes, I read it all to bring you the highlights. What are the biggest unanswered questions for whether we will scale models 10,000x and is there a deeper question that underlies them all? Plus new clips from Anthropic CEO, Simple update, Eric Schmidt and more…

Link for Download and Off-line Watching: https://drive.google.com/file/d/1EObJY3kVgXRxH4N0mWxd3fqYWOEPNckW/view?usp=sharing

Epoch Report: https://epochai.org/blog/can-ai-scaling-continue-through-2030

Stargate Report: https://www.theinformation.com/articles/microsoft-and-openai-plot-100-billion-stargate-ai-supercomputer?rc=sy0ihq

My StarGate Video: https://www.youtube.com/watch?v=KXG2f-So9oo

Eric Schmidt Leak: https://www.cnbc.com/2024/08/15/eric-schmidt-on-nvidia-you-know-what-to-do-in-the-stock-market.html

Amodei Interview: https://www.youtube.com/watch?v=7xij6SoCClI

Market Caps: https://companiesmarketcap.com/
Cerebras: https://inference.cerebras.ai/

Noam Brown Tweet: https://x.com/polynoamial/status/1803844406480638284

RT-2-X: https://deepmind.google/discover/blog/scaling-up-learning-across-many-different-robot-types/

Exponential Data Might Be Needed: https://arxiv.org/pdf/2404.04125

10,000x Scaling Deep Dive, and a 5-year LLM Roadmap

Comments

@Barnaby it only feels like that because you're living in linear time. On the exponential scale, they're a straight line.

Gregory Klopper

2024-10-11 16:02:08 +0000 UTC

It feels like the time-scales are getting shorter.

Barnaby Golden

2024-09-19 21:18:25 +0000 UTC

Yeah what happens if scaling does continue for 10,000x AND it continues with o1 -> o3 (another 10,000x). ASI?

Dane Holmberg

2024-09-19 20:59:29 +0000 UTC

Has the release of o1-preview changed your conclusions from this video at all?

Barnaby Golden

2024-09-19 16:35:35 +0000 UTC

No problem, and yeah “Beff Jezos” is interesting to say the least. Here is a video Extropic released that does go into a bit of detail (and didn’t get many views): https://youtu.be/mxLuoifgJdU?si=2rP-A-XFmwKgzMe7

James Patton

2024-09-05 16:29:59 +0000 UTC

Thanks so much Juanjo! I see you retweeting me often, am grateful for your support on multiple platforms!

Philip

2024-09-05 15:58:01 +0000 UTC

So kind James, I will be honest in saying I don't know much about Extropic, other than the controversies around its leadership, so will watch them more carefully after what you said.

Philip

2024-09-05 15:57:39 +0000 UTC

I once did a video on here vaguely related to your Stack Overflow comments, it was about a patent OpenAI did. But you raise so many interesting points, makes me wonder what you do for work Blake!

Philip

2024-09-05 15:56:49 +0000 UTC

Me too Vlad, me too

Philip

2024-09-05 15:55:37 +0000 UTC

Thanks Mark, and yes, that term has become so confusing, and polarising, that I steer clear of it more than I used to. Even the demo-AGI term I tried to popularize does not fit the bill, so Andrew is right in that sense that it very much depends on our definitions.

Philip

2024-09-05 15:55:30 +0000 UTC

Yeah I toyed with including it, and DiLoCo, but had to weigh up the heaviness of the video for a general audience!

Philip

2024-09-05 15:54:16 +0000 UTC

That is so kind Joe, thank you

Philip

2024-09-05 15:53:45 +0000 UTC

Philip: you have a truly penetrating insight. Amidst all the distracting noise in this area, your incisiveness cuts straight to the core issues. Your commentary and analysis are a breath of fresh air.

Joe Marler

2024-09-05 12:09:14 +0000 UTC

Great video as always! Btw on the topic of power and distributed training, Nous Research lately talked about DisTrO to make this easier: https://github.com/NousResearch/DisTrO Right now just preliminary info, but a full paper is coming if their readme is to be believed.

Shawn Fumo

2024-09-03 13:01:15 +0000 UTC

Yeah, it feels like the next 6 months or so may tell us a lot. And whether OpenAI themselves wows us, between Anthropic, DeepMind, xAI, Meta, now Magic coming out of stealth, etc., it feels like someone is going to have something impressive. If none of those manage anything other than incremental improvements, the “AI has slowed” people may have a point. Then again, all it takes is one breakthrough to possibly turn that all on its head. I keep coming back in my mind to that recent paper where researchers trained a good quality image generator from scratch for less than $2000 and 37m images. Vs SD 1.5 costing $300k and using almost 4b images. We may not see something quite that dramatic on the LLM side but then again who knows. It still feels so early in all of this.

Shawn Fumo

2024-09-03 12:57:21 +0000 UTC

I’m not sure on how much time awareness something like Gemini Pro has, but I know Covariant has stuff baked into its robot arm AI (mostly for picking operations) to do things like visualize what will happen given various physical actions. Interesting their co-founders were just hired by Amazon to incorporate their tech into Amazon’s warehouses, etc.

Shawn Fumo

2024-09-03 12:50:53 +0000 UTC

Truly one of your most insightful and thought-provoking videos ever Phillip. Unless I missed it I think you barely used the term AGI. Perhaps because it’s such a poorly defined concept that it’s kind of become a distraction. But I wonder If you heard Andrew Ng recently say that he believes AGI is at least decades away and how you would square that with what seem to be more ambitious projections from the Epoch paper.

Mark Levine

2024-09-03 12:11:22 +0000 UTC

Sometimes I myself feel like "a blind reinforcement learning model, just fumbling amid the infinities of the universe". Thank you!

Vlad Gheorghe

2024-09-02 20:38:48 +0000 UTC

I agree strongly with the notion of a data labeling/synthetic data revolution. I think that it extends past just labeling because I think there are low hanging fruit that hybrid-synthetic data generation would manifest exponentially faster—I'm making the assumption that any leaps in purely synthetic data generation for informing world modeling will only be possible after innovations are conceptually demonstrated in a supervised process. That cottage industry you mention in your video is real, and kind of scary. Although, please know that I'm still very new to the field, and most of my knowledge is informed by an informal education and practical experience; so please correct me on any points. Right now, there is an upswell of data annotating of large language models that goes beyond A/B testing. I'd argue that there is some A/B testing going on that is a facade for aggregating domain expert reasoning. And because that sentence sounds almost conspiratorial, I'd appreciate any correction in my interpretation of the processes I've observed. If It's reasonable to use coding tasks as a microcosm of the data that could be scaled synthetically with a direct impact on performance (memorization or otherwise), then this process is kind of freaky: - First, an A/B testing environment is set up for evaluating comparative performance on coding tasks. - Second, human annotators are given a rating criteria with three essential parts: 1) Rating scales for qualitative aspects of the response, 2) Section for reviewers to explain their reasoning for their rating with specifications for how the reviewer should write and the details to include, 3) Proof of code execution. - Finally, the human annotators are given the option to edit a model response such that the solution can be executed. The process is interesting because of the requirement to explain one's reasoning and provide proof of code execution, but it's freaky because of the detailed instructions and guidance on how an annotator must explain why the provided solution to the coding task is wrong and how it is corrected. From my naive understanding, this is more than high-quality, expert-informed data on a model's performance on coding tasks. If I think about Stack Overflow as a baseline, this process seems like a very simple way to take a step past A/B testing and systematically produce proper language modeling of the conditions for a broad range of coding tasks and their correct completion, much like Stack Overflow, but without tangential comments and competing explanations. I don't think this process alone will scale, but if LLMs can be trained to perform the same process in tandem, maybe synthetic data that mimics the Stack Overflow style post would be something LLMs could extrapolate into its most generalized form. I can't tell if this is the kind of this hybrid-synthetic data would contain the patterns and artifacts needed to model any of the real-world systems like synchronicity of process execution or the memory garbage collection of virtual runtime environments, small world-models I would argue are necessary for reasoning through coding tasks. However, I can see how this kind of approach could be used in other problem spaces to create large datasets of tasks and written down reasoning of how they are performed correctly.

Blake Chambers

2024-09-02 15:50:37 +0000 UTC

Current multimodal models are trained on relatively low resolution image patches that have been tokenized, and I think this hinders an LLM from seeing the images as clearly as they would need to in order to reason like they do with text. Additionally, as far as I know, no multi-modal LLM yet truly has time awareness and so even if they train on frames of a video I don't think we yet have visual cause and effect as a concept in LLMs (but possibly in diffusion models, especially if you look at the recent DOOM engine which could repurpose a Stable Diffusion model to simulate the game). We will probably see a lot more development in multimodal networks still, because it seems like the current models are just barely scraping by. Heck, even the way tokens work is too low resolution in some regards (hence the "how many 'r's in 'strawberry'" meme). And I think this could dramatically change the amount of intelligence you can squeeze out of even the current amount of compute. And it can be shown that architectural improvements does have this effect, e.g. compare GPT-3 with Llama 3.1 8B. Hopefully architectures like LeCun's JEPA which can reason more in latent space also with video will eventually unlock a lot more trainability on multimodal data. If it works, I don't think data labeling will be too important (just as it isn't for humans), though maybe data consistency will be (so for the example of watching cartoons, maybe they should run through a 3D renderer pass that shows the cartoon on a screen "in the real world" so that all data the LLM receives is based in reality, just like it is for humans).

Blixt

2024-09-02 07:25:34 +0000 UTC

One of your best videos. So many insights from it

Leo G

2024-09-02 05:59:56 +0000 UTC

Amazing video as always Philip! Regarding the power constraints - I'd be curious to hear your views on "thermodynamic" computing (Extropic) and "liquid" neural networks (Liquid.ai) as a part of the solution - as they claim 100-1000x efficiency gains. However, there may simply not be enough information released from these guys to put together a video/analysis on it.

James Patton

2024-09-02 01:37:18 +0000 UTC

Very interesting stuff. At the very least the next generation of LLM's will help researcher digest the tons of papers coming out each year even if they only end up being great at regurgitation.

Mike D

2024-09-01 23:54:44 +0000 UTC

This is what I think every I hear "diminishing returns" or that OpenAI is losing it's lead. I don't think anyone can answer that until we've seen GPT-5 in action.

Mike D

2024-09-01 23:34:38 +0000 UTC

As always, awesome video Philip!

Juanjo do Olmo

2024-09-01 23:09:14 +0000 UTC

It seems like so many decisions hinge upon how much GPT-5 wows people. This is a "make or break" moment for the trajectory of AI.

David Shapiro

2024-09-01 21:50:49 +0000 UTC

It's always an exciting email to get, when one of these videos is ready!

Simon Sturmer

2024-09-01 21:08:05 +0000 UTC

More Creators

wParallax

wParallax

gumroad

marcelinhofeet

marcelinhofeet

patreon

tyjokr

tyjokr

patreon

jrenaegaming

jrenaegaming

patreon

boudoir_noir

boudoir_noir

patreon

Nemuinu

Nemuinu

fanbox

木下心葉

木下心葉

fanbox

Milena Cipriano

Milena Cipriano

patreon

Belmontart

Belmontart

fanbox

mockrock

mockrock

patreon

Low End University

Low End University

patreon

billionslove

billionslove

fanbox

marmastry

marmastry

gumroad

Turbo

Turbo

patreon

Tosaka

Tosaka

gumroad

bintuc

bintuc

patreon

Holycchi

Holycchi

patreon

Stiletto Bella

Stiletto Bella

patreon

ホゾヒカル

ホゾヒカル

fanbox

MysticLight

MysticLight

patreon

Oct.

Oct.

fanbox

JM

patreon

Daloknight

Daloknight

patreon

starblame

starblame

fanbox

JPC Comics

JPC Comics

gumroad

Tarri

Tarri

patreon

P

fanbox

ZackGrooves

ZackGrooves

patreon

glaazius

glaazius

patreon

strauzek

strauzek

patreon

KrUnCh

KrUnCh

fanbox

GarryFlix

GarryFlix

gumroad

pixeledasteroid

pixeledasteroid

gumroad

山本精子漏

山本精子漏

fanbox

KyokaSuigetsu

KyokaSuigetsu

patreon

Younagi

Younagi

patreon

青肌サキュバス

青肌サキュバス

fanbox

Stormia

Stormia

patreon

trinian

trinian

patreon

あやつり人形くるみ

あやつり人形くるみ

fantia