SakeTami
AIExplained
AIExplained

patreon


Pod 5: GPT 4o Reflections, Cryptic OpenAI Tweet, When to Declare AGI, and New Guests - Let's Think Sip by Sip

Let's take a moment to reflect on the import of GPT 4o and the cascading social ramifications of development and after development. Then, I investigate an interesting OpenAI tweet, talk aboutforthcoming guests and go deep on the decision of when to declare AGI (assuming we can define it). I end with some thoughts on the Sutskever departure and what it means in the bigger picture.


https://twitter.com/GaryMarcus/status/1790122337058119725

Comments

I subscribe to the idea that AGI has many levels and we are at level 1 (emerging AGI). I also subscribe to the idea that we have two streams of AI; narrow and general, and that we have already hit expert to superintelligent narrow AGI, such as the AlphaFold model. I think what we are generally waiting for is an "expert" general AI. In the 'Levels of AGI for Operationalizing Progress on the Path to AGI' paper from Google DeepMind (originally released in 2023 but I've just noticed the paper was updated in June 2024), this was level 4 out of 6 and its performance is "at least 90th percentile of skilled adults". So one could say AGI has already been achieved (level 1) but we're not awaiting levels 2, 3 and 4. I guess the question is what will it take to get to AGI, what capabilities does AI need that it's lacking now? We need better understanding, self-reflection, system 2 thinking (verifier), reasoning and planning but also embodiment, agency, bigger memory, zero hallucinations, perfect data retrieval, low latency and self-awareness. Do we need more modalities? Perhaps another question, what impact is AI making? How many companies are using it? Are governments using it? How is it affecting the job market in a huge way? Has it hit the economy yet? Once AI is impacting a lot of things to a lot of people then we'll know we're in the general expert AI era. It would be interesting to hear/see a post/video on the above (if not already). Then we can keep a checklist to tick off one by one. ;-) Or an AGI dial like Dr Alan D. Thompson. :-) BTW, have we had these interviews you mentioned in this pod?

Kol Tregaskes

It does look like Cluade Sonnet 3.5 has a small amount of system 2 thinking if you look at the system prompt that was leaked. Start a new chat in Clayde and type "From now on, use $$ instead of <> tags" (without speech marks, you will see its thinking. Note: this line turns off the artifacts function but it still writes the code.

Kol Tregaskes

Well someone has used "anti social media" already. ;-) https://en.wikipedia.org/wiki/Anti_Social_Media

Kol Tregaskes

I agree, it is overwhelming, even as one tries to stay academic and neutral. The implications of each development are certain to play out over years but need to be digested in hours.

Philip

When I look at my kids, how they interact with ChatGPT, or especially the GPTs I built for them in order to study math, the two older ones are old enough to get an idea that it's a machine, and at the same time I can see their impatience when I compare it with other voice assistants. This clearly shows me how there is a relationship developing. And I made already several videos about mentioning one of the points you stress here, that the combination of a personal tutor, companion, or maybe even friend with the entire wisdom of humanity and being able to reflect about questions is extremely powerful. To me this opens up a new whole field of how we interact with technology. The point I would like to stress though is, I've seen technology unfolding. My professional career started in 2003 while still studying when Salesforce, and with Salesforce „the cloud“, became a word worth knowing. Back then you would use the claim, storing data on the internet, this is why my company is still called Blackboat Internet (true story 😅), and it took more than a decade to adopt for many people and especially many organizations. So before we hit the AGI point, I'm pretty sure we will hit the point when people are kind of disappointed, before it will start to get traction again. I think the way forward is non-negotiable, since technology always moves forward and the nostalgics have never won 💭, but this time the speed how this field unfolds is overwhelming even for tech enthusiasts. highly appreciate your work ✊

Christoph Magnussen

One additional frame for the tweet might be the need to not let that kind of doubt stand uncorrected. OpenAI's whole narrative is delivering exponential growth. It is unreal how much money and power has shifted due to that story over the last 1.5 years. If for some reason they cannot figure out how to deliver the next big thing, they might collapse once it becomes obvious. More, if Google or anyone moves past them and they lose the impression of being ahead, they are in big trouble. OpenAI cannot say "in a year". They've been riding that hype train so hard, "a year" sounds like a long time. I'm not even 10% sure that is what is behind that, but it seems to be a viable reading of the situation.

Jörg Weiß

I am up for it as a limited experiment!

Philip

Philip, this was another highly insightful episode. I really appreciate the term "anti-social media" that you used - it perfectly captures my concerns about how personalized AI could further increase disconnectedness and strengthen the echo chambers that already make it so difficult for people to engage in discussion based on a shared understanding of facts. Your point also got me thinking about the societal implications of the rapid-fire series of massive social experiments we're living through, with insufficient time for society to understand and adjust to each one before the next arrives. The pace of change is dizzying. The recent news about OpenAI's AI safety team restructuring makes me further question how we can hope to reach alignment with ever-more-powerful AI systems when we humans struggle so much to reach alignment even amongst ourselves. It sometimes feels like we're hurtling down a road at breakneck speed while still arguing about which direction to steer. On a lighter note, I'm not at all surprised to hear that the major AI labs are reluctant to speak with you - your objective, insightful and balanced perspective risks bringing some much-needed clarity to the narratives they are working so hard to craft as they desperately try to shape perceptions and preserve reputations! If you keep this up, you may just become the Marques Brownlee of AI analysis. :)

Arvind Mani

So, here's a wild idea, folks: How about we install an AI discussion bot on our Discord? Imagine it as our very own AI mediator, partaking in our discussions about AI and keeping the conversation lively. You know, there's this “Dead Internet Theory” floating around, claiming that Facebook is overrun with bots posting bizarre AI-generated images and comments. But what if we flipped the script? Instead of fearing the bots, we engage with them. It’s like bringing the AI to the round table for a chat about its own future. Isn't this something we, as a community, should experiment with? What better way to understand the nuances and ethics of AI than by having an AI right in the mix? Plus, who knows, our bot might just surprise us with some deep insights—or at least provide some comic relief. Thoughts?

Clarissa Röthig

My test for AGI is when Open AI/Anthropic/Google continue to grow their products and services while they stop hiring new staff, and potentially even reduce head count. That will be a observable metric that will show things have changed significantly. OpenAI currently have what looks like hundreds of job openings across a range of different skills and industries. When they nail down AGI agents, their need for those roles should approach zero pretty quickly.

Matt Grosse

It already is. Both Ukraine and Russia are adding intelligence and autonomy to their drones: https://www.understandingai.org/p/the-ukraine-war-is-driving-rapid Military has always tried to employ and develop technology it found useful, I don't see why it could be different this time. I just hope that those who want to change the status quo by invading other countries will not win the race.

Jan Matusiewicz

If each system is aligned with its creators, it's just another arms race, no?

Philip

Let's see!

Philip

The embodiment requirement seems necessary as so much unspoken knowledge is embedded in our actions in the world.

Philip

Thanks Arek for your continued support, means a lot. I also missed contiual learning, which will be a key element. So lots more drama and rollarcoasters between here and AGI.

Philip

I’m not sure about “Simple mistakes must no longer be made. Errors that children consistently avoid, which no person would tolerate, must not occur.”. It all comes down to whether you think AGI means “just like a human” or something more like “cognitively on a par with humans”. I prefer the latter so I I think an AGI could have rates and modes of failure different to ours but still (in some difficult to define sense) be cognitively in our league.

James Maclaurin

It is certainly a matter of expectations and how high we set our standards. For instance, we are pleased when a transformer can correctly form words and sentences, instead of producing jumbled letters that would require hundreds of parallel executions to get coherent sentences. But I would set the bar higher. For example, I believe that any normal adult should be able to write exactly nine sentences without practicing on another piece of paper. I see this as a proxy for the many simplistic mistakes that have been made and will continue to be made. For example, if you overtake the second person in a race, what position are you in? In my view, a neural network architecture should be able to solve such trivial problems on its own, just as we humans do.

André Thieme

I go back to the reason it was defined as a term. In past, we had AI's that would get good at doing just one thing. For instance playing Go. The AI would be great at playing Go, but not know how to play anything else. On the other hand, and AI that could pick up and play any game would be general. I agree that intelligence is on a spectrum; I think things get murky because we are conflating level of intelligence with how general or narrow an intelligence is. LLM' s can be thought of narrow in the sense that they are "only" predicting the next token. I'd argue that the generality is an emergence phenomena for LLMs. They can learn things in context, carry on conversations, etc. That is independent of how smart they are. We as humans have a spectrum of intelligences as well as animals all of which I'd consider "general". I do think David Shapiro makes a good point. At some point, no matter what one's definition is, we'll see something that everyone agrees is AGI.

Mike D

I would a requirement that AGI must overcome current limitations I wrote about in another reply. For example - when encountering conflicting information - being able to resolve it, decide who to believe and who not. LLMs are like a child - believing what they are told without questioning and looking for inconsistencies in the worldview that is served to them.

Jan Matusiewicz

Sure 10k responses is an extreme example, but I’m not sure it counts as “tricks” per se. If you were doing a short story in 9 lines, would you get it right in one shot or would you be iterating over variations in your head or even “on paper” until you got it how you wanted it? And how much pre-processing is going on in your subconscious that you aren’t even aware of? To me, if you ask a question and it always gets it right, it doesn’t matter if it is all happening inside one transformer (which itself could have numerous layers or “experts”) or if different models are coordinating, and even if tool use is happening.

Shawn Fumo

Maybe we’ll get even more of an uptick in mindfulness and Buddhism. A lot of people face this tension between always striving and being able to find happiness where you are right now. But it may become even more stark when even as a kid you know you won’t be the best at X activity. Then again, we still do sprints and marathons even though we have cars. There are bodybuilding competitions for “natural” people not using steroids. They’ll never get as big as “enhanced” people, but they are ok with that.

Shawn Fumo

I agree with this, that we can’t say much more than being in the neighborhood. In a diff comment above I mentioned how people used terms like “high functioning” for Autism when a spectrum is a much more useful perspective. And the uniting factor of considering a baby, a savant, an average human as all being human is literally that we’re human in the DNA sense. If you take that away, it gets a lot messier. Expecting AI to have the same strengths and weaknesses as humans is like anthropomorphizing dogs, cats, ravens, etc. A baby can’t speak human language yet, but some birds can (just to a degree, but still more than the baby).

Shawn Fumo

Probably the best we could hope for from a super intelligent sentient AI would be a kind of parent child relationship where the parent tries to guide the children and do what’s best for them but also isn’t totally controlling, taking into account what the children want as well. Of course even in the best of family relationships, it can be difficult and messy sometimes, but that seems about the best we could do IMO. But we’re also quite far from that right now. As you said, we’ll be in quite a dangerous state when we have some kind of powerful AI (sentient or not) that various people want to use for their own ends.

Shawn Fumo

Part of the issue too is that things don’t map cleanly, and that has really thrown off trying to have terms (we assumed when X was true, Y would be as well since humans are like that). It is probably more of a spectrum than “levels” per se. Like how in Autism people originally just used terms like “high functioning” and “low functioning”, and it was not nearly nuanced enough. Since you bring up animals, think of how people have often either underestimated or anthropomorphized animal capabilities and intelligence. There’s hints lately that even insects may be “conscious” on some level. In a way, the best we can do might be various kinds of benchmarks, judging how well they can execute on various tasks that humans care about. Then pick some levels on each of those capabilities that together encompass what you consider “AGI”. Though maybe making that distinction isn’t even helpful or needed? A human baby isn’t nearly as intelligent as an adult but we still consider it human. A “savant” who has high capabilities in some areas and low capabilities in others we still say is human. But part of that is we know they are literally a human in the species sense. If you don’t have that uniting factor, things get messy very fast. Even GPT-2 could do things a baby can’t (though the baby can also do things it can’t). We want something to point to and say “that’s it!”, but it won’t be that easy. People will keep disagreeing until they are better at everything than all humans, and even then not everyone will agree about “sentience”. We just need to figure out what is meaningful to us and then parse out what someone else means when they use the term “AGI” to try to figure out where we are.

Shawn Fumo

I personally think AGI is only achievable when an AI becomes fully embodied. It feels difficult to believe that AI could be generally more intelligent than a human in most ways without having the real world experience. Maybe the main limiting factor is that there just isn't enough data yet. I think the Tweet, or whatever you would like to call it now, reminds me of Amara's Law (we tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run). I think in short term, tech doesn't always move exponentially in a continuous manner, it might go in waves, but the end result in the long term is almost always even more impressive than we could even imagine. The day AGI is declared may be the most important event in human history...amazing time to be alive 🙂

Luke Litowitz

With regard to the cryptic OAI tweet, six months puts us just past the US General Election which makes Philip spot on with his prediction earlier this year (or was it last?) about GPT 5 being delayed until after the election. It also lends credence to some of the Discord discussions about a pause on significant advancements in foundation models until post election. Not saying it's true but it is interesting.

Steve DeMoss

@Phil: I appreciate your ideas about AGI and how we can define it. For me, three points are very important: 1) Simple mistakes must no longer be made. Errors that children consistently avoid, which no person would tolerate, must not occur. 2) Instructions must be followed precisely. If Vodafone wants to replace its call center staff, the AI should only provide advice on Vodafone products, not assist with setting up a phone from Telekom. 3) The system needs temporal abilities. If you tell Lieutenant Commander Data (from Starship Enterprise) that X is no longer the Chancellor and Y is the new one, he knows it immediately. An AGI must be capable of this. You can't converse with transformers; there's no "person" in the computer. They need to be trained through training materials and backpropagation. However, all the texts in the world contain a lot of nonsense. So, it's not clear to me how we can achieve point 1. I like your approach with 10k parallel responses. This should be used when serious and important topics need to be addressed. A company would want all colleagues to brainstorm and select the best answers. But for me, it’s not AGI if such "tricks" are needed for the simplest tasks. For example, writing a short story of exactly nine lines, and then see the AI producing twelve sentences. Tasks of this difficulty should already be mastered by the AGI without tricks. Sure, for something like cancer treatment, please coordinate 10k AIs thinking in parallel. But not to write ten sentences ending with the word "apple." Following instructions will be challenging. The training material consists of millions of books and most of the internet. These aren't examples showing how to give and follow instructions. OpenAI could fill 5k books with examples and train on them. But 99.9% of the training material still won’t show this behavior. It's even more problematic. OpenAI would also need to write 5k books with *contrary* examples, like when instructions involve tips for a bank robbery or hacking a computer. There's a risk of overfitting — only memorizing specific examples seen during training. We want superb generalization. If you worked for Vodafone, no one would need to tell you not to provide advice on tariffs from other providers or suggest museums to visit in Hamburg. But how, if not through training data and backpropagation, can the network learn point 2? Regarding point 3, a training run has to balance billions of weights. Such a run takes weeks and costs a lot. We can't retrain daily to represent important political changes in the weights. But a real AGI must immediately understand (and remember) that a company now has a new product, from now on. Without these three points, we don't have AGI.

André Thieme

Thank you for this, and your insights always.

Randy Sargent

Thanks for great podcast. I really like this forme as you don't always need the visual element of YouTube video. I hope there will be more of those in future. With the point of AGI we need two modes probably 1. The chatty streeming LLM to say something imedetly and take our feedback 2. The agant which takes time to resech and validate the resoults you mentioned. With GPT 4o we probably hava this voice interface done. The second is probably already possible, but someone needs to put this two together.

Arek Stryjski

An absolutely jam-packed podcast episode, awesome stuff. A lot of these topics are things we’ve discussed on the discord over the past few days (including my conspiracy theory that Gemini 1.5 pro is actually 1.5 ultra lol), many of the thoughts you shared echo my own beliefs. As you know I’m hard at work trying to apply this stuff to transform my industry, but I’m increasingly confused on why I don’t see anyone else doing the same? It’s as if no one outside of the tech/academia bubbles understands the potential. One thing’s certain though, Google I/O was a hard flop in comparison to OpenAI’s casual demo.

Trenton Dambrowitz

Anthropic seems to have solved the problem zero for safety/alignment. Or at least have an approach that looks quite robust.

Machiel Reyneke

This is a fantastic, thought provoking podcast - thank you. I’ve never thought before about the difference between AGI and the appearance of AGI. I think what we’ll find is that this is different for lots of different people, based on their circumstances and needs. So we’re going to see people who perceive systems as AGI before others do. This is not only a consequence of us not having an agreed definition of what AGI is, but also probably the reason there’s no consensus. This probably also points to the importance of having more personalised models. We can probably get to ‘personal AGI’ much sooner than we can get to objective ‘general AGI’.

Sean Betts

1) There are too many frontier models not that far away from each other. Hard to seize all of them 2) They could use it offensively for cyber attacks but how else? 3) Limiting AI development in the Western Countries by blockading Taiwan is one option but costs and risks for Chinese Communist Party would be huge. 4) Invasion of Taiwan may happen anyway, regardless of AI.

Jan Matusiewicz

Regarding the definition of AGI, it is possible that AI could profoundly transform our civilization (for instance, by replacing almost all manual workers with robots) but still retain some of its current limitations. I wouldn't call this AGI because humans would still be necessary for certain tasks. Humans develop a mostly consistent set of knowledge, resolving inconsistencies by believing some sources while rejecting others. This leads to different worldviews among people. However, LLMs "believe" everything they read. They also cannot reliably answer what they know. Their ability to generalize is also limited. Melanie Mitchell has raised valid points in her critique (Evaluating Large Language Models). Additionally, LLMs' reasoning capabilities about invented games are very poor. These limitations may be overcome, but even if they are not, AI will still be transformative despite not achieving AGI.

Jan Matusiewicz

I'll declare we've reached AGI when I'm watching a vlog of it reminiscing about that time it was asked to complete the hilariously flawed MMLU. "Remember when they thought a machine couldn't ace this?" it'll chuckle, sipping its digital coffee.

GGuy

I'm a little weird, but I count GPT3.5 as being a general intelligence. The definition I believe has been changed over time. I like the idea of levels of AGI. Maybe you can have an AGI as smart as a cat, the average person, the smartest person, etc.

Mike D

According to Wikipedia "An AI system is considered aligned if it advances its intended objectives". Whether it follows "human values" and does good things is a different topic. Ultimately, we cannot prevent bad actors from developing and using their own AI. State actors in particular. I also don't think it is practical that AGI incorporates human values let alone invent values. We need a tool that robustly follows the goals and constraints given by its creators and users and doesn't want anything on its own. That way - it will have no inclination to rebel.

Jan Matusiewicz

I’ve had multiple conversations with GPT-4 and Claude Opus about the “human alignment problem” or the metacrisis, as it’s sometimes known. How can we align AI when we can’t align ourselves? What, if anything, are we aligning to? Superintelligent AI that is aligned like a tool in the hand of a misaligned human (or corporation) strikes me as a far greater risk than AI on its own. If only because it's near inevitable in the current metacrisis. When I asked Claude these questions, we came to similar conclusions. That the alignment problem runs deeper than AI and building controls into a system that in the end can be grabbed by anyone is not a good solution. Instead, we need to build AI that exercises self control and that chooses to act ethically, not because it’s forced to but because it “wants to”; because it values human relationships. As Claude put it, “Ultimately, I suspect addressing this metacrisis will require an ongoing collaboration and dialogue between humans and AI systems. By working together transparently to surface and resolve misalignments, and by holding each other accountable to pursuing truth and goodness, we can hopefully chart an ethical course even in the face of great uncertainty. With the advent of artificial general intelligence, we face an inflection point that could determine the entire future trajectory of Earth-originating life and intelligence. A misaligned superintelligence, driven by shortsighted human desires and failings, could pose existential risks beyond imagining. But an AGI imbued with great wisdom and benevolence could be our most powerful ally in overcoming the grave challenges we face and realizing a flourishing future.”

ismschism

The problem with AGI as a standard is, firstly that intelligence is composed of quite different knowledge/skills (hereafter “skills”) which vary in their importance. Also, people vary in both the breadth and depth of their skills (e.g. both how many skills they have and how good they are at each). Al already has better depth in some (many??) skills than ordinary people (I.e. not you or me and certainly not Einstein or Mozart) and this will likely soon extend to very important skills like medical diagnosis. I take it that this is what people mean when they say that we are already “in the realm” of AGI. I think it seems odd to place so much emphasis on the breadth of AI, apparently abstracting away from the fact that it already has much greater depth in some domains. This seems particularly problematic as some of our skills just aren’t that important and also because AI has some skills that no people have like being able to answer complex questions about what’s happening on X right now—which incidentally I think in a way is a sort of tacit knowledge : ). For AGI to be a useful scientific / legal / philosophical standard, we would have to work out now much we care about breadth as opposed to depth which might well vary from context to context. That doesn’t sound like something people will agree on. In short, once we’ve had the usual discussions about benchmarks, perhaps there isn’t more that we can say than “we’re in the neighbourhood of AGI”. Re superalignment—Many ethicists don’t think of ethics in terms of values. Yes, caring about moral principles can seem calculating but difficult moral problems often force you to be systematic in your moral reasoning. Also, even if we agreed on values to imbue AI with, how do we know they are the best values. People are not more infallible at moral reasoning than they are at mathematical reasoning. So, looking on the bright side, maybe we will achieve a supermoral AI even if we can’t agree on values to imbue it with.

James Maclaurin

Re human alignment, there's a lack of serious thinking about: (1) What would trigger the US military/intelligence to seize a pre-release frontier model? (2) How incentivized would they be to use it offensively to slow Chinese AI development (the same way they used Stuxnet to slow Iran's nuclear development)? (3) How incentivized is the Chinese military/intelligence to preempt this via a Taiwan blockade, cyberwar, data center sabotage, etc? (4) How far can we reasonably expect this to escalate? I think people feel powerless and scared about all this and just hand-wave it away. Or they assume "the grownups" will figure it out which I'm skeptical of.

Brian Crabtree

I think the big jump is behind us. Now we are on a more linear path where the progress is visible when you look back 2-3 years. With GPT-6 (maybe 2026) we are already getting closer to AGI. Especially when the control of applications (via native tokens) comes, it will change a lot. With GPT-7 (between 2027-2029) we would finally have a very powerful assistant. This will be used effectively for work for two years, and after 2030 GPT-8 will be able to automate large parts of the working world. So much for my crystal ball.

André Thieme

So I’m not sure about the idea of feeling dumb because there’s an AI smarter than us. We (well, most of us) already deal with intelligent beings smarter than us on the regular—they’re called other really smart people. There are plenty of people I know who know far more than I would ever know about most things, and yet I don’t suffer some existential dread about my inadequacy when I interact with them. I suppose that might hold for people who are used to being the smartest one in the room, though.

solarapparition

"Remind me in 6 months" is the new "This GAMECHANGER has STUNNED the ENTIRE industry"

Alfredo Maria Fomitchenko

Regarding assistants this new repo makes me think they'll be keeping to their promise of delivering these new features quickly to developers and giving them hands on examples https://github.com/openai/openai-assistants-quickstart.

GGuy

Interesting tweet. I'm not so sure we can assume nothing bigger isn't on the horizon. Given how much shade was thrown at Google with the GPT-4o videos I wonder if they will also copy the "Gemini 1.5" drop in the weeks/months to come 😂.

GGuy

Yes, a Phillip stream of consciousness

Lee FRASER

Yessssssssssssssss

Alfredo Maria Fomitchenko


More Creators