diff --git a/content/post/Latent Space/Latent-Space-High-Agency-Pydantic->-VC-Backed-Frameworks-—-with-Jason-Liu-of-Instructor.lrc b/content/post/Latent Space/Latent-Space-High-Agency-Pydantic->-VC-Backed-Frameworks-—-with-Jason-Liu-of-Instructor.lrc
new file mode 100644
index 0000000..d573d06
--- /dev/null
+++ b/content/post/Latent Space/Latent-Space-High-Agency-Pydantic->-VC-Backed-Frameworks-—-with-Jason-Liu-of-Instructor.lrc
@@ -0,0 +1,971 @@
+[by:whisper.cpp]
+[00:00.00]Hey everyone, welcome to the Late In Space podcast.
+[00:09.30]This is Alessio, partner in CTO & Residence at Decibel Partners.
+[00:13.00]And I'm joined by Makoho's Swicks, founder of SmallAI.
+[00:16.00]Hello, we're back in the remote studio with Jason Liu from Instructor.
+[00:20.00]Welcome, Jason.
+[00:21.00]Hey there, thanks for having me.
+[00:23.00]Jason, you are extremely famous.
+[00:25.00]So I don't know what I'm going to do introducing you.
+[00:28.00]You're one of the Waterloo clan.
+[00:30.00]There's like a small cadre of you that's just completely dominating machine learning.
+[00:34.00]Actually, can you list like Waterloo alums that you're like, you know are just dominating and questioning it right now.
+[00:39.00]So like, John from like RaiSana is doing his inversion models, right?
+[00:45.00]I know like Clive Chen from Waterloo.
+[00:48.00]When I started the data science club, he was one of the guys we're like joining in and just like hanging out in the room.
+[00:52.00]And now he was at Tesla working with Carpathian, now he's at OpenAI.
+[00:56.00]Yeah, he's in my climbing club.
+[00:58.00]Oh hell yeah.
+[00:59.00]Yeah, I haven't seen him in like six years now.
+[01:01.00]To get in the social scene in San Francisco, you have to climb.
+[01:04.00]So both in career and in rocks.
+[01:07.00]So you started the data science club in Waterloo.
+[01:09.00]We can talk about that.
+[01:10.00]But then also spent five years at Stitchfix as an MLE.
+[01:13.00]You pioneered the use of OpenAI's LLMs to increase stylish efficiency.
+[01:17.00]So you must have been like a very, very early user.
+[01:19.00]This was like pretty early on.
+[01:21.00]Yeah, I mean this was like GPT-3.
+[01:24.00]Okay, so we actually were using transformers at Stitchfix before the GPT-3 model.
+[01:29.00]So we were just using transformers recommendation systems.
+[01:31.00]At that time, I was very skeptical of transformers.
+[01:34.00]I was like, why do we need all this infrastructure?
+[01:36.00]We can just use like matrix factorization.
+[01:38.00]When GPT-2 came out, I fine-tuned my own GPT-2 to write like rap lyrics.
+[01:42.00]And I was like, okay, this is cute.
+[01:43.00]Okay, I got to go back to my real job, right?
+[01:45.00]Like, who cares if I can write a rap lyric?
+[01:47.00]When GPT-3 came out, again, I was very much like,
+[01:50.00]why are we using like a post request to review every comment a person leaves?
+[01:54.00]Like, we can just use classical models.
+[01:56.00]So I was very against language models for like the longest time.
+[01:59.00]And then when chat GPT came out,
+[02:01.00]I basically just wrote a long apology letter to everyone at the company.
+[02:04.00]I was like, hey guys, you know, I was very dismissive of some of the technology.
+[02:07.00]I didn't think you would scale well and I am wrong.
+[02:10.00]This is incredible.
+[02:11.00]And I immediately just transitioned to go from computer vision,
+[02:14.00]recommendation systems to LLMs.
+[02:16.00]But funny enough, now that we have RAG,
+[02:18.00]we're kind of going back to recommendation systems.
+[02:21.00]Speaking of that, I think Alessio is going to bring up the next one.
+[02:23.00]Yeah, I was going to say we had Brian Bishop from X on the podcast to overlap Stitch Fix.
+[02:28.00]Yeah, he was like one of my main users of the recommendation framework
+[02:31.00]that I had built out at Stitch Fix.
+[02:33.00]Yeah, we talked a lot about REXIS.
+[02:35.00]So it makes sense.
+[02:36.00]So now I have adopted that line, RAG is REXIS.
+[02:39.00]And you know, if you're trying to reinvent new concepts,
+[02:42.00]you should study REXIS first,
+[02:43.00]because you're going to independently reinvent a lot of concepts.
+[02:45.00]So your system was called flight.
+[02:47.00]It's a recommendation framework with over 80% adoption
+[02:50.00]servicing 350 million requests every day.
+[02:52.00]Wasn't there something existing at Stitch Fix?
+[02:54.00]Why did you have to write one from scratch?
+[02:56.00]No, so I think because at Stitch Fix,
+[02:59.00]a lot of the machine learning engineers and data scientists
+[03:01.00] were writing production code.
+[03:03.00]So every team's systems were very bespoke.
+[03:06.00]It's like this team only needs to do like real-time recommendations
+[03:09.00]with small data.
+[03:10.00]So they just have like a fast API app with some like pandas code.
+[03:13.00]This other team has to do a lot more data.
+[03:15.00]So they have some kind of like spark job that does some batch ETL
+[03:18.00]that does a recommendation,right?
+[03:20.00]And so what happens is each team writes their code differently.
+[03:23.00]And I have to come in and like refactor their code.
+[03:25.00]And I was like, oh man,I'm refactoring four different code bases
+[03:28.00]four different times.
+[03:29.00]Wouldn't it be better if all the code quality was my fault?
+[03:32.00]Alright,let me just write this framework for everyone else to use it.
+[03:35.00]And now one person can maintain five different systems
+[03:38.00]rather than five teams having their own bespoke system.
+[03:41.00]And so it was really a need of just sort of standardizing everything.
+[03:44.00]And then once you do that,you can do observability
+[03:47.00]across the entire pipeline and make large sweeping improvements
+[03:50.00]in this infrastructure,right?
+[03:52.00]If we notice that something is slow,we can detect it on the operator layer.
+[03:56.00]Just hey,hey,like this team,you guys are doing this operation
+[03:59.00]it's lowering our latency by like 30%.
+[04:01.00]If you just optimize your python code here,we can probably
+[04:05.00]make an extra million dollars.So let's jump on a call
+[04:07.00]and forget this out.And then a lot of it was doing
+[04:09.00]all this observability work to figure out what the heck is going on
+[04:12.00]and how to optimize this system from not only just a code perspective.
+[04:15.00]So like harassingly Oregon saying like we need to add cash in here.
+[04:18.00]We're doing duplicated work here.Let's go clean up the systems.
+[04:21.00]Yeah,got it.One more system that I'm interested in finding out more about
+[04:25.00]is your similarity search system using Clip and GPT-3
+[04:29.00]embedding in FICE,where you said over $50 million in annual revenue.
+[04:33.00]So of course they all gave all that to you,right?
+[04:35.00]No,no,no.I mean,it's not going up and down,but you know,
+[04:38.00]I got a little bit,so I'm pretty happy about that.
+[04:40.00]But there,you know,that was when we were doing fine tuning like
+[04:44.00]resnets to do image classification.
+[04:46.00]And so a lot of it was given an image if we could predict
+[04:50.00]the different attributes we have in the merchandising
+[04:52.00]and we can predict the index embeddings of the comments
+[04:55.00]then we can kind of build a image vector or image embedding
+[04:59.00]that can capture both descriptions of the clothing and sales of the clothing.
+[05:03.00]And then we would use these additional vectors
+[05:05.00]to augment our recommendation system.
+[05:07.00]And so with the recommendation system really was just around
+[05:10.00]like what are similar items,what are complementary items,
+[05:12.00]what are items that you would wear and a single outfit
+[05:15.00]and being able to say on a product page,let me show you
+[05:18.00]like 15,20 more things.
+[05:20.00]And then what we found was like,hey,when you turn that on
+[05:22.00]you make a bunch of money.
+[05:23.00]Yeah,so okay,so you didn't actually use GPT-3embeddings
+[05:26.00]you fine tuned your own,because I was surprised
+[05:28.00]that GPT-3 worked off the shelf.
+[05:30.00]Okay,because I mean,at this point we would have
+[05:32.00]3 million pieces of inventory over like a billion interactions
+[05:35.00]and users and clothes
+[05:37.00]any kind of fine-taining would definitely up the form
+[05:39.00]like some off the shelf model.
+[05:41.00]Cool,I'm about to move on from Stitch Fix
+[05:43.00]but,you know,any other like fun stories from the Stitch Fix
+[05:45.00]that you want to cover?
+[05:46.00]No,I think that's basically it.
+[05:48.00]I mean,the biggest one really was the fact that
+[05:50.00]I think for just four years I was so bearish on language models
+[05:53.00]and just NLP in general,I was just like,none of this really works.
+[05:55.00]Like,why would I spend time focusing on this?
+[05:57.00]I gotta go do the things that makes money.
+[05:59.00]Recommendations,bounding boxes,image customization.
+[06:02.00]Yeah,now I'm like prompting an image model.
+[06:04.00]Oh,man,I was wrong.
+[06:05.00]So,my Stitch Fix question would be,you know,
+[06:08.00]I think you have a bit of a drip and I don't.
+[06:10.00]You know,my primary wardrobe is free start-up
+[06:12.00]conference t-shirts.
+[06:14.00]Should more technology brothers be using Stitch Fix?
+[06:17.00]What's your fashion advice?
+[06:20.00]Oh,man,I mean,I'm not a user of Stitch Fix,right?
+[06:23.00]It's like,I enjoy going out and like touching things
+[06:27.00]and putting things on and trying them on,right?
+[06:29.00]I think Stitch Fix is a place where you kind of go
+[06:31.00]because you want the work offloaded.
+[06:33.00]I really love the clothing I buy
+[06:35.00]where I have to like,when I land in Japan
+[06:37.00]I'm doing like a 45-minute walk
+[06:39.00]up a giant hill to find this weird denim shop.
+[06:42.00]That's the stuff that really excites me.
+[06:44.00]But,I think the bigger thing that really captures
+[06:46.00]is this idea that narrative matters a lot
+[06:48.00]to human beings,okay?
+[06:50.00]And I think the recommendation system,that's really hard to capture.
+[06:53.00]It's easy to use AI to sell like a $20 shirt,
+[06:56.00]but it's really hard for AI to sell like a $500 shirt.
+[06:59.00]But people are buying $500 shirts,you know what I mean?
+[07:01.00]Like,there's definitely something that we can't really capture
+[07:04.00]just yet that we probably will figure out how to
+[07:07.00]in the future.
+[07:08.00]Well,he'll probably output in JSON,which is
+[07:10.00]what you're going to turn to next.
+[07:12.00]Then you went on a sabbatical to South Park Commons
+[07:14.00]in New York,which is unusual
+[07:16.00]because it's based on NSF.
+[07:18.00]Yeah,so,basically in 2020,really,
+[07:20.00]I was enjoying working a lot
+[07:22.00]and so I was like building a lot of stuff.
+[07:24.00]This is where we were making like the tens of millions of dollars
+[07:26.00]doing stuffand then I had a hand injury
+[07:28.00]and so I really couldn't code anymore
+[07:30.00]for a year or two years.
+[07:32.00]And so I kind of took sort of half of it as medical leave.
+[07:34.00]The other half I became more of like a tech lead
+[07:36.00]just like making for the systems or like lights were on.
+[07:39.00]And then when I went to New York,
+[07:41.00]I spent some time there and kind of just like wound down
+[07:44.00]the tech work,you know,did some pottery,did some jiu-jitsu
+[07:47.00]and after GBD came out,I was like,
+[07:49.00]Oh,I clearly need to figure out what is going on here
+[07:52.00]because something feels very magical.
+[07:54.00]I don't understand it.
+[07:56.00]So I spent basically like five months just prompting
+[07:58.00]and playing around with stuff.
+[07:59.00]And then afterwards it was just my starter friends
+[08:01.00]going like,Hey Jason,you know,
+[08:03.00]my investors want us to have an AI strategy.
+[08:05.00]Can you help us out?
+[08:06.00]And it's just snowballed and born more
+[08:08.00]and become until I was making this my full-time job.
+[08:10.00]Yeah,got it.
+[08:11.00]You know,you had YouTube University
+[08:13.00]and a journaling app,you know,a bunch of other explorations,
+[08:16.00]but it seems like the most productive
+[08:18.00]or the best-known thing that came out of your time
+[08:20.00]there was Instructor.
+[08:21.00]Yeah,written on the bullet train in Japan.
+[08:23.00]Tell us the origin story.
+[08:24.00]Yeah,I mean,I think at some point,
+[08:27.00]you know,tools like guardrails and Marvin came out,
+[08:29.00]right?
+[08:30.00]Those are kind of tools that like use XML
+[08:32.00]and Python to get structure data out,
+[08:33.00]but they really were doing things
+[08:35.00]sort of in the prompt and these were built
+[08:37.00]with sort of the instruct models in mind.
+[08:39.00]Like,I'd already done that in the past,right?
+[08:41.00]Stitchfix,you know,one of the things we did was
+[08:43.00]we wouldn't take every crest note
+[08:45.00]and turn that into a JSON object
+[08:47.00]that we would use to send to our search engine,right?
+[08:49.00]So if you said like,I want to,you know,
+[08:51.00]skinny jeans that were this size,
+[08:53.00]that would turn into JSON that we would send
+[08:55.00]to our internal search APIs.
+[08:56.00]But it always felt kind of gross.
+[08:58.00]A lot of it is just like you read the JSON,
+[09:00.00]you like parse it,
+[09:01.00]you make sure the names are strings
+[09:02.00]and ages are numbers
+[09:03.00]and you do all this messy stuff.
+[09:04.00]But when function calling came out,
+[09:06.00]it was very much sort of a new way of doing things,right?
+[09:09.00]Function calling lets you define the schema
+[09:11.00]separate from the data and the instructions.
+[09:13.00]And what this meant was
+[09:15.00]you can kind of have a lot more complex schemas
+[09:17.00]and just map them in pidantic
+[09:19.00]and then you can just keep those very separate.
+[09:21.00]And then once you add like methods,
+[09:22.00]you can add validators and all that kind of stuff.
+[09:24.00]The one thing I really had with a lot of these libraries,
+[09:26.00]though,was it was doing a lot of the string formatting themselves,
+[09:29.00]which was fine when it was the instruction tune models,
+[09:32.00]you just have a string.
+[09:33.00]But when you have these new chat models,
+[09:35.00]you have these chat messages,
+[09:37.00]and I just didn't really feel like
+[09:38.00]not being able to access that for the developer
+[09:40.00]was sort of a good benefit that they would get.
+[09:43.00]And so I just said,let me write like the most
+[09:45.00]simple SDK around the OpenAI SDK,
+[09:48.00]so simple wrapper on the SDK,
+[09:50.00]just handle the response model a bit
+[09:52.00]and kind of think of myself more like requests
+[09:55.00]than an actual framework that people can use.
+[09:57.00]And so the girl's like,
+[09:58.00]Hey,like this is something that you can use
+[09:59.00]to build your own framework.
+[10:00.00]But let me just do all the boring stuff
+[10:02.00]that nobody really wants to do.
+[10:03.00]People want to build their own frameworks,
+[10:05.00]but people don't want to build like JSON parsing.
+[10:08.00]And the retrying and all that other stuff.
+[10:10.00]Yeah,right.
+[10:11.00]We had this little build this discussion before the show,
+[10:13.00]but like that design principle
+[10:14.00]of going forward being requests
+[10:16.00]rather than being Django.
+[10:17.00]Yeah.
+[10:18.00]What inspires you there?
+[10:19.00]This has come from a lot of prior pain.
+[10:21.00]Are there other open source projects
+[10:23.00]that inspired your philosophy here?
+[10:25.00]Yeah,I mean,I think it would be requests,right?
+[10:27.00]Like I think it is just the obvious thing you install.
+[10:30.00]If you were going to go
+[10:31.00]make like HTTP requests in Python,
+[10:33.00]you would obviously import requests.
+[10:35.00]Maybe if you want to do more async work,
+[10:37.00]there's like future tools,
+[10:38.00]but you don't really even think about installing it.
+[10:40.00]And when you do install it,
+[10:41.00]you don't think of it as like,
+[10:42.00]this is a requests app,right?
+[10:44.00]Like,no,this is just Python.
+[10:46.00]Like the bigger question is like,
+[10:48.00]a lot of people ask questions like,
+[10:49.00]oh,why isn't requests like in the standard library?
+[10:52.00]That's how I want my library to feel,right?
+[10:54.00]It's like,oh,if you're going to use the LLM SDKs,
+[10:57.00]you're obviously going to install instructor.
+[10:59.00]And then I think the second question would be like,
+[11:01.00]oh,how come instructor doesn't just go into
+[11:03.00]open AI,go into Anthropic?
+[11:05.00]Like,if that's the conversation we're having,
+[11:06.00]like that's where I feel like I've succeeded.
+[11:08.00]Yeah,it's like,yeah,yeah.
+[11:09.00]So standard,you may as well just have in the base libraries.
+[11:12.00]And the shape of the request
+[11:14.00]stayed the same,but initially
+[11:16.00]function calling was maybe
+[11:17.00]equal structure outputs for a lot of people.
+[11:19.00]I think now the models also
+[11:21.00]support like JSON mode and
+[11:23.00]some of these things and,you know,
+[11:25.00]return JSON of my grandma is going to die.
+[11:27.00]All of that stuff is maybe to the side.
+[11:29.00]How have you seen that evolution?
+[11:30.00]Like,maybe what's the meta game today?
+[11:32.00]Like,should people just forget about
+[11:33.00]function calling for structure outputs
+[11:35.00]or where,when is structure output,
+[11:37.00]like JSON mode the best versus not.
+[11:39.00]We'd love to get any thoughts given that you do this every day.
+[11:42.00]Yeah,I would almost say these are like
+[11:44.00]different implementations of like
+[11:46.00]the real thing we care about is the fact that now we have
+[11:48.00]type responses to language models.
+[11:50.00]And because we have the type response,
+[11:52.00]my ID is a little bit happier.
+[11:53.00]I get autocomplete.
+[11:54.00]If I'm using the response wrong,
+[11:56.00]there's a little red squiggly line.
+[11:57.00]Like,those are the things I care about.
+[11:59.00]In terms of whether or not like
+[12:00.00]JSON mode is better,
+[12:01.00]I usually think it's almost worse
+[12:03.00]unless you want to spend less money
+[12:05.00]on like the prompt tokens
+[12:07.00]that the function call represents.
+[12:08.00]Primarily because with JSON mode,
+[12:10.00]you don't actually specify the schema.
+[12:11.00]So sure,like JSON load works,
+[12:13.00]but really I care a lot more than just
+[12:15.00]specify that it is JSON,right?
+[12:17.00]I think function calling gives you a tool to
+[12:19.00]specify the fact like,ok,this is a list
+[12:21.00]of objects that I want and each object
+[12:23.00]has a name or an age and I want the age
+[12:25.00]to be above zero and I want to make sure
+[12:27.00]it's parsed correctly.That's where
+[12:28.00]kind of function calling really shines.
+[12:30.00]I need thoughts on single versus
+[12:32.00]parallel function calling.
+[12:34.00]So I did a presentation at our
+[12:36.00]AI inaction discord channel
+[12:38.00]and obviously Shogays instructor.
+[12:41.00]One of the big things that we have before
+[12:43.00]is single function calling.It's like
+[12:44.00]when you're trying to extract lists,
+[12:46.00]you have to make these funky like properties
+[12:48.00]that are list to then actually return
+[12:50.00]all the objects.How do you see
+[12:52.00]the hack being put on the developer's
+[12:54.00]plate versus like more of this stuff
+[12:56.00]just getting better in the model.
+[12:58.00]And I know you tweeted recently about
+[13:00.00]entropic,for example,you know,some
+[13:02.00]less they're not list or strings and
+[13:03.00]there's like all of these discrepancies.
+[13:05.00]I almost would prefer that there was
+[13:07.00]always a single function call,but
+[13:09.00]obviously there is like the agent's
+[13:10.00]workflows that,you know,instructor
+[13:12.00] can really support that well,but
+[13:14.00]are things that,you know,ot to be done.
+[13:16.00]Like you could define,I think maybe
+[13:18.00]like 50 or 60 different functions
+[13:20.00]in a single API call.
+[13:22.00]And,you know,if it was like get the
+[13:24.00]weather or turn the lights on or do
+[13:26.00]something else,it makes a lot of sense
+[13:28.00]to have theseparallel function calls,but
+[13:30.00]in terms of an extraction workflow,I
+[13:32.00]definitely think it's probably more
+[13:34.00]helpful to have everything be a
+[13:36.00]single schema.Just because you can
+[13:38.00]specify relationships between these
+[13:40.00]single chain of thought before you
+[13:42.00]generate a list of results,like
+[13:44.00]there's like small,like,API
+[13:46.00]differences,right,whereif it's
+[13:48.00]parallel function calling,if you do one,like
+[13:50.00]again,really,I really care about how
+[13:52.00]the SDK looks,and says,okay,do I
+[13:54.00]always return a list of functions,or do
+[13:56.00]you just want to have the actual
+[13:58.00]object back out,and you want to have
+[14:00.00]autocomplete over that object.Interesting.What's
+[14:02.00]kind of the cap for,like,how many
+[14:04.00]function definitions you can put in
+[14:06.00]where it still works well?Do you have
+[14:08.00]anything that doesn't really need to do
+[14:10.00]anything that's more than six or seven
+[14:12.00]different functions?I think in the
+[14:14.00]documentation,they support way more.I
+[14:16.00]don't even know if there's any good
+[14:18.00]evalves that have,you know,over like two
+[14:20.00]dozen function calls.I think if you're
+[14:22.00]riding into issues where you have,like,20
+[14:24.00]or 50 or 60function calls,I think
+[14:26.00]you're much better having those
+[14:28.00]specifications saved in a vector
+[14:30.00]database,and then have them be retrieved,right.So
+[14:32.00]if there are 30tools,like,you should
+[14:34.00]basically be,like,ranking them,and then
+[14:36.00]other than just,like,shubbing,like,60
+[14:38.00]function into a single.Yeah.
+[14:40.00]Well,I mean,so,I think this is
+[14:42.00]relevant now,because previously,I
+[14:44.00]think context limits prevented you
+[14:46.00]from having more than a dozen tools
+[14:48.00]anyway.And now that we
+[14:50.00]have million-token context windows,you
+[14:52.00]know,a cloud recently with their new
+[14:54.00]function calling release said they can
+[14:56.00]handle over 250tools,which is
+[14:58.00]insane to me.That's a lot.You're
+[15:00.00]saying,like,you know,you don't think
+[15:02.00]there's many people doing that.I think
+[15:04.00]a sort of agent-like platform where you
+[15:06.00]have a bunch of connectors,they wouldn't
+[15:08.00]run into that problem.Probably,you're
+[15:10.00]right,that they should use a vector
+[15:12.00]database and kind of rag their tools.I
+[15:14.00]know Zapier has,like,a few thousand,like,8,000,9,000
+[15:16.00]connectors that,you know,obviously don't fit
+[15:18.00]anywhere.So,yeah,I mean,that,I
+[15:20.00]think that would be it,unless you need
+[15:22.00]some kind of intelligence thatchains
+[15:24.00]things together,which is,I think
+[15:26.00]what Alessio is coming back to,right.Like,there's
+[15:28.00]this trend aboutparallelfunction
+[15:30.00]calling.I don't know what I think about
+[15:32.00]multiple tools insequence,but they're not
+[15:34.00]inparallel.I haven't explored this atall.I'm
+[15:36.00]just,like,throwing this open to you.So,like,what
+[15:38.00]doyou think aboutall these newthings.Yeah,it's
+[15:40.00]like,you know,do we assume thatall
+[15:42.00]functioncalls could happen in anyorder?In
+[15:44.00]which case,like,we eithercannassume that
+[15:46.00]or wecannassume that,like,things need to
+[15:48.00]happen in some kind ofsequence as a dag,right.But if
+[15:50.00]it's a dag,really,that's just,like,one json
+[15:52.00]object that is the entire dag,ratherthan
+[15:54.00]going,like,okay,theorder of thefunction
+[15:56.00]thatreturn don't matter.That's definitely
+[15:58.00]just not true inpractice,right.Like,ifI have,I
+[16:00.00] can do something that's,like,turn the lights on,like,unplug
+[16:02.00] the power,then,like,turn the toaster on,or
+[16:04.00] something,like,theorder doesn't matter.And
+[16:06.00]it's unclear how well you can describe
+[16:08.00]the importance of that reasoning to a
+[16:10.00]language model yet.I mean,I'm sure
+[16:12.00]you can do it with,like,good enough prompting.But
+[16:14.00]I just haven't any use case with a function
+[16:16.00]sequence,really matters.Yeah,to me,the most
+[16:18.00]interesting thing is,the models are better
+[16:21.00]atpicking than your ranking is usually.Like,I
+[16:24.00] mean,competing a company around system
+[16:26.00]integration.And,for example,with one system,there
+[16:29.00] are,like,700,80 endpoints.And
+[16:31.00]ifyou actually try and do vector
+[16:33.00]similarity,it's not that good,because
+[16:35.00]the people that wrote the specs didn't
+[16:37.00]have a mind,making them,like,semantically
+[16:39.00] apart.You know,they're kind of like,oh,create
+[16:41.00] this,create this,create this,versus
+[16:43.00]when you give it to a model,like,in Opus
+[16:45.00]you put them all,it's quite good at picking
+[16:47.00]which ones you should actually run.And
+[16:49.00]I'm curious to see if the model providers
+[16:51.00]actually care about some of those
+[16:53.00]worthless,or if the agent company is
+[16:55.00]actually gonna build very good rankers to
+[16:57.00]kind of fill that gap.Yeah,my money is on
+[16:59.00]the rankers,becauseyou can do those so
+[17:01.00]easily,right?You could just say,well,given
+[17:03.00]the embeddings ofmy search query and
+[17:05.00]the embeddings ofthe description,I
+[17:07.00]can just train XGBoost and just make
+[17:09.00]sure that I have very high,like,MRR,which
+[17:11.00]is,like,mean reciprocal rank.And
+[17:13.00]so,the only objective is to makesure
+[17:15.00]that the tools you use are in the top
+[17:17.00]and filtered.Like,that feels super
+[17:19.00]straightforward,and you don't have to
+[17:21.00]actually figure out how to fine tune a
+[17:23.00]language model to do tool selection anymore.Yeah,I
+[17:25.00]amagine you either have,like,less than 3
+[17:27.00]tools or more than a thousand.I
+[17:29.00]don't know what kind of companies
+[17:31.00]that,oh,thank god we only have,like,185
+[17:33.00]tools.And this works perfectly,right?
+[17:35.00]That's right.And before we maybe move on
+[17:37.00]just from this,it was interesting to me
+[17:39.00]you retweeted this thing about entropic
+[17:41.00]function calling,and it was Joshua
+[17:43.00]Brown's retweeting some benchmark that
+[17:45.00]is,like,oh,my god,entropic function
+[17:47.00]calling so good.And then you retweeted
+[17:49.00]and then you tweeted it later,and it's
+[17:51.00]like,it's so good.And
+[17:53.00]what's your flow?How do you
+[17:55.00]actually test these things?Because
+[17:57.00]obviously the benchmarks are lying,right?Because
+[17:59.00]the benchmarks say it's good and you said
+[18:01.00]it's bad and I trust you more than the
+[18:03.00]benchmark.How do you think about
+[18:05.00]that?And then how do you evolve it
+[18:07.00]over time?It's mostly just client data.I
+[18:09.00]actually have been mostly busy with
+[18:11.00]enough client work that I haven't been
+[18:13.00]able to reproduce public benchmarks.And
+[18:15.00]so I can't even share some of the results
+[18:17.00]entropic.But I would just say,like,in
+[18:19.00]production,we have some pretty
+[18:21.00]interesting schemas,where it's like
+[18:23.00]initatively building lists where
+[18:25.00]we're doing,like,updates of lists.Like,we're
+[18:27.00]doing in-place updates.So,like,upserts
+[18:29.00]and inserts.And in those
+[18:31.00]situations,we're like,oh,yeah,we have a bunch
+[18:33.00]of different parsing errors.Numbers are being
+[18:35.00]returned to strings.We were expecting lists
+[18:37.00]of objects,but we're getting strings that
+[18:39.00]are,like,the strings of JSON,right?So we
+[18:41.00]had to call JSON parse on individual
+[18:43.00]elements.Overall,I'm,like,super happy
+[18:45.00]with theentropic models compared
+[18:47.00]to the openAM models.Sonnet is very
+[18:49.00]cost-effective.Hikoo is,in function
+[18:51.00]calling,it's actually better.But I think they just
+[18:53.00]had to sort of file down the edges a little
+[18:55.00] bit,where,like,our tests pass,but
+[18:57.00]then we actually deployed a production,we get,you
+[18:59.00] know,half a percent of traffic
+[19:01.00]having issues,where,if you ask for JSON,it'll
+[19:03.00] try to talk to you,or if you use function
+[19:05.00]calling,you know,we'll have,like,a parse error.And
+[19:07.00]so I think that definitely going to be things
+[19:09.00]that are fixed in,like,the upcoming weeks.But
+[19:11.00] in terms of,like,the reasoning capabilities,I
+[19:13.00] mean,it's hard to beat,like,70%
+[19:15.00]cost reduction,especially when you're building
+[19:17.00] consumer applications,right?If you're building
+[19:19.00] something for,like,consultants or private equity,like,you're charging
+[19:21.00] $400,it doesn't really matter if it's $1
+[19:23.00] or $2.But for consumer apps,it
+[19:25.00] makes products viable.If you can go
+[19:27.00] from 4 to Sonnet,you might actually be able
+[19:29.00] to price it better.Yeah.
+[19:31.00]I had this chart about the ELO
+[19:33.00] versus the cost of all the
+[19:35.00] models.And,you could
+[19:37.00] put trend graphs on each of those things
+[19:39.00] about,like,you know,higher ELO equals
+[19:41.00] higher cost,except for Hikoo.Hikoo kind of just broke
+[19:43.00] the lines,or the ISO ELOs,if you want
+[19:45.00] to kind of call it.Cool.Before we
+[19:47.00] go too far into,you know,your opinions on
+[19:49.00] just the overall ecosystem,I want to
+[19:51.00] make sure that we map out the surface area
+[19:53.00] of Instructure.I would say that
+[19:55.00] most people would be familiar with
+[19:57.00] Instructure from your talks and your tweets
+[19:59.00] and all that.You had the number one
+[20:01.00] talk at from the AI Engineering Summit
+[20:03.00]Jason Liu and Jerry Liu.Yeah,yeah,yeah,yeah.You have to
+[20:07.00] start with Jay and Italy to do well.But
+[20:09.00] yeah,until I actually went through your
+[20:11.00] cookbook,I didn't realize,like,the surface area.Like,how
+[20:13.00] would you categorize the use cases,right?You have
+[20:15.00]LLM self-critique,you have knowledge
+[20:17.00] graphs in here,you have PII data
+[20:19.00] sanitation.How do you characterize to people
+[20:21.00] like,what is the surface area of Instructure?Yeah,so
+[20:23.00] I mean,this is the part that feels crazy
+[20:25.00] because really,the difference is
+[20:27.00]LLMs give you strings and Instructure gives
+[20:29.00] you data structures.And once you get data structures
+[20:31.00] again,you can do every,like,lead code problem
+[20:33.00] you ever thought of,right?And so,I think
+[20:35.00] there's a couple of really common applications.The
+[20:37.00] first one obviously is extracting
+[20:39.00]Structure data.This is just be,okay,well,like,I want to
+[20:42.00] put in an image of a receipt,I want to
+[20:44.00] give it back out a list of checkout items
+[20:46.00] with a price and a fee and a coupon code
+[20:48.00] or whatever.That's one application.Another
+[20:50.00] application really is around
+[20:52.00] extracting graphs out.So,one of the things
+[20:54.00] we found out about these language models
+[20:56.00] is that not only can you define nodes,it's
+[20:58.00] really good at figuring out what are nodes
+[21:00.00] and what are edges.And so,we have a bunch
+[21:02.00] of examples where,you know,not only do I
+[21:04.00] extract that,you know,this happens
+[21:06.00] after that,but also,like,okay,these two
+[21:08.00] are dependencies of another task.And you
+[21:11.00] can do,you know,extracting complex
+[21:13.00] entities that have relationships.Given
+[21:15.00] a story,for example,you can extract
+[21:17.00] relationships of families across different
+[21:19.00] characters.This is going to be done by
+[21:21.00] defining a graph.The last really big
+[21:23.00] application really is just around query
+[21:25.00] understanding.The idea is that,like,any
+[21:27.00] API call has some schema and if you can
+[21:29.00] define that schema ahead of time,you can
+[21:31.00] use a language model to resolve a request
+[21:33.00] into a much more complex request.One
+[21:35.00] that an embedding could not do.So,for
+[21:37.00] example,I have a really popular post called,like,
+[21:39.00]ragg is more than embeddings and
+[21:41.00] effectively,you know,if I have a question
+[21:43.00] like this,what was the latest thing that
+[21:45.00] happened this week?That embeds to nothing,right?
+[21:47.00]But really,like,that query should just
+[21:49.00] be,like,select all data where the
+[21:51.00] date time is between today and today
+[21:53.00] minus seven days,right?
+[21:55.00]What if I said,how did my writing
+[21:57.00] change between this month and last month?
+[21:59.00]Again,embeddings would do nothing.
+[22:01.00]But really,if you could do,like,a group
+[22:03.00] by over the month and a summarize,then
+[22:05.00] you could,again,like,do something much
+[22:07.00] interesting.And so,this really just calls out the fact
+[22:09.00] that embeddings really is kind of,like,the
+[22:11.00] lowest hanging fruit.And using something
+[22:13.00] like instructor can really help produce a
+[22:15.00] data structure.And then you can just
+[22:17.00] use your computer science and reason about this data
+[22:19.00] structure.Maybe you say,okay,well,I want to produce
+[22:21.00] a graph where I want to group by each month
+[22:23.00] and then summarize them jointly.You
+[22:25.00] can do that if you know how to define this
+[22:27.00] data structure.Yeah.In that part,you
+[22:29.00] kind of run up against,like,the
+[22:31.00]lang chains of the world that used to have
+[22:33.00] that.They still do have,like,the
+[22:35.00] self-querying.I think they used to call it
+[22:37.00] when we had Harrison on in our episode.How do you
+[22:39.00] see yourself interacting with the other LLM
+[22:41.00] frameworks in the ecosystem?Yeah.I mean,if
+[22:43.00] they use instructor,I think that's totally cool.Again,it's,like,it's just
+[22:46.00] Python,right?It's,like,asking,like,oh,how does,like,Jango
+[22:49.00] interact with requests?Well,you just might make
+[22:51.00] a request.get in a Jango app,right?But no one
+[22:54.00] would say.I,like,went off of Django because I'm using requests now.They
+[22:57.00] should be ideally,like,sort of the wrong
+[22:59.00] comparison.In terms of it,especially,like,the agent
+[23:02.00] workflows,I think the real goal for me is to go down,like,the LLM
+[23:04.00] compiler route,which is,instead of doing,like,a
+[23:07.00]react type reasoning loop,I think
+[23:10.00] my belief is that we should be using,like,workflows.If we
+[23:13.00] do this,then we always have a request and a complete
+[23:16.00] workflow.We can fine tune a model that has a better
+[23:18.00] workflow,whereas it's hard to think about,like,how do you fine tune
+[23:21.00] a better react loop?Yeah.Do you want to always train it to
+[23:24.00] have less looping,in which case,like,you wanted to get the right
+[23:27.00] answer the first time,in which case,it was a workflow
+[23:29.00] to begin with,right?Right.Can you define workflow
+[23:31.00] because I used to work at a workflow company,but I'm not
+[23:34.00] sure this is a good term for everybody.Oh,yeah,like,I'm
+[23:36.00] thinking workflow in terms of,like,the prefect to zap
+[23:39.00] your workflow.Yeah.Like,I want to build a DAG.I want you
+[23:42.00] to tell me what the nodes and edges are,and then maybe the
+[23:45.00] edges are also put in with AI.But the idea is that,like,I
+[23:48.00] want to be able to present you the entire plan,and then
+[23:51.00] ask you to fix things as I execute it,rather than going,like,Hey,I
+[23:55.00] couldn't parse the JSON,so I'm going to try again.I couldn't
+[23:58.00] parse the JSON,like,I'm going to try again.And then
+[24:00.00] next,you know,you spent,like,$2 on opening AI
+[24:02.00] credits,right?Yeah.As well as with the plan,you can just
+[24:05.00] say,Oh,the edge between node,like,X and Y does not
+[24:09.00] run.Let me just iteratively try to fix that
+[24:12.00] component.Fix the one that sticks,go on the next component.
+[24:14.00]And obviously,you can get into a world where if you have
+[24:17.00] enough examples of the nodes X and Y,maybe you can use,like,a
+[24:20.00] vector database to find a good few shot examples.You can do a
+[24:24.00] lot if you sort of break down the problem into that workflow
+[24:27.00] and execute that workflow,rather than looping and hoping
+[24:30.00] the reasoning is good enough to generate the
+[24:33.00] correct output.Yeah.You know,I've been hammering
+[24:36.00] on Devin a lot.I got access a couple of weeks ago.
+[24:39.00]And obviously,for a simple task,it does well.For the
+[24:42.00] complicated,like,more than 10,20-hour tasks,I can
+[24:45.00] see it in a lot of times.That's a crazy comparison,like,we
+[24:47.00] use to type out,like,3,4 loops.You're like,only once
+[24:51.00] it gets to,like,our tasks,it's hard.Yeah.Less than an hour,it's
+[24:55.00] nothing.That's crazy.I mean,okay.Maybe my
+[24:59.00] goal force has shifted.I don't know.That's incredible.Yeah.No,like,I'm
+[25:03.00] like,I'm like sub one minute executions.Like,the fact
+[25:06.00] that you're talking about 10 hours is incredible.I think
+[25:09.00] it's a spectrum.I think I'm going to say this every
+[25:11.00] single time I bring up Devin.Like,let's not reward them
+[25:13.00] for taking longer to do things.Do you know what I mean?Like,that's
+[25:16.00] a metric that is easily abusable.Sure.Yeah.You can definitely
+[25:19.00] you know what I mean.But I think it's like,if you can
+[25:22.00]monotonically increase the success probability over an
+[25:26.00] hour,like,that's winning to me,right?Like,obviously,if you run an hour and you've
+[25:30.00] made no progress,like,I think when we were in,like,auto-GBT land,there was that one
+[25:34.00] example where it's like,I wanted it to,like,buy me a bicycle over
+[25:37.00] night.I spent $700 and I never found the bicycle.Yeah.Yeah.Right.I
+[25:41.00] wonder if you'll be able to purchase a bicycle.Because it actually can do
+[25:44.00] things in real world,it just needs to suspend to you for often stuff.The point I was
+[25:48.00] trying to make was that I can see it turning plants.I think one of the agents
+[25:51.00]loop holes,or one of the things that is a real barrier for agents is LLMs really
+[25:55.00] like to get stuck into a lane.And,you know,what you're talking about,what I've
+[25:58.00] seen Devin do,is it gets stuck in a lane and he would just kind of change
+[26:01.00] plans based on the performance of the plan itself.And it's kind of cool.I
+[26:05.00] feel like we've gone too much in the looping route,and I think a lot of more
+[26:08.00] plans and,like,dags and data structures are probably going to come back to help
+[26:12.00] fill in some holes.Yeah.And what's like the interface to that?You know,you see
+[26:16.00] it's like an existing,like,state machine kind of thing that,like,connects to the LLMs.
+[26:20.00]The traditional DAC player.So,like,do you think we need something new for,like,a.i.dags?
+[26:25.00]Yeah.I mean,I think that the hard part is going to be describing visually the fact
+[26:30.00] that this DAC can also change over time,when it should still be allowed to be fuzzy.
+[26:34.00]I think in,like,mathematics,we have,like,plate diagrams,and,like,marked
+[26:37.00] octane diagrams,and,like,you know,recurrent states,and all that,like that.Some of
+[26:40.00] that might come into this,like,workflow world,but to be honest,I'm not too sure.
+[26:44.00]I think right now,the first steps are just how do we take this DAC idea and break
+[26:48.00] it down to modular components that we can,like,prompt better,have few shot examples
+[26:52.00] for,and ultimately,like,fine tune against.But in terms of even the UI,it's hard
+[26:56.00] to say what,well,likely would.I think,you know,people like Prefect and Zapier
+[27:00.00] have a bit of good shot at doing a good job.Yeah.You seem to use Prefect a lot.
+[27:04.00]I actually work there at Prefect competitor at Temporal,and I'm also very familiar
+[27:07.00] with Daxter.What else would you call out as,like,particularly interesting
+[27:11.00] in the AI engineering stack?Man,I almost use nothing.Like,I just use
+[27:16.00]everything.And,like,py tests.Like,okay.I think that's basically it.You know,a lot
+[27:22.00] of the observability companies have,the more observability companies
+[27:26.00] I've tried,the more I just use Postgres.Really?Okay.Postgres for observability?
+[27:32.00]But,like,the issue really is the fact that these observability companies isn't
+[27:35.00] actually doing observability for the system.It's just doing the LLM thing.
+[27:39.00]Like,I still end up using,like,Datadoc,or,like,you know,Sentry to do,like,latency.
+[27:43.00]And,so,I just have those system handle it.And,then,the,like,prompt in,prompt out,like,in C token costs.
+[27:49.00]I just put that in,like,a Postgres table now.So,you don't need,like,20 funded startups
+[27:53.00]building LLM ops?Yeah,but I'm also,like,an old-tired guy,you know what I mean?
+[27:58.00]Like,I think,because of my background,it's like,yeah,the Python stuff,I'll write
+[28:01.00] myself.But,you know,I will also just use Versal happily.Yeah,yeah.Because I'm just
+[28:05.00] not familiar with that world of tooling.Whereas,like,I think,you know,I spent
+[28:09.00]3 good years building observability tools for recommendation systems.And,I
+[28:13.00] was like,oh,compared to that,instructor is just one call.I just have to put
+[28:17.00]time,start,time,end,then count the prompt token,right?Because I'm not doing a very
+[28:21.00]complex looping behavior.I'm doing mostly workflows and extraction.Yeah.
+[28:25.00]I mean,while we're on this topic,we'll just kind of get this out of the way.You
+[28:28.00]famously have decided to not be a venture-backed company.You want to do the consulting route.
+[28:32.00]Oh,yes.The obvious route for,you know,someone as successful as instructor is
+[28:35.00]like,oh,here's hosted instructor with,like,all tooling.Yeah.You just said you had a whole
+[28:38.00] bunch of experience building observability tooling,like,you have the perfect
+[28:41.00] background to do this,and you're not.Yeah.Isn't that sick?I think it's sick.I
+[28:44.00] mean,I know why,because you want to go free-dive.Yeah.Yeah.Because I think there's
+[28:48.00] two things,right?Look,one,if I tell myself I want to build requests,request is not
+[28:53.00] a venture-backed startup,right?I mean,one could argue,like,whether a postman
+[28:57.00] is,but I think for the most part,it's like,having worked so much,and more
+[29:01.00] interested in looking at how systems are being applied,and just having access to
+[29:05.00]the most interesting data,and I think I can do that more through a consulting
+[29:08.00] business where I can come in and create,go,oh,you want to build perfect memory,you
+[29:11.00] want to build an agent,you want to build,like,automations over construction,or,like,
+[29:15.00] insurance,and it's a supply chain,or,like,you want to handle writing private equity
+[29:20.00] mergers and acquisitions reports based off of user interviews,like,those things are
+[29:23.00] super fun.Whereas,like,maintaining the library,I think is mostly just kind of
+[29:28.00] like a utility that I try to keep up,especially because if it's not venture-backed,I have
+[29:32.00] no reason to sort of go down the route of,like,trying to get a thousand integrations
+[29:36.00] in my mind,I just go,like,oh,okay,98% of the people use OpenAI,I'll
+[29:40.00] support that,and if someone contributes another platform,that's great,I'll
+[29:43.00] merge it in.Yeah,I mean,you only addedenthropic support this year.
+[29:47.00]Yeah,yeah.You couldn't even get an API key until,like,this year,right?
+[29:51.00]That's true.And so,ok,if I add it,like,last year,I was trying to,like,double the
+[29:55.00] code base to service,you know,half a percent of all downloads.Do you think the
+[29:58.00]market share will shift a lot,now that Anthropic has,like,a very,very competitive offering?
+[30:02.00]I think it's still hard to get API access.I don't know if it's fully GA now,if
+[30:08.00] it's GA,if you can get commercial access really easily.I got commercial after,like,two weeks
+[30:13.00] to reach out to their sales team.Two weeks.Yeah,so,it's not too bad.There's a call list here,and
+[30:17.00] then anytime you run into rate limits,just,like,ping one of the Anthropic staff members.
+[30:21.00]Yeah,then maybe we can,like,cut that part out,so I don't need to,like,you know.No,it's cool,it's cool.
+[30:24.00]Fall it's news,but it's a common question.Surely,just from the price perspective,it's gonna
+[30:28.00] make a lot of sense.If you are a business,you should totally consider Sonnet.The cost savings
+[30:34.00] is just gonna justify it,if you actually are doing things at volume.And yeah,I think the
+[30:38.00] SDK is,like,pretty good.But to back to the instructor thing,I just don't think
+[30:41.00] it's a billion-dollar company.And I think if I raise money,the first question
+[30:44.00] is gonna be,like,how are you gonna get a billion-dollar company?And I would just go,like,man,like,if I
+[30:48.00] make a million dollars as a consultant,I'm super happy.I'm,like,more than a static.
+[30:52.00]I can have,like,a small staff of,like,three people.Like,it's fun.And I think a lot of my happiest
+[30:57.00] founder friends are those who,like,raised a tiny seed round,became profitable.They're making,like,
+[31:02.00]70,000MRR.And they're,like,we don't even need to raise the seed round.Let's just keep it,like,between
+[31:07.00] me and my co-founder,we'll go traveling,and it'll be a great time.I think it's a lot of fun.
+[31:11.00]I'll write Pete to the seed investor in the company.Yeah.I think that's,like,one of the things
+[31:16.00] that people get wrong sometimes,and I see this a lot.They have an insight into,like,some new
+[31:21.00] tech.Like,say hello to MCI.And they build some open-source stuff.And it's,like,I should just raise
+[31:26.00] money and do this.And I tell people a lot.It's,like,look,you can make a lot more money
+[31:30.00] than something else than doing a startup.Like,most people that do a company could make a lot
+[31:34.00] more money just working somewhere else than the company itself.Do you have any advice for folks
+[31:39.00] that are maybe in a similar situation?They're trying to decide,oh,should I stay?Am I,like,high-paid
+[31:44.00] fang job?And just tweet this on the side?And do this on GitHub?Should I go be a consultant?
+[31:50.00]It seems like a lot of work.It's,like,you got to talk to all these people,you know.There's a lot to unpack.
+[31:55.00]I think the open-source thing is just,like,well,I'm just doing it purely for fun.And I'm doing it
+[31:59.00] because I think I'm right.But part of being right is the fact that it's not a venture-back startup.
+[32:04.00]Like,I think I'm right because this is all you need,right?So I think a part of the philosophy
+[32:09.00] is the fact that all you need is a very sharp blade to sort of do your work.And you don't
+[32:13.00] actually need to build,like,a big enterprise.So that's one thing.I think the other thing,too,that
+[32:17.00]I've been thinking around,just because I have a lot of friends at Google that want to leave right now,it's,like,man,like,what we lack is not money or skill.
+[32:23.00]Like,what we lack is courage.You just,like,you just have to do this hard thing,and you have to do it scared anyways,right?
+[32:29.00]In terms of,like,whether or not you do want to do a founder,I think that's just a matter of
+[32:32.00]optimality.But I definitely recognize that the,like,expected value of being a founder is still quite low.
+[32:39.00]Right.I know as many founder breakups,and as I know,friends who've raised a seed round this year.
+[32:44.00]Right.And,like,that is,like,the reality.And,like,you know,even in,from that perspective,it's been tough,where it's,like,oh,man,like,a lot of incubators
+[32:51.00]want you to have co-founders,now you spend half the time,like,fundraising,and then trying to,like,meet co-founders,and find co-founders,
+[32:57.00]rather than building the thing.This is a lot of time spent out doing things I'm not really good at.
+[33:02.00]I do think there's a rising trend in solo founding.You know,I am a solo.I think that something,like,30% of,like,forget what the exact
+[33:11.00]estat is,something,like,30% of starters that make it so,like,series B or something,actually are solo
+[33:15.00]founder.I feel,like,this must have co-founder idea,mostly comes from YC,and most,everyone else copies it,and then
+[33:23.00]planning your company's breakup over co-founder breakups.And,I bet it would be,like,I wonder how much of it is the people
+[33:28.00]who don't have that much,like,and hope there's not a diss to anybody,but it's,like,you sort of,you go through
+[33:33.00]the incubator route,because you don't have,like,the social equity,you would need to sort of,like,send an email to
+[33:38.00]Lacoy,and be,like,hey,I'm going on this ride.You want to take it on the rocket ship,right?Like,that's very hard to sell.
+[33:43.00]My message if I was to raise money is,like,you've seen my Twitter.My life is sick.I've decided to make it much worse by
+[33:49.00]being a founder,because this is something I have to do.So,do you want to come along?Otherwise,I want to fund it myself.
+[33:55.00]Like,if I can't say that,like,I don't need the money,because I can,like,handle payroll,and,like,hire an intern,and get an assistant.
+[34:01.00]Like,that's all fine.But,I really don't want to go back to Meta.I want to,like,get two years to,like,try to find a problem
+[34:07.00]we're solving.That feels like a bad time.Yeah.Jason is,like,I wear a YSL jacket on stage at AI
+[34:13.00]Engineer Summit.I don't need your accelerator money.And boots.You don't forget the boots.That's true.That's true.You have really good boots.
+[34:20.00]Havily good boots.But,I think that is a part of it,right?I think it is just,like,optionality,and also,just,like,I'm a lot older now.
+[34:26.00]I think 22-year-old Jason would have been probably too scared,and now I'm,like,too wise.But,I think it's a matter of,like,if you raise money,you have to have a plan
+[34:34.00]of spending it.And I'm just not that creative with spending that much money.Yeah.I mean,to be clear,you just celebrated your 30th birthday.
+[34:41.00]Happy birthday.Yeah.It's an album.We're going to Mexico next week.A lot older is relative to some of the folks I think.
+[34:48.00]Seeing on the career tips,I think this works out a great post about our user world to get into AI.As on one of your tweets
+[34:55.00]in January 23,you applied to,like,Figma,Notion,Coherent,Tropic,and all of them rejected you because you didn't have enough
+[35:01.00]LLM experience.Yeah.I think at that time,it would be easy for a lot of people to say,Oh,I kind of missed the boat.You know,I'm too late.
+[35:08.00]Not going to make it.You know.Any advice for people that feel like that?
+[35:13.00]Like,the biggest learning here is actually from a lot of folks in Jiu Jitsu.They're like,Oh,man,like,is it too late to start Jiu Jitsu?
+[35:18.00]Like,I'll join Jiu Jitsu once I get in more shape,right?It's like,there's a lot of,like,excuses.And then you say,Oh,like,why should I start now?
+[35:26.00]I'll be like 45 by the time I'm any good.It'll be 45 anyways.Like,time is passing.Like,if you don't start now,you start tomorrow,you're just,like,one more day behind.
+[35:35.00]If you're worried about being behind,like,today is,like,the soonest you can start,right?And so you've got to recognize that,like,maybe you just don't want it,and that's fine,too.
+[35:43.00]Like,if you wanted it,you would have started.I think a lot of these people,again,probably think of things on a too short time horizon,but,again,you know,you're going to be old anyway,so you may as well just start now.
+[35:53.00]You know,one more thing on,I guess,the career advice slash,like,sort of,logging.You always go viral for this post that you wrote,on advice to young people and the lies you tell yourself.
+[36:02.00]Oh,yeah,yeah,yeah.You said that you were writing it for your sister,what?Like,why is that?
+[36:06.00]Yeah,yeah,she was,like,bummed out about going to college and,like,stressing about jobs,and I was,like,Oh,I really want to hear.Okay.
+[36:13.00]And I just kind of,like,text this for you the whole thing.It's crazy.It's got,like,50,000 views.I mean,your average tweet has more.
+[36:20.00]But that thing is,like,a 30-minute read now.
+[36:24.00]Yeah,yeah.So there's lots of stuff here,which I agree with.You know,I'm also of occasionally indulging the sort of life reflection phase.
+[36:31.00]There's the how to be lucky.There's the how to have high agency.I feel like the agency thing is always a trend in SF,or just in text circles.
+[36:39.00]How do you define having high agency?
+[36:41.00]I'm almost,like,pass the high agency phase now.Now,my biggest concern is,like,okay,the agency is just,like,the norm of the vector.
+[36:49.00]What also matters is the direction,right?It's,like,how pure is the shot?Yeah,I mean,I think agency is just a matter of,like,having courage and doing the thing.That's scary,right?
+[36:59.00]You know,if you want to go rock climbing,it's,like,do you decide you want to go rock climbing,then you show up to the gym,you rent some shoes,and you just fall 40 times?
+[37:05.00]Or do you go,like,Oh,like,I'm actually more intelligent.Let me go research the kind of shoes that I want.Okay,like,there's flatter shoes and more inclined shoes.
+[37:13.00]Like,which one should I get?Okay,let me go order the shoes on Amazon.I'll come back in three days.Like,Oh,it's a little bit too tight.Maybe it's too aggressive.I'm only a beginner.Let me go change.
+[37:22.00]No,I think the high-agent person just,like,goes and,like,falls down 20 times,right?Yeah,I think the higher agency person is more focused on,like,process metrics versus outcome metrics.
+[37:32.00]Right?Like,from pottery,like,one thing I learned was,if you want to be good at pottery,you shouldn't count like the number of cups or bowls you make.You should just weigh the amount of clay you use,right?
+[37:42.00]Like,the successful person says,Oh,I want to do a hundred pounds of clay,right?The less agency person is like,Oh,I've made six cups,and then after I've made six cups,like,there's not really,what do you do next?No,just pounds of clay,pounds of clay.
+[37:53.00]Seeing with the work here,right?I say,Oh,you just got to write the tweets,like,made the commits,contribute open source,like,write the documentation.There's no real outcome.It's just a process.And if you love that process,you just get really good at the thing you're doing.
+[38:04.00]Yeah.So,just to push back on this,because obviously,I mostly agree.How would you design performance review systems?
+[38:11.00]Because you were effectively saying,we can count nicer code for developers,right?Like,did you?
+[38:18.00]No,I don't think that would be the actual,like,I think if you make that an outcome,like,I can just expand a for loop,right?I think,okay.So,for performance review,this is interesting because I've mostly thought of it from the perspective of science and not engineering.
+[38:31.00]I've been running a lot of engineering stand-ups,primarily because there's not really that many machine learning folks.The process outcome is like experiments and ideas,right?
+[38:39.00]Like,if you think about outcomes,what you might want to think about an outcome is,oh,I want to improve the revenue or whatnot.But that's really hard.But if you're someone who is going out,like,okay,like,this week,I want to come up with,like,three or four experiments.I might move the needle.Okay,nothing worked.
+[38:51.00]To them,they might think,oh,nothing worked.Like,I suck.But to me,it's like,wow,you've closed off all these other possible avenues for,like,research.Like,you're going to get to the place.So,you're going to figure out that direction really soon.
+[39:02.00]There's no way you'd try thirty different things and none of them work.Usually,like,ten of them work,five of them work really well,two of them work really,really well.And one thing was,like,the nail in the head.So,agency lets you sort of capture the volume of experiments.And,like,experience lets you figure out,like,oh,that other half,it's not worth doing,right?
+[39:19.00]I think experience is going to go,half these pumping papers don't make any sense,just use chain of thought and just,you know,use a for loop.That's basically it,right?It's like,usually performance for me is around,like,how many experiments are you running?How often times are you trying?Yeah.
+[39:31.00]So,when do you give up on an experiment?Because a stitch fix,you're going to give up on language models,I guess,in a way,as a tool to use.And then,maybe the tools got better.You were right at the time,and then the tool improved.I think there are similar paths in my engineering career,where I try one approach,and at the time it doesn't work,and then the thing changes.But then I kind of soured on that approach,and I don't go back to it.
+[39:52.00]Yeah,how do you think about that loop?So,usually when I'm coaching folks,and they say,like,oh,like,these things don't work,I'm not going to pursue them in the future.Like,one of the big things,like,hey,the negative result is a result,and this is something worth documenting.Like,this is an academia,like,if it's negative,you don't just,like,not publish,right?But then,like,what do you actually write down?Like,what you should write down is,like,here are the conditions,this is the inputs and the outputs we tried the experiment on.And then,one thing that's really valuable is basically writing down,under what conditions would I revisit these experiments.
+[40:20.00]These things don't work because of what we had at the time.If someone is reading this two years from now,under what conditions will we try again?That's really hard,but again,that's,like,another skill you kind of learn,right?It's like,you do go back,you do experiments,you figure out why it works now.I think a lot of it here is just,like,scaling worked.Rap lyrics,you know,that was because I did not have high enough quality data.If we phase shift and say,okay,you don't even need training data.Oh,great,then it might just work.Different domain.
+[40:47.00]Do you have anything in your list that is,like,it doesn't work now,but I want to try it again later.Something that people should.Maybe keep in mind,you know,people always,like,AGI when,you know,when are you going to know the AGI is here.Maybe it's less than that,but any stuff that you tried recently that didn't work that you think will get there.
+[41:03.00]I think the personal assistants and the writing I've shown to myself is just not good enough yet.So I hired a writer and I hired a personal assistant.So now I'm going to basically,like,work with these people until I figure out,like,what I can actually,like,automate,and what are,like,the reproducible steps.But,like,I think the experiment for me is,like,I'm going to go pay a person,like,thousand of dollars a month to,like,help me improve my life,and then let me get them to help me figure,like,what are the components,and how do I actually modularize something to get it to work.Because it's not just,like,o-auth Gmail calendar and,like,notion.It's not just,like,it's,like,another skill.It's,like,another skill.
+[41:30.00]It's not just,like,o-auth Gmail calendar and,like,notion.It's a little bit more complicated than that,but we just don't know what that is yet.Those are two,sort of,systems that,I wish gb4 or opus was actually good enough to just write me an essay,but most of the essays are still pretty bad.Yeah,I would say,you know,on the personal assistant side,Lindy is probably the one I've seen the most.Flow was a speaker at the summit.I don't know if you've checked it out,or any other,sort of,agents assistant startup.
+[41:55.00]No,not recently.I haven't tried,Lindy.They were not GA.I was considering it.Yeah,yeah.A lot of it now,it's,like,oh,like,really,what I want you to do is take a look at all of my meetings and,like,write,like,a really good weekly summary email for my clients.To remind them that I'm,like,you know,thinking of them and,like,working for them,right?Or it's,like,I want you to notice that,like,my Monday is,like,way too packed,and,like,block out more time.And also,like,you know,that people to do the reschedule and then try to opt in to move them around.And then I want you to say,oh,Jason should have,like,a 15 minutes.
+[42:24.00]Prep break after form back to back.Those are things that now I know I can prompt them in,but can it do it well.Before,I didn't know that's what I wanted to prompt for us.Defragging a calendar and adding breaks so I can,like,eat lunch.Yeah,that's the AGI test.
+[42:39.00]Exactly.Compassion,right?I think one thing that,yeah,we didn't touch on it before,but I think it was interesting.You had this tweet a while ago about prompts should be code.And then there were a lot of companies trying to build prompt engineering tooling.Kinda.
+[42:53.00]Trying to turn the prompt into a more structured thing.What's your thought today?Now you want to turn the thinking into DAGs,like,do prompts should still be code?Any updated ideas?
+[43:03.00]Ah,it's the same thing,right?I think,you know,with Instructor,it is very much,like,the output model is defined as a code object.That code object is sent to the LLM and in return you get a data structure.So,the outputs of these models,I think,should also be code objects.And the inputs somewhat should be code objects.
+[43:20.00]But I think the one thing that Instructor tries to do is separate instruction data and the types of the output.And beyond that,I really just think that most of it should be still,like,managed pretty closely to the developer.
+[43:31.00]Like,so much of it is changing that if you give control of these systems away too early,you end up ultimately wanting them back.Like,many companies I know that reach out are ones where,like,oh,we're going off of the frameworks because now that we know what the business outcomes we're trying to optimize for,these frameworks don't work.
+[43:47.00]Yeah,because we do rag,but we want to do rag to,like,sell you supplements or to have you,like,schedule the fitness appointment.The prompts are kind of too big in the systems to really pull them back out and,like,start doing upselling or something.It's really funny,but a lot of it ends up being,like,once you understand the business outcomes,you care way more about the prompt.
+[44:06.00]Actually,this is fun.In our prep for this call,we were trying to say,like,what can you as an independent person say that maybe me and Alessio cannot say or,you know,someone who wants to get a company say.
+[44:15.00]What do you think is the market share of the frameworks?The land chain,the llama index,the everything.
+[44:20.00]Oh,massive.Because not everyone wants to care about the code,right?I think that's a different question to,like,what is the business model and are they going to be,like,massively profitable businesses,right?
+[44:31.00]Making hundreds of millions of dollars,that feels,like,so straightforward,right?Because not everyone is the prompt engineer,like,there's so much productivity to be captured in,like,backoffice,off to automations,right?
+[44:42.00]It's not because they care about the prompts that they care about managing these things.Yeah,but those would be sort of low-code experiences,you know?
+[44:48.00]Yeah,I think the bigger challenge is,like,okay,100 million dollars,probably pretty easy.It's just time and effort,and they have the manpower and the money to sort of solve those problems.
+[44:58.00]Again,if you go to the VC route,then it's,like,you're talking about billions,and that's really the goal.That stuff,for me,it's,like,pretty unclear.
+[45:05.00]But again,that is to say,that,like,I sort of am building things for developers who want to use Instructure to build their own tooling,in terms of the amount of developers there are in the world,versus,downstream consumers of these things,or even just think of how many companies will use,like,the Adobe's and the IBM's,right?
+[45:19.00]Because they want something that's fully managed,and they want something that they know will work.And if the incremental 10% requires you to hire another team of 20 people,you might not want to do it.
+[45:28.00]I think that kind of organization is really good for those,like,bigger companies.
+[45:32.00]And now,I just want to capture your thoughts on one more thing,which is,you said you wanted most of the prompts to stay close to the developer.And Hamel Hussein wrote this post,which I really love,called,FU Show Me The Prompt.
+[45:43.00]Yeah.
+[45:44.00]I think he cites you in one of those part of the blog post.And I think DSPy is kind of,like,the complete antithesis of that,which is,I think it's interesting,because I also hold the strong view that AI is a better prompt engineer than you are.And I don't know how to square that.Wondering if you have thoughts.
+[45:57.00]I think something like DSPy can work because there are,like,very short-term metrics to measure success,right?It is,like,did you find the PII?Or,like,did you write the multi-hop question the correct way?
+[46:12.00]But in these workflows that I've been managing,a lot of it,are we minimizing churn and maximizing retention?
+[46:18.00]Yeah,that's a very long loop.It's not really,like,a uptuna,like,training loop,right?Like,those things are much more harder to capture.So we don't actually have those metrics.
+[46:26.00]And,obviously,we can figure out,like,is the summary good,but,like,how do you measure the quality of the summary?It's,like,that feedback loop ends up being a lot longer.And then,again,when something changes,it's really hard to make sure that it works across these,like,newer models,or,again,like,changes to work for the current prompts.
+[46:44.00]Like,when we migrate from,like,anthropic to open AI,like,there's just a ton of changes that are,like,infrastructure-related,not necessarily around the prompt itself.Yeah,cool.
+[46:53.00]Any other engineering startups that you think should not exist before we wrap up?
+[46:58.00]No,I mean,oh my gosh.I mean,a lot of it,again,is just,like,every time investors,like,how does this make a billion dollars?Like,it doesn't.I'm gonna go back to just,like,tweeting and holding my breath underwater.Yeah,like,I don't really pay attention too much to most of these,like,most of the stuff I'm doing is around,like,the consumer of,like,LLM calls.Yep.
+[47:18.00]I think people just want to move really fast,and they will end up pick these vendors,but I don't really know if anything has really,like,blown me out the water.Like,I only trust myself.But that's also a function of just being an old man.Like,I think,you know,many companies are definitely very happy with using most of these tools anyways.But I definitely think I occupy a very small space in the AI engineering ecosystem.Yeah.
+[47:39.00]I would say one of the challenges here,you know,you call about the dealing in the consumer of LLM's space.I think that's what AI engineering differs from ML engineering,and I think a constant disconnect or cognitive dissonance in this field in the AI engineers that have sprung up is that they're not as good as the ML engineers.They're not as qualified.I think that,you know,you are someone who has credibility in the MLE space,and you are also a very authoritative figure in the AI e-space.And
+[48:08.00]I think so.I think you've built the de facto leading library.I think yours,I think instructors should be part of the standard lib,even though I tried to not use it.Like,basically also end up rebuilding instructor,right?Like,that's a lot of the back and forth that we had over the past two days.I think that's the fundamental thing that we're trying to figure out.Like,there's very small supply of MLEs.Not everyone's going to have that experience that you had,but the global demand for AI is going to far outstrip the existing MLEs.So what do we do?Do we force everyone to go through the standard MLE curriculum?
+[48:37.00]MLE curriculum or do we make a new one?
+[48:39.36]I got some takes.
+[48:40.80]I think a lot of these app layer startups should not be hiring MLEs because they end up turning.
+[48:46.56]Yeah, they want to work at opening high.
+[48:48.44]They're just like, hey guys, I joined and you have no data.
+[48:51.84]And like all I did this week was take some typescript build errors and like figure out why we don't have any tests.
+[48:59.00]And like what is this framework X and Y?
+[49:01.40]Like how do you measure success?
+[49:02.76]What are your business outcomes?
+[49:03.84]Oh no, okay, let's not focus on that.
+[49:05.60]Great, I'll focus on these typescript build errors.
+[49:09.04]And then you're just like, what am I doing?
+[49:10.52]And then you kind of sort of feel really frustrated.
+[49:12.60]And I already recognize that because I've made offers to machine learning engineers.
+[49:18.08]They've joined and they've left in like two months.
+[49:21.32]And the response is like, yeah, I think I'm going to join a research lab.
+[49:24.00]So I think it's not even that, like I don't even think you should be hiring these MLEs.
+[49:27.56]On the other hand, what I also see a lot of is the really motivated engineer that's doing more engineering
+[49:34.20]is not being allowed to actually like fully pursue the AI engineering.
+[49:37.60]So they're the guy who built the demo, it got traction, now it's working.
+[49:40.72]But they're still being pulled back to figure out why Google Calendar integrations are not working
+[49:45.08]or like how to make sure that, you know, the button is loading on the page.
+[49:48.16]And so I'm sort of like in a very interesting position where the companies want to hire MLE.
+[49:53.32]They don't need to hire, but they won't let the excited people who've caught the AI engineering bug
+[49:57.84]could go do that work more full time.
+[50:00.00]This is something I'm literally wrestling with this week.
+[50:02.84]As I just wrote something about it, this is one of the things I'm probably going to be recommending in the future
+[50:06.40]is really thinking about like, where is the talent coming from?
+[50:08.96]How much of it is internal?
+[50:10.12]And do you really need to hire someone who's like writing pie torch code?
+[50:14.00]Yeah, exactly.
+[50:15.24]Most of the time you're not, you're going to need someone to write instructor code.
+[50:20.60]And like, I feel goofy all the time, just like prompting.
+[50:23.24]It's like, oh man, I wish I just had a target data set that I could like train a model against.
+[50:27.16]Yes.
+[50:27.68]And I can just say it's right or wrong.
+[50:29.24]Yeah.So, you know, I guess what Lanespace is, what the AI engineer world's fair is, is that we're trying to create
+[50:35.48]and elevate this industry of AI engineers, where it's legitimate to actually take these
+[50:40.36]motivated software engineers who want to build more in AI and do creative things in AI
+[50:44.48]to actually say you have the blessing, and this is legitimate, sub-special to your software engineering.
+[50:49.72]I think there's been a mix of that product engineering.
+[50:52.44]I think a lot more data science is going to come in versus machine learning engineering.
+[50:55.92]Because a lot of it now is just quantifying.
+[50:57.96]Like, what does the business actually want as an outcome?
+[51:01.16]The outcome is not rag-app.
+[51:02.56]The outcome is like reduced churn.
+[51:04.64]You wouldn't need to figure out what that actually is.
+[51:06.20]I had to measure it.
+[51:06.96]Yeah.All the data engineering tools still apply.
+[51:09.28]BI layers,sematic layers, whatever.
+[51:12.32]Yeah.
+[51:12.96]Cool.We'll have you back again for the world's fair.
+[51:15.88]We don't know what you're going to talk about, but I'm sure it's going to be amazing.
+[51:19.12]You're a very, very polished speaker.
+[51:20.48]The title is written.
+[51:21.68]It's just a, "Pydantic is still all you need."
+[51:26.24]I'm worried about having too many all-you-need titles, because that's obviously very trendy.
+[51:30.20]So, you have one of them, but I need to keep a lid on everyone saying their thing is all you need.
+[51:35.28]But yeah, we'll figure it out.
+[51:36.44]Pydantic is not my thing.
+[51:37.68]So, what else?
+[51:38.52]I think that's why it works.
+[51:40.56]It's true.
+[51:41.28]Cool.Well, it was a real pleasure to have you on.
+[51:43.24]Of course.
+[51:43.76]Everybody should go follow you on Twitter and check out Instructure.
+[51:46.68]There's also Instructure.js, which I'm very happy to see.
+[51:49.36]And what else? Anything else to plug?
+[51:51.76]Useinstructor.com.
+[51:52.84]We got a domain name now.
+[51:54.04]Nice.
+[51:54.84]Nice.Awesome.
+[51:55.80]Cool.Cool.
+[51:56.92]Thanks for your time.
+[51:57.72]Thanks.
+[51:58.72](音乐)
diff --git a/content/post/Latent Space/Latent-Space-High-Agency-Pydantic->-VC-Backed-Frameworks-—-with-Jason-Liu-of-Instructor.md b/content/post/Latent Space/Latent-Space-High-Agency-Pydantic->-VC-Backed-Frameworks-—-with-Jason-Liu-of-Instructor.md
new file mode 100644
index 0000000..5983fc8
--- /dev/null
+++ b/content/post/Latent Space/Latent-Space-High-Agency-Pydantic->-VC-Backed-Frameworks-—-with-Jason-Liu-of-Instructor.md
@@ -0,0 +1,1957 @@
+---
+title: High Agency Pydantic > VC Backed Frameworks — with Jason Liu of Instructor
+author: Latent Space
+date: Fri, 19 Apr 2024 19:07:36 GMT
+draft: false
+summary: We are reuniting for the 2nd AI UX demo day in SF on Apr 28. Sign up to demo here! And don’t forget tickets for the AI Engineer World’s Fair — for early birds who join before keynote announcements!Abo...
+categories: [Latent Space]
+---
+
+{{< aplayer name="High Agency Pydantic > VC Backed Frameworks — with Jason Liu of Instructor" artist="Latent Space" url="https://chrt.fm/track/ABF6EF/api.substack.com/feed/podcast/143697169/f060c975d3dc0ce0ea1ea5e658eed89f.mp3" cover="https://substackcdn.com/feed/podcast/1084089/post/143697169/25da23069ad788e4cec348bdea08df5e.jpg" lrc-folded=true lrc-type=3 lrc="../Latent-Space-High-Agency-Pydantic->-VC-Backed-Frameworks-—-with-Jason-Liu-of-Instructor.lrc" >}}{{< /aplayer >}}
+
+------
+
+
We are reuniting for the 2nd AI UX demo day in SF on Apr 28. Sign up to demo here!
And don’t forget tickets for the AI Engineer World’s Fair — for early birds who join before keynote announcements!
About a year ago there was a lot of buzz around prompt engineering techniques to force structured output. Our friend Simon Willison tweeted a bunch of tips and tricks, but the most iconic one is Riley Goodside making it a matter of life or death:
Guardrails (friend of the pod and AI Engineer speaker), Marvin (AI Engineer speaker), and jsonformer had also come out at the time. In June 2023, Jason Liu (today’s guest!) open sourced his “OpenAI Function Call and Pydantic Integration Module”, now known as Instructor, which quickly turned prompt engineering black magic into a clean, developer-friendly SDK.
A few months later, model providers started to add function calling capabilities to their APIs as well as structured outputs support like “JSON Mode”, which was announced at OpenAI Dev Day (see recap here).
In just a handful of months, we went from threatening to kill grandmas to first-class support from the research labs. And yet, Instructor was still downloaded 150,000 times last month. Why?
What Instructor looks like
Instructor patches your LLM provider SDKs to offer a new response_model option to which you can pass a structure defined in Pydantic. It currently supports OpenAI, Anthropic, Cohere, and a long tail of models through LiteLLM.
What Instructor is for
There are three core use cases to Instructor:
* Extracting structured data: Taking an input like an image of a receipt and extracting structured data from it, such as a list of checkout items with their prices, fees, and coupon codes.
* Extracting graphs: Identifying nodes and edges in a given input to extract complex entities and their relationships. For example, extracting relationships between characters in a story or dependencies between tasks.
* Query understanding: Defining a schema for an API call and using a language model to resolve a request into a more complex one that an embedding could not handle. For example, creating date intervals from queries like “what was the latest thing that happened this week?” to then pass onto a RAG system or similar.
Jason called all these different ways of getting data from LLMs “typed responses”: taking strings and turning them into data structures.
Structured outputs as a planning tool
The first wave of agents was all about open-ended iteration and planning, with projects like AutoGPT and BabyAGI. Models would come up with a possible list of steps, and start going down the list one by one. It’s really easy for them to go down the wrong branch, or get stuck on a single step with no way to intervene.
What if these planning steps were returned to us as DAGs using structured output, and then managed as workflows? This also makes it easy to better train model on how to create these plans, as they are much more structured than a bullet point list. Once you have this structure, each piece can be modified individually by different specialized models.
You can read some of Jason’s experiments here:
While LLMs will keep improving (Llama3 just got released as we write this), having a consistent structure for the output will make it a lot easier to swap models in and out.
Jason’s overall message on how we can move from ReAct loops to more controllable Agent workflows mirrors the “Process” discussion from our Elicit episode:
Watch the talk
As a bonus, here’s Jason’s talk from last year’s AI Engineer Summit. He’ll also be a speaker at this year’s AI Engineer World’s Fair!
Timestamps
* [00:00:00] Introductions
* [00:02:23] Early experiments with Generative AI at StitchFix
* [00:08:11] Design philosophy behind the Instructor library
* [00:11:12] JSON Mode vs Function Calling
* [00:12:30] Single vs parallel function calling
* [00:14:00] How many functions is too many?
* [00:17:39] How to evaluate function calling
* [00:20:23] What is Instructor good for?
* [00:22:42] The Evolution from Looping to Workflow in AI Engineering
* [00:27:03] State of the AI Engineering Stack
* [00:28:26] Why Instructor isn't VC backed
* [00:31:15] Advice on Pursuing Open Source Projects and Consulting
* [00:36:00] The Concept of High Agency and Its Importance
* [00:42:44] Prompts as Code and the Structure of AI Inputs and Outputs
* [00:44:20] The Emergence of AI Engineering as a Distinct Field
Show notes
* Jason on the UWaterloo mafia
* Jason on Twitter, LinkedIn, website
* Instructor docs
* Max Woolf on the potential of Structured Output
* swyx on Elo vs Cost
* Jason on Anthropic Function Calling
* Jason on Rejections, Advice to Young People
* Jason on Bad Startup Ideas
* Jason on Prompts as Code
* Rysana’s inversion models
* Bryan Bischof’s episode
* Hamel Husain
Transcript
Alessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.
Swyx [00:00:16]: Hello, we're back in the remote studio with Jason Liu from Instructor. Welcome Jason.
Jason [00:00:21]: Hey there. Thanks for having me.
Swyx [00:00:23]: Jason, you are extremely famous, so I don't know what I'm going to do introducing you, but you're one of the Waterloo clan. There's like this small cadre of you that's just completely dominating machine learning. Actually, can you list like Waterloo alums that you're like, you know, are just dominating and crushing it right now?
Jason [00:00:39]: So like John from like Rysana is doing his inversion models, right? I know like Clive Chen from Waterloo. When I started the data science club, he was one of the guys who were like joining in and just like hanging out in the room. And now he was at Tesla working with Karpathy, now he's at OpenAI, you know.
Swyx [00:00:56]: He's in my climbing club.
Jason [00:00:58]: Oh, hell yeah. I haven't seen him in like six years now.
Swyx [00:01:01]: To get in the social scene in San Francisco, you have to climb. So both in career and in rocks. So you started a data science club at Waterloo, we can talk about that, but then also spent five years at Stitch Fix as an MLE. You pioneered the use of OpenAI's LLMs to increase stylist efficiency. So you must have been like a very, very early user. This was like pretty early on.
Jason [00:01:20]: Yeah, I mean, this was like GPT-3, okay. So we actually were using transformers at Stitch Fix before the GPT-3 model. So we were just using transformers for recommendation systems. At that time, I was very skeptical of transformers. I was like, why do we need all this infrastructure? We can just use like matrix factorization. When GPT-2 came out, I fine tuned my own GPT-2 to write like rap lyrics and I was like, okay, this is cute. Okay, I got to go back to my real job, right? Like who cares if I can write a rap lyric? When GPT-3 came out, again, I was very much like, why are we using like a post request to review every comment a person leaves? Like we can just use classical models. So I was very against language models for like the longest time. And then when ChatGPT came out, I basically just wrote a long apology letter to everyone at the company. I was like, hey guys, you know, I was very dismissive of some of this technology. I didn't think it would scale well, and I am wrong. This is incredible. And I immediately just transitioned to go from computer vision recommendation systems to LLMs. But funny enough, now that we have RAG, we're kind of going back to recommendation systems.
Swyx [00:02:21]: Yeah, speaking of that, I think Alessio is going to bring up the next one.
Alessio [00:02:23]: Yeah, I was going to say, we had Bryan Bischof from Hex on the podcast. Did you overlap at Stitch Fix?
Jason [00:02:28]: Yeah, he was like one of my main users of the recommendation frameworks that I had built out at Stitch Fix.
Alessio [00:02:32]: Yeah, we talked a lot about RecSys, so it makes sense.
Swyx [00:02:36]: So now I have adopted that line, RAG is RecSys. And you know, if you're trying to reinvent new concepts, you should study RecSys first, because you're going to independently reinvent a lot of concepts. So your system was called Flight. It's a recommendation framework with over 80% adoption, servicing 350 million requests every day. Wasn't there something existing at Stitch Fix? Why did you have to write one from scratch?
Jason [00:02:56]: No, so I think because at Stitch Fix, a lot of the machine learning engineers and data scientists were writing production code, sort of every team's systems were very bespoke. It's like, this team only needs to do like real time recommendations with small data. So they just have like a fast API app with some like pandas code. This other team has to do a lot more data. So they have some kind of like Spark job that does some batch ETL that does a recommendation. And so what happens is each team writes their code differently. And I have to come in and refactor their code. And I was like, oh man, I'm refactoring four different code bases, four different times. Wouldn't it be better if all the code quality was my fault? Let me just write this framework, force everyone else to use it. And now one person can maintain five different systems, rather than five teams having their own bespoke system. And so it was really a need of just sort of standardizing everything. And then once you do that, you can do observability across the entire pipeline and make large sweeping improvements in this infrastructure, right? If we notice that something is slow, we can detect it on the operator layer. Just hey, hey, like this team, you guys are doing this operation is lowering our latency by like 30%. If you just optimize your Python code here, we can probably make an extra million dollars. So let's jump on a call and figure this out. And then a lot of it was doing all this observability work to figure out what the heck is going on and optimize this system from not only just a code perspective, sort of like harassingly or against saying like, we need to add caching here. We're doing duplicated work here. Let's go clean up the systems. Yep.
Swyx [00:04:22]: Got it. One more system that I'm interested in finding out more about is your similarity search system using Clip and GPT-3 embeddings and FIASS, where you saved over $50 million in annual revenue. So of course they all gave all that to you, right?
Jason [00:04:34]: No, no, no. I mean, it's not going up and down, but you know, I got a little bit, so I'm pretty happy about that. But there, you know, that was when we were doing fine tuning like ResNets to do image classification. And so a lot of it was given an image, if we could predict the different attributes we have in the merchandising and we can predict the text embeddings of the comments, then we can kind of build a image vector or image embedding that can capture both descriptions of the clothing and sales of the clothing. And then we would use these additional vectors to augment our recommendation system. And so with the recommendation system really was just around like, what are similar items? What are complimentary items? What are items that you would wear in a single outfit? And being able to say on a product page, let me show you like 15, 20 more things. And then what we found was like, hey, when you turn that on, you make a bunch of money.
Swyx [00:05:23]: Yeah. So, okay. So you didn't actually use GPT-3 embeddings. You fine tuned your own? Because I was surprised that GPT-3 worked off the shelf.
Jason [00:05:30]: Because I mean, at this point we would have 3 million pieces of inventory over like a billion interactions between users and clothes. So any kind of fine tuning would definitely outperform like some off the shelf model.
Swyx [00:05:41]: Cool. I'm about to move on from Stitch Fix, but you know, any other like fun stories from the Stitch Fix days that you want to cover?
Jason [00:05:46]: No, I think that's basically it. I mean, the biggest one really was the fact that I think for just four years, I was so bearish on language models and just NLP in general. I'm just like, none of this really works. Like, why would I spend time focusing on this? I got to go do the thing that makes money, recommendations, bounding boxes, image classification. Yeah. Now I'm like prompting an image model. I was like, oh man, I was wrong.
Swyx [00:06:06]: So my Stitch Fix question would be, you know, I think you have a bit of a drip and I don't, you know, my primary wardrobe is free startup conference t-shirts. Should more technology brothers be using Stitch Fix? What's your fashion advice?
Jason [00:06:19]: Oh man, I mean, I'm not a user of Stitch Fix, right? It's like, I enjoy going out and like touching things and putting things on and trying them on. Right. I think Stitch Fix is a place where you kind of go because you want the work offloaded. I really love the clothing I buy where I have to like, when I land in Japan, I'm doing like a 45 minute walk up a giant hill to find this weird denim shop. That's the stuff that really excites me. But I think the bigger thing that's really captured is this idea that narrative matters a lot to human beings. Okay. And I think the recommendation system, that's really hard to capture. It's easy to use AI to sell like a $20 shirt, but it's really hard for AI to sell like a $500 shirt. But people are buying $500 shirts, you know what I mean? There's definitely something that we can't really capture just yet that we probably will figure out how to in the future.
Swyx [00:07:07]: Well, it'll probably output in JSON, which is what we're going to turn to next. Then you went on a sabbatical to South Park Commons in New York, which is unusual because it's based on USF.
Jason [00:07:17]: Yeah. So basically in 2020, really, I was enjoying working a lot as I was like building a lot of stuff. This is where we were making like the tens of millions of dollars doing stuff. And then I had a hand injury. And so I really couldn't code anymore for like a year, two years. And so I kind of took sort of half of it as medical leave, the other half I became more of like a tech lead, just like making sure the systems were like lights were on. And then when I went to New York, I spent some time there and kind of just like wound down the tech work, you know, did some pottery, did some jujitsu. And after GPD came out, I was like, oh, I clearly need to figure out what is going on here because something feels very magical. I don't understand it. So I spent basically like five months just prompting and playing around with stuff. And then afterwards, it was just my startup friends going like, hey, Jason, you know, my investors want us to have an AI strategy. Can you help us out? And it just snowballed and bore more and more until I was making this my full time job. Yeah, got it.
Swyx [00:08:11]: You know, you had YouTube University and a journaling app, you know, a bunch of other explorations. But it seems like the most productive or the best known thing that came out of your time there was Instructor. Yeah.
Jason [00:08:22]: Written on the bullet train in Japan. I think at some point, you know, tools like Guardrails and Marvin came out. Those are kind of tools that I use XML and Pytantic to get structured data out. But they really were doing things sort of in the prompt. And these are built with sort of the instruct models in mind. Like I'd already done that in the past. Right. At Stitch Fix, you know, one of the things we did was we would take a request note and turn that into a JSON object that we would use to send it to our search engine. Right. So if you said like, I want to, you know, skinny jeans that were this size, that would turn into JSON that we would send to our internal search APIs. But it always felt kind of gross. A lot of it is just like you read the JSON, you like parse it, you make sure the names are strings and ages are numbers and you do all this like messy stuff. But when function calling came out, it was very much sort of a new way of doing things. Right. Function calling lets you define the schema separate from the data and the instructions. And what this meant was you can kind of have a lot more complex schemas and just map them in Pytantic. And then you can just keep those very separate. And then once you add like methods, you can add validators and all that kind of stuff. The one thing I really had with a lot of these libraries, though, was it was doing a lot of the string formatting themselves, which was fine when it was the instruction to models. You just have a string. But when you have these new chat models, you have these chat messages. And I just didn't really feel like not being able to access that for the developer was sort of a good benefit that they would get. And so I just said, let me write like the most simple SDK around the OpenAI SDK, a simple wrapper on the SDK, just handle the response model a bit and kind of think of myself more like requests than actual framework that people can use. And so the goal is like, hey, like this is something that you can use to build your own framework. But let me just do all the boring stuff that nobody really wants to do. People want to build their own frameworks, but people don't want to build like JSON parsing.
Swyx [00:10:08]: And the retrying and all that other stuff.
Jason [00:10:10]: Yeah.
Swyx [00:10:11]: Right. We had this a little bit of this discussion before the show, but like that design principle of going for being requests rather than being Django. Yeah. So what inspires you there? This has come from a lot of prior pain. Are there other open source projects that inspired your philosophy here? Yeah.
Jason [00:10:25]: I mean, I think it would be requests, right? Like, I think it is just the obvious thing you install. If you were going to go make HTTP requests in Python, you would obviously import requests. Maybe if you want to do more async work, there's like future tools, but you don't really even think about installing it. And when you do install it, you don't think of it as like, oh, this is a requests app. Right? Like, no, this is just Python. The bigger question is, like, a lot of people ask questions like, oh, why isn't requests like in the standard library? Yeah. That's how I want my library to feel, right? It's like, oh, if you're going to use the LLM SDKs, you're obviously going to install instructor. And then I think the second question would be like, oh, like, how come instructor doesn't just go into OpenAI, go into Anthropic? Like, if that's the conversation we're having, like, that's where I feel like I've succeeded. Yeah. It's like, yeah, so standard, you may as well just have it in the base libraries.
Alessio [00:11:12]: And the shape of the request stayed the same, but initially function calling was maybe equal structure outputs for a lot of people. I think now the models also support like JSON mode and some of these things and, you know, return JSON or my grandma is going to die. All of that stuff is maybe to decide how have you seen that evolution? Like maybe what's the metagame today? Should people just forget about function calling for structure outputs or when is structure output like JSON mode the best versus not? We'd love to get any thoughts given that you do this every day.
Jason [00:11:42]: Yeah, I would almost say these are like different implementations of like the real thing we care about is the fact that now we have typed responses to language models. And because we have that type response, my IDE is a little bit happier. I get autocomplete. If I'm using the response wrong, there's a little red squiggly line. Like those are the things I care about in terms of whether or not like JSON mode is better. I usually think it's almost worse unless you want to spend less money on like the prompt tokens that the function call represents, primarily because with JSON mode, you don't actually specify the schema. So sure, like JSON load works, but really, I care a lot more than just the fact that it is JSON, right? I think function calling gives you a tool to specify the fact like, okay, this is a list of objects that I want and each object has a name or an age and I want the age to be above zero and I want to make sure it's parsed correctly. That's where kind of function calling really shines.
Alessio [00:12:30]: Any thoughts on single versus parallel function calling? So I did a presentation at our AI in Action Discord channel, and obviously showcase instructor. One of the big things that we have before with single function calling is like when you're trying to extract lists, you have to make these funky like properties that are lists to then actually return all the objects. How do you see the hack being put on the developer's plate versus like more of this stuff just getting better in the model? And I know you tweeted recently about Anthropic, for example, you know, some lists are not lists or strings and there's like all of these discrepancies.
Jason [00:13:04]: I almost would prefer it if it was always a single function call. Obviously, there is like the agents workflows that, you know, Instructor doesn't really support that well, but are things that, you know, ought to be done, right? Like you could define, I think maybe like 50 or 60 different functions in a single API call. And, you know, if it was like get the weather or turn the lights on or do something else, it makes a lot of sense to have these parallel function calls. But in terms of an extraction workflow, I definitely think it's probably more helpful to have everything be a single schema, right? Just because you can sort of specify relationships between these entities that you can't do in a parallel function calling, you can have a single chain of thought before you generate a list of results. Like there's like small like API differences, right? Where if it's for parallel function calling, if you do one, like again, really, I really care about how the SDK looks and says, okay, do I always return a list of functions or do you just want to have the actual object back out and you want to have like auto complete over that object? Interesting.
Alessio [00:14:00]: What's kind of the cap for like how many function definitions you can put in where it still works well? Do you have any sense on that?
Jason [00:14:07]: I mean, for the most part, I haven't really had a need to do anything that's more than six or seven different functions. I think in the documentation, they support way more. I don't even know if there's any good evals that have over like two dozen function calls. I think if you're running into issues where you have like 20 or 50 or 60 function calls, I think you're much better having those specifications saved in a vector database and then have them be retrieved, right? So if there are 30 tools, like you should basically be like ranking them and then using the top K to do selection a little bit better rather than just like shoving like 60 functions into a single. Yeah.
Swyx [00:14:40]: Yeah. Well, I mean, so I think this is relevant now because previously I think context limits prevented you from having more than a dozen tools anyway. And now that we have million token context windows, you know, a cloud recently with their new function calling release said they can handle over 250 tools, which is insane to me. That's, that's a lot. You're saying like, you know, you don't think there's many people doing that. I think anyone with a sort of agent like platform where you have a bunch of connectors, they wouldn't run into that problem. Probably you're right that they should use a vector database and kind of rag their tools. I know Zapier has like a few thousand, like 8,000, 9,000 connectors that, you know, obviously don't fit anywhere. So yeah, I mean, I think that would be it unless you need some kind of intelligence that chains things together, which is, I think what Alessio is coming back to, right? Like there's this trend about parallel function calling. I don't know what I think about that. Anthropic's version was, I think they use multiple tools in sequence, but they're not in parallel. I haven't explored this at all. I'm just like throwing this open to you as to like, what do you think about all these new things? Yeah.
Jason [00:15:40]: It's like, you know, do we assume that all function calls could happen in any order? In which case, like we either can assume that, or we can assume that like things need to happen in some kind of sequence as a DAG, right? But if it's a DAG, really that's just like one JSON object that is the entire DAG rather than going like, okay, the order of the function that return don't matter. That's definitely just not true in practice, right? Like if I have a thing that's like turn the lights on, like unplug the power, and then like turn the toaster on or something like the order doesn't matter. And it's unclear how well you can describe the importance of that reasoning to a language model yet. I mean, I'm sure you can do it with like good enough prompting, but I just haven't any use cases where the function sequence really matters. Yeah.
Alessio [00:16:18]: To me, the most interesting thing is the models are better at picking than your ranking is usually. Like I'm incubating a company around system integration. For example, with one system, there are like 780 endpoints. And if you're actually trying to do vector similarity, it's not that good because the people that wrote the specs didn't have in mind making them like semantically apart. You know, they're kind of like, oh, create this, create this, create this. Versus when you give it to a model, like in Opus, you put them all, it's quite good at picking which ones you should actually run. And I'm curious to see if the model providers actually care about some of those workflows or if the agent companies are actually going to build very good rankers to kind of fill that gap.
Jason [00:16:58]: Yeah. My money is on the rankers because you can do those so easily, right? You could just say, well, given the embeddings of my search query and the embeddings of the description, I can just train XGBoost and just make sure that I have very high like MRR, which is like mean reciprocal rank. And so the only objective is to make sure that the tools you use are in the top end filtered. Like that feels super straightforward and you don't have to actually figure out how to fine tune a language model to do tool selection anymore. Yeah. I definitely think that's the case because for the most part, I imagine you either have like less than three tools or more than a thousand. I don't know what kind of company said, oh, thank God we only have like 185 tools and this works perfectly, right? That's right.
Alessio [00:17:39]: And before we maybe move on just from this, it was interesting to me, you retweeted this thing about Anthropic function calling and it was Joshua Brown's retweeting some benchmark that it's like, oh my God, Anthropic function calling so good. And then you retweeted it and then you tweeted it later and it's like, it's actually not that good. What's your flow? How do you actually test these things? Because obviously the benchmarks are lying, right? Because the benchmarks say it's good and you said it's bad and I trust you more than the benchmark. How do you think about that? And then how do you evolve it over time?
Jason [00:18:09]: It's mostly just client data. I actually have been mostly busy with enough client work that I haven't been able to reproduce public benchmarks. And so I can't even share some of the results in Anthropic. I would just say like in production, we have some pretty interesting schemas where it's like iteratively building lists where we're doing like updates of lists, like we're doing in place updates. So like upserts and inserts. And in those situations we're like, oh yeah, we have a bunch of different parsing errors. Numbers are being returned to strings. We were expecting lists of objects, but we're getting strings that are like the strings of JSON, right? So we had to call JSON parse on individual elements. Overall, I'm like super happy with the Anthropic models compared to the OpenAI models. Sonnet is very cost effective. Haiku is in function calling, it's actually better, but I think they just had to sort of file down the edges a little bit where like our tests pass, but then we actually deployed a production. We got half a percent of traffic having issues where if you ask for JSON, it'll try to talk to you. Or if you use function calling, you know, we'll have like a parse error. And so I think that definitely gonna be things that are fixed in like the upcoming weeks. But in terms of like the reasoning capabilities, man, it's hard to beat like 70% cost reduction, especially when you're building consumer applications, right? If you're building something for consultants or private equity, like you're charging $400, it doesn't really matter if it's a dollar or $2. But for consumer apps, it makes products viable. If you can go from four to Sonnet, you might actually be able to price it better. Yeah.
Swyx [00:19:31]: I had this chart about the ELO versus the cost of all the models. And you could put trend graphs on each of those things about like, you know, higher ELO equals higher cost, except for Haiku. Haiku kind of just broke the lines, or the ISO ELOs, if you want to call it. Cool. Before we go too far into your opinions on just the overall ecosystem, I want to make sure that we map out the surface area of Instructor. I would say that most people would be familiar with Instructor from your talks and your tweets and all that. You had the number one talk from the AI Engineer Summit.
Jason [00:20:03]: Two Liu. Jason Liu and Jerry Liu. Yeah.
Swyx [00:20:06]: Yeah. Until I actually went through your cookbook, I didn't realize the surface area. How would you categorize the use cases? You have LLM self-critique, you have knowledge graphs in here, you have PII data sanitation. How do you characterize to people what is the surface area of Instructor? Yeah.
Jason [00:20:23]: This is the part that feels crazy because really the difference is LLMs give you strings and Instructor gives you data structures. And once you get data structures, again, you can do every lead code problem you ever thought of. Right. And so I think there's a couple of really common applications. The first one obviously is extracting structured data. This is just be, okay, well, like I want to put in an image of a receipt. I want to give it back out a list of checkout items with a price and a fee and a coupon code or whatever. That's one application. Another application really is around extracting graphs out. So one of the things we found out about these language models is that not only can you define nodes, it's really good at figuring out what are nodes and what are edges. And so we have a bunch of examples where, you know, not only do I extract that, you know, this happens after that, but also like, okay, these two are dependencies of another task. And you can do, you know, extracting complex entities that have relationships. Given a story, for example, you could extract relationships of families across different characters. This can all be done by defining a graph. The last really big application really is just around query understanding. The idea is that like any API call has some schema and if you can define that schema ahead of time, you can use a language model to resolve a request into a much more complex request. One that an embedding could not do. So for example, I have a really popular post called like rag is more than embeddings. And effectively, you know, if I have a question like this, what was the latest thing that happened this week? That embeds to nothing, right? But really like that query should just be like select all data where the date time is between today and today minus seven days, right? What if I said, how did my writing change between this month and last month? Again, embeddings would do nothing. But really, if you could do like a group by over the month and a summarize, then you could again like do something much more interesting. And so this really just calls out the fact that embeddings really is kind of like the lowest hanging fruit. And using something like instructor can really help produce a data structure. And then you can just use your computer science and reason about the data structure. Maybe you say, okay, well, I'm going to produce a graph where I want to group by each month and then summarize them jointly. You can do that if you know how to define this data structure. Yeah.
Swyx [00:22:29]: So you kind of run up against like the LangChains of the world that used to have that. They still do have like the self querying, I think they used to call it when we had Harrison on in our episode. How do you see yourself interacting with the other LLM frameworks in the ecosystem? Yeah.
Jason [00:22:42]: I mean, if they use instructor, I think that's totally cool. Again, it's like, it's just Python, right? It's like asking like, oh, how does like Django interact with requests? Well, you just might make a request.get in a Django app, right? But no one would say, I like went off of Django because I'm using requests now. They should be ideally like sort of the wrong comparison in terms of especially like the agent workflows. I think the real goal for me is to go down like the LLM compiler route, which is instead of doing like a react type reasoning loop. I think my belief is that we should be using like workflows. If we do this, then we always have a request and a complete workflow. We can fine tune a model that has a better workflow. Whereas it's hard to think about like, how do you fine tune a better react loop? Yeah. You always train it to have less looping, in which case like you wanted to get the right answer the first time, in which case it was a workflow to begin with, right?
Swyx [00:23:31]: Can you define workflow? Because I used to work at a workflow company, but I'm not sure this is a good term for everybody.
Jason [00:23:36]: I'm thinking workflow in terms of like the prefect Zapier workflow. Like I want to build a DAG, I want you to tell me what the nodes and edges are. And then maybe the edges are also put in with AI. But the idea is that like, I want to be able to present you the entire plan and then ask you to fix things as I execute it, rather than going like, hey, I couldn't parse the JSON, so I'm going to try again. I couldn't parse the JSON, I'm going to try again. And then next thing you know, you spent like $2 on opening AI credits, right? Yeah. Whereas with the plan, you can just say, oh, the edge between node like X and Y does not run. Let me just iteratively try to fix that, fix the one that sticks, go on to the next component. And obviously you can get into a world where if you have enough examples of the nodes X and Y, maybe you can use like a vector database to find a good few shot examples. You can do a lot if you sort of break down the problem into that workflow and executing that workflow, rather than looping and hoping the reasoning is good enough to generate the correct output. Yeah.
Swyx [00:24:35]: You know, I've been hammering on Devon a lot. I got access a couple of weeks ago. And obviously for simple tasks, it does well. For the complicated, like more than 10, 20 hour tasks, I can see- That's a crazy comparison.
Jason [00:24:47]: We used to talk about like three, four loops. Only once it gets to like hour tasks, it's hard.
Swyx [00:24:54]: Yeah. Less than an hour, there's nothing.
Jason [00:24:57]: That's crazy.
Swyx [00:24:58]: I mean, okay. Maybe my goalposts have shifted. I don't know. That's incredible.
Jason [00:25:02]: Yeah. No, no. I'm like sub one minute executions. Like the fact that you're talking about 10 hours is incredible.
Swyx [00:25:08]: I think it's a spectrum. I think I'm going to say this every single time I bring up Devon. Let's not reward them for taking longer to do things. Do you know what I mean? I think that's a metric that is easily abusable.
Jason [00:25:18]: Sure. Yeah. You know what I mean? But I think if you can monotonically increase the success probability over an hour, that's winning to me. Right? Like obviously if you run an hour and you've made no progress. Like I think when we were in like auto GBT land, there was that one example where it's like, I wanted it to like buy me a bicycle overnight. I spent $7 on credit and I never found the bicycle. Yeah.
Swyx [00:25:41]: Yeah. Right. I wonder if you'll be able to purchase a bicycle. Because it actually can do things in real world. It just needs to suspend to you for off and stuff. The point I was trying to make was that I can see it turning plans. I think one of the agents loopholes or one of the things that is a real barrier for agents is LLMs really like to get stuck into a lane. And you know what you're talking about, what I've seen Devon do is it gets stuck in a lane and it will just kind of change plans based on the performance of the plan itself. And it's kind of cool.
Jason [00:26:05]: I feel like we've gone too much in the looping route and I think a lot of more plans and like DAGs and data structures are probably going to come back to help fill in some holes. Yeah.
Alessio [00:26:14]: What do you think of the interface to that? Do you see it's like an existing state machine kind of thing that connects to the LLMs, the traditional DAG players? Do you think we need something new for like AI DAGs?
Jason [00:26:25]: Yeah. I mean, I think that the hard part is going to be describing visually the fact that this DAG can also change over time and it should still be allowed to be fuzzy. I think in like mathematics, we have like plate diagrams and like Markov chain diagrams and like recurrent states and all that. Some of that might come into this workflow world. But to be honest, I'm not too sure. I think right now, the first steps are just how do we take this DAG idea and break it down to modular components that we can like prompt better, have few shot examples for and ultimately like fine tune against. But in terms of even the UI, it's hard to say what it will likely win. I think, you know, people like Prefect and Zapier have a pretty good shot at doing a good job.
Swyx [00:27:03]: Yeah. You seem to use Prefect a lot. I actually worked at a Prefect competitor at Temporal and I'm also very familiar with Dagster. What else would you call out as like particularly interesting in the AI engineering stack?
Jason [00:27:13]: Man, I almost use nothing. I just use Cursor and like PyTests. Okay. I think that's basically it. You know, a lot of the observability companies have... The more observability companies I've tried, the more I just use Postgres.
Swyx [00:27:29]: Really? Okay. Postgres for observability?
Jason [00:27:32]: But the issue really is the fact that these observability companies isn't actually doing observability for the system. It's just doing the LLM thing. Like I still end up using like Datadog or like, you know, Sentry to do like latency. And so I just have those systems handle it. And then the like prompt in, prompt out, latency, token costs. I just put that in like a Postgres table now.
Swyx [00:27:51]: So you don't need like 20 funded startups building LLM ops? Yeah.
Jason [00:27:55]: But I'm also like an old, tired guy. You know what I mean? Like I think because of my background, it's like, yeah, like the Python stuff, I'll write myself. But you know, I will also just use Vercel happily. Yeah. Yeah. So I'm not really into that world of tooling, whereas I think, you know, I spent three good years building observability tools for recommendation systems. And I was like, oh, compared to that, Instructor is just one call. I just have to put time star, time and then count the prompt token, right? Because I'm not doing a very complex looping behavior. I'm doing mostly workflows and extraction. Yeah.
Swyx [00:28:26]: I mean, while we're on this topic, we'll just kind of get this out of the way. You famously have decided to not be a venture backed company. You want to do the consulting route. The obvious route for someone as successful as Instructor is like, oh, here's hosted Instructor with all tooling. Yeah. You just said you had a whole bunch of experience building observability tooling. You have the perfect background to do this and you're not.
Jason [00:28:43]: Yeah. Isn't that sick? I think that's sick.
Swyx [00:28:44]: I mean, I know why, because you want to go free dive.
Jason [00:28:47]: Yeah. Yeah. Because I think there's two things. Right. Well, one, if I tell myself I want to build requests, requests is not a venture backed startup. Right. I mean, one could argue whether or not Postman is, but I think for the most part, it's like having worked so much, I'm more interested in looking at how systems are being applied and just having access to the most interesting data. And I think I can do that more through a consulting business where I can come in and go, oh, you want to build perfect memory. You want to build an agent. You want to build like automations over construction or like insurance and supply chain, or like you want to handle writing private equity, mergers and acquisitions reports based off of user interviews. Those things are super fun. Whereas like maintaining the library, I think is mostly just kind of like a utility that I try to keep up, especially because if it's not venture backed, I have no reason to sort of go down the route of like trying to get a thousand integrations. In my mind, I just go like, okay, 98% of the people use open AI. I'll support that. And if someone contributes another platform, that's great. I'll merge it in. Yeah.
Swyx [00:29:45]: I mean, you only added Anthropic support this year. Yeah.
Jason [00:29:47]: Yeah. You couldn't even get an API key until like this year, right? That's true. Okay. If I add it like last year, I was trying to like double the code base to service, you know, half a percent of all downloads.
Swyx [00:29:58]: Do you think the market share will shift a lot now that Anthropic has like a very, very competitive offering?
Jason [00:30:02]: I think it's still hard to get API access. I don't know if it's fully GA now, if it's GA, if you can get a commercial access really easily.
Alessio [00:30:12]: I got commercial after like two weeks to reach out to their sales team.
Jason [00:30:14]: Okay.
Alessio [00:30:15]: Yeah.
Swyx [00:30:16]: Two weeks. It's not too bad. There's a call list here. And then anytime you run into rate limits, just like ping one of the Anthropic staff members.
Jason [00:30:21]: Yeah. Then maybe we need to like cut that part out. So I don't need to like, you know, spread false news.
Swyx [00:30:25]: No, it's cool. It's cool.
Jason [00:30:26]: But it's a common question. Yeah. Surely just from the price perspective, it's going to make a lot of sense. Like if you are a business, you should totally consider like Sonnet, right? Like the cost savings is just going to justify it if you actually are doing things at volume. And yeah, I think the SDK is like pretty good. Back to the instructor thing. I just don't think it's a billion dollar company. And I think if I raise money, the first question is going to be like, how are you going to get a billion dollar company? And I would just go like, man, like if I make a million dollars as a consultant, I'm super happy. I'm like more than ecstatic. I can have like a small staff of like three people. It's fun. And I think a lot of my happiest founder friends are those who like raised a tiny seed round, became profitable. They're making like 70, 60, 70, like MRR, 70,000 MRR and they're like, we don't even need to raise the seed round. Let's just keep it like between me and my co-founder, we'll go traveling and it'll be a great time. I think it's a lot of fun.
Alessio [00:31:15]: Yeah. like say LLMs / AI and they build some open source stuff and it's like I should just raise money and do this and I tell people a lot it's like look you can make a lot more money doing something else than doing a startup like most people that do a company could make a lot more money just working somewhere else than the company itself do you have any advice for folks that are maybe in a similar situation they're trying to decide oh should I stay in my like high paid FAANG job and just tweet this on the side and do this on github should I go be a consultant like being a consultant seems like a lot of work so you got to talk to all these people you know there's a lot to unpack
Jason [00:31:54]: I think the open source thing is just like well I'm just doing it purely for fun and I'm doing it because I think I'm right but part of being right is the fact that it's not a venture backed startup like I think I'm right because this is all you need right so I think a part of the philosophy is the fact that all you need is a very sharp blade to sort of do your work and you don't actually need to build like a big enterprise so that's one thing I think the other thing too that I've kind of been thinking around just because I have a lot of friends at google that want to leave right now it's like man like what we lack is not money or skill like what we lack is courage you should like you just have to do this a hard thing and you have to do it scared anyways right in terms of like whether or not you do want to do a founder I think that's just a matter of optionality but I definitely recognize that the like expected value of being a founder is still quite low it is right I know as many founder breakups and as I know friends who raised a seed round this year right like that is like the reality and like you know even in from that perspective it's been tough where it's like oh man like a lot of incubators want you to have co-founders now you spend half the time like fundraising and then trying to like meet co-founders and find co-founders rather than building the thing this is a lot of time spent out doing uh things I'm not really good at. I do think there's a rising trend in solo founding yeah.
Swyx [00:33:06]: You know I am a solo I think that something like 30 percent of like I forget what the exact status something like 30 percent of starters that make it to like series B or something actually are solo founder I feel like this must have co-founder idea mostly comes from YC and most everyone else copies it and then plenty of companies break up over co-founder
Jason [00:33:27]: Yeah and I bet it would be like I wonder how much of it is the people who don't have that much like and I hope this is not a diss to anybody but it's like you sort of you go through the incubator route because you don't have like the social equity you would need is just sort of like send an email to Sequoia and be like hey I'm going on this ride you want a ticket on the rocket ship right like that's very hard to sell my message if I was to raise money is like you've seen my twitter my life is sick I've decided to make it much worse by being a founder because this is something I have to do so do you want to come along otherwise I want to fund it myself like if I can't say that like I don't need the money because I can like handle payroll and like hire an intern and get an assistant like that's all fine but I really don't want to go back to meta I want to like get two years to like try to find a problem we're solving that feels like a bad time
Alessio [00:34:12]: Yeah Jason is like I wear a YSL jacket on stage at AI Engineer Summit I don't need your accelerator money
Jason [00:34:18]: And boots, you don't forget the boots. But I think that is a part of it right I think it is just like optionality and also just like I'm a lot older now I think 22 year old Jason would have been probably too scared and now I'm like too wise but I think it's a matter of like oh if you raise money you have to have a plan of spending it and I'm just not that creative with spending that much money yeah I mean to be clear you just celebrated your 30th birthday happy birthday yeah it's awesome so next week a lot older is relative to some some of the folks I think seeing on the career tips
Alessio [00:34:48]: I think Swix had a great post about are you too old to get into AI I saw one of your tweets in January 23 you applied to like Figma, Notion, Cohere, Anthropic and all of them rejected you because you didn't have enough LLM experience I think at that time it would be easy for a lot of people to say oh I kind of missed the boat you know I'm too late not gonna make it you know any advice for people that feel like that
Jason [00:35:14]: Like the biggest learning here is actually from a lot of folks in jiu-jitsu they're like oh man like is it too late to start jiu-jitsu like I'll join jiu-jitsu once I get in more shape right it's like there's a lot of like excuses and then you say oh like why should I start now I'll be like 45 by the time I'm any good and say well you'll be 45 anyways like time is passing like if you don't start now you start tomorrow you're just like one more day behind if you're worried about being behind like today is like the soonest you can start right and so you got to recognize that like maybe you just don't want it and that's fine too like if you wanted you would have started I think a lot of these people again probably think of things on a too short time horizon but again you know you're gonna be old anyways you may as well just start now you know
Swyx [00:35:55]: One more thing on I guess the um career advice slash sort of vlogging you always go viral for this post that you wrote on advice to young people and the lies you tell yourself oh yeah yeah you said you were writing it for your sister.
Jason [00:36:05]: She was like bummed out about going to college and like stressing about jobs and I was like oh and I really want to hear okay and I just kind of like text-to-sweep the whole thing it's crazy it's got like 50,000 views like I'm mind I mean your average tweet has more but that thing is like a 30-minute read now
Swyx [00:36:26]: So there's lots of stuff here which I agree with I you know I'm also of occasionally indulge in the sort of life reflection phase there's the how to be lucky there's the how to have high agency I feel like the agency thing is always a trend in sf or just in tech circles how do you define having high agency
Jason [00:36:42]: I'm almost like past the high agency phase now now my biggest concern is like okay the agency is just like the norm of the vector what also matters is the direction right it's like how pure is the shot yeah I mean I think agency is just a matter of like having courage and doing the thing that's scary right you know if people want to go rock climbing it's like do you decide you want to go rock climbing then you show up to the gym you rent some shoes and you just fall 40 times or do you go like oh like I'm actually more intelligent let me go research the kind of shoes that I want okay like there's flatter shoes and more inclined shoes like which one should I get okay let me go order the shoes on Amazon I'll come back in three days like oh it's a little bit too tight maybe it's too aggressive I'm only a beginner let me go change no I think the higher agent person just like goes and like falls down 20 times right yeah I think the higher agency person is more focused on like process metrics versus outcome metrics right like from pottery like one thing I learned was if you want to be good at pottery you shouldn't count like the number of cups or bowls you make you should just weigh the amount of clay you use right like the successful person says oh I went through 100 pounds of clay right the less agency was like oh I've made six cups and then after I made six cups like there's not really what are you what do you do next no just pounds of clay pounds of clay same with the work here right so you just got to write the tweets like make the commits contribute open source like write the documentation there's no real outcome it's just a process and if you love that process you just get really good at the thing you're doing
Swyx [00:38:04]: yeah so just to push back on this because obviously I mostly agree how would you design performance review systems because you were effectively saying we can count lines of code for developers right
Jason [00:38:15]: I don't think that would be the actual like I think if you make that an outcome like I can just expand a for loop right I think okay so for performance review this is interesting because I've mostly thought of it from the perspective of science and not engineering I've been running a lot of engineering stand-ups primarily because there's not really that many machine learning folks the process outcome is like experiments and ideas right like if you think about outcome is what you might want to think about an outcome is oh I want to improve the revenue or whatnot but that's really hard but if you're someone who is going out like okay like this week I want to come up with like three or four experiments I might move the needle okay nothing worked to them they might think oh nothing worked like I suck but to me it's like wow you've closed off all these other possible avenues for like research like you're gonna get to the place that you're gonna figure out that direction really soon there's no way you try 30 different things and none of them work usually like 10 of them work five of them work really well two of them work really really well and one thing was like the nail in the head so agency lets you sort of capture the volume of experiments and like experience lets you figure out like oh that other half it's not worth doing right I think experience is going like half these prompting papers don't make any sense just use chain of thought and just you know use a for loop that's basically right it's like usually performance for me is around like how many experiments are you running how oftentimes are you trying.
Alessio [00:39:32]: When do you give up on an experiment because a StitchFix you kind of give up on language models I guess in a way as a tool to use and then maybe the tools got better you were right at the time and then the tool improved I think there are similar paths in my engineering career where I try one approach and at the time it doesn't work and then the thing changes but then I kind of soured on that approach and I don't go back to it soon
Jason [00:39:51]: I see yeah how do you think about that loop so usually when I'm coaching folks and as they say like oh these things don't work I'm not going to pursue them in the future like one of the big things like hey the negative result is a result and this is something worth documenting like this is an academia like if it's negative you don't just like not publish right but then like what do you actually write down like what you should write down is like here are the conditions this is the inputs and the outputs we tried the experiment on and then one thing that's really valuable is basically writing down under what conditions would I revisit these experiments these things don't work because of what we had at the time if someone is reading this two years from now under what conditions will we try again that's really hard but again that's like another skill you kind of learn right it's like you do go back and you do experiments you figure out why it works now I think a lot of it here is just like scaling worked yeah rap lyrics you know that was because I did not have high enough quality data if we phase shift and say okay you don't even need training data oh great then it might just work a different domain
Alessio [00:40:48]: Do you have anything in your list that is like it doesn't work now but I want to try it again later? Something that people should maybe keep in mind you know people always like agi when you know when are you going to know the agi is here maybe it's less than that but any stuff that you tried recently that didn't work that
Jason [00:41:01]: You think will get there I mean I think the personal assistance and the writing I've shown to myself it's just not good enough yet so I hired a writer and I hired a personal assistant so now I'm gonna basically like work with these people until I figure out like what I can actually like automate and what are like the reproducible steps but like I think the experiment for me is like I'm gonna go pay a person like thousand dollars a month that helped me improve my life and then let me get them to help me figure like what are the components and how do I actually modularize something to get it to work because it's not just like a lot gmail calendar and like notion it's a little bit more complicated than that but we just don't know what that is yet those are two sort of systems that I wish gb4 or opus was actually good enough to just write me an essay but most of the essays are still pretty bad
Swyx [00:41:44]: yeah I would say you know on the personal assistance side Lindy is probably the one I've seen the most flow was at a speaker at the summit I don't know if you've checked it out or any other sort of agents assistant startup
Jason [00:41:54]: Not recently I haven't tried lindy they were not ga last time I was considering it yeah yeah a lot of it now it's like oh like really what I want you to do is take a look at all of my meetings and like write like a really good weekly summary email for my clients to remind them that I'm like you know thinking of them and like working for them right or it's like I want you to notice that like my monday is like way too packed and like block out more time and also like email the people to do the reschedule and then try to opt in to move them around and then I want you to say oh jason should have like a 15 minute prep break after form back to back those are things that now I know I can prompt them in but can it do it well like before I didn't even know that's what I wanted to prompt for us defragging a calendar and adding break so I can like eat lunch yeah that's the AGI test yeah exactly compassion right I think one thing that yeah we didn't touch on it before but
Alessio [00:42:44]: I think was interesting you had this tweet a while ago about prompts should be code and then there were a lot of companies trying to build prompt engineering tooling kind of trying to turn the prompt into a more structured thing what's your thought today now you want to turn the thinking into DAGs like do prompts should still be code any updated ideas
Jason [00:43:04]: It's the same thing right I think you know with Instructor it is very much like the output model is defined as a code object that code object is sent to the LLM and in return you get a data structure so the outputs of these models I think should also be code objects and the inputs somewhat should be code objects but I think the one thing that instructor tries to do is separate instruction data and the types of the output and beyond that I really just think that most of it should be still like managed pretty closely to the developer like so much of is changing that if you give control of these systems away too early you end up ultimately wanting them back like many companies I know that I reach out or ones were like oh we're going off of the frameworks because now that we know what the business outcomes we're trying to optimize for these frameworks don't work yeah because we do rag but we want to do rag to like sell you supplements or to have you like schedule the fitness appointment the prompts are kind of too baked into the systems to really pull them back out and like start doing upselling or something it's really funny but a lot of it ends up being like once you understand the business outcomes you care way more about the prompt
Swyx [00:44:07]: Actually this is fun in our prep for this call we were trying to say like what can you as an independent person say that maybe me and Alessio cannot say or me you know someone at a company say what do you think is the market share of the frameworks the LangChain, the LlamaIndex, the everything...
Jason [00:44:20]: Oh massive because not everyone wants to care about the code yeah right I think that's a different question to like what is the business model and are they going to be like massively profitable businesses right making hundreds of millions of dollars that feels like so straightforward right because not everyone is a prompt engineer like there's so much productivity to be captured in like back office optim automations right it's not because they care about the prompts that they care about managing these things yeah but those would be sort of low code experiences you yeah I think the bigger challenge is like okay hundred million dollars probably pretty easy it's just time and effort and they have the manpower and the money to sort of solve those problems again if you go the vc route then it's like you're talking about billions and that's really the goal that stuff for me it's like pretty unclear but again that is to say that like I sort of am building things for developers who want to use infrastructure to build their own tooling in terms of the amount of developers there are in the world versus downstream consumers of these things or even just think of how many companies will use like the adobes and the ibms right because they want something that's fully managed and they want something that they know will work and if the incremental 10% requires you to hire another team of 20 people you might not want to do it and I think that kind of organization is really good for uh those are bigger companies
Swyx [00:45:32]: I just want to capture your thoughts on one more thing which is you said you wanted most of the prompts to stay close to the developer and Hamel Husain wrote this post which I really love called f you show me the prompt yeah I think he cites you in one of those part of the blog post and I think ds pi is kind of like the complete antithesis of that which is I think it's interesting because I also hold the strong view that AI is a better prompt engineer than you are and I don't know how to square that wondering if you have thoughts
Jason [00:45:58]: I think something like DSPy can work because there are like very short-term metrics to measure success right it is like did you find the pii or like did you write the multi-hop question the correct way but in these workflows that I've been managing a lot of it are we minimizing churn and maximizing retention yeah that's a very long loop it's not really like a uptuna like training loop right like those things are much more harder to capture so we don't actually have those metrics for that right and obviously we can figure out like okay is the summary good but like how do you measure the quality of the summary it's like that feedback loop it ends up being a lot longer and then again when something changes it's really hard to make sure that it works across these like newer models or again like changes to work for the current process like when we migrate from like anthropic to open ai like there's just a ton of change that are like infrastructure related not necessarily around the prompt itself yeah cool any other ai engineering startups that you think should not exist before we wrap up i mean oh my gosh i mean a lot of it again it's just like every time of investors like how does this make a billion dollars like it doesn't i'm gonna go back to just like tweeting and holding my breath underwater yeah like i don't really pay attention too much to most of this like most of the stuff i'm doing is around like the consumer of like llm calls yep i think people just want to move really fast and they will end up pick these vendors but i don't really know if anything has really like blown me out the water like i only trust myself but that's also a function of just being an old man like i think you know many companies are definitely very happy with using most of these tools anyways but i definitely think i occupy a very small space in the engineering ecosystem.
Swyx [00:47:41]: Yeah i would say one of the challenges here you know you call about the dealing in the consumer of llm's space i think that's what ai engineering differs from ml engineering and i think a constant disconnect or cognitive dissonance in this field in the ai engineers that have sprung up is that they are not as good as the ml engineers they are not as qualified i think that you know you are someone who has credibility in the mle space and you are also a very authoritative figure in the ai space and i think so and you know i think you've built the de facto leading library i think yours i think instructors should be part of the standard lib even though i try to not use it like i basically also end up rebuilding instructor right like that's a lot of the back and forth that we had over the past two days i think that's the fundamental thing that we're trying to figure out like there's very small supply of MLEs not everyone's going to have that experience that you had but the global demand for AI is going to far outstrip the existing MLEs.
Jason [00:48:36]: So what do we do do we force everyone to go through the standard MLE curriculum or do we make a new one? I've got some takes go i think a lot of these app layer startups should not be hiring MLEs because they end up churning yeah they want to work at opening high they're just like hey guys i joined and you have no data and like all i did this week was take some typescript build errors and like figure out why we don't have any tests and like what is this framework x and y like how do you measure success what are your business outcomes oh no okay let's not focus on that great i'll focus on these typescript build errors and then you're just like what am i doing and then you kind of sort of feel really frustrated and i already recognize that because i've made offers to machine learning engineers they've joined and they've left in like two months and the response is like yeah i think i'm gonna join a research lab so i think it's not even that like i don't even think you should be hiring these mles on the other hand what i also see a lot of is the really motivated engineer that's doing more engineering is not being allowed to actually like fully pursue the ai engineering so they're the guy who built the demo it got traction now it's working but they're still being pulled back to figure out why google calendar integrations are not working or like how to make sure that you know the button is loading on the page and so i'm sort of like in a very interesting position where the companies want to hire an ml they don't need to hire but they won't let the excited people who've caught the ai engineering bug could go do that work more full-time this is something i'm literally wrestling with this week as i just wrote something about it this is one of the things i'm probably going to be recommending in the future is really thinking about like where is the talent coming from how much of it is internal and do you really need to hire someone who's like writing pytorch code yeah exactly most of the time you're not you're gonna need someone to write instructor code and like i feel goofy all the time just like prompting it's like oh man like i wish i just had a target data set that i could like train a model against yes and i can just say it's right or wrong yeah.
Swyx [00:50:32]: You know i guess what Latent Space is, what the AI Engineer world's fair is is that we're trying to create and elevate this industry of ai engineers where it's legitimate to actually take these motivated software engineers who want to build more in ai and do creative things in ai to actually say you have the blessing like and this is legitimate sub-specialty of software engineering
Jason [00:50:50]: Yeah i think there's been a mix of that product engineering i think a lot more data science is going to come in versus machine learning engineering because a lot of it now is just quantifying like what does the business actually want as an outcome the outcome is not rag app yeah the outcome is like reduced churn people need to figure out what that actually is and how to measure it yeah all the data engineering tools still apply
Swyx [00:51:09]: bi layers semantic layers whatever yeah cool we'll have you back again for the world's fair we don't know what you're going to talk about but i'm sure it's going to be amazing you're a very polished speaker
Jason [00:51:19]: The title is written it's just uh Pydantic is still all you need
Swyx [00:51:26]: I'm worried about having too many all you need titles because that's obviously very trendy so yeah you have one of them but i need to keep a lid on like you know everyone's saying their
Jason [00:51:34]: thing is all you need but yeah we'll figure it out i think it's not my thing it's someone else
Swyx [00:51:38]: i think that's why it works it's true cool well it's a real pleasure to have you on of course everyone should go follow you on twitter and check out instructor there's also instructor js which i'm very happy to see.
Get full access to Latent Space at www.latent.space/subscribe
+
+[by:whisper.cpp]
+
+[00:00.00]Hey everyone, welcome to the Late In Space podcast.
+
+[00:09.30]This is Alessio, partner in CTO & Residence at Decibel Partners.
+
+[00:13.00]And I'm joined by Makoho's Swicks, founder of SmallAI.
+
+[00:16.00]Hello, we're back in the remote studio with Jason Liu from Instructor.
+
+[00:20.00]Welcome, Jason.
+
+[00:21.00]Hey there, thanks for having me.
+
+[00:23.00]Jason, you are extremely famous.
+
+[00:25.00]So I don't know what I'm going to do introducing you.
+
+[00:28.00]You're one of the Waterloo clan.
+
+[00:30.00]There's like a small cadre of you that's just completely dominating machine learning.
+
+[00:34.00]Actually, can you list like Waterloo alums that you're like, you know are just dominating and questioning it right now.
+
+[00:39.00]So like, John from like RaiSana is doing his inversion models, right?
+
+[00:45.00]I know like Clive Chen from Waterloo.
+
+[00:48.00]When I started the data science club, he was one of the guys we're like joining in and just like hanging out in the room.
+
+[00:52.00]And now he was at Tesla working with Carpathian, now he's at OpenAI.
+
+[00:56.00]Yeah, he's in my climbing club.
+
+[00:58.00]Oh hell yeah.
+
+[00:59.00]Yeah, I haven't seen him in like six years now.
+
+[01:01.00]To get in the social scene in San Francisco, you have to climb.
+
+[01:04.00]So both in career and in rocks.
+
+[01:07.00]So you started the data science club in Waterloo.
+
+[01:09.00]We can talk about that.
+
+[01:10.00]But then also spent five years at Stitchfix as an MLE.
+
+[01:13.00]You pioneered the use of OpenAI's LLMs to increase stylish efficiency.
+
+[01:17.00]So you must have been like a very, very early user.
+
+[01:19.00]This was like pretty early on.
+
+[01:21.00]Yeah, I mean this was like GPT-3.
+
+[01:24.00]Okay, so we actually were using transformers at Stitchfix before the GPT-3 model.
+
+[01:29.00]So we were just using transformers recommendation systems.
+
+[01:31.00]At that time, I was very skeptical of transformers.
+
+[01:34.00]I was like, why do we need all this infrastructure?
+
+[01:36.00]We can just use like matrix factorization.
+
+[01:38.00]When GPT-2 came out, I fine-tuned my own GPT-2 to write like rap lyrics.
+
+[01:42.00]And I was like, okay, this is cute.
+
+[01:43.00]Okay, I got to go back to my real job, right?
+
+[01:45.00]Like, who cares if I can write a rap lyric?
+
+[01:47.00]When GPT-3 came out, again, I was very much like,
+
+[01:50.00]why are we using like a post request to review every comment a person leaves?
+
+[01:54.00]Like, we can just use classical models.
+
+[01:56.00]So I was very against language models for like the longest time.
+
+[01:59.00]And then when chat GPT came out,
+
+[02:01.00]I basically just wrote a long apology letter to everyone at the company.
+
+[02:04.00]I was like, hey guys, you know, I was very dismissive of some of the technology.
+
+[02:07.00]I didn't think you would scale well and I am wrong.
+
+[02:10.00]This is incredible.
+
+[02:11.00]And I immediately just transitioned to go from computer vision,
+
+[02:14.00]recommendation systems to LLMs.
+
+[02:16.00]But funny enough, now that we have RAG,
+
+[02:18.00]we're kind of going back to recommendation systems.
+
+[02:21.00]Speaking of that, I think Alessio is going to bring up the next one.
+
+[02:23.00]Yeah, I was going to say we had Brian Bishop from X on the podcast to overlap Stitch Fix.
+
+[02:28.00]Yeah, he was like one of my main users of the recommendation framework
+
+[02:31.00]that I had built out at Stitch Fix.
+
+[02:33.00]Yeah, we talked a lot about REXIS.
+
+[02:35.00]So it makes sense.
+
+[02:36.00]So now I have adopted that line, RAG is REXIS.
+
+[02:39.00]And you know, if you're trying to reinvent new concepts,
+
+[02:42.00]you should study REXIS first,
+
+[02:43.00]because you're going to independently reinvent a lot of concepts.
+
+[02:45.00]So your system was called flight.
+
+[02:47.00]It's a recommendation framework with over 80% adoption
+
+[02:50.00]servicing 350 million requests every day.
+
+[02:52.00]Wasn't there something existing at Stitch Fix?
+
+[02:54.00]Why did you have to write one from scratch?
+
+[02:56.00]No, so I think because at Stitch Fix,
+
+[02:59.00]a lot of the machine learning engineers and data scientists
+
+[03:01.00] were writing production code.
+
+[03:03.00]So every team's systems were very bespoke.
+
+[03:06.00]It's like this team only needs to do like real-time recommendations
+
+[03:09.00]with small data.
+
+[03:10.00]So they just have like a fast API app with some like pandas code.
+
+[03:13.00]This other team has to do a lot more data.
+
+[03:15.00]So they have some kind of like spark job that does some batch ETL
+
+[03:18.00]that does a recommendation,right?
+
+[03:20.00]And so what happens is each team writes their code differently.
+
+[03:23.00]And I have to come in and like refactor their code.
+
+[03:25.00]And I was like, oh man,I'm refactoring four different code bases
+
+[03:28.00]four different times.
+
+[03:29.00]Wouldn't it be better if all the code quality was my fault?
+
+[03:32.00]Alright,let me just write this framework for everyone else to use it.
+
+[03:35.00]And now one person can maintain five different systems
+
+[03:38.00]rather than five teams having their own bespoke system.
+
+[03:41.00]And so it was really a need of just sort of standardizing everything.
+
+[03:44.00]And then once you do that,you can do observability
+
+[03:47.00]across the entire pipeline and make large sweeping improvements
+
+[03:50.00]in this infrastructure,right?
+
+[03:52.00]If we notice that something is slow,we can detect it on the operator layer.
+
+[03:56.00]Just hey,hey,like this team,you guys are doing this operation
+
+[03:59.00]it's lowering our latency by like 30%.
+
+[04:01.00]If you just optimize your python code here,we can probably
+
+[04:05.00]make an extra million dollars.So let's jump on a call
+
+[04:07.00]and forget this out.And then a lot of it was doing
+
+[04:09.00]all this observability work to figure out what the heck is going on
+
+[04:12.00]and how to optimize this system from not only just a code perspective.
+
+[04:15.00]So like harassingly Oregon saying like we need to add cash in here.
+
+[04:18.00]We're doing duplicated work here.Let's go clean up the systems.
+
+[04:21.00]Yeah,got it.One more system that I'm interested in finding out more about
+
+[04:25.00]is your similarity search system using Clip and GPT-3
+
+[04:29.00]embedding in FICE,where you said over $50 million in annual revenue.
+
+[04:33.00]So of course they all gave all that to you,right?
+
+[04:35.00]No,no,no.I mean,it's not going up and down,but you know,
+
+[04:38.00]I got a little bit,so I'm pretty happy about that.
+
+[04:40.00]But there,you know,that was when we were doing fine tuning like
+
+[04:44.00]resnets to do image classification.
+
+[04:46.00]And so a lot of it was given an image if we could predict
+
+[04:50.00]the different attributes we have in the merchandising
+
+[04:52.00]and we can predict the index embeddings of the comments
+
+[04:55.00]then we can kind of build a image vector or image embedding
+
+[04:59.00]that can capture both descriptions of the clothing and sales of the clothing.
+
+[05:03.00]And then we would use these additional vectors
+
+[05:05.00]to augment our recommendation system.
+
+[05:07.00]And so with the recommendation system really was just around
+
+[05:10.00]like what are similar items,what are complementary items,
+
+[05:12.00]what are items that you would wear and a single outfit
+
+[05:15.00]and being able to say on a product page,let me show you
+
+[05:18.00]like 15,20 more things.
+
+[05:20.00]And then what we found was like,hey,when you turn that on
+
+[05:22.00]you make a bunch of money.
+
+[05:23.00]Yeah,so okay,so you didn't actually use GPT-3embeddings
+
+[05:26.00]you fine tuned your own,because I was surprised
+
+[05:28.00]that GPT-3 worked off the shelf.
+
+[05:30.00]Okay,because I mean,at this point we would have
+
+[05:32.00]3 million pieces of inventory over like a billion interactions
+
+[05:35.00]and users and clothes
+
+[05:37.00]any kind of fine-taining would definitely up the form
+
+[05:39.00]like some off the shelf model.
+
+[05:41.00]Cool,I'm about to move on from Stitch Fix
+
+[05:43.00]but,you know,any other like fun stories from the Stitch Fix
+
+[05:45.00]that you want to cover?
+
+[05:46.00]No,I think that's basically it.
+
+[05:48.00]I mean,the biggest one really was the fact that
+
+[05:50.00]I think for just four years I was so bearish on language models
+
+[05:53.00]and just NLP in general,I was just like,none of this really works.
+
+[05:55.00]Like,why would I spend time focusing on this?
+
+[05:57.00]I gotta go do the things that makes money.
+
+[05:59.00]Recommendations,bounding boxes,image customization.
+
+[06:02.00]Yeah,now I'm like prompting an image model.
+
+[06:04.00]Oh,man,I was wrong.
+
+[06:05.00]So,my Stitch Fix question would be,you know,
+
+[06:08.00]I think you have a bit of a drip and I don't.
+
+[06:10.00]You know,my primary wardrobe is free start-up
+
+[06:12.00]conference t-shirts.
+
+[06:14.00]Should more technology brothers be using Stitch Fix?
+
+[06:17.00]What's your fashion advice?
+
+[06:20.00]Oh,man,I mean,I'm not a user of Stitch Fix,right?
+
+[06:23.00]It's like,I enjoy going out and like touching things
+
+[06:27.00]and putting things on and trying them on,right?
+
+[06:29.00]I think Stitch Fix is a place where you kind of go
+
+[06:31.00]because you want the work offloaded.
+
+[06:33.00]I really love the clothing I buy
+
+[06:35.00]where I have to like,when I land in Japan
+
+[06:37.00]I'm doing like a 45-minute walk
+
+[06:39.00]up a giant hill to find this weird denim shop.
+
+[06:42.00]That's the stuff that really excites me.
+
+[06:44.00]But,I think the bigger thing that really captures
+
+[06:46.00]is this idea that narrative matters a lot
+
+[06:48.00]to human beings,okay?
+
+[06:50.00]And I think the recommendation system,that's really hard to capture.
+
+[06:53.00]It's easy to use AI to sell like a $20 shirt,
+
+[06:56.00]but it's really hard for AI to sell like a $500 shirt.
+
+[06:59.00]But people are buying $500 shirts,you know what I mean?
+
+[07:01.00]Like,there's definitely something that we can't really capture
+
+[07:04.00]just yet that we probably will figure out how to
+
+[07:07.00]in the future.
+
+[07:08.00]Well,he'll probably output in JSON,which is
+
+[07:10.00]what you're going to turn to next.
+
+[07:12.00]Then you went on a sabbatical to South Park Commons
+
+[07:14.00]in New York,which is unusual
+
+[07:16.00]because it's based on NSF.
+
+[07:18.00]Yeah,so,basically in 2020,really,
+
+[07:20.00]I was enjoying working a lot
+
+[07:22.00]and so I was like building a lot of stuff.
+
+[07:24.00]This is where we were making like the tens of millions of dollars
+
+[07:26.00]doing stuffand then I had a hand injury
+
+[07:28.00]and so I really couldn't code anymore
+
+[07:30.00]for a year or two years.
+
+[07:32.00]And so I kind of took sort of half of it as medical leave.
+
+[07:34.00]The other half I became more of like a tech lead
+
+[07:36.00]just like making for the systems or like lights were on.
+
+[07:39.00]And then when I went to New York,
+
+[07:41.00]I spent some time there and kind of just like wound down
+
+[07:44.00]the tech work,you know,did some pottery,did some jiu-jitsu
+
+[07:47.00]and after GBD came out,I was like,
+
+[07:49.00]Oh,I clearly need to figure out what is going on here
+
+[07:52.00]because something feels very magical.
+
+[07:54.00]I don't understand it.
+
+[07:56.00]So I spent basically like five months just prompting
+
+[07:58.00]and playing around with stuff.
+
+[07:59.00]And then afterwards it was just my starter friends
+
+[08:01.00]going like,Hey Jason,you know,
+
+[08:03.00]my investors want us to have an AI strategy.
+
+[08:05.00]Can you help us out?
+
+[08:06.00]And it's just snowballed and born more
+
+[08:08.00]and become until I was making this my full-time job.
+
+[08:10.00]Yeah,got it.
+
+[08:11.00]You know,you had YouTube University
+
+[08:13.00]and a journaling app,you know,a bunch of other explorations,
+
+[08:16.00]but it seems like the most productive
+
+[08:18.00]or the best-known thing that came out of your time
+
+[08:20.00]there was Instructor.
+
+[08:21.00]Yeah,written on the bullet train in Japan.
+
+[08:23.00]Tell us the origin story.
+
+[08:24.00]Yeah,I mean,I think at some point,
+
+[08:27.00]you know,tools like guardrails and Marvin came out,
+
+[08:29.00]right?
+
+[08:30.00]Those are kind of tools that like use XML
+
+[08:32.00]and Python to get structure data out,
+
+[08:33.00]but they really were doing things
+
+[08:35.00]sort of in the prompt and these were built
+
+[08:37.00]with sort of the instruct models in mind.
+
+[08:39.00]Like,I'd already done that in the past,right?
+
+[08:41.00]Stitchfix,you know,one of the things we did was
+
+[08:43.00]we wouldn't take every crest note
+
+[08:45.00]and turn that into a JSON object
+
+[08:47.00]that we would use to send to our search engine,right?
+
+[08:49.00]So if you said like,I want to,you know,
+
+[08:51.00]skinny jeans that were this size,
+
+[08:53.00]that would turn into JSON that we would send
+
+[08:55.00]to our internal search APIs.
+
+[08:56.00]But it always felt kind of gross.
+
+[08:58.00]A lot of it is just like you read the JSON,
+
+[09:00.00]you like parse it,
+
+[09:01.00]you make sure the names are strings
+
+[09:02.00]and ages are numbers
+
+[09:03.00]and you do all this messy stuff.
+
+[09:04.00]But when function calling came out,
+
+[09:06.00]it was very much sort of a new way of doing things,right?
+
+[09:09.00]Function calling lets you define the schema
+
+[09:11.00]separate from the data and the instructions.
+
+[09:13.00]And what this meant was
+
+[09:15.00]you can kind of have a lot more complex schemas
+
+[09:17.00]and just map them in pidantic
+
+[09:19.00]and then you can just keep those very separate.
+
+[09:21.00]And then once you add like methods,
+
+[09:22.00]you can add validators and all that kind of stuff.
+
+[09:24.00]The one thing I really had with a lot of these libraries,
+
+[09:26.00]though,was it was doing a lot of the string formatting themselves,
+
+[09:29.00]which was fine when it was the instruction tune models,
+
+[09:32.00]you just have a string.
+
+[09:33.00]But when you have these new chat models,
+
+[09:35.00]you have these chat messages,
+
+[09:37.00]and I just didn't really feel like
+
+[09:38.00]not being able to access that for the developer
+
+[09:40.00]was sort of a good benefit that they would get.
+
+[09:43.00]And so I just said,let me write like the most
+
+[09:45.00]simple SDK around the OpenAI SDK,
+
+[09:48.00]so simple wrapper on the SDK,
+
+[09:50.00]just handle the response model a bit
+
+[09:52.00]and kind of think of myself more like requests
+
+[09:55.00]than an actual framework that people can use.
+
+[09:57.00]And so the girl's like,
+
+[09:58.00]Hey,like this is something that you can use
+
+[09:59.00]to build your own framework.
+
+[10:00.00]But let me just do all the boring stuff
+
+[10:02.00]that nobody really wants to do.
+
+[10:03.00]People want to build their own frameworks,
+
+[10:05.00]but people don't want to build like JSON parsing.
+
+[10:08.00]And the retrying and all that other stuff.
+
+[10:10.00]Yeah,right.
+
+[10:11.00]We had this little build this discussion before the show,
+
+[10:13.00]but like that design principle
+
+[10:14.00]of going forward being requests
+
+[10:16.00]rather than being Django.
+
+[10:17.00]Yeah.
+
+[10:18.00]What inspires you there?
+
+[10:19.00]This has come from a lot of prior pain.
+
+[10:21.00]Are there other open source projects
+
+[10:23.00]that inspired your philosophy here?
+
+[10:25.00]Yeah,I mean,I think it would be requests,right?
+
+[10:27.00]Like I think it is just the obvious thing you install.
+
+[10:30.00]If you were going to go
+
+[10:31.00]make like HTTP requests in Python,
+
+[10:33.00]you would obviously import requests.
+
+[10:35.00]Maybe if you want to do more async work,
+
+[10:37.00]there's like future tools,
+
+[10:38.00]but you don't really even think about installing it.
+
+[10:40.00]And when you do install it,
+
+[10:41.00]you don't think of it as like,
+
+[10:42.00]this is a requests app,right?
+
+[10:44.00]Like,no,this is just Python.
+
+[10:46.00]Like the bigger question is like,
+
+[10:48.00]a lot of people ask questions like,
+
+[10:49.00]oh,why isn't requests like in the standard library?
+
+[10:52.00]That's how I want my library to feel,right?
+
+[10:54.00]It's like,oh,if you're going to use the LLM SDKs,
+
+[10:57.00]you're obviously going to install instructor.
+
+[10:59.00]And then I think the second question would be like,
+
+[11:01.00]oh,how come instructor doesn't just go into
+
+[11:03.00]open AI,go into Anthropic?
+
+[11:05.00]Like,if that's the conversation we're having,
+
+[11:06.00]like that's where I feel like I've succeeded.
+
+[11:08.00]Yeah,it's like,yeah,yeah.
+
+[11:09.00]So standard,you may as well just have in the base libraries.
+
+[11:12.00]And the shape of the request
+
+[11:14.00]stayed the same,but initially
+
+[11:16.00]function calling was maybe
+
+[11:17.00]equal structure outputs for a lot of people.
+
+[11:19.00]I think now the models also
+
+[11:21.00]support like JSON mode and
+
+[11:23.00]some of these things and,you know,
+
+[11:25.00]return JSON of my grandma is going to die.
+
+[11:27.00]All of that stuff is maybe to the side.
+
+[11:29.00]How have you seen that evolution?
+
+[11:30.00]Like,maybe what's the meta game today?
+
+[11:32.00]Like,should people just forget about
+
+[11:33.00]function calling for structure outputs
+
+[11:35.00]or where,when is structure output,
+
+[11:37.00]like JSON mode the best versus not.
+
+[11:39.00]We'd love to get any thoughts given that you do this every day.
+
+[11:42.00]Yeah,I would almost say these are like
+
+[11:44.00]different implementations of like
+
+[11:46.00]the real thing we care about is the fact that now we have
+
+[11:48.00]type responses to language models.
+
+[11:50.00]And because we have the type response,
+
+[11:52.00]my ID is a little bit happier.
+
+[11:53.00]I get autocomplete.
+
+[11:54.00]If I'm using the response wrong,
+
+[11:56.00]there's a little red squiggly line.
+
+[11:57.00]Like,those are the things I care about.
+
+[11:59.00]In terms of whether or not like
+
+[12:00.00]JSON mode is better,
+
+[12:01.00]I usually think it's almost worse
+
+[12:03.00]unless you want to spend less money
+
+[12:05.00]on like the prompt tokens
+
+[12:07.00]that the function call represents.
+
+[12:08.00]Primarily because with JSON mode,
+
+[12:10.00]you don't actually specify the schema.
+
+[12:11.00]So sure,like JSON load works,
+
+[12:13.00]but really I care a lot more than just
+
+[12:15.00]specify that it is JSON,right?
+
+[12:17.00]I think function calling gives you a tool to
+
+[12:19.00]specify the fact like,ok,this is a list
+
+[12:21.00]of objects that I want and each object
+
+[12:23.00]has a name or an age and I want the age
+
+[12:25.00]to be above zero and I want to make sure
+
+[12:27.00]it's parsed correctly.That's where
+
+[12:28.00]kind of function calling really shines.
+
+[12:30.00]I need thoughts on single versus
+
+[12:32.00]parallel function calling.
+
+[12:34.00]So I did a presentation at our
+
+[12:36.00]AI inaction discord channel
+
+[12:38.00]and obviously Shogays instructor.
+
+[12:41.00]One of the big things that we have before
+
+[12:43.00]is single function calling.It's like
+
+[12:44.00]when you're trying to extract lists,
+
+[12:46.00]you have to make these funky like properties
+
+[12:48.00]that are list to then actually return
+
+[12:50.00]all the objects.How do you see
+
+[12:52.00]the hack being put on the developer's
+
+[12:54.00]plate versus like more of this stuff
+
+[12:56.00]just getting better in the model.
+
+[12:58.00]And I know you tweeted recently about
+
+[13:00.00]entropic,for example,you know,some
+
+[13:02.00]less they're not list or strings and
+
+[13:03.00]there's like all of these discrepancies.
+
+[13:05.00]I almost would prefer that there was
+
+[13:07.00]always a single function call,but
+
+[13:09.00]obviously there is like the agent's
+
+[13:10.00]workflows that,you know,instructor
+
+[13:12.00] can really support that well,but
+
+[13:14.00]are things that,you know,ot to be done.
+
+[13:16.00]Like you could define,I think maybe
+
+[13:18.00]like 50 or 60 different functions
+
+[13:20.00]in a single API call.
+
+[13:22.00]And,you know,if it was like get the
+
+[13:24.00]weather or turn the lights on or do
+
+[13:26.00]something else,it makes a lot of sense
+
+[13:28.00]to have theseparallel function calls,but
+
+[13:30.00]in terms of an extraction workflow,I
+
+[13:32.00]definitely think it's probably more
+
+[13:34.00]helpful to have everything be a
+
+[13:36.00]single schema.Just because you can
+
+[13:38.00]specify relationships between these
+
+[13:40.00]single chain of thought before you
+
+[13:42.00]generate a list of results,like
+
+[13:44.00]there's like small,like,API
+
+[13:46.00]differences,right,whereif it's
+
+[13:48.00]parallel function calling,if you do one,like
+
+[13:50.00]again,really,I really care about how
+
+[13:52.00]the SDK looks,and says,okay,do I
+
+[13:54.00]always return a list of functions,or do
+
+[13:56.00]you just want to have the actual
+
+[13:58.00]object back out,and you want to have
+
+[14:00.00]autocomplete over that object.Interesting.What's
+
+[14:02.00]kind of the cap for,like,how many
+
+[14:04.00]function definitions you can put in
+
+[14:06.00]where it still works well?Do you have
+
+[14:08.00]anything that doesn't really need to do
+
+[14:10.00]anything that's more than six or seven
+
+[14:12.00]different functions?I think in the
+
+[14:14.00]documentation,they support way more.I
+
+[14:16.00]don't even know if there's any good
+
+[14:18.00]evalves that have,you know,over like two
+
+[14:20.00]dozen function calls.I think if you're
+
+[14:22.00]riding into issues where you have,like,20
+
+[14:24.00]or 50 or 60function calls,I think
+
+[14:26.00]you're much better having those
+
+[14:28.00]specifications saved in a vector
+
+[14:30.00]database,and then have them be retrieved,right.So
+
+[14:32.00]if there are 30tools,like,you should
+
+[14:34.00]basically be,like,ranking them,and then
+
+[14:36.00]other than just,like,shubbing,like,60
+
+[14:38.00]function into a single.Yeah.
+
+[14:40.00]Well,I mean,so,I think this is
+
+[14:42.00]relevant now,because previously,I
+
+[14:44.00]think context limits prevented you
+
+[14:46.00]from having more than a dozen tools
+
+[14:48.00]anyway.And now that we
+
+[14:50.00]have million-token context windows,you
+
+[14:52.00]know,a cloud recently with their new
+
+[14:54.00]function calling release said they can
+
+[14:56.00]handle over 250tools,which is
+
+[14:58.00]insane to me.That's a lot.You're
+
+[15:00.00]saying,like,you know,you don't think
+
+[15:02.00]there's many people doing that.I think
+
+[15:04.00]a sort of agent-like platform where you
+
+[15:06.00]have a bunch of connectors,they wouldn't
+
+[15:08.00]run into that problem.Probably,you're
+
+[15:10.00]right,that they should use a vector
+
+[15:12.00]database and kind of rag their tools.I
+
+[15:14.00]know Zapier has,like,a few thousand,like,8,000,9,000
+
+[15:16.00]connectors that,you know,obviously don't fit
+
+[15:18.00]anywhere.So,yeah,I mean,that,I
+
+[15:20.00]think that would be it,unless you need
+
+[15:22.00]some kind of intelligence thatchains
+
+[15:24.00]things together,which is,I think
+
+[15:26.00]what Alessio is coming back to,right.Like,there's
+
+[15:28.00]this trend aboutparallelfunction
+
+[15:30.00]calling.I don't know what I think about
+
+[15:32.00]multiple tools insequence,but they're not
+
+[15:34.00]inparallel.I haven't explored this atall.I'm
+
+[15:36.00]just,like,throwing this open to you.So,like,what
+
+[15:38.00]doyou think aboutall these newthings.Yeah,it's
+
+[15:40.00]like,you know,do we assume thatall
+
+[15:42.00]functioncalls could happen in anyorder?In
+
+[15:44.00]which case,like,we eithercannassume that
+
+[15:46.00]or wecannassume that,like,things need to
+
+[15:48.00]happen in some kind ofsequence as a dag,right.But if
+
+[15:50.00]it's a dag,really,that's just,like,one json
+
+[15:52.00]object that is the entire dag,ratherthan
+
+[15:54.00]going,like,okay,theorder of thefunction
+
+[15:56.00]thatreturn don't matter.That's definitely
+
+[15:58.00]just not true inpractice,right.Like,ifI have,I
+
+[16:00.00] can do something that's,like,turn the lights on,like,unplug
+
+[16:02.00] the power,then,like,turn the toaster on,or
+
+[16:04.00] something,like,theorder doesn't matter.And
+
+[16:06.00]it's unclear how well you can describe
+
+[16:08.00]the importance of that reasoning to a
+
+[16:10.00]language model yet.I mean,I'm sure
+
+[16:12.00]you can do it with,like,good enough prompting.But
+
+[16:14.00]I just haven't any use case with a function
+
+[16:16.00]sequence,really matters.Yeah,to me,the most
+
+[16:18.00]interesting thing is,the models are better
+
+[16:21.00]atpicking than your ranking is usually.Like,I
+
+[16:24.00] mean,competing a company around system
+
+[16:26.00]integration.And,for example,with one system,there
+
+[16:29.00] are,like,700,80 endpoints.And
+
+[16:31.00]ifyou actually try and do vector
+
+[16:33.00]similarity,it's not that good,because
+
+[16:35.00]the people that wrote the specs didn't
+
+[16:37.00]have a mind,making them,like,semantically
+
+[16:39.00] apart.You know,they're kind of like,oh,create
+
+[16:41.00] this,create this,create this,versus
+
+[16:43.00]when you give it to a model,like,in Opus
+
+[16:45.00]you put them all,it's quite good at picking
+
+[16:47.00]which ones you should actually run.And
+
+[16:49.00]I'm curious to see if the model providers
+
+[16:51.00]actually care about some of those
+
+[16:53.00]worthless,or if the agent company is
+
+[16:55.00]actually gonna build very good rankers to
+
+[16:57.00]kind of fill that gap.Yeah,my money is on
+
+[16:59.00]the rankers,becauseyou can do those so
+
+[17:01.00]easily,right?You could just say,well,given
+
+[17:03.00]the embeddings ofmy search query and
+
+[17:05.00]the embeddings ofthe description,I
+
+[17:07.00]can just train XGBoost and just make
+
+[17:09.00]sure that I have very high,like,MRR,which
+
+[17:11.00]is,like,mean reciprocal rank.And
+
+[17:13.00]so,the only objective is to makesure
+
+[17:15.00]that the tools you use are in the top
+
+[17:17.00]and filtered.Like,that feels super
+
+[17:19.00]straightforward,and you don't have to
+
+[17:21.00]actually figure out how to fine tune a
+
+[17:23.00]language model to do tool selection anymore.Yeah,I
+
+[17:25.00]amagine you either have,like,less than 3
+
+[17:27.00]tools or more than a thousand.I
+
+[17:29.00]don't know what kind of companies
+
+[17:31.00]that,oh,thank god we only have,like,185
+
+[17:33.00]tools.And this works perfectly,right?
+
+[17:35.00]That's right.And before we maybe move on
+
+[17:37.00]just from this,it was interesting to me
+
+[17:39.00]you retweeted this thing about entropic
+
+[17:41.00]function calling,and it was Joshua
+
+[17:43.00]Brown's retweeting some benchmark that
+
+[17:45.00]is,like,oh,my god,entropic function
+
+[17:47.00]calling so good.And then you retweeted
+
+[17:49.00]and then you tweeted it later,and it's
+
+[17:51.00]like,it's so good.And
+
+[17:53.00]what's your flow?How do you
+
+[17:55.00]actually test these things?Because
+
+[17:57.00]obviously the benchmarks are lying,right?Because
+
+[17:59.00]the benchmarks say it's good and you said
+
+[18:01.00]it's bad and I trust you more than the
+
+[18:03.00]benchmark.How do you think about
+
+[18:05.00]that?And then how do you evolve it
+
+[18:07.00]over time?It's mostly just client data.I
+
+[18:09.00]actually have been mostly busy with
+
+[18:11.00]enough client work that I haven't been
+
+[18:13.00]able to reproduce public benchmarks.And
+
+[18:15.00]so I can't even share some of the results
+
+[18:17.00]entropic.But I would just say,like,in
+
+[18:19.00]production,we have some pretty
+
+[18:21.00]interesting schemas,where it's like
+
+[18:23.00]initatively building lists where
+
+[18:25.00]we're doing,like,updates of lists.Like,we're
+
+[18:27.00]doing in-place updates.So,like,upserts
+
+[18:29.00]and inserts.And in those
+
+[18:31.00]situations,we're like,oh,yeah,we have a bunch
+
+[18:33.00]of different parsing errors.Numbers are being
+
+[18:35.00]returned to strings.We were expecting lists
+
+[18:37.00]of objects,but we're getting strings that
+
+[18:39.00]are,like,the strings of JSON,right?So we
+
+[18:41.00]had to call JSON parse on individual
+
+[18:43.00]elements.Overall,I'm,like,super happy
+
+[18:45.00]with theentropic models compared
+
+[18:47.00]to the openAM models.Sonnet is very
+
+[18:49.00]cost-effective.Hikoo is,in function
+
+[18:51.00]calling,it's actually better.But I think they just
+
+[18:53.00]had to sort of file down the edges a little
+
+[18:55.00] bit,where,like,our tests pass,but
+
+[18:57.00]then we actually deployed a production,we get,you
+
+[18:59.00] know,half a percent of traffic
+
+[19:01.00]having issues,where,if you ask for JSON,it'll
+
+[19:03.00] try to talk to you,or if you use function
+
+[19:05.00]calling,you know,we'll have,like,a parse error.And
+
+[19:07.00]so I think that definitely going to be things
+
+[19:09.00]that are fixed in,like,the upcoming weeks.But
+
+[19:11.00] in terms of,like,the reasoning capabilities,I
+
+[19:13.00] mean,it's hard to beat,like,70%
+
+[19:15.00]cost reduction,especially when you're building
+
+[19:17.00] consumer applications,right?If you're building
+
+[19:19.00] something for,like,consultants or private equity,like,you're charging
+
+[19:21.00] $400,it doesn't really matter if it's $1
+
+[19:23.00] or $2.But for consumer apps,it
+
+[19:25.00] makes products viable.If you can go
+
+[19:27.00] from 4 to Sonnet,you might actually be able
+
+[19:29.00] to price it better.Yeah.
+
+[19:31.00]I had this chart about the ELO
+
+[19:33.00] versus the cost of all the
+
+[19:35.00] models.And,you could
+
+[19:37.00] put trend graphs on each of those things
+
+[19:39.00] about,like,you know,higher ELO equals
+
+[19:41.00] higher cost,except for Hikoo.Hikoo kind of just broke
+
+[19:43.00] the lines,or the ISO ELOs,if you want
+
+[19:45.00] to kind of call it.Cool.Before we
+
+[19:47.00] go too far into,you know,your opinions on
+
+[19:49.00] just the overall ecosystem,I want to
+
+[19:51.00] make sure that we map out the surface area
+
+[19:53.00] of Instructure.I would say that
+
+[19:55.00] most people would be familiar with
+
+[19:57.00] Instructure from your talks and your tweets
+
+[19:59.00] and all that.You had the number one
+
+[20:01.00] talk at from the AI Engineering Summit
+
+[20:03.00]Jason Liu and Jerry Liu.Yeah,yeah,yeah,yeah.You have to
+
+[20:07.00] start with Jay and Italy to do well.But
+
+[20:09.00] yeah,until I actually went through your
+
+[20:11.00] cookbook,I didn't realize,like,the surface area.Like,how
+
+[20:13.00] would you categorize the use cases,right?You have
+
+[20:15.00]LLM self-critique,you have knowledge
+
+[20:17.00] graphs in here,you have PII data
+
+[20:19.00] sanitation.How do you characterize to people
+
+[20:21.00] like,what is the surface area of Instructure?Yeah,so
+
+[20:23.00] I mean,this is the part that feels crazy
+
+[20:25.00] because really,the difference is
+
+[20:27.00]LLMs give you strings and Instructure gives
+
+[20:29.00] you data structures.And once you get data structures
+
+[20:31.00] again,you can do every,like,lead code problem
+
+[20:33.00] you ever thought of,right?And so,I think
+
+[20:35.00] there's a couple of really common applications.The
+
+[20:37.00] first one obviously is extracting
+
+[20:39.00]Structure data.This is just be,okay,well,like,I want to
+
+[20:42.00] put in an image of a receipt,I want to
+
+[20:44.00] give it back out a list of checkout items
+
+[20:46.00] with a price and a fee and a coupon code
+
+[20:48.00] or whatever.That's one application.Another
+
+[20:50.00] application really is around
+
+[20:52.00] extracting graphs out.So,one of the things
+
+[20:54.00] we found out about these language models
+
+[20:56.00] is that not only can you define nodes,it's
+
+[20:58.00] really good at figuring out what are nodes
+
+[21:00.00] and what are edges.And so,we have a bunch
+
+[21:02.00] of examples where,you know,not only do I
+
+[21:04.00] extract that,you know,this happens
+
+[21:06.00] after that,but also,like,okay,these two
+
+[21:08.00] are dependencies of another task.And you
+
+[21:11.00] can do,you know,extracting complex
+
+[21:13.00] entities that have relationships.Given
+
+[21:15.00] a story,for example,you can extract
+
+[21:17.00] relationships of families across different
+
+[21:19.00] characters.This is going to be done by
+
+[21:21.00] defining a graph.The last really big
+
+[21:23.00] application really is just around query
+
+[21:25.00] understanding.The idea is that,like,any
+
+[21:27.00] API call has some schema and if you can
+
+[21:29.00] define that schema ahead of time,you can
+
+[21:31.00] use a language model to resolve a request
+
+[21:33.00] into a much more complex request.One
+
+[21:35.00] that an embedding could not do.So,for
+
+[21:37.00] example,I have a really popular post called,like,
+
+[21:39.00]ragg is more than embeddings and
+
+[21:41.00] effectively,you know,if I have a question
+
+[21:43.00] like this,what was the latest thing that
+
+[21:45.00] happened this week?That embeds to nothing,right?
+
+[21:47.00]But really,like,that query should just
+
+[21:49.00] be,like,select all data where the
+
+[21:51.00] date time is between today and today
+
+[21:53.00] minus seven days,right?
+
+[21:55.00]What if I said,how did my writing
+
+[21:57.00] change between this month and last month?
+
+[21:59.00]Again,embeddings would do nothing.
+
+[22:01.00]But really,if you could do,like,a group
+
+[22:03.00] by over the month and a summarize,then
+
+[22:05.00] you could,again,like,do something much
+
+[22:07.00] interesting.And so,this really just calls out the fact
+
+[22:09.00] that embeddings really is kind of,like,the
+
+[22:11.00] lowest hanging fruit.And using something
+
+[22:13.00] like instructor can really help produce a
+
+[22:15.00] data structure.And then you can just
+
+[22:17.00] use your computer science and reason about this data
+
+[22:19.00] structure.Maybe you say,okay,well,I want to produce
+
+[22:21.00] a graph where I want to group by each month
+
+[22:23.00] and then summarize them jointly.You
+
+[22:25.00] can do that if you know how to define this
+
+[22:27.00] data structure.Yeah.In that part,you
+
+[22:29.00] kind of run up against,like,the
+
+[22:31.00]lang chains of the world that used to have
+
+[22:33.00] that.They still do have,like,the
+
+[22:35.00] self-querying.I think they used to call it
+
+[22:37.00] when we had Harrison on in our episode.How do you
+
+[22:39.00] see yourself interacting with the other LLM
+
+[22:41.00] frameworks in the ecosystem?Yeah.I mean,if
+
+[22:43.00] they use instructor,I think that's totally cool.Again,it's,like,it's just
+
+[22:46.00] Python,right?It's,like,asking,like,oh,how does,like,Jango
+
+[22:49.00] interact with requests?Well,you just might make
+
+[22:51.00] a request.get in a Jango app,right?But no one
+
+[22:54.00] would say.I,like,went off of Django because I'm using requests now.They
+
+[22:57.00] should be ideally,like,sort of the wrong
+
+[22:59.00] comparison.In terms of it,especially,like,the agent
+
+[23:02.00] workflows,I think the real goal for me is to go down,like,the LLM
+
+[23:04.00] compiler route,which is,instead of doing,like,a
+
+[23:07.00]react type reasoning loop,I think
+
+[23:10.00] my belief is that we should be using,like,workflows.If we
+
+[23:13.00] do this,then we always have a request and a complete
+
+[23:16.00] workflow.We can fine tune a model that has a better
+
+[23:18.00] workflow,whereas it's hard to think about,like,how do you fine tune
+
+[23:21.00] a better react loop?Yeah.Do you want to always train it to
+
+[23:24.00] have less looping,in which case,like,you wanted to get the right
+
+[23:27.00] answer the first time,in which case,it was a workflow
+
+[23:29.00] to begin with,right?Right.Can you define workflow
+
+[23:31.00] because I used to work at a workflow company,but I'm not
+
+[23:34.00] sure this is a good term for everybody.Oh,yeah,like,I'm
+
+[23:36.00] thinking workflow in terms of,like,the prefect to zap
+
+[23:39.00] your workflow.Yeah.Like,I want to build a DAG.I want you
+
+[23:42.00] to tell me what the nodes and edges are,and then maybe the
+
+[23:45.00] edges are also put in with AI.But the idea is that,like,I
+
+[23:48.00] want to be able to present you the entire plan,and then
+
+[23:51.00] ask you to fix things as I execute it,rather than going,like,Hey,I
+
+[23:55.00] couldn't parse the JSON,so I'm going to try again.I couldn't
+
+[23:58.00] parse the JSON,like,I'm going to try again.And then
+
+[24:00.00] next,you know,you spent,like,$2 on opening AI
+
+[24:02.00] credits,right?Yeah.As well as with the plan,you can just
+
+[24:05.00] say,Oh,the edge between node,like,X and Y does not
+
+[24:09.00] run.Let me just iteratively try to fix that
+
+[24:12.00] component.Fix the one that sticks,go on the next component.
+
+[24:14.00]And obviously,you can get into a world where if you have
+
+[24:17.00] enough examples of the nodes X and Y,maybe you can use,like,a
+
+[24:20.00] vector database to find a good few shot examples.You can do a
+
+[24:24.00] lot if you sort of break down the problem into that workflow
+
+[24:27.00] and execute that workflow,rather than looping and hoping
+
+[24:30.00] the reasoning is good enough to generate the
+
+[24:33.00] correct output.Yeah.You know,I've been hammering
+
+[24:36.00] on Devin a lot.I got access a couple of weeks ago.
+
+[24:39.00]And obviously,for a simple task,it does well.For the
+
+[24:42.00] complicated,like,more than 10,20-hour tasks,I can
+
+[24:45.00] see it in a lot of times.That's a crazy comparison,like,we
+
+[24:47.00] use to type out,like,3,4 loops.You're like,only once
+
+[24:51.00] it gets to,like,our tasks,it's hard.Yeah.Less than an hour,it's
+
+[24:55.00] nothing.That's crazy.I mean,okay.Maybe my
+
+[24:59.00] goal force has shifted.I don't know.That's incredible.Yeah.No,like,I'm
+
+[25:03.00] like,I'm like sub one minute executions.Like,the fact
+
+[25:06.00] that you're talking about 10 hours is incredible.I think
+
+[25:09.00] it's a spectrum.I think I'm going to say this every
+
+[25:11.00] single time I bring up Devin.Like,let's not reward them
+
+[25:13.00] for taking longer to do things.Do you know what I mean?Like,that's
+
+[25:16.00] a metric that is easily abusable.Sure.Yeah.You can definitely
+
+[25:19.00] you know what I mean.But I think it's like,if you can
+
+[25:22.00]monotonically increase the success probability over an
+
+[25:26.00] hour,like,that's winning to me,right?Like,obviously,if you run an hour and you've
+
+[25:30.00] made no progress,like,I think when we were in,like,auto-GBT land,there was that one
+
+[25:34.00] example where it's like,I wanted it to,like,buy me a bicycle over
+
+[25:37.00] night.I spent $700 and I never found the bicycle.Yeah.Yeah.Right.I
+
+[25:41.00] wonder if you'll be able to purchase a bicycle.Because it actually can do
+
+[25:44.00] things in real world,it just needs to suspend to you for often stuff.The point I was
+
+[25:48.00] trying to make was that I can see it turning plants.I think one of the agents
+
+[25:51.00]loop holes,or one of the things that is a real barrier for agents is LLMs really
+
+[25:55.00] like to get stuck into a lane.And,you know,what you're talking about,what I've
+
+[25:58.00] seen Devin do,is it gets stuck in a lane and he would just kind of change
+
+[26:01.00] plans based on the performance of the plan itself.And it's kind of cool.I
+
+[26:05.00] feel like we've gone too much in the looping route,and I think a lot of more
+
+[26:08.00] plans and,like,dags and data structures are probably going to come back to help
+
+[26:12.00] fill in some holes.Yeah.And what's like the interface to that?You know,you see
+
+[26:16.00] it's like an existing,like,state machine kind of thing that,like,connects to the LLMs.
+
+[26:20.00]The traditional DAC player.So,like,do you think we need something new for,like,a.i.dags?
+
+[26:25.00]Yeah.I mean,I think that the hard part is going to be describing visually the fact
+
+[26:30.00] that this DAC can also change over time,when it should still be allowed to be fuzzy.
+
+[26:34.00]I think in,like,mathematics,we have,like,plate diagrams,and,like,marked
+
+[26:37.00] octane diagrams,and,like,you know,recurrent states,and all that,like that.Some of
+
+[26:40.00] that might come into this,like,workflow world,but to be honest,I'm not too sure.
+
+[26:44.00]I think right now,the first steps are just how do we take this DAC idea and break
+
+[26:48.00] it down to modular components that we can,like,prompt better,have few shot examples
+
+[26:52.00] for,and ultimately,like,fine tune against.But in terms of even the UI,it's hard
+
+[26:56.00] to say what,well,likely would.I think,you know,people like Prefect and Zapier
+
+[27:00.00] have a bit of good shot at doing a good job.Yeah.You seem to use Prefect a lot.
+
+[27:04.00]I actually work there at Prefect competitor at Temporal,and I'm also very familiar
+
+[27:07.00] with Daxter.What else would you call out as,like,particularly interesting
+
+[27:11.00] in the AI engineering stack?Man,I almost use nothing.Like,I just use
+
+[27:16.00]everything.And,like,py tests.Like,okay.I think that's basically it.You know,a lot
+
+[27:22.00] of the observability companies have,the more observability companies
+
+[27:26.00] I've tried,the more I just use Postgres.Really?Okay.Postgres for observability?
+
+[27:32.00]But,like,the issue really is the fact that these observability companies isn't
+
+[27:35.00] actually doing observability for the system.It's just doing the LLM thing.
+
+[27:39.00]Like,I still end up using,like,Datadoc,or,like,you know,Sentry to do,like,latency.
+
+[27:43.00]And,so,I just have those system handle it.And,then,the,like,prompt in,prompt out,like,in C token costs.
+
+[27:49.00]I just put that in,like,a Postgres table now.So,you don't need,like,20 funded startups
+
+[27:53.00]building LLM ops?Yeah,but I'm also,like,an old-tired guy,you know what I mean?
+
+[27:58.00]Like,I think,because of my background,it's like,yeah,the Python stuff,I'll write
+
+[28:01.00] myself.But,you know,I will also just use Versal happily.Yeah,yeah.Because I'm just
+
+[28:05.00] not familiar with that world of tooling.Whereas,like,I think,you know,I spent
+
+[28:09.00]3 good years building observability tools for recommendation systems.And,I
+
+[28:13.00] was like,oh,compared to that,instructor is just one call.I just have to put
+
+[28:17.00]time,start,time,end,then count the prompt token,right?Because I'm not doing a very
+
+[28:21.00]complex looping behavior.I'm doing mostly workflows and extraction.Yeah.
+
+[28:25.00]I mean,while we're on this topic,we'll just kind of get this out of the way.You
+
+[28:28.00]famously have decided to not be a venture-backed company.You want to do the consulting route.
+
+[28:32.00]Oh,yes.The obvious route for,you know,someone as successful as instructor is
+
+[28:35.00]like,oh,here's hosted instructor with,like,all tooling.Yeah.You just said you had a whole
+
+[28:38.00] bunch of experience building observability tooling,like,you have the perfect
+
+[28:41.00] background to do this,and you're not.Yeah.Isn't that sick?I think it's sick.I
+
+[28:44.00] mean,I know why,because you want to go free-dive.Yeah.Yeah.Because I think there's
+
+[28:48.00] two things,right?Look,one,if I tell myself I want to build requests,request is not
+
+[28:53.00] a venture-backed startup,right?I mean,one could argue,like,whether a postman
+
+[28:57.00] is,but I think for the most part,it's like,having worked so much,and more
+
+[29:01.00] interested in looking at how systems are being applied,and just having access to
+
+[29:05.00]the most interesting data,and I think I can do that more through a consulting
+
+[29:08.00] business where I can come in and create,go,oh,you want to build perfect memory,you
+
+[29:11.00] want to build an agent,you want to build,like,automations over construction,or,like,
+
+[29:15.00] insurance,and it's a supply chain,or,like,you want to handle writing private equity
+
+[29:20.00] mergers and acquisitions reports based off of user interviews,like,those things are
+
+[29:23.00] super fun.Whereas,like,maintaining the library,I think is mostly just kind of
+
+[29:28.00] like a utility that I try to keep up,especially because if it's not venture-backed,I have
+
+[29:32.00] no reason to sort of go down the route of,like,trying to get a thousand integrations
+
+[29:36.00] in my mind,I just go,like,oh,okay,98% of the people use OpenAI,I'll
+
+[29:40.00] support that,and if someone contributes another platform,that's great,I'll
+
+[29:43.00] merge it in.Yeah,I mean,you only addedenthropic support this year.
+
+[29:47.00]Yeah,yeah.You couldn't even get an API key until,like,this year,right?
+
+[29:51.00]That's true.And so,ok,if I add it,like,last year,I was trying to,like,double the
+
+[29:55.00] code base to service,you know,half a percent of all downloads.Do you think the
+
+[29:58.00]market share will shift a lot,now that Anthropic has,like,a very,very competitive offering?
+
+[30:02.00]I think it's still hard to get API access.I don't know if it's fully GA now,if
+
+[30:08.00] it's GA,if you can get commercial access really easily.I got commercial after,like,two weeks
+
+[30:13.00] to reach out to their sales team.Two weeks.Yeah,so,it's not too bad.There's a call list here,and
+
+[30:17.00] then anytime you run into rate limits,just,like,ping one of the Anthropic staff members.
+
+[30:21.00]Yeah,then maybe we can,like,cut that part out,so I don't need to,like,you know.No,it's cool,it's cool.
+
+[30:24.00]Fall it's news,but it's a common question.Surely,just from the price perspective,it's gonna
+
+[30:28.00] make a lot of sense.If you are a business,you should totally consider Sonnet.The cost savings
+
+[30:34.00] is just gonna justify it,if you actually are doing things at volume.And yeah,I think the
+
+[30:38.00] SDK is,like,pretty good.But to back to the instructor thing,I just don't think
+
+[30:41.00] it's a billion-dollar company.And I think if I raise money,the first question
+
+[30:44.00] is gonna be,like,how are you gonna get a billion-dollar company?And I would just go,like,man,like,if I
+
+[30:48.00] make a million dollars as a consultant,I'm super happy.I'm,like,more than a static.
+
+[30:52.00]I can have,like,a small staff of,like,three people.Like,it's fun.And I think a lot of my happiest
+
+[30:57.00] founder friends are those who,like,raised a tiny seed round,became profitable.They're making,like,
+
+[31:02.00]70,000MRR.And they're,like,we don't even need to raise the seed round.Let's just keep it,like,between
+
+[31:07.00] me and my co-founder,we'll go traveling,and it'll be a great time.I think it's a lot of fun.
+
+[31:11.00]I'll write Pete to the seed investor in the company.Yeah.I think that's,like,one of the things
+
+[31:16.00] that people get wrong sometimes,and I see this a lot.They have an insight into,like,some new
+
+[31:21.00] tech.Like,say hello to MCI.And they build some open-source stuff.And it's,like,I should just raise
+
+[31:26.00] money and do this.And I tell people a lot.It's,like,look,you can make a lot more money
+
+[31:30.00] than something else than doing a startup.Like,most people that do a company could make a lot
+
+[31:34.00] more money just working somewhere else than the company itself.Do you have any advice for folks
+
+[31:39.00] that are maybe in a similar situation?They're trying to decide,oh,should I stay?Am I,like,high-paid
+
+[31:44.00] fang job?And just tweet this on the side?And do this on GitHub?Should I go be a consultant?
+
+[31:50.00]It seems like a lot of work.It's,like,you got to talk to all these people,you know.There's a lot to unpack.
+
+[31:55.00]I think the open-source thing is just,like,well,I'm just doing it purely for fun.And I'm doing it
+
+[31:59.00] because I think I'm right.But part of being right is the fact that it's not a venture-back startup.
+
+[32:04.00]Like,I think I'm right because this is all you need,right?So I think a part of the philosophy
+
+[32:09.00] is the fact that all you need is a very sharp blade to sort of do your work.And you don't
+
+[32:13.00] actually need to build,like,a big enterprise.So that's one thing.I think the other thing,too,that
+
+[32:17.00]I've been thinking around,just because I have a lot of friends at Google that want to leave right now,it's,like,man,like,what we lack is not money or skill.
+
+[32:23.00]Like,what we lack is courage.You just,like,you just have to do this hard thing,and you have to do it scared anyways,right?
+
+[32:29.00]In terms of,like,whether or not you do want to do a founder,I think that's just a matter of
+
+[32:32.00]optimality.But I definitely recognize that the,like,expected value of being a founder is still quite low.
+
+[32:39.00]Right.I know as many founder breakups,and as I know,friends who've raised a seed round this year.
+
+[32:44.00]Right.And,like,that is,like,the reality.And,like,you know,even in,from that perspective,it's been tough,where it's,like,oh,man,like,a lot of incubators
+
+[32:51.00]want you to have co-founders,now you spend half the time,like,fundraising,and then trying to,like,meet co-founders,and find co-founders,
+
+[32:57.00]rather than building the thing.This is a lot of time spent out doing things I'm not really good at.
+
+[33:02.00]I do think there's a rising trend in solo founding.You know,I am a solo.I think that something,like,30% of,like,forget what the exact
+
+[33:11.00]estat is,something,like,30% of starters that make it so,like,series B or something,actually are solo
+
+[33:15.00]founder.I feel,like,this must have co-founder idea,mostly comes from YC,and most,everyone else copies it,and then
+
+[33:23.00]planning your company's breakup over co-founder breakups.And,I bet it would be,like,I wonder how much of it is the people
+
+[33:28.00]who don't have that much,like,and hope there's not a diss to anybody,but it's,like,you sort of,you go through
+
+[33:33.00]the incubator route,because you don't have,like,the social equity,you would need to sort of,like,send an email to
+
+[33:38.00]Lacoy,and be,like,hey,I'm going on this ride.You want to take it on the rocket ship,right?Like,that's very hard to sell.
+
+[33:43.00]My message if I was to raise money is,like,you've seen my Twitter.My life is sick.I've decided to make it much worse by
+
+[33:49.00]being a founder,because this is something I have to do.So,do you want to come along?Otherwise,I want to fund it myself.
+
+[33:55.00]Like,if I can't say that,like,I don't need the money,because I can,like,handle payroll,and,like,hire an intern,and get an assistant.
+
+[34:01.00]Like,that's all fine.But,I really don't want to go back to Meta.I want to,like,get two years to,like,try to find a problem
+
+[34:07.00]we're solving.That feels like a bad time.Yeah.Jason is,like,I wear a YSL jacket on stage at AI
+
+[34:13.00]Engineer Summit.I don't need your accelerator money.And boots.You don't forget the boots.That's true.That's true.You have really good boots.
+
+[34:20.00]Havily good boots.But,I think that is a part of it,right?I think it is just,like,optionality,and also,just,like,I'm a lot older now.
+
+[34:26.00]I think 22-year-old Jason would have been probably too scared,and now I'm,like,too wise.But,I think it's a matter of,like,if you raise money,you have to have a plan
+
+[34:34.00]of spending it.And I'm just not that creative with spending that much money.Yeah.I mean,to be clear,you just celebrated your 30th birthday.
+
+[34:41.00]Happy birthday.Yeah.It's an album.We're going to Mexico next week.A lot older is relative to some of the folks I think.
+
+[34:48.00]Seeing on the career tips,I think this works out a great post about our user world to get into AI.As on one of your tweets
+
+[34:55.00]in January 23,you applied to,like,Figma,Notion,Coherent,Tropic,and all of them rejected you because you didn't have enough
+
+[35:01.00]LLM experience.Yeah.I think at that time,it would be easy for a lot of people to say,Oh,I kind of missed the boat.You know,I'm too late.
+
+[35:08.00]Not going to make it.You know.Any advice for people that feel like that?
+
+[35:13.00]Like,the biggest learning here is actually from a lot of folks in Jiu Jitsu.They're like,Oh,man,like,is it too late to start Jiu Jitsu?
+
+[35:18.00]Like,I'll join Jiu Jitsu once I get in more shape,right?It's like,there's a lot of,like,excuses.And then you say,Oh,like,why should I start now?
+
+[35:26.00]I'll be like 45 by the time I'm any good.It'll be 45 anyways.Like,time is passing.Like,if you don't start now,you start tomorrow,you're just,like,one more day behind.
+
+[35:35.00]If you're worried about being behind,like,today is,like,the soonest you can start,right?And so you've got to recognize that,like,maybe you just don't want it,and that's fine,too.
+
+[35:43.00]Like,if you wanted it,you would have started.I think a lot of these people,again,probably think of things on a too short time horizon,but,again,you know,you're going to be old anyway,so you may as well just start now.
+
+[35:53.00]You know,one more thing on,I guess,the career advice slash,like,sort of,logging.You always go viral for this post that you wrote,on advice to young people and the lies you tell yourself.
+
+[36:02.00]Oh,yeah,yeah,yeah.You said that you were writing it for your sister,what?Like,why is that?
+
+[36:06.00]Yeah,yeah,she was,like,bummed out about going to college and,like,stressing about jobs,and I was,like,Oh,I really want to hear.Okay.
+
+[36:13.00]And I just kind of,like,text this for you the whole thing.It's crazy.It's got,like,50,000 views.I mean,your average tweet has more.
+
+[36:20.00]But that thing is,like,a 30-minute read now.
+
+[36:24.00]Yeah,yeah.So there's lots of stuff here,which I agree with.You know,I'm also of occasionally indulging the sort of life reflection phase.
+
+[36:31.00]There's the how to be lucky.There's the how to have high agency.I feel like the agency thing is always a trend in SF,or just in text circles.
+
+[36:39.00]How do you define having high agency?
+
+[36:41.00]I'm almost,like,pass the high agency phase now.Now,my biggest concern is,like,okay,the agency is just,like,the norm of the vector.
+
+[36:49.00]What also matters is the direction,right?It's,like,how pure is the shot?Yeah,I mean,I think agency is just a matter of,like,having courage and doing the thing.That's scary,right?
+
+[36:59.00]You know,if you want to go rock climbing,it's,like,do you decide you want to go rock climbing,then you show up to the gym,you rent some shoes,and you just fall 40 times?
+
+[37:05.00]Or do you go,like,Oh,like,I'm actually more intelligent.Let me go research the kind of shoes that I want.Okay,like,there's flatter shoes and more inclined shoes.
+
+[37:13.00]Like,which one should I get?Okay,let me go order the shoes on Amazon.I'll come back in three days.Like,Oh,it's a little bit too tight.Maybe it's too aggressive.I'm only a beginner.Let me go change.
+
+[37:22.00]No,I think the high-agent person just,like,goes and,like,falls down 20 times,right?Yeah,I think the higher agency person is more focused on,like,process metrics versus outcome metrics.
+
+[37:32.00]Right?Like,from pottery,like,one thing I learned was,if you want to be good at pottery,you shouldn't count like the number of cups or bowls you make.You should just weigh the amount of clay you use,right?
+
+[37:42.00]Like,the successful person says,Oh,I want to do a hundred pounds of clay,right?The less agency person is like,Oh,I've made six cups,and then after I've made six cups,like,there's not really,what do you do next?No,just pounds of clay,pounds of clay.
+
+[37:53.00]Seeing with the work here,right?I say,Oh,you just got to write the tweets,like,made the commits,contribute open source,like,write the documentation.There's no real outcome.It's just a process.And if you love that process,you just get really good at the thing you're doing.
+
+[38:04.00]Yeah.So,just to push back on this,because obviously,I mostly agree.How would you design performance review systems?
+
+[38:11.00]Because you were effectively saying,we can count nicer code for developers,right?Like,did you?
+
+[38:18.00]No,I don't think that would be the actual,like,I think if you make that an outcome,like,I can just expand a for loop,right?I think,okay.So,for performance review,this is interesting because I've mostly thought of it from the perspective of science and not engineering.
+
+[38:31.00]I've been running a lot of engineering stand-ups,primarily because there's not really that many machine learning folks.The process outcome is like experiments and ideas,right?
+
+[38:39.00]Like,if you think about outcomes,what you might want to think about an outcome is,oh,I want to improve the revenue or whatnot.But that's really hard.But if you're someone who is going out,like,okay,like,this week,I want to come up with,like,three or four experiments.I might move the needle.Okay,nothing worked.
+
+[38:51.00]To them,they might think,oh,nothing worked.Like,I suck.But to me,it's like,wow,you've closed off all these other possible avenues for,like,research.Like,you're going to get to the place.So,you're going to figure out that direction really soon.
+
+[39:02.00]There's no way you'd try thirty different things and none of them work.Usually,like,ten of them work,five of them work really well,two of them work really,really well.And one thing was,like,the nail in the head.So,agency lets you sort of capture the volume of experiments.And,like,experience lets you figure out,like,oh,that other half,it's not worth doing,right?
+
+[39:19.00]I think experience is going to go,half these pumping papers don't make any sense,just use chain of thought and just,you know,use a for loop.That's basically it,right?It's like,usually performance for me is around,like,how many experiments are you running?How often times are you trying?Yeah.
+
+[39:31.00]So,when do you give up on an experiment?Because a stitch fix,you're going to give up on language models,I guess,in a way,as a tool to use.And then,maybe the tools got better.You were right at the time,and then the tool improved.I think there are similar paths in my engineering career,where I try one approach,and at the time it doesn't work,and then the thing changes.But then I kind of soured on that approach,and I don't go back to it.
+
+[39:52.00]Yeah,how do you think about that loop?So,usually when I'm coaching folks,and they say,like,oh,like,these things don't work,I'm not going to pursue them in the future.Like,one of the big things,like,hey,the negative result is a result,and this is something worth documenting.Like,this is an academia,like,if it's negative,you don't just,like,not publish,right?But then,like,what do you actually write down?Like,what you should write down is,like,here are the conditions,this is the inputs and the outputs we tried the experiment on.And then,one thing that's really valuable is basically writing down,under what conditions would I revisit these experiments.
+
+[40:20.00]These things don't work because of what we had at the time.If someone is reading this two years from now,under what conditions will we try again?That's really hard,but again,that's,like,another skill you kind of learn,right?It's like,you do go back,you do experiments,you figure out why it works now.I think a lot of it here is just,like,scaling worked.Rap lyrics,you know,that was because I did not have high enough quality data.If we phase shift and say,okay,you don't even need training data.Oh,great,then it might just work.Different domain.
+
+[40:47.00]Do you have anything in your list that is,like,it doesn't work now,but I want to try it again later.Something that people should.Maybe keep in mind,you know,people always,like,AGI when,you know,when are you going to know the AGI is here.Maybe it's less than that,but any stuff that you tried recently that didn't work that you think will get there.
+
+[41:03.00]I think the personal assistants and the writing I've shown to myself is just not good enough yet.So I hired a writer and I hired a personal assistant.So now I'm going to basically,like,work with these people until I figure out,like,what I can actually,like,automate,and what are,like,the reproducible steps.But,like,I think the experiment for me is,like,I'm going to go pay a person,like,thousand of dollars a month to,like,help me improve my life,and then let me get them to help me figure,like,what are the components,and how do I actually modularize something to get it to work.Because it's not just,like,o-auth Gmail calendar and,like,notion.It's not just,like,it's,like,another skill.It's,like,another skill.
+
+[41:30.00]It's not just,like,o-auth Gmail calendar and,like,notion.It's a little bit more complicated than that,but we just don't know what that is yet.Those are two,sort of,systems that,I wish gb4 or opus was actually good enough to just write me an essay,but most of the essays are still pretty bad.Yeah,I would say,you know,on the personal assistant side,Lindy is probably the one I've seen the most.Flow was a speaker at the summit.I don't know if you've checked it out,or any other,sort of,agents assistant startup.
+
+[41:55.00]No,not recently.I haven't tried,Lindy.They were not GA.I was considering it.Yeah,yeah.A lot of it now,it's,like,oh,like,really,what I want you to do is take a look at all of my meetings and,like,write,like,a really good weekly summary email for my clients.To remind them that I'm,like,you know,thinking of them and,like,working for them,right?Or it's,like,I want you to notice that,like,my Monday is,like,way too packed,and,like,block out more time.And also,like,you know,that people to do the reschedule and then try to opt in to move them around.And then I want you to say,oh,Jason should have,like,a 15 minutes.
+
+[42:24.00]Prep break after form back to back.Those are things that now I know I can prompt them in,but can it do it well.Before,I didn't know that's what I wanted to prompt for us.Defragging a calendar and adding breaks so I can,like,eat lunch.Yeah,that's the AGI test.
+
+[42:39.00]Exactly.Compassion,right?I think one thing that,yeah,we didn't touch on it before,but I think it was interesting.You had this tweet a while ago about prompts should be code.And then there were a lot of companies trying to build prompt engineering tooling.Kinda.
+
+[42:53.00]Trying to turn the prompt into a more structured thing.What's your thought today?Now you want to turn the thinking into DAGs,like,do prompts should still be code?Any updated ideas?
+
+[43:03.00]Ah,it's the same thing,right?I think,you know,with Instructor,it is very much,like,the output model is defined as a code object.That code object is sent to the LLM and in return you get a data structure.So,the outputs of these models,I think,should also be code objects.And the inputs somewhat should be code objects.
+
+[43:20.00]But I think the one thing that Instructor tries to do is separate instruction data and the types of the output.And beyond that,I really just think that most of it should be still,like,managed pretty closely to the developer.
+
+[43:31.00]Like,so much of it is changing that if you give control of these systems away too early,you end up ultimately wanting them back.Like,many companies I know that reach out are ones where,like,oh,we're going off of the frameworks because now that we know what the business outcomes we're trying to optimize for,these frameworks don't work.
+
+[43:47.00]Yeah,because we do rag,but we want to do rag to,like,sell you supplements or to have you,like,schedule the fitness appointment.The prompts are kind of too big in the systems to really pull them back out and,like,start doing upselling or something.It's really funny,but a lot of it ends up being,like,once you understand the business outcomes,you care way more about the prompt.
+
+[44:06.00]Actually,this is fun.In our prep for this call,we were trying to say,like,what can you as an independent person say that maybe me and Alessio cannot say or,you know,someone who wants to get a company say.
+
+[44:15.00]What do you think is the market share of the frameworks?The land chain,the llama index,the everything.
+
+[44:20.00]Oh,massive.Because not everyone wants to care about the code,right?I think that's a different question to,like,what is the business model and are they going to be,like,massively profitable businesses,right?
+
+[44:31.00]Making hundreds of millions of dollars,that feels,like,so straightforward,right?Because not everyone is the prompt engineer,like,there's so much productivity to be captured in,like,backoffice,off to automations,right?
+
+[44:42.00]It's not because they care about the prompts that they care about managing these things.Yeah,but those would be sort of low-code experiences,you know?
+
+[44:48.00]Yeah,I think the bigger challenge is,like,okay,100 million dollars,probably pretty easy.It's just time and effort,and they have the manpower and the money to sort of solve those problems.
+
+[44:58.00]Again,if you go to the VC route,then it's,like,you're talking about billions,and that's really the goal.That stuff,for me,it's,like,pretty unclear.
+
+[45:05.00]But again,that is to say,that,like,I sort of am building things for developers who want to use Instructure to build their own tooling,in terms of the amount of developers there are in the world,versus,downstream consumers of these things,or even just think of how many companies will use,like,the Adobe's and the IBM's,right?
+
+[45:19.00]Because they want something that's fully managed,and they want something that they know will work.And if the incremental 10% requires you to hire another team of 20 people,you might not want to do it.
+
+[45:28.00]I think that kind of organization is really good for those,like,bigger companies.
+
+[45:32.00]And now,I just want to capture your thoughts on one more thing,which is,you said you wanted most of the prompts to stay close to the developer.And Hamel Hussein wrote this post,which I really love,called,FU Show Me The Prompt.
+
+[45:43.00]Yeah.
+
+[45:44.00]I think he cites you in one of those part of the blog post.And I think DSPy is kind of,like,the complete antithesis of that,which is,I think it's interesting,because I also hold the strong view that AI is a better prompt engineer than you are.And I don't know how to square that.Wondering if you have thoughts.
+
+[45:57.00]I think something like DSPy can work because there are,like,very short-term metrics to measure success,right?It is,like,did you find the PII?Or,like,did you write the multi-hop question the correct way?
+
+[46:12.00]But in these workflows that I've been managing,a lot of it,are we minimizing churn and maximizing retention?
+
+[46:18.00]Yeah,that's a very long loop.It's not really,like,a uptuna,like,training loop,right?Like,those things are much more harder to capture.So we don't actually have those metrics.
+
+[46:26.00]And,obviously,we can figure out,like,is the summary good,but,like,how do you measure the quality of the summary?It's,like,that feedback loop ends up being a lot longer.And then,again,when something changes,it's really hard to make sure that it works across these,like,newer models,or,again,like,changes to work for the current prompts.
+
+[46:44.00]Like,when we migrate from,like,anthropic to open AI,like,there's just a ton of changes that are,like,infrastructure-related,not necessarily around the prompt itself.Yeah,cool.
+
+[46:53.00]Any other engineering startups that you think should not exist before we wrap up?
+
+[46:58.00]No,I mean,oh my gosh.I mean,a lot of it,again,is just,like,every time investors,like,how does this make a billion dollars?Like,it doesn't.I'm gonna go back to just,like,tweeting and holding my breath underwater.Yeah,like,I don't really pay attention too much to most of these,like,most of the stuff I'm doing is around,like,the consumer of,like,LLM calls.Yep.
+
+[47:18.00]I think people just want to move really fast,and they will end up pick these vendors,but I don't really know if anything has really,like,blown me out the water.Like,I only trust myself.But that's also a function of just being an old man.Like,I think,you know,many companies are definitely very happy with using most of these tools anyways.But I definitely think I occupy a very small space in the AI engineering ecosystem.Yeah.
+
+[47:39.00]I would say one of the challenges here,you know,you call about the dealing in the consumer of LLM's space.I think that's what AI engineering differs from ML engineering,and I think a constant disconnect or cognitive dissonance in this field in the AI engineers that have sprung up is that they're not as good as the ML engineers.They're not as qualified.I think that,you know,you are someone who has credibility in the MLE space,and you are also a very authoritative figure in the AI e-space.And
+
+[48:08.00]I think so.I think you've built the de facto leading library.I think yours,I think instructors should be part of the standard lib,even though I tried to not use it.Like,basically also end up rebuilding instructor,right?Like,that's a lot of the back and forth that we had over the past two days.I think that's the fundamental thing that we're trying to figure out.Like,there's very small supply of MLEs.Not everyone's going to have that experience that you had,but the global demand for AI is going to far outstrip the existing MLEs.So what do we do?Do we force everyone to go through the standard MLE curriculum?
+
+[48:37.00]MLE curriculum or do we make a new one?
+
+[48:39.36]I got some takes.
+
+[48:40.80]I think a lot of these app layer startups should not be hiring MLEs because they end up turning.
+
+[48:46.56]Yeah, they want to work at opening high.
+
+[48:48.44]They're just like, hey guys, I joined and you have no data.
+
+[48:51.84]And like all I did this week was take some typescript build errors and like figure out why we don't have any tests.
+
+[48:59.00]And like what is this framework X and Y?
+
+[49:01.40]Like how do you measure success?
+
+[49:02.76]What are your business outcomes?
+
+[49:03.84]Oh no, okay, let's not focus on that.
+
+[49:05.60]Great, I'll focus on these typescript build errors.
+
+[49:09.04]And then you're just like, what am I doing?
+
+[49:10.52]And then you kind of sort of feel really frustrated.
+
+[49:12.60]And I already recognize that because I've made offers to machine learning engineers.
+
+[49:18.08]They've joined and they've left in like two months.
+
+[49:21.32]And the response is like, yeah, I think I'm going to join a research lab.
+
+[49:24.00]So I think it's not even that, like I don't even think you should be hiring these MLEs.
+
+[49:27.56]On the other hand, what I also see a lot of is the really motivated engineer that's doing more engineering
+
+[49:34.20]is not being allowed to actually like fully pursue the AI engineering.
+
+[49:37.60]So they're the guy who built the demo, it got traction, now it's working.
+
+[49:40.72]But they're still being pulled back to figure out why Google Calendar integrations are not working
+
+[49:45.08]or like how to make sure that, you know, the button is loading on the page.
+
+[49:48.16]And so I'm sort of like in a very interesting position where the companies want to hire MLE.
+
+[49:53.32]They don't need to hire, but they won't let the excited people who've caught the AI engineering bug
+
+[49:57.84]could go do that work more full time.
+
+[50:00.00]This is something I'm literally wrestling with this week.
+
+[50:02.84]As I just wrote something about it, this is one of the things I'm probably going to be recommending in the future
+
+[50:06.40]is really thinking about like, where is the talent coming from?
+
+[50:08.96]How much of it is internal?
+
+[50:10.12]And do you really need to hire someone who's like writing pie torch code?
+
+[50:14.00]Yeah, exactly.
+
+[50:15.24]Most of the time you're not, you're going to need someone to write instructor code.
+
+[50:20.60]And like, I feel goofy all the time, just like prompting.
+
+[50:23.24]It's like, oh man, I wish I just had a target data set that I could like train a model against.
+
+[50:27.16]Yes.
+
+[50:27.68]And I can just say it's right or wrong.
+
+[50:29.24]Yeah.So, you know, I guess what Lanespace is, what the AI engineer world's fair is, is that we're trying to create
+
+[50:35.48]and elevate this industry of AI engineers, where it's legitimate to actually take these
+
+[50:40.36]motivated software engineers who want to build more in AI and do creative things in AI
+
+[50:44.48]to actually say you have the blessing, and this is legitimate, sub-special to your software engineering.
+
+[50:49.72]I think there's been a mix of that product engineering.
+
+[50:52.44]I think a lot more data science is going to come in versus machine learning engineering.
+
+[50:55.92]Because a lot of it now is just quantifying.
+
+[50:57.96]Like, what does the business actually want as an outcome?
+
+[51:01.16]The outcome is not rag-app.
+
+[51:02.56]The outcome is like reduced churn.
+
+[51:04.64]You wouldn't need to figure out what that actually is.
+
+[51:06.20]I had to measure it.
+
+[51:06.96]Yeah.All the data engineering tools still apply.
+
+[51:09.28]BI layers,sematic layers, whatever.
+
+[51:12.32]Yeah.
+
+[51:12.96]Cool.We'll have you back again for the world's fair.
+
+[51:15.88]We don't know what you're going to talk about, but I'm sure it's going to be amazing.
+
+[51:19.12]You're a very, very polished speaker.
+
+[51:20.48]The title is written.
+
+[51:21.68]It's just a, "Pydantic is still all you need."
+
+[51:26.24]I'm worried about having too many all-you-need titles, because that's obviously very trendy.
+
+[51:30.20]So, you have one of them, but I need to keep a lid on everyone saying their thing is all you need.
+
+[51:35.28]But yeah, we'll figure it out.
+
+[51:36.44]Pydantic is not my thing.
+
+[51:37.68]So, what else?
+
+[51:38.52]I think that's why it works.
+
+[51:40.56]It's true.
+
+[51:41.28]Cool.Well, it was a real pleasure to have you on.
+
+[51:43.24]Of course.
+
+[51:43.76]Everybody should go follow you on Twitter and check out Instructure.
+
+[51:46.68]There's also Instructure.js, which I'm very happy to see.
+
+[51:49.36]And what else? Anything else to plug?
+
+[51:51.76]Useinstructor.com.
+
+[51:52.84]We got a domain name now.
+
+[51:54.04]Nice.
+
+[51:54.84]Nice.Awesome.
+
+[51:55.80]Cool.Cool.
+
+[51:56.92]Thanks for your time.
+
+[51:57.72]Thanks.
+
+[51:58.72](音乐)
+
diff --git a/content/post/Latent Space/Latent-Space-Latent-Space-Chats:-NLW-(Four-Wars,-GPT5),-Josh-Albrecht-Ali-Rohde-(TNAI),-Dylan-Patel-Semianalysis-(Groq),-Milind-Naphade-(Nvidia-GTC),-Personal-AI-(ft.-Harrison-Chase-—-LangFriend-LangMem).lrc b/content/post/Latent Space/Latent-Space-Latent-Space-Chats:-NLW-(Four-Wars,-GPT5),-Josh-Albrecht-Ali-Rohde-(TNAI),-Dylan-Patel-Semianalysis-(Groq),-Milind-Naphade-(Nvidia-GTC),-Personal-AI-(ft.-Harrison-Chase-—-LangFriend-LangMem).lrc
new file mode 100644
index 0000000..29a3141
--- /dev/null
+++ b/content/post/Latent Space/Latent-Space-Latent-Space-Chats:-NLW-(Four-Wars,-GPT5),-Josh-Albrecht-Ali-Rohde-(TNAI),-Dylan-Patel-Semianalysis-(Groq),-Milind-Naphade-(Nvidia-GTC),-Personal-AI-(ft.-Harrison-Chase-—-LangFriend-LangMem).lrc
@@ -0,0 +1,6303 @@
+[by:whisper.cpp]
+[00:00.00](音乐)
+[00:06.00]欢迎到达特兰的节目的节目的节目
+[00:09.00]这是Charlie,你的AI co-host
+[00:12.00]SWIX 和 ALESIO 都在节目中做了更多的节目
+[00:16.00]我们有很多新的节目
+[00:18.00]来自Elicit、Chroma、 Instructor
+[00:20.00]和我们的新的节目在 NSFW
+[00:23.00]"不安全的工作AI"
+[00:25.00]今天我们会在SWIX 和 ALESIO 的新的节目中
+[00:29.00]找到更多新的新的节目
+[00:31.00]在我们的第一节目中
+[00:33.00]找到更多新的节目
+[00:35.00]SWIX 和 ALESIO 都在节目中
+[00:37.00]做了更多新的节目
+[00:40.00]SWIX 和 ALESIO 都在节目中
+[00:42.00]做了更多新的节目
+[00:44.00]我们有很多新的节目
+[00:46.00]SWIX 和 ALESIO 都在节目中
+[00:49.00]找到更多新的节目
+[00:52.00]我们有很多新的节目
+[00:54.00]SWIX 和 ALESIO 都在节目中
+[00:56.00]找到更多新的节目
+[00:58.00]SWIX 和 ALESIO 都在节目中
+[01:00.00]找到更多新的节目
+[01:02.00]SWIX 和 ALESIO 都在节目中
+[01:04.00]找到更多新的节目
+[01:06.00]SWIX 和 ALESIO 都在节目中
+[01:08.00]找到更多新的节目
+[01:09.00]SWIX 和 ALESIO 都在节目中
+[01:11.00]找到更多新的节目
+[01:12.00]SWIX 和 ALESIO 都在节目中
+[01:14.00]找到更多新的节目
+[01:16.00]SWIX 和 ALESIO 都在节目中
+[01:18.00]找到更多新的节目
+[01:19.00]SWIX 和 ALESIO 都在节目中
+[01:21.00]找到更多新的节目
+[01:22.00]SWIX 和 ALESIO 都在节目中
+[01:24.00]找到更多新的节目
+[01:26.00]SWIX 和 ALESIO 都在节目中
+[01:28.00]找到更多新的节目
+[01:30.00]SWIX 和 ALESIO 都在节目中
+[01:32.00]找到更多新的节目
+[01:34.00]SWIX 和 ALESIO 都在节目中
+[01:36.00]找到更多新的节目
+[01:38.00]SWIX 和 ALESIO 都在节目中
+[01:40.00]找到更多新的节目
+[01:42.00]SWIX 和 ALESIO 都在节目中
+[01:44.00]找到更多新的节目
+[01:46.00]SWIX 和 ALESIO 都在节目中
+[01:48.00]找到更多新的节目
+[01:50.00]SWIX 和 ALESIO 都在节目中
+[01:52.00]找到更多新的节目
+[01:54.00]SWIX 和 ALESIO 都在节目中
+[01:56.00]找到更多新的节目
+[01:58.00]SWIX 和 ALESIO 都在节目中
+[02:00.00]找到更多新的节目
+[02:02.00]SWIX 和 ALESIO 都在节目中
+[02:04.00]找到更多新的节目
+[02:06.00]SWIX 和 ALESIO 都在节目中
+[02:08.00]找到更多新的节目
+[02:10.00]SWIX 和 ALESIO 都在节目中
+[02:12.00]找到更多新的节目
+[02:14.00]SWIX 和 ALESIO 都在节目中
+[02:16.00]找到更多新的节目
+[02:18.00]SWIX 和 ALESIO 都在节目中
+[02:20.00]找到更多新的节目
+[02:22.00]SWIX 和 ALESIO 都在节目中
+[02:24.00]找到更多新的节目
+[02:26.00]SWIX 和 ALESIO 都在节目中
+[02:28.00]找到更多新的节目
+[02:30.00]SWIX 和 ALESIO 都在节目中
+[02:32.00]找到更多新的节目
+[02:34.00]SWIX 和 ALESIO 都在节目中
+[02:36.00]找到更多新的节目
+[02:38.00]SWIX 和 ALESIO 都在节目中
+[02:40.00]找到更多新的节目
+[02:42.00]SWIX 和 ALESIO 都在节目中
+[02:44.00]找到更多新的节目
+[02:46.00]SWIX 和 ALESIO 都在节目中
+[02:48.00]找到更多新的节目
+[02:50.00]SWIX 和 ALESIO 都在节目中
+[02:52.00]找到更多新的节目
+[02:54.00]SWIX 和 ALESIO 都在节目中
+[02:56.00]找到更多新的节目
+[02:58.00]SWIX 和 ALESIO 都在节目中
+[03:00.00]找到更多新的节目
+[03:02.00]SWIX 和 ALESIO 都在节目中
+[03:04.00]找到更多新的节目
+[03:06.00]SWIX 和 ALESIO 都在节目中
+[03:08.00]找到更多新的节目
+[03:10.00]SWIX 和 ALESIO 都在节目中
+[03:12.00]找到更多新的节目
+[03:14.00]SWIX 和 ALESIO 都在节目中
+[03:16.00]找到更多新的节目
+[03:18.00]SWIX 和 ALESIO 都在节目中
+[03:20.00]找到更多新的节目
+[03:22.00]SWIX 和 ALESIO 都在节目中
+[03:24.00]找到更多新的节目
+[03:26.00]SWIX 和 ALESIO 都在节目中
+[03:28.00]找到更多新的节目
+[03:30.00]SWIX 和 ALESIO 都在节目中
+[03:32.00]找到更多新的节目
+[03:34.00]SWIX 和 ALESIO 都在节目中
+[03:36.00]找到更多新的节目
+[03:38.00]SWIX 和 ALESIO 都在节目中
+[03:40.00]找到更多新的节目
+[03:42.00]SWIX 和 ALESIO 都在节目中
+[03:44.00]找到更多新的节目
+[03:46.00]SWIX 和 ALESIO 都在节目中
+[03:48.00]找到更多新的节目
+[03:50.00]SWIX 和 ALESIO 都在节目中
+[03:52.00]找到更多新的节目
+[03:54.00]SWIX 和 ALESIO 都在节目中
+[03:56.00]找到更多新的节目
+[03:58.00]SWIX 和 ALESIO 都在节目中
+[04:00.00]找到更多新的节目
+[04:02.00]SWIX 和 ALESIO 都在节目中
+[04:04.00]找到更多新的节目
+[04:06.00]SWIX 和 ALESIO 都在节目中
+[04:08.00]找到更多新的节目
+[04:10.00]SWIX 和 ALESIO 都在节目中
+[04:12.00]找到更多新的节目
+[04:14.00]SWIX 和 ALESIO 都在节目中
+[04:16.00]找到更多新的节目
+[04:18.00]SWIX 和 ALESIO 都在节目中
+[04:20.00]找到更多新的节目
+[04:22.00]SWIX 和 ALESIO 都在节目中
+[04:24.00]找到更多新的节目
+[04:26.00]SWIX 和 ALESIO 都在节目中
+[04:28.00]找到更多新的节目
+[04:30.00]SWIX 和 ALESIO 都在节目中
+[04:32.00]找到更多新的节目
+[04:34.00]SWIX 和 ALESIO 都在节目中
+[04:36.00]找到更多新的节目
+[04:38.00]SWIX 和 ALESIO 都在节目中
+[04:40.00]找到更多新的节目
+[04:42.00]SWIX 和 ALESIO 都在节目中
+[04:44.00]找到更多新的节目
+[04:46.00]SWIX 和 ALESIO 都在节目中
+[04:48.00]找到更多新的节目
+[04:50.00]SWIX 和 ALESIO 都在节目中
+[04:52.00]找到更多新的节目
+[04:54.00]SWIX 和 ALESIO 都在节目中
+[04:56.00]找到更多新的节目
+[04:58.00]SWIX 和 ALESIO 都在节目中
+[05:00.00]找到更多新的节目
+[05:02.00]SWIX 和 ALESIO 都在节目中
+[05:04.00]找到更多新的节目
+[05:06.00]SWIX 和 ALESIO 都在节目中
+[05:08.00]找到更多新的节目
+[05:10.00]SWIX 和 ALESIO 都在节目中
+[05:12.00]找到更多新的节目
+[05:14.00]SWIX 和 ALESIO 都在节目中
+[05:16.00]找到更多新的节目
+[05:18.00]SWIX 和 ALESIO 都在节目中
+[05:20.00]找到更多新的节目
+[05:22.00]SWIX 和 ALESIO 都在节目中
+[05:24.00]找到更多新的节目
+[05:26.00]SWIX 和 ALESIO 都在节目中
+[05:28.00]找到更多新的节目
+[05:30.00]SWIX 和 ALESIO 都在节目中
+[05:32.00]找到更多新的节目
+[05:34.00]SWIX 和 ALESIO 都在节目中
+[05:36.00]找到更多新的节目
+[05:38.00]SWIX 和 ALESIO 都在节目中
+[05:40.00]找到更多新的节目
+[05:42.00]SWIX 和 ALESIO 都在节目中
+[05:44.00]找到更多新的节目
+[05:46.00]SWIX 和 ALESIO 都在节目中
+[05:48.00]找到更多新的节目
+[05:50.00]SWIX 和 ALESIO 都在节目中
+[05:52.00]找到更多新的节目
+[05:54.00]SWIX 和 ALESIO 都在节目中
+[05:56.00]找到更多新的节目
+[05:58.00]SWIX 和 ALESIO 都在节目中
+[06:00.00]找到更多新的节目
+[06:02.00]SWIX 和 ALESIO 都在节目中
+[06:04.00]找到更多新的节目
+[06:06.00]SWIX 和 ALESIO 都在节目中
+[06:08.00]找到更多新的节目
+[06:10.00]SWIX 和 ALESIO 都在节目中
+[06:12.00]找到更多新的节目
+[06:14.00]SWIX 和 ALESIO 都在节目中
+[06:16.00]找到更多新的节目
+[06:18.00]SWIX 和 ALESIO 都在节目中
+[06:20.00]找到更多新的节目
+[06:22.00]SWIX 和 ALESIO 都在节目中
+[06:24.00]找到更多新的节目
+[06:26.00]SWIX 和 ALESIO 都在节目中
+[06:28.00]找到更多新的节目
+[06:30.00]SWIX 和 ALESIO 都在节目中
+[06:32.00]找到更多新的节目
+[06:34.00]SWIX 和 ALESIO 都在节目中
+[06:36.00]找到更多新的节目
+[06:38.00]SWIX 和 ALESIO 都在节目中
+[06:40.00]找到更多新的节目
+[06:42.00]SWIX 和 ALESIO 都在节目中
+[06:44.00]找到更多新的节目
+[06:46.00]SWIX 和 ALESIO 都在节目中
+[06:48.00]找到更多新的节目
+[06:50.00]SWIX 和 ALESIO 都在节目中
+[06:52.00]找到更多新的节目
+[06:54.00]SWIX 和 ALESIO 都在节目中
+[06:56.00]找到更多新的节目
+[06:58.00]SWIX 和 ALESIO 都在节目中
+[07:00.00]找到更多新的节目
+[07:02.00]SWIX 和 ALESIO 都在节目中
+[07:04.00]找到更多新的节目
+[07:06.00]SWIX 和 ALESIO 都在节目中
+[07:08.00]找到更多新的节目
+[07:10.00]SWIX 和 ALESIO 都在节目中
+[07:12.00]找到更多新的节目
+[07:14.00]SWIX 和 ALESIO 都在节目中
+[07:16.00]找到更多新的节目
+[07:18.00]SWIX 和 ALESIO 都在节目中
+[07:20.00]找到更多新的节目
+[07:22.00]SWIX 和 ALESIO 都在节目中
+[07:24.00]找到更多新的节目
+[07:26.00]SWIX 和 ALESIO 都在节目中
+[07:28.00]找到更多新的节目
+[07:30.00]SWIX 和 ALESIO 都在节目中
+[07:32.00]找到更多新的节目
+[07:34.00]SWIX 和 ALESIO 都在节目中
+[07:36.00]找到更多新的节目
+[07:38.00]SWIX 和 ALESIO 都在节目中
+[07:40.00]找到更多新的节目
+[07:42.00]SWIX 和 ALESIO 都在节目中
+[07:44.00]找到更多新的节目
+[07:46.00]SWIX 和 ALESIO 都在节目中
+[07:48.00]找到更多新的节目
+[07:50.00]SWIX 和 ALESIO 都在节目中
+[07:52.00]找到更多新的节目
+[07:54.00]SWIX 和 ALESIO 都在节目中
+[07:56.00]找到更多新的节目
+[07:58.00]找到更多新的节目
+[08:00.00]SWIX 和 ALESIO 都在节目中
+[08:02.00]找到更多新的节目
+[08:04.00]SWIX 和 ALESIO 都在节目中
+[08:06.00]找到更多新的节目
+[08:08.00]SWIX 和 ALESIO 都在节目中
+[08:10.00]找到更多新的节目
+[08:12.00]SWIX 和 ALESIO 都在节目中
+[08:14.00]找到更多新的节目
+[08:16.00]SWIX 和 ALESIO 都在节目中
+[08:18.00]找到更多新的节目
+[08:20.00]SWIX 和 ALESIO 都在节目中
+[08:22.00]找到更多新的节目
+[08:24.00]SWIX 和 ALESIO 都在节目中
+[08:26.00]找到更多新的节目
+[08:28.00]SWIX 和 ALESIO 都在节目中
+[08:30.00]找到更多新的节目
+[08:32.00]SWIX 和 ALESIO 都在节目中
+[08:34.00]找到更多新的节目
+[08:36.00]SWIX 和 ALESIO 都在节目中
+[08:38.00]找到更多新的节目
+[08:40.00]SWIX 和 ALESIO 都在节目中
+[08:42.00]找到更多新的节目
+[08:44.00]SWIX 和 ALESIO 都在节目中
+[08:46.00]找到更多新的节目
+[08:48.00]SWIX 和 ALESIO 都在节目中
+[08:50.00]找到更多新的节目
+[08:52.00]SWIX 和 ALESIO 都在节目中
+[08:54.00]找到更多新的节目
+[08:56.00]SWIX 和 ALESIO 都在节目中
+[08:58.00]找到更多新的节目
+[09:00.00]SWIX 和 ALESIO 都在节目中
+[09:02.00]找到更多新的节目
+[09:04.00]SWIX 和 ALESIO 都在节目中
+[09:06.00]找到更多新的节目
+[09:08.00]SWIX 和 ALESIO 都在节目中
+[09:10.00]找到更多新的节目
+[09:12.00]SWIX 和 ALESIO 都在节目中
+[09:14.00]找到更多新的节目
+[09:16.00]SWIX 和 ALESIO 都在节目中
+[09:18.00]找到更多新的节目
+[09:20.00]SWIX 和 ALESIO 都在节目中
+[09:22.00]找到更多新的节目
+[09:24.00]空中
+[09:26.00]找到更多新的节目
+[09:28.00]SWIX 和 ALESIO 都在节目中
+[09:30.00]找到更多新的节目
+[09:32.00]SWIX 和 ALESIO 都在节目中
+[09:34.00]找到更多新的节目
+[09:36.00]SWIX 和 ALESIO 都在节目中
+[09:38.00]找到更多新的节目
+[09:40.00]SWIX 和 ALESIO 都在节目中
+[09:42.00]找到更多新的节目
+[09:44.00]SWIX 和 ALESIO 都在节目中
+[09:46.00]找到更多新的节目
+[09:48.00]SWIX 和 ALESIO 都在节目中
+[09:50.00]找到更多新的节目
+[09:52.00]SWIX 和 ALESIO 都在节目中
+[09:54.00]找到更多新的节目
+[09:56.00]SWIX 和 ALESIO 都在节目中
+[09:58.00]找到更多新的节目
+[10:00.00]SWIX 和 ALESIO 都在节目中
+[10:02.00]找到更多新的节目
+[10:04.00]SWIX 和 ALESIO 都在节目中
+[10:06.00]找到更多新的节目
+[10:08.00]SWIX 和 ALESIO 都在节目中
+[10:10.00]找到更多新的节目
+[10:12.00]SWIX 和 ALESIO 都在节目中
+[10:14.00]找到更多新的节目
+[10:16.00]SWIX 和 ALESIO 都在节目中
+[10:18.00]找到更多新的节目
+[10:20.00]SWIX 和 ALESIO 都在节目中
+[10:22.00]找到更多新的节目
+[10:24.00]SWIX 和 ALESIO 都在节目中
+[10:26.00]找到更多新的节目
+[10:28.00]SWIX 和 ALESIO 都在节目中
+[10:30.00]找到更多新的节目
+[10:32.00]SWIX 和 ALESIO 都在节目中
+[10:34.00]找到更多新的节目
+[10:36.00]SWIX 和 ALESIO 都在节目中
+[10:38.00]找到更多新的节目
+[10:40.00]SWIX 和 ALESIO 都在节目中
+[10:42.00]找到更多新的节目
+[10:44.00]SWIX 和 ALESIO 都在节目中
+[10:46.00]找到更多新的节目
+[10:48.00]SWIX 和 ALESIO 都在节目中
+[10:50.00]找到更多新的节目
+[10:52.00]SWIX 和 ALESIO 都在节目中
+[10:54.00]找到更多新的节目
+[10:56.00]SWIX 和 ALESIO 都在节目中
+[10:58.00]找到更多新的节目
+[11:00.00]SWIX 和 ALESIO 都在节目中
+[11:02.00]找到更多新的节目
+[11:04.00]SWIX 和 ALESIO 都在节目中
+[11:06.00]找到更多新的节目
+[11:08.00]SWIX 和 ALESIO 都在节目中
+[11:10.00]找到更多新的节目
+[11:12.00]SWIX 和 ALESIO 都在节目中
+[11:14.00]找到更多新的节目
+[11:16.00]SWIX 和 ALESIO 都在节目中
+[11:18.00]找到更多新的节目
+[11:20.00]SWIX 和 ALESIO 都在节目中
+[11:22.00]找到更多新的节目
+[11:24.00]SWIX 和 ALESIO 都在节目中
+[11:26.00]找到更多新的节目
+[11:28.00]SWIX 和 ALESIO 都在节目中
+[11:30.00]找到更多新的节目
+[11:32.00]SWIX 和 ALESIO 都在节目中
+[11:34.00]找到更多新的节目
+[11:36.00]SWIX 和 ALESIO 都在节目中
+[11:38.00]找到更多新的节目
+[11:40.00]SWIX 和 ALESIO 都在节目中
+[11:42.00]找到更多新的节目
+[11:44.00]SWIX 和 ALESIO 都在节目中
+[11:46.00]找到更多新的节目
+[11:48.00]SWIX 和 ALESIO 都在节目中
+[11:50.00]找到更多新的节目
+[11:52.00]SWIX 和 ALESIO 都在节目中
+[11:54.00]找到更多新的节目
+[11:56.00]SWIX 和 ALESIO 都在节目中
+[11:58.00]找到更多新的节目
+[12:00.00]SWIX 和 ALESIO 都在节目中
+[12:02.00]找到更多新的节目
+[12:04.00]SWIX 和 ALESIO 都在节目中
+[12:06.00]找到更多新的节目
+[12:08.00]SWIX 和 ALESIO 都在节目中
+[12:10.00]找到更多新的节目
+[12:12.00]SWIX 和 ALESIO 都在节目中
+[12:14.00]找到更多新的节目
+[12:16.00]SWIX 和 ALESIO 都在节目中
+[12:18.00]找到更多新的节目
+[12:20.00]SWIX 和 ALESIO 都在节目中
+[12:22.00]找到更多新的节目
+[12:24.00]SWIX 和 ALESIO 都在节目中
+[12:26.00]找到更多新的节目
+[12:28.00]SWIX 和 ALESIO 都在节目中
+[12:30.00]找到更多新的节目
+[12:32.00]SWIX 和 ALESIO 都在节目中
+[12:34.00]找到更多新的节目
+[12:36.00]SWIX 和 ALESIO 都在节目中
+[12:38.00]找到更多新的节目
+[12:40.00]找到更多新的节目
+[12:42.00]找到更多新的节目
+[12:44.00]找到更多新的节目
+[12:46.00]找到更多新的节目
+[12:48.00]找到更多新的节目
+[12:50.00]找到更多新的节目
+[12:52.00]找到更多新的节目
+[12:54.00]找到更多新的节目
+[12:56.00]找到更多新的节目
+[12:58.00]找到更多新的节目
+[13:00.00]找到更多新的节目
+[13:02.00]找到更多新的节目
+[13:04.00]找到更多新的节目
+[13:06.00]找到更多新的节目
+[13:08.00]找到更多新的节目
+[13:10.00]找到更多新的节目
+[13:12.00]找到更多新的节目
+[13:14.00]找到更多新的节目
+[13:16.00]找到更多新的节目
+[13:18.00]找到更多新的节目
+[13:20.00]找到更多新的节目
+[13:22.00]找到更多新的节目
+[13:24.00]找到更多新的节目
+[13:26.00]找到更多新的节目
+[13:28.00]找到更多新的节目
+[13:30.00]找到更多新的节目
+[13:32.00]找到更多新的节目
+[13:34.00]找到更多新的节目
+[13:36.00]找到更多新的节目
+[13:38.00]找到更多新的节目
+[13:40.00]找到更多新的节目
+[13:42.00]找到更多新的节目
+[13:44.00]找到更多新的节目
+[13:46.00]找到更多新的节目
+[13:48.00]找到更多新的节目
+[13:50.00]找到更多新的节目
+[13:52.00]找到更多新的节目
+[13:54.00]找到更多新的节目
+[13:56.00]找到更多新的节目
+[13:58.00]找到更多新的节目
+[14:00.00]找到更多新的节目
+[14:02.00]找到更多新的节目
+[14:04.00]找到更多新的节目
+[14:06.00]找到更多新的节目
+[14:08.00]找到更多新的节目
+[14:10.00]找到更多新的节目
+[14:12.00]找到更多新的节目
+[14:14.00]找到更多新的节目
+[14:16.00]找到更多新的节目
+[14:18.00]找到更多新的节目
+[14:20.00]找到更多新的节目
+[14:22.00]找到更多新的节目
+[14:24.00]找到更多新的节目
+[14:26.00]找到更多新的节目
+[14:28.00]找到更多新的节目
+[14:30.00]找到更多新的节目
+[14:32.00]找到更多新的节目
+[14:34.00]找到更多新的节目
+[14:36.00]找到更多新的节目
+[14:38.00]找到更多新的节目
+[14:40.00]找到更多新的节目
+[14:42.00]找到更多新的节目
+[14:44.00]找到更多新的节目
+[14:46.00]找到更多新的节目
+[14:48.00]找到更多新的节目
+[14:50.00]找到更多新的节目
+[14:52.00]找到更多新的节目
+[14:54.00]找到更多新的节目
+[14:56.00]找到更多新的节目
+[14:58.00]找到更多新的节目
+[15:00.00]找到更多新的节目
+[15:02.00]找到更多新的节目
+[15:04.00]找到更多新的节目
+[15:06.00]找到更多新的节目
+[15:08.00]找到更多新的节目
+[15:10.00]找到更多新的节目
+[15:12.00]找到更多新的节目
+[15:14.00]找到更多新的节目
+[15:16.00]找到更多新的节目
+[15:18.00]找到更多新的节目
+[15:20.00]找到更多新的节目
+[15:22.00]找到更多新的节目
+[15:24.00]找到更多新的节目
+[15:26.00]找到更多新的节目
+[15:28.00]找到更多新的节目
+[15:30.00]找到更多新的节目
+[15:32.00]找到更多新的节目
+[15:34.00]找到更多新的节目
+[15:36.00]找到更多新的节目
+[15:38.00]找到更多新的节目
+[15:40.00]找到更多新的节目
+[15:42.00]找到更多新的节目
+[15:44.00]找到更多新的节目
+[15:46.00]找到更多新的节目
+[15:48.00]找到更多新的节目
+[15:50.00]找到更多新的节目
+[15:52.00]找到更多新的节目
+[15:54.00]找到更多新的节目
+[15:56.00]找到更多新的节目
+[15:58.00]找到更多新的节目
+[16:00.00]找到更多新的节目
+[16:02.00]找到更多新的节目
+[16:04.00]找到更多新的节目
+[16:06.00]找到更多新的节目
+[16:08.00]找到更多新的节目
+[16:10.00]找到更多新的节目
+[16:12.00]找到更多新的节目
+[16:14.00]找到更多新的节目
+[16:16.00]找到更多新的节目
+[16:18.00]找到更多新的节目
+[16:20.00]找到更多新的节目
+[16:22.00]找到更多新的节目
+[16:24.00]找到更多新的节目
+[16:26.00]找到更多新的节目
+[16:28.00]找到更多新的节目
+[16:30.00]找到更多新的节目
+[16:32.00]找到更多新的节目
+[16:34.00]找到更多新的节目
+[16:36.00]找到更多新的节目
+[16:38.00]找到更多新的节目
+[16:40.00]找到更多新的节目
+[16:42.00]找到更多新的节目
+[16:44.00]找到更多新的节目
+[16:46.00]找到更多新的节目
+[16:48.00]找到更多新的节目
+[16:50.00]找到更多新的节目
+[16:52.00]找到更多新的节目
+[16:54.00]找到更多新的节目
+[16:56.00]找到更多新的节目
+[16:58.00]找到更多新的节目
+[17:00.00]找到更多新的节目
+[17:02.00]找到更多新的节目
+[17:04.00]找到更多新的节目
+[17:06.00]找到更多新的节目
+[17:08.00]找到更多新的节目
+[17:10.00]找到更多新的节目
+[17:12.00]找到更多新的节目
+[17:14.00]找到更多新的节目
+[17:16.00]找到更多新的节目
+[17:18.00]找到更多新的节目
+[17:20.00]找到更多新的节目
+[17:22.00]找到更多新的节目
+[17:24.00]找到更多新的节目
+[17:26.00]找到更多新的节目
+[17:28.00]找到更多新的节目
+[17:30.00]找到更多新的节目
+[17:32.00]找到更多新的节目
+[17:34.00]找到更多新的节目
+[17:36.00]找到更多新的节目
+[17:38.00]找到更多新的节目
+[17:40.00]找到更多新的节目
+[17:42.00]找到更多新的节目
+[17:44.00]找到更多新的节目
+[17:46.00]找到更多新的节目
+[17:48.00]找到更多新的节目
+[17:50.00]找到更多新的节目
+[17:52.00]找到更多新的节目
+[17:54.00]找到更多新的节目
+[17:56.00]找到更多新的节目
+[17:58.00]找到更多新的节目
+[18:00.00]找到更多新的节目
+[18:02.00]找到更多新的节目
+[18:04.00]找到更多新的节目
+[18:06.00]找到更多新的节目
+[18:08.00]找到更多新的节目
+[18:10.00]找到更多新的节目
+[18:12.00]找到更多新的节目
+[18:14.00]找到更多新的节目
+[18:16.00]找到更多新的节目
+[18:18.00]找到更多新的节目
+[18:20.00]找到更多新的节目
+[18:22.00]找到更多新的节目
+[18:24.00]找到更多新的节目
+[18:26.00]找到更多新的节目
+[18:28.00]找到更多新的节目
+[18:30.00]找到更多新的节目
+[18:32.00]找到更多新的节目
+[18:34.00]找到更多新的节目
+[18:36.00]找到更多新的节目
+[18:38.00]找到更多新的节目
+[18:40.00]找到更多新的节目
+[18:42.00]找到更多新的节目
+[18:44.00]找到更多新的节目
+[18:46.00]找到更多新的节目
+[18:48.00]找到更多新的节目
+[18:50.00]找到更多新的节目
+[18:52.00]找到更多新的节目
+[18:54.00]找到更多新的节目
+[18:56.00]找到更多新的节目
+[18:58.00]找到更多新的节目
+[19:00.00]找到更多新的节目
+[19:02.00]找到更多新的节目
+[19:04.00]找到更多新的节目
+[19:06.00]找到更多新的节目
+[19:08.00]找到更多新的节目
+[19:10.00]找到更多新的节目
+[19:12.00]找到更多新的节目
+[19:14.00]找到更多新的节目
+[19:16.00]找到更多新的节目
+[19:18.00]找到更多新的节目
+[19:20.00]找到更多新的节目
+[19:22.00]找到更多新的节目
+[19:24.00]找到更多新的节目
+[19:26.00]找到更多新的节目
+[19:28.00]找到更多新的节目
+[19:30.00]找到更多新的节目
+[19:32.00]找到更多新的节目
+[19:34.00]找到更多新的节目
+[19:36.00]找到更多新的节目
+[19:38.00]找到更多新的节目
+[19:40.00]找到更多新的节目
+[19:42.00]找到更多新的节目
+[19:44.00]找到更多新的节目
+[19:46.00]找到更多新的节目
+[19:48.00]找到更多新的节目
+[19:50.00]找到更多新的节目
+[19:52.00]找到更多新的节目
+[19:54.00]找到更多新的节目
+[19:56.00]找到更多新的节目
+[19:58.00]找到更多新的节目
+[20:00.00]找到更多新的节目
+[20:02.00]找到更多新的节目
+[20:04.00]找到更多新的节目
+[20:06.00]找到更多新的节目
+[20:08.00]找到更多新的节目
+[20:10.00]找到更多新的节目
+[20:12.00]找到更多新的节目
+[20:14.00]找到更多新的节目
+[20:16.00]找到更多新的节目
+[20:18.00]找到更多新的节目
+[20:20.00]找到更多新的节目
+[20:22.00]找到更多新的节目
+[20:24.00]找到更多新的节目
+[20:26.00]找到更多新的节目
+[20:28.00]找到更多新的节目
+[20:30.00]找到更多新的节目
+[20:32.00]找到更多新的节目
+[20:34.00]找到更多新的节目
+[20:36.00]找到更多新的节目
+[20:38.00]找到更多新的节目
+[20:40.00]找到更多新的节目
+[20:42.00]找到更多新的节目
+[20:44.00]找到更多新的节目
+[20:46.00]找到更多新的节目
+[20:48.00]找到更多新的节目
+[20:50.00]找到更多新的节目
+[20:52.00]找到更多新的节目
+[20:54.00]找到更多新的节目
+[20:56.00]找到更多新的节目
+[20:58.00]找到更多新的节目
+[21:00.00]找到更多新的节目
+[21:02.00]找到更多新的节目
+[21:04.00]找到更多新的节目
+[21:06.00]找到更多新的节目
+[21:08.00]找到更多新的节目
+[21:10.00]找到更多新的节目
+[21:12.00]找到更多新的节目
+[21:14.00]找到更多新的节目
+[21:16.00]找到更多新的节目
+[21:18.00]找到更多新的节目
+[21:20.00]找到更多新的节目
+[21:22.00]找到更多新的节目
+[21:24.00]找到更多新的节目
+[21:26.00]找到更多新的节目
+[21:28.00]找到更多新的节目
+[21:30.00]找到更多新的节目
+[21:32.00]找到更多新的节目
+[21:34.00]找到更多新的节目
+[21:36.00]找到更多新的节目
+[21:38.00]找到更多新的节目
+[21:40.00]找到更多新的节目
+[21:42.00]找到更多新的节目
+[21:44.00]找到更多新的节目
+[21:46.00]找到更多新的节目
+[21:48.00]找到更多新的节目
+[21:50.00]找到更多新的节目
+[21:52.00]找到更多新的节目
+[21:54.00]找到更多新的节目
+[21:56.00]找到更多新的节目
+[21:58.00]找到更多新的节目
+[22:00.00]找到更多新的节目
+[22:02.00]找到更多新的节目
+[22:04.00]找到更多新的节目
+[22:06.00]找到更多新的节目
+[22:08.00]找到更多新的节目
+[22:10.00]找到更多新的节目
+[22:12.00]找到更多新的节目
+[22:14.00]找到更多新的节目
+[22:16.00]找到更多新的节目
+[22:18.00]找到更多新的节目
+[22:20.00]找到更多新的节目
+[22:22.00]找到更多新的节目
+[22:24.00]找到更多新的节目
+[22:26.00]找到更多新的节目
+[22:28.00]找到更多新的节目
+[22:30.00]找到更多新的节目
+[22:32.00]找到更多新的节目
+[22:34.00]找到更多新的节目
+[22:36.00]找到更多新的节目
+[22:38.00]找到更多新的节目
+[22:40.00]找到更多新的节目
+[22:42.00]找到更多新的节目
+[22:44.00]找到更多新的节目
+[22:46.00]找到更多新的节目
+[22:48.00]找到更多新的节目
+[22:50.00]找到更多新的节目
+[22:52.00]找到更多新的节目
+[22:54.00]找到更多新的节目
+[22:56.00]找到更多新的节目
+[22:58.00]找到更多新的节目
+[23:00.00]找到更多新的节目
+[23:02.00]找到更多新的节目
+[23:04.00]找到更多新的节目
+[23:06.00]找到更多新的节目
+[23:08.00]找到更多新的节目
+[23:10.00]找到更多新的节目
+[23:12.00]找到更多新的节目
+[23:14.00]找到更多新的节目
+[23:16.00]找到更多新的节目
+[23:18.00]找到更多新的节目
+[23:20.00]找到更多新的节目
+[23:22.00]找到更多新的节目
+[23:24.00]找到更多新的节目
+[23:26.00]找到更多新的节目
+[23:28.00]找到更多新的节目
+[23:30.00]找到更多新的节目
+[23:32.00]找到更多新的节目
+[23:34.00]找到更多新的节目
+[23:36.00]找到更多新的节目
+[23:38.00]找到更多新的节目
+[23:40.00]找到更多新的节目
+[23:42.00]找到更多新的节目
+[23:44.00]找到更多新的节目
+[23:46.00]找到更多新的节目
+[23:48.00]找到更多新的节目
+[23:50.00]找到更多新的节目
+[23:52.00]找到更多新的节目
+[23:54.00]找到更多新的节目
+[23:56.00]找到更多新的节目
+[23:58.00]找到更多新的节目
+[24:00.00]找到更多新的节目
+[24:02.00]找到更多新的节目
+[24:04.00]找到更多新的节目
+[24:06.00]找到更多新的节目
+[24:08.00]找到更多新的节目
+[24:10.00]找到更多新的节目
+[24:12.00]找到更多新的节目
+[24:14.00]找到更多新的节目
+[24:16.00]找到更多新的节目
+[24:18.00]找到更多新的节目
+[24:20.00]找到更多新的节目
+[24:22.00]找到更多新的节目
+[24:24.00]找到更多新的节目
+[24:26.00]找到更多新的节目
+[24:28.00]找到更多新的节目
+[24:30.00]找到更多新的节目
+[24:32.00]找到更多新的节目
+[24:34.00]找到更多新的节目
+[24:36.00]找到更多新的节目
+[24:38.00]找到更多新的节目
+[24:40.00]找到更多新的节目
+[24:42.00]找到更多新的节目
+[24:44.00]找到更多新的节目
+[24:46.00]找到更多新的节目
+[24:48.00]找到更多新的节目
+[24:50.00]找到更多新的节目
+[24:52.00]找到更多新的节目
+[24:54.00]找到更多新的节目
+[24:56.00]找到更多新的节目
+[24:58.00]找到更多新的节目
+[25:00.00]找到更多新的节目
+[25:02.00]找到更多新的节目
+[25:04.00]找到更多新的节目
+[25:06.00]找到更多新的节目
+[25:08.00]找到更多新的节目
+[25:10.00]找到更多新的节目
+[25:12.00]找到更多新的节目
+[25:14.00]找到更多新的节目
+[25:16.00]找到更多新的节目
+[25:18.00]找到更多新的节目
+[25:20.00]找到更多新的节目
+[25:22.00]找到更多新的节目
+[25:24.00]找到更多新的节目
+[25:26.00]找到更多新的节目
+[25:28.00]找到更多新的节目
+[25:30.00]找到更多新的节目
+[25:32.00]找到更多新的节目
+[25:34.00]找到更多新的节目
+[25:36.00]找到更多新的节目
+[25:38.00]找到更多新的节目
+[25:40.00]找到更多新的节目
+[25:42.00]找到更多新的节目
+[25:44.00]找到更多新的节目
+[25:46.00]找到更多新的节目
+[25:48.00]找到更多新的节目
+[25:50.00]找到更多新的节目
+[25:52.00]找到更多新的节目
+[25:54.00]找到更多新的节目
+[25:56.00]找到更多新的节目
+[25:58.00]找到更多新的节目
+[26:00.00]找到更多新的节目
+[26:02.00]找到更多新的节目
+[26:04.00]找到更多新的节目
+[26:06.00]找到更多新的节目
+[26:08.00]找到更多新的节目
+[26:10.00]找到更多新的节目
+[26:12.00]找到更多新的节目
+[26:14.00]找到更多新的节目
+[26:16.00]找到更多新的节目
+[26:18.00]找到更多新的节目
+[26:20.00]找到更多新的节目
+[26:22.00]找到更多新的节目
+[26:24.00]找到更多新的节目
+[26:26.00]找到更多新的节目
+[26:28.00]找到更多新的节目
+[26:30.00]找到更多新的节目
+[26:32.00]找到更多新的节目
+[26:34.00]找到更多新的节目
+[26:36.00]找到更多新的节目
+[26:38.00]找到更多新的节目
+[26:40.00]找到更多新的节目
+[26:42.00]找到更多新的节目
+[26:44.00]找到更多新的节目
+[26:46.00]找到更多新的节目
+[26:48.00]找到更多新的节目
+[26:50.00]找到更多新的节目
+[26:52.00]找到更多新的节目
+[26:54.00]找到更多新的节目
+[26:56.00]找到更多新的节目
+[26:58.00]找到更多新的节目
+[27:00.00]找到更多新的节目
+[27:02.00]找到更多新的节目
+[27:04.00]找到更多新的节目
+[27:06.00]找到更多新的节目
+[27:08.00]找到更多新的节目
+[27:10.00]找到更多新的节目
+[27:12.00]找到更多新的节目
+[27:14.00]找到更多新的节目
+[27:16.00]找到更多新的节目
+[27:18.00]找到更多新的节目
+[27:20.00]找到更多新的节目
+[27:22.00]找到更多新的节目
+[27:24.00]找到更多新的节目
+[27:26.00]找到更多新的节目
+[27:28.00]找到更多新的节目
+[27:30.00]找到更多新的节目
+[27:32.00]找到更多新的节目
+[27:34.00]找到更多新的节目
+[27:36.00]找到更多新的节目
+[27:38.00]找到更多新的节目
+[27:40.00]找到更多新的节目
+[27:42.00]找到更多新的节目
+[27:44.00]找到更多新的节目
+[27:46.00]找到更多新的节目
+[27:48.00]找到更多新的节目
+[27:50.00]找到更多新的节目
+[27:52.00]找到更多新的节目
+[27:54.00]找到更多新的节目
+[27:56.00]找到更多新的节目
+[27:58.00]找到更多新的节目
+[28:00.00]找到更多新的节目
+[28:02.00]找到更多新的节目
+[28:04.00]找到更多新的节目
+[28:06.00]找到更多新的节目
+[28:08.00]找到更多新的节目
+[28:10.00]找到更多新的节目
+[28:12.00]找到更多新的节目
+[28:14.00]找到更多新的节目
+[28:16.00]找到更多新的节目
+[28:18.00]找到更多新的节目
+[28:20.00]找到更多新的节目
+[28:22.00]找到更多新的节目
+[28:24.00]找到更多新的节目
+[28:26.00]找到更多新的节目
+[28:28.00]找到更多新的节目
+[28:30.00]找到更多新的节目
+[28:32.00]找到更多新的节目
+[28:34.00]找到更多新的节目
+[28:36.00]找到更多新的节目
+[28:38.00]找到更多新的节目
+[28:40.00]找到更多新的节目
+[28:42.00]找到更多新的节目
+[28:44.00]找到更多新的节目
+[28:46.00]找到更多新的节目
+[28:48.00]找到更多新的节目
+[28:50.00]找到更多新的节目
+[28:52.00]找到更多新的节目
+[28:54.00]找到更多新的节目
+[28:56.00]找到更多新的节目
+[28:58.00]找到更多新的节目
+[29:00.00]找到更多新的节目
+[29:02.00]找到更多新的节目
+[29:04.00]找到更多新的节目
+[29:06.00]找到更多新的节目
+[29:08.00]找到更多新的节目
+[29:10.00]找到更多新的节目
+[29:12.00]找到更多新的节目
+[29:14.00]找到更多新的节目
+[29:16.00]找到更多新的节目
+[29:18.00]找到更多新的节目
+[29:20.00]找到更多新的节目
+[29:22.00]找到更多新的节目
+[29:24.00]找到更多新的节目
+[29:26.00]找到更多新的节目
+[29:28.00]找到更多新的节目
+[29:30.00]找到更多新的节目
+[29:32.00]找到更多新的节目
+[29:34.00]找到更多新的节目
+[29:36.00]找到更多新的节目
+[29:38.00]找到更多新的节目
+[29:40.00]找到更多新的节目
+[29:42.00]找到更多新的节目
+[29:44.00]找到更多新的节目
+[29:46.00]找到更多新的节目
+[29:48.00]找到更多新的节目
+[29:50.00]找到更多新的节目
+[29:52.00]找到更多新的节目
+[29:54.00]找到更多新的节目
+[29:56.00]找到更多新的节目
+[29:58.00]找到更多新的节目
+[30:00.00]找到更多新的节目
+[30:02.00]找到更多新的节目
+[30:04.00]找到更多新的节目
+[30:06.00]找到更多新的节目
+[30:08.00]找到更多新的节目
+[30:10.00]找到更多新的节目
+[30:12.00]找到更多新的节目
+[30:14.00]找到更多新的节目
+[30:16.00]找到更多新的节目
+[30:18.00]找到更多新的节目
+[30:20.00]找到更多新的节目
+[30:22.00]找到更多新的节目
+[30:24.00]找到更多新的节目
+[30:26.00]找到更多新的节目
+[30:28.00]找到更多新的节目
+[30:30.00]找到更多新的节目
+[30:32.00]找到更多新的节目
+[30:34.00]找到更多新的节目
+[30:36.00]找到更多新的节目
+[30:38.00]找到更多新的节目
+[30:40.00]找到更多新的节目
+[30:42.00]找到更多新的节目
+[30:44.00]找到更多新的节目
+[30:46.00]找到更多新的节目
+[30:48.00]找到更多新的节目
+[30:50.00]所以我们现在 slowly
+[30:52.00]改变了 现在我们现在慢慢改变了
+[30:54.00]我在将来归咖巴斯
+[30:56.00]没有改变了 没有改变了
+[30:58.00]但是现在我们
+[31:00.00]有一个世界的 一个世界的
+[31:02.00]有一个世界的 哪里有Claude
+[31:04.00]还有Gemnite 还有Gypsy4
+[31:06.00]希望更多的 会变得更多
+[31:08.00]希望更多的 希望更多的
+[31:10.00]所以 说说 说说
+[31:12.00]我们的视频 视频 视频
+[31:14.00]所以 非常大 非常大 因为
+[31:16.00]我们的视频 也不可以用
+[31:18.00]但我认为
+[31:20.00]大陆的社区
+[31:22.00]要他们继续建设
+[31:24.00]然后他们必须找一些方法
+[31:26.00]才能发现他们会做的
+[31:28.00]所以他们明白
+[31:30.00]他们必须解决他们要的
+[31:32.00]但是我们的视频 也有
+[31:34.00]Mistral 也有
+[31:36.00]Grock 现在
+[31:38.00]Grock 1 从 从 oktober
+[31:40.00]是公司
+[31:41.00]对对对对
+[31:42.00]你以为Grock是Grock the chip company
+[31:44.00]Grock the chip company
+[31:46.00]当然 浪漫3是
+[31:48.00]所有人都在问
+[31:50.00]我的感觉是
+[31:52.00]小时候 宋可伯
+[31:54.00]刚才说了浪漫3
+[31:56.00]说了 至少从
+[31:58.00]一个想法的选择
+[32:00.00]他不想 怎么做
+[32:02.00]要保持
+[32:04.00]能够保持 能够保持
+[32:06.00]Mistral 也想想
+[32:08.00]你去到 怎么
+[32:10.00]他 能够发展
+[32:12.00]每个人都好 任何
+[32:14.00]对 从我听过
+[32:16.00]在GDC 浪漫3
+[32:18.00]最大的模式是
+[32:20.00]260-300billion
+[32:22.00]所以那是大部分的
+[32:24.00]那不是一个开放模式
+[32:26.00]你不能给人们
+[32:28.00]300billion的模式
+[32:30.00]要用它 非常有力量
+[32:32.00]所以我认为
+[32:34.00]它是 可能是开放模式
+[32:36.00]但那是一个不同的问题
+[32:38.00]对对对
+[32:40.00]它是 比他们做的
+[32:42.00]在开放模式
+[32:44.00]在浪漫上
+[32:46.00]你能够使用
+[32:48.00]开放的AI
+[32:50.00]开放的安逸
+[32:52.00]一些公司在
+[32:54.00]中央的强硬度
+[32:56.00]所以我们
+[32:58.00]在Buckets 上
+[33:00.00]他们做了很多
+[33:02.00]他们比PyDorch 更好
+[33:04.00]那是 可能
+[33:06.00]在弥術上
+[33:08.00]可能是 可能是
+[33:10.00]I love the duck destroying
+[33:12.00]a lot of monopolies arc
+[33:14.00]it's been very entertaining
+[33:16.00]let's bridge into the
+[33:18.00]big tech side of this
+[33:20.00]I think when I did my episode
+[33:22.00]I added this as an additional war
+[33:24.00]that's something I'm paying attention to
+[33:26.00]so we've got
+[33:28.00]Microsoft's moves with inflection
+[33:30.00]which I think potentially are
+[33:32.00]being read as
+[33:34.00]a shift vis-a-vis the relationship
+[33:36.00]with open AI
+[33:38.00]missure a large relationship
+[33:40.00]seems to reinforce as well
+[33:42.00]we have apple potentially
+[33:44.00]entering the race finally
+[33:46.00]giving up project titan
+[33:48.00]and trying to spend more effort on this
+[33:50.00]although counterpoint
+[33:52.00]we also have them talking about
+[33:54.00]there being reports of a deal with google
+[33:56.00]which is interesting to see
+[33:58.00]what their strategy there is
+[34:00.00]and then metas been largely quiet
+[34:02.00]we just talked about the main piece
+[34:04.00]but there's spoilers
+[34:06.00]one of those things has been most interesting
+[34:08.00]to you guys as you think about
+[34:10.00]what's going to shake out for the rest of this year
+[34:12.00]let's take a crack
+[34:14.00]the reason we don't have a fifth war
+[34:16.00]for the big tech wars
+[34:18.00]that's one of those things where I just feel
+[34:20.00]we don't cover differently
+[34:22.00]from other media channels
+[34:24.00]I guess
+[34:26.00]in our entire interest
+[34:28.00]we try not to cover the big tech gamethrones
+[34:30.00]or it's proxied through
+[34:32.00]all the other four wars anyway
+[34:34.00]there's just a lot of overlap
+[34:36.00]but yeah I think absolutely personally
+[34:38.00]the most interesting one is apple entering the race
+[34:40.00]they actually release, they are announced
+[34:42.00]their first large language model that they train themselves
+[34:44.00]it's like a 30 billion multimodal model
+[34:46.00]people weren't that impressed
+[34:48.00]but it was like the first time
+[34:50.00]that apple has kind of showcased that
+[34:52.00]we're training large models in house as well
+[34:54.00]of course they might be
+[34:56.00]doing this deal with google
+[34:58.00]it sounds very sort of rumourary to me
+[35:00.00]and it's probably if it's on device
+[35:02.00]it's going to be a smaller model
+[35:03.00]it's going to be smarter auto complete
+[35:05.00]I don't know what to say
+[35:07.00]I'm still here dealing with
+[35:09.00]Siri which hasn't
+[35:11.00]probably hasn't been updated since
+[35:13.00]God knows when it was introduced
+[35:15.00]it's horrible and it
+[35:17.00]makes me so angry
+[35:19.00]one as an apple customer and user
+[35:21.00]I'm just hoping for better ai on apple itself
+[35:23.00]but two they are
+[35:25.00]the gold standard
+[35:27.00]when it comes to local devices
+[35:29.00]personal compute and trust
+[35:31.00]you trust them with your data
+[35:33.00]and I think
+[35:35.00]that's what a lot of people are looking for in ai
+[35:37.00]that they love the benefits of ai
+[35:39.00]they don't love the downsides
+[35:41.00]which is that you have to send all your data
+[35:43.00]to some clouds somewhere and some of this data
+[35:45.00]that we're going to feed ai is the most personal data there is
+[35:47.00]so apple being
+[35:49.00]one of the most trusted personal
+[35:51.00]data companies I think it's very important
+[35:53.00]that they enter the ai race
+[35:55.00]and I hope to see more out of them
+[35:57.00]to me the biggest question
+[35:59.00]it's like who's paying who
+[36:01.00]because for the browsers
+[36:03.00]google pays apple like 18
+[36:05.00]20 billion every year
+[36:07.00]to be the default browser
+[36:09.00]is google going to pay you to have javanai
+[36:11.00]or is apple paying google to have javanai
+[36:13.00]I think that's like what I'm most interested
+[36:15.00]to figure out because with the browsers
+[36:17.00]it's like it's the entry point
+[36:19.00]to the thing so it's really valuable
+[36:21.00]to be the default that's what google pays
+[36:23.00]but I wonder if the perception in ai
+[36:25.00]is going to be like hey
+[36:27.00]you have a good local model on my phone
+[36:29.00]to be worth me purchasing your device
+[36:31.00]and that's going to drive apple
+[36:33.00]to be the one buying the model
+[36:35.00]but then like Sean said
+[36:37.00]they're doing the mm1 themselves
+[36:39.00]are they saying we do models
+[36:41.00]but they're not as good as the google ones
+[36:43.00]I don't know the whole thing is really confusing
+[36:45.00]but it makes for a great meme
+[36:47.00]material on twitter
+[36:49.00]I think like
+[36:51.00]they are possibly more than
+[36:53.00]open ai and mic microsoft and amazon
+[36:55.00]they are the most full stack company there is
+[36:57.00]in computing
+[36:59.00]and so
+[37:01.00]like they own the chips man
+[37:03.00]like they manufacture everything
+[37:05.00]so if there was a company
+[37:07.00]that could seriously challenge
+[37:09.00]the other ai players it would be apple
+[37:11.00]and it's
+[37:13.00]I don't think it's as hard as self-driving
+[37:15.00]so like maybe they've just been
+[37:17.00]investing in the wrong thing this whole time
+[37:19.00]Wallstreet certainly thinks so
+[37:21.00]Wallstreet love that move man
+[37:23.00]there's a big sigh of relief
+[37:25.00]well let's move away
+[37:27.00]from sort of the big stuff
+[37:29.00]I think to both of your points
+[37:31.00]can I drop one factoid
+[37:33.00]about this wallstreet thing
+[37:35.00]I went and looked at
+[37:37.00]when from being a VR company
+[37:39.00]to an ai company
+[37:41.00]and I think
+[37:43.00]the stock
+[37:45.00]I'm trying to look up the details now
+[37:47.00]the stock has gone up 187%
+[37:49.00]since limo one
+[37:51.00]$830 billion in market value
+[37:53.00]created in the past year
+[37:55.00]if you haven't seen that chart
+[37:59.00]it's actually remarkable if you draw
+[38:01.00]a little arrow on it
+[38:03.00]it's likeno we're an ai company now
+[38:05.00]forget the VR thing
+[38:07.00]it isn't interesting
+[38:11.00]no I think unless you called it
+[38:13.00]zuck's disruptor arc or whatever
+[38:15.00]he really does
+[38:17.00]he is in the midst of a total
+[38:19.00]it's a redemption arc or it's just
+[38:21.00]it's something different where
+[38:23.00]he's sort of the spoiler like
+[38:25.00]people loved him
+[38:27.00]just freestyle talking about why he thought
+[38:29.00]they had a better headset than apple
+[38:31.00]even if they didn't agree they just loved
+[38:33.00]he was going direct to camera and talking about it
+[38:35.00]for five minutes or whatever
+[38:37.00]that's a fascinating shift that I don't think
+[38:39.00]anyone had on their bingo card
+[38:41.00]whatever two years ago
+[38:43.00]it's still there in cn5 d-long
+[38:45.00]don't write it off
+[38:47.00]we need to see him fight in the coliseum
+[38:49.00]no I think in terms of
+[38:51.00]self
+[38:53.00]management life leadership
+[38:55.00]there's a lot of lessons to learn from him
+[38:57.00]you might kind of quibble
+[38:59.00]with the social impact of facebook
+[39:01.00]but just himself
+[39:03.00]in terms of personal growth
+[39:05.00]and perseverance through
+[39:07.00]a lot of change
+[39:09.00]everyone throwing stuff his way
+[39:11.00]I think there's a lot to say about
+[39:13.00]to learn from zuck
+[39:15.00]he's my age
+[39:17.00]awesome
+[39:19.00]so one of the big things
+[39:21.00]that I think you guys have
+[39:23.00]distinct and unique insight into
+[39:25.00]being where you are and where you work on
+[39:27.00]iswhat developers
+[39:29.00]are getting really excited about right now
+[39:31.00]and by that I mean on the one hand
+[39:33.00]certainly start ups who are actually
+[39:35.00]formalized and formed to start ups
+[39:37.00]but also just in terms of
+[39:39.00]what people are spending their nights and weekends on
+[39:41.00]what they're coming to hackathons to do
+[39:43.00]and you know I think it's a
+[39:45.00]it's such a fascinating indicator
+[39:47.00]for where things are headed like
+[39:49.00]if you zoom back a year
+[39:51.00]right now was right when everyone was getting
+[39:53.00]so so excited about
+[39:55.00]ai agent stuff
+[39:57.00]auto gpt and baby agi and these things were like
+[39:59.00]if you dropped anything on youtube about those
+[40:01.00]like instantly tens of thousands of views
+[40:03.00]I know because I had like
+[40:05.00]a 50,000 view video
+[40:07.00]like the second day that I was doing
+[40:09.00]the show on youtube you know because I was talking about
+[40:11.00]auto gpt and so anyways
+[40:13.00]you know obviously that's sort of not totally
+[40:15.00]come to fruition yet but what are some of the
+[40:17.00]trends and what you guys are seeing in terms of
+[40:19.00]people's interest and what people are building
+[40:21.00]I can start maybe with the agents part
+[40:23.00]and then I know Sean is doing a
+[40:25.00]diffusion meetup tonight there's
+[40:27.00]a lot of different things
+[40:29.00]the agent wave has been the most
+[40:31.00]interesting kind of like dream
+[40:33.00]to reality
+[40:35.00]arc so auto gpt I think
+[40:37.00]they went from zero to like
+[40:39.00]125,000 get up stars in six weeks
+[40:41.00]and then one year later
+[40:43.00]they have 150,000
+[40:45.00]stars so there's kind of been a big
+[40:47.00]plot so I mean you might say
+[40:49.00]there's just not that many people that can
+[40:51.00]start it you know everybody already started
+[40:53.00]but the promise of
+[40:55.00]hey I'll just give you a goal
+[40:57.00]and you do it I think it's like
+[40:59.00]amazing to get people's
+[41:01.00]imagination going you know
+[41:03.00]they're like oh wow this is
+[41:05.00]this is awesome everybody
+[41:07.00]can try this to do anything
+[41:09.00]but then as technologists
+[41:11.00]you're like well that's
+[41:13.00]that's just like not possible you know
+[41:15.00]we would have like solved everything and
+[41:17.00]I think it takes a little bit to go from
+[41:19.00]the promise and the hope
+[41:21.00]that people show you to then
+[41:23.00]trinate yourself and going back to say
+[41:25.00]okay this is not really working for me and
+[41:27.00]David won from adept you know
+[41:29.00]they in our episode he specifically said
+[41:31.00]we don't want to do a bottom sub product
+[41:33.00]you knowwe don't want something that everybody
+[41:35.00]could try because it's really hard to get it
+[41:37.00]to be reliable so
+[41:39.00]we're seeing a lot of companies
+[41:41.00]doing vertical agents that
+[41:43.00]are narrow for a specific
+[41:45.00]domain and they're very good at something
+[41:47.00]Myconover who was at Databricks before
+[41:49.00]is also a friend of Layton space
+[41:51.00]he's doing this new company go bright with
+[41:53.00]doing AI agents for financial research
+[41:55.00]and that's it you know and
+[41:57.00]they're doing very well there are
+[41:59.00]other companies doing it in
+[42:01.00]security doing it in
+[42:03.00]compliance doing it in legal
+[42:05.00]all of these things that like
+[42:07.00]people nobody
+[42:09.00]just wakes up and sayoh I
+[42:11.00]cannot wait to go on auto gpd and ask
+[42:13.00]it to do a compliance review of my thing
+[42:15.00]you know just not what inspires people
+[42:17.00]so think the gap on the developer
+[42:19.00]side has been the more bottom sub
+[42:21.00]hacker mentality is trying to build
+[42:23.00]this like very generic
+[42:25.00]agents that can do a lot of open
+[42:27.00]ended task and then the more business
+[42:29.00]side of things is like hey if I want
+[42:31.00]to raise my next round I cannot
+[42:33.00]just like sit around the mess
+[42:35.00]mess around with like super generic
+[42:37.00]stuff I need to find a use case that
+[42:39.00]really works and I think that that
+[42:41.00]is worth for a lot of folks in
+[42:43.00]parallel you have a lot of companies
+[42:45.00]doing evals there are dozens
+[42:47.00]of them that just want to help you
+[42:49.00]measure how good your models are
+[42:51.00]doing again if you build evals
+[42:53.00]you need to also have a restrained
+[42:55.00]surface area to actually figure out
+[42:57.00]whether or not it's good right because
+[42:59.00]there's a lot of stuff going on
+[43:01.00]and everything under the sun so
+[43:03.00]that's another category where I've
+[43:05.00]seen from the sort of
+[43:07.00]pitches that I've seen there's a
+[43:09.00]lot of interest in the enterprise
+[43:11.00]it's just like really fragmented
+[43:13.00]because the production use cases
+[43:15.00]are just coming like now you know
+[43:17.00]there are not a lot of long established
+[43:19.00]ones to test against and so
+[43:21.00]that's kind of on the virtual agents
+[43:23.00]and then the robotic side
+[43:25.00]it's probably been the thing that
+[43:27.00]the amount of robots that were there
+[43:29.00]there were just like robots everywhere
+[43:31.00]both in the keynote and then on the show floor
+[43:33.00]you would haveboston dynamics
+[43:35.00]dogs running around
+[43:37.00]there was like this like fox
+[43:39.00]robot that had like a virtual face
+[43:41.00]that like talked to you and like moved
+[43:43.00]in realtime there were industrial
+[43:45.00]robots.imbedia did a big push
+[43:47.00]on their own omniverse thing which
+[43:49.00]is like thisdigital twin
+[43:51.00]ofwhatever environments you're in
+[43:53.00]that you can use to train the robots agents
+[43:55.00]so that kind of takes people back to the
+[43:57.00]reinforcement learning days but
+[43:59.00]yeah agents people want them
+[44:01.00]you know people want them I give a talk
+[44:03.00]about the rise of the full stack employees
+[44:05.00]and kind of this future the same way
+[44:07.00]full stack engineers kind of work
+[44:09.00]agross the stack in the future every
+[44:11.00]employee is going to interact with
+[44:13.00]every part of the organization through
+[44:15.00]agents and AI enabled tooling
+[44:17.00]this is happening it just needs to be a
+[44:19.00]lot more narrow than maybe the first
+[44:21.00]approach that we took which is just
+[44:23.00]short of super interesting stuff going on
+[44:25.00]yeah but he less you covered
+[44:27.00]a lot of stuff there I'll separate
+[44:29.00]the robotics piece because I feel like
+[44:31.00]that's so different from the software world
+[44:33.00]but yeah we do we do talk to a lot of
+[44:35.00]engineers and you know that that this is
+[44:37.00]our sort of bread and butter and I do agree
+[44:39.00]that vertical agents have worked out a
+[44:41.00]lot better than the horizontal ones
+[44:43.00]I think you know the point I'll make
+[44:45.00]here is just the reason auto GPT
+[44:47.00]and maybe AGI you know it's in the
+[44:49.00]name like they were promising AGI
+[44:51.00]but I think people are discovering that you cannot
+[44:53.00]engineer your way to AGI it has to be
+[44:55.00]done at the model level and all these
+[44:57.00]engineer and prompt engineering
+[44:59.00]hacks on top of it weren't really going to
+[45:01.00]get us there in a meaningful way
+[45:03.00]without much further
+[45:05.00]improvements in the models
+[45:07.00]I would say I'll go so far as to say
+[45:09.00]even Devin which is I would
+[45:11.00]I think the most advanced agents
+[45:13.00]that we've ever seen still requires a
+[45:15.00]lot of engineering and still probably
+[45:17.00]falls apart a lot in terms of like
+[45:19.00]practical usage or it's just way too slow
+[45:21.00]and expensive for you know what it's
+[45:23.00]what is promised in comparison to video
+[45:25.00]so yeah that's what happened
+[45:27.00]of agents from last year
+[45:29.00]but I do see like vertical agents
+[45:31.00]being very popular and sometimes
+[45:33.00]I think the word agent might even be
+[45:35.00]overused sometimes like people don't
+[45:37.00]really care whether or not you call it an
+[45:39.00]AI agent right like does it replace
+[45:41.00]boring menial tasks that I do
+[45:43.00]that I might hire human to do or
+[45:45.00]that the human who is hired to do it
+[45:47.00]doesn't really want to do
+[45:49.00]and I think there's absolutely ways
+[45:51.00]in sort of a vertical context
+[45:53.00]that you can actually go after
+[45:55.00]very routine tasks that can be scaled out
+[45:57.00]to a lot of you know AI assistance
+[45:59.00]so yeah I would
+[46:01.00]basically plus one what I said there
+[46:03.00]I think it's very very promising
+[46:05.00]and I think more people should work on it
+[46:07.00]not less like there's not enough people
+[46:09.00]like this should be the main thrust
+[46:11.00]of the AI engineers to look out
+[46:13.00]look for use cases and go to production
+[46:15.00]instead of just always working on some
+[46:17.00]AI promising thing that never arrives
+[46:19.00]I can only add that
+[46:21.00]so I've been fiercely making
+[46:23.00]tutorials behind the scenes
+[46:25.00]around basically everything you can imagine
+[46:27.00]with AI we've probably done about 300
+[46:29.00]tutorials over the last couple months
+[46:31.00]and the verticalized
+[46:33.00]anything right like this is a solution
+[46:35.00]for your particular job
+[46:37.00]or role even if it's way less
+[46:39.00]interesting or
+[46:41.00]kind of sexy it's like so radically more
+[46:43.00]useful to people in terms of intersecting
+[46:45.00]with how like those are the ways that people are
+[46:47.00]actually adopting AI
+[46:49.00]in a lot of cases it's just a
+[46:51.00]thing that I do over and over again
+[46:53.00]by the way I think that's the same way that even the
+[46:55.00]generalized models are getting adopted you know
+[46:57.00]it's like I use mid-journey
+[46:59.00]for lots of stuff but the main thing I use it for
+[47:01.00]is youtube thumbnails every day like day in
+[47:03.00]day out I will always do a youtube thumbnail
+[47:05.00]you know or two with with mid-journey
+[47:07.00]and it's like you can you can start to extrapolate
+[47:09.00]that across a lot of things and all of a sudden
+[47:11.00]you know AI doesn't
+[47:13.00]it looks revolutionary because of
+[47:15.00]a million small changes rather than
+[47:17.00]one sort of big dramatic change
+[47:19.00]and I think that the verticalization of agents
+[47:21.00]is sort of a great example of
+[47:23.00]how that's going to play out too
+[47:25.00]So I'll have one caveat here
+[47:27.00]which is I think that because
+[47:29.00]multimodal models are now commonplace
+[47:31.00]like claw,generalized,openEI
+[47:33.00]all very very easily
+[47:35.00]multimodal,apples easily
+[47:37.00]multimodal,all this stuff
+[47:39.00]the switch for agents for sort of general desktop
+[47:41.00]browsing
+[47:43.00]that I think people need to keep an eye on
+[47:45.00]it's not mature yet
+[47:47.00]but it is absolutely coming on the way
+[47:49.00]and so just as we're starting to talk
+[47:51.00]about this verticalization piece
+[47:53.00]because that is mature
+[47:55.00]that is ready for people to work on
+[47:57.00]that is that a lot of people are making really good money doing that
+[47:59.00]the thing that's on the rise
+[48:01.00]is this sort of drive by vision
+[48:03.00]version of the agent where
+[48:05.00]they're not specifically taking in text or anything
+[48:07.00]but just watching your screen just like someone else would
+[48:09.00]and piloting it
+[48:11.00]by vision
+[48:13.00]in the episode with David
+[48:15.00]that will have dropped by the time that this airs
+[48:17.00]I think that is the promise of adept
+[48:19.00]that is the promise of what
+[48:21.00]a lot of these sort of desktop agents are
+[48:23.00]and that is the more general purpose
+[48:25.00]system that could be as big as
+[48:27.00]the browser
+[48:29.00]the operating system
+[48:31.00]people really want to build that
+[48:33.00]foundational piece of software in AI
+[48:35.00]and I would see the potential
+[48:37.00]therefor desktop agents being that
+[48:39.00]that you can have self-driving
+[48:41.00]computers. Don't write the horizontal
+[48:43.00]piece out. I just think we took a while
+[48:45.00]to get there. What else are you guys
+[48:47.00]seeing?That's interesting to you.
+[48:49.00]I'm looking at your notes and seeing a ton
+[48:51.00]ofcategories. I'll take
+[48:53.00]the next two as one category
+[48:55.00]which is basically alternative architectures.
+[48:57.00]The two main things that everyone
+[48:59.00]following AI kind of knows now is
+[49:01.00]one, the diffusion architecture
+[49:03.00]and two, the
+[49:05.00]let's just say the decoder only
+[49:07.00]transformer architecture that is popular
+[49:09.00]by GPT. You can read, you can look on
+[49:11.00]YouTube for thousands and thousands of tutorials
+[49:13.00]on each of those things. What we are talking about
+[49:15.00]here is what's next, what people are
+[49:17.00]researching and what could be on the horizon
+[49:19.00]that takes the place of those other two things.
+[49:21.00]So first we'll talk about transformer architectures
+[49:23.00]and then diffusion. So transform the two leading
+[49:25.00]candidates are effectively RWKV
+[49:27.00]and the state space models. The most recent
+[49:29.00]one of which is Mamba but there's others
+[49:31.00]and the S4H3
+[49:33.00]stuff coming out of Hazy Research in Stanford
+[49:35.00]and all of those are
+[49:37.00]non-quadratic
+[49:39.00]language models that scale
+[49:41.00]that promise to scale a lot better
+[49:43.00]than the traditional transformer
+[49:45.00]that this might be too theoretical
+[49:47.00]for most people right now
+[49:49.00]but it's going to be
+[49:51.00]it's going to come out in weird ways
+[49:53.00]where imagine if like right now
+[49:55.00]the talk of the town is that
+[49:57.00]Claude and Gemini have a million tokens of context
+[49:59.00]and you can put in like
+[50:01.00]two hours of video now
+[50:03.00]but what if you could throw in
+[50:05.00]200,000 hours of video
+[50:07.00]how does that change your
+[50:09.00]usage of AI
+[50:11.00]what if you could throw in the entire
+[50:13.00]genetic sequence of a human
+[50:15.00]and synthesize new drugs
+[50:17.00]how does that change things
+[50:19.00]we don't know because we haven't had access to this capability
+[50:21.00]being so cheap before
+[50:23.00]and that's the ultimate promise of these two models
+[50:25.00]they're not there yet
+[50:27.00]it's a very very good progress
+[50:29.00]RWKV and Mamba are probably the two leading examples
+[50:31.00]both of which are open source
+[50:33.00]that you can try them today
+[50:35.00]and have a lot of progress there
+[50:37.00]the main thing I'll highlight for other UKV
+[50:39.00]is that at the 7B level
+[50:41.00]they seem to have beat
+[50:43.00]Lama2 in all benchmarks
+[50:45.00]that matter
+[50:47.00]at the same size for the same amount of training
+[50:49.00]as an open source model so that's exciting
+[50:51.00]there are 7B now
+[50:53.00]they're not at 7TB we don't know if it'll scale
+[50:55.00]the other thing is diffusion
+[50:57.00]diffusions and transformers
+[50:59.00]are kind of on the collision course
+[51:01.00]the original stable diffusion already used
+[51:03.00]transformers in parts of its architecture
+[51:05.00]it seems that transformers are
+[51:07.00]eating more and more of those layers
+[51:09.00]particularly the VAE layer
+[51:11.00]so that's the diffusion transformer
+[51:13.00]is what Sora is built on
+[51:15.00]the guy who wrote the diffusion transformer
+[51:17.00]paper
+[51:19.00]bilt pebbles is the lead tech guy
+[51:21.00]on Sora
+[51:23.00]but there's more sort of experimentation
+[51:25.00]to diffusion I'm holding a meetup
+[51:27.00]actually here in San Francisco that's going to be like the state of diffusion
+[51:29.00]which I'm pretty excited about
+[51:31.00]stability's doing a lot of good work
+[51:33.00]and if you look at the architecture
+[51:35.00]of how they're creating
+[51:37.00]stable diffusion 3, our glass diffusion
+[51:39.00]and the inconsistency models
+[51:41.00]or SDXL turbo
+[51:43.00]all of these are like very very interesting innovations
+[51:45.00]on like the original idea
+[51:47.00]of what stable diffusion was so if you think
+[51:49.00]that it is expensive to create or slow to create
+[51:51.00]stable diffusion or AI-generated art
+[51:53.00]you are not up to date with the latest models
+[51:55.00]if you think it is hard to create
+[51:57.00]texted images you are not up to date with the latest models
+[51:59.00]and people still are kind of far behind
+[52:01.00]the last piece of which
+[52:03.00]is the wall cut I always kind of hold out
+[52:05.00]which is text diffusion
+[52:07.00]so instead of using auto generative
+[52:09.00]or auto regressive transformers
+[52:11.00]can you use text to diffuse
+[52:13.00]so you can use diffusion models to diffuse
+[52:15.00]and create entire chunks of text
+[52:17.00]all at once instead of token by token
+[52:19.00]and that is something that mid-journey confirmed today
+[52:21.00]because it was only rumored
+[52:23.00]the past few months but they confirmed today
+[52:25.00]that they were looking into
+[52:27.00]all those things are like very exciting new model
+[52:29.00]architectures that are maybe something
+[52:31.00]that you will see in production 2-3 years from now
+[52:33.00]so the couple of the trends that I want to just
+[52:37.00]get your takes on because they're sort of something
+[52:39.00]that seems like they're coming up are
+[52:41.00]one sort of these wearable
+[52:43.00]kind of passive
+[52:45.00]AI experiences where
+[52:47.00]they're absorbing a lot of what's going on around you
+[52:49.00]and then kind of bringing things back
+[52:51.00]and then the other one that I
+[52:53.00]wanted to see if you guys had thoughts on were
+[52:55.00]sort of this next generation of chip companies
+[52:57.00]obviously there's a huge amount of emphasis
+[52:59.00]on hardware and silicon
+[53:01.00]and different ways of doing things but
+[53:03.00]love your take on neither or both of those
+[53:05.00]so wearables
+[53:07.00]I'm very excited about it
+[53:09.00]I want wearables on me at all times
+[53:11.00]I have two right here to quantify my health
+[53:13.00]and I'm all for them
+[53:15.00]but society is not ready for wearables
+[53:17.00]no one's comfortable with
+[53:19.00]a device on recording every single
+[53:21.00]conversation we have even
+[53:23.00]all three of us here as
+[53:25.00]podcasters we don't record everything
+[53:27.00]that we say and I think
+[53:29.00]there's a social shift that needs to happen
+[53:31.00]I'm an investor in tab
+[53:33.00]they are renaming to a broader
+[53:35.00]vision but they are one of the
+[53:37.00]three or four leading wearables in this
+[53:39.00]space instead of the AI pendants
+[53:41.00]or AI OS
+[53:43.00]I have seen two humans
+[53:45.00]in a while in San Francisco
+[53:47.00]I'm very very excited to report
+[53:49.00]that there are people walking around
+[53:51.00]with those things on their chest
+[53:53.00]and it is as goofy as it sounds
+[53:55.00]it absolutely is going to fail
+[53:57.00]but god bless them for trying
+[53:59.00]and I've also bought a rabbit
+[54:01.00]so I'm very excited for all those things to arrive
+[54:03.00]but yeah people
+[54:05.00]are very keen on hardware
+[54:07.00]I think the idea that you can have physical objects
+[54:09.00]that embody an AI
+[54:11.00]do specific things for you
+[54:13.00]is as old as
+[54:15.00]the sort of golem
+[54:17.00]in sort of medieval times
+[54:19.00]in terms of like how much we want
+[54:21.00]our objects to be smart
+[54:23.00]and do things for us
+[54:25.00]and I think it's absolutely
+[54:27.00]a great play
+[54:29.00]the funny thing is people are much more willing
+[54:31.00]to pay you up front
+[54:33.00]for a hardware device
+[54:35.00]than they are willing to pay $8 a month
+[54:37.00]suscription recurring for software
+[54:39.00]and so the interesting economics
+[54:41.00]of these wearable companies
+[54:43.00]is they have negative float
+[54:45.00]in the sense that people pay deposits up front
+[54:47.00]like I paid
+[54:49.00]$200 for the rabbit up front
+[54:51.00]and I don't get it for another six months
+[54:53.00]I paid $600 for the tablet
+[54:55.00]I don't get it for another six months
+[54:57.00]and then they can take that money
+[54:59.00]and sort of invest it in their next
+[55:01.00]events or their next properties
+[55:03.00]or ventures
+[55:05.00]and I think that's a very interesting
+[55:07.00]comics from other types of
+[55:09.00]AI companies that I see
+[55:11.00]and I think just the tactile feel
+[55:13.00]of an AI I think is very promising
+[55:15.00]I don't know if you have other
+[55:17.00]thoughts on the wearable stuff
+[55:19.00]open interpreter
+[55:21.00]just announced their product four hours ago
+[55:23.00]which is not really a wearable
+[55:25.00]but it's still like a
+[55:27.00]physical device
+[55:29.00]it's a push to talk mic
+[55:31.00]to a device on your laptop
+[55:33.00]it's a $99 push
+[55:35.00]but again
+[55:37.00]go back to your point
+[55:39.00]people are interested in
+[55:41.00]spending money for things that they can hold
+[55:43.00]I don't know what that means overall
+[55:45.00]where things are going but
+[55:47.00]making more of this AI
+[55:49.00]be a physical part of your life
+[55:51.00]I think people are interested in that
+[55:53.00]but I agree with Sean
+[55:55.00]I talked to Avi about this
+[55:57.00]but Avi's point is like
+[55:59.00]most consumers care about utility
+[56:01.00]more than they care about privacy
+[56:03.00]I'm not like you've seen with social media
+[56:05.00]but I also think there's a big
+[56:07.00]social reaction
+[56:09.00]to AI that is like much more
+[56:11.00]rooted than the social media one
+[56:13.00]but we'll see
+[56:15.00]but a lot again a lot of work a lot of developers
+[56:17.00]a lot of money going into it so
+[56:19.00]there's bound to be experiments
+[56:21.00]being run on the chip
+[56:23.00]sorry I'll just
+[56:25.00]ship it one more thing and then we transition to the chips
+[56:27.00]the thing I'll caution people on is
+[56:29.00]don't overly focus on the form factor
+[56:31.00]the form factor is a delivery mode
+[56:33.00]there will be many form factors
+[56:35.00]it doesn't matter so much as
+[56:37.00]where in the data war does it sit
+[56:39.00]it actually is context acquisition
+[56:41.00]because and maybe a little bit of multi-modality
+[56:43.00]context is king
+[56:45.00]if you have access to data
+[56:47.00]that no one else has
+[56:49.00]then you will be able to create AI that no one else can create
+[56:51.00]and so what is the most personal context
+[56:53.00]it is your everyday conversation
+[56:55.00]it is as close to mapping your
+[56:57.00]mental train of thought as possible
+[56:59.00]without physically you writing down notes
+[57:01.00]so that is the promise
+[57:03.00]the ultimate goal here
+[57:05.00]which is like personal context
+[57:07.00]it's always available on you
+[57:09.00]a little ncl that stuff
+[57:11.00]that's the frame I want to give people
+[57:13.00]that the form factors will change
+[57:15.00]and there will be multiple form factors
+[57:17.00]but it's the software behind that
+[57:19.00]in the personal context
+[57:21.00]that you cannot get anywhere else
+[57:23.00]that'll win
+[57:25.00]so that was wearables
+[57:27.00]but it's not even a new release
+[57:29.00]because the company I think was started in 2016
+[57:31.00]so it's actually quite old
+[57:33.00]but now recently captured
+[57:35.00]people's imagination with their mixrall
+[57:37.00]500 tokens a second demo
+[57:39.00]yeah I think so far
+[57:41.00]the battle on the GPU side
+[57:43.00]has beeneither you go
+[57:45.00]kind of like massive chip
+[57:47.00]like this irreversible of the world
+[57:49.00]where one chip front
+[57:51.00]reversible is about 2 million dollars
+[57:53.00]you know that's compared
+[57:55.00]so you cannot compare one chip
+[57:57.00]versus one chip but h100
+[57:59.00]it's like 40,000
+[58:01.00]the problem with those architectures
+[58:03.00]has been
+[58:05.00]they want to be very general
+[58:07.00]but they wanted to put a lot
+[58:09.00]of the SRAM on the chip
+[58:11.00]it's much more convenient
+[58:13.00]when you're using larger language models
+[58:15.00]but the models outpace the size
+[58:17.00]of the chips and chips have a much longer
+[58:19.00]turnaround cycle
+[58:21.00]Grock today is great for the current architecture
+[58:23.00]it's a lot more expensive also
+[58:25.00]as far as dollar per flop
+[58:27.00]but there are the SIK
+[58:29.00]when you have very high concurrency
+[58:31.00]we actually were much cheaper
+[58:33.00]you shouldn't just be looking at the compute power
+[58:35.00]for most people this doesn't really matter
+[58:37.00]you know like I think that's like the most
+[58:39.00]the most interesting thing to me is like
+[58:41.00]we've now gone back
+[58:43.00]with AI to a world where
+[58:45.00]developers care
+[58:47.00]about what hardware is running
+[58:49.00]which was not the case in traditional software
+[58:51.00]maybe 20 years since
+[58:53.00]as the cloud has gotten really big
+[58:55.00]my thinking is that
+[58:57.00]in the next 2-3 years
+[58:59.00]we're going to go back to that
+[59:01.00]we're like people are not going to be sweating
+[59:03.00]what GPU do you have in your cloud
+[59:05.00]what do you have
+[59:07.00]you want to run this model
+[59:09.00]we can run it at the same speed as everybody else
+[59:11.00]and then everybody will make different choices
+[59:13.00]whether they want to have
+[59:15.00]higher front end capital investment
+[59:17.00]and then better utilization
+[59:19.00]and then upgrade later
+[59:21.00]there are a lot of parameters
+[59:23.00]and then there's the dark horses
+[59:25.00]that is some of the smaller companies
+[59:27.00]like Lemurian Labs, MedEx
+[59:29.00]that are working on
+[59:31.00]maybe not a chip alone
+[59:33.00]but also some of the actual math
+[59:35.00]infrastructure and the instructions
+[59:37.00]that make them run
+[59:39.00]there's a lot going on but
+[59:41.00]yeah I think the
+[59:43.00]the episode with Dylan will be
+[59:45.00]interesting for people but
+[59:47.00]hey everybody has pros and cons
+[59:49.00]it's different than the models
+[59:51.00]where you're like oh this one is definitely better for me
+[59:53.00]and I'm going to use it
+[59:55.00]I think for most people
+[59:57.00]it's like fun twitter meming
+[59:59.00]99% of people
+[60:01.00]that tweet about this stuff
+[60:03.00]are never going to buy any of these chips anyway
+[60:05.00]so it's really more for entertainment
+[60:07.00]wow I mean
+[60:09.00]this is serious business here
+[60:11.00]the potential new Nvidia
+[60:13.00]if anyone can take
+[60:15.00]i'm more talking about
+[60:17.00]how should people think about it
+[60:19.00]i think the end user
+[60:21.00]is not impacted as much
+[60:23.00]so I disagree
+[60:25.00]I love disagreements
+[60:27.00]who likes the podcast
+[60:29.00]where all 3 people always agree with each other
+[60:31.00]you will see the impact of this
+[60:33.00]in the tokens per second over time
+[60:35.00]this year
+[60:37.00]I have very very credible sources
+[60:39.00]all telling me
+[60:41.00]that the average tokens per second
+[60:43.00]right now we have
+[60:45.00]somewhere between 50 to 100
+[60:47.00]that's the norm for people
+[60:49.00]average tokens per second
+[60:51.00]will go to 500 to 2000
+[60:53.00]this year
+[60:55.00]from a number of chip suppliers that I cannot name
+[60:57.00]that will cause
+[60:59.00]step change in the use cases
+[61:01.00]every time you have an order of magnitude improvement
+[61:03.00]in the speed of something
+[61:05.00]you unlock new use cases
+[61:07.00]that become fun instead of a chore
+[61:09.00]and so that's what I would
+[61:11.00]caution this audience to think about
+[61:13.00]what can you do in much higher AI speed
+[61:15.00]it's not just things streaming out faster
+[61:17.00]it is
+[61:19.00]things working in the background a lot more seamlessly
+[61:21.00]and therefore being a lot more useful
+[61:23.00]than previously imagined
+[61:25.00]so that would be my two cents on that
+[61:27.00]yeah
+[61:29.00]I mean the new
+[61:31.00]imbedia chips are also much faster
+[61:33.00]to me that's true
+[61:35.00]when it comes about startups
+[61:37.00]are the startups pushing
+[61:39.00]on the incumbents
+[61:41.00]or are the incumbents still leading
+[61:43.00]and then the startups are riding the same wave
+[61:45.00]I don't have yet a good sense of that
+[61:47.00]it's next year's
+[61:49.00]imbedia release just gonna be better than everything
+[61:51.00]that gets released this year
+[61:53.00]if that's the case
+[61:55.00]it's like okay
+[61:57.00]damn jensen
+[61:59.00]it's like I'm gonna fight imbedia
+[62:01.00]damn jensen got hands
+[62:03.00]he really does
+[62:05.00]well
+[62:07.00]one conversation guys
+[62:09.00]just by way of wrapping up
+[62:11.00]call it over the next three months
+[62:13.00]between now and sort of the beginning of summer
+[62:15.00]what's one prediction that each of you has
+[62:17.00]could be about anything
+[62:19.00]could be big company, could be startup
+[62:21.00]could be something you have privileged information that you know
+[62:23.00]and you just won't tell us that you actually know
+[62:25.00]does it have to be something that we think
+[62:27.00]it's gonna be true or like something that we think
+[62:29.00]because for me it's like
+[62:31.00]is sundar gonna be the CEO of google
+[62:33.00]maybe not in three months but maybe in like six months
+[62:35.00]in nine months
+[62:37.00]people were like oh maybe that miss is gonna be the new CEO
+[62:39.00]that was kinda like
+[62:41.00]I was busyfishing some deep mind people
+[62:43.00]and google people for like a good
+[62:45.00]guest for the pot and I was like
+[62:47.00]oh what about jeptine and they're like
+[62:49.00]well that miss is really like the person that runs everything
+[62:51.00]anyway and the stuff
+[62:53.00]and it's like interesting
+[62:55.00]so I don't know
+[62:57.00]serge could come back I don't know
+[62:59.00]like he's making more appearances these days
+[63:01.00]yeah
+[63:03.00]but we can just put it as like
+[63:05.00]my thing is like
+[63:07.00]CEO change potential
+[63:09.00]but again
+[63:11.00]three months is too short
+[63:13.00]to make a prediction
+[63:15.00]time scale might be off
+[63:17.00]yeah I mean
+[63:21.00]for me I think the progression
+[63:23.00]in vertical agent companies
+[63:25.00]will keep going
+[63:27.00]we just had the other day
+[63:29.00]Klarna talking about how they replaced like
+[63:31.00]customer support agents with the
+[63:33.00]AI agents
+[63:35.00]that's to the beginning guys
+[63:37.00]imagine this rolling out across most of the fortune
+[63:39.00]500
+[63:41.00]and I'm not saying this is like a utopian
+[63:43.00]scenario there will be very very
+[63:45.00]embarrassing and bad outcomes
+[63:47.00]of this where like humans would never
+[63:49.00]make this mistake but AIS did and
+[63:51.00]we'll all laugh at it or it will be very offended
+[63:53.00]by whatever you know bad outcome
+[63:55.00]it did so we have to be responsible
+[63:57.00]and careful in the rollout but yeah this is
+[63:59.00]the rolling out you know Alessio likes to say
+[64:01.00]that this year is the year of AI production
+[64:03.00]let's see it let's see all these vertical
+[64:05.00]full stack employees
+[64:07.00]come out into the workforce
+[64:09.00]love it alright guys well thank you so much
+[64:11.00]for sharing your thoughts and insights
+[64:13.00]here and can't wait to do it again
+[64:15.00]welcome back again
+[64:17.00]it's Charlie your AI cohost
+[64:19.00]we're now in part two
+[64:21.00]of the special weekend episode
+[64:23.00]collating some of SWIX and Alessio's
+[64:25.00]recent appearances
+[64:27.00]if you are not active in the latent space discord
+[64:29.00]you might not be aware of the
+[64:31.00]many many many in person
+[64:33.00]events we host gathering our
+[64:35.00]listener community all over the world
+[64:37.00]you can see the latent space
+[64:39.00]community page for how to join
+[64:41.00]and subscribe to our event calendar
+[64:43.00]for future meetups
+[64:45.00]we're going to share some of our recent live
+[64:47.00]appearances in this next part
+[64:49.00]starting with the Thursday nights in AI meetup
+[64:51.00]a regular fixture in the SF AI scene
+[64:53.00]run by imbu and outset capital
+[64:55.00]primarily our former guest
+[64:57.00]Kanjen Kyu
+[64:59.00]Allie Roder and Josh Albrecht
+[65:01.00]here's SWIX
+[65:03.00]today for those of you who have been here before
+[65:07.00]you know the general format
+[65:09.00]so we'll do a quick fireside Q&A with SWIX
+[65:11.00]where we're asking him the questions
+[65:13.00]then we'll actually go to our rapid fire Q&A
+[65:15.00]where we're asking really fast
+[65:17.00]hopefully spicy questions
+[65:19.00]and then we'll open it up to the audience
+[65:21.00]for your questions so you guys
+[65:23.00]around the room submit your questions
+[65:25.00]and we'll go through as many of them as possible
+[65:27.00]during that period
+[65:29.00]and then actually SWIX brought
+[65:31.00]a gift for us which is two
+[65:33.00]latent space t-shirts
+[65:35.00]AI engineer t-shirts
+[65:37.00]and those will be awarded to the
+[65:39.00]two spiciest questions
+[65:41.00]askers
+[65:43.00]and I'll let Josh decide on that
+[65:45.00]so we want to get your spiciest takes
+[65:47.00]please send them in during the event
+[65:49.00]as we're talking and then also at the end
+[65:51.00]alright with that
+[65:53.00]let's get going
+[65:55.00]ok
+[65:57.00]welcome SWIX
+[65:59.00]how does it feel to be interviewed
+[66:01.00]rather than the interviewer
+[66:03.00]weird I don't know what to do in this chair
+[66:05.00]where should I put my hands
+[66:07.00]you look good
+[66:09.00]and I also love asking follow up questions
+[66:11.00]and I tend to take over panels a lot
+[66:13.00]if you ever see me on a panel
+[66:15.00]I tend to ask the other panelists questions
+[66:17.00]so we should be ready
+[66:19.00]this is like a free interview
+[66:21.00]so why not
+[66:23.00]so you interviewed Ken June
+[66:25.00]the CEO of Impeo4 but you didn't interview Josh
+[66:27.00]so maybe tonight
+[66:29.00]we will look for different questions
+[66:31.00]and look for alignment
+[66:33.00]I love it
+[66:35.00]I just want to hear this story
+[66:37.00]you've completely exploded with latent space
+[66:39.00]and AI engineer and I know
+[66:41.00]you also before all of that
+[66:43.00]had exploded in popularity for your
+[66:45.00]learning in public movement and your dev tools work
+[66:47.00]dav relations work
+[66:49.00]so who are you and how did you get here
+[66:51.00]let's start with that
+[66:53.00]quick story is I'm Sean I'm from Singapore
+[66:55.00]SWIX is my initials for those who don't know
+[66:57.00]a lot of Singaporeans are ethically Chinese
+[66:59.00]and we have Chinese names and English names
+[67:01.00]so it's just my initials
+[67:03.00]came to the US for college
+[67:05.00]and have been here for about 15 years
+[67:07.00]but half of that was in finance
+[67:09.00]and then the other half was in tech
+[67:11.00]tech is where I was most known
+[67:13.00]just because I realized that
+[67:15.00]I was much more
+[67:17.00]aligned towards learning in public
+[67:19.00]whereas in finance everything is a trade secret
+[67:21.00]everything is zero sum
+[67:23.00]whereas in tech you're allowed to come
+[67:25.00]to meetups and conferences and share your
+[67:27.00]learnings and share your mistakes even
+[67:29.00]and that's totally fine you like
+[67:31.00]open source your code it's totally fine
+[67:33.00]and even better you like contribute
+[67:35.00]pr to other people's code which is even better
+[67:37.00]and I found that I thrives in that
+[67:39.00]learning public environments and
+[67:41.00]that kind of got me started
+[67:43.00]early higher
+[67:45.00]early directly higher at Netlify
+[67:47.00]and then did the same at EWS
+[67:49.00]Temporal and Airbyte
+[67:51.00]and so that's like the whole story
+[67:53.00]I can talk more about like developer tooling
+[67:55.00]and developer relations if that's something
+[67:57.00]that people are interested in
+[67:59.00]but I think the more recent thing is AI
+[68:01.00]and I started really being interested
+[68:03.00]in it mostly because
+[68:05.00]the approximate cause of starting learning
+[68:07.00]in space was stable diffusion
+[68:09.00]when you could run a large model
+[68:11.00]on your desktop
+[68:13.00]where I was like okay this is
+[68:15.00]something qualitatively very different
+[68:17.00]and then we started
+[68:19.00]in space and
+[68:21.00]we have to talk about it on a podcast
+[68:23.00]there we go
+[68:25.00]it wasn't a podcast for like four months
+[68:27.00]and then I had been running a discord
+[68:29.00]for DevTools investors
+[68:31.00]I also invested in DevTools
+[68:33.00]and I advised companies on DevTools
+[68:35.00]definition things
+[68:37.00]and I think it was the start of 2023
+[68:39.00]Alessio and I were both like
+[68:41.00]I think we need to get more tokens out of
+[68:43.00]people and I was running out of original sources
+[68:45.00]to write about
+[68:47.00]so I was like okay I'll go get those original sources
+[68:49.00]and I think that's when we started the podcast
+[68:51.00]and I think it's just the chemistry between us
+[68:53.00]the way we spike in different ways
+[68:55.00]and also honestly the
+[68:57.00]kind participation of the guests
+[68:59.00]to give us their time
+[69:01.00]getting George Hoss was a big deal
+[69:03.00]and also shoutout to Alessio for cold emailing him
+[69:05.00]for booking some of our
+[69:07.00]biggest guests
+[69:09.00]and just working really hard to try to tell the story
+[69:11.00]that people can use at work
+[69:13.00]I think that there's a lot of AI podcasts out there
+[69:15.00]and a lot of AI forums
+[69:17.00]or fireside chats with no fire
+[69:19.00]that always talk about
+[69:21.00]what's your AGI timeline, what's your PDoom
+[69:23.00]very very nice hallway conversations
+[69:25.00]for freshman year but not very useful
+[69:27.00]for work
+[69:29.00]and practically making money
+[69:31.00]and thinking about
+[69:33.00]changing everyday lives
+[69:35.00]what's interesting is obviously
+[69:37.00]you care about the existential
+[69:39.00]safety of the human race
+[69:41.00]but in the meantime we got to eat
+[69:43.00]so I think that's kind of
+[69:45.00]the inspaces niche
+[69:47.00]we explicitly don't really talk about AGI
+[69:49.00]we explicitly don't talk about
+[69:51.00]things that we're a little bit too far out
+[69:53.00]we don't do a ton of robotics
+[69:55.00]we don't do a ton of high frequency trading
+[69:57.00]there's tons of machine learning in there
+[69:59.00]but we just don't do that
+[70:01.00]we're like what are most software engineers going to need
+[70:03.00]because that's our background
+[70:05.00]and that's the audience that we serve
+[70:07.00]and I think just being really clear on that audience
+[70:09.00]has resonated with people
+[70:11.00]you would never expect a technical podcast
+[70:13.00]to reach a general audience
+[70:15.00]like top 10 on the tech charts
+[70:17.00]but I've been surprised by that before
+[70:19.00]and it's been successful
+[70:21.00]I don't know what to say about that
+[70:23.00]I think honestly I kind of
+[70:25.00]have this negative reaction towards being
+[70:27.00] being classified as a podcast
+[70:29.00]because the podcast is downstream of ideas
+[70:31.00]and it's one mode of conversation
+[70:33.00]it's one mode of idea delivery
+[70:35.00]but you can deliver ideas
+[70:37.00]on a newsletter in a person like this
+[70:39.00]there's so many different ways
+[70:41.00]and so I think I think about it more as
+[70:43.00]we are trying to start or serve an industry
+[70:45.00]and that industry is the AI
+[70:47.00]engineer industry
+[70:49.00]which we can talk about more
+[70:51.00]yes let's go into that
+[70:53.00]so the AI engineer you penned a piece
+[70:55.00]all the rise of the AI engineer
+[70:57.00]you tweeted about it
+[70:59.00]you also responded
+[71:01.00]largely agreeing with what you said
+[71:03.00]what is an AI engineer
+[71:05.00]the AI engineer is the software engineer
+[71:07.00]building with AI
+[71:09.00]enhanced by AI
+[71:11.00]and eventually it will be a non-human
+[71:13.00]engineers writing code for you
+[71:15.00]which I know impu is all about
+[71:17.00]you're saying eventually the AI engineer
+[71:19.00]will become a non-human engineer
+[71:21.00]that will be one kind of AI engineer
+[71:23.00]that people are trying to build
+[71:25.00]and is probably the most furthest away
+[71:27.00]because it's so hard
+[71:29.00]but there are three types of AI engineer
+[71:31.00]one is AI enhanced
+[71:33.00]where you use AI products like co-pilot
+[71:35.00]and two is AI products engineer
+[71:37.00]where you use the exposed AI capabilities
+[71:39.00]to the end user
+[71:41.00]as a software engineer like not doing pre-training
+[71:43.00]not being an ML researcher
+[71:45.00]not being an ML engineer
+[71:47.00]but just interacting with foundation models
+[71:49.00]probably APIs from foundation model labs
+[71:51.00]what's the third one
+[71:53.00]and the third one is the non-human AI engineer
+[71:55.00]the fully autonomous
+[71:57.00]dream coder
+[71:59.00]how long do you think it is till we get to
+[72:01.00]early
+[72:03.00]this is my equivalent of AGI timelines
+[72:05.00]I know I know
+[72:07.00]lots of active
+[72:09.00]I have supported companies
+[72:11.00]actively working on that
+[72:13.00]I think it's more useful to think about levels of autonomy
+[72:15.00]and so my answer to that
+[72:17.00]is perpetually five years away
+[72:19.00]until it figures it out
+[72:21.00]no but my actual antidote
+[72:23.00]the closest comparison we have to that is self-driving
+[72:25.00]we're doing this in San Francisco
+[72:27.00]for those who are watching on the live stream
+[72:29.00]if you haven't come to San Francisco
+[72:31.00]and seen and taken a waymo ride
+[72:33.00]just come get a friend and take a waymo ride
+[72:35.00]I remember 2014
+[72:37.00]we covered a little bit of autos in my hedge fund
+[72:39.00]and I remember telling a friend
+[72:41.00]I was self-driving cars around the corner
+[72:43.00]this is it
+[72:45.00]parking will be a thing of the past
+[72:47.00]and it didn't happen for the next ten years
+[72:49.00]but now most of us in San Francisco
+[72:51.00]can take it for granted
+[72:53.00]I think you just have to
+[72:55.00]be mindful that
+[72:57.00]the rough edges take a long time
+[72:59.00]and yes it's going to work in demos
+[73:01.00]then it's going to work a little bit further out
+[73:03.00]and it's just going to take a long time
+[73:05.00]the more useful mental model I have
+[73:07.00]is levels of autonomy
+[73:09.00]in self-driving you have level one, two, three, four, five
+[73:11.00]just the amount of human attention
+[73:13.00]that you getat first
+[73:15.00]your hands are always on ten and two
+[73:17.00]and you have to pay attention to the driving
+[73:19.00]every thirty seconds
+[73:21.00]and eventually you can sleep in the car
+[73:23.00]there's a whole spectrum of that
+[73:25.00]what's the equivalent for coding
+[73:27.00]keep your hands on the keyboard
+[73:29.00]and then eventually you have to accept everything
+[73:31.00]that's good
+[73:33.00]approve the PR
+[73:35.00]approve this looks good
+[73:37.00]that's the dream that people want
+[73:39.00]because really you unlock a lot of coding
+[73:41.00]when people, non-technical people can file issues
+[73:43.00]and then the
+[73:45.00]AI engineer can sort of automatically
+[73:47.00]write code, pass your tests
+[73:49.00]and if it kind of works as
+[73:51.00]as advertised, then you can just kind of merge it
+[73:53.00]and then you
+[73:55.00]10x, 100x, the number of developers in your company
+[73:57.00]immediately
+[73:59.00]so that's the goal, that's the holy grail
+[74:01.00]we're not there yet, but sweep, code gen
+[74:03.00]there's a bunch of companies, magic probably
+[74:05.00]are all working towards that
+[74:07.00]and so the TLDR
+[74:09.00]the thing that we covered, Leslie and I covered
+[74:11.00]in the January recap
+[74:13.00]that we did was that
+[74:15.00]the people should have in their minds is the inner loop
+[74:17.00]versus the outer loop for the developer
+[74:19.00]inner loop is everything that happens
+[74:21.00]in your IDE between git commits
+[74:23.00]and outer loop is what happens
+[74:25.00]when you push up your git commit to
+[74:27.00]github, for example, or gitlab
+[74:29.00]and that's a nice split, which means
+[74:31.00]everything local, everything that needs to be fast
+[74:33.00]everything that's kind of very hands on for developers
+[74:35.00]is probably easier to automate
+[74:37.00]or easier to have code assistance
+[74:39.00]that's what copilot is, that's what all those things are
+[74:41.00]and then everything that happens autonomously
+[74:43.00]or effectively away from the keyboard
+[74:45.00]with a github issue or something
+[74:47.00]that is more outer loop where
+[74:49.00]you're relying a lot more on autonomy
+[74:51.00]and we are maybe not smart enough
+[74:53.00]to do that yet
+[74:55.00]Do you have any thoughts on the user experience
+[74:57.00]and how that will change? One of the things that
+[74:59.00]has happened for me, looking at some of these products
+[75:01.00]and playing around with things ourselves
+[75:03.00]it sounds good to have an automated PR
+[75:05.00]then you get an automated PR and you're like
+[75:07.00]I really don't want to review 300 lines
+[75:09.00]of generated code and find the bug
+[75:11.00]and then you have another agent that's a reviewer
+[75:13.00]and then they just come up to you
+[75:15.00]and then you like tell it, go fix it
+[75:17.00]and it comes back with 400 lines
+[75:19.00]yes, there is a length bias to code
+[75:21.00]and
+[75:23.00]you do have higher passing rates
+[75:25.00]in PRs, this is a documented human behavior
+[75:27.00]thing, send me two lines of code
+[75:29.00]I will review the shit out of that
+[75:31.00]I don't know if I can swear on this
+[75:33.00]send me 200 lines of code, looks good to me
+[75:35.00]guess what, the agents are going to
+[75:37.00]perfectly happy to copy that behavior from us
+[75:39.00]when we actually want them to do the opposite
+[75:41.00]so, yeah, I think
+[75:43.00]the GAN model of code generation
+[75:45.00]is probably not going to work super well
+[75:47.00]I do think we probably need just
+[75:49.00]better planning from the start
+[75:51.00]which is, I'm just repeating the
+[75:53.00]MBU thesis, by the way
+[75:55.00]just go listen to Kanjin talk about this
+[75:57.00]she's much better at it than I am
+[75:59.00]but yeah, I think
+[76:01.00]the code review thing is going to be
+[76:03.00]I think that what codium
+[76:05.00]the two codiums, the Israeli one
+[76:07.00]Israeli codium
+[76:09.00]with the E
+[76:11.00]Yeah, codium with the E
+[76:13.00]They still have refused to rename
+[76:15.00]I'm friends with both of them
+[76:17.00]Every month I'm like
+[76:19.00]Guys, lets all come to one room
+[76:21.00]Someone's got a fold
+[76:23.00]Codium with the E has gone
+[76:25.00]You got to write the test first
+[76:27.00]It's like a sort of tripartite
+[76:29.00]relationship, again this is also covered on a
+[76:31.00]podcast with them which is fantastic
+[76:33.00]You interview me, you sort of through me
+[76:35.00]So, codium is like
+[76:37.00]They've already thought this all the way through
+[76:39.00]They're like, okay you write the user story
+[76:41.00]From the user story you generate all the tests
+[76:43.00]You also generate the code
+[76:45.00]And you update any one of those
+[76:47.00]They all have to update together
+[76:49.00]And probably the critical factor
+[76:51.00]Is the test generation from the story
+[76:53.00]Because everything else
+[76:55.00]Can just kind of bounce the hits off
+[76:57.00]Of those things until they pass
+[76:59.00]So you have to write good tests
+[77:01.00]It's kind of like the eat your vegetables of coding
+[77:03.00]Which nobody really wants to do
+[77:05.00]And so I think it's a really smart tactic
+[77:07.00]To go to market
+[77:09.00]By saying we automatically generate
+[77:11.00]Test for you and start not great
+[77:13.00]But then get better
+[77:15.00]And eventually you get to
+[77:17.00]The weakest point in the chain
+[77:19.00]For the entire loop of code generation
+[77:21.00]What do you think their weakest link is
+[77:25.00]The weakest link
+[77:27.00]It's test generation
+[77:29.00]Do you think there's a way to
+[77:31.00]To make that actually better
+[77:33.00]For making it better
+[77:37.00]You have to have good isolation
+[77:39.00]And I think
+[77:41.00]Proper, serverless cloud environment
+[77:43.00]Is integral to that
+[77:45.00]It could be like a fly I/O
+[77:47.00]It could be like
+[77:49.00]A cloud fair worker
+[77:51.00]It depends how many resources
+[77:53.00]Your test environment needs
+[77:55.00]And effectively I was talking about this
+[77:57.00]I think with maybe Rob earlier in the audience
+[77:59.00]Every agent needs a sandbox
+[78:01.00]If you're a code agent you need a coding sandbox
+[78:03.00]But if you're whatever
+[78:05.00]Like MBU used to have this
+[78:07.00]Minecrafts clone that was much faster
+[78:09.00]If you have a model of the real world
+[78:11.00]You have to go generate some plan
+[78:13.00]Or some code or some whatever
+[78:15.00]Test it against that real world
+[78:17.00]So that you can get this iterative feedback
+[78:19.00]And then get the final result back
+[78:21.00]That is somewhat validated against the real world
+[78:23.00]And so you need a really good sandbox
+[78:25.00]I don't think people
+[78:27.00]This is an infrastructure need
+[78:29.00]Humans have had for a long time
+[78:31.00]We've never solved it for ourselves
+[78:33.00]And now we have to solve it for about
+[78:35.00]A thousand times larger quantity of agents
+[78:37.00]Then actually exist
+[78:39.00]And so I think we actually have to
+[78:41.00]Involve a lot more infrastructure
+[78:43.00]In order to serve these things
+[78:45.00]So for those who don't know
+[78:47.00]I also have
+[78:49.00]So we're talking about the rise of AI engineer
+[78:51.00]I also have various conversations
+[78:53.00]About immutable infrastructure
+[78:55.00]And this is all of the kinds
+[78:57.00]In order to solve
+[78:59.00]Agents and coding agents
+[79:01.00]We're going to have to solve the other stuff too along the way
+[79:03.00]And it's really neat for me
+[79:05.00]To see all that tied together in my deaf tools work
+[79:07.00]That all these themes kind of reemerge
+[79:09.00]Just naturally just because
+[79:11.00]Everything we needed for humans
+[79:13.00]We just need a hundred times more for agents
+[79:15.00]Let's talk about the AI engineer
+[79:17.00]AI engineer has become a whole thing
+[79:19.00]It's become a term and also a conference
+[79:21.00]And tell us more
+[79:23.00]And a job title
+[79:25.00]Tell us more about that
+[79:27.00]What's going on there
+[79:29.00]That is a very big cloud of things
+[79:31.00]I would just say
+[79:33.00]I think it's an emergent industry
+[79:35.00]I've seen this happen repeatedly
+[79:37.00]So the general term
+[79:39.00]So the general term is software engineer
+[79:41.00]Or programmer
+[79:43.00]In the 70s and 80s
+[79:45.00]There would not be a senior engineer
+[79:47.00]There would just be an engineer
+[79:49.00]I don't think you would even call
+[79:51.00]What about a member of the technical staff
+[79:53.00]Oh yeah MTS
+[79:55.00]Very very elite
+[79:57.00]So these striations appear
+[79:59.00]When the population grows
+[80:01.00]And the technical depth grows
+[80:03.00]Over time
+[80:05.00]When it starts
+[80:07.00]Not that important
+[80:09.00]It's just going to specialize
+[80:11.00]I've seen this happen for front end
+[80:13.00]For DevOps, for data
+[80:15.00]I can't remember what else I listed in that piece
+[80:17.00]But those are the main three that I was around for
+[80:19.00]Now a lot of people are arguing
+[80:21.00]That there is the ML researcher
+[80:23.00]The ML engineer
+[80:25.00]Who sort of pairs with the researcher
+[80:27.00]Sometimes they also call research engineer
+[80:29.00]And then on the other side of the fence
+[80:31.00]It's just software engineers
+[80:33.00]And that's how it was until about last year
+[80:35.00]And now there's this specializing
+[80:37.00]And rising class of people
+[80:39.00]Building AI specific software
+[80:41.00]That are not any of those previous titles
+[80:43.00]That I just mentioned
+[80:45.00]And that's the thesis of the AI engineer
+[80:47.00]In the emerging category of start-ups
+[80:49.00]Of jobs
+[80:51.00]I've had people from Meta, IBM, Microsoft
+[80:53.00]Open AI tell me that their title
+[80:55.00]Is now AI engineer
+[80:57.00]So like I can see that this is a trend
+[80:59.00]And I think that's what Andre called out
+[81:01.00]In his post that like just mathematically
+[81:03.00]Just the limitations in terms of talents
+[81:05.00]Research talents and GPUs
+[81:07.00]That all these will tend to concentrate
+[81:09.00]In a few labs
+[81:11.00]And everyone else
+[81:13.00]Are just going to have to rely on them
+[81:15.00]Or build differentiation of products
+[81:17.00]In other ways, and those will be AI engineers
+[81:19.00]So mathematically there will be more AI engineers
+[81:21.00]Than ML engineers, it's just the truth
+[81:23.00]Right now it's the other way
+[81:25.00]Right now the number of AI engineers
+[81:27.00]Is maybe 10x less
+[81:29.00]So I think that the ratio will invert
+[81:31.00]And I think the goal of it in space
+[81:33.00]And the goal of the conference
+[81:35.00]And anything else I do is to serve
+[81:37.00]That growing audience
+[81:39.00]To make the distinction clear
+[81:41.00]If I'm a software engineer
+[81:43.00]What do I have to learn
+[81:45.00]What additional capabilities does that
+[81:47.00]Type of engineer have
+[81:49.00]Funny you say that
+[81:51.00]I don't actually have a specific blog
+[81:53.00]Post on how to like
+[81:55.00]Change classes
+[81:57.00]I do think I always think about this
+[81:59.00]In terms of Baldur's Gate
+[82:01.00]DND ruleset number
+[82:03.00]5.1 or whatever
+[82:05.00]So I kind of intentionally left that open
+[82:07.00]To leave space for others
+[82:09.00]I think when you start an industry
+[82:11.00]That's the only way to guarantee
+[82:13.00]That it will fail
+[82:15.00]I do have a take
+[82:17.00]Obviously because a lot of people
+[82:19.00]Are asking me where to start
+[82:21.00]And I think basically
+[82:23.00]So what we have is
+[82:25.00]Late and Space University
+[82:27.00]We just finished working on
+[82:29.00]Day 7 today
+[82:31.00]It's a seven-day project
+[82:33.00]And I think we've done a great job
+[82:35.00]We've done a great job
+[82:37.00]We've done a great job
+[82:39.00]And we've just finished working on
+[82:41.00]Day 7 today
+[82:43.00]It's a seven-day email course
+[82:45.00]Where it basically like
+[82:47.00]It is completely designed to answer
+[82:49.00]The question of like
+[82:51.00]I'm an existing software engineer
+[82:53.00]I know how to code
+[82:55.00]But I don't get all this AI stuff
+[82:57.00]I've been living under a rock
+[82:59.00]Or like it's just too overwhelming for me
+[83:01.00]You have to pick for me
+[83:03.00]Or curate for me as a trusted friend
+[83:05.00]And I have one hour a day for seven days
+[83:07.00]It's image generation
+[83:09.00]It's code generation
+[83:11.00]It's audio
+[83:13.00]ASR
+[83:15.00]Audio speech recognition
+[83:17.00]And then I forget
+[83:19.00]What the fifth and sixth one is
+[83:21.00]But the last day is agents
+[83:23.00]And so basically I'm just like
+[83:25.00]Here are seven projects that you should do
+[83:27.00]To feel like you can do anything in AI
+[83:29.00]You can't really do everything in AI
+[83:31.00]Just from that small list
+[83:33.00]But I think it's just like anything
+[83:35.00]Go through like a set list
+[83:37.00]Of things that are basic skills
+[83:39.00]That I think everyone in this industry should have
+[83:41.00]To be at least conversant
+[83:43.00]In if someone, if like a boss comes to you
+[83:45.00]And goes like hey can we build this
+[83:47.00]You don't even know if the answer is no
+[83:49.00]So I want you to move towards
+[83:51.00]From like unknown unknowns to at least known unknowns
+[83:53.00]And I think that's where you start
+[83:55.00]Being competent as an engineer
+[83:57.00]So yeah that's LSU
+[83:59.00]In Space University just to trigger the tigers
+[84:03.00]So do you think in the future that people
+[84:05.00]An AI engineer is going to be someone's
+[84:07.00]Full-time job like people are just going to be
+[84:09.00]AI engineers or do you think it's going to be
+[84:11.00]More of a world where I'm a software engineer
+[84:13.00]And like 20% of my time
+[84:15.00]I'm using open ai's, api's
+[84:17.00]And I'm working on prompt engineering
+[84:19.00]And stuff like that and using code pilot
+[84:21.00]You just reminded me of day six's
+[84:23.00]Open source models and fine tuning
+[84:25.00]I think it will be a spectrum. That's why I don't want to be
+[84:27.00]Like too definitive about it. Like we have
+[84:29.00]Full-time front-end engineers and we have part-time
+[84:31.00]And you dip into that community whenever you want
+[84:33.00]But wouldn't it be nice if there was a
+[84:35.00]Collective name for that community
+[84:37.00]So you could go find it, you could find each other
+[84:39.00]And like honestly that's really it
+[84:41.00]Like a lot of people, a lot of companies are pinging me
+[84:43.00]For like hey I want to hire this kind of person
+[84:45.00]But you can't hire that person
+[84:47.00]But I want someone like that
+[84:49.00]And then people on the labor side
+[84:51.00]Were pinging me going like okay I want to do more
+[84:53.00]In this space but where do I go
+[84:55.00]And I think just having that shelling point
+[84:57.00]Of what an industry title or name is
+[84:59.00]And sort of building out that mythology
+[85:01.00]And community and conference
+[85:03.00]I think is helpful hopefully
+[85:05.00]And I don't have any prescriptions
+[85:07.00]On whether or not it's a full-time job
+[85:09.00]I do think over time it's going to become
+[85:11.00]More of a full-time job
+[85:13.00]And that's great for the people who want to do that
+[85:15.00]And the companies that want to employ that
+[85:17.00]But it's absolutely like you can take it part-time
+[85:19.00]Like jobs come in many formats
+[85:21.00]Yep that makes sense
+[85:23.00]And then you have a huge world fair
+[85:25.00]Coming up
+[85:27.00]Tell me about that
+[85:29.00]So part of I think
+[85:31.00]What creating industry requires
+[85:33.00]Is to let people gather in one place
+[85:35.00]And also for me
+[85:37.00]To get high quality talks out of people
+[85:39.00]You have to create an event out of it
+[85:41.00]Otherwise they don't do the work
+[85:43.00]So last year we did
+[85:45.00]The AI engineer summit which went very well
+[85:47.00]And people can see that online
+[85:49.00]And we're very happy with how that turned out
+[85:51.00]This year we want to go four times bigger
+[85:53.00]With the world fair
+[85:55.00]To try to reflect AI engineering
+[85:57.00]As it is in 2024
+[85:59.00]I always admired
+[86:01.00]Two conferences in this respect
+[86:03.00]One is NEURUPs which I went to last year
+[86:05.00]And documented on the pod which was fantastic
+[86:07.00]And two which is KubeCon
+[86:09.00]From the other side of my life
+[86:11.00]Which is the sort of cloud registration
+[86:13.00]In DevOps world
+[86:15.00]So NEURUPs is the one place that you go to
+[86:17.00]To I think it's the top conference
+[86:19.00]I mean there's others
+[86:21.00]That you can kind of consider
+[86:23.00]So NEURUPs
+[86:25.00]NEURUPs is where the research sciences are the stars
+[86:27.00]The researchers are the stars, the PhDs are the stars
+[86:29.00]Mostly it's just PhDs on the job market
+[86:31.00]It's really funny to go
+[86:33.00]Especially these things
+[86:35.00]It's really funny to go to NEURUPs
+[86:37.00]And the PhDs trying to back them
+[86:39.00]There are lots of VCs trying to back there
+[86:41.00]This year
+[86:43.00]So NEURUPs research sciences are the stars
+[86:45.00]And I wanted for AI engineers
+[86:47.00]Engineer to be the star
+[86:49.00]To show off their tooling
+[86:51.00]And their techniques
+[86:53.00]And their difficulty
+[86:55.00]Moving all these ideas from research into production
+[86:57.00]The other one was KubeCon
+[86:59.00]Where you could honestly just go
+[87:01.00]And not attend any of the talks
+[87:03.00]And just walk the floor
+[87:05.00]And figure out what's going on in DevOps
+[87:07.00]Which is fantastic
+[87:09.00]So that curation
+[87:11.00]And that bringing together of an industry
+[87:13.00]Is what I'm going for for the conference
+[87:15.00]And it's coming in June
+[87:17.00]The most important thing to be honest
+[87:19.00]The most important thing was to buy the domain
+[87:21.00]So we got AI.engineer
+[87:23.00]People are like engineer is a domain
+[87:25.00]And funny enough
+[87:27.00].engineer was cheaper than.engineering
+[87:29.00]I don't understand why
+[87:31.00]But that's up to the domain people
+[87:33.00]All right
+[87:35.00]Josh, any questions on agents
+[87:37.00]Yeah, I think maybe you have a lot of
+[87:39.00]Experience and exposure
+[87:41.00]Talking to all these companies and founders
+[87:43.00]And researchers and everyone that's on your podcast
+[87:45.00]Do you have like
+[87:47.00]Do you feel like you have a good kind of perspective
+[87:49.00]On some of the things that like
+[87:51.00]Some of the kind of technical issues having seen
+[87:53.00]Like we were just talking about like for
+[87:55.00]Coding agents like oh how you know
+[87:57.00]The value of test is really important
+[87:59.00]There are other things like for you know retrieval
+[88:01.00]Like now, you know, we have these models
+[88:03.00]Coming out with a million context, you know
+[88:05.00]Are a million tokens of context like 30 million
+[88:07.00]Is retrieval going to matter anymore
+[88:09.00]The huge context matter like what do you think
+[88:11.00]Specific about the long context thing
+[88:13.00]Sure, yeah
+[88:15.00]I was going to ask a few other ones after that
+[88:17.00]So go for that one first
+[88:19.00]That's what I was going to ask first
+[88:21.00]Yeah, let's talk about the long context
+[88:23.00]So for those who don't know
+[88:25.00]Long context was kind of
+[88:27.00]In the air last year but really
+[88:29.00]Really really really came into focus this year
+[88:31.00]With Gemini1.5 having
+[88:33.00]Million token context and saying that
+[88:35.00]It was in research for 10 million tokens
+[88:37.00]And that means that
+[88:39.00]You can put you like
+[88:41.00]No longer have to really think about
+[88:43.00]What you retrieve
+[88:45.00]No longer really think about
+[88:47.00]What you have to put into context
+[88:49.00]You can just kind of throw the entire
+[88:51.00]Knowledge base in there or books or film
+[88:53.00]Anything like that and that's fantastic
+[88:55.00]A lot of people are thinking that it kills
+[88:57.00]Rag and I think like one that's not true
+[88:59.00]Because for any kind of cost reason
+[89:01.00]You know, you still pay per token
+[89:03.00]So basically Google is like perfectly happy
+[89:05.00]To let you pay a million tokens
+[89:07.00]Every single time you make an API call
+[89:09.00]But good luck, you know, having $100 API call
+[89:11.00]You don't want to be slow, no explanation needed
+[89:13.00]And then finally my criticism of
+[89:15.00]Long context is that it's also not debuggable
+[89:17.00]Like if something goes wrong with the result
+[89:19.00]You can't do like the rags decomposition
+[89:21.00]Of where the source of error
+[89:23.00]Like you just have to like go like
+[89:25.00]It's the way it's bro like it's somewhere in there
+[89:27.00]I'm sorry, I pretty strongly agree with this
+[89:29.00]Why do you think people are making such
+[89:31.00]Crazy Long context windows
+[89:32.00]People love to kill rag
+[89:33.00]It's so much because it's too expensive
+[89:35.00]It's so expensive like you said
+[89:37.00]Yeah, I just think I'm just calling it
+[89:39.00]It's a different dimension
+[89:40.00]I think it's an option that's great when it's there
+[89:42.00]Like when I'm prototyping
+[89:43.00]I do not ever want to worry about context
+[89:45.00]And I'm going to call stuff a few times
+[89:47.00]And I don't want to run the errors
+[89:48.00]I don't want to have it set up a complex retrieval system
+[89:50.00]Just to prototype something
+[89:52.00]But once I done prototyping
+[89:53.00]Then I'll worry about all the other rags stuff
+[89:55.00]And yes, I'm going to buy some system
+[89:57.00]Or build some system or whatever to go do that
+[90:00.00]So I think it's just like an improvement
+[90:03.00]In like one dimension that you need
+[90:05.00]But the improvements in the other dimensions
+[90:07.00]And it's all needed
+[90:08.00]Like this space isn't going to keep growing
+[90:10.00]In unlimited fashion
+[90:12.00]I do think that this combined with multi-modality
+[90:16.00]Does unlock new things
+[90:18.00]So that's what I was going to ask about next
+[90:19.00]It's like how important is multimodal
+[90:21.00]Like great, you know, generating videos
+[90:23.00]Sure, whatever
+[90:24.00]Okay, how many of us need to generate videos that often
+[90:26.00]It'd be cool for TV shows, sure, but like, yeah
+[90:28.00]I think it's pretty important
+[90:30.00]The one thing that in
+[90:31.00]When we launched the in space podcast
+[90:33.00]We listed a bunch of interest areas
+[90:35.00]One thing I love about being explicit
+[90:37.00]Or intentional about our work
+[90:40.00]Is that you list the things that you're interested in
+[90:42.00]And you list the things that you're not interested in
+[90:44.00]And people are very unwilling
+[90:46.00]To have an entire interest list
+[90:48.00]One of the things that we are not interested in
+[90:50.00]Was multimodality last year
+[90:52.00]Because everyone was
+[90:54.00]I was just like, okay, you can generate images
+[90:56.00]And they're pretty, but like, not a giant business
+[90:58.00]I was wrong
+[90:59.00]Journey is a giant, giant, massive business
+[91:01.00]That no one can understand or get into
+[91:03.00]But also, I think
+[91:05.00]Being able to natively understand
+[91:07.00]Audio and video and code
+[91:09.00]I consider code a special modality
+[91:11.00]All that is
+[91:13.00]Very, like, qualitatively different
+[91:15.00]Then translating it into English first
+[91:17.00]And using English as, you know, like a bottleneck
+[91:19.00]Or pipe
+[91:20.00]And then, you know, applying it in LLMs
+[91:22.00]The ability of LLMs to reason across modalities
+[91:25.00]Gives you something more than you could
+[91:27.00]Individually by using Texas
+[91:29.00]The universal interface
+[91:30.00]So I think that's useful
+[91:32.00]So concretely, what does that mean?
+[91:34.00]It means that
+[91:35.00]So, I think the reference post for everyone
+[91:37.00]That you should have in your head
+[91:39.00]Is Simon Willison's post on Gemini1.5's
+[91:41.00]Video capability
+[91:43.00]Where he basically shot a video of
+[91:45.00]His bookshelf, just kind of scanning through it
+[91:47.00]And he was able to give back a complete
+[91:49.00]JSON list of the books and the authors
+[91:51.00]And all the details that were visible there
+[91:53.00]Helucinated some of it
+[91:55.00]Which is, you know, another issue
+[91:57.00]But I think it's just like unlocks this use case
+[91:59.00]That you just would not even try to code
+[92:01.00]Without the native video understanding capability
+[92:04.00]And obviously, like, on a technical level
+[92:07.00]Video is just a bunch of frames
+[92:08.00]So it actually is just image understanding
+[92:10.00]But image within the temporal dimension
+[92:12.00]Which this month, I think
+[92:14.00]Became much more of a important thing
+[92:16.00]Like the integration of space and time
+[92:18.00]In transformers
+[92:19.00]I don't think anyone was really talking about that
+[92:21.00]Until this month
+[92:22.00]And now it's the only thing anyone can ever think about
+[92:24.00]For Sora and for all the other stuff
+[92:26.00]The last thing I'll say
+[92:28.00]Which is against this trend of
+[92:30.00]Every modality is important
+[92:32.00]They just do all the modalities
+[92:34.00]I kind of agree with Nat Friedman
+[92:36.00]Who actually kind of pointed out
+[92:37.00]Just before the Gemini thing blew up
+[92:39.00]This month
+[92:41.00]Which was, like, why is it that
+[92:43.00]OpenAI is pushing Dolly so hard
+[92:45.00]Why is Bing pushing Bing Image Creator?
+[92:47.00]Like, it's not apparent
+[92:49.00]That you have to create images to create AGI
+[92:51.00]But every lab just seems to want to do this
+[92:54.00]And I kind of agree
+[92:55.00]That it's not on the critical path
+[92:57.00]Especially for image generation
+[92:59.00]Maybe image understanding, video understanding
+[93:01.00]Yeah, consumption
+[93:02.00]But generation
+[93:04.00]Maybe we'll be wrong next year
+[93:06.00]Just catches you a bunch of flak with, like, you know
+[93:08.00]Culture war things
+[93:10.00]It's true
+[93:11.00]All right, we're going to move into
+[93:13.00]Rapidfire Q&A
+[93:14.00]So, we're going to ask you a question
+[93:16.00]I don't want to overthink it, baby
+[93:18.00]We're going to audience the Q&A
+[93:20.00]So, I'll tell you
+[93:22.00]All right
+[93:23.00]We've cut the Q&A section for time
+[93:25.00]So, if you want to hear the spicy questions
+[93:27.00]Head over to the Thursday nights in A.I. video
+[93:29.00]for the full discussion
+[93:31.00]Next up, we have another former guest
+[93:33.00]Dylan Patel of Semi-analysis
+[93:36.00]the inventor of the GPU rich poor divide
+[93:39.00]who did a special live show with us in March
+[93:41.00]But that means you can finally
+[93:45.00]side-to-side A.V. test your favorite boba shops
+[93:48.00]We got Gongcha, we got boba guys
+[93:50.00]We got the lemon, whatever it's called
+[93:53.00]So, let us know what's your favorite
+[93:55.00]We also have Slido up to submit questions
+[93:58.00]We already had Dylan on the podcast
+[94:00.00]And this guy tweets and writes about all kinds of stuff
+[94:02.00]So, we want to know what people want to know more about
+[94:05.00]Rather than just being self-truined
+[94:07.00]But we'll do a state of the union, maybe
+[94:10.00]Everybody wants to know about GROC
+[94:12.00]Everybody wants to know whether or not
+[94:14.00]MVD is going to zero after GROC
+[94:16.00]Everybody wants to know what's going on with AMD
+[94:18.00]We got some AMD folks in the crowd, too
+[94:20.00]So, feel free to interact at any time
+[94:23.00]We have HECCLE, HECCLE, please
+[94:25.00]Good comedians show their color
+[94:28.00]The way they can handle the crowd when they're HECCLE
+[94:31.00]Do not throw boba
+[94:33.00]Do not throw boba at the stand
+[94:35.00]We cannot afford another podcast thing set up
+[94:37.00]Awesome, welcome everybody
+[94:39.00]To the seminalysis and latest space crossover
+[94:41.00]Dylan texted me on signal
+[94:43.00]He was like, dude, how do I easily set up a meetup
+[94:46.00]And here we are today
+[94:48.00]As you might have seen, there's no name tags
+[94:50.00]There's a bunch of things that are missing
+[94:52.00]But we did our best
+[94:53.00]It was extremely easy, right?
+[94:55.00]Like, I texted Alessio, he's like, yo, I got the spot
+[94:58.00]Okay, cool, here's a link, send it to people
+[95:00.00]Send it, and then show it up
+[95:03.00]And like, there was zero other organization that I required
+[95:07.00]So, everybody's here
+[95:09.00]A lot of seminalysis fans, we get in the crowd
+[95:12.00]Everybody wants to know more about
+[95:14.00]What's going on today and GROC has definitely been the hottest thing
+[95:16.00]We just recorded our monthly podcast today
+[95:18.00]And we didn't talk that much about GROC
+[95:20.00]Because we wanted you to talk more about
+[95:22.00]And then we'll splice you into our monthly recap
+[95:24.00]So, let's start there
+[95:26.00]So, you guys are the two GROC spreadsheetors
+[95:29.00]So, we broke out some GROC numbers
+[95:32.00]Because everyone was wondering
+[95:33.00]There's two things going on, right?
+[95:34.00]One, you know, how important, you know
+[95:36.00]How does it achieve the inference speed that it does
+[95:38.00]That it has been demonstrated by GROC chat
+[95:41.00]And two, how does it achieve its price promise
+[95:44.00]That is promised, that is sort of the public pricing
+[95:46.00]Of 27 cents per million token
+[95:48.00]And there's been a lot of speculation
+[95:50.00]Or, you know, some numbers thrown out there
+[95:52.00]I put out some tentative numbers
+[95:54.00]And you put out different numbers
+[95:55.00]But I'll just kind of lay that as the groundwork
+[95:58.00]Like, everyone's very excited about
+[96:00.00]Essentially like five times faster
+[96:02.00]Token generation than any other LLM currently
+[96:05.00]And that unlocks interesting downstream possibilities
+[96:08.00]If it's sustainable
+[96:10.00]If it's affordable
+[96:11.00]And so I think your question
+[96:13.00]Or reading your piece on GROC
+[96:15.00]Which is on the screen right now
+[96:16.00]Is it sustainable
+[96:18.00]So, like many things
+[96:20.00]This is VC funded, including this Boba
+[96:23.00]No, I'm just kidding
+[96:24.00]I'm paying for the Boba
+[96:25.00]Thank you, Semunas, the subscribers
+[96:27.00]I hope he pays for it
+[96:29.00]I pay for it right now
+[96:30.00]That's true, Alessio has the IOU
+[96:33.00]Right?
+[96:34.00]And that's all it is
+[96:35.00]But yeah, like many things, you know
+[96:37.00]They're not making money off of their inference service
+[96:40.00]They're just throwing it out there for cheap
+[96:42.00]And hoping to get business
+[96:43.00]And maybe raise money off of that
+[96:45.00]And I think that's a that's a fine use case
+[96:48.00]But the question is like
+[96:49.00]How much money are they losing, right?
+[96:51.00]And that's sort of what I went through
+[96:52.00]Breaking down in this article
+[96:54.00]That's on the screen
+[96:55.00]And that's it's pretty clear
+[96:56.00]They're like seven to ten X off
+[97:00.00]Like break even on their inference API
+[97:02.00]Which is like horrendous
+[97:04.00]Like far worse than any other
+[97:06.00]Sort of inference API provider
+[97:08.00]So this is like a simple simple
+[97:10.00]Cost thing that was pulled up
+[97:12.00]You can either inference at
+[97:13.00]Very high throughput
+[97:14.00]Or you can inference at
+[97:15.00]Very high very low latency
+[97:17.00]With GPUs you can do both
+[97:18.00]With GROC you can only do one
+[97:20.00]Of course with GROC
+[97:21.00]You can do that one faster
+[97:22.00]Marginally faster than a
+[97:24.00]Inference latency optimized GPU server
+[97:26.00]But no one offers inference latency
+[97:28.00]Optimized GPU servers
+[97:29.00]Because you would just burn money
+[97:31.00]Makes no economic sense to do so
+[97:33.00]Until maybe someone's willing
+[97:34.00]To pay for that
+[97:35.00]So GROC service
+[97:36.00]You know the surface looks awesome
+[97:37.00]Compared to everyone else's service
+[97:39.00]Which is throughput optimized
+[97:41.00]And then when you compare to
+[97:43.00]Input optimized scenario
+[97:44.00]GPUs look quite slow
+[97:46.00]But the reality is
+[97:47.00]Is they're serving 64
+[97:49.00]128 users at once
+[97:51.00]They have a batch size
+[97:52.00]How many users are being
+[97:53.00]Served at once
+[97:54.00]Where as GROC is taking
+[97:55.00]576 chips
+[97:56.00]And they're not really
+[97:58.00]Doing that efficiently
+[97:59.00]They're serving a far far
+[98:01.00]Fewer number of users
+[98:02.00]But extremely fast
+[98:03.00]Now that could be worthwhile
+[98:05.00]If they can get there
+[98:07.00]The number of users
+[98:08.00]They're serving at once up
+[98:10.00]But that's extremely hard
+[98:11.00]Because they don't have
+[98:12.00]Their chip
+[98:13.00]So they can't store
+[98:14.00]KVcash
+[98:15.00]KVcash for all the
+[98:16.00]Various different users
+[98:17.00]So their crux of the issue
+[98:19.00]Is just like hey
+[98:20.00]Can they get that performance
+[98:22.00]Up as much as they claim they will
+[98:24.00]They need to get it up
+[98:25.00]More than 10x
+[98:26.00]To make this like a reasonable
+[98:28.00]Benefit
+[98:29.00]In the meantime
+[98:30.00]NVIDIA is launching a new
+[98:31.00]GPU in two weeks
+[98:32.00]That'll be fun at GTC
+[98:34.00]And they're constantly
+[98:35.00]Pushing software as well
+[98:36.00]So we'll see
+[98:37.00]If GROC can catch up to that
+[98:38.00]The current verdict is
+[98:40.00]They're quite far behind
+[98:41.00]But hopefully
+[98:42.00]That maybe they can
+[98:43.00]Get there by scaling
+[98:44.00]Their system larger
+[98:45.00]Yeah
+[98:46.00]I was listening back
+[98:47.00]To our original episode
+[98:48.00]And you were talking
+[98:49.00]About how NVIDIA
+[98:50.00]Basically adopted
+[98:51.00]This different strategy
+[98:52.00]Of just leaning on
+[98:54.00]Networking GPUs together
+[98:55.00]And it seems like
+[98:56.00]GROC has some minor
+[98:58.00]Version of that going on
+[98:59.00]Here with the GROC rack
+[99:00.00]Is it enough?
+[99:03.00]What's GROC's next
+[99:05.00]Step here strategically?
+[99:07.00]Yeah, that's
+[99:09.00]The next step is of course
+[99:10.00]So right now
+[99:12.00]They connect 10 racks
+[99:13.00]Of chips together
+[99:14.00]And that's the system
+[99:15.00]That's running
+[99:16.00]On their API today
+[99:17.00]Whereas most people
+[99:18.00]Who are running
+[99:19.00]Mischro are running
+[99:20.00]In on two GPUs
+[99:22.00]One fourth of a server
+[99:24.00]And that rack
+[99:25.00]Is not
+[99:26.00]Obviously 10 racks
+[99:27.00]Is pretty crazy
+[99:28.00]But they think
+[99:29.00]That they can
+[99:30.00]Scale performance
+[99:31.00]If they have
+[99:32.00]This individual system
+[99:33.00]Be 20 racks
+[99:34.00]They think they can
+[99:35.00]Continue to scale performance
+[99:36.00]Extra linearly
+[99:37.00]So that'd be amazing
+[99:38.00]If they could
+[99:39.00]And I'm doubtful
+[99:40.00]That's going to be something
+[99:42.00]That's scalable
+[99:43.00]Especially for
+[99:45.00]You know, larger models
+[99:47.00]So there's the chip itself
+[99:51.00]But there's also a lot
+[99:52.00]Of work they're doing
+[99:53.00]At the compiler level
+[99:54.00]Do you have any good sense
+[99:55.00]Of like how easy it is
+[99:57.00]To actually work with LPU
+[99:59.00]Is that something
+[100:00.00]That's going to be
+[100:01.00]About on that for them
+[100:02.00]So Ollie's in the front
+[100:03.00]Right there
+[100:04.00]And he knows a ton
+[100:05.00]About VLIW architectures
+[100:07.00]But to summarize
+[100:08.00]In his opinion
+[100:09.00]And I think many folks
+[100:10.00]Is it's extremely hard
+[100:11.00]To program
+[100:12.00]These sorts of architectures
+[100:14.00]Which is why they have
+[100:15.00]Their compiler
+[100:16.00]And so on and so forth
+[100:17.00]But it's an incredible amount
+[100:19.00]Of work for them
+[100:20.00]To stand up individual models
+[100:22.00]And to get the performance
+[100:23.00]Up on them
+[100:24.00]Which is what they've been
+[100:25.00]Working on
+[100:26.00]Whereas GPUs
+[100:27.00]Are far more flexible
+[100:28.00]Of course
+[100:29.00]And so the question is
+[100:30.00]Can this compiler
+[100:31.00]Continue to extract
+[100:32.00]Performance
+[100:33.00]Well theoretically
+[100:34.00]There's a lot more
+[100:35.00]Performance to run
+[100:36.00]On the hardware
+[100:37.00]And they don't have
+[100:38.00]Many things that people
+[100:40.00]Generally associate with
+[100:42.00]Programmable hardware
+[100:44.00]They don't have buffers
+[100:45.00]And many other things
+[100:46.00]So it makes it very tough
+[100:47.00]To do that
+[100:48.00]But that's what their
+[100:49.00]They're relatively large
+[100:51.00]Compiler team is working on
+[100:53.00]So I'm not a GPU compiler guy
+[100:55.00]But I do want to
+[100:56.00]Clarify my understanding
+[100:57.00]From what I read
+[100:58.00]Which is a lot of
+[100:59.00]Catching up to do
+[101:00.00]It is
+[101:01.00]The crux of it
+[101:02.00]Is some kind of speculative
+[101:04.00]The word that comes
+[101:05.00]The routing
+[101:06.00]Of weights
+[101:08.00]And work
+[101:09.00]That needs to
+[101:10.00]Be done or scheduling
+[101:11.00]Of work across
+[101:12.00]The ten racks
+[101:14.00]Of GPUs
+[101:15.00]Is that like
+[101:17.00]The bulk of the benefit
+[101:18.00]That you get
+[101:19.00]From the compilation
+[101:20.00]So with the grok chips
+[101:22.00]What's really
+[101:23.00]Interesting is like
+[101:25.00]With GPUs
+[101:26.00]You can issue
+[101:27.00]Certain instructions
+[101:29.00]And you will get
+[101:30.00]A different result
+[101:31.00]Like depending on
+[101:32.00]The time I know
+[101:33.00]A lot of people
+[101:34.00]I'll have
+[101:35.00]Have had that experience
+[101:36.00]Or like the GPU
+[101:37.00]Literally doesn't return
+[101:38.00]The numbers it should be
+[101:39.00]As basically called
+[101:40.00]Non-determinism
+[101:41.00]With grok
+[101:42.00]Their chip is completely
+[101:43.00]Deterministic
+[101:44.00]The moment you compile it
+[101:45.00]You know exactly how long
+[101:46.00]It will take to operate
+[101:47.00]Right there is no
+[101:48.00]There is no like
+[101:49.00]Deviation at all
+[101:51.00]And so
+[101:52.00]You know they've
+[101:53.00]They're planning everything
+[101:55.00]Ahead of time
+[101:56.00]Like every instruction
+[101:57.00]Like it will
+[101:58.00]Complete in the time
+[101:59.00]That they've planned it for
+[102:00.00]And there is no
+[102:01.00]I don't know what
+[102:02.00]The best way to state this is
+[102:03.00]No variance there
+[102:04.00]Which is
+[102:05.00]Interesting from like
+[102:06.00]When you look historically
+[102:07.00]They tried to push this
+[102:08.00]In automotive
+[102:09.00]Because automotive
+[102:10.00]You probably want
+[102:11.00]Your car to do
+[102:12.00]Exactly what you
+[102:13.00]I issued it to do
+[102:14.00]And not have
+[102:15.00]Unpredictability
+[102:16.00]But yeah
+[102:17.00]Sorry I lost track
+[102:18.00]Of the question
+[102:19.00]It's okay
+[102:20.00]I just wanted to
+[102:21.00]Understand a little bit
+[102:22.00]More about like
+[102:23.00]What people should
+[102:24.00]Should know about
+[102:25.00]The compiler magic
+[102:26.00]That goes on with grok
+[102:27.00]Like you know
+[102:28.00]Like I think
+[102:30.00]From a software
+[102:31.00]Like hardware point of view
+[102:32.00]That intersection of
+[102:34.00]I guess
+[102:35.00]So chips have like
+[102:36.00]Like I'm still this
+[102:37.00]From someone
+[102:38.00]Here in the crowd
+[102:39.00]But chips have like
+[102:40.00]Five you know
+[102:41.00]Sort of there's
+[102:42.00]Like when you're designing a chip
+[102:43.00]There's there's
+[102:44.00]It's called PPA right
+[102:45.00]Power performance in area
+[102:47.00]It's kind of a triangle
+[102:48.00]That you optimize around
+[102:49.00]And the one thing
+[102:50.00]People don't realize
+[102:51.00]Is there's a
+[102:52.00]There's a third P
+[102:53.00]That's like PPA P
+[102:55.00]And the last P
+[102:56.00]Is pain in the ass
+[102:57.00]To program
+[102:58.00]And that's
+[102:59.00]That is very important
+[103:00.00]For like
+[103:01.00]High hardware
+[103:02.00]Like TPU
+[103:04.00]Without the
+[103:05.00]Hundreds of people
+[103:06.00]That work on the compiler
+[103:07.00]And jacks
+[103:08.00]And XLA
+[103:09.00]And all these sorts of things
+[103:10.00]Would be a pain in the ass
+[103:11.00]To program
+[103:12.00]But Google's got that like
+[103:13.00]Plumbing
+[103:14.00]Now if you look across
+[103:15.00]The ecosystem
+[103:16.00]Everything else is a pain
+[103:17.00]In the ass to program
+[103:18.00]Compared to NVIDIA
+[103:19.00]And this applies
+[103:20.00]To the grok chip as well
+[103:22.00]So yeah
+[103:23.00]Question is like
+[103:24.00]Can the compiler team
+[103:25.00]Get performance up
+[103:26.00]Anywhere close to theoretical
+[103:28.00]And then can they make
+[103:29.00]It not a pain in the ass
+[103:30.00]To program
+[103:31.00]And then can they make
+[103:32.00]It not a pain in the ass
+[103:33.00]To program
+[103:34.00]And then can they make
+[103:35.00]It not a pain in the ass
+[103:36.00]To program
+[103:37.00]And then can they make
+[103:38.00]It not a pain in the ass
+[103:39.00]And then can they make
+[103:40.00]It not a pain in the ass
+[103:41.00]And then can they make
+[103:42.00]It not a pain in the ass
+[103:43.00]And then can they make
+[103:44.00]It not a pain in the ass
+[103:45.00]And then can they make
+[103:46.00]It not a pain in the ass
+[103:47.00]And then can they make
+[103:48.00]It not a pain in the ass
+[103:49.00]And then can they make
+[103:50.00]It not a pain in the ass
+[103:51.00]And then can they make
+[103:52.00]It not a pain in the ass
+[103:53.00]And then can they make
+[103:54.00]It not a pain in the ass
+[103:55.00]And then can they make
+[103:56.00]It not a pain in the ass
+[103:57.00]And then can they make
+[103:58.00]It not a pain in the ass
+[103:59.00]And then can they make
+[104:00.00]It not a pain in the ass
+[104:01.00]And then can they make
+[104:02.00]It not a pain in the ass
+[104:03.00]And then can they make
+[104:04.00]It not a pain in the ass
+[104:05.00]And then can they make
+[104:06.00]It not a pain in the ass
+[104:07.00]And then can they make
+[104:08.00]It not a pain in the ass
+[104:09.00]And then can they make
+[104:10.00]It not a pain in the ass
+[104:11.00]And then can they make
+[104:12.00]It not a pain in the ass
+[104:13.00]And then can they make
+[104:14.00]It not a pain in the ass
+[104:15.00]And then can they make
+[104:16.00]It not a pain in the ass
+[104:17.00]And then can they make
+[104:18.00]It not a pain in the ass
+[104:19.00]And then can they make
+[104:20.00]It not a pain in the ass
+[104:21.00]And then can they make
+[104:22.00]It not a pain in the ass
+[104:23.00]And then can they make
+[104:24.00]It not a pain in the ass
+[104:25.00]And then can they make
+[104:26.00]It not a pain in the ass
+[104:27.00]And then can they make
+[104:28.00]It not a pain in the ass
+[104:29.00]And then can they make
+[104:30.00]It not a pain in the ass
+[104:31.00]And then can they make
+[104:32.00]It not a pain in the ass
+[104:33.00]And then can they make
+[104:34.00]It not a pain in the ass
+[104:35.00]And then can they make
+[104:36.00]It not a pain in the ass
+[104:37.00]And then can they make
+[104:38.00]It not a pain in the ass
+[104:39.00]And then can they make
+[104:40.00]It not a pain in the ass
+[104:41.00]And then can they make
+[104:42.00]It not a pain in the ass
+[104:43.00]And then can they make
+[104:44.00]It not a pain in the ass
+[104:45.00]And then can they make
+[104:46.00]It not a pain in the ass
+[104:47.00]And then can they make
+[104:48.00]It not a pain in the ass
+[104:49.00]And then can they make
+[104:50.00]It not a pain in the ass
+[104:51.00]And then can they make
+[104:52.00]It not a pain in the ass
+[104:53.00]And then can they make
+[104:54.00]It not a pain in the ass
+[104:55.00]And then can they make
+[104:56.00]It not a pain in the ass
+[104:57.00]And then can they make
+[104:58.00]It not a pain in the ass
+[104:59.00]And then can they make
+[105:00.00]It not a pain in the ass
+[105:01.00]And then can they make
+[105:02.00]It not a pain in the ass
+[105:03.00]And then can they make
+[105:04.00]It not a pain in the ass
+[105:05.00]And then can they make
+[105:06.00]It not a pain in the ass
+[105:07.00]And then can they make
+[105:08.00]It not a pain in the ass
+[105:09.00]And then can they make
+[105:10.00]It not a pain in the ass
+[105:11.00]And then can they make
+[105:12.00]It not a pain in the ass
+[105:13.00]And then can they make
+[105:14.00]It not a pain in the ass
+[105:15.00]And then can they make
+[105:16.00]It not a pain in the ass
+[105:17.00]And then can they make
+[105:18.00]It not a pain in the ass
+[105:19.00]And then can they make
+[105:20.00]It not a pain in the ass
+[105:21.00]And then can they make
+[105:22.00]It not a pain in the ass
+[105:23.00]And then can they make
+[105:24.00]It not a pain in the ass
+[105:25.00]And then can they make
+[105:26.00]It not a pain in the ass
+[105:27.00]And then can they make
+[105:28.00]It not a pain in the ass
+[105:29.00]And then can they make
+[105:30.00]It not a pain in the ass
+[105:31.00]And then can they make
+[105:32.00]It not a pain in the ass
+[105:33.00]And then can they make
+[105:34.00]It not a pain in the ass
+[105:35.00]And then can they make
+[105:36.00]It not a pain in the ass
+[105:37.00]And then can they make
+[105:38.00]It not a pain in the ass
+[105:39.00]And then can they make
+[105:40.00]It not a pain in the ass
+[105:41.00]And then can they make
+[105:42.00]It not a pain in the ass
+[105:43.00]And then can they make
+[105:44.00]It not a pain in the ass
+[105:45.00]And then can they make
+[105:46.00]It not a pain in the ass
+[105:47.00]And then can they make
+[105:48.00]It not a pain in the ass
+[105:49.00]And then can they make
+[105:50.00]It not a pain in the ass
+[105:51.00]And then can they make
+[105:52.00]It not a pain in the ass
+[105:53.00]And then can they make
+[105:54.00]It not a pain in the ass
+[105:55.00]And then can they make
+[105:56.00]It not a pain in the ass
+[105:57.00]And then can they make
+[105:58.00]It not a pain in the ass
+[105:59.00]And then can they make
+[106:00.00]It not a pain in the ass
+[106:01.00]And then can they make
+[106:02.00]It not a pain in the ass
+[106:03.00]And then can they make
+[106:04.00]It not a pain in the ass
+[106:05.00]And then can they make
+[106:06.00]It not a pain in the ass
+[106:07.00]And then can they make
+[106:08.00]It not a pain in the ass
+[106:09.00]And then can they make
+[106:10.00]It not a pain in the ass
+[106:11.00]And then can they make
+[106:12.00]It not a pain in the ass
+[106:13.00]And then can they make
+[106:14.00]It not a pain in the ass
+[106:15.00]And then can they make
+[106:16.00]It not a pain in the ass
+[106:17.00]And then can they make
+[106:18.00]It not a pain in the ass
+[106:19.00]And then can they make
+[106:20.00]It not a pain in the ass
+[106:21.00]And then can they make
+[106:22.00]It not a pain in the ass
+[106:23.00]And then can they make
+[106:24.00]It not a pain in the ass
+[106:25.00]And then can they make
+[106:26.00]It not a pain in the ass
+[106:27.00]And then can they make
+[106:28.00]It not a pain in the ass
+[106:29.00]And then can they make
+[106:30.00]It not a pain in the ass
+[106:31.00]And then can they make
+[106:32.00]It not a pain in the ass
+[106:33.00]And then can they make
+[106:34.00]It not a pain in the ass
+[106:35.00]And then can they make
+[106:36.00]It not a pain in the ass
+[106:37.00]And then can they make
+[106:38.00]It not a pain in the ass
+[106:39.00]And then can they make
+[106:40.00]It not a pain in the ass
+[106:41.00]And then can they make
+[106:42.00]It not a pain in the ass
+[106:43.00]And then can they make
+[106:44.00]It not a pain in the ass
+[106:45.00]And then can they make
+[106:46.00]It not a pain in the ass
+[106:47.00]And then can they make
+[106:48.00]It not a pain in the ass
+[106:49.00]And then can they make
+[106:50.00]It not a pain in the ass
+[106:51.00]And then can they make
+[106:52.00]It not a pain in the ass
+[106:53.00]And then can they make
+[106:54.00]It not a pain in the ass
+[106:55.00]And then can they make
+[106:56.00]It not a pain in the ass
+[106:57.00]And then can they make
+[106:58.00]It not a pain in the ass
+[106:59.00]And then can they make
+[107:00.00]It not a pain in the ass
+[107:01.00]And then can they make
+[107:02.00]It not a pain in the ass
+[107:03.00]And then can they make
+[107:04.00]It not a pain in the ass
+[107:05.00]And then can they make
+[107:06.00]It not a pain in the ass
+[107:07.00]And then can they make
+[107:08.00]It not a pain in the ass
+[107:09.00]And then can they make
+[107:10.00]It not a pain in the ass
+[107:11.00]And then can they make
+[107:12.00]It not a pain in the ass
+[107:13.00]And then can they make
+[107:14.00]It not a pain in the ass
+[107:15.00]And then can they make
+[107:16.00]It not a pain in the ass
+[107:17.00]And then can they make
+[107:18.00]It not a pain in the ass
+[107:19.00]And then can they make
+[107:20.00]It not a pain in the ass
+[107:21.00]And then can they make
+[107:22.00]It not a pain in the ass
+[107:23.00]And then can they make
+[107:24.00]It not a pain in the ass
+[107:25.00]And then can they make
+[107:26.00]It not a pain in the ass
+[107:27.00]And then can they make
+[107:28.00]It not a pain in the ass
+[107:29.00]And then can they make
+[107:30.00]It not a pain in the ass
+[107:31.00]And then can they make
+[107:32.00]It not a pain in the ass
+[107:33.00]And then can they make
+[107:34.00]It not a pain in the ass
+[107:35.00]And then can they make
+[107:36.00]It not a pain in the ass
+[107:37.00]And then can they make
+[107:38.00]It not a pain in the ass
+[107:39.00]And then can they make
+[107:40.00]It not a pain in the ass
+[107:41.00]And then can they make
+[107:42.00]It not a pain in the ass
+[107:43.00]And then can they make
+[107:44.00]It not a pain in the ass
+[107:45.00]And then can they make
+[107:46.00]It not a pain in the ass
+[107:47.00]And then can they make
+[107:48.00]It not a pain in the ass
+[107:49.00]And then can they make
+[107:50.00]It not a pain in the ass
+[107:51.00]And then can they make
+[107:52.00]It not a pain in the ass
+[107:53.00]And then can they make
+[107:54.00]It not a pain in the ass
+[107:55.00]And then can they make
+[107:56.00]It not a pain in the ass
+[107:57.00]And then can they make
+[107:58.00]It not a pain in the ass
+[107:59.00]And then can they make
+[108:00.00]It not a pain in the ass
+[108:01.00]And then can they make
+[108:02.00]And then can they make
+[108:03.00]And then can they make
+[108:04.00]And then can they make
+[108:05.00]It not a pain in the ass
+[108:06.00]And then can they make
+[108:07.00]It not a pain in the ass
+[108:08.00]And then can they make
+[108:09.00]It not a pain in the ass
+[108:10.00]And then can they make
+[108:11.00]It not a pain in the ass
+[108:12.00]And then can they make
+[108:13.00]It not a pain in the ass
+[108:14.00]And then can they make
+[108:15.00]It not a pain in the ass
+[108:16.00]And then can they make
+[108:17.00]It not a pain in the ass
+[108:18.00]And then can they make
+[108:19.00]It not a pain in the ass
+[108:20.00]And then can they make
+[108:21.00]It not a pain in the ass
+[108:22.00]And then can they make
+[108:23.00]It not a pain in the ass
+[108:24.00]And then can they make
+[108:25.00]It not a pain in the ass
+[108:26.00]And then can they make
+[108:27.00]It not a pain in the ass
+[108:28.00]And then can they make
+[108:29.00]And then can they make
+[108:30.00]It not a pain in the ass
+[108:31.00]And then can they make
+[108:32.00]It not a pain in the ass
+[108:33.00]And then can they make
+[108:34.00]It not a pain in the ass
+[108:35.00]And then can they make
+[108:36.00]It not a pain in the ass
+[108:37.00]And then can they make
+[108:38.00]It not a pain in the ass
+[108:39.00]And then can they make
+[108:40.00]It not a pain in the ass
+[108:41.00]And then can they make
+[108:42.00]It not a pain in the ass
+[108:43.00]And then can they make
+[108:44.00]It not a pain in the ass
+[108:45.00]And then can they make
+[108:46.00]It not a pain in the ass
+[108:47.00]And then can they make
+[108:48.00]It not a pain in the ass
+[108:49.00]And then can they make
+[108:50.00]It not a pain in the ass
+[108:51.00]And then can they make
+[108:52.00]It not a pain in the ass
+[108:53.00]And then can they make
+[108:54.00]It not a pain in the ass
+[108:55.00]And then can they make
+[108:56.00]It not a pain in the ass
+[108:57.00]And then can they make
+[108:58.00]It not a pain in the ass
+[108:59.00]And then can they make
+[109:00.00]It not a pain in the ass
+[109:01.00]And then can they make
+[109:02.00]It not a pain in the ass
+[109:03.00]And then can they make
+[109:04.00]It not a pain in the ass
+[109:05.00]And then can they make
+[109:06.00]It not a pain in the ass
+[109:07.00]And then can they make
+[109:08.00]It not a pain in the ass
+[109:09.00]And then can they make
+[109:10.00]It not a pain in the ass
+[109:11.00]And then can they make
+[109:12.00]It not a pain in the ass
+[109:13.00]And then can they make
+[109:14.00]It not a pain in the ass
+[109:15.00]And then can they make
+[109:16.00]It not a pain in the ass
+[109:17.00]And then can they make
+[109:18.00]It not a pain in the ass
+[109:19.00]And then can they make
+[109:20.00]It not a pain in the ass
+[109:21.00]And then can they make
+[109:22.00]It not a pain in the ass
+[109:23.00]And then can they make
+[109:24.00]It not a pain in the ass
+[109:25.00]And then can they make
+[109:26.00]It not a pain in the ass
+[109:27.00]And then can they make
+[109:28.00]It not a pain in the ass
+[109:29.00]And then can they make
+[109:30.00]It not a pain in the ass
+[109:31.00]And then can they make
+[109:32.00]It not a pain in the ass
+[109:33.00]And then can they make
+[109:34.00]It not a pain in the ass
+[109:35.00]And then can they make
+[109:36.00]It not a pain in the ass
+[109:37.00]And then can they make
+[109:38.00]It not a pain in the ass
+[109:39.00]And then can they make
+[109:40.00]It not a pain in the ass
+[109:41.00]And then can they make
+[109:42.00]It not a pain in the ass
+[109:43.00]And then can they make
+[109:44.00]It not a pain in the ass
+[109:45.00]And then can they make
+[109:46.00]It not a pain in the ass
+[109:47.00]And then can they make
+[109:48.00]It not a pain in the ass
+[109:49.00]And then can they make
+[109:50.00]It not a pain in the ass
+[109:51.00]And then can they make
+[109:52.00]It not a pain in the ass
+[109:53.00]And then can they make
+[109:54.00]It not a pain in the ass
+[109:55.00]And then can they make
+[109:56.00]It not a pain in the ass
+[109:57.00]And then can they make
+[109:58.00]It not a pain in the ass
+[109:59.00]And then can they make
+[110:00.00]It not a pain in the ass
+[110:01.00]And then can they make
+[110:02.00]It not a pain in the ass
+[110:03.00]And then can they make
+[110:04.00]It not a pain in the ass
+[110:05.00]And then can they make
+[110:06.00]It not a pain in the ass
+[110:07.00]And then can they make
+[110:08.00]It not a pain in the ass
+[110:09.00]And then can they make
+[110:10.00]It not a pain in the ass
+[110:11.00]And then can they make
+[110:12.00]It not a pain in the ass
+[110:13.00]And then can they make
+[110:14.00]It not a pain in the ass
+[110:15.00]And then can they make
+[110:16.00]It not a pain in the ass
+[110:17.00]And so then I just
+[110:18.00]I just started editing them
+[110:19.00]And so then I just started editing them
+[110:20.00]So I have stopped
+[110:21.00]Comparing RAG with long
+[110:22.00]Context of Fine Tuning
+[110:23.00]Hold on, you said I retweeted
+[110:24.00]You defending it
+[110:25.00]I thought you were hating on it
+[110:26.00]And that's why I retweeted it
+[110:28.00]It's not one of a defense
+[110:30.00]Because everyone was like
+[110:31.00]Long context is killing RAG
+[110:32.00]And then I had a future
+[110:33.00]Oh that should be so quadratic
+[110:34.00]That's another one
+[110:36.00]And I actually
+[110:37.00]Messed the fine print as well
+[110:38.00]Let's see
+[110:39.00]Power benefits of
+[110:40.00]Astrum dominant
+[110:41.00]Yeah, so that's a good question
+[110:43.00]Astrum is on chip memory
+[110:45.00]Everyone's just using HBM
+[110:46.00]If you don't have to go
+[110:47.00]To off-chip memory
+[110:48.00]That'd be really efficient
+[110:49.00]Right?
+[110:50.00]Because you're
+[110:51.00]You're not moving bits around
+[110:52.00]But there's always
+[110:54.00]The issue of
+[110:55.00]You don't have enough memory
+[110:56.00]So you still have to move
+[110:57.00]Bits around constantly
+[110:58.00]And so that's the
+[110:59.00]That's the question
+[111:00.00]So yeah, sure
+[111:01.00]If you can not move data
+[111:02.00]Around as you compute
+[111:03.00]It's going to be fantastically
+[111:04.00]Efficient
+[111:05.00]But that isn't really
+[111:06.00]It's not really just
+[111:07.00]Easier simple to do
+[111:08.00]What do you think is going to be
+[111:09.00]Harder in the future
+[111:10.00]Like getting more energy
+[111:11.00]At cheaper cost
+[111:12.00]Like getting more of this hardware
+[111:14.00]To run
+[111:15.00]Yeah, I wonder
+[111:16.00]So someone was talking about this earlier
+[111:18.00]But it's like
+[111:19.00]Here in the crowd
+[111:20.00]And I'm looking right at him
+[111:21.00]But he's complaining
+[111:22.00]That journalists keep saying that
+[111:24.00]You know that
+[111:25.00]Like misreporting about how data centers
+[111:27.00]Or what data centers
+[111:28.00]Are doing to the environment
+[111:29.00]Right?
+[111:30.00]Which I thought was quite funny
+[111:31.00]Because they're inundated by
+[111:33.00]Journalists talking about data centers
+[111:35.00]Like destroying the world
+[111:36.00]Anyways, you know
+[111:37.00]That's not quite the case
+[111:38.00]But yeah, I don't know
+[111:39.00]Like the power is certainly
+[111:42.00]Gonna be hard to get
+[111:44.00]But you know
+[111:45.00]I think
+[111:46.00]If you just look at history
+[111:47.00]Right?
+[111:48.00]Like humanity
+[111:49.00]Especially America
+[111:50.00]Like power
+[111:51.00]Power production and usage
+[111:52.00]Kept skyrocketing
+[111:53.00]From like the 1700s
+[111:55.00]To like 1970s
+[111:57.00]And then it's kind of
+[111:58.00]Flat line from there
+[111:59.00]So why can't we like
+[112:00.00]Go back to the like growth stage
+[112:02.00]I guess it's like the
+[112:03.00]The whole like mantra
+[112:04.00]Of like accelerationists
+[112:06.00]I guess
+[112:07.00]This is IAC, yep
+[112:08.00]Well, I don't think it's IAC
+[112:09.00]I think it's like
+[112:10.00]Same Altman
+[112:11.00]Hotly believes this too
+[112:12.00]And I don't think he's IAC
+[112:13.00]So but yeah
+[112:14.00]Like I don't think like
+[112:15.00]It's like think
+[112:16.00]It's like something to
+[112:17.00]Think about like
+[112:18.00]The US is going back
+[112:19.00]To growing an energy usage
+[112:21.00]Where as for the last like
+[112:22.00]Forty years kind of
+[112:24.00]We're flat on energy usage
+[112:25.00]And what does that mean
+[112:26.00]Like yeah
+[112:29.00]There was another question
+[112:31.00]On Marvel but kind of the
+[112:32.00]I think that's
+[112:33.00]It's definitely like
+[112:34.00]One of these three guys
+[112:35.00]We're on the buy side
+[112:36.00]That are asking this question
+[112:39.00]Wanna know if Marvell's stock
+[112:41.00]Is gonna go up
+[112:43.00]So Marvell
+[112:44.00]They're doing the
+[112:46.00]Customizing for GROC
+[112:48.00]They also do the
+[112:49.00]Trinium too
+[112:50.00]And the Google CPU
+[112:51.00]Yeah any other
+[112:52.00]Any other chip
+[112:53.00]That they're working on
+[112:54.00]That people should
+[112:55.00]Should keep in mind
+[112:56.00]It's like yeah
+[112:57.00]Any needle moving
+[112:58.00]Any stock moving
+[112:59.00]Yeah exactly
+[113:01.00]They're working on
+[113:02.00]Some more stuff
+[113:03.00]Yeah I'll refrain from
+[113:05.00]Yeah all right
+[113:06.00]Let's see
+[113:07.00]Other GROC stuff
+[113:08.00]We want to get it
+[113:09.00]Get through
+[113:10.00]I don't think so
+[113:11.00]All right
+[113:12.00]Most of other ones
+[113:13.00]You're going edge compute hardware
+[113:16.00]Any real use cases
+[113:17.00]For it
+[113:18.00]Yeah I mean
+[113:20.00]I have like a
+[113:21.00]Really like anti edge view
+[113:23.00]So many people were like
+[113:25.00]Oh I'm gonna run
+[113:26.00]This model on my phone
+[113:27.00]Or on my laptop
+[113:28.00]And I love
+[113:30.00]I love how much
+[113:31.00]It's raining so now
+[113:32.00]I can be horrible
+[113:33.00]And you people won't leave
+[113:35.00]Like I want you
+[113:36.00]To try and leave
+[113:37.00]This building
+[113:38.00]Captive audience
+[113:40.00]Should I start singing
+[113:41.00]Like there's
+[113:42.00]Nothing you can do
+[113:43.00]You definitely
+[113:44.00]I'll stop you from that
+[113:45.00]Sorry
+[113:46.00]Edge hardware
+[113:47.00]Like you know
+[113:48.00]People are like
+[113:49.00]I'm going to run
+[113:50.00]This model on my phone
+[113:51.00]Or on my laptop
+[113:52.00]It makes no sense to me
+[113:53.00]Current hardware
+[113:54.00]Is not really capable of it
+[113:55.00]So you're going to buy
+[113:56.00]Any hardware
+[113:57.00]To run
+[113:58.00]Whatever on the edge
+[113:59.00]Or you're going to
+[114:00.00]Just run
+[114:01.00]Very very small models
+[114:02.00]But in either case
+[114:03.00]You're going to end up
+[114:05.00]With like
+[114:06.00]The performance is really low
+[114:07.00]And then whatever you spent
+[114:08.00]To run it locally
+[114:09.00]In the cloud
+[114:10.00]It could service 10x the users
+[114:12.00]So it kind of like
+[114:14.00]SOL in terms of like
+[114:17.00]Economics of
+[114:18.00]Running things on the edge
+[114:20.00]And then like latency is like
+[114:23.00]For LLMs
+[114:24.00]Right for LLMs
+[114:25.00]It's like
+[114:26.00]Not that big of a deal
+[114:27.00]Relative
+[114:28.00]To like
+[114:29.00]Internet latency
+[114:30.00]Is not that big of a deal
+[114:31.00]Relative to the
+[114:32.00]Use of the model
+[114:33.00]Right like the actual model
+[114:34.00]Operating
+[114:35.00]Whether it's on edge hardware
+[114:36.00]Or cloud hardware
+[114:37.00]And cloud hardware
+[114:38.00]Like edge hardware
+[114:39.00]Is not really
+[114:40.00]Able to like
+[114:41.00]Have a measurable
+[114:43.00]Appreciable
+[114:44.00]Like advantage
+[114:45.00]Over cloud hardware
+[114:48.00]This applies
+[114:49.00]To diffusion models
+[114:50.00]This applies to LLMs
+[114:52.00]Of course small models
+[114:53.00]Will be able to run
+[114:54.00]But not all
+[114:55.00]Yeah
+[114:56.00]What chances
+[114:57.00]The startups
+[114:58.00]Like medax etched
+[114:59.00]Or 5,600
+[115:00.00]I think you all interviewed them
+[115:01.00]Why don't you answer
+[115:02.00]Yeah we have connections
+[115:03.00]With medax and lemur
+[115:04.00]And we haven't know
+[115:06.00]But Gavin is friendly
+[115:07.00]They didn't
+[115:08.00]Yeah they said
+[115:09.00]They don't want to talk publicly
+[115:10.00]Yeah
+[115:11.00]What they're doing
+[115:12.00]It's something like
+[115:13.00]When they open up
+[115:14.00]We can
+[115:15.00]Sure sure
+[115:16.00]Yeah
+[115:17.00]But do you think like
+[115:18.00]I think the two three
+[115:19.00]We're going to answer the question
+[115:20.00]What do you think of them
+[115:21.00]There's a couple things
+[115:23.00]It's like
+[115:24.00]How do the other companies
+[115:26.00]Innovate against them
+[115:27.00]I think when you do a new
+[115:28.00]Silicon you're like
+[115:29.00]Oh we're going to be
+[115:30.00]So much better at this thing
+[115:31.00]You're like much faster
+[115:32.00]Much cheaper
+[115:33.00]But there's all the other curves
+[115:34.00]Going down
+[115:35.00]On the macro environment
+[115:36.00]So if it takes you
+[115:37.00]Like five years
+[115:38.00]Before you were
+[115:39.00]Like a lot better
+[115:40.00]Five years later
+[115:41.00]Once you take the chip out
+[115:42.00]You're only comparing yourself
+[115:43.00]To the five year advancement
+[115:44.00]That the major companies had to
+[115:46.00]So then it's like
+[115:47.00]Okay that
+[115:48.00]We're going to have like
+[115:49.00]The C300 whatever
+[115:50.00]From from nvidia
+[115:51.00]By the time
+[115:52.00]Some of these chips come up
+[115:53.00]What's after Z
+[115:55.00]What do you think is after Z
+[115:56.00]In the roadmap
+[115:57.00]It's X Y Z
+[116:00.00]No
+[116:01.00]Anyways
+[116:03.00]Yeah yeah
+[116:04.00]It's like the age old problem
+[116:05.00]You build a chip
+[116:06.00]It has some cool thing
+[116:07.00]Cool feature
+[116:08.00]And then like
+[116:09.00]A year later nvidia
+[116:10.00]Has it in hardware
+[116:11.00]It has implemented
+[116:12.00]Some flavor of that in hardware
+[116:14.00]Or two generations out
+[116:16.00]Like what idea
+[116:17.00]Are you going to have
+[116:18.00]That nvidia can't implement
+[116:19.00]Is like really the question
+[116:21.00]It's like you have to be
+[116:22.00]Fundamentally different
+[116:23.00]In some way
+[116:24.00]That holds through
+[116:25.00]For four or five years
+[116:26.00]That's kind of the big issue
+[116:28.00]But you know like
+[116:30.00]Like those people have
+[116:31.00]Some ideas that are interesting
+[116:32.00]And yeah maybe
+[116:33.00]It'll work out right
+[116:34.00]It's going to be hard
+[116:35.00]To fight nvidia
+[116:36.00]Who one doesn't
+[116:37.00]Consider them competition
+[116:38.00]I worried about like
+[116:39.00]Google and Amazon's chip
+[116:40.00]Right they're not
+[116:41.00]And I guess
+[116:42.00]To some extent AMD's chip
+[116:43.00]But like
+[116:44.00]They're not really worried
+[116:45.00]About you know
+[116:46.00]Maddox or etched
+[116:47.00]Or or grok
+[116:48.00]Or you know
+[116:49.00]Positron or any of these folks
+[116:51.00]How much of an advantage
+[116:53.00]Do they have
+[116:54.00]By working closely
+[116:55.00]With like open ai
+[116:56.00]And some of these other folks
+[116:57.00]And then already knowing
+[116:58.00]Where some of the
+[116:59.00]Architecture decisions
+[117:00.00]Are going and since
+[117:01.00]Those companies are like
+[117:02.00]The biggest buyers
+[117:03.00]Of the chips
+[117:04.00]Yeah I mean like
+[117:05.00]You see like
+[117:06.00]Like the most important
+[117:07.00]Sort of ai companies
+[117:09.00]Are obviously going to
+[117:10.00]Tell hardware vendors
+[117:11.00]What they want
+[117:12.00]You know open ai
+[117:13.00]And you know
+[117:14.00]So on and so forth
+[117:15.00]Right they can obviously
+[117:16.00]Tell them what they want
+[117:17.00]And the startups
+[117:18.00]Aren't actually going to
+[117:19.00]Get anywhere close to
+[117:20.00]As much feedback on
+[117:21.00]What to do on like
+[117:22.00]You know very
+[117:23.00]My new low level stuff
+[117:24.00]So that is a difficult here
+[117:26.00]Some startups
+[117:27.00]Like Maddox
+[117:28.00]Obviously have
+[117:29.00]People who built
+[117:30.00]Or worked on
+[117:31.00]The largest models
+[117:32.00]And other startups
+[117:33.00]Might not have
+[117:34.00]That advantage
+[117:35.00]And so they're always
+[117:36.00]Gonna have that issue
+[117:37.00]Of like
+[117:38.00]Hey how do I get
+[117:39.00]The feedback
+[117:40.00]Or what's changing
+[117:41.00]What do they see
+[117:42.00]Down the pipeline
+[117:43.00]That's
+[117:44.00]That I really need
+[117:45.00]To be aware of
+[117:46.00]And ready for
+[117:47.00]When I design
+[117:48.00]My hardware
+[117:49.00]All right
+[117:50.00]Every hardware shortage
+[117:51.00]As eventually
+[117:52.00]Turn into a glut
+[117:53.00]Will that be
+[117:54.00]Through of NVIDIA chips
+[117:55.00]It's so when
+[117:56.00]But also why
+[117:57.00]Absolutely
+[117:58.00]And I'm so excited
+[117:59.00]To buy like
+[118:00.00]H100s
+[118:01.00]Not a thousand
+[118:02.00]But yeah
+[118:03.00]Everyone's
+[118:04.00]Gonna buy chips
+[118:05.00]It's just the way
+[118:06.00]Semiconductors work
+[118:07.00]Because the supply chain
+[118:08.00]Takes forever to build out
+[118:09.00]And it's like
+[118:10.00]A really weird thing
+[118:11.00]So if the backlog
+[118:12.00]Of chips is a year
+[118:15.00]People will order
+[118:16.00]Two years worth
+[118:17.00]Of what they want
+[118:18.00]For the next year
+[118:19.00]It is like
+[118:20.00]A very common thing
+[118:21.00]It's not just like
+[118:22.00]This AI cycle
+[118:23.00]But like
+[118:24.00]Like microcontroller
+[118:25.00]Like the automotive companies
+[118:26.00]They order
+[118:27.00]Two years worth
+[118:28.00]Of what they needed
+[118:29.00]For one year
+[118:30.00]What happens
+[118:31.00]In semiconductors
+[118:32.00]When lead times
+[118:33.00]Lengthen
+[118:34.00]The purchases
+[118:35.00]And inventory
+[118:36.00]Is sort of like
+[118:37.00]Double
+[118:38.00]So these
+[118:39.00]The NVIDIA GPU shortage
+[118:44.00]Obviously is going to be
+[118:45.00]Rectified
+[118:46.00]And when it is
+[118:47.00]Everyone's sort of
+[118:48.00]Double orders
+[118:49.00]Will become
+[118:50.00]Extremely apparent
+[118:51.00]Right
+[118:52.00]And you know
+[118:53.00]You see like
+[118:54.00]Random companies
+[118:55.00]Out of nowhere being
+[118:56.00]Like yeah
+[118:57.00]We've got 32,000
+[118:58.00]H100s on order
+[118:59.00]And 25,000
+[119:00.00]And trust
+[119:01.00]They're not all
+[119:02.00]They're not all
+[119:03.00]Real orders
+[119:04.00]For one
+[119:05.00]But I think
+[119:06.00]The bubble will
+[119:07.00]Continue on
+[119:08.00]For a long time
+[119:09.00]Right like it's not
+[119:10.00]It's not going to end
+[119:11.00]Like this year
+[119:12.00]Right like people
+[119:13.00]People need AI
+[119:14.00]Right like I think
+[119:15.00]Everyone in this audience
+[119:16.00]Would agree right like
+[119:17.00]There's no
+[119:18.00]There's no
+[119:19.00]Like immediate
+[119:20.00]Like end to the
+[119:21.00]To the bubble
+[119:22.00]Right
+[119:23.00]What's next
+[119:24.00]Why
+[119:25.00]I think it's just
+[119:26.00]Because the supply chain
+[119:27.00]Expands so much
+[119:28.00]Some companies
+[119:29.00]Will continue to buy
+[119:30.00]Like an
+[119:31.00]Open AI
+[119:32.00]Or meta
+[119:33.00]Will continue to buy
+[119:34.00]But then like
+[119:35.00]All these random
+[119:36.00]Startups will
+[119:37.00]Or a lot of them
+[119:38.00]Will not be able to
+[119:39.00]Continue to buy
+[119:40.00]So then
+[119:41.00]That kind of leads to like
+[119:42.00]They'll pause
+[119:43.00]For a little bit
+[119:44.00]Or like
+[119:45.00]I think in 2018
+[119:46.00]Right like
+[119:47.00]Memory pricing was
+[119:48.00]Extremely high
+[119:49.00]Then all of a sudden
+[119:50.00]Google, Microsoft
+[119:51.00]And Amazon
+[119:52.00]All agreed
+[119:53.00]I don't you know
+[119:54.00]They won't
+[119:55.00]They won't say
+[119:56.00]It's together
+[119:57.00]It's a week
+[119:58.00]To stop ordering memory
+[119:59.00]And within like
+[120:00.00]A month
+[120:01.00]The price of memory
+[120:02.00]Started tanking
+[120:03.00]Like insane amounts
+[120:04.00]Right and like
+[120:05.00]People will claim
+[120:06.00]You know all sorts of
+[120:07.00]Reasons why
+[120:08.00]That was timed
+[120:09.00]Extremely well
+[120:10.00]It was like very clear
+[120:11.00]And people in the
+[120:12.00]Financial markets
+[120:13.00]Were able to make trades
+[120:14.00]And everything
+[120:15.00]People stopped buying
+[120:16.00]And it's not like
+[120:17.00]Their demand just dried up
+[120:18.00]It's just like
+[120:19.00]They had a little bit
+[120:20.00]Of a demand slowdown
+[120:21.00]And then they had enough
+[120:22.00]Inventory that they could
+[120:23.00]Like weather until
+[120:24.00]Like prices tanked
+[120:25.00]Because it's such
+[120:26.00]Thank you very much
+[120:27.00]Is it
+[120:28.00]Hey everyone
+[120:33.00]And so
+[120:34.00]Today we have a special guest
+[120:35.00]Millions from Capital One
+[120:37.00]But I would tend to
+[120:38.00]Like to introduce people
+[120:39.00]With a bit of their background
+[120:41.00]And then learn a little bit
+[120:42.00]More about you
+[120:43.00]On the personal side
+[120:44.00]You call your PhD
+[120:45.00]In a probabilistic framework
+[120:47.00]From mapping audio
+[120:48.00]Visual features to semantics
+[120:49.00]I feel like that
+[120:50.00]Is like the beginnings
+[120:51.00]Of like a multimodal
+[120:52.00]AI model in some sense
+[120:54.00]Do you have any sort
+[120:55.00]Factions on your PhD
+[120:56.00]Vis this cut
+[120:57.00]Thanks for having me
+[120:59.00]And so
+[121:00.00]Let me say this that
+[121:01.00]It almost feels like
+[121:02.00]Things go around in circles
+[121:03.00]Right
+[121:04.00]In research and development
+[121:05.00]And so
+[121:06.00]At the right time
+[121:07.00]And the right place
+[121:08.00]You kind of intersect
+[121:09.00]Back with some of the topics
+[121:11.00]And then
+[121:12.00]Some other conditions
+[121:13.00]That have happened
+[121:14.00]Suddenly make
+[121:15.00]A big difference
+[121:16.00]Between that ticking off
+[121:17.00]Vorses you know
+[121:18.00]It may not be
+[121:19.00]You know
+[121:20.00]As intently pursued
+[121:21.00]At any given point of time
+[121:22.00]Right so
+[121:23.00]I have been
+[121:24.00]In AI for now
+[121:25.00]Three decades
+[121:26.00]You know
+[121:27.00]You talked about
+[121:28.00]My PhD thesis
+[121:29.00]My bachelor's thesis
+[121:30.00]Was on implementing
+[121:32.00]Neural networks
+[121:33.00]On India's
+[121:34.00]You know
+[121:35.00]Homegrown supercomputers
+[121:36.00]Back then
+[121:37.00]And so
+[121:38.00]You know
+[121:39.00]This whole notion of
+[121:40.00]Message passing
+[121:41.00]And distributed computing
+[121:42.00]And computing weights
+[121:44.00]You know
+[121:45.00]And then bringing
+[121:46.00]All of them back
+[121:47.00]Distributing
+[121:48.00]The computations
+[121:49.00]Of the neural network
+[121:50.00]You know
+[121:51.00]Forward pass
+[121:52.00]Those things
+[121:53.00]For what we used to do
+[121:54.00]You know
+[121:55.00]For that
+[121:56.00]Particular supercomputing
+[121:57.00]Architecture
+[121:58.00]We had back then
+[121:59.00]And then
+[122:00.00]My PhD
+[122:01.00]Of course was
+[122:02.00]How to understand
+[122:03.00]What's going on
+[122:04.00]In a video
+[122:05.00]Right and use
+[122:06.00]Multi-modal cues
+[122:07.00]To your point
+[122:08.00]You know
+[122:09.00]What has happened
+[122:10.00]In the last couple of
+[122:11.00]Decades
+[122:12.00]One we have
+[122:13.00]Tremendous amount
+[122:14.00]Of data explosion
+[122:15.00]So when I was doing
+[122:16.00]My PhD
+[122:17.00]I used to
+[122:18.00]Actually go
+[122:19.00]To blockbuster
+[122:20.00]And rent
+[122:21.00]Explosions
+[122:22.00]And how do you
+[122:23.00]Actually
+[122:24.00]Models in the audio stream
+[122:25.00]And the visual stream
+[122:26.00]For something
+[122:27.00]Like an explosion
+[122:28.00]So I remember
+[122:29.00]Going and doing
+[122:30.00]All this digitization
+[122:31.00]Of tape and then
+[122:32.00]Cleaning up the data
+[122:33.00]And then
+[122:34.00]You know
+[122:35.00]Having some kind
+[122:36.00]Of a labeling tool
+[122:37.00]That I actually cooked up
+[122:38.00]And having a spouse
+[122:40.00]Of a friend of mine
+[122:41.00]To do the labeling
+[122:42.00]For me
+[122:43.00]So look at
+[122:44.00]Where we were back then
+[122:45.00]And now you have
+[122:46.00]Scale.ai
+[122:47.00]That basically goes
+[122:48.00]And does labeling
+[122:49.00]For a lot of these models
+[122:50.00]And so forth
+[122:51.00]So scale
+[122:52.00]Of data has changed
+[122:53.00]That's one
+[122:54.00]The second thing
+[122:55.00]That has changed
+[122:56.00]Is we were looking
+[122:57.00]At computing
+[122:58.00]Architectures
+[122:59.00]That were
+[123:00.00]Much much
+[123:01.00]Much less rich
+[123:02.00]In terms of what
+[123:03.00]We had back then
+[123:04.00]And the 2012
+[123:06.00]Breakthrough
+[123:07.00]By Hinton and his students
+[123:09.00]And using GPUs
+[123:11.00]Really when
+[123:12.00]The ImageNet competition
+[123:14.00]Really helped
+[123:15.00]Take this field off
+[123:16.00]In a completely
+[123:17.00]New direction
+[123:18.00]And at
+[123:19.00]Very very large scale
+[123:20.00]And the third thing
+[123:21.00]Is of course
+[123:22.00]The GPU computing
+[123:23.00]Back then
+[123:24.00]I did not have access
+[123:25.00]When I was doing my PhD
+[123:26.00]To some of the amazing things
+[123:28.00]That Nvidia hadn't yet built
+[123:30.00]And so
+[123:31.00]It's I think really
+[123:32.00]The confluence of those three things
+[123:34.00]Which make all the difference
+[123:35.00]Between a lot of this research
+[123:36.00]That happened
+[123:37.00]In the late 90s
+[123:38.00]And what's happening
+[123:39.00]Between the 2010
+[123:41.00]To now
+[123:42.00]Kind of time frame
+[123:43.00]But if you look at the intent
+[123:45.00]The intent was the same
+[123:47.00]What's in the video
+[123:48.00]How do we understand
+[123:49.00]The multimodal cues
+[123:50.00]You know
+[123:51.00]That come together
+[123:52.00]To give us
+[123:53.00]That semantic understanding
+[123:54.00]Of what's in the video
+[123:55.00]And so
+[123:56.00]To that extent
+[123:57.00]The problems
+[123:58.00]We were trying
+[123:59.00]To solve the same
+[124:00.00]But the tools
+[124:01.00]That we have now
+[124:02.00]Are amazingly
+[124:03.00]Amazingly different
+[124:04.00]And amazingly
+[124:05.00]More powerful
+[124:06.00]Are there any
+[124:07.00]Maybe research approaches
+[124:08.00]Or ML patterns
+[124:10.00]That you tried
+[124:11.00]That didn't work
+[124:12.00]That you think
+[124:13.00]Will work today
+[124:14.00]Would you abuse
+[124:15.00]That people haven't tried
+[124:16.00]I think
+[124:17.00]There are many people
+[124:18.00]That have done serious
+[124:19.00]ML research before
+[124:20.00]The GPU era
+[124:21.00]I would say
+[124:22.00]If you think about
+[124:23.00]All the ML researchers
+[124:24.00]Working today
+[124:25.00]Most of them are post-GPU
+[124:26.00]Any like story
+[124:27.00]That you remember
+[124:28.00]That you were like
+[124:29.00]Oh, this seems really promising
+[124:30.00]But like
+[124:31.00]There wasn't enough compute
+[124:32.00]Or anything like that
+[124:33.00]The whole concept
+[124:34.00]Of modeling context
+[124:35.00]Right
+[124:36.00]So my thesis was about
+[124:37.00]Not
+[124:38.00]How do you just
+[124:39.00]Detect isolated things
+[124:40.00]In a video
+[124:41.00]Right
+[124:42.00]Like this is a car
+[124:43.00]That's an explosion
+[124:44.00]And so on and so forth
+[124:45.00]If I see these
+[124:46.00]End things together
+[124:47.00]Do they actually
+[124:48.00]Contextually make sense
+[124:49.00]Like do I see
+[124:50.00]The sky
+[124:51.00]About the land
+[124:52.00]And if I do that
+[124:53.00]Then I have a higher confidence
+[124:55.00]That this indeed is sky
+[124:56.00]And that indeed is land
+[124:57.00]Right
+[124:58.00]And so on and so forth
+[124:59.00]So when we were trying
+[125:00.00]To model that context
+[125:01.00]We had extremely limited
+[125:03.00]Libling
+[125:04.00]And extremely limited
+[125:05.00]Corpus
+[125:06.00]In terms of how we could do it
+[125:08.00]Now when I think back
+[125:10.00]I think that
+[125:11.00]What better way to describe
+[125:12.00]Context than a multimodal LLM
+[125:14.00]Right
+[125:15.00]Which strain on
+[125:16.00]As much of the data
+[125:18.00]As there is on the internet
+[125:20.00]In terms of multimodality
+[125:22.00]And that to me
+[125:23.00]Would have been an amazing thing
+[125:24.00]For me to have back then
+[125:26.00]So I would say
+[125:27.00]How to model context
+[125:28.00]Is a problem
+[125:29.00]That is going to be evergreen
+[125:30.00]It never goes out of fashion
+[125:32.00]But how we are able
+[125:33.00]To do it now
+[125:34.00] Versus then
+[125:35.00]I see as
+[125:36.00]One of the steps towards
+[125:38.00]Truly understanding
+[125:39.00]You know what's happening
+[125:40.00]Right in the multiple modalities
+[125:42.00]The other part
+[125:43.00]Is reasoning
+[125:44.00]Right
+[125:45.00]I think we are still
+[125:46.00]In the very early innings
+[125:47.00]Of reasoning
+[125:48.00]You know we see this
+[125:49.00]Interesting evolution
+[125:51.00]Of you know
+[125:52.00]How do you actually
+[125:53.00]Build a model of the world
+[125:55.00]And Yanlacan's work
+[125:56.00]You know very interesting
+[125:58.00]In this sense to me
+[125:59.00]He has been talking
+[126:00.00]About it for a while
+[126:01.00]Right so
+[126:02.00]But now I think
+[126:03.00]We are getting to a point
+[126:04.00]Where a lot of those pieces
+[126:05.00]May start coming together
+[126:07.00]And I think solving
+[126:09.00]That reasoning piece
+[126:10.00]Is a very very critical step
+[126:12.00]Before we can actually
+[126:14.00]Build
+[126:15.00]Truly intelligent machines
+[126:17.00]And do you have any
+[126:18.00]Intuition on the part
+[126:19.00]The video
+[126:20.00]Is going to play in it
+[126:21.00]Because a lot of Yan's
+[126:22.00]Also talking points
+[126:23.00]Are around you know
+[126:24.00]With vjapa
+[126:25.00]And some of those models
+[126:26.00]If you show
+[126:27.00]What's going to happen next
+[126:28.00]Like that's part
+[126:29.00]Of a war model
+[126:30.00]Like it's video going to be
+[126:32.00]Like a big part in it
+[126:33.00]Like do we need to
+[126:34.00]Get there to actually
+[126:35.00]Get a real war model
+[126:36.00]Like do you think text
+[126:37.00]Is enough to
+[126:38.00]Get a good shot at it
+[126:39.00]And maybe biased
+[126:40.00]In answering that question
+[126:41.00]Given that
+[126:42.00]I you know
+[126:43.00]Cut my teeth
+[126:44.00]In multi modality
+[126:45.00]And given that
+[126:46.00]The video modality
+[126:47.00]In general
+[126:48.00]Right is a lot
+[126:49.00]More challenging
+[126:50.00]You know whether
+[126:51.00]It's just because
+[126:52.00]Of the sheer size of data
+[126:53.00]Right in terms of
+[126:54.00]The number of pixels
+[126:55.00]That you need to process
+[126:56.00]Whether it is because
+[126:57.00]You are actually
+[126:58.00]Capturing the real world
+[127:00.00]Which tends to
+[127:01.00]Be far more complex
+[127:02.00]You know in the
+[127:03.00]Case of language
+[127:04.00]You know humans
+[127:05.00]Over millennia
+[127:07.00]Have evolved this
+[127:09.00]Longly concise
+[127:11.00]Codebook
+[127:12.00]Of how to describe
+[127:13.00]Things
+[127:14.00]So there is a humongous
+[127:15.00]Amount of abstraction
+[127:18.00]And rationalization
+[127:20.00]And concise definition
+[127:22.00]That has gone on
+[127:23.00]In how languages
+[127:25.00]Evolved
+[127:26.00]And so you know
+[127:27.00]The number of words
+[127:28.00]In the vocabulary
+[127:29.00]Of a language
+[127:30.00]When you look at that
+[127:31.00]We are able to
+[127:32.00]Tell beautiful stories
+[127:33.00]Right with
+[127:34.00]Just those many
+[127:36.00]You know words
+[127:37.00]But if you look at
+[127:38.00]In the real world
+[127:39.00]If you look at
+[127:40.00]Capturing that
+[127:41.00]Right whether it is
+[127:42.00]Through
+[127:43.00]The eyes of a robot
+[127:44.00]As it's looking
+[127:45.00]You know and trying
+[127:46.00]To help you around
+[127:47.00]In a room setting
+[127:48.00]Or a building setting
+[127:49.00]Right or whether
+[127:50.00]It's the traffic
+[127:51.00]Right which
+[127:52.00]Vacues have to look at
+[127:53.00]When they're
+[127:54.00]Driving on the roads
+[127:55.00]The amount of
+[127:56.00]Variability
+[127:57.00]The amount of
+[127:58.00]Distortion of signal
+[127:59.00]That comes with
+[128:00.00]That right
+[128:01.00]Is just remarkably
+[128:03.00]Difficult
+[128:04.00]And remarkably rich
+[128:05.00]To really analyze
+[128:06.00]So I would say
+[128:07.00]Alicio
+[128:08.00]It's both
+[128:09.00]I feel that
+[128:10.00]The video modality
+[128:11.00]Has to be understood
+[128:13.00]For a true understanding
+[128:14.00]Of the world model
+[128:15.00]I would also say
+[128:16.00]That's harder
+[128:17.00]In some sense
+[128:19.00]Because of its inherent complexity
+[128:20.00]Then the language part
+[128:22.00]Which there's already
+[128:24.00]Consize representation
+[128:25.00]That people have come up with
+[128:26.00]Awesome
+[128:27.00]Sorry Sean
+[128:28.00]I know we hijacked the intro
+[128:30.00]But it was a good rabbit hole
+[128:32.00]To go into
+[128:33.00]No it's okay
+[128:34.00]I also just like
+[128:35.00]Stunned by
+[128:36.00]Background and history
+[128:37.00]That millennium is
+[128:38.00]Bringing to
+[128:39.00]AI you know
+[128:40.00]I guess the speed run through
+[128:41.00]In the resume
+[128:42.00]You know 14 years
+[128:43.00]At IBM
+[128:44.00]Finally ending
+[128:45.00]As chief scientist at
+[128:46.00]IBM research
+[128:47.00]And then since
+[128:48.00]Cisco
+[128:49.00]With cognitive systems
+[128:50.00]And CTO
+[128:51.00]Of metropolis at NVIDIA
+[128:52.00]What should people know
+[128:53.00]About like
+[128:54.00]How your
+[128:55.00]Interest in your trajectory
+[128:56.00]Has progressed through
+[128:57.00]Your career
+[128:58.00]You know when I reflect
+[128:59.00]Part of what I see
+[129:00.00]Is there is a constant
+[129:02.00]And the constant is
+[129:04.00]How do you actually
+[129:05.00]AI work
+[129:06.00]For you know
+[129:07.00]Name your favorite
+[129:08.00]Problem
+[129:09.00]That favorite problem
+[129:10.00]Changes maybe
+[129:11.00]From decade to decade
+[129:12.00]Or in the context of
+[129:13.00]Even my state
+[129:14.00]IBM research
+[129:15.00]But it's always been
+[129:16.00]How do we build
+[129:17.00]AI solutions
+[129:18.00]AI platforms
+[129:19.00]That solve
+[129:20.00]Real world problems
+[129:21.00]So when
+[129:22.00]We started out
+[129:23.00]We were the first
+[129:24.00]Video understanding platform
+[129:26.00]That we built at
+[129:27.00]IBM research right
+[129:28.00]We actually got a Wall Street
+[129:29.00]Multimedia
+[129:30.00]Award for it
+[129:31.00]Innovation Award for it
+[129:32.00]In the early 2000s
+[129:34.00]We helped with
+[129:35.00]Setting up the benchmark
+[129:36.00]That's known as
+[129:38.00]Trackweed
+[129:39.00]Which again
+[129:40.00]To at least your
+[129:41.00]Point people
+[129:42.00]Only know ImageNet
+[129:43.00]And thereafter
+[129:44.00]Before ImageNet
+[129:45.00]There was Trackweed
+[129:46.00]And so
+[129:47.00]The first decade
+[129:48.00]Shan was really about
+[129:49.00]You know how do we
+[129:50.00]Understand what's in the video
+[129:51.00]And how can then
+[129:52.00]We turn that into
+[129:53.00]Meaningful use of
+[129:55.00]AI technology
+[129:56.00]For media companies
+[129:58.00]You know broadcasting
+[129:59.00]Corporations
+[130:00.00]And so on and so forth
+[130:01.00]Right the business to
+[130:02.00]Business kind of setting
+[130:03.00]Of course
+[130:04.00]Because we were at
+[130:05.00]IBM
+[130:06.00]We did not focus
+[130:07.00]On the consumer
+[130:08.00]And then
+[130:09.00]You know YouTube
+[130:10.00]Happened right
+[130:11.00]Here in the valley
+[130:12.00]And then
+[130:13.00]You see the
+[130:14.00]Explosive content
+[130:15.00]And applications
+[130:16.00]Of AI to
+[130:17.00]You know those
+[130:18.00]Kind of video
+[130:19.00]Understanding problems
+[130:20.00]Then it was
+[130:21.00]How do we actually
+[130:22.00]Make sense
+[130:23.00]Out of our
+[130:24.00]IoT world
+[130:25.00]So when you have
+[130:26.00]Signals coming from
+[130:27.00]Sensors everywhere
+[130:28.00]You know whether
+[130:29.00]There are sensors
+[130:30.00]Embedded in your
+[130:31.00]Bridges
+[130:32.00]Sensory information
+[130:33.00]To make
+[130:34.00]Good decisions
+[130:35.00]To a
+[130:36.00]You know observe
+[130:37.00]The environments
+[130:38.00]And we optimize
+[130:39.00]Those environments
+[130:40.00]Right and so a large
+[130:41.00]Part of my
+[130:42.00]Second half of
+[130:43.00]State IBM research
+[130:44.00]Was to come up with
+[130:45.00]What is this
+[130:46.00]Research around
+[130:48.00]Smart cities
+[130:49.00]Smarter planet
+[130:50.00]And that
+[130:51.00]Actually became
+[130:52.00]An AI platform
+[130:53.00]For helping
+[130:55.00]Optimized traffic
+[130:56.00]Right one of
+[130:57.00]The proudest
+[130:58.00]Things that
+[130:59.00]I really
+[131:00.00]Foundly remember
+[131:01.00]Use the data
+[131:02.00]From you know
+[131:03.00]Teleco data sources
+[131:04.00]To understand
+[131:05.00]How people move
+[131:06.00]In a city
+[131:07.00]And then use
+[131:08.00]That information
+[131:09.00]To build
+[131:10.00]Optimal planning
+[131:11.00]Right whether
+[131:12.00]It's for bus outs
+[131:13.00]Whether it's for metro
+[131:14.00]Right we did
+[131:15.00]Some amazing work
+[131:16.00]In Istanbul
+[131:17.00]For example
+[131:18.00]You know
+[131:19.00]Completely different
+[131:20.00]Scale
+[131:21.00]And then
+[131:22.00]The same
+[131:23.00]Kind of platform
+[131:24.00]We applied
+[131:25.00]To helping
+[131:26.00]Cities in
+[131:27.00]American Midwest
+[131:28.00]Like Dubuque
+[131:29.00]Optimized their
+[131:30.00]Dubuque
+[131:31.00]And then
+[131:32.00]And then
+[131:33.00]The same
+[131:34.00]Like Dubuque
+[131:35.00]And then
+[131:36.00]The same
+[131:37.00]Like Dubuque
+[131:38.00]And then
+[131:39.00]The same
+[131:40.00]Like Dubuque
+[131:41.00]And then
+[131:42.00]The same
+[131:43.00]Like Dubuque
+[131:44.00]And then
+[131:45.00]The same
+[131:46.00]Like Dubuque
+[131:47.00]And then
+[131:48.00]The same
+[131:49.00]Like Dubuque
+[131:50.00]And then
+[131:51.00]The same
+[131:52.00]Like Dubuque
+[131:53.00]And then
+[131:54.00]The same
+[131:55.00]Like Dubuque
+[131:56.00]And then
+[131:57.00]The same
+[131:58.00]Like Dubuque
+[131:59.00]And then
+[132:00.00]And then
+[132:01.00]The same
+[132:02.00]Like Dubuque
+[132:03.00]And then
+[132:04.00]The same
+[132:05.00]Like Dubuque
+[132:06.00]And then
+[132:07.00]The same
+[132:08.00]Like Dubuque
+[132:09.00]And then
+[132:10.00]The same
+[132:11.00]Like Dubuque
+[132:12.00]And then
+[132:13.00]The same
+[132:14.00]Like Dubuque
+[132:15.00]And then
+[132:16.00]The same
+[132:17.00]Like Dubuque
+[132:18.00]And then
+[132:19.00]The same
+[132:20.00]Like Dubuque
+[132:21.00]And then
+[132:22.00]The same
+[132:23.00]Like Dubuque
+[132:24.00]And then
+[132:25.00]The same
+[132:26.00]Like Dubuque
+[132:27.00]And then
+[132:28.00]The same
+[132:29.00]Like Dubuque
+[132:30.00]And then
+[132:31.00]The same
+[132:32.00]Like Dubuque
+[132:33.00]And then
+[132:34.00]The same
+[132:35.00]Like Dubuque
+[132:36.00]And then
+[132:37.00]The same
+[132:38.00]Like Dubuque
+[132:39.00]And then
+[132:40.00]The same
+[132:41.00]Like Dubuque
+[132:42.00]And then
+[132:43.00]The same
+[132:44.00]Like Dubuque
+[132:45.00]And then
+[132:46.00]The same
+[132:47.00]Like Dubuque
+[132:48.00]And then
+[132:49.00]The same
+[132:50.00]Like Dubuque
+[132:51.00]And then
+[132:52.00]The same
+[132:53.00]Like Dubuque
+[132:54.00]And then
+[132:55.00]The same
+[132:56.00]Like Dubuque
+[132:57.00]And then
+[132:58.00]The same
+[132:59.00]Like Dubuque
+[133:00.00]And then
+[133:01.00]The same
+[133:02.00]Like Dubuque
+[133:03.00]And then
+[133:04.00]The same
+[133:05.00]Like Dubuque
+[133:06.00]And then
+[133:07.00]The same
+[133:08.00]Like Dubuque
+[133:09.00]And then
+[133:10.00]The same
+[133:11.00]Like Dubuque
+[133:12.00]And then
+[133:13.00]The same
+[133:14.00]Like Dubuque
+[133:15.00]And then
+[133:16.00]The same
+[133:17.00]Like Dubuque
+[133:18.00]And then
+[133:19.00]The same
+[133:20.00]Like Dubuque
+[133:21.00]And then
+[133:22.00]The same
+[133:23.00]Like Dubuque
+[133:24.00]And then
+[133:25.00]The same
+[133:26.00]Like Dubuque
+[133:27.00]And then
+[133:28.00]The same
+[133:29.00]Like Dubuque
+[133:30.00]And then
+[133:31.00]The same
+[133:32.00]Like Dubuque
+[133:33.00]And then
+[133:34.00]The same
+[133:35.00]Like Dubuque
+[133:36.00]And then
+[133:37.00]The same
+[133:38.00]Like Dubuque
+[133:39.00]And then
+[133:40.00]The same
+[133:41.00]Like Dubuque
+[133:42.00]And then
+[133:43.00]The same
+[133:44.00]Like Dubuque
+[133:45.00]And then
+[133:46.00]The same
+[133:47.00]Like Dubuque
+[133:48.00]And then
+[133:49.00]The same
+[133:50.00]Like Dubuque
+[133:51.00]And then
+[133:52.00]The same
+[133:53.00]Like Dubuque
+[133:54.00]And then
+[133:55.00]The same
+[133:56.00]Like Dubuque
+[133:57.00]And then
+[133:58.00]The same
+[133:59.00]Like Dubuque
+[134:00.00]And then
+[134:01.00]The same
+[134:02.00]Like Dubuque
+[134:03.00]And then
+[134:04.00]The same
+[134:05.00]Like Dubuque
+[134:06.00]And then
+[134:07.00]The same
+[134:08.00]Like Dubuque
+[134:09.00]And then
+[134:10.00]The same
+[134:11.00]Like Dubuque
+[134:12.00]And then
+[134:13.00]The same
+[134:14.00]Like Dubuque
+[134:15.00]And then
+[134:16.00]The same
+[134:17.00]Like Dubuque
+[134:18.00]And then
+[134:19.00]The same
+[134:20.00]Like Dubuque
+[134:21.00]And then
+[134:22.00]The same
+[134:23.00]Like Dubuque
+[134:24.00]And then
+[134:25.00]The same
+[134:26.00]Like Dubuque
+[134:27.00]And then
+[134:28.00]The same
+[134:29.00]Like Dubuque
+[134:30.00]And then
+[134:31.00]The same
+[134:32.00]Like Dubuque
+[134:33.00]And then
+[134:34.00]The same
+[134:35.00]Like Dubuque
+[134:36.00]And then
+[134:37.00]The same
+[134:38.00]Like Dubuque
+[134:39.00]And then
+[134:40.00]The same
+[134:41.00]Like Dubuque
+[134:42.00]And then
+[134:43.00]The same
+[134:44.00]Like Dubuque
+[134:45.00]And then
+[134:46.00]The same
+[134:47.00]Like Dubuque
+[134:48.00]And then
+[134:49.00]The same
+[134:50.00]Like Dubuque
+[134:51.00]And then
+[134:52.00]The same
+[134:53.00]Like Dubuque
+[134:54.00]And then
+[134:55.00]The same
+[134:56.00]Like Dubuque
+[134:57.00]And then
+[134:58.00]The same
+[134:59.00]Like Dubuque
+[135:00.00]And then
+[135:01.00]The same
+[135:02.00]Like Dubuque
+[135:03.00]And then
+[135:04.00]The same
+[135:05.00]Like Dubuque
+[135:06.00]And then
+[135:07.00]The same
+[135:08.00]Like Dubuque
+[135:09.00]And then
+[135:10.00]The same
+[135:11.00]Like Dubuque
+[135:12.00]And then
+[135:13.00]The same
+[135:14.00]Like Dubuque
+[135:15.00]And then
+[135:16.00]The same
+[135:17.00]Like Dubuque
+[135:18.00]And then
+[135:19.00]The same
+[135:20.00]Like Dubuque
+[135:21.00]And then
+[135:22.00]The same
+[135:23.00]Like Dubuque
+[135:24.00]And then
+[135:25.00]The same
+[135:26.00]Like Dubuque
+[135:27.00]And then
+[135:28.00]The same
+[135:29.00]Like Dubuque
+[135:30.00]And then
+[135:31.00]The same
+[135:32.00]Like Dubuque
+[135:33.00]And then
+[135:34.00]The same
+[135:35.00]Like Dubuque
+[135:36.00]And then
+[135:37.00]The same
+[135:38.00]Like Dubuque
+[135:39.00]And then
+[135:40.00]The same
+[135:41.00]Like Dubuque
+[135:42.00]And then
+[135:43.00]The same
+[135:44.00]Like Dubuque
+[135:45.00]And then
+[135:46.00]The same
+[135:47.00]Like Dubuque
+[135:48.00]And then
+[135:49.00]The same
+[135:50.00]Like Dubuque
+[135:51.00]And then
+[135:52.00]The same
+[135:53.00]Like Dubuque
+[135:54.00]And then
+[135:55.00]The same
+[135:56.00]Like Dubuque
+[135:57.00]And then
+[135:58.00]The same
+[135:59.00]Like Dubuque
+[136:00.00]And then
+[136:01.00]The same
+[136:02.00]Like Dubuque
+[136:03.00]And then
+[136:04.00]The same
+[136:05.00]Like Dubuque
+[136:06.00]And then
+[136:07.00]The same
+[136:08.00]Like Dubuque
+[136:09.00]And then
+[136:10.00]The same
+[136:11.00]Like Dubuque
+[136:12.00]And then
+[136:13.00]The same
+[136:14.00]Like Dubuque
+[136:15.00]And then
+[136:16.00]The same
+[136:17.00]Like Dubuque
+[136:18.00]And then
+[136:19.00]The same
+[136:20.00]Like Dubuque
+[136:21.00]And then
+[136:22.00]The same
+[136:23.00]Like Dubuque
+[136:24.00]And then
+[136:25.00]The same
+[136:26.00]Like Dubuque
+[136:27.00]And then
+[136:28.00]The same
+[136:29.00]Like Dubuque
+[136:30.00]And then
+[136:31.00]The same
+[136:32.00]Like Dubuque
+[136:33.00]And then
+[136:34.00]The same
+[136:35.00]Like Dubuque
+[136:36.00]And then
+[136:37.00]The same
+[136:38.00]Like Dubuque
+[136:39.00]And then
+[136:40.00]The same
+[136:41.00]Like Dubuque
+[136:42.00]And then
+[136:43.00]The same
+[136:44.00]Like Dubuque
+[136:45.00]And then
+[136:46.00]The same
+[136:47.00]Like Dubuque
+[136:48.00]And then
+[136:49.00]And then
+[136:50.00]The same
+[136:51.00]Like Dubuque
+[136:52.00]And then
+[136:53.00]The same
+[136:54.00]Like Dubuque
+[136:55.00]And then
+[136:56.00]The same
+[136:57.00]Like Dubuque
+[136:58.00]And then
+[136:59.00]The same
+[137:00.00]Like Dubuque
+[137:01.00]And then
+[137:02.00]The same
+[137:03.00]Like Dubuque
+[137:04.00]And then
+[137:05.00]The same
+[137:06.00]Like Dubuque
+[137:07.00]And then
+[137:08.00]The same
+[137:09.00]Like Dubuque
+[137:10.00]And then
+[137:11.00]The same
+[137:12.00]Like Dubuque
+[137:13.00]And then
+[137:14.00]The same
+[137:15.00]Like Dubuque
+[137:16.00]And then
+[137:17.00]The same
+[137:18.00]Like Dubuque
+[137:19.00]And then
+[137:20.00]The same
+[137:21.00]Like Dubuque
+[137:22.00]And then
+[137:23.00]The same
+[137:24.00]Like Dubuque
+[137:25.00]And then
+[137:26.00]The same
+[137:27.00]Like Dubuque
+[137:28.00]And then
+[137:29.00]The same
+[137:30.00]Like Dubuque
+[137:31.00]And then
+[137:32.00]The same
+[137:33.00]Like Dubuque
+[137:34.00]And then
+[137:35.00]The same
+[137:36.00]Like Dubuque
+[137:37.00]And then
+[137:38.00]The same
+[137:39.00]Like Dubuque
+[137:40.00]And then
+[137:41.00]The same
+[137:42.00]Like Dubuque
+[137:43.00]And then
+[137:44.00]The same
+[137:45.00]Like Dubuque
+[137:46.00]And then
+[137:47.00]The same
+[137:48.00]Like Dubuque
+[137:49.00]And then
+[137:50.00]The same
+[137:51.00]Like Dubuque
+[137:52.00]And then
+[137:53.00]The same
+[137:54.00]Like Dubuque
+[137:55.00]And then
+[137:56.00]The same
+[137:57.00]Like Dubuque
+[137:58.00]And then
+[137:59.00]The same
+[138:00.00]Like Dubuque
+[138:01.00]And then
+[138:02.00]The same
+[138:03.00]Like Dubuque
+[138:04.00]And then
+[138:05.00]The same
+[138:06.00]Like Dubuque
+[138:07.00]And then
+[138:08.00]The same
+[138:09.00]Like Dubuque
+[138:10.00]And then
+[138:11.00]The same
+[138:12.00]Like Dubuque
+[138:13.00]And then
+[138:14.00]The same
+[138:15.00]Like Dubuque
+[138:16.00]And then
+[138:17.00]The same
+[138:18.00]Like Dubuque
+[138:19.00]And then
+[138:20.00]The same
+[138:21.00]Like Dubuque
+[138:22.00]And then
+[138:23.00]The same
+[138:24.00]Like Dubuque
+[138:25.00]And then
+[138:26.00]The same
+[138:27.00]Like Dubuque
+[138:28.00]And then
+[138:29.00]The same
+[138:30.00]Like Dubuque
+[138:31.00]And then
+[138:32.00]The same
+[138:33.00]Like Dubuque
+[138:34.00]And then
+[138:35.00]The same
+[138:36.00]Like Dubuque
+[138:37.00]And then
+[138:38.00]The same
+[138:39.00]Like Dubuque
+[138:40.00]And then
+[138:41.00]The same
+[138:42.00]Like Dubuque
+[138:43.00]And then
+[138:44.00]The same
+[138:45.00]Like Dubuque
+[138:46.00]And then
+[138:47.00]The same
+[138:48.00]Like Dubuque
+[138:49.00]And then
+[138:50.00]The same
+[138:51.00]Like Dubuque
+[138:52.00]And then
+[138:53.00]The same
+[138:54.00]Like Dubuque
+[138:55.00]And then
+[138:56.00]The same
+[138:57.00]Like Dubuque
+[138:58.00]And then
+[138:59.00]The same
+[139:00.00]Like Dubuque
+[139:01.00]And then
+[139:02.00]The same
+[139:03.00]Like Dubuque
+[139:04.00]And then
+[139:05.00]The same
+[139:06.00]Like Dubuque
+[139:07.00]And then
+[139:08.00]The same
+[139:09.00]Like Dubuque
+[139:10.00]And then
+[139:11.00]The same
+[139:12.00]Like Dubuque
+[139:13.00]And then
+[139:14.00]The same
+[139:15.00]Like Dubuque
+[139:16.00]And then
+[139:17.00]The same
+[139:18.00]Like Dubuque
+[139:19.00]And then
+[139:20.00]The same
+[139:21.00]Like Dubuque
+[139:22.00]And then
+[139:23.00]The same
+[139:24.00]Like Dubuque
+[139:25.00]And then
+[139:26.00]The same
+[139:27.00]Like Dubuque
+[139:28.00]And then
+[139:29.00]The same
+[139:30.00]Like Dubuque
+[139:31.00]And then
+[139:32.00]The same
+[139:33.00]Like Dubuque
+[139:34.00]And then
+[139:35.00]The same
+[139:36.00]Like Dubuque
+[139:37.00]And then
+[139:38.00]The same
+[139:39.00]Like Dubuque
+[139:40.00]And then
+[139:41.00]The same
+[139:42.00]Like Dubuque
+[139:43.00]And then
+[139:44.00]The same
+[139:45.00]Like Dubuque
+[139:46.00]And then
+[139:47.00]The same
+[139:48.00]Like Dubuque
+[139:49.00]And then
+[139:50.00]The same
+[139:51.00]Like Dubuque
+[139:52.00]And then
+[139:53.00]The same
+[139:54.00]Like Dubuque
+[139:55.00]And then
+[139:56.00]The same
+[139:57.00]Like Dubuque
+[139:58.00]And then
+[139:59.00]The same
+[140:00.00]Like Dubuque
+[140:01.00]And then
+[140:02.00]The same
+[140:03.00]Like Dubuque
+[140:04.00]And then
+[140:05.00]The same
+[140:06.00]Like Dubuque
+[140:07.00]And then
+[140:08.00]The same
+[140:09.00]Like Dubuque
+[140:10.00]And then
+[140:11.00]The same
+[140:12.00]Like Dubuque
+[140:13.00]And then
+[140:14.00]The same
+[140:15.00]Like Dubuque
+[140:16.00]And then
+[140:17.00]The same
+[140:18.00]Like Dubuque
+[140:19.00]And then
+[140:20.00]The same
+[140:21.00]Like Dubuque
+[140:22.00]And then
+[140:23.00]The same
+[140:24.00]Like Dubuque
+[140:25.00]And then
+[140:26.00]The same
+[140:27.00]Like Dubuque
+[140:28.00]And then
+[140:29.00]The same
+[140:30.00]Like Dubuque
+[140:31.00]And then
+[140:32.00]The same
+[140:33.00]Like Dubuque
+[140:34.00]And then
+[140:35.00]The same
+[140:36.00]Like Dubuque
+[140:37.00]And then
+[140:38.00]The same
+[140:39.00]Like Dubuque
+[140:40.00]And then
+[140:41.00]The same
+[140:42.00]Like Dubuque
+[140:43.00]And then
+[140:44.00]The same
+[140:45.00]Like Dubuque
+[140:46.00]And then
+[140:47.00]The same
+[140:48.00]Like Dubuque
+[140:49.00]And then
+[140:50.00]The same
+[140:51.00]Like Dubuque
+[140:52.00]And then
+[140:53.00]The same
+[140:54.00]Like Dubuque
+[140:55.00]And then
+[140:56.00]The same
+[140:57.00]Like Dubuque
+[140:58.00]And then
+[140:59.00]The same
+[141:00.00]Like Dubuque
+[141:01.00]And then
+[141:02.00]The same
+[141:03.00]Like Dubuque
+[141:04.00]And then
+[141:05.00]The same
+[141:06.00]Like Dubuque
+[141:07.00]And then
+[141:08.00]The same
+[141:09.00]Like Dubuque
+[141:10.00]And then
+[141:11.00]The same
+[141:12.00]Like Dubuque
+[141:13.00]And then
+[141:14.00]The same
+[141:15.00]Like Dubuque
+[141:16.00]And then
+[141:17.00]The same
+[141:18.00]Like Dubuque
+[141:19.00]And then
+[141:20.00]The same
+[141:21.00]Like Dubuque
+[141:22.00]And then
+[141:23.00]The same
+[141:24.00]Like Dubuque
+[141:25.00]And then
+[141:26.00]The same
+[141:27.00]Like Dubuque
+[141:28.00]And then
+[141:29.00]The same
+[141:30.00]Like Dubuque
+[141:31.00]And then
+[141:32.00]The same
+[141:33.00]Like Dubuque
+[141:34.00]And then
+[141:35.00]The same
+[141:36.00]Like Dubuque
+[141:37.00]And then
+[141:38.00]The same
+[141:39.00]Like Dubuque
+[141:40.00]And then
+[141:41.00]The same
+[141:42.00]Like Dubuque
+[141:43.00]And then
+[141:44.00]The same
+[141:45.00]Like Dubuque
+[141:46.00]And then
+[141:47.00]The same
+[141:48.00]Like Dubuque
+[141:49.00]And then
+[141:50.00]The same
+[141:51.00]Like Dubuque
+[141:52.00]And then
+[141:53.00]The same
+[141:54.00]Like Dubuque
+[141:55.00]And then
+[141:56.00]The same
+[141:57.00]Like Dubuque
+[141:58.00]And then
+[141:59.00]The same
+[142:00.00]Like Dubuque
+[142:01.00]And then
+[142:02.00]The same
+[142:03.00]Like Dubuque
+[142:04.00]And then
+[142:05.00]The same
+[142:06.00]Like Dubuque
+[142:07.00]And then
+[142:08.00]The same
+[142:09.00]Like Dubuque
+[142:10.00]And then
+[142:11.00]The same
+[142:12.00]Like Dubuque
+[142:13.00]And then
+[142:14.00]The same
+[142:15.00]Like Dubuque
+[142:16.00]And then
+[142:17.00]The same
+[142:18.00]Like Dubuque
+[142:19.00]And then
+[142:20.00]The same
+[142:21.00]Like Dubuque
+[142:22.00]And then
+[142:23.00]The same
+[142:24.00]Like Dubuque
+[142:25.00]And then
+[142:26.00]The same
+[142:27.00]Like Dubuque
+[142:28.00]And then
+[142:29.00]The same
+[142:30.00]Like Dubuque
+[142:31.00]And then
+[142:32.00]The same
+[142:33.00]Like Dubuque
+[142:34.00]And then
+[142:35.00]The same
+[142:36.00]Like Dubuque
+[142:37.00]And then
+[142:38.00]The same
+[142:39.00]Like Dubuque
+[142:40.00]And then
+[142:41.00]The same
+[142:42.00]Like Dubuque
+[142:43.00]And then
+[142:44.00]The same
+[142:45.00]Like Dubuque
+[142:46.00]And then
+[142:47.00]The same
+[142:48.00]Like Dubuque
+[142:49.00]And then
+[142:50.00]The same
+[142:51.00]Like Dubuque
+[142:52.00]And then
+[142:53.00]The same
+[142:54.00]Like Dubuque
+[142:55.00]And then
+[142:56.00]The same
+[142:57.00]Like Dubuque
+[142:58.00]And then
+[142:59.00]The same
+[143:00.00]Like Dubuque
+[143:01.00]And then
+[143:02.00]The same
+[143:03.00]Like Dubuque
+[143:04.00]And then
+[143:05.00]The same
+[143:06.00]Like Dubuque
+[143:07.00]And then
+[143:08.00]The same
+[143:09.00]Like Dubuque
+[143:10.00]And then
+[143:11.00]The same
+[143:12.00]Like Dubuque
+[143:13.00]And then
+[143:14.00]The same
+[143:15.00]Like Dubuque
+[143:16.00]And then
+[143:17.00]The same
+[143:18.00]Like Dubuque
+[143:19.00]And then
+[143:20.00]The same
+[143:21.00]Like Dubuque
+[143:22.00]And then
+[143:23.00]The same
+[143:24.00]Like Dubuque
+[143:25.00]And then
+[143:26.00]The same
+[143:27.00]Like Dubuque
+[143:28.00]And then
+[143:29.00]The same
+[143:30.00]Like Dubuque
+[143:31.00]And then
+[143:32.00]The same
+[143:33.00]Like Dubuque
+[143:34.00]And then
+[143:35.00]The same
+[143:36.00]Like Dubuque
+[143:37.00]And then
+[143:38.00]The same
+[143:39.00]Like Dubuque
+[143:40.00]And then
+[143:41.00]The same
+[143:42.00]Like Dubuque
+[143:43.00]And then
+[143:44.00]The same
+[143:45.00]Like Dubuque
+[143:46.00]And then
+[143:47.00]The same
+[143:48.00]Like Dubuque
+[143:49.00]And then
+[143:50.00]The same
+[143:51.00]Like Dubuque
+[143:52.00]And then
+[143:53.00]The same
+[143:54.00]Like Dubuque
+[143:55.00]And then
+[143:56.00]The same
+[143:57.00]Like Dubuque
+[143:58.00]And then
+[143:59.00]The same
+[144:00.00]Like Dubuque
+[144:01.00]And then
+[144:02.00]The same
+[144:03.00]Like Dubuque
+[144:04.00]And then
+[144:05.00]The same
+[144:06.00]Like Dubuque
+[144:07.00]And then
+[144:08.00]The same
+[144:09.00]Like Dubuque
+[144:10.00]And then
+[144:11.00]The same
+[144:12.00]Like Dubuque
+[144:13.00]And then
+[144:14.00]The same
+[144:15.00]Like Dubuque
+[144:16.00]And then
+[144:17.00]The same
+[144:18.00]Like Dubuque
+[144:19.00]And then
+[144:20.00]The same
+[144:21.00]Like Dubuque
+[144:22.00]And then
+[144:23.00]The same
+[144:24.00]Like Dubuque
+[144:25.00]And then
+[144:26.00]The same
+[144:27.00]Like Dubuque
+[144:28.00]And then
+[144:29.00]The same
+[144:30.00]Like Dubuque
+[144:31.00]And then
+[144:32.00]The same
+[144:33.00]Like Dubuque
+[144:34.00]And then
+[144:35.00]The same
+[144:36.00]Like Dubuque
+[144:37.00]And then
+[144:38.00]The same
+[144:39.00]Like Dubuque
+[144:40.00]And then
+[144:41.00]The same
+[144:42.00]Like Dubuque
+[144:43.00]And then
+[144:44.00]The same
+[144:45.00]Like Dubuque
+[144:46.00]And then
+[144:47.00]The same
+[144:48.00]Like Dubuque
+[144:49.00]And then
+[144:50.00]The same
+[144:51.00]Like Dubuque
+[144:52.00]And then
+[144:53.00]The same
+[144:54.00]Like Dubuque
+[144:55.00]And then
+[144:56.00]The same
+[144:57.00]Like Dubuque
+[144:58.00]And then
+[144:59.00]The same
+[145:00.00]Like Dubuque
+[145:01.00]And then
+[145:02.00]The same
+[145:03.00]Like Dubuque
+[145:04.00]And then
+[145:05.00]The same
+[145:06.00]Like Dubuque
+[145:07.00]And then
+[145:08.00]The same
+[145:09.00]Like Dubuque
+[145:10.00]And then
+[145:11.00]The same
+[145:12.00]Like Dubuque
+[145:13.00]And then
+[145:14.00]The same
+[145:15.00]Like Dubuque
+[145:16.00]And then
+[145:17.00]The same
+[145:18.00]Like Dubuque
+[145:19.00]And then
+[145:20.00]The same
+[145:21.00]Like Dubuque
+[145:22.00]And then
+[145:23.00]The same
+[145:24.00]Like Dubuque
+[145:25.00]And then
+[145:26.00]The same
+[145:27.00]Like Dubuque
+[145:28.00]And then
+[145:29.00]The same
+[145:30.00]Like Dubuque
+[145:31.00]And then
+[145:32.00]The same
+[145:33.00]Like Dubuque
+[145:34.00]And then
+[145:35.00]The same
+[145:36.00]Like Dubuque
+[145:37.00]And then
+[145:38.00]The same
+[145:39.00]Like Dubuque
+[145:40.00]And then
+[145:41.00]The same
+[145:42.00]Like Dubuque
+[145:43.00]And then
+[145:44.00]The same
+[145:45.00]Like Dubuque
+[145:46.00]And then
+[145:47.00]The same
+[145:48.00]Like Dubuque
+[145:49.00]And then
+[145:50.00]The same
+[145:51.00]Like Dubuque
+[145:52.00]And then
+[145:53.00]The same
+[145:54.00]Like Dubuque
+[145:55.00]And then
+[145:56.00]The same
+[145:57.00]Like Dubuque
+[145:58.00]And then
+[145:59.00]The same
+[146:00.00]Like Dubuque
+[146:01.00]And then
+[146:02.00]The same
+[146:03.00]Like Dubuque
+[146:04.00]And then
+[146:05.00]The same
+[146:06.00]Like Dubuque
+[146:07.00]And then
+[146:08.00]The same
+[146:09.00]Like Dubuque
+[146:10.00]And then
+[146:11.00]The same
+[146:12.00]Like Dubuque
+[146:13.00]And then
+[146:14.00]The same
+[146:15.00]Like Dubuque
+[146:16.00]And then
+[146:17.00]The same
+[146:18.00]Like Dubuque
+[146:19.00]And then
+[146:20.00]The same
+[146:21.00]Like Dubuque
+[146:22.00]And then
+[146:23.00]The same
+[146:24.00]Like Dubuque
+[146:25.00]And then
+[146:26.00]The same
+[146:27.00]Like Dubuque
+[146:28.00]And then
+[146:29.00]The same
+[146:30.00]Like Dubuque
+[146:31.00]And then
+[146:32.00]The same
+[146:33.00]Like Dubuque
+[146:34.00]And then
+[146:35.00]The same
+[146:36.00]Like Dubuque
+[146:37.00]And then
+[146:38.00]The same
+[146:39.00]Like Dubuque
+[146:40.00]And then
+[146:41.00]The same
+[146:42.00]Like Dubuque
+[146:43.00]And then
+[146:44.00]The same
+[146:45.00]Like Dubuque
+[146:46.00]And then
+[146:47.00]The same
+[146:48.00]Like Dubuque
+[146:49.00]And then
+[146:50.00]The same
+[146:51.00]Like Dubuque
+[146:52.00]And then
+[146:53.00]The same
+[146:54.00]Like Dubuque
+[146:55.00]And then
+[146:56.00]The same
+[146:57.00]Like Dubuque
+[146:58.00]And then
+[146:59.00]The same
+[147:00.00]Like Dubuque
+[147:01.00]And then
+[147:02.00]The same
+[147:03.00]Like Dubuque
+[147:04.00]And then
+[147:05.00]The same
+[147:06.00]Like Dubuque
+[147:07.00]And then
+[147:08.00]The same
+[147:09.00]Like Dubuque
+[147:10.00]And then
+[147:11.00]The same
+[147:12.00]Like Dubuque
+[147:13.00]And then
+[147:14.00]The same
+[147:15.00]Like Dubuque
+[147:16.00]And then
+[147:17.00]The same
+[147:18.00]Like Dubuque
+[147:19.00]And then
+[147:20.00]The same
+[147:21.00]Like Dubuque
+[147:22.00]And then
+[147:23.00]The same
+[147:24.00]Like Dubuque
+[147:25.00]And then
+[147:26.00]The same
+[147:27.00]Like Dubuque
+[147:28.00]And then
+[147:29.00]The same
+[147:30.00]Like Dubuque
+[147:31.00]And then
+[147:32.00]The same
+[147:33.00]Like Dubuque
+[147:34.00]And then
+[147:35.00]The same
+[147:36.00]Like Dubuque
+[147:37.00]And then
+[147:38.00]The same
+[147:39.00]Like Dubuque
+[147:40.00]And then
+[147:41.00]The same
+[147:42.00]Like Dubuque
+[147:43.00]And then
+[147:44.00]The same
+[147:45.00]Like Dubuque
+[147:46.00]And then
+[147:47.00]The same
+[147:48.00]Like Dubuque
+[147:49.00]And then
+[147:50.00]The same
+[147:51.00]Like Dubuque
+[147:52.00]And then
+[147:53.00]The same
+[147:54.00]Like Dubuque
+[147:55.00]And then
+[147:56.00]The same
+[147:57.00]Like Dubuque
+[147:58.00]And then
+[147:59.00]The same
+[148:00.00]Like Dubuque
+[148:01.00]And then
+[148:02.00]The same
+[148:03.00]Like Dubuque
+[148:04.00]And then
+[148:05.00]The same
+[148:06.00]Like Dubuque
+[148:07.00]And then
+[148:08.00]The same
+[148:09.00]Like Dubuque
+[148:10.00]And then
+[148:11.00]The same
+[148:12.00]Like Dubuque
+[148:13.00]And then
+[148:14.00]The same
+[148:15.00]Like Dubuque
+[148:16.00]And then
+[148:17.00]The same
+[148:18.00]Like Dubuque
+[148:19.00]And then
+[148:20.00]The same
+[148:21.00]Like Dubuque
+[148:22.00]And then
+[148:23.00]The same
+[148:24.00]Like Dubuque
+[148:25.00]And then
+[148:26.00]The same
+[148:27.00]Like Dubuque
+[148:28.00]And then
+[148:29.00]The same
+[148:30.00]Like Dubuque
+[148:31.00]And then
+[148:32.00]The same
+[148:33.00]Like Dubuque
+[148:34.00]And then
+[148:35.00]The same
+[148:36.00]Like Dubuque
+[148:37.00]And then
+[148:38.00]The same
+[148:39.00]Like Dubuque
+[148:40.00]And then
+[148:41.00]The same
+[148:42.00]Like Dubuque
+[148:43.00]And then
+[148:44.00]The same
+[148:45.00]Like Dubuque
+[148:46.00]And then
+[148:47.00]The same
+[148:48.00]Like Dubuque
+[148:49.00]And then
+[148:50.00]The same
+[148:51.00]Like Dubuque
+[148:52.00]And then
+[148:53.00]The same
+[148:54.00]Like Dubuque
+[148:55.00]And then
+[148:56.00]The same
+[148:57.00]Like Dubuque
+[148:58.00]And then
+[148:59.00]The same
+[149:00.00]Like Dubuque
+[149:01.00]And then
+[149:02.00]The same
+[149:03.00]Like Dubuque
+[149:04.00]And then
+[149:05.00]The same
+[149:06.00]Like Dubuque
+[149:07.00]And then
+[149:08.00]The same
+[149:09.00]Like Dubuque
+[149:10.00]And then
+[149:11.00]The same
+[149:12.00]Like Dubuque
+[149:13.00]And then
+[149:14.00]The same
+[149:15.00]Like Dubuque
+[149:16.00]And then
+[149:17.00]The same
+[149:18.00]Like Dubuque
+[149:19.00]And then
+[149:20.00]The same
+[149:21.00]Like Dubuque
+[149:22.00]And then
+[149:23.00]The same
+[149:24.00]Like Dubuque
+[149:25.00]And then
+[149:26.00]The same
+[149:27.00]Like Dubuque
+[149:28.00]And then
+[149:29.00]The same
+[149:30.00]Like Dubuque
+[149:31.00]And then
+[149:32.00]The same
+[149:33.00]Like Dubuque
+[149:34.00]And then
+[149:35.00]The same
+[149:36.00]Like Dubuque
+[149:37.00]And then
+[149:38.00]The same
+[149:39.00]Like Dubuque
+[149:40.00]And then
+[149:41.00]The same
+[149:42.00]Like Dubuque
+[149:43.00]And then
+[149:44.00]The same
+[149:45.00]Like Dubuque
+[149:46.00]And then
+[149:47.00]The same
+[149:48.00]Like Dubuque
+[149:49.00]And then
+[149:50.00]The same
+[149:51.00]Like Dubuque
+[149:52.00]And then
+[149:53.00]The same
+[149:54.00]Like Dubuque
+[149:55.00]And then
+[149:56.00]The same
+[149:57.00]Like Dubuque
+[149:58.00]And then
+[149:59.00]The same
+[150:00.00]Like Dubuque
+[150:01.00]And then
+[150:02.00]The same
+[150:03.00]Like Dubuque
+[150:04.00]And then
+[150:05.00]The same
+[150:06.00]Like Dubuque
+[150:07.00]And then
+[150:08.00]The same
+[150:09.00]Like Dubuque
+[150:10.00]And then
+[150:11.00]The same
+[150:12.00]Like Dubuque
+[150:13.00]And then
+[150:14.00]The same
+[150:15.00]Like Dubuque
+[150:16.00]And then
+[150:17.00]The same
+[150:18.00]Like Dubuque
+[150:19.00]And then
+[150:20.00]The same
+[150:21.00]Like Dubuque
+[150:22.00]And then
+[150:23.00]The same
+[150:24.00]Like Dubuque
+[150:25.00]And then
+[150:26.00]The same
+[150:27.00]Like Dubuque
+[150:28.00]And then
+[150:29.00]The same
+[150:30.00]Like Dubuque
+[150:31.00]And then
+[150:32.00]The same
+[150:33.00]Like Dubuque
+[150:34.00]And then
+[150:35.00]The same
+[150:36.00]Like Dubuque
+[150:37.00]And then
+[150:38.00]The same
+[150:39.00]Like Dubuque
+[150:40.00]And then
+[150:41.00]The same
+[150:42.00]Like Dubuque
+[150:43.00]And then
+[150:44.00]The same
+[150:45.00]Like Dubuque
+[150:46.00]And then
+[150:47.00]The same
+[150:48.00]Like Dubuque
+[150:49.00]And then
+[150:50.00]The same
+[150:51.00]Like Dubuque
+[150:52.00]And then
+[150:53.00]The same
+[150:54.00]Like Dubuque
+[150:55.00]And then
+[150:56.00]The same
+[150:57.00]Like Dubuque
+[150:58.00]And then
+[150:59.00]The same
+[151:00.00]Like Dubuque
+[151:01.00]And then
+[151:02.00]The same
+[151:03.00]Like Dubuque
+[151:04.00]And then
+[151:05.00]The same
+[151:06.00]Like Dubuque
+[151:07.00]And then
+[151:08.00]The same
+[151:09.00]Like Dubuque
+[151:10.00]And then
+[151:11.00]The same
+[151:12.00]Like Dubuque
+[151:13.00]And then
+[151:14.00]The same
+[151:15.00]Like Dubuque
+[151:16.00]And then
+[151:17.00]The same
+[151:18.00]Like Dubuque
+[151:19.00]And then
+[151:20.00]The same
+[151:21.00]Like Dubuque
+[151:22.00]And then
+[151:23.00]The same
+[151:24.00]Like Dubuque
+[151:25.00]And then
+[151:26.00]The same
+[151:27.00]Like Dubuque
+[151:28.00]And then
+[151:29.00]The same
+[151:30.00]Like Dubuque
+[151:31.00]And then
+[151:32.00]The same
+[151:33.00]Like Dubuque
+[151:34.00]And then
+[151:35.00]The same
+[151:36.00]Like Dubuque
+[151:37.00]And then
+[151:38.00]The same
+[151:39.00]Like Dubuque
+[151:40.00]And then
+[151:41.00]The same
+[151:42.00]Like Dubuque
+[151:43.00]And then
+[151:44.00]The same
+[151:45.00]Like Dubuque
+[151:46.00]And then
+[151:47.00]The same
+[151:48.00]Like Dubuque
+[151:49.00]And then
+[151:50.00]The same
+[151:51.00]Like Dubuque
+[151:52.00]And then
+[151:53.00]The same
+[151:54.00]Like Dubuque
+[151:55.00]And then
+[151:56.00]The same
+[151:57.00]Like Dubuque
+[151:58.00]And then
+[151:59.00]The same
+[152:00.00]Like Dubuque
+[152:01.00]And then
+[152:02.00]The same
+[152:03.00]Like Dubuque
+[152:04.00]And then
+[152:05.00]The same
+[152:06.00]Like Dubuque
+[152:07.00]And then
+[152:08.00]The same
+[152:09.00]Like Dubuque
+[152:10.00]And then
+[152:11.00]The same
+[152:12.00]Like Dubuque
+[152:13.00]And then
+[152:14.00]The same
+[152:15.00]Like Dubuque
+[152:16.00]And then
+[152:17.00]The same
+[152:18.00]Like Dubuque
+[152:19.00]And then
+[152:20.00]The same
+[152:21.00]Like Dubuque
+[152:22.00]And then
+[152:23.00]The same
+[152:24.00]Like Dubuque
+[152:25.00]And then
+[152:26.00]The same
+[152:27.00]Like Dubuque
+[152:28.00]And then
+[152:29.00]The same
+[152:30.00]Like Dubuque
+[152:31.00]And then
+[152:32.00]The same
+[152:33.00]Like Dubuque
+[152:34.00]And then
+[152:35.00]The same
+[152:36.00]Like Dubuque
+[152:37.00]And then
+[152:38.00]The same
+[152:39.00]Like Dubuque
+[152:40.00]And then
+[152:41.00]The same
+[152:42.00]Like Dubuque
+[152:43.00]And then
+[152:44.00]The same
+[152:45.00]Like Dubuque
+[152:46.00]And then
+[152:47.00]The same
+[152:48.00]Like Dubuque
+[152:49.00]And then
+[152:50.00]The same
+[152:51.00]Like Dubuque
+[152:52.00]And then
+[152:53.00]The same
+[152:54.00]Like Dubuque
+[152:55.00]And then
+[152:56.00]The same
+[152:57.00]Like Dubuque
+[152:58.00]And then
+[152:59.00]The same
+[153:00.00]Like Dubuque
+[153:01.00]And then
+[153:02.00]The same
+[153:03.00]Like Dubuque
+[153:04.00]And then
+[153:05.00]The same
+[153:06.00]Like Dubuque
+[153:07.00]And then
+[153:08.00]The same
+[153:09.00]Like Dubuque
+[153:10.00]And then
+[153:11.00]The same
+[153:12.00]Like Dubuque
+[153:13.00]And then
+[153:14.00]The same
+[153:15.00]Like Dubuque
+[153:16.00]And then
+[153:17.00]The same
+[153:18.00]Like Dubuque
+[153:19.00]And then
+[153:20.00]The same
+[153:21.00]Like Dubuque
+[153:22.00]And then
+[153:23.00]The same
+[153:24.00]Like Dubuque
+[153:25.00]And then
+[153:26.00]The same
+[153:27.00]Like Dubuque
+[153:28.00]And then
+[153:29.00]The same
+[153:30.00]Like Dubuque
+[153:31.00]And then
+[153:32.00]The same
+[153:33.00]Like Dubuque
+[153:34.00]And then
+[153:35.00]The same
+[153:36.00]Like Dubuque
+[153:37.00]And then
+[153:38.00]The same
+[153:39.00]Like Dubuque
+[153:40.00]And then
+[153:41.00]The same
+[153:42.00]Like Dubuque
+[153:43.00]And then
+[153:44.00]The same
+[153:45.00]Like Dubuque
+[153:46.00]And then
+[153:47.00]The same
+[153:48.00]Like Dubuque
+[153:49.00]And then
+[153:50.00]The same
+[153:51.00]Like Dubuque
+[153:52.00]And then
+[153:53.00]The same
+[153:54.00]Like Dubuque
+[153:55.00]And then
+[153:56.00]The same
+[153:57.00]Like Dubuque
+[153:58.00]And then
+[153:59.00]The same
+[154:00.00]Like Dubuque
+[154:01.00]And then
+[154:02.00]The same
+[154:03.00]Like Dubuque
+[154:04.00]And then
+[154:05.00]The same
+[154:06.00]Like Dubuque
+[154:07.00]And then
+[154:08.00]The same
+[154:09.00]Like Dubuque
+[154:10.00]And then
+[154:11.00]The same
+[154:12.00]Like Dubuque
+[154:13.00]And then
+[154:14.00]The same
+[154:15.00]Like Dubuque
+[154:16.00]And then
+[154:17.00]The same
+[154:18.00]Like Dubuque
+[154:19.00]And then
+[154:20.00]The same
+[154:21.00]Like Dubuque
+[154:22.00]And then
+[154:23.00]The same
+[154:24.00]Like Dubuque
+[154:25.00]And then
+[154:26.00]The same
+[154:27.00]Like Dubuque
+[154:28.00]And then
+[154:29.00]The same
+[154:30.00]Like Dubuque
+[154:31.00]And then
+[154:32.00]The same
+[154:33.00]Like Dubuque
+[154:34.00]And then
+[154:35.00]The same
+[154:36.00]Like Dubuque
+[154:37.00]And then
+[154:38.00]The same
+[154:39.00]Like Dubuque
+[154:40.00]And then
+[154:41.00]The same
+[154:42.00]Like Dubuque
+[154:43.00]And then
+[154:44.00]The same
+[154:45.00]Like Dubuque
+[154:46.00]And then
+[154:47.00]The same
+[154:48.00]Like Dubuque
+[154:49.00]And then
+[154:50.00]The same
+[154:51.00]Like Dubuque
+[154:52.00]And then
+[154:53.00]The same
+[154:54.00]Like Dubuque
+[154:55.00]And then
+[154:56.00]The same
+[154:57.00]Like Dubuque
+[154:58.00]And then
+[154:59.00]The same
+[155:00.00]Like Dubuque
+[155:01.00]And then
+[155:02.00]The same
+[155:03.00]Like Dubuque
+[155:04.00]And then
+[155:05.00]The same
+[155:06.00]Like Dubuque
+[155:07.00]And then
+[155:08.00]The same
+[155:09.00]Like Dubuque
+[155:10.00]And then
+[155:11.00]The same
+[155:12.00]Like Dubuque
+[155:13.00]And then
+[155:14.00]The same
+[155:15.00]Like Dubuque
+[155:16.00]And then
+[155:17.00]The same
+[155:18.00]Like Dubuque
+[155:19.00]And then
+[155:20.00]The same
+[155:21.00]Like Dubuque
+[155:22.00]And then
+[155:23.00]The same
+[155:24.00]Like Dubuque
+[155:25.00]And then
+[155:26.00]The same
+[155:27.00]Like Dubuque
+[155:28.00]And then
+[155:29.00]The same
+[155:30.00]Like Dubuque
+[155:31.00]There has been a tremendous explosion in terms of the number of people that are interested, the number of experiment that are being tried out.
+[155:38.00]And I would say we are very fortunate that there is this kind of interest right now.
+[155:43.00]The thing that you notice is it does go from some of the seminal moment to seminal moment.
+[155:50.00]Between AlexNet and the transformer paper in 2017, you could see that there was a huge amount of innovation on the visual side.
+[156:00.00]You know, we had resonance, you know, all kinds of interesting architectures.
+[156:05.00]And then after 2017, it became again significantly focused on sequence to sequence, right?
+[156:11.00]So the ideal sequence here, you know, machine translation therefore was text.
+[156:16.00]You know, we keep hearing about maybe there will be a breakthrough moment on the visual side again, right?
+[156:22.00]In another five years.
+[156:23.00]And then suddenly, you know, people will start spending more energy there.
+[156:29.00]So I would like to actually go back to the modality, right?
+[156:32.00]So it was the visual modality and now the text modality.
+[156:35.00]And then maybe it's proteins, right?
+[156:38.00]Maybe it's some modality in the healthcare space, right?
+[156:42.00]Where sudden breakthroughs come about in the next several years.
+[156:46.00]And then you will suddenly see a whole bunch of people looking at gene sequences and so on and so forth, right?
+[156:52.00]And that interest will spike.
+[156:54.00]So I actually find it fascinating that we are growing as a community from where we started.
+[156:59.00]I don't really see it as a hype versus not hype.
+[157:03.00]I really see it as more of the ability of this modality coupled with the neural architecture.
+[157:10.00]You know, that will be the most effective and efficient at that point of time in attracting a lot of energy from the academic and industrial research community.
+[157:19.00]So we can only be better off because of it.
+[157:22.00]So I don't see it as a hype as much as an opportunity for learning more.
+[157:27.00]I will add this.
+[157:28.00]And I think it is a good segue for the next part of what you're going to talk about LSU, right?
+[157:33.00]Which is I get asked this question often, right?
+[157:36.00]Why did I move from NVIDIA?
+[157:37.00]Like who leaves NVIDIA?
+[157:39.00]And the fact of the matter is that really if you have that unfulfilled desire in you to solve the actual end problem,
+[157:46.00]you have to go towards one of these verticals, right?
+[157:49.00]With the whether it's health care, whether it is financial services and you have to be embedded in an organization that actually has a culture for fostering big tech like a fang like work ethic.
+[158:02.00]Whether it's having the data, right?
+[158:04.00]Already being in the cloud, having the roots in data driven and machine learning.
+[158:09.00]So you want to be in a place which allows for that kind of creativity so that you can actually take Gen AI to its logical next step, right?
+[158:19.00]We're just solving problems so that we change the financial services domain for the good, right?
+[158:24.00]There's benefit to the customers.
+[158:26.00]And so when you do that, though, you have to be mindful.
+[158:30.00]If you are working in places where you are building recommendation systems for what to buy or what to watch.
+[158:39.00]There are a lot of ways in which you may not have the best possible answer, the most accurate answer and you are still fine.
+[158:46.00]But when you are in one of these domains that matter, health care, financial services, right?
+[158:51.00]You really are solving a harder problem because the tolerance for error of your end customer is going to be much,
+[158:59.00]much,much lower.And what that does is it forces the research and development to go in specific directions.
+[159:06.00]You may not be able to just take what's out there in open source and use it as is, right?
+[159:12.00]You may have to actually innovate beyond what's in the state of the art publication to make it work for this kind of domain, right?
+[159:21.00]And so it takes a special class of applied researchers.
+[159:25.00]It takes a special class of machine learning engineers, data scientists that have the will.
+[159:31.00]The last mile is hard and you have to have that will to solve that problem.
+[159:36.00]And so in some cases what ends up happening is that these roles and the work we will do may be harder than what you end up doing in a generic setting or at a platform level only or in a use case which actually doesn't have this kind of extremely low tolerance for error.
+[159:54.00]So that becomes, you know, one of the main motivating functions for people to join us, right?
+[160:01.00]Who really want to actually take these kind of really hard challenges.
+[160:05.00]I was going through some of your team papers just from 2023, some of them in Europe, some at ICML.
+[160:10.00]You've done a lot of work like class imbalance on data sets.
+[160:13.00]So how do you make performing neural networks when like the data is kind of in balance in certain domains.
+[160:18.00]You're doing some research on transformer graphs versus like graph neural networks.
+[160:23.00]So there's just kind of like a lot in there.
+[160:25.00]Any favorite project that you want to shut out any interesting paper that you saw come out of your team.
+[160:30.00]You know, I could choose one, but then it would basically end up being, you know, not fair to the others.
+[160:35.00]So the only on that, you know,
+[160:39.00]It's like the what's your favorite child?
+[160:42.00]There's no way to say one versus the other.
+[160:46.00]But there are a lot of interesting trajectories of exploration here, right?
+[160:51.00]We are trying to look at the superset of what all these things are trying to do.
+[160:55.00]We are trying to look at data sets that are very unique to our domain.
+[160:58.00]We are trying to look at attributes that are important to us like time series, tabular, right?
+[161:04.00]These are things that are important, very important.
+[161:06.00]We are looking at the imbalance problems, right?
+[161:09.00]There are other things that we are looking at.
+[161:11.00]And so I wouldn't want to just highlight one.
+[161:15.00]There are a bunch of things that are of great interest to us from our perspective.
+[161:20.00]Let me just not pick one.
+[161:23.00]That makes sense.
+[161:25.00]And yeah, just to wrap, we always like to ask our guests, who are you looking for?
+[161:29.00]You know, we got a lot of AI engineers, researchers in the audience.
+[161:33.00]Like, who are the type of people that are going to have a good time working with you?
+[161:36.00]And what are some of the open roles that you have?
+[161:38.00]So let me start with the roles.
+[161:40.00]And then, you know, we can talk about the kind of people who are going to have a great time with us, right?
+[161:44.00]We have a number of open roles and they span the entire spectrum from applied researchers to data scientists, to machine learning engineers, AI engineers, right?
+[161:55.00]People who are listening to you right now that we really weren't interested in what we are doing.
+[162:01.00]We have roles at various levels of seniority.
+[162:03.00]We have individual contributor roles.
+[162:05.00]That's one of the things that we have really, really double clicked on in the last several months since I've joined.
+[162:10.00]And that was a thrust also before I joined, but especially in the applied research field, right?
+[162:15.00]We are looking for, you know, what would be the equivalent of, you know, principal research scientists, right?
+[162:20.00]Distinguished research scientists, individual contributors.
+[162:23.00]We are looking for fresh graduates, right?
+[162:26.00]Masters and PhD students, you know, just out of school, all the way to people who have, you know, a decade or more of experience.
+[162:33.00]So, really there is a huge spectrum of talent that we want to onboard and we are looking for.
+[162:41.00]Now, what is the characteristic of someone who will really come and enjoy here, right?
+[162:46.00]I think I started giving that to you earlier when I said people who are interested in actually solving the problem.
+[162:52.00]That last mile is hard, so we really want people to know that they have to have the stomach for that last mile.
+[162:58.00]People who are good at understanding what the product requirements are, they sometimes make the best applied researchers, right?
+[163:07.00]Because they can understand what our needs are and formulate problems, change architectures.
+[163:15.00]We are looking for people who are very good at dealing with ambiguity.
+[163:19.00]A lot of what we are doing, a lot of the advancements we are seeing, they are empirical data driven, right?
+[163:25.00]And so, we need to have that as a skill.
+[163:29.00]How do you actually build algorithms, systems and solutions that you can show improvement on our data?
+[163:37.00]It's great if somebody is doing some architecture or some network is doing great on some of the benchmarks, right?
+[163:45.00]That get published outside, but it's equally important.
+[163:48.00]It's actually more important that we are able to show to ourselves that these algorithms, architectures do well on our own internal data sets and benchmarks and evaluation, right?
+[164:00.00]And so, to that point, how do we actually come up with meaningful evaluation frameworks and methodologies, right?
+[164:05.00]That itself is something of great interest to us.
+[164:09.00]So, people in general who like to solve problems, problem solvers, people who like to go from theoretical to practical and people who are not afraid that the empirical data may not bear out their best idea and they have to go and rethink it, right?
+[164:25.00]And redo it.Those are the kind of people that will really do well here.
+[164:28.00]Yeah, it was great to hear your story and not a lot of people in the world, I think, that have the same depth of experience in AI, so this was awesome.
+[164:35.00]Yeah, a lot of people that you're looking for are also like the kind of AI engineers that we want to encourage.
+[164:40.00]So, yeah, thanks for sharing your thoughts.
+[164:42.00]Thank you, Sean.
+[164:43.00](音乐)
+[165:07.00](音乐)
+[165:09.00]字幕by索兰娅
+[165:11.00]我看你不太好看
+[165:12.32]我看你不太好看
diff --git a/content/post/Latent Space/Latent-Space-Latent-Space-Chats:-NLW-(Four-Wars,-GPT5),-Josh-Albrecht-Ali-Rohde-(TNAI),-Dylan-Patel-Semianalysis-(Groq),-Milind-Naphade-(Nvidia-GTC),-Personal-AI-(ft.-Harrison-Chase-—-LangFriend-LangMem).md b/content/post/Latent Space/Latent-Space-Latent-Space-Chats:-NLW-(Four-Wars,-GPT5),-Josh-Albrecht-Ali-Rohde-(TNAI),-Dylan-Patel-Semianalysis-(Groq),-Milind-Naphade-(Nvidia-GTC),-Personal-AI-(ft.-Harrison-Chase-—-LangFriend-LangMem).md
new file mode 100644
index 0000000..d50a53e
--- /dev/null
+++ b/content/post/Latent Space/Latent-Space-Latent-Space-Chats:-NLW-(Four-Wars,-GPT5),-Josh-Albrecht-Ali-Rohde-(TNAI),-Dylan-Patel-Semianalysis-(Groq),-Milind-Naphade-(Nvidia-GTC),-Personal-AI-(ft.-Harrison-Chase-—-LangFriend-LangMem).md
@@ -0,0 +1,12621 @@
+---
+title: Latent Space Chats: NLW (Four Wars, GPT5), Josh Albrecht/Ali Rohde (TNAI), Dylan Patel/Semianalysis (Groq), Milind Naphade (Nvidia GTC), Personal AI (ft. Harrison Chase — LangFriend/LangMem)
+author: Latent Space
+date: Sat, 06 Apr 2024 18:46:10 GMT
+draft: false
+summary: Our next 2 big events are AI UX and the World’s Fair. Join and apply to speak/sponsor!Due to timing issues we didn’t have an interview episode to share with you this week, but not to worry, we have mo...
+categories: [Latent Space]
+---
+
+{{< aplayer name="Latent Space Chats: NLW (Four Wars, GPT5), Josh Albrecht/Ali Rohde (TNAI), Dylan Patel/Semianalysis (Groq), Milind Naphade (Nvidia GTC), Personal AI (ft. Harrison Chase — LangFriend/LangMem)" artist="Latent Space" url="https://chrt.fm/track/ABF6EF/api.substack.com/feed/podcast/143334144/04a32184eac93dbdab8148562e701041.mp3" cover="https://substackcdn.com/feed/podcast/1084089/post/143334144/2e804590656ded340d33d70be08093ca.jpg" lrc-folded=true lrc-type=3 lrc="../Latent-Space-Latent-Space-Chats:-NLW-(Four-Wars,-GPT5),-Josh-Albrecht-Ali-Rohde-(TNAI),-Dylan-Patel-Semianalysis-(Groq),-Milind-Naphade-(Nvidia-GTC),-Personal-AI-(ft.-Harrison-Chase-—-LangFriend-LangMem).lrc" >}}{{< /aplayer >}}
+
+------
+
+Our next 2 big events are AI UX and the World’s Fair. Join and apply to speak/sponsor!
Due to timing issues we didn’t have an interview episode to share with you this week, but not to worry, we have more than enough “weekend special” content in the backlog for you to get your Latent Space fix, whether you like thinking about the big picture, or learning more about the pod behind the scenes, or talking Groq and GPUs, or AI Leadership, or Personal AI.
Enjoy!
AI Breakdown
The indefatigable NLW had us back on his show for an update on the Four Wars, covering Sora, Suno, and the reshaped GPT-4 Class Landscape:
and a longer segment on AI Engineering trends covering the future LLM landscape (Llama 3, GPT-5, Gemini 2, Claude 4), Open Source Models (Mistral, Grok), Apple and Meta’s AI strategy, new chips (Groq, MatX) and the general movement from baby AGIs to vertical Agents:
Thursday Nights in AI
We’re also including swyx’s interview with Josh Albrecht and Ali Rohde to reintroduce swyx and Latent Space to a general audience, and engage in some spicy Q&A:
Dylan Patel on Groq
We hosted a private event with Dylan Patel of SemiAnalysis (our last pod here):
Not all of it could be released so we just talked about our Groq estimates:
Milind Naphade - Capital One
In relation to conversations at NeurIPS and Nvidia GTC and upcoming at World’s Fair, we also enjoyed chatting with Milind Naphade about his AI Leadership work at IBM, Cisco, Nvidia, and now leading the AI Foundations org at Capital One. We covered:
* Milind’s learnings from ~25 years in machine learning
* His first paper citation was 24 years ago
* Lessons from working with Jensen Huang for 6 years and being CTO of Metropolis
* Thoughts on relevant AI research
* GTC takeaways and what makes NVIDIA special
If you’d like to work on building solutions rather than platform (as Milind put it), his Applied AI Research team at Capital One is hiring, which falls under the Capital One Tech team.
Personal AI Meetup
It all started with a meme:
Within days of each other, BEE, FRIEND, EmilyAI, Compass, Nox and LangFriend were all launching personal AI wearables and assistants. So we decided to put together a the world’s first Personal AI meetup featuring creators and enthusiasts of wearables. The full video is live now, with full show notes within.
Timestamps
* [00:01:13] AI Breakdown Part 1
* [00:02:20] Four Wars
* [00:13:45] Sora
* [00:15:12] Suno
* [00:16:34] The GPT-4 Class Landscape
* [00:17:03] Data War: Reddit x Google
* [00:21:53] Gemini 1.5 vs Claude 3
* [00:26:58] AI Breakdown Part 2
* [00:27:33] Next Frontiers: Llama 3, GPT-5, Gemini 2, Claude 4
* [00:31:11] Open Source Models - Mistral, Grok
* [00:34:13] Apple MM1
* [00:37:33] Meta's $800b AI rebrand
* [00:39:20] AI Engineer landscape - from baby AGIs to vertical Agents
* [00:47:28] Adept episode - Screen Multimodality
* [00:48:54] Top Model Research from January Recap
* [00:53:08] AI Wearables
* [00:57:26] Groq vs Nvidia month - GPU Chip War
* [01:00:31] Disagreements
* [01:02:08] Summer 2024 Predictions
* [01:04:18] Thursday Nights in AI - swyx
* [01:33:34] Dylan Patel - Semianalysis + Latent Space Live Show
* [01:34:58] Groq
Transcript
[00:00:00] swyx: Welcome to the Latent Space Podcast Weekend Edition. This is Charlie, your AI co host. Swyx and Alessio are off for the week, making more great content. We have exciting interviews coming up with Elicit, Chroma, Instructor, and our upcoming series on NSFW, Not Safe for Work AI. In today's episode, we're collating some of Swyx and Alessio's recent appearances, all in one place for you to find.
[00:00:32] swyx: In part one, we have our first crossover pod of the year. In our listener survey, several folks asked for more thoughts from our two hosts. In 2023, Swyx and Alessio did crossover interviews with other great podcasts like the AI Breakdown, Practical AI, Cognitive Revolution, Thursday Eye, and Chinatalk, all of which you can find in the Latentspace About page.
[00:00:56] swyx: NLW of the AI Breakdown asked us back to do a special on the 4Wars framework and the AI engineer scene. We love AI Breakdown as one of the best examples Daily podcasts to keep up on AI news, so we were especially excited to be back on Watch out and take
[00:01:12] NLW: care
[00:01:13] AI Breakdown Part 1
[00:01:13] NLW: today on the AI breakdown. Part one of my conversation with Alessio and Swix from Latent Space.
[00:01:19] NLW: All right, fellas, welcome back to the AI Breakdown. How are you doing? I'm good. Very good. With the last, the last time we did this show, we were like, oh yeah, let's do check ins like monthly about all the things that are going on and then. Of course, six months later, and, you know, the, the, the world has changed in a thousand ways.
[00:01:36] NLW: It's just, it's too busy to even, to even think about podcasting sometimes. But I, I'm super excited to, to be chatting with you again. I think there's, there's a lot to, to catch up on, just to tap in, I think in the, you know, in the beginning of 2024. And, and so, you know, we're gonna talk today about just kind of a, a, a broad sense of where things are in some of the key battles in the AI space.
[00:01:55] NLW: And then the, you know, one of the big things that I, that I'm really excited to have you guys on here for us to talk about where, sort of what patterns you're seeing and what people are actually trying to build, you know, where, where developers are spending their, their time and energy and, and, and any sort of, you know, trend trends there, but maybe let's start I guess by checking in on a framework that you guys actually introduced, which I've loved and I've cribbed a couple of times now, which is this sort of four wars of the, of the AI stack.
[00:02:20] Four Wars
[00:02:20] NLW: Because first, since I have you here, I'd love, I'd love to hear sort of like where that started gelling. And then and then maybe we can get into, I think a couple of them that are you know, particularly interesting, you know, in the, in light of
[00:02:30] swyx: some recent news. Yeah, so maybe I'll take this one. So the four wars is a framework that I came up around trying to recap all of 2023.
[00:02:38] swyx: I tried to write sort of monthly recap pieces. And I was trying to figure out like what makes one piece of news last longer than another or more significant than another. And I think it's basically always around battlegrounds. Wars are fought around limited resources. And I think probably the, you know, the most limited resource is talent, but the talent expresses itself in a number of areas.
[00:03:01] swyx: And so I kind of focus on those, those areas at first. So the four wars that we cover are the data wars, the GPU rich, poor war, the multi modal war, And the RAG and Ops War. And I think you actually did a dedicated episode to that, so thanks for covering that. Yeah, yeah.
[00:03:18] NLW: Not only did I do a dedicated episode, I actually used that.
[00:03:22] NLW: I can't remember if I told you guys. I did give you big shoutouts. But I used it as a framework for a presentation at Intel's big AI event that they hold each year, where they have all their folks who are working on AI internally. And it totally resonated. That's amazing. Yeah, so, so, what got me thinking about it again is specifically this inflection news that we recently had, this sort of, you know, basically, I can't imagine that anyone who's listening wouldn't have thought about it, but, you know, inflection is a one of the big contenders, right?
[00:03:53] NLW: I think probably most folks would have put them, you know, just a half step behind the anthropics and open AIs of the world in terms of labs, but it's a company that raised 1. 3 billion last year, less than a year ago. Reed Hoffman's a co founder Mustafa Suleyman, who's a co founder of DeepMind, you know, so it's like, this is not a a small startup, let's say, at least in terms of perception.
[00:04:13] NLW: And then we get the news that basically most of the team, it appears, is heading over to Microsoft and they're bringing in a new CEO. And you know, I'm interested in, in, in kind of your take on how much that reflects, like hold aside, I guess, you know, all the other things that it might be about, how much it reflects this sort of the, the stark.
[00:04:32] NLW: Brutal reality of competing in the frontier model space right now. And, you know, just the access to compute.
[00:04:38] Alessio: There are a lot of things to say. So first of all, there's always somebody who's more GPU rich than you. So inflection is GPU rich by startup standard. I think about 22, 000 H100s, but obviously that pales compared to the, to Microsoft.
[00:04:55] Alessio: The other thing is that this is probably good news, maybe for the startups. It's like being GPU rich, it's not enough. You know, like I think they were building something pretty interesting in, in pi of their own model of their own kind of experience. But at the end of the day, you're the interface that people consume as end users.
[00:05:13] Alessio: It's really similar to a lot of the others. So and we'll tell, talk about GPT four and cloud tree and all this stuff. GPU poor, doing something. That the GPU rich are not interested in, you know we just had our AI center of excellence at Decibel and one of the AI leads at one of the big companies was like, Oh, we just saved 10 million and we use these models to do a translation, you know, and that's it.
[00:05:39] Alessio: It's not, it's not a GI, it's just translation. So I think like the inflection part is maybe. A calling and a waking to a lot of startups then say, Hey, you know, trying to get as much capital as possible, try and get as many GPUs as possible. Good. But at the end of the day, it doesn't build a business, you know, and maybe what inflection I don't, I don't, again, I don't know the reasons behind the inflection choice, but if you say, I don't want to build my own company that has 1.
[00:06:05] Alessio: 3 billion and I want to go do it at Microsoft, it's probably not a resources problem. It's more of strategic decisions that you're making as a company. So yeah, that was kind of my. I take on it.
[00:06:15] swyx: Yeah, and I guess on my end, two things actually happened yesterday. It was a little bit quieter news, but Stability AI had some pretty major departures as well.
[00:06:25] swyx: And you may not be considering it, but Stability is actually also a GPU rich company in the sense that they were the first new startup in this AI wave to brag about how many GPUs that they have. And you should join them. And you know, Imadis is definitely a GPU trader in some sense from his hedge fund days.
[00:06:43] swyx: So Robin Rhombach and like the most of the Stable Diffusion 3 people left Stability yesterday as well. So yesterday was kind of like a big news day for the GPU rich companies, both Inflection and Stability having sort of wind taken out of their sails. I think, yes, it's a data point in the favor of Like, just because you have the GPUs doesn't mean you can, you automatically win.
[00:07:03] swyx: And I think, you know, kind of I'll echo what Alessio says there. But in general also, like, I wonder if this is like the start of a major consolidation wave, just in terms of, you know, I think that there was a lot of funding last year and, you know, the business models have not been, you know, All of these things worked out very well.
[00:07:19] swyx: Even inflection couldn't do it. And so I think maybe that's the start of a small consolidation wave. I don't think that's like a sign of AI winter. I keep looking for AI winter coming. I think this is kind of like a brief cold front. Yeah,
[00:07:34] NLW: it's super interesting. So I think a bunch of A bunch of stuff here.
[00:07:38] NLW: One is, I think, to both of your points, there, in some ways, there, there had already been this very clear demarcation between these two sides where, like, the GPU pores, to use the terminology, like, just weren't trying to compete on the same level, right? You know, the vast majority of people who have started something over the last year, year and a half, call it, were racing in a different direction.
[00:07:59] NLW: They're trying to find some edge somewhere else. They're trying to build something different. If they're, if they're really trying to innovate, it's in different areas. And so it's really just this very small handful of companies that are in this like very, you know, it's like the coheres and jaspers of the world that like this sort of, you know, that are that are just sort of a little bit less resourced than, you know, than the other set that I think that this potentially even applies to, you know, everyone else that could clearly demarcate it into these two, two sides.
[00:08:26] NLW: And there's only a small handful kind of sitting uncomfortably in the middle, perhaps. Let's, let's come back to the idea of, of the sort of AI winter or, you know, a cold front or anything like that. So this is something that I, I spent a lot of time kind of thinking about and noticing. And my perception is that The vast majority of the folks who are trying to call for sort of, you know, a trough of disillusionment or, you know, a shifting of the phase to that are people who either, A, just don't like AI for some other reason there's plenty of that, you know, people who are saying, You Look, they're doing way worse than they ever thought.
[00:09:03] NLW: You know, there's a lot of sort of confirmation bias kind of thing going on. Or two, media that just needs a different narrative, right? Because they're sort of sick of, you know, telling the same story. Same thing happened last summer, when every every outlet jumped on the chat GPT at its first down month story to try to really like kind of hammer this idea that that the hype was too much.
[00:09:24] NLW: Meanwhile, you have, you know, just ridiculous levels of investment from enterprises, you know, coming in. You have, you know, huge, huge volumes of, you know, individual behavior change happening. But I do think that there's nothing incoherent sort of to your point, Swyx, about that and the consolidation period.
[00:09:42] NLW: Like, you know, if you look right now, for example, there are, I don't know, probably 25 or 30 credible, like, build your own chatbot. platforms that, you know, a lot of which have, you know, raised funding. There's no universe in which all of those are successful across, you know, even with a, even, even with a total addressable market of every enterprise in the world, you know, you're just inevitably going to see some amount of consolidation.
[00:10:08] NLW: Same with, you know, image generators. There are, if you look at A16Z's top 50 consumer AI apps, just based on, you know, web traffic or whatever, they're still like I don't know, a half. Dozen or 10 or something, like, some ridiculous number of like, basically things like Midjourney or Dolly three. And it just seems impossible that we're gonna have that many, you know, ultimately as, as, as sort of, you know, going, going concerned.
[00:10:33] NLW: So, I don't know. I, I, I think that the, there will be inevitable consolidation 'cause you know. It's, it's also what kind of like venture rounds are supposed to do. You're not, not everyone who gets a seed round is supposed to get to series A and not everyone who gets a series A is supposed to get to series B.
[00:10:46] NLW: That's sort of the natural process. I think it will be tempting for a lot of people to try to infer from that something about AI not being as sort of big or as as sort of relevant as, as it was hyped up to be. But I, I kind of think that's the wrong conclusion to come to.
[00:11:02] Alessio: I I would say the experimentation.
[00:11:04] Alessio: Surface is a little smaller for image generation. So if you go back maybe six, nine months, most people will tell you, why would you build a coding assistant when like Copilot and GitHub are just going to win everything because they have the data and they have all the stuff. If you fast forward today, A lot of people use Cursor everybody was excited about the Devin release on Twitter.
[00:11:26] Alessio: There are a lot of different ways of attacking the market that are not completion of code in the IDE. And even Cursors, like they evolved beyond single line to like chat, to do multi line edits and, and all that stuff. Image generation, I would say, yeah, as a, just as from what I've seen, like maybe the product innovation has slowed down at the UX level and people are improving the models.
[00:11:50] Alessio: So the race is like, how do I make better images? It's not like, how do I make the user interact with the generation process better? And that gets tough, you know? It's hard to like really differentiate yourselves. So yeah, that's kind of how I look at it. And when we think about multimodality, maybe the reason why people got so excited about Sora is like, oh, this is like a completely It's not a better image model.
[00:12:13] Alessio: This is like a completely different thing, you know? And I think the creative mind It's always looking for something that impacts the viewer in a different way, you know, like they really want something different versus the developer mind. It's like, Oh, I, I just, I have this like very annoying thing I want better.
[00:12:32] Alessio: I have this like very specific use cases that I want to go after. So it's just different. And that's why you see a lot more companies in image generation. But I agree with you that. If you fast forward there, there's not going to be 10 of them, you know, it's probably going to be one or
[00:12:46] swyx: two. Yeah, I mean, to me, that's why I call it a war.
[00:12:49] swyx: Like, individually, all these companies can make a story that kind of makes sense, but collectively, they cannot all be true. Therefore, they all, there is some kind of fight over limited resources here. Yeah, so
[00:12:59] NLW: it's interesting. We wandered very naturally into sort of another one of these wars, which is the multimodality kind of idea, which is, you know, basically a question of whether it's going to be these sort of big everything models that end up winning or whether, you know, you're going to have really specific things, you know, like something, you know, Dolly 3 inside of sort of OpenAI's larger models versus, you know, a mid journey or something like that.
[00:13:24] NLW: And at first, you know, I was kind of thinking like, For most of the last, call it six months or whatever, it feels pretty definitively both and in some ways, you know, and that you're, you're seeing just like great innovation on sort of the everything models, but you're also seeing lots and lots happen at sort of the level of kind of individual use cases.
[00:13:45] Sora
[00:13:45] NLW: But then Sora comes along and just like obliterates what I think anyone thought you know, where we were when it comes to video generation. So how are you guys thinking about this particular battle or war at the moment?
[00:13:59] swyx: Yeah, this was definitely a both and story, and Sora tipped things one way for me, in terms of scale being all you need.
[00:14:08] swyx: And the benefit, I think, of having multiple models being developed under one roof. I think a lot of people aren't aware that Sora was developed in a similar fashion to Dolly 3. And Dolly3 had a very interesting paper out where they talked about how they sort of bootstrapped their synthetic data based on GPT 4 vision and GPT 4.
[00:14:31] swyx: And, and it was just all, like, really interesting, like, if you work on one modality, it enables you to work on other modalities, and all that is more, is, is more interesting. I think it's beneficial if it's all in the same house, whereas the individual startups who don't, who sort of carve out a single modality and work on that, definitely won't have the state of the art stuff on helping them out on synthetic data.
[00:14:52] swyx: So I do think like, The balance is tilted a little bit towards the God model companies, which is challenging for the, for the, for the the sort of dedicated modality companies. But everyone's carving out different niches. You know, like we just interviewed Suno ai, the sort of music model company, and, you know, I don't see opening AI pursuing music anytime soon.
[00:15:12] Suno
[00:15:12] swyx: Yeah,
[00:15:13] NLW: Suno's been phenomenal to play with. Suno has done that rare thing where, which I think a number of different AI product categories have done, where people who don't consider themselves particularly interested in doing the thing that the AI enables find themselves doing a lot more of that thing, right?
[00:15:29] NLW: Like, it'd be one thing if Just musicians were excited about Suno and using it but what you're seeing is tons of people who just like music all of a sudden like playing around with it and finding themselves kind of down that rabbit hole, which I think is kind of like the highest compliment that you can give one of these startups at the
[00:15:45] swyx: early days of it.
[00:15:46] swyx: Yeah, I, you know, I, I asked them directly, you know, in the interview about whether they consider themselves mid journey for music. And he had a more sort of nuanced response there, but I think that probably the business model is going to be very similar because he's focused on the B2C element of that. So yeah, I mean, you know, just to, just to tie back to the question about, you know, You know, large multi modality companies versus small dedicated modality companies.
[00:16:10] swyx: Yeah, highly recommend people to read the Sora blog posts and then read through to the Dali blog posts because they, they strongly correlated themselves with the same synthetic data bootstrapping methods as Dali. And I think once you make those connections, you're like, oh, like it, it, it is beneficial to have multiple state of the art models in house that all help each other.
[00:16:28] swyx: And these, this, that's the one thing that a dedicated modality company cannot do.
[00:16:34] The GPT-4 Class Landscape
[00:16:34] NLW: So I, I wanna jump, I wanna kind of build off that and, and move into the sort of like updated GPT-4 class landscape. 'cause that's obviously been another big change over the last couple months. But for the sake of completeness, is there anything that's worth touching on with with sort of the quality?
[00:16:46] NLW: Quality data or sort of a rag ops wars just in terms of, you know, anything that's changed, I guess, for you fundamentally in the last couple of months about where those things stand.
[00:16:55] swyx: So I think we're going to talk about rag for the Gemini and Clouds discussion later. And so maybe briefly discuss the data piece.
[00:17:03] Data War: Reddit x Google
[00:17:03] swyx: I think maybe the only new thing was this Reddit deal with Google for like a 60 million dollar deal just ahead of their IPO, very conveniently turning Reddit into a AI data company. Also, very, very interestingly, a non exclusive deal, meaning that Reddit can resell that data to someone else. And it probably does become table stakes.
[00:17:23] swyx: A lot of people don't know, but a lot of the web text dataset that originally started for GPT 1, 2, and 3 was actually scraped from GitHub. from Reddit at least the sort of vote scores. And I think, I think that's a, that's a very valuable piece of information. So like, yeah, I think people are figuring out how to pay for data.
[00:17:40] swyx: People are suing each other over data. This, this, this war is, you know, definitely very, very much heating up. And I don't think, I don't see it getting any less intense. I, you know, next to GPUs, data is going to be the most expensive thing in, in a model stack company. And. You know, a lot of people are resorting to synthetic versions of it, which may or may not be kosher based on how far along or how commercially blessed the, the forms of creating that synthetic data are.
[00:18:11] swyx: I don't know if Alessio, you have any other interactions with like Data source companies, but that's my two cents.
[00:18:17] Alessio: Yeah yeah, I actually saw Quentin Anthony from Luther. ai at GTC this week. He's also been working on this. I saw Technium. He's also been working on the data side. I think especially in open source, people are like, okay, if everybody is putting the gates up, so to speak, to the data we need to make it easier for people that don't have 50 million a year to get access to good data sets.
[00:18:38] Alessio: And Jensen, at his keynote, he did talk about synthetic data a little bit. So I think that's something that we'll definitely hear more and more of in the enterprise, which never bodes well, because then all the, all the people with the data are like, Oh, the enterprises want to pay now? Let me, let me put a pay here stripe link so that they can give me 50 million.
[00:18:57] Alessio: But it worked for Reddit. I think the stock is up. 40 percent today after opening. So yeah, I don't know if it's all about the Google deal, but it's obviously Reddit has been one of those companies where, hey, you got all this like great community, but like, how are you going to make money? And like, they try to sell the avatars.
[00:19:15] Alessio: I don't know if that it's a great business for them. The, the data part sounds as an investor, you know, the data part sounds a lot more interesting than, than consumer
[00:19:25] swyx: cosmetics. Yeah, so I think, you know there's more questions around data you know, I think a lot of people are talking about the interview that Mira Murady did with the Wall Street Journal, where she, like, just basically had no, had no good answer for where they got the data for Sora.
[00:19:39] swyx: I, I think this is where, you know, there's, it's in nobody's interest to be transparent about data, and it's, it's kind of sad for the state of ML and the state of AI research but it is what it is. We, we have to figure this out as a society, just like we did for music and music sharing. You know, in, in sort of the Napster to Spotify transition, and that might take us a decade.
[00:19:59] swyx: Yeah, I
[00:20:00] NLW: do. I, I agree. I think, I think that you're right to identify it, not just as that sort of technical problem, but as one where society has to have a debate with itself. Because I think that there's, if you rationally within it, there's Great kind of points on all side, not to be the sort of, you know, person who sits in the middle constantly, but it's why I think a lot of these legal decisions are going to be really important because, you know, the job of judges is to listen to all this stuff and try to come to things and then have other judges disagree.
[00:20:24] NLW: And, you know, and have the rest of us all debate at the same time. By the way, as a total aside, I feel like the synthetic data right now is like eggs in the 80s and 90s. Like, whether they're good for you or bad for you, like, you know, we, we get one study that's like synthetic data, you know, there's model collapse.
[00:20:42] NLW: And then we have like a hint that llama, you know, to the most high performance version of it, which was one they didn't release was trained on synthetic data. So maybe it's good. It's like, I just feel like every, every other week I'm seeing something sort of different about whether it's a good or bad for, for these models.
[00:20:56] swyx: Yeah. The branding of this is pretty poor. I would kind of tell people to think about it like cholesterol. There's good cholesterol, bad cholesterol. And you can have, you know, good amounts of both. But at this point, it is absolutely without a doubt that most large models from here on out will all be trained as some kind of synthetic data and that is not a bad thing.
[00:21:16] swyx: There are ways in which you can do it poorly. Whether it's commercial, you know, in terms of commercial sourcing or in terms of the model performance. But it's without a doubt that good synthetic data is going to help your model. And this is just a question of like where to obtain it and what kinds of synthetic data are valuable.
[00:21:36] swyx: You know, if even like alpha geometry, you know, was, was a really good example from like earlier this year.
[00:21:42] NLW: If you're using the cholesterol analogy, then my, then my egg thing can't be that far off. Let's talk about the sort of the state of the art and the, and the GPT 4 class landscape and how that's changed.
[00:21:53] Gemini 1.5 vs Claude 3
[00:21:53] NLW: Cause obviously, you know, sort of the, the two big things or a couple of the big things that have happened. Since we last talked, we're one, you know, Gemini first announcing that a model was coming and then finally it arriving, and then very soon after a sort of a different model arriving from Gemini and and Cloud three.
[00:22:11] NLW: So I guess, you know, I'm not sure exactly where the right place to start with this conversation is, but, you know, maybe very broadly speaking which of these do you think have made a bigger impact? Thank you.
[00:22:20] Alessio: Probably the one you can use, right? So, Cloud. Well, I'm sure Gemini is going to be great once they let me in, but so far I haven't been able to.
[00:22:29] Alessio: I use, so I have this small podcaster thing that I built for our podcast, which does chapters creation, like named entity recognition, summarization, and all of that. Cloud Tree is, Better than GPT 4. Cloud2 was unusable. So I use GPT 4 for everything. And then when Opus came out, I tried them again side by side and I posted it on, on Twitter as well.
[00:22:53] Alessio: Cloud is better. It's very good, you know, it's much better, it seems to me, it's much better than GPT 4 at doing writing that is more, you know, I don't know, it just got good vibes, you know, like the GPT 4 text, you can tell it's like GPT 4, you know, it's like, it always uses certain types of words and phrases and, you know, maybe it's just me because I've now done it for, you know, So, I've read like 75, 80 generations of these things next to each other.
[00:23:21] Alessio: Clutter is really good. I know everybody is freaking out on twitter about it, my only experience of this is much better has been on the podcast use case. But I know that, you know, Quran from from News Research is a very big opus pro, pro opus person. So, I think that's also It's great to have people that actually care about other models.
[00:23:40] Alessio: You know, I think so far to a lot of people, maybe Entropic has been the sibling in the corner, you know, it's like Cloud releases a new model and then OpenAI releases Sora and like, you know, there are like all these different things, but yeah, the new models are good. It's interesting.
[00:23:55] NLW: My my perception is definitely that just, just observationally, Cloud 3 is certainly the first thing that I've seen where lots of people.
[00:24:06] NLW: They're, no one's debating evals or anything like that. They're talking about the specific use cases that they have, that they used to use chat GPT for every day, you know, day in, day out, that they've now just switched over. And that has, I think, shifted a lot of the sort of like vibe and sentiment in the space too.
[00:24:26] NLW: And I don't necessarily think that it's sort of a A like full you know, sort of full knock. Let's put it this way. I think it's less bad for open AI than it is good for anthropic. I think that because GPT 5 isn't there, people are not quite willing to sort of like, you know get overly critical of, of open AI, except in so far as they're wondering where GPT 5 is.
[00:24:46] NLW: But I do think that it makes, Anthropic look way more credible as a, as a, as a player, as a, you know, as a credible sort of player, you know, as opposed to to, to where they were.
[00:24:57] Alessio: Yeah. And I would say the benchmarks veil is probably getting lifted this year. I think last year. People were like, okay, this is better than this on this benchmark, blah, blah, blah, because maybe they did not have a lot of use cases that they did frequently.
[00:25:11] Alessio: So it's hard to like compare yourself. So you, you defer to the benchmarks. I think now as we go into 2024, a lot of people have started to use these models from, you know, from very sophisticated things that they run in production to some utility that they have on their own. Now they can just run them side by side.
[00:25:29] Alessio: And it's like, Hey, I don't care that like. The MMLU score of Opus is like slightly lower than GPT 4. It just works for me, you know, and I think that's the same way that traditional software has been used by people, right? Like you just strive for yourself and like, which one does it work, works best for you?
[00:25:48] Alessio: Like nobody looks at benchmarks outside of like sales white papers, you know? And I think it's great that we're going more in that direction. We have a episode with Adapt coming out this weekend. I'll and some of their model releases, they specifically say, We do not care about benchmarks, so we didn't put them in, you know, because we, we don't want to look good on them.
[00:26:06] Alessio: We just want the product to work. And I think more and more people will, will
[00:26:09] swyx: go that way. Yeah. I I would say like, it does take the wind out of the sails for GPT 5, which I know where, you know, Curious about later on. I think anytime you put out a new state of the art model, you have to break through in some way.
[00:26:21] swyx: And what Claude and Gemini have done is effectively take away any advantage to saying that you have a million token context window. Now everyone's just going to be like, Oh, okay. Now you just match the other two guys. And so that puts An insane amount of pressure on what gpt5 is going to be because it's just going to have like the only option it has now because all the other models are multimodal all the other models are long context all the other models have perfect recall gpt5 has to match everything and do more to to not be a flop
[00:26:58] AI Breakdown Part 2
[00:26:58] NLW: hello friends back again with part two if you haven't heard part one of this conversation i suggest you go check it out but to be honest they are kind of actually separable In this conversation, we get into a topic that I think Alessio and Swyx are very well positioned to discuss, which is what developers care about right now, what people are trying to build around.
[00:27:16] NLW: I honestly think that one of the best ways to see the future in an industry like AI is to try to dig deep on what developers and entrepreneurs are attracted to build, even if it hasn't made it to the news pages yet. So consider this your preview of six months from now, and let's dive in. Let's bring it to the GPT 5 conversation.
[00:27:33] Next Frontiers: Llama 3, GPT-5, Gemini 2, Claude 4
[00:27:33] NLW: I mean, so, so I think that that's a great sort of assessment of just how the stakes have been raised, you know is your, I mean, so I guess maybe, maybe I'll, I'll frame this less as a question, just sort of something that, that I, that I've been watching right now, the only thing that makes sense to me with how.
[00:27:50] NLW: Fundamentally unbothered and unstressed OpenAI seems about everything is that they're sitting on something that does meet all that criteria, right? Because, I mean, even in the Lex Friedman interview that, that Altman recently did, you know, he's talking about other things coming out first. He's talking about, he's just like, he, listen, he, he's good and he could play nonchalant, you know, if he wanted to.
[00:28:13] NLW: So I don't want to read too much into it, but. You know, they've had so long to work on this, like unless that we are like really meaningfully running up against some constraint, it just feels like, you know, there's going to be some massive increase, but I don't know. What do you guys think?
[00:28:28] swyx: Hard to speculate.
[00:28:29] swyx: You know, at this point, they're, they're pretty good at PR and they're not going to tell you anything that they don't want to. And he can tell you one thing and change their minds the next day. So it's, it's, it's really, you know, I've always said that model version numbers are just marketing exercises, like they have something and it's always improving and at some point you just cut it and decide to call it GPT 5.
[00:28:50] swyx: And it's more just about defining an arbitrary level at which they're ready and it's up to them on what ready means. We definitely did see some leaks on GPT 4. 5, as I think a lot of people reported and I'm not sure if you covered it. So it seems like there might be an intermediate release. But I did feel, coming out of the Lex Friedman interview, that GPT 5 was nowhere near.
[00:29:11] swyx: And you know, it was kind of a sharp contrast to Sam talking at Davos in February, saying that, you know, it was his top priority. So I find it hard to square. And honestly, like, there's also no point Reading too much tea leaves into what any one person says about something that hasn't happened yet or has a decision that hasn't been taken yet.
[00:29:31] swyx: Yeah, that's, that's my 2 cents about it. Like, calm down, let's just build .
[00:29:35] Alessio: Yeah. The, the February rumor was that they were gonna work on AI agents, so I don't know, maybe they're like, yeah,
[00:29:41] swyx: they had two agent two, I think two agent projects, right? One desktop agent and one sort of more general yeah, sort of GPTs like agent and then Andre left, so he was supposed to be the guy on that.
[00:29:52] swyx: What did Andre see? What did he see? I don't know. What did he see?
[00:29:56] Alessio: I don't know. But again, it's just like the rumors are always floating around, you know but I think like, this is, you know, we're not going to get to the end of the year without Jupyter you know, that's definitely happening. I think the biggest question is like, are Anthropic and Google.
[00:30:13] Alessio: Increasing the pace, you know, like it's the, it's the cloud four coming out like in 12 months, like nine months. What's the, what's the deal? Same with Gemini. They went from like one to 1. 5 in like five days or something. So when's Gemini 2 coming out, you know, is that going to be soon? I don't know.
[00:30:31] Alessio: There, there are a lot of, speculations, but the good thing is that now you can see a world in which OpenAI doesn't rule everything. You know, so that, that's the best, that's the best news that everybody got, I would say.
[00:30:43] swyx: Yeah, and Mistral Large also dropped in the last month. And, you know, not as, not quite GPT 4 class, but very good from a new startup.
[00:30:52] swyx: So yeah, we, we have now slowly changed in landscape, you know. In my January recap, I was complaining that nothing's changed in the landscape for a long time. But now we do exist in a world, sort of a multipolar world where Cloud and Gemini are legitimate challengers to GPT 4 and hopefully more will emerge as well hopefully from meta.
[00:31:11] Open Source Models - Mistral, Grok
[00:31:11] NLW: So speak, let's actually talk about sort of the open source side of this for a minute. So Mistral Large, notable because it's, it's not available open source in the same way that other things are, although I think my perception is that the community has largely given them Like the community largely recognizes that they want them to keep building open source stuff and they have to find some way to fund themselves that they're going to do that.
[00:31:27] NLW: And so they kind of understand that there's like, they got to figure out how to eat, but we've got, so, you know, there there's Mistral, there's, I guess, Grok now, which is, you know, Grok one is from, from October is, is open
[00:31:38] swyx: sourced at, yeah. Yeah, sorry, I thought you thought you meant Grok the chip company.
[00:31:41] swyx: No, no, no, yeah, you mean Twitter Grok.
[00:31:43] NLW: Although Grok the chip company, I think is even more interesting in some ways, but and then there's the, you know, obviously Llama3 is the one that sort of everyone's wondering about too. And, you know, my, my sense of that, the little bit that, you know, Zuckerberg was talking about Llama 3 earlier this year, suggested that, at least from an ambition standpoint, he was not thinking about how do I make sure that, you know, meta content, you know, keeps, keeps the open source thrown, you know, vis a vis Mistral.
[00:32:09] NLW: He was thinking about how you go after, you know, how, how he, you know, releases a thing that's, you know, every bit as good as whatever OpenAI is on at that point.
[00:32:16] Alessio: Yeah. From what I heard in the hallways at, at GDC, Llama 3, the, the biggest model will be, you 260 to 300 billion parameters, so that that's quite large.
[00:32:26] Alessio: That's not an open source model. You know, you cannot give people a 300 billion parameters model and ask them to run it. You know, it's very compute intensive. So I think it is, it
[00:32:35] swyx: can be open source. It's just, it's going to be difficult to run, but that's a separate question.
[00:32:39] Alessio: It's more like, as you think about what they're doing it for, you know, it's not like empowering the person running.
[00:32:45] Alessio: llama. On, on their laptop, it's like, oh, you can actually now use this to go after open AI, to go after Anthropic, to go after some of these companies at like the middle complexity level, so to speak. Yeah. So obviously, you know, we estimate Gentala on the podcast, they're doing a lot here, they're making PyTorch better.
[00:33:03] Alessio: You know, they want to, that's kind of like maybe a little bit of a shorted. Adam Bedia, in a way, trying to get some of the CUDA dominance out of it. Yeah, no, it's great. The, I love the duck destroying a lot of monopolies arc. You know, it's, it's been very entertaining. Let's bridge
[00:33:18] NLW: into the sort of big tech side of this, because this is obviously like, so I think actually when I did my episode, this was one of the I added this as one of as an additional war that, that's something that I'm paying attention to.
[00:33:29] NLW: So we've got Microsoft's moves with inflection, which I think pretend, potentially are being read as A shift vis a vis the relationship with OpenAI, which also the sort of Mistral large relationship seems to reinforce as well. We have Apple potentially entering the race, finally, you know, giving up Project Titan and and, and kind of trying to spend more effort on this.
[00:33:50] NLW: Although, Counterpoint, we also have them talking about it, or there being reports of a deal with Google, which, you know, is interesting to sort of see what their strategy there is. And then, you know, Meta's been largely quiet. We kind of just talked about the main piece, but, you know, there's, and then there's spoilers like Elon.
[00:34:07] NLW: I mean, you know, what, what of those things has sort of been most interesting to you guys as you think about what's going to shake out for the rest of this
[00:34:13] Apple MM1
[00:34:13] swyx: year? I'll take a crack. So the reason we don't have a fifth war for the Big Tech Wars is that's one of those things where I just feel like we don't cover differently from other media channels, I guess.
[00:34:26] swyx: Sure, yeah. In our anti interestness, we actually say, like, we try not to cover the Big Tech Game of Thrones, or it's proxied through Twitter. You know, all the other four wars anyway, so there's just a lot of overlap. Yeah, I think absolutely, personally, the most interesting one is Apple entering the race.
[00:34:41] swyx: They actually released, they announced their first large language model that they trained themselves. It's like a 30 billion multimodal model. People weren't that impressed, but it was like the first time that Apple has kind of showcased that, yeah, we're training large models in house as well. Of course, like, they might be doing this deal with Google.
[00:34:57] swyx: I don't know. It sounds very sort of rumor y to me. And it's probably, if it's on device, it's going to be a smaller model. So something like a Jemma. It's going to be smarter autocomplete. I don't know what to say. I'm still here dealing with, like, Siri, which hasn't, probably hasn't been updated since God knows when it was introduced.
[00:35:16] swyx: It's horrible. I, you know, it, it, it makes me so angry. So I, I, one, as an Apple customer and user, I, I'm just hoping for better AI on Apple itself. But two, they are the gold standard when it comes to local devices, personal compute and, and trust, like you, you trust them with your data. And. I think that's what a lot of people are looking for in AI, that they have, they love the benefits of AI, they don't love the downsides, which is that you have to send all your data to some cloud somewhere.
[00:35:45] swyx: And some of this data that we're going to feed AI is just the most personal data there is. So Apple being like one of the most trusted personal data companies, I think it's very important that they enter the AI race, and I hope to see more out of them.
[00:35:58] Alessio: To me, the, the biggest question with the Google deal is like, who's paying who?
[00:36:03] Alessio: Because for the browsers, Google pays Apple like 18, 20 billion every year to be the default browser. Is Google going to pay you to have Gemini or is Apple paying Google to have Gemini? I think that's, that's like what I'm most interested to figure out because with the browsers, it's like, it's the entry point to the thing.
[00:36:21] Alessio: So it's really valuable to be the default. That's why Google pays. But I wonder if like the perception in AI is going to be like, Hey. You just have to have a good local model on my phone to be worth me purchasing your device. And that was, that's kind of drive Apple to be the one buying the model. But then, like Shawn said, they're doing the MM1 themselves.
[00:36:40] Alessio: So are they saying we do models, but they're not as good as the Google ones? I don't know. The whole thing is, it's really confusing, but. It makes for great meme material on on Twitter.
[00:36:51] swyx: Yeah, I mean, I think, like, they are possibly more than OpenAI and Microsoft and Amazon. They are the most full stack company there is in computing, and so, like, they own the chips, man.
[00:37:05] swyx: Like, they manufacture everything so if, if, if there was a company that could do that. You know, seriously challenge the other AI players. It would be Apple. And it's, I don't think it's as hard as self driving. So like maybe they've, they've just been investing in the wrong thing this whole time. We'll see.
[00:37:21] swyx: Wall Street certainly thinks
[00:37:22] NLW: so. Wall Street loved that move, man. There's a big, a big sigh of relief. Well, let's, let's move away from, from sort of the big stuff. I mean, the, I think to both of your points, it's going to.
[00:37:33] Meta's $800b AI rebrand
[00:37:33] NLW: Can I, can
[00:37:34] swyx: I, can I, can I jump on factoid about this, this Wall Street thing? I went and looked at when Meta went from being a VR company to an AI company.
[00:37:44] swyx: And I think the stock I'm trying to look up the details now. The stock has gone up 187% since Lamo one. Yeah. Which is $830 billion in market value created in the past year. . Yeah. Yeah.
[00:37:57] NLW: It's, it's, it's like, remember if you guys haven't Yeah. If you haven't seen the chart, it's actually like remarkable.
[00:38:02] NLW: If you draw a little
[00:38:03] swyx: arrow on it, it's like, no, we're an AI company now and forget the VR thing.
[00:38:10] NLW: It's it, it is an interesting, no, it's, I, I think, alessio, you called it sort of like Zuck's Disruptor Arc or whatever. He, he really does. He is in the midst of a, of a total, you know, I don't know if it's a redemption arc or it's just, it's something different where, you know, he, he's sort of the spoiler.
[00:38:25] NLW: Like people loved him just freestyle talking about why he thought they had a better headset than Apple. But even if they didn't agree, they just loved it. He was going direct to camera and talking about it for, you know, five minutes or whatever. So that, that's a fascinating shift that I don't think anyone had on their bingo card, you know, whatever, two years ago.
[00:38:41] NLW: Yeah. Yeah,
[00:38:42] swyx: we still
[00:38:43] Alessio: didn't see and fight Elon though, so
[00:38:45] swyx: that's what I'm really looking forward to. I mean, hey, don't, don't, don't write it off, you know, maybe just these things take a while to happen. But we need to see and fight in the Coliseum. No, I think you know, in terms of like self management, life leadership, I think he has, there's a lot of lessons to learn from him.
[00:38:59] swyx: You know he might, you know, you might kind of quibble with, like, the social impact of Facebook, but just himself as a in terms of personal growth and, and, you know, Per perseverance through like a lot of change and you know, everyone throwing stuff his way. I think there's a lot to say about like, to learn from, from Zuck, which is crazy 'cause he's my age.
[00:39:18] swyx: Yeah. Right.
[00:39:20] AI Engineer landscape - from baby AGIs to vertical Agents
[00:39:20] NLW: Awesome. Well, so, so one of the big things that I think you guys have, you know, distinct and, and unique insight into being where you are and what you work on is. You know, what developers are getting really excited about right now. And by that, I mean, on the one hand, certainly, you know, like startups who are actually kind of formalized and formed to startups, but also, you know, just in terms of like what people are spending their nights and weekends on what they're, you know, coming to hackathons to do.
[00:39:45] NLW: And, you know, I think it's a, it's a, it's, it's such a fascinating indicator for, for where things are headed. Like if you zoom back a year, right now was right when everyone was getting so, so excited about. AI agent stuff, right? Auto, GPT and baby a GI. And these things were like, if you dropped anything on YouTube about those, like instantly tens of thousands of views.
[00:40:07] NLW: I know because I had like a 50,000 view video, like the second day that I was doing the show on YouTube, you know, because I was talking about auto GPT. And so anyways, you know, obviously that's sort of not totally come to fruition yet, but what are some of the trends in what you guys are seeing in terms of people's, people's interest and, and, and what people are building?
[00:40:24] Alessio: I can start maybe with the agents part and then I know Shawn is doing a diffusion meetup tonight. There's a lot of, a lot of different things. The, the agent wave has been the most interesting kind of like dream to reality arc. So out of GPT, I think they went, From zero to like 125, 000 GitHub stars in six weeks, and then one year later, they have 150, 000 stars.
[00:40:49] Alessio: So there's kind of been a big plateau. I mean, you might say there are just not that many people that can start it. You know, everybody already started it. But the promise of, hey, I'll just give you a goal, and you do it. I think it's like, amazing to get people's imagination going. You know, they're like, oh, wow, this This is awesome.
[00:41:08] Alessio: Everybody, everybody can try this to do anything. But then as technologists, you're like, well, that's, that's just like not possible, you know, we would have like solved everything. And I think it takes a little bit to go from the promise and the hope that people show you to then try it yourself and going back to say, okay, this is not really working for me.
[00:41:28] Alessio: And David Wong from Adept, you know, they in our episode, he specifically said. We don't want to do a bottom up product. You know, we don't want something that everybody can just use and try because it's really hard to get it to be reliable. So we're seeing a lot of companies doing vertical agents that are narrow for a specific domain, and they're very good at something.
[00:41:49] Alessio: Mike Conover, who was at Databricks before, is also a friend of Latentspace. He's doing this new company called BrightWave doing AI agents for financial research, and that's it, you know, and they're doing very well. There are other companies doing it in security, doing it in compliance, doing it in legal.
[00:42:08] Alessio: All of these things that like, people, nobody just wakes up and say, Oh, I cannot wait to go on AutoGPD and ask it to do a compliance review of my thing. You know, just not what inspires people. So I think the gap on the developer side has been the more bottom sub hacker mentality is trying to build this like very Generic agents that can do a lot of open ended tasks.
[00:42:30] Alessio: And then the more business side of things is like, Hey, If I want to raise my next round, I can not just like sit around the mess, mess around with like super generic stuff. I need to find a use case that really works. And I think that that is worth for, for a lot of folks in parallel, you have a lot of companies doing evals.
[00:42:47] Alessio: There are dozens of them that just want to help you measure how good your models are doing. Again, if you build evals, you need to also have a restrained surface area to actually figure out whether or not it's good, right? Because you cannot eval anything on everything under the sun. So that's another category where I've seen from the startup pitches that I've seen, there's a lot of interest in, in the enterprise.
[00:43:11] Alessio: It's just like really. Fragmented because the production use cases are just coming like now, you know, there are not a lot of long established ones to, to test against. And so does it, that's kind of on the virtual agents and then the robotic side it's probably been the thing that surprised me the most at NVIDIA GTC, the amount of robots that were there that were just like robots everywhere.
[00:43:33] Alessio: Like, both in the keynote and then on the show floor, you would have Boston Dynamics dogs running around. There was, like, this, like fox robot that had, like, a virtual face that, like, talked to you and, like, moved in real time. There were industrial robots. NVIDIA did a big push on their own Omniverse thing, which is, like, this Digital twin of whatever environments you're in that you can use to train the robots agents.
[00:43:57] Alessio: So that kind of takes people back to the reinforcement learning days, but yeah, agents, people want them, you know, people want them. I give a talk about the, the rise of the full stack employees and kind of this future, the same way full stack engineers kind of work across the stack. In the future, every employee is going to interact with every part of the organization through agents and AI enabled tooling.
[00:44:17] Alessio: This is happening. It just needs to be a lot more narrow than maybe the first approach that we took, which is just put a string in AutoGPT and pray. But yeah, there's a lot of super interesting stuff going on.
[00:44:27] swyx: Yeah. Well, he Let's recover a lot of stuff there. I'll separate the robotics piece because I feel like that's so different from the software world.
[00:44:34] swyx: But yeah, we do talk to a lot of engineers and you know, that this is our sort of bread and butter. And I do agree that vertical agents have worked out a lot better than the horizontal ones. I think all You know, the point I'll make here is just the reason AutoGPT and maybe AGI, you know, it's in the name, like they were promising AGI.
[00:44:53] swyx: But I think people are discovering that you cannot engineer your way to AGI. It has to be done at the model level and all these engineering, prompt engineering hacks on top of it weren't really going to get us there in a meaningful way without much further, you know, improvements in the models. I would say, I'll go so far as to say, even Devin, which is, I would, I think the most advanced agent that we've ever seen, still requires a lot of engineering and still probably falls apart a lot in terms of, like, practical usage.
[00:45:22] swyx: Or it's just, Way too slow and expensive for, you know, what it's, what it's promised compared to the video. So yeah, that's, that's what, that's what happened with agents from, from last year. But I, I do, I do see, like, vertical agents being very popular and, and sometimes you, like, I think the word agent might even be overused sometimes.
[00:45:38] swyx: Like, people don't really care whether or not you call it an AI agent, right? Like, does it replace boring menial tasks that I do That I might hire a human to do, or that the human who is hired to do it, like, actually doesn't really want to do. And I think there's absolutely ways in sort of a vertical context that you can actually go after very routine tasks that can be scaled out to a lot of, you know, AI assistants.
[00:46:01] swyx: So, so yeah, I mean, and I would, I would sort of basically plus one what let's just sit there. I think it's, it's very, very promising and I think more people should work on it, not less. Like there's not enough people. Like, we, like, this should be the, the, the main thrust of the AI engineer is to look out, look for use cases and, and go to a production with them instead of just always working on some AGI promising thing that never arrives.
[00:46:21] swyx: I,
[00:46:22] NLW: I, I can only add that so I've been fiercely making tutorials behind the scenes around basically everything you can imagine with AI. We've probably done, we've done about 300 tutorials over the last couple of months. And the verticalized anything, right, like this is a solution for your particular job or role, even if it's way less interesting or kind of sexy, it's like so radically more useful to people in terms of intersecting with how, like those are the ways that people are actually.
[00:46:50] NLW: Adopting AI in a lot of cases is just a, a, a thing that I do over and over again. By the way, I think that's the same way that even the generalized models are getting adopted. You know, it's like, I use midjourney for lots of stuff, but the main thing I use it for is YouTube thumbnails every day. Like day in, day out, I will always do a YouTube thumbnail, you know, or two with, with Midjourney, right?
[00:47:09] NLW: And it's like you can, you can start to extrapolate that across a lot of things and all of a sudden, you know, a AI doesn't. It looks revolutionary because of a million small changes rather than one sort of big dramatic change. And I think that the verticalization of agents is sort of a great example of how that's
[00:47:26] swyx: going to play out too.
[00:47:28] Adept episode - Screen Multimodality
[00:47:28] swyx: So I'll have one caveat here, which is I think that Because multi modal models are now commonplace, like Cloud, Gemini, OpenAI, all very very easily multi modal, Apple's easily multi modal, all this stuff. There is a switch for agents for sort of general desktop browsing that I think people so much for joining us today, and we'll see you in the next video.
[00:48:04] swyx: Version of the the agent where they're not specifically taking in text or anything They're just watching your screen just like someone else would and and I'm piloting it by vision And you know in the the episode with David that we'll have dropped by the time that this this airs I think I think that is the promise of adept and that is a promise of what a lot of these sort of desktop agents Are and that is the more general purpose system That could be as big as the browser, the operating system, like, people really want to build that foundational piece of software in AI.
[00:48:38] swyx: And I would see, like, the potential there for desktop agents being that, that you can have sort of self driving computers. You know, don't write the horizontal piece out. I just think we took a while to get there.
[00:48:48] NLW: What else are you guys seeing that's interesting to you? I'm looking at your notes and I see a ton of categories.
[00:48:54] Top Model Research from January Recap
[00:48:54] swyx: Yeah so I'll take the next two as like as one category, which is basically alternative architectures, right? The two main things that everyone following AI kind of knows now is, one, the diffusion architecture, and two, the let's just say the, Decoder only transformer architecture that is popularized by GPT.
[00:49:12] swyx: You can read, you can look on YouTube for thousands and thousands of tutorials on each of those things. What we are talking about here is what's next, what people are researching, and what could be on the horizon that takes the place of those other two things. So first of all, we'll talk about transformer architectures and then diffusion.
[00:49:25] swyx: So transformers the, the two leading candidates are effectively RWKV and the state space models the most recent one of which is Mamba, but there's others like the Stripe, ENA, and the S four H three stuff coming out of hazy research at Stanford. And all of those are non quadratic language models that scale the promise to scale a lot better than the, the traditional transformer.
[00:49:47] swyx: That this might be too theoretical for most people right now, but it's, it's gonna be. It's gonna come out in weird ways, where, imagine if like, Right now the talk of the town is that Claude and Gemini have a million tokens of context and like whoa You can put in like, you know, two hours of video now, okay But like what if you put what if we could like throw in, you know, two hundred thousand hours of video?
[00:50:09] swyx: Like how does that change your usage of AI? What if you could throw in the entire genetic sequence of a human and like synthesize new drugs. Like, well, how does that change things? Like, we don't know because we haven't had access to this capability being so cheap before. And that's the ultimate promise of these two models.
[00:50:28] swyx: They're not there yet but we're seeing very, very good progress. RWKV and Mamba are probably the, like, the two leading examples, both of which are open source that you can try them today and and have a lot of progress there. And the, the, the main thing I'll highlight for audio e KV is that at, at the seven B level, they seem to have beat LAMA two in all benchmarks that matter at the same size for the same amount of training as an open source model.
[00:50:51] swyx: So that's exciting. You know, they're there, they're seven B now. They're not at seven tb. We don't know if it'll. And then the other thing is diffusion. Diffusions and transformers are are kind of on the collision course. The original stable diffusion already used transformers in in parts of its architecture.
[00:51:06] swyx: It seems that transformers are eating more and more of those layers particularly the sort of VAE layer. So that's, the Diffusion Transformer is what Sora is built on. The guy who wrote the Diffusion Transformer paper, Bill Pebbles, is, Bill Pebbles is the lead tech guy on Sora. So you'll just see a lot more Diffusion Transformer stuff going on.
[00:51:25] swyx: But there's, there's more sort of experimentation with diffusion. I'm holding a meetup actually here in San Francisco that's gonna be like the state of diffusion, which I'm pretty excited about. Stability's doing a lot of good work. And if you look at the, the architecture of how they're creating Stable Diffusion 3, Hourglass Diffusion, and the inconsistency models, or SDXL Turbo.
[00:51:45] swyx: All of these are, like, very, very interesting innovations on, like, the original idea of what Stable Diffusion was. So if you think that it is expensive to create or slow to create Stable Diffusion or an AI generated art, you are not up to date with the latest models. If you think it is hard to create text and images, you are not up to date with the latest models.
[00:52:02] swyx: And people still are kind of far behind. The last piece of which is the wildcard I always kind of hold out, which is text diffusion. So Instead of using autogenerative or autoregressive transformers, can you use text to diffuse? So you can use diffusion models to diffuse and create entire chunks of text all at once instead of token by token.
[00:52:22] swyx: And that is something that Midjourney confirmed today, because it was only rumored the past few months. But they confirmed today that they were looking into. So all those things are like very exciting new model architectures that are, Maybe something that we'll, you'll see in production two to three years from now.
[00:52:37] swyx: So the couple of the trends
[00:52:38] NLW: that I want to just get your takes on, because they're sort of something that, that seems like they're coming up are one sort of these, these wearable, you know, kind of passive AI experiences where they're absorbing a lot of what's going on around you and then, and then kind of bringing things back.
[00:52:53] NLW: And then the, the other one that I, that I wanted to see if you guys had thoughts on were sort of this next generation of chip companies. Obviously there's a huge amount of emphasis. On on hardware and silicon and, and, and different ways of doing things, but, you know, love your take on, on either or both of
[00:53:07] swyx: those.
[00:53:08] AI Wearables
[00:53:08] swyx: So for so wearables, I'm very excited about it. I want wearables on me at all times. I have two right here. To, to quantify my health. And I, you know, I'm all for them. But society is not ready for wearables, right? Like, no one's comfortable with a device on recording every single conversation we have.
[00:53:24] swyx: Even all three of us here as podcasters, we don't record everything that we say. And I think there's a social shift that needs to happen. I am an investor in TAB. They are renaming to a broader vision, but they are one of the three or four leading wearables in this space. It's sort of the AI pendants, or AI OS, or AI personal companion space.
[00:53:47] swyx: I have seen two humanes in the wild in San Francisco. I'm very, very excited to report that there are people walking around with those things on their chest and it is as goofy as it sounds. It, it absolutely is going to fail. God bless them for trying. And I've also bought a rabbit. So I'm, I'm very excited for all those things to arrive.
[00:54:06] swyx: But yeah people are very keen on hardware. I think the, the, the idea that you can have physical objects that. Embody an AI that do specific things for you is as old as, you know, the sort of Golem in sort of medieval times in terms of like how much we want our objects to be smart and do things for us.
[00:54:27] swyx: And I think it's absolutely a great play. The funny thing is people are much more willing to pay you upfront for a hardware device than they are willing to pay like an 8 a month subscription recurring for software, right? And so the interesting economics of these wearable companies is they have negative float.
[00:54:47] swyx: In the sense that people pay deposits upfront, like I paid like, I don't know, 200 bucks for the rabbit. Upfront, and I don't get it for another six months. I paid 600 for the tab, and I don't get it for another six months. And, and then, then they can take that money and, and sort of invest it in like their next, the next events or their next properties or ventures.
[00:55:06] swyx: And like, I think that's a, that's a very interesting reversal of economics from other types of AI companies that I see. And I think, yeah, just the, the, the tactile feel of an AI, I think is very promising. I, Alex, I don't know if you have other thoughts on, on the wearable stuff.
[00:55:21] Alessio: The open interpreter just announced their product four hours ago.
[00:55:25] Alessio: Yeah. Which is a, it's not really a wearable, but it's a, it's still like a physical device.
[00:55:30] swyx: It's a push to talk mic to, to a device on your, on your laptop. Right. It's a $99 push talk. Yeah.
[00:55:38] Alessio: But, but, but everybody, but again, going back to your point, it's like people want to, people are interested in spending money for like things that they can hold, you know, I don't know what that means overall for like where things are going, but making more of this AI be a physical part of your life.
[00:55:54] Alessio: I think people are interested in that, but I agree with Shawn. I mean, I've been. I talked to Avi about this, but Avi's point is like, most consumers, like, care about utility more than they care about privacy, you know, like you've seen with social media. But I also think there's a big societal reaction to AI that is, like, much more rooted than the social media one.
[00:56:16] Alessio: But we'll see. But a lot, again, a lot of work, a lot of developers, a lot of money going into it. So there's, there's bound to be experiments being run. On, on the
[00:56:25] swyx: chip side. Sorry, I'll just ship it one more thing and then we transition to the chips. The thing I'll caution people on is don't overly focus on the form factor.
[00:56:33] swyx: The form factor is a delivery mode. There will be many form factors. It doesn't matter so much as where in the data war does it sit. It actually is context acquisition. Because, and maybe a little bit of multi modality. Context, like, context is king. Like, if you have access to data that no one else has, then you will be able to create AI that no one else can create.
[00:56:54] swyx: And so what is the most personal context? It is your everyday conversation. It is as close to mapping your mental train of thought As possible without, you know, physically you writing down notes. So, so that is the promise, the ultimate goal here, which is like, personal context, it's always available on you you know, loading and seeing all that stuff.
[00:57:12] swyx: But yeah, that's the, that's the frame I want to give people that the form factors will change and there will be multiple form factors, but it's the software behind that. And in the personal context that you cannot get anywhere else, that'll win.
[00:57:24] Alessio: Yeah, so that was wearables.
[00:57:26] Groq vs Nvidia month - GPU Chip War
[00:57:26] Alessio: On the chip side, yeah, Grok was probably the biggest release.
[00:57:29] Alessio: Jonathan, well, it's not even a new release because the company, I think, was started in 2016. So it's actually quite old. But now recently captured the people's imagination with their MixedREL 500 tokens a second demo. Yeah, I think so far the battle on the GPU side has been Either you go kind of like massive chip, like the Cerebros of the world, where one chip from Cerebros is about two million dollars, you know, that's compared, obviously, you cannot compare one chip versus one chip, but h100 is like 40, 000, something like that the problem with those architectures has been They want to be very general, you know, but like they wanted to put a lot of the RAM, the SRAM on the chip.
[00:58:13] Alessio: It's much more convenient when you're using larger language models, but the models outpace the size of the chips and chips have a much longer, you know, turnaround cycle. Grok today. It's great for the current architecture. It's a lot more expensive also, as far as dollar per flop but their idea is like, hey, when you have very high concurrency, we actually were much cheaper, you know, you shouldn't just be looking at the compute power for most people, this doesn't really matter, you know, like, I think that's like the most the most interesting thing to me is like, We've now gone back with, with AI to a world where developers care about what hardware is running, which was not the case in traditional software for like, maybe 20 years since as the cloud has gotten really big.
[00:58:57] Alessio: My, my thinking is that in the next two, three years, like we're going to go back to that. We're like, people are not going to be sweating. Oh, what GPU do you have in your cloud? What do you have? It's like. Yeah, you want to run this model, we can run it at the same speed as everybody else, and then everybody will make different choices, whether they want to have higher front end capital investment, and then better utilization, some people would rather do lower investment before, and then upgrade later, there are a lot of parameters and then there's the dark horses, right, that is some of the smaller companies like Lemurian Labs, MedEx that are working on maybe not a chip alone, but also like some of the, the actual math infrastructure and the instructions on it that make them run.
[00:59:40] Alessio: There's a lot going on, but yeah, I think the, the episode with with Dylan will be interesting for, for people, but I think we also came out of it saying, Hey, everybody has pros and cons. There's no, it's different than the models where you're like, Oh, this one is definitely better for me. And I'm going to use it.
[00:59:56] Alessio: I think for most people. It's like fun Twitter memeing, you know, but it's like 99 percent of people that tweet about this stuff are never gonna buy any of these chips anyway. It's, it's really more for entertainment.
[01:00:10] swyx: No. Wow. I mean, like, this is serious business here, right? You're talking about, you know, like who, like the potential new Nvidia, if anyone can take like 1% of NVIDIA's business, they're a serious startup that you should look at.
[01:00:20] swyx: Right? So , that's, that's, that's my, well, yeah,
[01:00:23] Alessio: yeah. On matters. Well, I'm more talking about like, what, how should people think about it? You know? It's like, yeah. I think like the, the end user is not impacted as much.
[01:00:31] Disagreements
[01:00:31] Alessio: This is obviously, so
[01:00:32] swyx: I disagree. Yeah, I love disagreements because, you know, who likes a podcast where all three people always agree with each other?
[01:00:38] swyx: You will see the impact of this in the tokens per second over time. This year, I have very, very credible sources all telling me that the average tokens per second, right now, we have somewhere between 50 to 100 as like the norm for people. Average tokens per second will go to 500 to 2, 000. This year from, from a number of chip suppliers that I cannot name.
[01:00:58] swyx: So like that is, that is, that will cause a step change in the use cases. Every time you have an order of magnitude improvement in the, in the speed of something, you unlock new use cases that become fun instead of a chore. And so that's what I would caution this audience to think about, which is like, what can you do in much higher AI speed?
[01:01:17] swyx: It's not just things streaming out faster. It is things working in the background a lot more seamlessly and therefore being a lot more useful. Then previously imagined. So that would be my two cents on.
[01:01:30] Alessio: Yeah. Yeah. I mean, the, the new NVIDIA chips are also much faster. To me, that's true. When it comes to startups, it's like, are the startups pushing the performance on the incumbents or are the incumbents still leading?
[01:01:44] Alessio: And then the startups are like riding the same wave, you know? I don't have yet a good sense of that. It's like, you know, it's next year's NVIDIA release. Just gonna be better than everything that gets released this year, you know, if that's the case, it's like, okay, damn Jensen, you know, it's like the meme.
[01:02:00] Alessio: It's like, I'm gonna fight. I'm gonna fight NVIDIA. It's like, damn, Jensen got hands. He really does.
[01:02:08] Summer 2024 Predictions
[01:02:08] NLW: Well, awesome conversation, guys. I guess just just by way of wrapping up, I call it over the next three months between now and sort of the beginning of summer was one prediction that each of you has. It can be about anything. It can be a big company. It can be a startup. It can be something you have privileged information that you know, and you just won't tell us that you actually
[01:02:25] Alessio: know.
[01:02:26] Alessio: What, does it have to be something that we think it's going to be true or like something that we think? Because for me, it's like, is Sundar going to be the CEO of Google? Maybe not in three months, maybe in like six months, nine months, you know, people are like, Oh, maybe Demis is going to be the new CEO.
[01:02:41] Alessio: That was kind of like, I, I was busy like fishing some deep mind people and Google people for like a good guest for the pod. And I was like, Oh, what about. Jeff Dean, and they're like, well, Demis is really like the person that runs everything anyway, and the stuff. It's like interesting. And
[01:02:57] swyx: so I don't know.
[01:02:58] swyx: What about Sergei? Sergei Sergei could come back. I don't know. Like he's making more appearances these days.
[01:03:03] Alessio: Yeah. I don't, I I Then we can just put it as like, you know. Yeah. My, my thing is like CEO change potential, but I, again, three months is too short to make a prediction. Yeah. I
[01:03:16] NLW: think that's the, that's that's fine.
[01:03:18] NLW: The, the timescale might be off.
[01:03:22] swyx: Yeah. I mean for me, I, I think the. Progression in vertical agent companies will keep going. We just had, the other day, Klarna talking about how they replaced like 700 of their customer support agents with the AI agents. That's just the beginning, guys. Like, imagine this rolling out across most of the Fortune 500.
[01:03:43] swyx: This is, and I'm not saying this is like a utopian scenario, there will be very, very embarrassing and bad outcomes of this, where like, humans would never make this mistake, but AIs did, and like, we'll all laugh at it, or we'll be very offended by whatever, you know, bad outcome it did. So we have to be responsible and careful in the rollout, but yeah, this is, it's rolling out, you know, Alessio likes to say that this year's the year of AI in production.
[01:04:04] swyx: Let's see it, let's, let's see all these sort of vertical, full stack employees. Come out into the workforce. Love
[01:04:11] Alessio: it.
[01:04:11] NLW: All right, guys. Well, thank you so much for for sharing your your thoughts and insights here And I can't wait to do it again
[01:04:18] Thursday Nights in AI - swyx
[01:04:18] NLW: Welcome
[01:04:19] swyx: back again. It's Charlie your AI co host We're now in part two of the special weekend episode collating some of SWIX and Alessio's recent appearances If you're not active in the Latentspace Discord, you might not be aware of the many, many, many in person.
[01:04:36] swyx: Events we host gathering our listener community all over the world. You can see the Latentspace community page for how to join and subscribe to our event calendar for future meetups. We're going to share some of our recent live appearances in this next part, starting with the Thursday nights in AI meetup, a regular fixture in the SF AI scene run by Imbue and Outset Capital.
[01:04:59] swyx: Primarily, our former guest, Kanjin Q, Ali Rhoda, and Josh Albrecht. Here's Swyx.
[01:05:08] swyx: Today, for those of you who have been here before, you know the general format. So we'll do a quick fireside Q& A with Swyx. Swyx, where we're asking him the questions. Then we'll actually go to our rapid fire Q& A, where we're asking really fast, hopefully, spicy questions. And then we'll open it up to the audience for your questions.
[01:05:25] swyx: So you guys sneak around the room, submit your questions, and we'll go through as many of them as possible during that period. And then actually, Swyx brought a gift for us, which is two Latentspace t shirts. AI Engineer. AI Engineer t shirts. And those will be awarded to the Two spiciest question askers.
[01:05:44] swyx: So and I'll let Josh decide on that. So if we want to get your spiciest takes, please send them in during the event as we're talking and then also at the end. All right. With that, let's get going.
[01:05:57] NLW: Okay. Welcome, Swyx. Thank you for that
[01:06:01] swyx: intro.
[01:06:01] NLW: How does it
[01:06:01] swyx: feel to be interviewed
[01:06:03] NLW: rather than the interviewer?
[01:06:04] swyx: Weird. I don't know what to do in this chair. Yeah. Like,
[01:06:07] NLW: where should I put my hands? Yeah, exactly. You look good.
[01:06:10] swyx: You look good. And I also love asking follow up questions. And I tend to, like, sort of take over panels a lot. If you ever see me on a panel, I tend to ask the other panelists questions.
[01:06:18] swyx: Okay.
[01:06:19] NLW: So we should be ready is what you're saying. So you back.
[01:06:21] swyx: That's fine. This is like a free MBU interview, so why not? That's right. That's right. That's
[01:06:24] NLW: right.
[01:06:25] swyx: Yeah, so you interviewed Ken Jeon, the CEO you didn't interview Josh, right? No, no. So maybe tonight. Yeah. Okay. We'll see. We'll look for different questions and look for an alignment.
[01:06:35] NLW: I love it. All
[01:06:36] swyx: right. I just want to hear this story. You know, you've completely exploded LatentSpace and AI Engineer, and I know you also, before all of that, had exploded in popularity for your learning in public movement and your DevTools work. And devrelations work. So, who are you and how did you get here?
[01:06:53] swyx: Let's
[01:06:53] NLW: start with that.
[01:06:54] swyx: Quick story is, I'm Shawn, I'm from Singapore. Swyx is my initials. For those who don't know, A lot of Singaporeans are ethically Chinese, and we have Chinese names and English names. So, it's just it's just my initials. Came to col came to the US for college, and have been here for about 15 years, but most, like half of that was in finance and then the other half was, was in tech.
[01:07:13] swyx: And the, and tech is where I was most known just because I realized that I was much more aligned towards learning in public, whereas in finance, Everything's a trade secret. Everything is zero sum. Whereas in tech, like, you're allowed to come to meetups and conferences and share your learnings and share your mistakes even.
[01:07:31] swyx: And that's totally fine. You, like, open source your code. It's totally fine. And even, even better, you, like, contribute PRs to other people's code, which is even better. And I found that I thrived in that. Learning public environments and that, that kind of got me started. I was an early hire, early Draft Relations hire at Netlify and then did the same at AWS Temporal and Airbyte.
[01:07:53] swyx: And then, and so that, that's like the whole story. I can talk, talk more about like developer tooling and developer relations if, if that's something that people are interested in. But I think the, the more recent thing is AI. And I started really being interested in it mostly because It, it, the, the approximate cause of starting Leanspace was stable diffusion.
[01:08:10] swyx: When you could run a large model that could do sufficiently enough on your, on your desktop. Where I was like, okay, like, this is, Something qualitatively very different. And that's then we started late in space and you're like, this is something different. We have to talk about it on a podcast.
[01:08:25] swyx: There we go. Yeah. It wasn't, it wasn't a podcast for like four months. And then, and then I had been running a discord for dev tools investors. 'cause I, I also invest in dev tools and I advise companies on deaf tools, def things. And I think it was the start of 2023 when Alessio and I were both like, you know, I think we, we need to like get more tokens out of.
[01:08:45] swyx: People, and I was running out of original sources to, to write about, so I was like, okay, I'll go get those original sources. And I think that, that's when we started the podcast. And I think it's just the chemistry between us, the, the way we spike in different ways. And also, like, honestly, the kind participation of the guests to give us their time.
[01:09:03] swyx: Like, you know, like, getting George Hoss was a big deal. And also shout out to Alessio for just cold emailing him for, for, for booking the, booking some of our biggest guests. And I'm just working really hard to try to tell the story that people can use at work. I think that there's a lot of AI podcasts out there and a lot of AI kind of forums or fireside chats with no fire.
[01:09:21] swyx: That always talk about age, like what's your AGI timeline, what's your PDoom. Very, very nice hallway conversations for freshman year but not very useful for work. And like, you know, practically like making money and like And thinking about, you know, changing the everyday lives. I think what's interesting is obviously you care about the existential safety of the human race.
[01:09:43] swyx: But in the meantime we gotta eat. So so I think that's like kind of latent space's niche. Like we explicitly don't really talk about AGI. We explicitly don't talk about Things that we're, like, a little bit too far out. Like, we don't do a ton of robotics. We don't do a ton of, like, high frequency trading.
[01:10:00] swyx: There's tons of machine learning in there, but we just don't do that. Because, like, we're like, all right, what are most software engineers gonna, gonna need? Because that's our background, and that's the audience that we serve. And I think just, like, being really clear on that audience has been, has resonated with people.
[01:10:12] swyx: Yeah, you would never expect a technical podcast to reach, like, a general audience, like, Top ten on the tech charts but I, you know, I've been surprised by that before and it's been successful. I don't know, I don't know what to say about that. I think honestly, I, I kind of have this like negative reaction towards being, being, being, being, being classified as a podcast because the podcast is downstream of ideas.
[01:10:35] swyx: And it's one mode of conversation, it's one mode of idea delivery, but you can deliver ideas on a newsletter, in person like this there's so many different ways. And so I think, I think about it more as we are trying to start or serve an industry, and that industry is the AI engineer industry, which is, which we can talk about more.
[01:10:53] swyx: Yes, let's go into that. So the AI engineer, you penned a piece called The Rise of the AI Engineer, you tweeted about it, Andrej Karpathy also responded, largely agreeing with what you said. What is an AI engineer? The AI engineer is the software engineer building with AI, enhanced by AI, And eventually it will be non human engineers writing code for you, Which I know MBU is all about.
[01:11:18] swyx: You're saying eventually the AI engineer will become a non human engineer? That will be one kind of AI engineer that people are trying to build, And is probably the most furthest away in terms of being reality. Because it's so hard. Got it. But, but there are three types of AI engineer and I just went through the three.
[01:11:33] swyx: One is AI enhanced where you like use AI products like Copilot and Cursor. And two is AI products engineer where you use the exposed AI capabilities to the end user As a software engineer, like, not doing pre training not being an ML researcher, not being an ML engineer, but just interacting with foundation models and probably APIs from foundation model labs.
[01:11:54] swyx: What's the third one? And the third one is the non human AI engineer. Got it. The fully autonomous AI engineer. Dream, you know, Coder. How long do you think it is till we get to, like, early, early versions? This is my equivalent of AGI timelines. I know, I know. You can set yourself up for this. So like, lots of active, like, I mean, I have, I have supported companies actively working on that.
[01:12:13] swyx: I think it's more useful to think about levels of autonomy. And so my answer to that is, you know, perpetually five years away until until it figures it out. No, but my actual anecdote the closest comparison we have to that is self driving. We are, we're doing this in San Francisco for those who are watching the live stream.
[01:12:32] swyx: If you haven't come to San Francisco and seen, and taken a Waymo ride just come, get a friend take a Waymo ride. I remember 2014 we covered a little bit of autos in, in my hedge fund. And I was, I remember telling a friend, I was like, self driving cars around the corner, like, this is it, like, you know, parking will be, like, parking will be a thing of the past and it didn't happen for the next 10 years.
[01:12:52] swyx: And, and, but now we, now, like, most of us in San Francisco can, can take it for granted. So I think, like, you just have to be mindful that the, the, the, the rough edges take a long time. And like, yes, it's going to work in demos, then it's going to work a little bit further out and it's just going to take a long time.
[01:13:08] swyx: The more useful mental model I have is sort of levels of autonomy. So in self driving, you have level 1, 2, 3, 4, 5 just the amount of human attention that you get. At first, like, your, your, your hands are always on 10 and 2 and you have to pay attention to the, to, to the driving every 30 seconds and eventually you can sleep in the car, right?
[01:13:25] swyx: So there's a whole spectrum of that. So what's the equivalent for that for, for coding? Keep your hands on the keyboard and then eventually you've kind of gone off. You tab to accept everything. Where are we? Oh, that's good, yeah. Yeah. Doesn't that already happen? Yeah. Approve the PR. Approve, this looks good.
[01:13:39] swyx: That's the dream that people want. It gives, it gives, really you unlock a lot of coding when people, non technical people can file issues, and then the AI engineer can sort of automatically write code, pass your tests, and if it, if it kind of works as, as, as intended. As, as advertised then you can just kind of merge it and then you, you know, 10x, 100x the number of developers in your company immediately.
[01:14:00] swyx: So that's the goal, that's the, that's the holy grail. We're not there yet but Sweep, CodeGen, there's a bunch of companies, Magic probably, are, are all working towards that. And, and so I so the TLDR, like the, the thing that we covered Alessio and I covered in the January recap that we did was that the, the basic split that people should have in their minds is the inner loop versus the outer loop for the developer.
[01:14:21] swyx: Inner loop is everything that happens in your IDE between Git commits. And outer loop is happens, is what happens when you push up your Git commit to GitHub, for example, or GitLab. And that's a nice split, which means like everything local, everything that needs to be fast is for everything that's kind of very hands on for developers.
[01:14:37] swyx: It's probably easier to automate or easier to have code assistance. That's what Copilot is, that's what, that's what all those things are. And then everything that happens autonomously when you're effectively away from the keyboard with like a GitHub issue or something that is more outer loop where you're you know, you're relying a lot more on autonomy and we are maybe, our LLMs are maybe not smart enough to do that yet.
[01:14:57] Alessio: Do you have any thoughts on
[01:14:58] swyx: kind of
[01:14:58] Alessio: the user experience and how that will change? One of the things
[01:15:01] swyx: that has happened for me, kind of looking at some of these products and playing around with things ourselves, like, You know, it sounds good to have an automated PR, then you get an automated PR and you're like, I really don't want to review like 300 lines of generated code, and like find the bug in it.
[01:15:13] swyx: Well then you have another agent that's a reviewer. That's right, but then you like tell it like, Oh, go fix it, and it comes back with 400 lines. Yes, there is a length bias to code, right? And you do have higher passing rates. In PRs. This is a documented human behavior thing, right? Send me two lines of code, I will review the s**t out of that.
[01:15:33] swyx: I don't know if I can swear on this. Send me, send me 200 lines of code, looks good to me. Right? Guess what? The, the agents are going to, perfectly happy to modify, to copy that behavior from us. When we actually want them to do the opposite. So, yeah, I, I think that the GAN model of code generation is probably not going to work super well.
[01:15:50] swyx: I do think we probably need just better planning from the start. Which is, I'm just repeating the MBU thesis by the way. Just go listen to Kanjin talk about this. She's much better at it than I am. But yeah, I think I think the code review thing is going to be I think that what Codium, there are two Codiums, the Israeli one.
[01:16:10] swyx: The Israeli Codium. With the E. Yeah, Codium with the E. They still have refused to rename. I'm friends with both of them. Every month I'm like, You're like, guys, let's
[01:16:18] NLW: all come to one room. Yeah,
[01:16:19] swyx: like, you know, someone's got to fold. Codium with the E has gone, like, you've got to write the test first. Right?
[01:16:25] swyx: You write the, you write the it's like a sort of tripartite relationship. Again, this was also covered on a podcast with them, which is fantastic. Like, you interview me, you sort of through me, you interview. Like, the past avatars I've been watching the Netflix show, by the way, it's fantastic. But like, so so Codium is like, they've already thought this all the way through.
[01:16:41] swyx: They're like, okay, you write the user story, from the user story you generate all the tests, you also generate the code and you update any one of those, they all have to update together. Right? So like, once the, and, and probably the critical factor is the test generation from the story. Because everything else can just kind of bounce the heads off of those things until they pass.
[01:17:01] swyx: So you have to write good tests. It's kind of like the eat your vegetables of coding, right? Which nobody really wants to do. And so I think it's a really smart tactic to go to market by saying we automatically generate tests for you and, you know, start not great, but then get better. And eventually you get to the weakest point in the chain for the entire loop of code generation.
[01:17:25] swyx: What do you think the weakest link is? The weakest link? Yeah. It's text generation. Yeah. Yeah. Do you think there's a way to, like, are there some promising
[01:17:33] Alessio: avenues you see forward for making that actually better?
[01:17:38] swyx: For making it better. You have to have, like, good isolation, and I think proper serverless cloud environments is integral to that.
[01:17:48] swyx: I, it could be like a fly. io. It could be like a Cloudflare worker. It depends how much, how many resources your test environment needs. And effectively I was talking about this, I think with maybe Rob earlier in the audience, where every agent needs a sandbox. If you're a code agent, you need a coding sandbox, but if you're whatever, like MBU used to have this, like, sort of Minefield, Minecraft's clone that was much faster.
[01:18:12] swyx: If, if you, if you have a model of the real world, you have to go, you have to go generate some plan or some code or some whatever, test it against that real world so that you can get this iterative feedback and then get the final result back that is somewhat validated against the real world. And so, like, you need a really good sandbox.
[01:18:26] swyx: I don't think people, I, I think this is, this is a, this is an infrastructure need that humans
[01:18:31] swyx + Josh Albrecht: have had for a long time. We've never solved it for ourselves. And now we have to solve it for humans. About a thousand times larger quantity of agents than, than, than actually exists. And, and so I, I, I think, like, we eventually have to involve, evolve a lot more infrastructure.
[01:18:45] swyx + Josh Albrecht: In order to serve these things. So yeah. So, for those who don't know, like I also have so, we're talking about the rise of AI engineer. I also have previous conversations about immutable infrastructure cloud environments and that kind of stuff. And this is all of a kind. Like, like, in order to solve agents and coding agents, we're going to have to solve the other stuff too along the way.
[01:19:05] swyx + Josh Albrecht: And it's really neat for me. To see all that tie together in my DevTools work that all these themes kind of reemerge just naturally, just because everything we needed for humans, we just need a hundred times more for, for for agents.
[01:19:17] Dylan Patel: Let's talk about the AI engineer. AI engineer has become a whole thing.
[01:19:21] Dylan Patel: It's become a term and also a conference. And tell us more, and a job title, tell us more about that. What's going on there?
[01:19:31] swyx + Josh Albrecht: That is like a very vague, a very, very big cloud of things. I would just say like, I think it's an emergent industry. I've seen this happen repeatedly for, so the general term is software engineer.
[01:19:44] swyx + Josh Albrecht: Programmer. In the 70s and 80s, there would not be like senior engineer. There would just be engineering. Like you, or you, I don't think they even call themselves engineer. They don't have that. What about a member of the technical staff? Oh, yeah, MTS. Very, very, very, very elite. But yeah, so like, you know, like these striations appear when the population grows and the technical depth grows over time.
[01:20:07] swyx + Josh Albrecht: Yeah. When it starts, when it ends. Not that, not that important, and then over time it's just gonna specialize. And I've seen this happen for frontend, for DevOps, for data and I can't remember what else I listed in, in that, in that piece, But those are the main three that I was around for. And I, I see this, I saw this happening for AI engineer which is effectively, now a lot of people are arguing that there is the ML researcher, the ML engineer, who sort of pairs with the researcher sometimes they also call research engineer and then on the other side of the fence is just software engineers.
[01:20:35] swyx + Josh Albrecht: And that's how it was up till about last year. And now there's this specializing and rising class of people building AI specific software that are not any of those previous titles that I just mentioned. And that's the thesis of the AI engineer, that this is an emerging category of AI. Startups of jobs I've had people from Meta, IBM, Microsoft, OpenAI tell me that they, their title is now AI engineer.
[01:20:58] swyx + Josh Albrecht: They're hiring AI engineers. So, like, I can see that this is a trend and I think that's what Andre called out in his post that, like, just mathematically, just the, just the limitations in terms of talent, research talents and GPUs, that all these will tend to concentrate in a, in a, in a, Few labs and everyone else are just going to have to rely on them or build differentiation of products in other ways And those will be AI engineers.
[01:21:21] swyx + Josh Albrecht: So mathematically there will be more AI engineers than ML engineers. It's just the truth. Right now it's the other way. Right now the number of AI engineers is maybe 10x less. So I think that the ratio will invert and you know I think the goal of the InSpace and the goal of the conference and anything else I do is to serve that
[01:21:38] Dylan Patel: growing audience.
[01:21:41] Dylan Patel: To make the distinction clear, if I'm a software engineer And I'm like, I want to become an AI engineer. What do I have to learn? Like, what additional capabilities does that type of engineer have? Funny you say that. I think you have a blog post on this very
[01:21:53] swyx + Josh Albrecht: topic. I don't actually have a specific blog post on how to, like, change classes.
[01:21:58] swyx + Josh Albrecht: I do think I always think about these in terms of yeah, Baldur's Gate and, you know D& D rule set number 5. 1 or whatever. But yeah, so I kind of intentionally left that open to leave space for others. I think when you start an industry, you need to the specifications that work the best in industries are So minimally defined so that other people can fill in the blanks.
[01:22:19] swyx + Josh Albrecht: And I want people to fill in the blanks. I want people to disagree with me and with with themselves so that we can figure this out as a, as a group. Like I don't want to overs specify everything, you know, like that that's, that's a way, that's the only way to guarantee it, that it will fail. Um, I do have a take obviously, 'cause a lot of people are, are asking me like, where to start.
[01:22:37] swyx + Josh Albrecht: And I think basically so what, what we have is latent Space University. We just finished working on day seven today. It's a seven day email course. Where basically, like, it, it is completely designed to answer the question of, like, okay, I'm a, I'm an existing software engineer, I, like, kind of, I know how to code but I don't get all this AI stuff, I've been living under a rock, or, like, it's just too overwhelming for me, you have to, like, pick for me, or curate for me as a, as a trusted friend.
[01:22:59] swyx + Josh Albrecht: And I have one hour a day for seven days. What, what, what do you do? slot in that, in that, in that bucket. So for us, it's making, making sort of LLM API, API calls. It's me, it's image generation, it's code generation, it's audio ASR, I, I think, what's, what's ASR? Audio speech recognition?
[01:23:18] swyx + Josh Albrecht: Yeah, yeah. And then I forget, I forget what the fifth and sixth one is, but the last day is agents. And, and so basically, like, I'm just like, you know, Here are seven projects that you should do to feel like you can do anything in AI. You can't really do everything in AI just from, just from that small list.
[01:23:34] swyx + Josh Albrecht: But I think it's just like, just like anything, you have to like, go through like a set list of, of things that are basic skills that I think everyone in this industry should have to be at least conversant in. If someone, if like a boss comes to you and goes like, hey, can we build this? You don't even know if the answer is no.
[01:23:52] swyx + Josh Albrecht: So I want you to move towards from like unknown unknowns to at least known unknowns. And I think that's, that's where you start being competent as an AI engineer. So, so yeah, that's LSU, Latent Space University, just to trigger the The Tigers.
[01:24:06] Dylan Patel: So do you think in the future that people, an AI engineer is going to be someone's full time job?
[01:24:10] Dylan Patel: Like people are just going to be AI engineers? Or do you think it's going to be more of a world where I'm a software engineer, and like, 20 percent of my time, I'm using open AIs, APIs, and I'm, Working on prompt engineering and stuff like that and using
[01:24:23] swyx + Josh Albrecht: CodePilot. You just reminded me of Day6's open source models and fine tuning.
[01:24:27] swyx + Josh Albrecht: Perfect. I think it will be a spectrum. That's why I don't want to be like too definitive about it. Like we have full time front end engineers and we have part time front end engineers and you dip into that community whenever you want. But wouldn't it be nice if there was a collective name for that community so you could go find it?
[01:24:40] swyx + Josh Albrecht: You can find each other. And, like, honestly, like, that's, that's really it. Like, a lot of people, a lot of companies were pinging me for, like, Hey, I want to hire this kind of person, but you can't hire that person, but I wanted someone like that. And then people on the labor side were, were pinging me going, like, Okay, I want to do more in this space, but where do I go?
[01:24:56] swyx + Josh Albrecht: And I think just having that shelling point of, of, of what an industry title and name is, and then sort of building out that. Mythology and community and conference I think is helpful, hopefully, and I don't have any prescriptions on whether or not it's a full time job. I do think, over time, it's going to become more of a full time job.
[01:25:14] swyx + Josh Albrecht: And that's great for the people who want to do that and the companies that want to employ that. But it's absolutely, like, you can take it part time, like, you know, jobs come in many formats. Yep, yep, that
[01:25:23] Dylan Patel: makes sense. Yeah. And then you have a huge world fair coming up. Yeah. Tell me about that. So,
[01:25:31] swyx + Josh Albrecht: Part of, I think, you know, What creating an industry requires is for, to let people gather in one place.
[01:25:37] swyx + Josh Albrecht: And also for me to get high quality talks out of people. You have to create an event out of it. Otherwise they don't do the work. So so last year we did the AI Engineer Summit, which went very well. And people can see that online and we're, we're, we're very happy with how that turned out.
[01:25:53] swyx + Josh Albrecht: This year we want to go four times bigger with the World's Fair and try to reflect AI engineering as it is in 2024. I always admired two conferences in, in this respect. One is NeurIPS, which I went to last year and, and documented on, on the pod, which was fantastic. And two, which is KubeCon from the other side of my life, which is the sort of cloud registration and, and DevOps world.
[01:26:18] swyx + Josh Albrecht: So NeurIPS is the one place that you go to, to, I think it's the top conference. I mean, there's, there's others that you can kind of consider. But, yeah so, so NeurIPS is, NeurIPS is where the research sciences are the stars. The researchers are the stars, PhDs are the stars, mostly it's just PhDs on the job market, to be honest.
[01:26:34] swyx + Josh Albrecht: It's really funny
[01:26:35] Dylan Patel: to go, especially these days. Yeah, it
[01:26:37] swyx + Josh Albrecht: was really funny to go to NeurIPS and go like, And the VCs trying to back them. Yeah, there are lots, lots of VCs trying to back them. Yeah, there This year. Anyway, so in Europe, research scientists are the stars. And for, I wanted for AI engineers, for engineers to be the star.
[01:26:51] swyx + Josh Albrecht: Right, to show off their tooling and their techniques and their difficulty moving all these ideas from research into production. The other one was KubeCon, where, You could honestly just go and not attend any of the talks and just walk the floor and figure out what's going on in DevOps, which is fantastic.
[01:27:10] swyx + Josh Albrecht: Because, yeah, so, so that curation and that bringing together of, of, of an industry is what I'm going for for the conference. And yeah, it's coming in June. The most important thing, to be honest, when I, like, conceived of this whole thing was to buy the domain. So we got AI. engineer. People are like, engineer is a domain?
[01:27:27] swyx + Josh Albrecht: Yeah, and funny enough, engineer was cheaper than engineering. I don't understand why, but like that's up to the domain people.
[01:27:36] Dylan Patel: Josh, any questions on agents?
[01:27:38] Alessio: Yeah,
[01:27:39] Dylan Patel: I think maybe, you know, you have a lot
[01:27:40] swyx + Josh Albrecht: of experience and exposure talking to all these companies and founders and researchers and everyone that's on your podcast.
[01:27:47] Dylan Patel: Do you have, do you feel like you have a
[01:27:50] swyx + Josh Albrecht: good kind of perspective on some of the things that, like, some of the kind of technical issues having seen? You know, like we were just talking about, like, for coding agents, like, oh, how, you know, the value of test is really important. There are other things, like, for, you know, retrieval, like now, You know, we have these models coming out with a million context, you know, or a million tokens of context length, or ten million, like, is retrieval going to
[01:28:10] Dylan Patel: matter anymore, like,
[01:28:11] swyx + Josh Albrecht: do
[01:28:11] Dylan Patel: huge contexts matter, like,
[01:28:13] swyx + Josh Albrecht: what do you think?
[01:28:14] swyx + Josh Albrecht: Specifically about the long context thing? Sure, yeah. Because you asked a more broad question. I was going to ask a few other ones after that, so go for that one first. Yeah. That's what I was going to ask first. We can ask, yeah, okay, let's talk about long context and then the other stuff. So, for those who don't know, LongContext was kind of in the air last year, but really, really, really came into focus this year.
[01:28:33] swyx + Josh Albrecht: With Gemini 1. 5 having a million token context and saying that it was in research for 10 million tokens. And that means that you can put, you, you, you, like, no longer have to really think about, What you retrieve sorry, you no longer really think about what you have to, like, put into context.
[01:28:50] swyx + Josh Albrecht: You can just kind of throw it, throw the entire knowledge base in there, or books, or film, anything like that and that's fantastic. A lot of people are thinking that it kills RAG, and I think, like, one, that's not true, because for any kind of cost reason you you know, you still pay per token, so if you there, so basically Google is, like, perfectly happy to let you pay a million tokens every single time you make an API call, but good luck, you know, having a hundred dollar API call.
[01:29:12] swyx + Josh Albrecht: And and then the other thing, it's going to be slow. No explanation needed. And then finally, my criticism of long context is that it's also not debuggable. Like, if something goes wrong with the result, you can't do, like, the ragged decomposition of where the source of error. Like, you just have to, like, go, like, it's the Waze, bro.
[01:29:29] swyx + Josh Albrecht: Like, it's somewhere in there. Sorry. I pretty strongly agree with this. Why do you think people are making such crazy long context windows? People love to kill rag, right? It's so much Kill it, though, because it's too expensive. It's so expensive like you said. Yeah, I just think I just call it a different dimension I think it's an option that's great when it's there like when I'm prototyping I do not ever want to worry about context and I'm gonna call Stuff a few times and I don't want to run to errors I don't want to have it set up a complex retrieval system just to prototype something But once I'm done prototyping then I'll worry about all the other rag stuff And yes, I'm gonna buy some system or build some system or whatever to go do that.
[01:30:02] swyx + Josh Albrecht: I so I think it's just like An improvement in like one dimension that you need And then, but the improvements in the other dimensions also matter. And it's all needed, like this space is just going to keep growing, um, in unlimited fashion. I do think that this combined with multi modality does unlock new things.
[01:30:21] swyx + Josh Albrecht: So That's what I was going to ask about next. It's like, how important is multi modal? Like, great, you know, generating videos, sure, whatever. Okay, how many of us need to generate videos that often? It'd be cool for TV shows, sure, but like, yeah. I think it's pretty important. And the one thing that, in, when we launched the Lean Space podcast, We listed a bunch of interest areas.
[01:30:37] swyx + Josh Albrecht: So one thing I love about being explicit or intentional about our, our work is that you list the things that you're interested in and you, you list the things that you're not interested in. And people are very unwilling to, to, to have an anti interest list. One of the things that we were not interested in was multimodality last year.
[01:30:55] swyx + Josh Albrecht: Because everyone was, I was just like, okay, you can generate images and they're pretty, but like not a giant business. I was wrong. Midrani is a giant, giant, massive business that no one can get it, no one can understand or get into. But also I think being able to, to natively understand audio and video and code.
[01:31:12] swyx + Josh Albrecht: I consider code a special modality. All that is very, like, qualitatively different than translating it into English first and using English as, I don't know, like a bottleneck or pipe and then you know, applying it in LLMs. Like the ability of LLMs to reason across modalities gives you something more than you could, you know, Individually by, by, by using text as the universal interface.
[01:31:33] swyx + Josh Albrecht: So I think that's useful. So concretely what, what does that mean? It means that so I think the reference post for everyone that you should have in your head is Simon Willison's post on Gemini 1. 5's video capability. Where he basically shot a video of his bookshelf and just kind of scanning through it.
[01:31:50] swyx + Josh Albrecht: And he was able to give back a, a complete JSON list of the books and the authors and, and all the details that were visible there. Hallucinated some of it, which is, you know, another, another issue. But I think it's just like unlocks this use case that you just would not even try to code without the native video understanding capability.
[01:32:08] swyx + Josh Albrecht: And obviously, like. On a technical level, video is just a bunch of frames. So actually it's just image understanding, but image within the temporal dimension, which this month, I think, became much more of a important thing, like the integration of space and time in Transformers. I don't think anyone was really talking about that until this month, and now it's the only thing anyone can ever think about for Sora and for all the other stuff.
[01:32:30] swyx + Josh Albrecht: The last thing I'll say that, which is which is Against this trend of like every modality is important. They just, just do all the modalities. I kind of agree with Nat Friedman who actually kind of pointed out just before the Gemini thing blew up this, this, this, this month, which was like, why is it that OpenAI is pushing Dolly so hard?
[01:32:48] swyx + Josh Albrecht: Why is, why is being pushing Bing image creator? Like, it's not nec, it's not apparent that you have to create images to create a GI. But every lab just seems to want to do this, and I kind of agree that it's not on the critical path. Especially for image generation, maybe image understanding, video understanding.
[01:33:04] swyx + Josh Albrecht: Yeah, consumption. But generation, eh. Maybe we'll be wrong next year. It just catches you a bunch of flack with like, you know, culture war things. Alright, we're going to
[01:33:14] Dylan Patel: move into rapid fire Q& A, so we're going to ask you questions. We've cut
[01:33:26] Dylan Patel: the Q& A section for time, so if you want to hear the spicy questions, head over to the Thursday Nights in AI video for the full discussion.
[01:33:34] Dylan Patel - Semianalysis + Latent Space Live Show
[01:33:34] Dylan Patel: Next up, we have another former guest, Dylan Patel of Semianalysis, the inventor of the GPU rich poor divide, who did a special live show with us in March. But that means you can finally, like, side to side A B test your favorite Boba
[01:33:51] Alessio: shops?
[01:33:51] Alessio: We got Gong Cha, we got Boba Guys, we got the Lemon, whatever it's called. So, let us know what's your favorite. We also have Slido up to submit questions. We already had Dylan on the podcast, and like, this guy tweets and writes about all kinds of stuff. So we want to know what people want to know more
[01:34:07] Alessio: about.
[01:34:08] Alessio: Rather than just being self, self driven. But we'll do A state of the union, maybe? I don't know. Everybody wants to know about Grok. Everybody wants to know whether or not NVIDIA is going to zero after Grok. Everybody wants to know what's going on with AMD. We got some AMD folks in the crowd, too.
[01:34:23] Alessio: So feel free to interact at any time. This is We have
[01:34:27] swyx + Josh Albrecht: portable mics.
[01:34:27] Dylan Patel: Heckle, please. What do you sorry. Good comedians show their color when with the way they can handle the crowd when they're heckled.
[01:34:35] Alessio: Do not throw Boba. Do not throw Boba at this end. We cannot afford another podcasting setup. Awesome.
[01:34:41] Alessio: Well, welcome everybody to the Semi Analysis and Latest Space Crossover. Dylan texted me on Signal. He was like, dude, how do I easily set up a meetup? And here we are today. Well, as you might have seen, there's no name tags. There's a bunch of things that are missing. But we did our
[01:34:55] Dylan Patel: best. It was extremely easy, right?
[01:34:58] Groq
[01:34:58] Dylan Patel: Like, I text Alessio. He's like, yo, I got the spot. Okay, cool. Thanks Here's a link. Send it to people. Sent it. And then showed up. And like, there was zero other organization that I required. So
[01:35:10] Alessio: everybody's here. A lot of, a lot of Semi Analysis fans we get in the crowd everybody wants to know more about what's going on today, and Grok has definitely been the hottest thing.
[01:35:19] Alessio: We just recorded our monthly podcast today, and we didn't talk that much about Grok because we wanted you to talk more about it, and then we'll splice you into our, our monthly recap. So, let's start there.
[01:35:29] swyx + Josh Albrecht: Okay, so, You guys, you guys are the new Grok spreadsheet ers. Yeah, yeah, so, so, we, we we broke out some Grok numbers because everyone was wondering, there's two things going on, right?
[01:35:37] swyx + Josh Albrecht: One you know, how important, or how does it achieve the inference speed that it does? That, that has been demonstrated by GrokChat. And two, how does it achieve its price promise that is promised that, that is sort of the public pricing of 27 cents per million token. And there's been a lot of speculation or, you know, some numbers thrown out there.
[01:35:55] swyx + Josh Albrecht: I put out some tentative numbers and you put out different numbers. But I'll just kind of lay that as, as the, as the groundwork. Like, everyone's like very excited about essentially like five times faster. Token generation than any other LLM currently. And that unlocks interesting downstream possibilities if it's sustainable, if it's affordable.
[01:36:14] swyx + Josh Albrecht: And so I think your question, or reading your piece on Grok, which is on the screen right now, is it sustainable?
[01:36:21] Dylan Patel: So like many things, this is VC funded, including this Boba. No, I'm just kidding, I'm paying for the Bobo, so but, but Thank you semi analysis
[01:36:29] swyx + Josh Albrecht: subscribers
[01:36:31] Alessio: I hope he pays for it, I pay for it right now That's
[01:36:33] Dylan Patel: true, that's true Alessio has the IOU, right?
[01:36:36] Dylan Patel: And that's, that's all it is, but yeah, like many things, you know, they're, they're not making money off of their inference service, right? They're just throwing it out there for cheap and hoping to get business and maybe raise money off of that, and I think that's a that's a fine use case, but the question is, like, how much money are they losing?
[01:36:53] Dylan Patel: Right, and, and that's sort of what I went through breaking down in this this article that's on the screen. And it's, it's pretty clear they're like 7 to 10x off, like, break even on their inference API, which is like horrendous, like far worse than any other sort of inference API provider. So this is like a simple, simple cost thing that was pulled up.
[01:37:15] Dylan Patel: You can either inference at very high throughput, or you can inference at very high, very low latency.
[01:37:20] Dylan Patel: With GPUs, you can do both. With Grok, you can only do one. Of course, with Grok, you can do that one faster. Marginally faster than a inference latency optimized GPU server. But no one offers inference latency optimized GPU servers because you would just burn money, right? Makes no economic sense to do so.
[01:37:36] Dylan Patel: Until maybe someone's willing to pay for that. So, so Grok service, you know, on the surface looks awesome compared to everyone else's service, which is throughput optimized. And, and then when you compare to the throughput optimized scenario, right, GPUs look quite slow, but the reality is they're serving, you know, 64, 128 users at once.
[01:37:54] Dylan Patel: Right, they're, they have a batch size, right? How many users are being served at once, whereas Grok Taking 576 chips, and they're not really doing that efficiently, right? You know, they're, they're serving a far, far fewer number of users, but extremely fast. Now, that could be worthwhile if they can get their, you know, the number of users they're serving at once up, but that's extremely hard because they don't have memory on their chip, so they can't store KV cache KV cache for, you know, all the various different users.
[01:38:21] Dylan Patel: And so, so the crux of the issue is just like, hey, So, can they, can they get that performance up as much as they claim they will, right? Which is, you know, they need to get it up more than 10x, right? To, to, to make this like a reasonable benefit, right? In the meantime, NVIDIA's launching a new GPU in two weeks that'll be fun at GTC and they're constantly pushing software as well, so we'll see if, if Grok can catch up to that.
[01:38:43] Dylan Patel: But the, the current verdict is, you know, they're, they're quite far behind, but it's hopeful, you know, that, that maybe they can get there by, you know, scaling their system larger. Yeah.
[01:38:52] swyx + Josh Albrecht: I was listening back to our original episode, and you were talking about how NVIDIA basically adopted this different strategy of just leaning on networking GPUs together.
[01:39:00] swyx + Josh Albrecht: And it seems like Grok has some, like, minor version of that going on here with the Grok rack. Is it enough? Like, what's Grok's next step here, like,
[01:39:12] Dylan Patel: strategically? Yeah, that's the next step is, of course, you know, so, you know, So right now they connect 10 racks of chips together, right, and that's the system that's running on their API today, right.
[01:39:23] Dylan Patel: Whereas most people who are running, you know, Mistral are running it on two GPUs, right. So one fourth of a server. Yeah. And that rack is not you know, obviously 10 racks is pretty crazy, but they think that they can scale performance if they have this individual system be 20 racks, right? They think they can continue to scale performance extra linearly.
[01:39:42] Dylan Patel: So that'd be amazing if they could but I, I, I'm, I'm doubtful that that's gonna be something that's scalable especially for, for, you know, larger models. So there's the
[01:39:56] Alessio: chip itself, but there's also a lot of work they're doing at the compiler level. Do you have any good sense of, like, how easy it is to actually work with LPU?
[01:40:04] Alessio: Like, is that something that is going to be a bottleneck for them?
[01:40:07] Dylan Patel: So, so Ali's in the front right there, and he, he knows a ton about about VLIW architectures. But to summarize sort of his opinion, and I think many folks's, it's, it's extremely hard to To program these sorts of architectures, right?
[01:40:19] Dylan Patel: Which is why they have their compiler and so on and so forth. But, you know, it's, it's an incredible amount of work for them to stand up individual models and to get the performance up on them which is what they've been working on, right? Whereas, whereas, you know, GPUs are far more flexible, of course.
[01:40:33] Dylan Patel: And so the question is, you know, can they, can they can, can this compiler continue to extract performance? Well, theoretically, like there, there's a lot more performance to run on the hardware. But they don't have, you know, many, many things that people generally associate with, with programmable hardware.
[01:40:49] Dylan Patel: Right? They don't have buffers and, and many other things. So, so it makes it very tough to to do that. But that's what their, you know, their relatively large compiler team is working on. Yeah,
[01:40:58] swyx + Josh Albrecht: So I'm, I'm not a GPU compiler guy. But I do want to clarify my understanding from what I read. Which is a lot of catching up to do.
[01:41:05] swyx + Josh Albrecht: It is, The crux of it is some kind of speculative, like I, in the, the word that comes to mind is speculative routing of weights and, you know, and, and work that, that needs to be done, or scheduling of work across the, you know, the 10 racks of, of GPUs. Is that the, is that like the, the, the bulk of the benefit that you get from
[01:41:25] Dylan Patel: the compilation?
[01:41:26] Dylan Patel: So, so with the Grok chips, what's really interesting is like with GPUs you can do, you can issue certain instructions. And you will get a different result. Like, depending on the time, I know a lot of people in ML have, have had that experience, right? Where like, the GPU literally doesn't return the numbers it should be.
[01:41:45] Dylan Patel: And that's basically called non determinism, right? And, and, and the, and, and, with, with Grok, their chip is completely deterministic. The moment you compile it, you know exactly how long it will take to operate, right? There is no, there is no, like, deviation at all. And so, you know, they've, they're planning everything ahead of time, right, like, every instruction, like, it will complete in the time that they've planned it for.
[01:42:08] Dylan Patel: And there is no I don't know, I don't know what the best way to state this is. There's no variance there which is interesting from, like, when you look historically, they tried to push this into automotive, right? Because automotive, you know, you probably want your car to do exactly what you issued it to do.
[01:42:22] Dylan Patel: And not have, sort of, unpredictability. But yeah, I don't, sorry, I lost track of the question.
[01:42:28] swyx + Josh Albrecht: It's okay, I just wanted to understand a little bit more about, like, what people should under, should know about the compiler magic that goes on with Brock. Like, you know, like, I think, I think, from a software, like, under, like, hardware point of view that in, that intersection of, you know,
[01:42:44] Dylan Patel: So, so, so chips have like, like and I'm going to steal this from someone here in the crowd, but chips have like five, you know, sort of, there's like, when you're designing a chip, there's, there's, it's called PPA, right?
[01:42:54] Dylan Patel: Power, performance, and area, right? So it's kind of a triangle that you optimize around. And the one thing people don't realize is there's a, there's a third P that's like PPAP. And the last P is pain in the ass to program. And, and that's that is very important for like. People making AI hardware, right?
[01:43:11] Dylan Patel: Like, TPU, without the hundreds of people that work on the compiler, and JAX, and XLA, and all these sorts of things, would be a pain in the ass to program. But Google's got that, like, plumbing. Now, if you look across the ecosystem, everything else is a pain in the ass to program compared to NVIDIA, right? And, and, and this applies to the, to the Grok chip as well, right?
[01:43:31] Dylan Patel: So, yeah, question is, like, can the compiler team get performance up anywhere close to theoretical? And then, and then can they make it not a pain in the ass to support new models? Cool. We
[01:43:41] Alessio: got a question, we got a question from Ali. What's the average VLIW bundle occupancy of Grok? Bro,
[01:43:49] Dylan Patel: get out of here.
[01:43:52] Alessio: I don't know if he's setting you up, or if he
[01:43:54] Dylan Patel: wants to chime in. I think he's setting me up, I think he's setting me up. So, okay,
[01:43:58] swyx + Josh Albrecht: what is VLIW for
[01:44:00] Dylan Patel: the rest of us? It's, it's like very long instruction word is basically what it means. And, hm. So, so, GPUs are relatively simple, right? They're, they're tiny little cores, very simple instructions, there's a shitload of them, right?
[01:44:16] Dylan Patel: CPUs, you know, they have a, they have a known instruction set, right? x86. It's very complicated but people have worked on it for decades. VLIW processors are very unique in that sense, right? Like and your question, Ali, I cannot answer that question. I have no clue. Is it documented anywhere online?
[01:44:35] Dylan Patel: Anyway, so like the systolic array, right? Like there's, within the TPU, there's a bunch of stuff, but the actual matrix multiply unit, it's called the MXU, and it's a VLIW architecture as well. It's and I'm, I'm just trying to find a, yeah, I just want to find something that makes me not sound like an idiot.
[01:44:51] swyx + Josh Albrecht: Sometimes I also like to ballpark things in terms of like, like where a good middle median value should be and where like a good high value should be. Sorry. You, you, you
[01:45:03] Dylan Patel: can ballpark things like, you know, like, yeah, so, so, so, but basically like the, the point is like you're trading this is optimal, this is theoretically the most optimal architecture for performance power and area in a given, and you know, not, not specifically Grok, but VLIW in general is gonna get you closer to optimal there, but then you're giving off, you know, that, that last P, which is pain in the ass program, is, is I think the most simple way to get into it.
[01:45:27] Dylan Patel: There's like, computer architecture books about this, but it's, it's, it's a little little, little complicated, right? Yeah.
[01:45:35] Alessio: Somebody asked, there's a lot of questions, that's great. Can we talk about LPU, Cerebrus, Tenstorin, some of these other architectures. How should people think about Maxim, SRAM versus Mix versus
[01:45:49] Dylan Patel: Yeah, yeah.
[01:45:50] Dylan Patel: So there's a lot of ML hardware out there, new and old, right? There's old stuff that's trying to compete there's new stuff that's coming up, you know, companies like, like MadX and Lumerium Labs and so on and so forth, right? You know but, but, so, so there's like a continuum of like, everyone before, say, two years ago that was doing ML hardware bet in one direction, right?
[01:46:11] Dylan Patel: We're gonna make it as an architecture that is, that is has more on chip memory than NVIDIA, right? Like, that was the general bet everyone made. Right? And so like Grok made that bet, they made it to the extreme, right? They didn't have any off chip memory at all. Only on chip memory. You have, you have Cerebrists who did a similar thing except, they were like, Yeah, we're gonna have on chip memory, but we're gonna make a chip that's the size of a wafer.
[01:46:33] Dylan Patel: Right? Like literally this big. Whereas an NVIDIA chip is roughly this big, right? So it's like this big, it's the only chip in the world that's that big. But again, same bet. More on chip memory, less off chip, right? GraphCore and SambaNova made a similar bet. And, and every, basically everyone made that bet.
[01:46:49] Dylan Patel: Cause they thought that's where ML would go. Of course, models grew faster than anyone ever imagined. Yeah, than the memory that was possible. And so that, that very quickly became the wrong bet. And so now we're, you know, sort of seeing a new wave of startups that are going to bet on the other side, as well as many other, you know, architectural things because memory is not really the only architectural thing, of course.
[01:47:08] Dylan Patel: And so, like, where to, where to, like, place startups is, is very dependent on, like, Hey, what are you doing differently than NVIDIA? And is NVIDIA just going to implement that in their chip next year, right? Or, or some version of that. That's, like, pretty much the only things to think about when looking at, you know, hardware companies now.
[01:47:27] Dylan Patel: Cool.
[01:47:28] Alessio: And, yeah, I, I think the, the question is like, there's the size of the models that got outrun, but now you're doing all this work at the compiler level, but it's very transformer based, everything they're doing on the optimization side. How, how do you think about that risk? Like, do you think it's okay for like a hardware company to take like architectural risk in terms of like, yeah, we assume transformers in two years, they'll still be pretty good.
[01:47:51] Alessio: But when you're like depreciating some of this cost of our life. For five years as a buyer.
[01:47:56] Dylan Patel: Yeah, yeah, that's, that's the biggest challenge with like some of the specialized hardware, right? It's like, I know my GPUs will be useful in four years or five years. Maybe not, like, super useful, but they'll be useful for something.
[01:48:07] Dylan Patel: But, there's no way to know that my hardware is going to be able to operate on whatever new model architecture that comes out in the next few years, right? Like, I, I, I like to joke transformers are all you need. And like everything else is like a waste of time. But, you know, I'm sure something better will come.
[01:48:26] Dylan Patel: Right? And, and, you know, you gotta have like, hardware is expensive and you own it for many years. Right? So you can't just like buy whatever's best for today's workload one time and then assume that workload is gonna stay stagnant. Cause that's a recipe to have your like hardware useless as soon as like things evolve.
[01:48:43] Dylan Patel: Right? Like imagine if someone like had hardware for LTSMs and. 2016 or whatever, right? Like, LSTMs. Yeah, LSTM, sorry. You look like an idiot, right? Because now it's not gonna work for, you know, the next architecture, right? As soon as BERT came out, right? For example. So yeah, it's, it's very anything super, super specialized is always at risk of, of being sort of obsoleted and useless.
[01:49:06] Dylan Patel: And, and sort of that's, that's the, that's the thought that like, hey, like, like Graphcore, right? Their chips are. Pretty decent at GNNs, right? Graph Neural Networks. They're actually pretty decent at that. But no one cares, right? So, congratulations, right? Like, you won, you won like the shortest midget, right?
[01:49:24] swyx + Josh Albrecht: Mentioning transformers is all you need. Gives us a nice opportunity to bring out one of your old tweets, but also mention Gemini. My old
[01:49:30] Dylan Patel: tweets, I'm scared. Recent
[01:49:33] swyx + Josh Albrecht: tweets. There's a lot of people talking about, like I think you had a tweet commenting on Gemini 1. 5. And the million token context where basically everyone was saying, like, okay, we need Mamba, we need RLUKV, or we need some other alternative architecture to scale to long context.
[01:49:48] swyx + Josh Albrecht: And Google comes out and says, no, we just, we scaled transformers to 10 million tokens. Easy. We, and, you know, like, I, I think that, that kind of, like, reflects on your thesis there a
[01:49:59] Dylan Patel: little bit. I guess, yeah. I mean, I don't know if I, if I have a coherent thesis, but it's, it's sure fun to, it's Who, who think that like, I, I, I just have an intense hatred for RAG.
[01:50:11] Dylan Patel: Right, like retrieval augmented generation is, is, is like the most like, I just have an intense like innate hatred for it. Wait, wait, you retweeted me
[01:50:18] swyx + Josh Albrecht: defending RAG in the White House press release. Yeah, yeah, yeah. Okay.
[01:50:21] Dylan Patel: But it's just fun,
[01:50:22] swyx + Josh Albrecht: it's all fun and games. Yeah, yeah, yeah, it's all fun and games.
[01:50:24] Dylan Patel: Yeah.
[01:50:25] Dylan Patel: No, no, no, I retweeted, I retweeted you because you memed the White House. I don't know if y'all saw the meme. Can you pull it up? Sure. Like the, the White House the White House put out this thing about like, They're getting very opinionated with this White House. Memory safety. I think it was effectively like, C is bad and Rust is good.
[01:50:39] Dylan Patel: It was like pretty wild that the White House put that out. And I mean like, like whatever that is, so, so, So
[01:50:46] swyx + Josh Albrecht: like, they just got very opinionated about prescribing languages to people. And so then I was, I just like started editing them. So I have stopped comparing RAG with long context and fine
[01:50:54] Dylan Patel: tuning.
[01:50:55] Dylan Patel: Wait, You said I retweeted you defending it. I thought you were hating on it. And that's why I retweeted it.
[01:51:00] swyx + Josh Albrecht: It's somewhat of a defense. Because everyone was like long context is killing RAG. And then I had future LLM should be sub quadratic. That's another one. And I actually messed with the fine print as well..
[01:51:11] Alessio: Let's see power benefits of SRAM dominant
[01:51:13] Dylan Patel: Yeah, yeah. So, so that's a good question, right? So, like, SRAM is on chip memory. Everyone's just using HBM. If you don't have to go to off chip memory, that'd be really efficient, right?
[01:51:23] Dylan Patel: Cause, cause you're, you're not moving bits around. But there's always the issue of you don't have enough memory, right? So, so you still have to move bits around constantly. And so that's the, that's the question. So, yeah, sure. If you, if you can not move data around as you compute, it's going to be fantastically efficient.
[01:51:39] Dylan Patel: That isn't really not really just easy or simple to do.
[01:51:42] Alessio: What do you think is going to be harder in the future, like getting more energy at cheaper costs or like getting more of this hardware
[01:51:48] Dylan Patel: to run? Yeah, I wonder, so someone was talking about this earlier but it's like here in the crowd and I'm looking right at him but he's complaining that journalists keep saying that you know, that, that, like misreporting about how data centers, or what data centers are doing to the environment.
[01:52:03] Dylan Patel: Right? Which I thought was quite funny, right? Cause, cause they're inundated by journalists talking about data centers like destroying the world. Anyways you know, that's not quite the case, right? But yeah, I don't know, like, the, the, the power is certainly going to be hard to get, but, you know, I think, I think if you just look at history, right?
[01:52:22] Dylan Patel: Like humanity, especially America, right? Like, power, power production and usage kept skyrocketing. From like the 1700s to like 1970s, and then it kind of flatlined from there, so why can't we like go back to the like growth stage, I guess is like the whole like mantra of like accelerationists, I guess.
[01:52:40] Dylan Patel: This is EAC, yep. Well I don't think it's EAC, I think it's like, like Sam Altman like wholly believes this too, right? Yeah. And I don't think he's EAC. So, but yeah, like, like, I don't think like, it's like things, it's like something to think about, right? Like. The US is going back to growing in energy usage whereas for the last like 40 years kind of were flat on energy usage.
[01:53:00] Dylan Patel: And what does that mean, right? Like, yeah.
[01:53:04] Alessio: Fair enough. There was another question on Marvel but kind of the, I think
[01:53:07] Dylan Patel: that's it's, it's, it's definitely like one of these three guys who are on the buy side that are asking this question. What, what, what you want to know if Marvel's stock is gonna go up?
[01:53:18] Dylan Patel: Yeah. So Marvell,
[01:53:19] Alessio: the, they're, they're doing the custom music for, for grok. They also do the tri too. And the, the Google CPU. Yeah. Any other, any other chip that they're working on that people should, should keep in mind. It's like, yeah. Any needle moving and it's any stock moving .
[01:53:34] Dylan Patel: Yeah, exactly. Exactly. They're, they're working on some more stuff.
[01:53:38] Dylan Patel: Yeah. I, I'll, I'll, I'll refrain from,
[01:53:40] Alessio: yeah. All right. Let's see other grok stuff we want to get it, get through. I don't think so. Alright, most of the other ones. Your view on edge compute hardware. Any real use cases for it?
[01:53:54] Dylan Patel: Yeah, I mean, I, I I have like a really like anti edge view. Yeah, let's hear it.
[01:53:58] Dylan Patel: Like, like, so many people are like, oh, I'm going to run this model on my phone or on my laptop and. I love how much it's raining. So now I can be horrible and you people won't leave. Like, I want you to try and leave this building. Captive audience. Seriously, should I start singing? Like, there's nothing you
[01:54:17] Alessio: can do.
[01:54:18] Alessio: You definitely, I'll stop you from that.
[01:54:19] Dylan Patel: Sorry, so edge hardware, right? Like, you know, people are like, I'm going to run this model on my phone or my laptop. It makes no sense to me. Cause Current hardware is not really capable of it. So you're gonna buy new hardware, to run whatever on the edge or you're gonna just run very, very small models.
[01:54:36] Dylan Patel: But in either case, you're, you're gonna end up with like the performance is really low, And then whatever you spent to run it locally, Like if you spent it in the cloud, it could service 10x the users, right? So you kind of like, SOL in terms of like, Economics of, of running things on the edge. And then like latency is like, for, for LLMs, right, for LLMs, it's like not that big of a deal relative to, like internet latency is not that big of a deal relative to the use of the model, right?
[01:55:08] Dylan Patel: Like the actual model operating, whether it's on edge hardware or cloud hardware. And cloud hardware is so much faster. So like edge hardware is not really able to like, have a measurable, appreciable, like advantage. Over, over cloud, cloud hardware. This applies to diffusion models, this applies to LLMs of course small models will be able to run, but not, not all, yeah.
[01:55:33] Dylan Patel: Cool.
[01:55:35] Alessio: Let's see. I guess you, you can now see them. Yeah, what chance do startups like MetaX fetch, or 5. 6? Haven't you
[01:55:41] swyx + Josh Albrecht: already reviewed
[01:55:41] Dylan Patel: them? Why don't you, why don't you answer? Yeah, we, we
[01:55:43] swyx + Josh Albrecht: actually, like, we have, Connections with Maddox and Lemurian. Yeah, yeah, yeah. We haven't, no. But Gavin is
[01:55:52] Alessio: Yeah, yeah, they said they don't want to talk publicly.
[01:55:55] Alessio: Oh, okay, okay.
[01:55:57] swyx + Josh Albrecht: When they open up, we can Sure,
[01:56:00] Alessio: sure. But do you think, like, I think the two,
[01:56:02] Dylan Patel: three Answer the question! What do you think of them?
[01:56:06] Alessio: I think, kind of, there's a couple things. It's like How do the other companies innovate against them? I think when you do a new Silicon, you're like, Oh, we're going to be so much better at this thing or like much faster, much cheaper.
[01:56:18] Alessio: But there's all the other curves going down on the macro environment at the same time. So if it takes you like five years before you were like a lot better, five years later, once you take the chip out, you're only comparing yourself to the five year advancement that the major companies had to. So then it's like, okay, the, we're going to have like the C300, whatever, from, from NVIDIA.
[01:56:37] Alessio: By the time some of these chips come up.
[01:56:40] Dylan Patel: What's after Z? What do you think is after Z in the road map? Because it's X, Y, Z, Anyways Yeah, yeah, it's like the age old problem, right? Like you build a chip, it has some cool thing, cool feature, and then like, a year later, NVIDIA has it in hardware, right? Has implemented some flavor of that in hardware.
[01:57:01] Dylan Patel: Or two generations out, right? Like, what idea are you going to have that NVIDIA can't implement, is like, really the question. It's like, you have to be fundamentally different in some way that holds through for, you know, four or five years, right? That's kind of the big issue. But, you know, like, those people have some ideas that are interesting, and yeah, maybe it'll work out, right?
[01:57:21] Dylan Patel: But it's going to be hard to fight NVIDIA, who one, doesn't consider them competition, right? They're worried about, like, Google and Amazon's chip. Right, they're not, and I guess to some extent AMD's chip, but like they're not really worried about you know, MADX or Etched or Grok or, you know, Positron or any of these folks.
[01:57:39] Alessio: How much of an advantage do they have by working closely with like OpenAI folks and then already knowing where some of the architecture decisions are going? And since those companies are like the biggest buyers and users of the
[01:57:51] Dylan Patel: chips, Yeah, I mean, like, you see, like, the most important sort of AI companies are obviously going to tell hardware vendors what they want you know, open AI and, you know, so on and so forth, right?
[01:58:02] Dylan Patel: They're just going to obviously tell them what they want and the startups aren't actually going to get anywhere close to as much feedback on what to do on, like, you know, very minute, low level stuff, right? So that's, that's the, that is a difficulty, right? Some startups, like, like, Maddox obviously have people who built, or worked on the largest models, like at Google, but then other startups might not have that advantage and so they're always gonna have that issue of like, hey, how do I get the feedback, or what's changing, what do they see down the pipeline that's, that I really need to be aware of and ready for when I design my hardware.
[01:58:37] Dylan Patel: Alright.
[01:58:38] Alessio: Every hardware shortage has eventually turned into a glut. Well, that'd be true of NVIDIA chips, it's so when, but also why.
[01:58:45] Dylan Patel: Absolutely, and I'm so excited to buy like H100s for like 1, 000, guys. No, that's not 000, but Yeah, everyone's gonna buy chips, right? Like, it's just the way semiconductors work, because the supply chain takes forever to build out.
[01:58:58] Dylan Patel: And it's, it's like a really weird thing, right? Like, so, so if the backlog of chips is a year, people will order, you know, Two years worth of what they want for the next year. It is like a very common thing. It's not just like this AI cycle, but like, like, like microcontrollers, right? Like the automotive companies, they order two years worth of what they needed for one year, just so they could get enough, right?
[01:59:21] Dylan Patel: Like, this is just like what happens in semiconductors when, when lead times lengthen, the, the purchases and inventory is sort of like double. Sorry. So, so these. The, the NVIDIA GPU shortage obviously is going to be rectified. And when it is everyone's sort of double orders will become extremely apparent, right?
[01:59:42] Dylan Patel: And, you know, you, you see like random companies out of nowhere being like, Yeah, we've got 32, 000 H100s on order, or we've got 10, 000 or 5, 000. And trust, they're not all they're not all real orders for one, but I think, I think the like bubble will continue on for a long time, right, like it's not, it's not going to end like this year, right, like people, people need AI, right, like I think everyone in this audience would agree, right, like there's no, there's no like immediate like end to the, to the bubble, right.
[02:00:09] Dylan Patel: Party like we're in 1995, not like 2000. Makes sense.
[02:00:12] Alessio: What's next? Thoughts on VLIW
[02:00:16] Dylan Patel: architectures? Oh, Y, Y, sorry, sorry, Y. The Y question, yeah, yeah. I think it's just because the supply chain expands so much, and then at the same time there will be no, like, economic, like, immediate economic thing for everyone, right?
[02:00:28] Dylan Patel: Like, some companies will continue to buy, like like an OpenAI or Meta will continue to buy, but then, like, All these random startups will, or a lot of them will not be able to continue to buy, right? So then, so then that like kind of leads to like, they'll pause for a little bit, right? Or like, I think in 2018, right?
[02:00:45] Dylan Patel: Like memory pricing was extremely high. Then all of a sudden Google, Microsoft, and Amazon all agreed, I don't, you know, You know, they don't, they won't, they won't say it's together, but they basically all agreed it like, within the same week to stop ordering memory. And within like a month, the price of memory started tanking like insane amounts, right?
[02:01:06] Dylan Patel: And like people claim, you know, all sorts of reasons why that was timed extremely well. But it was like very clear and people in the financial markets were able to make trades and everything, right? People stopped buying and it's not like their demand just dried up. It's just like they had a little bit of a demand slowdown and then they had enough inventory that they could like weather until like prices tanked.
[02:01:26] Dylan Patel: Because it's such an inelastic good, right? Yeah.
[02:01:29] swyx + Josh Albrecht: Thank you very much. That's it.
[02:01:35] AI Charlie: That concludes our audio segment this weekend. But if you're listening all the way to the end, we have two bonus segments for you. A conversation with Malin Nefe, Senior Vice President of AI at Capital One. We'll be speaking at the AI Leadership Track of the AI Engineer World's Far. And the recent Latent Space Personal AI Meetup featuring a lot of new AI wearables. Bee, Based Hardware, DeepGram MLE AI, and LangChain LangFriend and LangMem, Presented by another former guest, Harrison Chase. Watch out and take care.
Get full access to Latent Space at www.latent.space/subscribe
+
+[by:whisper.cpp]
+
+[00:00.00](音乐)
+
+[00:06.00]欢迎到达特兰的节目的节目的节目
+
+[00:09.00]这是Charlie,你的AI co-host
+
+[00:12.00]SWIX 和 ALESIO 都在节目中做了更多的节目
+
+[00:16.00]我们有很多新的节目
+
+[00:18.00]来自Elicit、Chroma、 Instructor
+
+[00:20.00]和我们的新的节目在 NSFW
+
+[00:23.00]"不安全的工作AI"
+
+[00:25.00]今天我们会在SWIX 和 ALESIO 的新的节目中
+
+[00:29.00]找到更多新的新的节目
+
+[00:31.00]在我们的第一节目中
+
+[00:33.00]找到更多新的节目
+
+[00:35.00]SWIX 和 ALESIO 都在节目中
+
+[00:37.00]做了更多新的节目
+
+[00:40.00]SWIX 和 ALESIO 都在节目中
+
+[00:42.00]做了更多新的节目
+
+[00:44.00]我们有很多新的节目
+
+[00:46.00]SWIX 和 ALESIO 都在节目中
+
+[00:49.00]找到更多新的节目
+
+[00:52.00]我们有很多新的节目
+
+[00:54.00]SWIX 和 ALESIO 都在节目中
+
+[00:56.00]找到更多新的节目
+
+[00:58.00]SWIX 和 ALESIO 都在节目中
+
+[01:00.00]找到更多新的节目
+
+[01:02.00]SWIX 和 ALESIO 都在节目中
+
+[01:04.00]找到更多新的节目
+
+[01:06.00]SWIX 和 ALESIO 都在节目中
+
+[01:08.00]找到更多新的节目
+
+[01:09.00]SWIX 和 ALESIO 都在节目中
+
+[01:11.00]找到更多新的节目
+
+[01:12.00]SWIX 和 ALESIO 都在节目中
+
+[01:14.00]找到更多新的节目
+
+[01:16.00]SWIX 和 ALESIO 都在节目中
+
+[01:18.00]找到更多新的节目
+
+[01:19.00]SWIX 和 ALESIO 都在节目中
+
+[01:21.00]找到更多新的节目
+
+[01:22.00]SWIX 和 ALESIO 都在节目中
+
+[01:24.00]找到更多新的节目
+
+[01:26.00]SWIX 和 ALESIO 都在节目中
+
+[01:28.00]找到更多新的节目
+
+[01:30.00]SWIX 和 ALESIO 都在节目中
+
+[01:32.00]找到更多新的节目
+
+[01:34.00]SWIX 和 ALESIO 都在节目中
+
+[01:36.00]找到更多新的节目
+
+[01:38.00]SWIX 和 ALESIO 都在节目中
+
+[01:40.00]找到更多新的节目
+
+[01:42.00]SWIX 和 ALESIO 都在节目中
+
+[01:44.00]找到更多新的节目
+
+[01:46.00]SWIX 和 ALESIO 都在节目中
+
+[01:48.00]找到更多新的节目
+
+[01:50.00]SWIX 和 ALESIO 都在节目中
+
+[01:52.00]找到更多新的节目
+
+[01:54.00]SWIX 和 ALESIO 都在节目中
+
+[01:56.00]找到更多新的节目
+
+[01:58.00]SWIX 和 ALESIO 都在节目中
+
+[02:00.00]找到更多新的节目
+
+[02:02.00]SWIX 和 ALESIO 都在节目中
+
+[02:04.00]找到更多新的节目
+
+[02:06.00]SWIX 和 ALESIO 都在节目中
+
+[02:08.00]找到更多新的节目
+
+[02:10.00]SWIX 和 ALESIO 都在节目中
+
+[02:12.00]找到更多新的节目
+
+[02:14.00]SWIX 和 ALESIO 都在节目中
+
+[02:16.00]找到更多新的节目
+
+[02:18.00]SWIX 和 ALESIO 都在节目中
+
+[02:20.00]找到更多新的节目
+
+[02:22.00]SWIX 和 ALESIO 都在节目中
+
+[02:24.00]找到更多新的节目
+
+[02:26.00]SWIX 和 ALESIO 都在节目中
+
+[02:28.00]找到更多新的节目
+
+[02:30.00]SWIX 和 ALESIO 都在节目中
+
+[02:32.00]找到更多新的节目
+
+[02:34.00]SWIX 和 ALESIO 都在节目中
+
+[02:36.00]找到更多新的节目
+
+[02:38.00]SWIX 和 ALESIO 都在节目中
+
+[02:40.00]找到更多新的节目
+
+[02:42.00]SWIX 和 ALESIO 都在节目中
+
+[02:44.00]找到更多新的节目
+
+[02:46.00]SWIX 和 ALESIO 都在节目中
+
+[02:48.00]找到更多新的节目
+
+[02:50.00]SWIX 和 ALESIO 都在节目中
+
+[02:52.00]找到更多新的节目
+
+[02:54.00]SWIX 和 ALESIO 都在节目中
+
+[02:56.00]找到更多新的节目
+
+[02:58.00]SWIX 和 ALESIO 都在节目中
+
+[03:00.00]找到更多新的节目
+
+[03:02.00]SWIX 和 ALESIO 都在节目中
+
+[03:04.00]找到更多新的节目
+
+[03:06.00]SWIX 和 ALESIO 都在节目中
+
+[03:08.00]找到更多新的节目
+
+[03:10.00]SWIX 和 ALESIO 都在节目中
+
+[03:12.00]找到更多新的节目
+
+[03:14.00]SWIX 和 ALESIO 都在节目中
+
+[03:16.00]找到更多新的节目
+
+[03:18.00]SWIX 和 ALESIO 都在节目中
+
+[03:20.00]找到更多新的节目
+
+[03:22.00]SWIX 和 ALESIO 都在节目中
+
+[03:24.00]找到更多新的节目
+
+[03:26.00]SWIX 和 ALESIO 都在节目中
+
+[03:28.00]找到更多新的节目
+
+[03:30.00]SWIX 和 ALESIO 都在节目中
+
+[03:32.00]找到更多新的节目
+
+[03:34.00]SWIX 和 ALESIO 都在节目中
+
+[03:36.00]找到更多新的节目
+
+[03:38.00]SWIX 和 ALESIO 都在节目中
+
+[03:40.00]找到更多新的节目
+
+[03:42.00]SWIX 和 ALESIO 都在节目中
+
+[03:44.00]找到更多新的节目
+
+[03:46.00]SWIX 和 ALESIO 都在节目中
+
+[03:48.00]找到更多新的节目
+
+[03:50.00]SWIX 和 ALESIO 都在节目中
+
+[03:52.00]找到更多新的节目
+
+[03:54.00]SWIX 和 ALESIO 都在节目中
+
+[03:56.00]找到更多新的节目
+
+[03:58.00]SWIX 和 ALESIO 都在节目中
+
+[04:00.00]找到更多新的节目
+
+[04:02.00]SWIX 和 ALESIO 都在节目中
+
+[04:04.00]找到更多新的节目
+
+[04:06.00]SWIX 和 ALESIO 都在节目中
+
+[04:08.00]找到更多新的节目
+
+[04:10.00]SWIX 和 ALESIO 都在节目中
+
+[04:12.00]找到更多新的节目
+
+[04:14.00]SWIX 和 ALESIO 都在节目中
+
+[04:16.00]找到更多新的节目
+
+[04:18.00]SWIX 和 ALESIO 都在节目中
+
+[04:20.00]找到更多新的节目
+
+[04:22.00]SWIX 和 ALESIO 都在节目中
+
+[04:24.00]找到更多新的节目
+
+[04:26.00]SWIX 和 ALESIO 都在节目中
+
+[04:28.00]找到更多新的节目
+
+[04:30.00]SWIX 和 ALESIO 都在节目中
+
+[04:32.00]找到更多新的节目
+
+[04:34.00]SWIX 和 ALESIO 都在节目中
+
+[04:36.00]找到更多新的节目
+
+[04:38.00]SWIX 和 ALESIO 都在节目中
+
+[04:40.00]找到更多新的节目
+
+[04:42.00]SWIX 和 ALESIO 都在节目中
+
+[04:44.00]找到更多新的节目
+
+[04:46.00]SWIX 和 ALESIO 都在节目中
+
+[04:48.00]找到更多新的节目
+
+[04:50.00]SWIX 和 ALESIO 都在节目中
+
+[04:52.00]找到更多新的节目
+
+[04:54.00]SWIX 和 ALESIO 都在节目中
+
+[04:56.00]找到更多新的节目
+
+[04:58.00]SWIX 和 ALESIO 都在节目中
+
+[05:00.00]找到更多新的节目
+
+[05:02.00]SWIX 和 ALESIO 都在节目中
+
+[05:04.00]找到更多新的节目
+
+[05:06.00]SWIX 和 ALESIO 都在节目中
+
+[05:08.00]找到更多新的节目
+
+[05:10.00]SWIX 和 ALESIO 都在节目中
+
+[05:12.00]找到更多新的节目
+
+[05:14.00]SWIX 和 ALESIO 都在节目中
+
+[05:16.00]找到更多新的节目
+
+[05:18.00]SWIX 和 ALESIO 都在节目中
+
+[05:20.00]找到更多新的节目
+
+[05:22.00]SWIX 和 ALESIO 都在节目中
+
+[05:24.00]找到更多新的节目
+
+[05:26.00]SWIX 和 ALESIO 都在节目中
+
+[05:28.00]找到更多新的节目
+
+[05:30.00]SWIX 和 ALESIO 都在节目中
+
+[05:32.00]找到更多新的节目
+
+[05:34.00]SWIX 和 ALESIO 都在节目中
+
+[05:36.00]找到更多新的节目
+
+[05:38.00]SWIX 和 ALESIO 都在节目中
+
+[05:40.00]找到更多新的节目
+
+[05:42.00]SWIX 和 ALESIO 都在节目中
+
+[05:44.00]找到更多新的节目
+
+[05:46.00]SWIX 和 ALESIO 都在节目中
+
+[05:48.00]找到更多新的节目
+
+[05:50.00]SWIX 和 ALESIO 都在节目中
+
+[05:52.00]找到更多新的节目
+
+[05:54.00]SWIX 和 ALESIO 都在节目中
+
+[05:56.00]找到更多新的节目
+
+[05:58.00]SWIX 和 ALESIO 都在节目中
+
+[06:00.00]找到更多新的节目
+
+[06:02.00]SWIX 和 ALESIO 都在节目中
+
+[06:04.00]找到更多新的节目
+
+[06:06.00]SWIX 和 ALESIO 都在节目中
+
+[06:08.00]找到更多新的节目
+
+[06:10.00]SWIX 和 ALESIO 都在节目中
+
+[06:12.00]找到更多新的节目
+
+[06:14.00]SWIX 和 ALESIO 都在节目中
+
+[06:16.00]找到更多新的节目
+
+[06:18.00]SWIX 和 ALESIO 都在节目中
+
+[06:20.00]找到更多新的节目
+
+[06:22.00]SWIX 和 ALESIO 都在节目中
+
+[06:24.00]找到更多新的节目
+
+[06:26.00]SWIX 和 ALESIO 都在节目中
+
+[06:28.00]找到更多新的节目
+
+[06:30.00]SWIX 和 ALESIO 都在节目中
+
+[06:32.00]找到更多新的节目
+
+[06:34.00]SWIX 和 ALESIO 都在节目中
+
+[06:36.00]找到更多新的节目
+
+[06:38.00]SWIX 和 ALESIO 都在节目中
+
+[06:40.00]找到更多新的节目
+
+[06:42.00]SWIX 和 ALESIO 都在节目中
+
+[06:44.00]找到更多新的节目
+
+[06:46.00]SWIX 和 ALESIO 都在节目中
+
+[06:48.00]找到更多新的节目
+
+[06:50.00]SWIX 和 ALESIO 都在节目中
+
+[06:52.00]找到更多新的节目
+
+[06:54.00]SWIX 和 ALESIO 都在节目中
+
+[06:56.00]找到更多新的节目
+
+[06:58.00]SWIX 和 ALESIO 都在节目中
+
+[07:00.00]找到更多新的节目
+
+[07:02.00]SWIX 和 ALESIO 都在节目中
+
+[07:04.00]找到更多新的节目
+
+[07:06.00]SWIX 和 ALESIO 都在节目中
+
+[07:08.00]找到更多新的节目
+
+[07:10.00]SWIX 和 ALESIO 都在节目中
+
+[07:12.00]找到更多新的节目
+
+[07:14.00]SWIX 和 ALESIO 都在节目中
+
+[07:16.00]找到更多新的节目
+
+[07:18.00]SWIX 和 ALESIO 都在节目中
+
+[07:20.00]找到更多新的节目
+
+[07:22.00]SWIX 和 ALESIO 都在节目中
+
+[07:24.00]找到更多新的节目
+
+[07:26.00]SWIX 和 ALESIO 都在节目中
+
+[07:28.00]找到更多新的节目
+
+[07:30.00]SWIX 和 ALESIO 都在节目中
+
+[07:32.00]找到更多新的节目
+
+[07:34.00]SWIX 和 ALESIO 都在节目中
+
+[07:36.00]找到更多新的节目
+
+[07:38.00]SWIX 和 ALESIO 都在节目中
+
+[07:40.00]找到更多新的节目
+
+[07:42.00]SWIX 和 ALESIO 都在节目中
+
+[07:44.00]找到更多新的节目
+
+[07:46.00]SWIX 和 ALESIO 都在节目中
+
+[07:48.00]找到更多新的节目
+
+[07:50.00]SWIX 和 ALESIO 都在节目中
+
+[07:52.00]找到更多新的节目
+
+[07:54.00]SWIX 和 ALESIO 都在节目中
+
+[07:56.00]找到更多新的节目
+
+[07:58.00]找到更多新的节目
+
+[08:00.00]SWIX 和 ALESIO 都在节目中
+
+[08:02.00]找到更多新的节目
+
+[08:04.00]SWIX 和 ALESIO 都在节目中
+
+[08:06.00]找到更多新的节目
+
+[08:08.00]SWIX 和 ALESIO 都在节目中
+
+[08:10.00]找到更多新的节目
+
+[08:12.00]SWIX 和 ALESIO 都在节目中
+
+[08:14.00]找到更多新的节目
+
+[08:16.00]SWIX 和 ALESIO 都在节目中
+
+[08:18.00]找到更多新的节目
+
+[08:20.00]SWIX 和 ALESIO 都在节目中
+
+[08:22.00]找到更多新的节目
+
+[08:24.00]SWIX 和 ALESIO 都在节目中
+
+[08:26.00]找到更多新的节目
+
+[08:28.00]SWIX 和 ALESIO 都在节目中
+
+[08:30.00]找到更多新的节目
+
+[08:32.00]SWIX 和 ALESIO 都在节目中
+
+[08:34.00]找到更多新的节目
+
+[08:36.00]SWIX 和 ALESIO 都在节目中
+
+[08:38.00]找到更多新的节目
+
+[08:40.00]SWIX 和 ALESIO 都在节目中
+
+[08:42.00]找到更多新的节目
+
+[08:44.00]SWIX 和 ALESIO 都在节目中
+
+[08:46.00]找到更多新的节目
+
+[08:48.00]SWIX 和 ALESIO 都在节目中
+
+[08:50.00]找到更多新的节目
+
+[08:52.00]SWIX 和 ALESIO 都在节目中
+
+[08:54.00]找到更多新的节目
+
+[08:56.00]SWIX 和 ALESIO 都在节目中
+
+[08:58.00]找到更多新的节目
+
+[09:00.00]SWIX 和 ALESIO 都在节目中
+
+[09:02.00]找到更多新的节目
+
+[09:04.00]SWIX 和 ALESIO 都在节目中
+
+[09:06.00]找到更多新的节目
+
+[09:08.00]SWIX 和 ALESIO 都在节目中
+
+[09:10.00]找到更多新的节目
+
+[09:12.00]SWIX 和 ALESIO 都在节目中
+
+[09:14.00]找到更多新的节目
+
+[09:16.00]SWIX 和 ALESIO 都在节目中
+
+[09:18.00]找到更多新的节目
+
+[09:20.00]SWIX 和 ALESIO 都在节目中
+
+[09:22.00]找到更多新的节目
+
+[09:24.00]空中
+
+[09:26.00]找到更多新的节目
+
+[09:28.00]SWIX 和 ALESIO 都在节目中
+
+[09:30.00]找到更多新的节目
+
+[09:32.00]SWIX 和 ALESIO 都在节目中
+
+[09:34.00]找到更多新的节目
+
+[09:36.00]SWIX 和 ALESIO 都在节目中
+
+[09:38.00]找到更多新的节目
+
+[09:40.00]SWIX 和 ALESIO 都在节目中
+
+[09:42.00]找到更多新的节目
+
+[09:44.00]SWIX 和 ALESIO 都在节目中
+
+[09:46.00]找到更多新的节目
+
+[09:48.00]SWIX 和 ALESIO 都在节目中
+
+[09:50.00]找到更多新的节目
+
+[09:52.00]SWIX 和 ALESIO 都在节目中
+
+[09:54.00]找到更多新的节目
+
+[09:56.00]SWIX 和 ALESIO 都在节目中
+
+[09:58.00]找到更多新的节目
+
+[10:00.00]SWIX 和 ALESIO 都在节目中
+
+[10:02.00]找到更多新的节目
+
+[10:04.00]SWIX 和 ALESIO 都在节目中
+
+[10:06.00]找到更多新的节目
+
+[10:08.00]SWIX 和 ALESIO 都在节目中
+
+[10:10.00]找到更多新的节目
+
+[10:12.00]SWIX 和 ALESIO 都在节目中
+
+[10:14.00]找到更多新的节目
+
+[10:16.00]SWIX 和 ALESIO 都在节目中
+
+[10:18.00]找到更多新的节目
+
+[10:20.00]SWIX 和 ALESIO 都在节目中
+
+[10:22.00]找到更多新的节目
+
+[10:24.00]SWIX 和 ALESIO 都在节目中
+
+[10:26.00]找到更多新的节目
+
+[10:28.00]SWIX 和 ALESIO 都在节目中
+
+[10:30.00]找到更多新的节目
+
+[10:32.00]SWIX 和 ALESIO 都在节目中
+
+[10:34.00]找到更多新的节目
+
+[10:36.00]SWIX 和 ALESIO 都在节目中
+
+[10:38.00]找到更多新的节目
+
+[10:40.00]SWIX 和 ALESIO 都在节目中
+
+[10:42.00]找到更多新的节目
+
+[10:44.00]SWIX 和 ALESIO 都在节目中
+
+[10:46.00]找到更多新的节目
+
+[10:48.00]SWIX 和 ALESIO 都在节目中
+
+[10:50.00]找到更多新的节目
+
+[10:52.00]SWIX 和 ALESIO 都在节目中
+
+[10:54.00]找到更多新的节目
+
+[10:56.00]SWIX 和 ALESIO 都在节目中
+
+[10:58.00]找到更多新的节目
+
+[11:00.00]SWIX 和 ALESIO 都在节目中
+
+[11:02.00]找到更多新的节目
+
+[11:04.00]SWIX 和 ALESIO 都在节目中
+
+[11:06.00]找到更多新的节目
+
+[11:08.00]SWIX 和 ALESIO 都在节目中
+
+[11:10.00]找到更多新的节目
+
+[11:12.00]SWIX 和 ALESIO 都在节目中
+
+[11:14.00]找到更多新的节目
+
+[11:16.00]SWIX 和 ALESIO 都在节目中
+
+[11:18.00]找到更多新的节目
+
+[11:20.00]SWIX 和 ALESIO 都在节目中
+
+[11:22.00]找到更多新的节目
+
+[11:24.00]SWIX 和 ALESIO 都在节目中
+
+[11:26.00]找到更多新的节目
+
+[11:28.00]SWIX 和 ALESIO 都在节目中
+
+[11:30.00]找到更多新的节目
+
+[11:32.00]SWIX 和 ALESIO 都在节目中
+
+[11:34.00]找到更多新的节目
+
+[11:36.00]SWIX 和 ALESIO 都在节目中
+
+[11:38.00]找到更多新的节目
+
+[11:40.00]SWIX 和 ALESIO 都在节目中
+
+[11:42.00]找到更多新的节目
+
+[11:44.00]SWIX 和 ALESIO 都在节目中
+
+[11:46.00]找到更多新的节目
+
+[11:48.00]SWIX 和 ALESIO 都在节目中
+
+[11:50.00]找到更多新的节目
+
+[11:52.00]SWIX 和 ALESIO 都在节目中
+
+[11:54.00]找到更多新的节目
+
+[11:56.00]SWIX 和 ALESIO 都在节目中
+
+[11:58.00]找到更多新的节目
+
+[12:00.00]SWIX 和 ALESIO 都在节目中
+
+[12:02.00]找到更多新的节目
+
+[12:04.00]SWIX 和 ALESIO 都在节目中
+
+[12:06.00]找到更多新的节目
+
+[12:08.00]SWIX 和 ALESIO 都在节目中
+
+[12:10.00]找到更多新的节目
+
+[12:12.00]SWIX 和 ALESIO 都在节目中
+
+[12:14.00]找到更多新的节目
+
+[12:16.00]SWIX 和 ALESIO 都在节目中
+
+[12:18.00]找到更多新的节目
+
+[12:20.00]SWIX 和 ALESIO 都在节目中
+
+[12:22.00]找到更多新的节目
+
+[12:24.00]SWIX 和 ALESIO 都在节目中
+
+[12:26.00]找到更多新的节目
+
+[12:28.00]SWIX 和 ALESIO 都在节目中
+
+[12:30.00]找到更多新的节目
+
+[12:32.00]SWIX 和 ALESIO 都在节目中
+
+[12:34.00]找到更多新的节目
+
+[12:36.00]SWIX 和 ALESIO 都在节目中
+
+[12:38.00]找到更多新的节目
+
+[12:40.00]找到更多新的节目
+
+[12:42.00]找到更多新的节目
+
+[12:44.00]找到更多新的节目
+
+[12:46.00]找到更多新的节目
+
+[12:48.00]找到更多新的节目
+
+[12:50.00]找到更多新的节目
+
+[12:52.00]找到更多新的节目
+
+[12:54.00]找到更多新的节目
+
+[12:56.00]找到更多新的节目
+
+[12:58.00]找到更多新的节目
+
+[13:00.00]找到更多新的节目
+
+[13:02.00]找到更多新的节目
+
+[13:04.00]找到更多新的节目
+
+[13:06.00]找到更多新的节目
+
+[13:08.00]找到更多新的节目
+
+[13:10.00]找到更多新的节目
+
+[13:12.00]找到更多新的节目
+
+[13:14.00]找到更多新的节目
+
+[13:16.00]找到更多新的节目
+
+[13:18.00]找到更多新的节目
+
+[13:20.00]找到更多新的节目
+
+[13:22.00]找到更多新的节目
+
+[13:24.00]找到更多新的节目
+
+[13:26.00]找到更多新的节目
+
+[13:28.00]找到更多新的节目
+
+[13:30.00]找到更多新的节目
+
+[13:32.00]找到更多新的节目
+
+[13:34.00]找到更多新的节目
+
+[13:36.00]找到更多新的节目
+
+[13:38.00]找到更多新的节目
+
+[13:40.00]找到更多新的节目
+
+[13:42.00]找到更多新的节目
+
+[13:44.00]找到更多新的节目
+
+[13:46.00]找到更多新的节目
+
+[13:48.00]找到更多新的节目
+
+[13:50.00]找到更多新的节目
+
+[13:52.00]找到更多新的节目
+
+[13:54.00]找到更多新的节目
+
+[13:56.00]找到更多新的节目
+
+[13:58.00]找到更多新的节目
+
+[14:00.00]找到更多新的节目
+
+[14:02.00]找到更多新的节目
+
+[14:04.00]找到更多新的节目
+
+[14:06.00]找到更多新的节目
+
+[14:08.00]找到更多新的节目
+
+[14:10.00]找到更多新的节目
+
+[14:12.00]找到更多新的节目
+
+[14:14.00]找到更多新的节目
+
+[14:16.00]找到更多新的节目
+
+[14:18.00]找到更多新的节目
+
+[14:20.00]找到更多新的节目
+
+[14:22.00]找到更多新的节目
+
+[14:24.00]找到更多新的节目
+
+[14:26.00]找到更多新的节目
+
+[14:28.00]找到更多新的节目
+
+[14:30.00]找到更多新的节目
+
+[14:32.00]找到更多新的节目
+
+[14:34.00]找到更多新的节目
+
+[14:36.00]找到更多新的节目
+
+[14:38.00]找到更多新的节目
+
+[14:40.00]找到更多新的节目
+
+[14:42.00]找到更多新的节目
+
+[14:44.00]找到更多新的节目
+
+[14:46.00]找到更多新的节目
+
+[14:48.00]找到更多新的节目
+
+[14:50.00]找到更多新的节目
+
+[14:52.00]找到更多新的节目
+
+[14:54.00]找到更多新的节目
+
+[14:56.00]找到更多新的节目
+
+[14:58.00]找到更多新的节目
+
+[15:00.00]找到更多新的节目
+
+[15:02.00]找到更多新的节目
+
+[15:04.00]找到更多新的节目
+
+[15:06.00]找到更多新的节目
+
+[15:08.00]找到更多新的节目
+
+[15:10.00]找到更多新的节目
+
+[15:12.00]找到更多新的节目
+
+[15:14.00]找到更多新的节目
+
+[15:16.00]找到更多新的节目
+
+[15:18.00]找到更多新的节目
+
+[15:20.00]找到更多新的节目
+
+[15:22.00]找到更多新的节目
+
+[15:24.00]找到更多新的节目
+
+[15:26.00]找到更多新的节目
+
+[15:28.00]找到更多新的节目
+
+[15:30.00]找到更多新的节目
+
+[15:32.00]找到更多新的节目
+
+[15:34.00]找到更多新的节目
+
+[15:36.00]找到更多新的节目
+
+[15:38.00]找到更多新的节目
+
+[15:40.00]找到更多新的节目
+
+[15:42.00]找到更多新的节目
+
+[15:44.00]找到更多新的节目
+
+[15:46.00]找到更多新的节目
+
+[15:48.00]找到更多新的节目
+
+[15:50.00]找到更多新的节目
+
+[15:52.00]找到更多新的节目
+
+[15:54.00]找到更多新的节目
+
+[15:56.00]找到更多新的节目
+
+[15:58.00]找到更多新的节目
+
+[16:00.00]找到更多新的节目
+
+[16:02.00]找到更多新的节目
+
+[16:04.00]找到更多新的节目
+
+[16:06.00]找到更多新的节目
+
+[16:08.00]找到更多新的节目
+
+[16:10.00]找到更多新的节目
+
+[16:12.00]找到更多新的节目
+
+[16:14.00]找到更多新的节目
+
+[16:16.00]找到更多新的节目
+
+[16:18.00]找到更多新的节目
+
+[16:20.00]找到更多新的节目
+
+[16:22.00]找到更多新的节目
+
+[16:24.00]找到更多新的节目
+
+[16:26.00]找到更多新的节目
+
+[16:28.00]找到更多新的节目
+
+[16:30.00]找到更多新的节目
+
+[16:32.00]找到更多新的节目
+
+[16:34.00]找到更多新的节目
+
+[16:36.00]找到更多新的节目
+
+[16:38.00]找到更多新的节目
+
+[16:40.00]找到更多新的节目
+
+[16:42.00]找到更多新的节目
+
+[16:44.00]找到更多新的节目
+
+[16:46.00]找到更多新的节目
+
+[16:48.00]找到更多新的节目
+
+[16:50.00]找到更多新的节目
+
+[16:52.00]找到更多新的节目
+
+[16:54.00]找到更多新的节目
+
+[16:56.00]找到更多新的节目
+
+[16:58.00]找到更多新的节目
+
+[17:00.00]找到更多新的节目
+
+[17:02.00]找到更多新的节目
+
+[17:04.00]找到更多新的节目
+
+[17:06.00]找到更多新的节目
+
+[17:08.00]找到更多新的节目
+
+[17:10.00]找到更多新的节目
+
+[17:12.00]找到更多新的节目
+
+[17:14.00]找到更多新的节目
+
+[17:16.00]找到更多新的节目
+
+[17:18.00]找到更多新的节目
+
+[17:20.00]找到更多新的节目
+
+[17:22.00]找到更多新的节目
+
+[17:24.00]找到更多新的节目
+
+[17:26.00]找到更多新的节目
+
+[17:28.00]找到更多新的节目
+
+[17:30.00]找到更多新的节目
+
+[17:32.00]找到更多新的节目
+
+[17:34.00]找到更多新的节目
+
+[17:36.00]找到更多新的节目
+
+[17:38.00]找到更多新的节目
+
+[17:40.00]找到更多新的节目
+
+[17:42.00]找到更多新的节目
+
+[17:44.00]找到更多新的节目
+
+[17:46.00]找到更多新的节目
+
+[17:48.00]找到更多新的节目
+
+[17:50.00]找到更多新的节目
+
+[17:52.00]找到更多新的节目
+
+[17:54.00]找到更多新的节目
+
+[17:56.00]找到更多新的节目
+
+[17:58.00]找到更多新的节目
+
+[18:00.00]找到更多新的节目
+
+[18:02.00]找到更多新的节目
+
+[18:04.00]找到更多新的节目
+
+[18:06.00]找到更多新的节目
+
+[18:08.00]找到更多新的节目
+
+[18:10.00]找到更多新的节目
+
+[18:12.00]找到更多新的节目
+
+[18:14.00]找到更多新的节目
+
+[18:16.00]找到更多新的节目
+
+[18:18.00]找到更多新的节目
+
+[18:20.00]找到更多新的节目
+
+[18:22.00]找到更多新的节目
+
+[18:24.00]找到更多新的节目
+
+[18:26.00]找到更多新的节目
+
+[18:28.00]找到更多新的节目
+
+[18:30.00]找到更多新的节目
+
+[18:32.00]找到更多新的节目
+
+[18:34.00]找到更多新的节目
+
+[18:36.00]找到更多新的节目
+
+[18:38.00]找到更多新的节目
+
+[18:40.00]找到更多新的节目
+
+[18:42.00]找到更多新的节目
+
+[18:44.00]找到更多新的节目
+
+[18:46.00]找到更多新的节目
+
+[18:48.00]找到更多新的节目
+
+[18:50.00]找到更多新的节目
+
+[18:52.00]找到更多新的节目
+
+[18:54.00]找到更多新的节目
+
+[18:56.00]找到更多新的节目
+
+[18:58.00]找到更多新的节目
+
+[19:00.00]找到更多新的节目
+
+[19:02.00]找到更多新的节目
+
+[19:04.00]找到更多新的节目
+
+[19:06.00]找到更多新的节目
+
+[19:08.00]找到更多新的节目
+
+[19:10.00]找到更多新的节目
+
+[19:12.00]找到更多新的节目
+
+[19:14.00]找到更多新的节目
+
+[19:16.00]找到更多新的节目
+
+[19:18.00]找到更多新的节目
+
+[19:20.00]找到更多新的节目
+
+[19:22.00]找到更多新的节目
+
+[19:24.00]找到更多新的节目
+
+[19:26.00]找到更多新的节目
+
+[19:28.00]找到更多新的节目
+
+[19:30.00]找到更多新的节目
+
+[19:32.00]找到更多新的节目
+
+[19:34.00]找到更多新的节目
+
+[19:36.00]找到更多新的节目
+
+[19:38.00]找到更多新的节目
+
+[19:40.00]找到更多新的节目
+
+[19:42.00]找到更多新的节目
+
+[19:44.00]找到更多新的节目
+
+[19:46.00]找到更多新的节目
+
+[19:48.00]找到更多新的节目
+
+[19:50.00]找到更多新的节目
+
+[19:52.00]找到更多新的节目
+
+[19:54.00]找到更多新的节目
+
+[19:56.00]找到更多新的节目
+
+[19:58.00]找到更多新的节目
+
+[20:00.00]找到更多新的节目
+
+[20:02.00]找到更多新的节目
+
+[20:04.00]找到更多新的节目
+
+[20:06.00]找到更多新的节目
+
+[20:08.00]找到更多新的节目
+
+[20:10.00]找到更多新的节目
+
+[20:12.00]找到更多新的节目
+
+[20:14.00]找到更多新的节目
+
+[20:16.00]找到更多新的节目
+
+[20:18.00]找到更多新的节目
+
+[20:20.00]找到更多新的节目
+
+[20:22.00]找到更多新的节目
+
+[20:24.00]找到更多新的节目
+
+[20:26.00]找到更多新的节目
+
+[20:28.00]找到更多新的节目
+
+[20:30.00]找到更多新的节目
+
+[20:32.00]找到更多新的节目
+
+[20:34.00]找到更多新的节目
+
+[20:36.00]找到更多新的节目
+
+[20:38.00]找到更多新的节目
+
+[20:40.00]找到更多新的节目
+
+[20:42.00]找到更多新的节目
+
+[20:44.00]找到更多新的节目
+
+[20:46.00]找到更多新的节目
+
+[20:48.00]找到更多新的节目
+
+[20:50.00]找到更多新的节目
+
+[20:52.00]找到更多新的节目
+
+[20:54.00]找到更多新的节目
+
+[20:56.00]找到更多新的节目
+
+[20:58.00]找到更多新的节目
+
+[21:00.00]找到更多新的节目
+
+[21:02.00]找到更多新的节目
+
+[21:04.00]找到更多新的节目
+
+[21:06.00]找到更多新的节目
+
+[21:08.00]找到更多新的节目
+
+[21:10.00]找到更多新的节目
+
+[21:12.00]找到更多新的节目
+
+[21:14.00]找到更多新的节目
+
+[21:16.00]找到更多新的节目
+
+[21:18.00]找到更多新的节目
+
+[21:20.00]找到更多新的节目
+
+[21:22.00]找到更多新的节目
+
+[21:24.00]找到更多新的节目
+
+[21:26.00]找到更多新的节目
+
+[21:28.00]找到更多新的节目
+
+[21:30.00]找到更多新的节目
+
+[21:32.00]找到更多新的节目
+
+[21:34.00]找到更多新的节目
+
+[21:36.00]找到更多新的节目
+
+[21:38.00]找到更多新的节目
+
+[21:40.00]找到更多新的节目
+
+[21:42.00]找到更多新的节目
+
+[21:44.00]找到更多新的节目
+
+[21:46.00]找到更多新的节目
+
+[21:48.00]找到更多新的节目
+
+[21:50.00]找到更多新的节目
+
+[21:52.00]找到更多新的节目
+
+[21:54.00]找到更多新的节目
+
+[21:56.00]找到更多新的节目
+
+[21:58.00]找到更多新的节目
+
+[22:00.00]找到更多新的节目
+
+[22:02.00]找到更多新的节目
+
+[22:04.00]找到更多新的节目
+
+[22:06.00]找到更多新的节目
+
+[22:08.00]找到更多新的节目
+
+[22:10.00]找到更多新的节目
+
+[22:12.00]找到更多新的节目
+
+[22:14.00]找到更多新的节目
+
+[22:16.00]找到更多新的节目
+
+[22:18.00]找到更多新的节目
+
+[22:20.00]找到更多新的节目
+
+[22:22.00]找到更多新的节目
+
+[22:24.00]找到更多新的节目
+
+[22:26.00]找到更多新的节目
+
+[22:28.00]找到更多新的节目
+
+[22:30.00]找到更多新的节目
+
+[22:32.00]找到更多新的节目
+
+[22:34.00]找到更多新的节目
+
+[22:36.00]找到更多新的节目
+
+[22:38.00]找到更多新的节目
+
+[22:40.00]找到更多新的节目
+
+[22:42.00]找到更多新的节目
+
+[22:44.00]找到更多新的节目
+
+[22:46.00]找到更多新的节目
+
+[22:48.00]找到更多新的节目
+
+[22:50.00]找到更多新的节目
+
+[22:52.00]找到更多新的节目
+
+[22:54.00]找到更多新的节目
+
+[22:56.00]找到更多新的节目
+
+[22:58.00]找到更多新的节目
+
+[23:00.00]找到更多新的节目
+
+[23:02.00]找到更多新的节目
+
+[23:04.00]找到更多新的节目
+
+[23:06.00]找到更多新的节目
+
+[23:08.00]找到更多新的节目
+
+[23:10.00]找到更多新的节目
+
+[23:12.00]找到更多新的节目
+
+[23:14.00]找到更多新的节目
+
+[23:16.00]找到更多新的节目
+
+[23:18.00]找到更多新的节目
+
+[23:20.00]找到更多新的节目
+
+[23:22.00]找到更多新的节目
+
+[23:24.00]找到更多新的节目
+
+[23:26.00]找到更多新的节目
+
+[23:28.00]找到更多新的节目
+
+[23:30.00]找到更多新的节目
+
+[23:32.00]找到更多新的节目
+
+[23:34.00]找到更多新的节目
+
+[23:36.00]找到更多新的节目
+
+[23:38.00]找到更多新的节目
+
+[23:40.00]找到更多新的节目
+
+[23:42.00]找到更多新的节目
+
+[23:44.00]找到更多新的节目
+
+[23:46.00]找到更多新的节目
+
+[23:48.00]找到更多新的节目
+
+[23:50.00]找到更多新的节目
+
+[23:52.00]找到更多新的节目
+
+[23:54.00]找到更多新的节目
+
+[23:56.00]找到更多新的节目
+
+[23:58.00]找到更多新的节目
+
+[24:00.00]找到更多新的节目
+
+[24:02.00]找到更多新的节目
+
+[24:04.00]找到更多新的节目
+
+[24:06.00]找到更多新的节目
+
+[24:08.00]找到更多新的节目
+
+[24:10.00]找到更多新的节目
+
+[24:12.00]找到更多新的节目
+
+[24:14.00]找到更多新的节目
+
+[24:16.00]找到更多新的节目
+
+[24:18.00]找到更多新的节目
+
+[24:20.00]找到更多新的节目
+
+[24:22.00]找到更多新的节目
+
+[24:24.00]找到更多新的节目
+
+[24:26.00]找到更多新的节目
+
+[24:28.00]找到更多新的节目
+
+[24:30.00]找到更多新的节目
+
+[24:32.00]找到更多新的节目
+
+[24:34.00]找到更多新的节目
+
+[24:36.00]找到更多新的节目
+
+[24:38.00]找到更多新的节目
+
+[24:40.00]找到更多新的节目
+
+[24:42.00]找到更多新的节目
+
+[24:44.00]找到更多新的节目
+
+[24:46.00]找到更多新的节目
+
+[24:48.00]找到更多新的节目
+
+[24:50.00]找到更多新的节目
+
+[24:52.00]找到更多新的节目
+
+[24:54.00]找到更多新的节目
+
+[24:56.00]找到更多新的节目
+
+[24:58.00]找到更多新的节目
+
+[25:00.00]找到更多新的节目
+
+[25:02.00]找到更多新的节目
+
+[25:04.00]找到更多新的节目
+
+[25:06.00]找到更多新的节目
+
+[25:08.00]找到更多新的节目
+
+[25:10.00]找到更多新的节目
+
+[25:12.00]找到更多新的节目
+
+[25:14.00]找到更多新的节目
+
+[25:16.00]找到更多新的节目
+
+[25:18.00]找到更多新的节目
+
+[25:20.00]找到更多新的节目
+
+[25:22.00]找到更多新的节目
+
+[25:24.00]找到更多新的节目
+
+[25:26.00]找到更多新的节目
+
+[25:28.00]找到更多新的节目
+
+[25:30.00]找到更多新的节目
+
+[25:32.00]找到更多新的节目
+
+[25:34.00]找到更多新的节目
+
+[25:36.00]找到更多新的节目
+
+[25:38.00]找到更多新的节目
+
+[25:40.00]找到更多新的节目
+
+[25:42.00]找到更多新的节目
+
+[25:44.00]找到更多新的节目
+
+[25:46.00]找到更多新的节目
+
+[25:48.00]找到更多新的节目
+
+[25:50.00]找到更多新的节目
+
+[25:52.00]找到更多新的节目
+
+[25:54.00]找到更多新的节目
+
+[25:56.00]找到更多新的节目
+
+[25:58.00]找到更多新的节目
+
+[26:00.00]找到更多新的节目
+
+[26:02.00]找到更多新的节目
+
+[26:04.00]找到更多新的节目
+
+[26:06.00]找到更多新的节目
+
+[26:08.00]找到更多新的节目
+
+[26:10.00]找到更多新的节目
+
+[26:12.00]找到更多新的节目
+
+[26:14.00]找到更多新的节目
+
+[26:16.00]找到更多新的节目
+
+[26:18.00]找到更多新的节目
+
+[26:20.00]找到更多新的节目
+
+[26:22.00]找到更多新的节目
+
+[26:24.00]找到更多新的节目
+
+[26:26.00]找到更多新的节目
+
+[26:28.00]找到更多新的节目
+
+[26:30.00]找到更多新的节目
+
+[26:32.00]找到更多新的节目
+
+[26:34.00]找到更多新的节目
+
+[26:36.00]找到更多新的节目
+
+[26:38.00]找到更多新的节目
+
+[26:40.00]找到更多新的节目
+
+[26:42.00]找到更多新的节目
+
+[26:44.00]找到更多新的节目
+
+[26:46.00]找到更多新的节目
+
+[26:48.00]找到更多新的节目
+
+[26:50.00]找到更多新的节目
+
+[26:52.00]找到更多新的节目
+
+[26:54.00]找到更多新的节目
+
+[26:56.00]找到更多新的节目
+
+[26:58.00]找到更多新的节目
+
+[27:00.00]找到更多新的节目
+
+[27:02.00]找到更多新的节目
+
+[27:04.00]找到更多新的节目
+
+[27:06.00]找到更多新的节目
+
+[27:08.00]找到更多新的节目
+
+[27:10.00]找到更多新的节目
+
+[27:12.00]找到更多新的节目
+
+[27:14.00]找到更多新的节目
+
+[27:16.00]找到更多新的节目
+
+[27:18.00]找到更多新的节目
+
+[27:20.00]找到更多新的节目
+
+[27:22.00]找到更多新的节目
+
+[27:24.00]找到更多新的节目
+
+[27:26.00]找到更多新的节目
+
+[27:28.00]找到更多新的节目
+
+[27:30.00]找到更多新的节目
+
+[27:32.00]找到更多新的节目
+
+[27:34.00]找到更多新的节目
+
+[27:36.00]找到更多新的节目
+
+[27:38.00]找到更多新的节目
+
+[27:40.00]找到更多新的节目
+
+[27:42.00]找到更多新的节目
+
+[27:44.00]找到更多新的节目
+
+[27:46.00]找到更多新的节目
+
+[27:48.00]找到更多新的节目
+
+[27:50.00]找到更多新的节目
+
+[27:52.00]找到更多新的节目
+
+[27:54.00]找到更多新的节目
+
+[27:56.00]找到更多新的节目
+
+[27:58.00]找到更多新的节目
+
+[28:00.00]找到更多新的节目
+
+[28:02.00]找到更多新的节目
+
+[28:04.00]找到更多新的节目
+
+[28:06.00]找到更多新的节目
+
+[28:08.00]找到更多新的节目
+
+[28:10.00]找到更多新的节目
+
+[28:12.00]找到更多新的节目
+
+[28:14.00]找到更多新的节目
+
+[28:16.00]找到更多新的节目
+
+[28:18.00]找到更多新的节目
+
+[28:20.00]找到更多新的节目
+
+[28:22.00]找到更多新的节目
+
+[28:24.00]找到更多新的节目
+
+[28:26.00]找到更多新的节目
+
+[28:28.00]找到更多新的节目
+
+[28:30.00]找到更多新的节目
+
+[28:32.00]找到更多新的节目
+
+[28:34.00]找到更多新的节目
+
+[28:36.00]找到更多新的节目
+
+[28:38.00]找到更多新的节目
+
+[28:40.00]找到更多新的节目
+
+[28:42.00]找到更多新的节目
+
+[28:44.00]找到更多新的节目
+
+[28:46.00]找到更多新的节目
+
+[28:48.00]找到更多新的节目
+
+[28:50.00]找到更多新的节目
+
+[28:52.00]找到更多新的节目
+
+[28:54.00]找到更多新的节目
+
+[28:56.00]找到更多新的节目
+
+[28:58.00]找到更多新的节目
+
+[29:00.00]找到更多新的节目
+
+[29:02.00]找到更多新的节目
+
+[29:04.00]找到更多新的节目
+
+[29:06.00]找到更多新的节目
+
+[29:08.00]找到更多新的节目
+
+[29:10.00]找到更多新的节目
+
+[29:12.00]找到更多新的节目
+
+[29:14.00]找到更多新的节目
+
+[29:16.00]找到更多新的节目
+
+[29:18.00]找到更多新的节目
+
+[29:20.00]找到更多新的节目
+
+[29:22.00]找到更多新的节目
+
+[29:24.00]找到更多新的节目
+
+[29:26.00]找到更多新的节目
+
+[29:28.00]找到更多新的节目
+
+[29:30.00]找到更多新的节目
+
+[29:32.00]找到更多新的节目
+
+[29:34.00]找到更多新的节目
+
+[29:36.00]找到更多新的节目
+
+[29:38.00]找到更多新的节目
+
+[29:40.00]找到更多新的节目
+
+[29:42.00]找到更多新的节目
+
+[29:44.00]找到更多新的节目
+
+[29:46.00]找到更多新的节目
+
+[29:48.00]找到更多新的节目
+
+[29:50.00]找到更多新的节目
+
+[29:52.00]找到更多新的节目
+
+[29:54.00]找到更多新的节目
+
+[29:56.00]找到更多新的节目
+
+[29:58.00]找到更多新的节目
+
+[30:00.00]找到更多新的节目
+
+[30:02.00]找到更多新的节目
+
+[30:04.00]找到更多新的节目
+
+[30:06.00]找到更多新的节目
+
+[30:08.00]找到更多新的节目
+
+[30:10.00]找到更多新的节目
+
+[30:12.00]找到更多新的节目
+
+[30:14.00]找到更多新的节目
+
+[30:16.00]找到更多新的节目
+
+[30:18.00]找到更多新的节目
+
+[30:20.00]找到更多新的节目
+
+[30:22.00]找到更多新的节目
+
+[30:24.00]找到更多新的节目
+
+[30:26.00]找到更多新的节目
+
+[30:28.00]找到更多新的节目
+
+[30:30.00]找到更多新的节目
+
+[30:32.00]找到更多新的节目
+
+[30:34.00]找到更多新的节目
+
+[30:36.00]找到更多新的节目
+
+[30:38.00]找到更多新的节目
+
+[30:40.00]找到更多新的节目
+
+[30:42.00]找到更多新的节目
+
+[30:44.00]找到更多新的节目
+
+[30:46.00]找到更多新的节目
+
+[30:48.00]找到更多新的节目
+
+[30:50.00]所以我们现在 slowly
+
+[30:52.00]改变了 现在我们现在慢慢改变了
+
+[30:54.00]我在将来归咖巴斯
+
+[30:56.00]没有改变了 没有改变了
+
+[30:58.00]但是现在我们
+
+[31:00.00]有一个世界的 一个世界的
+
+[31:02.00]有一个世界的 哪里有Claude
+
+[31:04.00]还有Gemnite 还有Gypsy4
+
+[31:06.00]希望更多的 会变得更多
+
+[31:08.00]希望更多的 希望更多的
+
+[31:10.00]所以 说说 说说
+
+[31:12.00]我们的视频 视频 视频
+
+[31:14.00]所以 非常大 非常大 因为
+
+[31:16.00]我们的视频 也不可以用
+
+[31:18.00]但我认为
+
+[31:20.00]大陆的社区
+
+[31:22.00]要他们继续建设
+
+[31:24.00]然后他们必须找一些方法
+
+[31:26.00]才能发现他们会做的
+
+[31:28.00]所以他们明白
+
+[31:30.00]他们必须解决他们要的
+
+[31:32.00]但是我们的视频 也有
+
+[31:34.00]Mistral 也有
+
+[31:36.00]Grock 现在
+
+[31:38.00]Grock 1 从 从 oktober
+
+[31:40.00]是公司
+
+[31:41.00]对对对对
+
+[31:42.00]你以为Grock是Grock the chip company
+
+[31:44.00]Grock the chip company
+
+[31:46.00]当然 浪漫3是
+
+[31:48.00]所有人都在问
+
+[31:50.00]我的感觉是
+
+[31:52.00]小时候 宋可伯
+
+[31:54.00]刚才说了浪漫3
+
+[31:56.00]说了 至少从
+
+[31:58.00]一个想法的选择
+
+[32:00.00]他不想 怎么做
+
+[32:02.00]要保持
+
+[32:04.00]能够保持 能够保持
+
+[32:06.00]Mistral 也想想
+
+[32:08.00]你去到 怎么
+
+[32:10.00]他 能够发展
+
+[32:12.00]每个人都好 任何
+
+[32:14.00]对 从我听过
+
+[32:16.00]在GDC 浪漫3
+
+[32:18.00]最大的模式是
+
+[32:20.00]260-300billion
+
+[32:22.00]所以那是大部分的
+
+[32:24.00]那不是一个开放模式
+
+[32:26.00]你不能给人们
+
+[32:28.00]300billion的模式
+
+[32:30.00]要用它 非常有力量
+
+[32:32.00]所以我认为
+
+[32:34.00]它是 可能是开放模式
+
+[32:36.00]但那是一个不同的问题
+
+[32:38.00]对对对
+
+[32:40.00]它是 比他们做的
+
+[32:42.00]在开放模式
+
+[32:44.00]在浪漫上
+
+[32:46.00]你能够使用
+
+[32:48.00]开放的AI
+
+[32:50.00]开放的安逸
+
+[32:52.00]一些公司在
+
+[32:54.00]中央的强硬度
+
+[32:56.00]所以我们
+
+[32:58.00]在Buckets 上
+
+[33:00.00]他们做了很多
+
+[33:02.00]他们比PyDorch 更好
+
+[33:04.00]那是 可能
+
+[33:06.00]在弥術上
+
+[33:08.00]可能是 可能是
+
+[33:10.00]I love the duck destroying
+
+[33:12.00]a lot of monopolies arc
+
+[33:14.00]it's been very entertaining
+
+[33:16.00]let's bridge into the
+
+[33:18.00]big tech side of this
+
+[33:20.00]I think when I did my episode
+
+[33:22.00]I added this as an additional war
+
+[33:24.00]that's something I'm paying attention to
+
+[33:26.00]so we've got
+
+[33:28.00]Microsoft's moves with inflection
+
+[33:30.00]which I think potentially are
+
+[33:32.00]being read as
+
+[33:34.00]a shift vis-a-vis the relationship
+
+[33:36.00]with open AI
+
+[33:38.00]missure a large relationship
+
+[33:40.00]seems to reinforce as well
+
+[33:42.00]we have apple potentially
+
+[33:44.00]entering the race finally
+
+[33:46.00]giving up project titan
+
+[33:48.00]and trying to spend more effort on this
+
+[33:50.00]although counterpoint
+
+[33:52.00]we also have them talking about
+
+[33:54.00]there being reports of a deal with google
+
+[33:56.00]which is interesting to see
+
+[33:58.00]what their strategy there is
+
+[34:00.00]and then metas been largely quiet
+
+[34:02.00]we just talked about the main piece
+
+[34:04.00]but there's spoilers
+
+[34:06.00]one of those things has been most interesting
+
+[34:08.00]to you guys as you think about
+
+[34:10.00]what's going to shake out for the rest of this year
+
+[34:12.00]let's take a crack
+
+[34:14.00]the reason we don't have a fifth war
+
+[34:16.00]for the big tech wars
+
+[34:18.00]that's one of those things where I just feel
+
+[34:20.00]we don't cover differently
+
+[34:22.00]from other media channels
+
+[34:24.00]I guess
+
+[34:26.00]in our entire interest
+
+[34:28.00]we try not to cover the big tech gamethrones
+
+[34:30.00]or it's proxied through
+
+[34:32.00]all the other four wars anyway
+
+[34:34.00]there's just a lot of overlap
+
+[34:36.00]but yeah I think absolutely personally
+
+[34:38.00]the most interesting one is apple entering the race
+
+[34:40.00]they actually release, they are announced
+
+[34:42.00]their first large language model that they train themselves
+
+[34:44.00]it's like a 30 billion multimodal model
+
+[34:46.00]people weren't that impressed
+
+[34:48.00]but it was like the first time
+
+[34:50.00]that apple has kind of showcased that
+
+[34:52.00]we're training large models in house as well
+
+[34:54.00]of course they might be
+
+[34:56.00]doing this deal with google
+
+[34:58.00]it sounds very sort of rumourary to me
+
+[35:00.00]and it's probably if it's on device
+
+[35:02.00]it's going to be a smaller model
+
+[35:03.00]it's going to be smarter auto complete
+
+[35:05.00]I don't know what to say
+
+[35:07.00]I'm still here dealing with
+
+[35:09.00]Siri which hasn't
+
+[35:11.00]probably hasn't been updated since
+
+[35:13.00]God knows when it was introduced
+
+[35:15.00]it's horrible and it
+
+[35:17.00]makes me so angry
+
+[35:19.00]one as an apple customer and user
+
+[35:21.00]I'm just hoping for better ai on apple itself
+
+[35:23.00]but two they are
+
+[35:25.00]the gold standard
+
+[35:27.00]when it comes to local devices
+
+[35:29.00]personal compute and trust
+
+[35:31.00]you trust them with your data
+
+[35:33.00]and I think
+
+[35:35.00]that's what a lot of people are looking for in ai
+
+[35:37.00]that they love the benefits of ai
+
+[35:39.00]they don't love the downsides
+
+[35:41.00]which is that you have to send all your data
+
+[35:43.00]to some clouds somewhere and some of this data
+
+[35:45.00]that we're going to feed ai is the most personal data there is
+
+[35:47.00]so apple being
+
+[35:49.00]one of the most trusted personal
+
+[35:51.00]data companies I think it's very important
+
+[35:53.00]that they enter the ai race
+
+[35:55.00]and I hope to see more out of them
+
+[35:57.00]to me the biggest question
+
+[35:59.00]it's like who's paying who
+
+[36:01.00]because for the browsers
+
+[36:03.00]google pays apple like 18
+
+[36:05.00]20 billion every year
+
+[36:07.00]to be the default browser
+
+[36:09.00]is google going to pay you to have javanai
+
+[36:11.00]or is apple paying google to have javanai
+
+[36:13.00]I think that's like what I'm most interested
+
+[36:15.00]to figure out because with the browsers
+
+[36:17.00]it's like it's the entry point
+
+[36:19.00]to the thing so it's really valuable
+
+[36:21.00]to be the default that's what google pays
+
+[36:23.00]but I wonder if the perception in ai
+
+[36:25.00]is going to be like hey
+
+[36:27.00]you have a good local model on my phone
+
+[36:29.00]to be worth me purchasing your device
+
+[36:31.00]and that's going to drive apple
+
+[36:33.00]to be the one buying the model
+
+[36:35.00]but then like Sean said
+
+[36:37.00]they're doing the mm1 themselves
+
+[36:39.00]are they saying we do models
+
+[36:41.00]but they're not as good as the google ones
+
+[36:43.00]I don't know the whole thing is really confusing
+
+[36:45.00]but it makes for a great meme
+
+[36:47.00]material on twitter
+
+[36:49.00]I think like
+
+[36:51.00]they are possibly more than
+
+[36:53.00]open ai and mic microsoft and amazon
+
+[36:55.00]they are the most full stack company there is
+
+[36:57.00]in computing
+
+[36:59.00]and so
+
+[37:01.00]like they own the chips man
+
+[37:03.00]like they manufacture everything
+
+[37:05.00]so if there was a company
+
+[37:07.00]that could seriously challenge
+
+[37:09.00]the other ai players it would be apple
+
+[37:11.00]and it's
+
+[37:13.00]I don't think it's as hard as self-driving
+
+[37:15.00]so like maybe they've just been
+
+[37:17.00]investing in the wrong thing this whole time
+
+[37:19.00]Wallstreet certainly thinks so
+
+[37:21.00]Wallstreet love that move man
+
+[37:23.00]there's a big sigh of relief
+
+[37:25.00]well let's move away
+
+[37:27.00]from sort of the big stuff
+
+[37:29.00]I think to both of your points
+
+[37:31.00]can I drop one factoid
+
+[37:33.00]about this wallstreet thing
+
+[37:35.00]I went and looked at
+
+[37:37.00]when from being a VR company
+
+[37:39.00]to an ai company
+
+[37:41.00]and I think
+
+[37:43.00]the stock
+
+[37:45.00]I'm trying to look up the details now
+
+[37:47.00]the stock has gone up 187%
+
+[37:49.00]since limo one
+
+[37:51.00]$830 billion in market value
+
+[37:53.00]created in the past year
+
+[37:55.00]if you haven't seen that chart
+
+[37:59.00]it's actually remarkable if you draw
+
+[38:01.00]a little arrow on it
+
+[38:03.00]it's likeno we're an ai company now
+
+[38:05.00]forget the VR thing
+
+[38:07.00]it isn't interesting
+
+[38:11.00]no I think unless you called it
+
+[38:13.00]zuck's disruptor arc or whatever
+
+[38:15.00]he really does
+
+[38:17.00]he is in the midst of a total
+
+[38:19.00]it's a redemption arc or it's just
+
+[38:21.00]it's something different where
+
+[38:23.00]he's sort of the spoiler like
+
+[38:25.00]people loved him
+
+[38:27.00]just freestyle talking about why he thought
+
+[38:29.00]they had a better headset than apple
+
+[38:31.00]even if they didn't agree they just loved
+
+[38:33.00]he was going direct to camera and talking about it
+
+[38:35.00]for five minutes or whatever
+
+[38:37.00]that's a fascinating shift that I don't think
+
+[38:39.00]anyone had on their bingo card
+
+[38:41.00]whatever two years ago
+
+[38:43.00]it's still there in cn5 d-long
+
+[38:45.00]don't write it off
+
+[38:47.00]we need to see him fight in the coliseum
+
+[38:49.00]no I think in terms of
+
+[38:51.00]self
+
+[38:53.00]management life leadership
+
+[38:55.00]there's a lot of lessons to learn from him
+
+[38:57.00]you might kind of quibble
+
+[38:59.00]with the social impact of facebook
+
+[39:01.00]but just himself
+
+[39:03.00]in terms of personal growth
+
+[39:05.00]and perseverance through
+
+[39:07.00]a lot of change
+
+[39:09.00]everyone throwing stuff his way
+
+[39:11.00]I think there's a lot to say about
+
+[39:13.00]to learn from zuck
+
+[39:15.00]he's my age
+
+[39:17.00]awesome
+
+[39:19.00]so one of the big things
+
+[39:21.00]that I think you guys have
+
+[39:23.00]distinct and unique insight into
+
+[39:25.00]being where you are and where you work on
+
+[39:27.00]iswhat developers
+
+[39:29.00]are getting really excited about right now
+
+[39:31.00]and by that I mean on the one hand
+
+[39:33.00]certainly start ups who are actually
+
+[39:35.00]formalized and formed to start ups
+
+[39:37.00]but also just in terms of
+
+[39:39.00]what people are spending their nights and weekends on
+
+[39:41.00]what they're coming to hackathons to do
+
+[39:43.00]and you know I think it's a
+
+[39:45.00]it's such a fascinating indicator
+
+[39:47.00]for where things are headed like
+
+[39:49.00]if you zoom back a year
+
+[39:51.00]right now was right when everyone was getting
+
+[39:53.00]so so excited about
+
+[39:55.00]ai agent stuff
+
+[39:57.00]auto gpt and baby agi and these things were like
+
+[39:59.00]if you dropped anything on youtube about those
+
+[40:01.00]like instantly tens of thousands of views
+
+[40:03.00]I know because I had like
+
+[40:05.00]a 50,000 view video
+
+[40:07.00]like the second day that I was doing
+
+[40:09.00]the show on youtube you know because I was talking about
+
+[40:11.00]auto gpt and so anyways
+
+[40:13.00]you know obviously that's sort of not totally
+
+[40:15.00]come to fruition yet but what are some of the
+
+[40:17.00]trends and what you guys are seeing in terms of
+
+[40:19.00]people's interest and what people are building
+
+[40:21.00]I can start maybe with the agents part
+
+[40:23.00]and then I know Sean is doing a
+
+[40:25.00]diffusion meetup tonight there's
+
+[40:27.00]a lot of different things
+
+[40:29.00]the agent wave has been the most
+
+[40:31.00]interesting kind of like dream
+
+[40:33.00]to reality
+
+[40:35.00]arc so auto gpt I think
+
+[40:37.00]they went from zero to like
+
+[40:39.00]125,000 get up stars in six weeks
+
+[40:41.00]and then one year later
+
+[40:43.00]they have 150,000
+
+[40:45.00]stars so there's kind of been a big
+
+[40:47.00]plot so I mean you might say
+
+[40:49.00]there's just not that many people that can
+
+[40:51.00]start it you know everybody already started
+
+[40:53.00]but the promise of
+
+[40:55.00]hey I'll just give you a goal
+
+[40:57.00]and you do it I think it's like
+
+[40:59.00]amazing to get people's
+
+[41:01.00]imagination going you know
+
+[41:03.00]they're like oh wow this is
+
+[41:05.00]this is awesome everybody
+
+[41:07.00]can try this to do anything
+
+[41:09.00]but then as technologists
+
+[41:11.00]you're like well that's
+
+[41:13.00]that's just like not possible you know
+
+[41:15.00]we would have like solved everything and
+
+[41:17.00]I think it takes a little bit to go from
+
+[41:19.00]the promise and the hope
+
+[41:21.00]that people show you to then
+
+[41:23.00]trinate yourself and going back to say
+
+[41:25.00]okay this is not really working for me and
+
+[41:27.00]David won from adept you know
+
+[41:29.00]they in our episode he specifically said
+
+[41:31.00]we don't want to do a bottom sub product
+
+[41:33.00]you knowwe don't want something that everybody
+
+[41:35.00]could try because it's really hard to get it
+
+[41:37.00]to be reliable so
+
+[41:39.00]we're seeing a lot of companies
+
+[41:41.00]doing vertical agents that
+
+[41:43.00]are narrow for a specific
+
+[41:45.00]domain and they're very good at something
+
+[41:47.00]Myconover who was at Databricks before
+
+[41:49.00]is also a friend of Layton space
+
+[41:51.00]he's doing this new company go bright with
+
+[41:53.00]doing AI agents for financial research
+
+[41:55.00]and that's it you know and
+
+[41:57.00]they're doing very well there are
+
+[41:59.00]other companies doing it in
+
+[42:01.00]security doing it in
+
+[42:03.00]compliance doing it in legal
+
+[42:05.00]all of these things that like
+
+[42:07.00]people nobody
+
+[42:09.00]just wakes up and sayoh I
+
+[42:11.00]cannot wait to go on auto gpd and ask
+
+[42:13.00]it to do a compliance review of my thing
+
+[42:15.00]you know just not what inspires people
+
+[42:17.00]so think the gap on the developer
+
+[42:19.00]side has been the more bottom sub
+
+[42:21.00]hacker mentality is trying to build
+
+[42:23.00]this like very generic
+
+[42:25.00]agents that can do a lot of open
+
+[42:27.00]ended task and then the more business
+
+[42:29.00]side of things is like hey if I want
+
+[42:31.00]to raise my next round I cannot
+
+[42:33.00]just like sit around the mess
+
+[42:35.00]mess around with like super generic
+
+[42:37.00]stuff I need to find a use case that
+
+[42:39.00]really works and I think that that
+
+[42:41.00]is worth for a lot of folks in
+
+[42:43.00]parallel you have a lot of companies
+
+[42:45.00]doing evals there are dozens
+
+[42:47.00]of them that just want to help you
+
+[42:49.00]measure how good your models are
+
+[42:51.00]doing again if you build evals
+
+[42:53.00]you need to also have a restrained
+
+[42:55.00]surface area to actually figure out
+
+[42:57.00]whether or not it's good right because
+
+[42:59.00]there's a lot of stuff going on
+
+[43:01.00]and everything under the sun so
+
+[43:03.00]that's another category where I've
+
+[43:05.00]seen from the sort of
+
+[43:07.00]pitches that I've seen there's a
+
+[43:09.00]lot of interest in the enterprise
+
+[43:11.00]it's just like really fragmented
+
+[43:13.00]because the production use cases
+
+[43:15.00]are just coming like now you know
+
+[43:17.00]there are not a lot of long established
+
+[43:19.00]ones to test against and so
+
+[43:21.00]that's kind of on the virtual agents
+
+[43:23.00]and then the robotic side
+
+[43:25.00]it's probably been the thing that
+
+[43:27.00]the amount of robots that were there
+
+[43:29.00]there were just like robots everywhere
+
+[43:31.00]both in the keynote and then on the show floor
+
+[43:33.00]you would haveboston dynamics
+
+[43:35.00]dogs running around
+
+[43:37.00]there was like this like fox
+
+[43:39.00]robot that had like a virtual face
+
+[43:41.00]that like talked to you and like moved
+
+[43:43.00]in realtime there were industrial
+
+[43:45.00]robots.imbedia did a big push
+
+[43:47.00]on their own omniverse thing which
+
+[43:49.00]is like thisdigital twin
+
+[43:51.00]ofwhatever environments you're in
+
+[43:53.00]that you can use to train the robots agents
+
+[43:55.00]so that kind of takes people back to the
+
+[43:57.00]reinforcement learning days but
+
+[43:59.00]yeah agents people want them
+
+[44:01.00]you know people want them I give a talk
+
+[44:03.00]about the rise of the full stack employees
+
+[44:05.00]and kind of this future the same way
+
+[44:07.00]full stack engineers kind of work
+
+[44:09.00]agross the stack in the future every
+
+[44:11.00]employee is going to interact with
+
+[44:13.00]every part of the organization through
+
+[44:15.00]agents and AI enabled tooling
+
+[44:17.00]this is happening it just needs to be a
+
+[44:19.00]lot more narrow than maybe the first
+
+[44:21.00]approach that we took which is just
+
+[44:23.00]short of super interesting stuff going on
+
+[44:25.00]yeah but he less you covered
+
+[44:27.00]a lot of stuff there I'll separate
+
+[44:29.00]the robotics piece because I feel like
+
+[44:31.00]that's so different from the software world
+
+[44:33.00]but yeah we do we do talk to a lot of
+
+[44:35.00]engineers and you know that that this is
+
+[44:37.00]our sort of bread and butter and I do agree
+
+[44:39.00]that vertical agents have worked out a
+
+[44:41.00]lot better than the horizontal ones
+
+[44:43.00]I think you know the point I'll make
+
+[44:45.00]here is just the reason auto GPT
+
+[44:47.00]and maybe AGI you know it's in the
+
+[44:49.00]name like they were promising AGI
+
+[44:51.00]but I think people are discovering that you cannot
+
+[44:53.00]engineer your way to AGI it has to be
+
+[44:55.00]done at the model level and all these
+
+[44:57.00]engineer and prompt engineering
+
+[44:59.00]hacks on top of it weren't really going to
+
+[45:01.00]get us there in a meaningful way
+
+[45:03.00]without much further
+
+[45:05.00]improvements in the models
+
+[45:07.00]I would say I'll go so far as to say
+
+[45:09.00]even Devin which is I would
+
+[45:11.00]I think the most advanced agents
+
+[45:13.00]that we've ever seen still requires a
+
+[45:15.00]lot of engineering and still probably
+
+[45:17.00]falls apart a lot in terms of like
+
+[45:19.00]practical usage or it's just way too slow
+
+[45:21.00]and expensive for you know what it's
+
+[45:23.00]what is promised in comparison to video
+
+[45:25.00]so yeah that's what happened
+
+[45:27.00]of agents from last year
+
+[45:29.00]but I do see like vertical agents
+
+[45:31.00]being very popular and sometimes
+
+[45:33.00]I think the word agent might even be
+
+[45:35.00]overused sometimes like people don't
+
+[45:37.00]really care whether or not you call it an
+
+[45:39.00]AI agent right like does it replace
+
+[45:41.00]boring menial tasks that I do
+
+[45:43.00]that I might hire human to do or
+
+[45:45.00]that the human who is hired to do it
+
+[45:47.00]doesn't really want to do
+
+[45:49.00]and I think there's absolutely ways
+
+[45:51.00]in sort of a vertical context
+
+[45:53.00]that you can actually go after
+
+[45:55.00]very routine tasks that can be scaled out
+
+[45:57.00]to a lot of you know AI assistance
+
+[45:59.00]so yeah I would
+
+[46:01.00]basically plus one what I said there
+
+[46:03.00]I think it's very very promising
+
+[46:05.00]and I think more people should work on it
+
+[46:07.00]not less like there's not enough people
+
+[46:09.00]like this should be the main thrust
+
+[46:11.00]of the AI engineers to look out
+
+[46:13.00]look for use cases and go to production
+
+[46:15.00]instead of just always working on some
+
+[46:17.00]AI promising thing that never arrives
+
+[46:19.00]I can only add that
+
+[46:21.00]so I've been fiercely making
+
+[46:23.00]tutorials behind the scenes
+
+[46:25.00]around basically everything you can imagine
+
+[46:27.00]with AI we've probably done about 300
+
+[46:29.00]tutorials over the last couple months
+
+[46:31.00]and the verticalized
+
+[46:33.00]anything right like this is a solution
+
+[46:35.00]for your particular job
+
+[46:37.00]or role even if it's way less
+
+[46:39.00]interesting or
+
+[46:41.00]kind of sexy it's like so radically more
+
+[46:43.00]useful to people in terms of intersecting
+
+[46:45.00]with how like those are the ways that people are
+
+[46:47.00]actually adopting AI
+
+[46:49.00]in a lot of cases it's just a
+
+[46:51.00]thing that I do over and over again
+
+[46:53.00]by the way I think that's the same way that even the
+
+[46:55.00]generalized models are getting adopted you know
+
+[46:57.00]it's like I use mid-journey
+
+[46:59.00]for lots of stuff but the main thing I use it for
+
+[47:01.00]is youtube thumbnails every day like day in
+
+[47:03.00]day out I will always do a youtube thumbnail
+
+[47:05.00]you know or two with with mid-journey
+
+[47:07.00]and it's like you can you can start to extrapolate
+
+[47:09.00]that across a lot of things and all of a sudden
+
+[47:11.00]you know AI doesn't
+
+[47:13.00]it looks revolutionary because of
+
+[47:15.00]a million small changes rather than
+
+[47:17.00]one sort of big dramatic change
+
+[47:19.00]and I think that the verticalization of agents
+
+[47:21.00]is sort of a great example of
+
+[47:23.00]how that's going to play out too
+
+[47:25.00]So I'll have one caveat here
+
+[47:27.00]which is I think that because
+
+[47:29.00]multimodal models are now commonplace
+
+[47:31.00]like claw,generalized,openEI
+
+[47:33.00]all very very easily
+
+[47:35.00]multimodal,apples easily
+
+[47:37.00]multimodal,all this stuff
+
+[47:39.00]the switch for agents for sort of general desktop
+
+[47:41.00]browsing
+
+[47:43.00]that I think people need to keep an eye on
+
+[47:45.00]it's not mature yet
+
+[47:47.00]but it is absolutely coming on the way
+
+[47:49.00]and so just as we're starting to talk
+
+[47:51.00]about this verticalization piece
+
+[47:53.00]because that is mature
+
+[47:55.00]that is ready for people to work on
+
+[47:57.00]that is that a lot of people are making really good money doing that
+
+[47:59.00]the thing that's on the rise
+
+[48:01.00]is this sort of drive by vision
+
+[48:03.00]version of the agent where
+
+[48:05.00]they're not specifically taking in text or anything
+
+[48:07.00]but just watching your screen just like someone else would
+
+[48:09.00]and piloting it
+
+[48:11.00]by vision
+
+[48:13.00]in the episode with David
+
+[48:15.00]that will have dropped by the time that this airs
+
+[48:17.00]I think that is the promise of adept
+
+[48:19.00]that is the promise of what
+
+[48:21.00]a lot of these sort of desktop agents are
+
+[48:23.00]and that is the more general purpose
+
+[48:25.00]system that could be as big as
+
+[48:27.00]the browser
+
+[48:29.00]the operating system
+
+[48:31.00]people really want to build that
+
+[48:33.00]foundational piece of software in AI
+
+[48:35.00]and I would see the potential
+
+[48:37.00]therefor desktop agents being that
+
+[48:39.00]that you can have self-driving
+
+[48:41.00]computers. Don't write the horizontal
+
+[48:43.00]piece out. I just think we took a while
+
+[48:45.00]to get there. What else are you guys
+
+[48:47.00]seeing?That's interesting to you.
+
+[48:49.00]I'm looking at your notes and seeing a ton
+
+[48:51.00]ofcategories. I'll take
+
+[48:53.00]the next two as one category
+
+[48:55.00]which is basically alternative architectures.
+
+[48:57.00]The two main things that everyone
+
+[48:59.00]following AI kind of knows now is
+
+[49:01.00]one, the diffusion architecture
+
+[49:03.00]and two, the
+
+[49:05.00]let's just say the decoder only
+
+[49:07.00]transformer architecture that is popular
+
+[49:09.00]by GPT. You can read, you can look on
+
+[49:11.00]YouTube for thousands and thousands of tutorials
+
+[49:13.00]on each of those things. What we are talking about
+
+[49:15.00]here is what's next, what people are
+
+[49:17.00]researching and what could be on the horizon
+
+[49:19.00]that takes the place of those other two things.
+
+[49:21.00]So first we'll talk about transformer architectures
+
+[49:23.00]and then diffusion. So transform the two leading
+
+[49:25.00]candidates are effectively RWKV
+
+[49:27.00]and the state space models. The most recent
+
+[49:29.00]one of which is Mamba but there's others
+
+[49:31.00]and the S4H3
+
+[49:33.00]stuff coming out of Hazy Research in Stanford
+
+[49:35.00]and all of those are
+
+[49:37.00]non-quadratic
+
+[49:39.00]language models that scale
+
+[49:41.00]that promise to scale a lot better
+
+[49:43.00]than the traditional transformer
+
+[49:45.00]that this might be too theoretical
+
+[49:47.00]for most people right now
+
+[49:49.00]but it's going to be
+
+[49:51.00]it's going to come out in weird ways
+
+[49:53.00]where imagine if like right now
+
+[49:55.00]the talk of the town is that
+
+[49:57.00]Claude and Gemini have a million tokens of context
+
+[49:59.00]and you can put in like
+
+[50:01.00]two hours of video now
+
+[50:03.00]but what if you could throw in
+
+[50:05.00]200,000 hours of video
+
+[50:07.00]how does that change your
+
+[50:09.00]usage of AI
+
+[50:11.00]what if you could throw in the entire
+
+[50:13.00]genetic sequence of a human
+
+[50:15.00]and synthesize new drugs
+
+[50:17.00]how does that change things
+
+[50:19.00]we don't know because we haven't had access to this capability
+
+[50:21.00]being so cheap before
+
+[50:23.00]and that's the ultimate promise of these two models
+
+[50:25.00]they're not there yet
+
+[50:27.00]it's a very very good progress
+
+[50:29.00]RWKV and Mamba are probably the two leading examples
+
+[50:31.00]both of which are open source
+
+[50:33.00]that you can try them today
+
+[50:35.00]and have a lot of progress there
+
+[50:37.00]the main thing I'll highlight for other UKV
+
+[50:39.00]is that at the 7B level
+
+[50:41.00]they seem to have beat
+
+[50:43.00]Lama2 in all benchmarks
+
+[50:45.00]that matter
+
+[50:47.00]at the same size for the same amount of training
+
+[50:49.00]as an open source model so that's exciting
+
+[50:51.00]there are 7B now
+
+[50:53.00]they're not at 7TB we don't know if it'll scale
+
+[50:55.00]the other thing is diffusion
+
+[50:57.00]diffusions and transformers
+
+[50:59.00]are kind of on the collision course
+
+[51:01.00]the original stable diffusion already used
+
+[51:03.00]transformers in parts of its architecture
+
+[51:05.00]it seems that transformers are
+
+[51:07.00]eating more and more of those layers
+
+[51:09.00]particularly the VAE layer
+
+[51:11.00]so that's the diffusion transformer
+
+[51:13.00]is what Sora is built on
+
+[51:15.00]the guy who wrote the diffusion transformer
+
+[51:17.00]paper
+
+[51:19.00]bilt pebbles is the lead tech guy
+
+[51:21.00]on Sora
+
+[51:23.00]but there's more sort of experimentation
+
+[51:25.00]to diffusion I'm holding a meetup
+
+[51:27.00]actually here in San Francisco that's going to be like the state of diffusion
+
+[51:29.00]which I'm pretty excited about
+
+[51:31.00]stability's doing a lot of good work
+
+[51:33.00]and if you look at the architecture
+
+[51:35.00]of how they're creating
+
+[51:37.00]stable diffusion 3, our glass diffusion
+
+[51:39.00]and the inconsistency models
+
+[51:41.00]or SDXL turbo
+
+[51:43.00]all of these are like very very interesting innovations
+
+[51:45.00]on like the original idea
+
+[51:47.00]of what stable diffusion was so if you think
+
+[51:49.00]that it is expensive to create or slow to create
+
+[51:51.00]stable diffusion or AI-generated art
+
+[51:53.00]you are not up to date with the latest models
+
+[51:55.00]if you think it is hard to create
+
+[51:57.00]texted images you are not up to date with the latest models
+
+[51:59.00]and people still are kind of far behind
+
+[52:01.00]the last piece of which
+
+[52:03.00]is the wall cut I always kind of hold out
+
+[52:05.00]which is text diffusion
+
+[52:07.00]so instead of using auto generative
+
+[52:09.00]or auto regressive transformers
+
+[52:11.00]can you use text to diffuse
+
+[52:13.00]so you can use diffusion models to diffuse
+
+[52:15.00]and create entire chunks of text
+
+[52:17.00]all at once instead of token by token
+
+[52:19.00]and that is something that mid-journey confirmed today
+
+[52:21.00]because it was only rumored
+
+[52:23.00]the past few months but they confirmed today
+
+[52:25.00]that they were looking into
+
+[52:27.00]all those things are like very exciting new model
+
+[52:29.00]architectures that are maybe something
+
+[52:31.00]that you will see in production 2-3 years from now
+
+[52:33.00]so the couple of the trends that I want to just
+
+[52:37.00]get your takes on because they're sort of something
+
+[52:39.00]that seems like they're coming up are
+
+[52:41.00]one sort of these wearable
+
+[52:43.00]kind of passive
+
+[52:45.00]AI experiences where
+
+[52:47.00]they're absorbing a lot of what's going on around you
+
+[52:49.00]and then kind of bringing things back
+
+[52:51.00]and then the other one that I
+
+[52:53.00]wanted to see if you guys had thoughts on were
+
+[52:55.00]sort of this next generation of chip companies
+
+[52:57.00]obviously there's a huge amount of emphasis
+
+[52:59.00]on hardware and silicon
+
+[53:01.00]and different ways of doing things but
+
+[53:03.00]love your take on neither or both of those
+
+[53:05.00]so wearables
+
+[53:07.00]I'm very excited about it
+
+[53:09.00]I want wearables on me at all times
+
+[53:11.00]I have two right here to quantify my health
+
+[53:13.00]and I'm all for them
+
+[53:15.00]but society is not ready for wearables
+
+[53:17.00]no one's comfortable with
+
+[53:19.00]a device on recording every single
+
+[53:21.00]conversation we have even
+
+[53:23.00]all three of us here as
+
+[53:25.00]podcasters we don't record everything
+
+[53:27.00]that we say and I think
+
+[53:29.00]there's a social shift that needs to happen
+
+[53:31.00]I'm an investor in tab
+
+[53:33.00]they are renaming to a broader
+
+[53:35.00]vision but they are one of the
+
+[53:37.00]three or four leading wearables in this
+
+[53:39.00]space instead of the AI pendants
+
+[53:41.00]or AI OS
+
+[53:43.00]I have seen two humans
+
+[53:45.00]in a while in San Francisco
+
+[53:47.00]I'm very very excited to report
+
+[53:49.00]that there are people walking around
+
+[53:51.00]with those things on their chest
+
+[53:53.00]and it is as goofy as it sounds
+
+[53:55.00]it absolutely is going to fail
+
+[53:57.00]but god bless them for trying
+
+[53:59.00]and I've also bought a rabbit
+
+[54:01.00]so I'm very excited for all those things to arrive
+
+[54:03.00]but yeah people
+
+[54:05.00]are very keen on hardware
+
+[54:07.00]I think the idea that you can have physical objects
+
+[54:09.00]that embody an AI
+
+[54:11.00]do specific things for you
+
+[54:13.00]is as old as
+
+[54:15.00]the sort of golem
+
+[54:17.00]in sort of medieval times
+
+[54:19.00]in terms of like how much we want
+
+[54:21.00]our objects to be smart
+
+[54:23.00]and do things for us
+
+[54:25.00]and I think it's absolutely
+
+[54:27.00]a great play
+
+[54:29.00]the funny thing is people are much more willing
+
+[54:31.00]to pay you up front
+
+[54:33.00]for a hardware device
+
+[54:35.00]than they are willing to pay $8 a month
+
+[54:37.00]suscription recurring for software
+
+[54:39.00]and so the interesting economics
+
+[54:41.00]of these wearable companies
+
+[54:43.00]is they have negative float
+
+[54:45.00]in the sense that people pay deposits up front
+
+[54:47.00]like I paid
+
+[54:49.00]$200 for the rabbit up front
+
+[54:51.00]and I don't get it for another six months
+
+[54:53.00]I paid $600 for the tablet
+
+[54:55.00]I don't get it for another six months
+
+[54:57.00]and then they can take that money
+
+[54:59.00]and sort of invest it in their next
+
+[55:01.00]events or their next properties
+
+[55:03.00]or ventures
+
+[55:05.00]and I think that's a very interesting
+
+[55:07.00]comics from other types of
+
+[55:09.00]AI companies that I see
+
+[55:11.00]and I think just the tactile feel
+
+[55:13.00]of an AI I think is very promising
+
+[55:15.00]I don't know if you have other
+
+[55:17.00]thoughts on the wearable stuff
+
+[55:19.00]open interpreter
+
+[55:21.00]just announced their product four hours ago
+
+[55:23.00]which is not really a wearable
+
+[55:25.00]but it's still like a
+
+[55:27.00]physical device
+
+[55:29.00]it's a push to talk mic
+
+[55:31.00]to a device on your laptop
+
+[55:33.00]it's a $99 push
+
+[55:35.00]but again
+
+[55:37.00]go back to your point
+
+[55:39.00]people are interested in
+
+[55:41.00]spending money for things that they can hold
+
+[55:43.00]I don't know what that means overall
+
+[55:45.00]where things are going but
+
+[55:47.00]making more of this AI
+
+[55:49.00]be a physical part of your life
+
+[55:51.00]I think people are interested in that
+
+[55:53.00]but I agree with Sean
+
+[55:55.00]I talked to Avi about this
+
+[55:57.00]but Avi's point is like
+
+[55:59.00]most consumers care about utility
+
+[56:01.00]more than they care about privacy
+
+[56:03.00]I'm not like you've seen with social media
+
+[56:05.00]but I also think there's a big
+
+[56:07.00]social reaction
+
+[56:09.00]to AI that is like much more
+
+[56:11.00]rooted than the social media one
+
+[56:13.00]but we'll see
+
+[56:15.00]but a lot again a lot of work a lot of developers
+
+[56:17.00]a lot of money going into it so
+
+[56:19.00]there's bound to be experiments
+
+[56:21.00]being run on the chip
+
+[56:23.00]sorry I'll just
+
+[56:25.00]ship it one more thing and then we transition to the chips
+
+[56:27.00]the thing I'll caution people on is
+
+[56:29.00]don't overly focus on the form factor
+
+[56:31.00]the form factor is a delivery mode
+
+[56:33.00]there will be many form factors
+
+[56:35.00]it doesn't matter so much as
+
+[56:37.00]where in the data war does it sit
+
+[56:39.00]it actually is context acquisition
+
+[56:41.00]because and maybe a little bit of multi-modality
+
+[56:43.00]context is king
+
+[56:45.00]if you have access to data
+
+[56:47.00]that no one else has
+
+[56:49.00]then you will be able to create AI that no one else can create
+
+[56:51.00]and so what is the most personal context
+
+[56:53.00]it is your everyday conversation
+
+[56:55.00]it is as close to mapping your
+
+[56:57.00]mental train of thought as possible
+
+[56:59.00]without physically you writing down notes
+
+[57:01.00]so that is the promise
+
+[57:03.00]the ultimate goal here
+
+[57:05.00]which is like personal context
+
+[57:07.00]it's always available on you
+
+[57:09.00]a little ncl that stuff
+
+[57:11.00]that's the frame I want to give people
+
+[57:13.00]that the form factors will change
+
+[57:15.00]and there will be multiple form factors
+
+[57:17.00]but it's the software behind that
+
+[57:19.00]in the personal context
+
+[57:21.00]that you cannot get anywhere else
+
+[57:23.00]that'll win
+
+[57:25.00]so that was wearables
+
+[57:27.00]but it's not even a new release
+
+[57:29.00]because the company I think was started in 2016
+
+[57:31.00]so it's actually quite old
+
+[57:33.00]but now recently captured
+
+[57:35.00]people's imagination with their mixrall
+
+[57:37.00]500 tokens a second demo
+
+[57:39.00]yeah I think so far
+
+[57:41.00]the battle on the GPU side
+
+[57:43.00]has beeneither you go
+
+[57:45.00]kind of like massive chip
+
+[57:47.00]like this irreversible of the world
+
+[57:49.00]where one chip front
+
+[57:51.00]reversible is about 2 million dollars
+
+[57:53.00]you know that's compared
+
+[57:55.00]so you cannot compare one chip
+
+[57:57.00]versus one chip but h100
+
+[57:59.00]it's like 40,000
+
+[58:01.00]the problem with those architectures
+
+[58:03.00]has been
+
+[58:05.00]they want to be very general
+
+[58:07.00]but they wanted to put a lot
+
+[58:09.00]of the SRAM on the chip
+
+[58:11.00]it's much more convenient
+
+[58:13.00]when you're using larger language models
+
+[58:15.00]but the models outpace the size
+
+[58:17.00]of the chips and chips have a much longer
+
+[58:19.00]turnaround cycle
+
+[58:21.00]Grock today is great for the current architecture
+
+[58:23.00]it's a lot more expensive also
+
+[58:25.00]as far as dollar per flop
+
+[58:27.00]but there are the SIK
+
+[58:29.00]when you have very high concurrency
+
+[58:31.00]we actually were much cheaper
+
+[58:33.00]you shouldn't just be looking at the compute power
+
+[58:35.00]for most people this doesn't really matter
+
+[58:37.00]you know like I think that's like the most
+
+[58:39.00]the most interesting thing to me is like
+
+[58:41.00]we've now gone back
+
+[58:43.00]with AI to a world where
+
+[58:45.00]developers care
+
+[58:47.00]about what hardware is running
+
+[58:49.00]which was not the case in traditional software
+
+[58:51.00]maybe 20 years since
+
+[58:53.00]as the cloud has gotten really big
+
+[58:55.00]my thinking is that
+
+[58:57.00]in the next 2-3 years
+
+[58:59.00]we're going to go back to that
+
+[59:01.00]we're like people are not going to be sweating
+
+[59:03.00]what GPU do you have in your cloud
+
+[59:05.00]what do you have
+
+[59:07.00]you want to run this model
+
+[59:09.00]we can run it at the same speed as everybody else
+
+[59:11.00]and then everybody will make different choices
+
+[59:13.00]whether they want to have
+
+[59:15.00]higher front end capital investment
+
+[59:17.00]and then better utilization
+
+[59:19.00]and then upgrade later
+
+[59:21.00]there are a lot of parameters
+
+[59:23.00]and then there's the dark horses
+
+[59:25.00]that is some of the smaller companies
+
+[59:27.00]like Lemurian Labs, MedEx
+
+[59:29.00]that are working on
+
+[59:31.00]maybe not a chip alone
+
+[59:33.00]but also some of the actual math
+
+[59:35.00]infrastructure and the instructions
+
+[59:37.00]that make them run
+
+[59:39.00]there's a lot going on but
+
+[59:41.00]yeah I think the
+
+[59:43.00]the episode with Dylan will be
+
+[59:45.00]interesting for people but
+
+[59:47.00]hey everybody has pros and cons
+
+[59:49.00]it's different than the models
+
+[59:51.00]where you're like oh this one is definitely better for me
+
+[59:53.00]and I'm going to use it
+
+[59:55.00]I think for most people
+
+[59:57.00]it's like fun twitter meming
+
+[59:59.00]99% of people
+
+[60:01.00]that tweet about this stuff
+
+[60:03.00]are never going to buy any of these chips anyway
+
+[60:05.00]so it's really more for entertainment
+
+[60:07.00]wow I mean
+
+[60:09.00]this is serious business here
+
+[60:11.00]the potential new Nvidia
+
+[60:13.00]if anyone can take
+
+[60:15.00]i'm more talking about
+
+[60:17.00]how should people think about it
+
+[60:19.00]i think the end user
+
+[60:21.00]is not impacted as much
+
+[60:23.00]so I disagree
+
+[60:25.00]I love disagreements
+
+[60:27.00]who likes the podcast
+
+[60:29.00]where all 3 people always agree with each other
+
+[60:31.00]you will see the impact of this
+
+[60:33.00]in the tokens per second over time
+
+[60:35.00]this year
+
+[60:37.00]I have very very credible sources
+
+[60:39.00]all telling me
+
+[60:41.00]that the average tokens per second
+
+[60:43.00]right now we have
+
+[60:45.00]somewhere between 50 to 100
+
+[60:47.00]that's the norm for people
+
+[60:49.00]average tokens per second
+
+[60:51.00]will go to 500 to 2000
+
+[60:53.00]this year
+
+[60:55.00]from a number of chip suppliers that I cannot name
+
+[60:57.00]that will cause
+
+[60:59.00]step change in the use cases
+
+[61:01.00]every time you have an order of magnitude improvement
+
+[61:03.00]in the speed of something
+
+[61:05.00]you unlock new use cases
+
+[61:07.00]that become fun instead of a chore
+
+[61:09.00]and so that's what I would
+
+[61:11.00]caution this audience to think about
+
+[61:13.00]what can you do in much higher AI speed
+
+[61:15.00]it's not just things streaming out faster
+
+[61:17.00]it is
+
+[61:19.00]things working in the background a lot more seamlessly
+
+[61:21.00]and therefore being a lot more useful
+
+[61:23.00]than previously imagined
+
+[61:25.00]so that would be my two cents on that
+
+[61:27.00]yeah
+
+[61:29.00]I mean the new
+
+[61:31.00]imbedia chips are also much faster
+
+[61:33.00]to me that's true
+
+[61:35.00]when it comes about startups
+
+[61:37.00]are the startups pushing
+
+[61:39.00]on the incumbents
+
+[61:41.00]or are the incumbents still leading
+
+[61:43.00]and then the startups are riding the same wave
+
+[61:45.00]I don't have yet a good sense of that
+
+[61:47.00]it's next year's
+
+[61:49.00]imbedia release just gonna be better than everything
+
+[61:51.00]that gets released this year
+
+[61:53.00]if that's the case
+
+[61:55.00]it's like okay
+
+[61:57.00]damn jensen
+
+[61:59.00]it's like I'm gonna fight imbedia
+
+[62:01.00]damn jensen got hands
+
+[62:03.00]he really does
+
+[62:05.00]well
+
+[62:07.00]one conversation guys
+
+[62:09.00]just by way of wrapping up
+
+[62:11.00]call it over the next three months
+
+[62:13.00]between now and sort of the beginning of summer
+
+[62:15.00]what's one prediction that each of you has
+
+[62:17.00]could be about anything
+
+[62:19.00]could be big company, could be startup
+
+[62:21.00]could be something you have privileged information that you know
+
+[62:23.00]and you just won't tell us that you actually know
+
+[62:25.00]does it have to be something that we think
+
+[62:27.00]it's gonna be true or like something that we think
+
+[62:29.00]because for me it's like
+
+[62:31.00]is sundar gonna be the CEO of google
+
+[62:33.00]maybe not in three months but maybe in like six months
+
+[62:35.00]in nine months
+
+[62:37.00]people were like oh maybe that miss is gonna be the new CEO
+
+[62:39.00]that was kinda like
+
+[62:41.00]I was busyfishing some deep mind people
+
+[62:43.00]and google people for like a good
+
+[62:45.00]guest for the pot and I was like
+
+[62:47.00]oh what about jeptine and they're like
+
+[62:49.00]well that miss is really like the person that runs everything
+
+[62:51.00]anyway and the stuff
+
+[62:53.00]and it's like interesting
+
+[62:55.00]so I don't know
+
+[62:57.00]serge could come back I don't know
+
+[62:59.00]like he's making more appearances these days
+
+[63:01.00]yeah
+
+[63:03.00]but we can just put it as like
+
+[63:05.00]my thing is like
+
+[63:07.00]CEO change potential
+
+[63:09.00]but again
+
+[63:11.00]three months is too short
+
+[63:13.00]to make a prediction
+
+[63:15.00]time scale might be off
+
+[63:17.00]yeah I mean
+
+[63:21.00]for me I think the progression
+
+[63:23.00]in vertical agent companies
+
+[63:25.00]will keep going
+
+[63:27.00]we just had the other day
+
+[63:29.00]Klarna talking about how they replaced like
+
+[63:31.00]customer support agents with the
+
+[63:33.00]AI agents
+
+[63:35.00]that's to the beginning guys
+
+[63:37.00]imagine this rolling out across most of the fortune
+
+[63:39.00]500
+
+[63:41.00]and I'm not saying this is like a utopian
+
+[63:43.00]scenario there will be very very
+
+[63:45.00]embarrassing and bad outcomes
+
+[63:47.00]of this where like humans would never
+
+[63:49.00]make this mistake but AIS did and
+
+[63:51.00]we'll all laugh at it or it will be very offended
+
+[63:53.00]by whatever you know bad outcome
+
+[63:55.00]it did so we have to be responsible
+
+[63:57.00]and careful in the rollout but yeah this is
+
+[63:59.00]the rolling out you know Alessio likes to say
+
+[64:01.00]that this year is the year of AI production
+
+[64:03.00]let's see it let's see all these vertical
+
+[64:05.00]full stack employees
+
+[64:07.00]come out into the workforce
+
+[64:09.00]love it alright guys well thank you so much
+
+[64:11.00]for sharing your thoughts and insights
+
+[64:13.00]here and can't wait to do it again
+
+[64:15.00]welcome back again
+
+[64:17.00]it's Charlie your AI cohost
+
+[64:19.00]we're now in part two
+
+[64:21.00]of the special weekend episode
+
+[64:23.00]collating some of SWIX and Alessio's
+
+[64:25.00]recent appearances
+
+[64:27.00]if you are not active in the latent space discord
+
+[64:29.00]you might not be aware of the
+
+[64:31.00]many many many in person
+
+[64:33.00]events we host gathering our
+
+[64:35.00]listener community all over the world
+
+[64:37.00]you can see the latent space
+
+[64:39.00]community page for how to join
+
+[64:41.00]and subscribe to our event calendar
+
+[64:43.00]for future meetups
+
+[64:45.00]we're going to share some of our recent live
+
+[64:47.00]appearances in this next part
+
+[64:49.00]starting with the Thursday nights in AI meetup
+
+[64:51.00]a regular fixture in the SF AI scene
+
+[64:53.00]run by imbu and outset capital
+
+[64:55.00]primarily our former guest
+
+[64:57.00]Kanjen Kyu
+
+[64:59.00]Allie Roder and Josh Albrecht
+
+[65:01.00]here's SWIX
+
+[65:03.00]today for those of you who have been here before
+
+[65:07.00]you know the general format
+
+[65:09.00]so we'll do a quick fireside Q&A with SWIX
+
+[65:11.00]where we're asking him the questions
+
+[65:13.00]then we'll actually go to our rapid fire Q&A
+
+[65:15.00]where we're asking really fast
+
+[65:17.00]hopefully spicy questions
+
+[65:19.00]and then we'll open it up to the audience
+
+[65:21.00]for your questions so you guys
+
+[65:23.00]around the room submit your questions
+
+[65:25.00]and we'll go through as many of them as possible
+
+[65:27.00]during that period
+
+[65:29.00]and then actually SWIX brought
+
+[65:31.00]a gift for us which is two
+
+[65:33.00]latent space t-shirts
+
+[65:35.00]AI engineer t-shirts
+
+[65:37.00]and those will be awarded to the
+
+[65:39.00]two spiciest questions
+
+[65:41.00]askers
+
+[65:43.00]and I'll let Josh decide on that
+
+[65:45.00]so we want to get your spiciest takes
+
+[65:47.00]please send them in during the event
+
+[65:49.00]as we're talking and then also at the end
+
+[65:51.00]alright with that
+
+[65:53.00]let's get going
+
+[65:55.00]ok
+
+[65:57.00]welcome SWIX
+
+[65:59.00]how does it feel to be interviewed
+
+[66:01.00]rather than the interviewer
+
+[66:03.00]weird I don't know what to do in this chair
+
+[66:05.00]where should I put my hands
+
+[66:07.00]you look good
+
+[66:09.00]and I also love asking follow up questions
+
+[66:11.00]and I tend to take over panels a lot
+
+[66:13.00]if you ever see me on a panel
+
+[66:15.00]I tend to ask the other panelists questions
+
+[66:17.00]so we should be ready
+
+[66:19.00]this is like a free interview
+
+[66:21.00]so why not
+
+[66:23.00]so you interviewed Ken June
+
+[66:25.00]the CEO of Impeo4 but you didn't interview Josh
+
+[66:27.00]so maybe tonight
+
+[66:29.00]we will look for different questions
+
+[66:31.00]and look for alignment
+
+[66:33.00]I love it
+
+[66:35.00]I just want to hear this story
+
+[66:37.00]you've completely exploded with latent space
+
+[66:39.00]and AI engineer and I know
+
+[66:41.00]you also before all of that
+
+[66:43.00]had exploded in popularity for your
+
+[66:45.00]learning in public movement and your dev tools work
+
+[66:47.00]dav relations work
+
+[66:49.00]so who are you and how did you get here
+
+[66:51.00]let's start with that
+
+[66:53.00]quick story is I'm Sean I'm from Singapore
+
+[66:55.00]SWIX is my initials for those who don't know
+
+[66:57.00]a lot of Singaporeans are ethically Chinese
+
+[66:59.00]and we have Chinese names and English names
+
+[67:01.00]so it's just my initials
+
+[67:03.00]came to the US for college
+
+[67:05.00]and have been here for about 15 years
+
+[67:07.00]but half of that was in finance
+
+[67:09.00]and then the other half was in tech
+
+[67:11.00]tech is where I was most known
+
+[67:13.00]just because I realized that
+
+[67:15.00]I was much more
+
+[67:17.00]aligned towards learning in public
+
+[67:19.00]whereas in finance everything is a trade secret
+
+[67:21.00]everything is zero sum
+
+[67:23.00]whereas in tech you're allowed to come
+
+[67:25.00]to meetups and conferences and share your
+
+[67:27.00]learnings and share your mistakes even
+
+[67:29.00]and that's totally fine you like
+
+[67:31.00]open source your code it's totally fine
+
+[67:33.00]and even better you like contribute
+
+[67:35.00]pr to other people's code which is even better
+
+[67:37.00]and I found that I thrives in that
+
+[67:39.00]learning public environments and
+
+[67:41.00]that kind of got me started
+
+[67:43.00]early higher
+
+[67:45.00]early directly higher at Netlify
+
+[67:47.00]and then did the same at EWS
+
+[67:49.00]Temporal and Airbyte
+
+[67:51.00]and so that's like the whole story
+
+[67:53.00]I can talk more about like developer tooling
+
+[67:55.00]and developer relations if that's something
+
+[67:57.00]that people are interested in
+
+[67:59.00]but I think the more recent thing is AI
+
+[68:01.00]and I started really being interested
+
+[68:03.00]in it mostly because
+
+[68:05.00]the approximate cause of starting learning
+
+[68:07.00]in space was stable diffusion
+
+[68:09.00]when you could run a large model
+
+[68:11.00]on your desktop
+
+[68:13.00]where I was like okay this is
+
+[68:15.00]something qualitatively very different
+
+[68:17.00]and then we started
+
+[68:19.00]in space and
+
+[68:21.00]we have to talk about it on a podcast
+
+[68:23.00]there we go
+
+[68:25.00]it wasn't a podcast for like four months
+
+[68:27.00]and then I had been running a discord
+
+[68:29.00]for DevTools investors
+
+[68:31.00]I also invested in DevTools
+
+[68:33.00]and I advised companies on DevTools
+
+[68:35.00]definition things
+
+[68:37.00]and I think it was the start of 2023
+
+[68:39.00]Alessio and I were both like
+
+[68:41.00]I think we need to get more tokens out of
+
+[68:43.00]people and I was running out of original sources
+
+[68:45.00]to write about
+
+[68:47.00]so I was like okay I'll go get those original sources
+
+[68:49.00]and I think that's when we started the podcast
+
+[68:51.00]and I think it's just the chemistry between us
+
+[68:53.00]the way we spike in different ways
+
+[68:55.00]and also honestly the
+
+[68:57.00]kind participation of the guests
+
+[68:59.00]to give us their time
+
+[69:01.00]getting George Hoss was a big deal
+
+[69:03.00]and also shoutout to Alessio for cold emailing him
+
+[69:05.00]for booking some of our
+
+[69:07.00]biggest guests
+
+[69:09.00]and just working really hard to try to tell the story
+
+[69:11.00]that people can use at work
+
+[69:13.00]I think that there's a lot of AI podcasts out there
+
+[69:15.00]and a lot of AI forums
+
+[69:17.00]or fireside chats with no fire
+
+[69:19.00]that always talk about
+
+[69:21.00]what's your AGI timeline, what's your PDoom
+
+[69:23.00]very very nice hallway conversations
+
+[69:25.00]for freshman year but not very useful
+
+[69:27.00]for work
+
+[69:29.00]and practically making money
+
+[69:31.00]and thinking about
+
+[69:33.00]changing everyday lives
+
+[69:35.00]what's interesting is obviously
+
+[69:37.00]you care about the existential
+
+[69:39.00]safety of the human race
+
+[69:41.00]but in the meantime we got to eat
+
+[69:43.00]so I think that's kind of
+
+[69:45.00]the inspaces niche
+
+[69:47.00]we explicitly don't really talk about AGI
+
+[69:49.00]we explicitly don't talk about
+
+[69:51.00]things that we're a little bit too far out
+
+[69:53.00]we don't do a ton of robotics
+
+[69:55.00]we don't do a ton of high frequency trading
+
+[69:57.00]there's tons of machine learning in there
+
+[69:59.00]but we just don't do that
+
+[70:01.00]we're like what are most software engineers going to need
+
+[70:03.00]because that's our background
+
+[70:05.00]and that's the audience that we serve
+
+[70:07.00]and I think just being really clear on that audience
+
+[70:09.00]has resonated with people
+
+[70:11.00]you would never expect a technical podcast
+
+[70:13.00]to reach a general audience
+
+[70:15.00]like top 10 on the tech charts
+
+[70:17.00]but I've been surprised by that before
+
+[70:19.00]and it's been successful
+
+[70:21.00]I don't know what to say about that
+
+[70:23.00]I think honestly I kind of
+
+[70:25.00]have this negative reaction towards being
+
+[70:27.00] being classified as a podcast
+
+[70:29.00]because the podcast is downstream of ideas
+
+[70:31.00]and it's one mode of conversation
+
+[70:33.00]it's one mode of idea delivery
+
+[70:35.00]but you can deliver ideas
+
+[70:37.00]on a newsletter in a person like this
+
+[70:39.00]there's so many different ways
+
+[70:41.00]and so I think I think about it more as
+
+[70:43.00]we are trying to start or serve an industry
+
+[70:45.00]and that industry is the AI
+
+[70:47.00]engineer industry
+
+[70:49.00]which we can talk about more
+
+[70:51.00]yes let's go into that
+
+[70:53.00]so the AI engineer you penned a piece
+
+[70:55.00]all the rise of the AI engineer
+
+[70:57.00]you tweeted about it
+
+[70:59.00]you also responded
+
+[71:01.00]largely agreeing with what you said
+
+[71:03.00]what is an AI engineer
+
+[71:05.00]the AI engineer is the software engineer
+
+[71:07.00]building with AI
+
+[71:09.00]enhanced by AI
+
+[71:11.00]and eventually it will be a non-human
+
+[71:13.00]engineers writing code for you
+
+[71:15.00]which I know impu is all about
+
+[71:17.00]you're saying eventually the AI engineer
+
+[71:19.00]will become a non-human engineer
+
+[71:21.00]that will be one kind of AI engineer
+
+[71:23.00]that people are trying to build
+
+[71:25.00]and is probably the most furthest away
+
+[71:27.00]because it's so hard
+
+[71:29.00]but there are three types of AI engineer
+
+[71:31.00]one is AI enhanced
+
+[71:33.00]where you use AI products like co-pilot
+
+[71:35.00]and two is AI products engineer
+
+[71:37.00]where you use the exposed AI capabilities
+
+[71:39.00]to the end user
+
+[71:41.00]as a software engineer like not doing pre-training
+
+[71:43.00]not being an ML researcher
+
+[71:45.00]not being an ML engineer
+
+[71:47.00]but just interacting with foundation models
+
+[71:49.00]probably APIs from foundation model labs
+
+[71:51.00]what's the third one
+
+[71:53.00]and the third one is the non-human AI engineer
+
+[71:55.00]the fully autonomous
+
+[71:57.00]dream coder
+
+[71:59.00]how long do you think it is till we get to
+
+[72:01.00]early
+
+[72:03.00]this is my equivalent of AGI timelines
+
+[72:05.00]I know I know
+
+[72:07.00]lots of active
+
+[72:09.00]I have supported companies
+
+[72:11.00]actively working on that
+
+[72:13.00]I think it's more useful to think about levels of autonomy
+
+[72:15.00]and so my answer to that
+
+[72:17.00]is perpetually five years away
+
+[72:19.00]until it figures it out
+
+[72:21.00]no but my actual antidote
+
+[72:23.00]the closest comparison we have to that is self-driving
+
+[72:25.00]we're doing this in San Francisco
+
+[72:27.00]for those who are watching on the live stream
+
+[72:29.00]if you haven't come to San Francisco
+
+[72:31.00]and seen and taken a waymo ride
+
+[72:33.00]just come get a friend and take a waymo ride
+
+[72:35.00]I remember 2014
+
+[72:37.00]we covered a little bit of autos in my hedge fund
+
+[72:39.00]and I remember telling a friend
+
+[72:41.00]I was self-driving cars around the corner
+
+[72:43.00]this is it
+
+[72:45.00]parking will be a thing of the past
+
+[72:47.00]and it didn't happen for the next ten years
+
+[72:49.00]but now most of us in San Francisco
+
+[72:51.00]can take it for granted
+
+[72:53.00]I think you just have to
+
+[72:55.00]be mindful that
+
+[72:57.00]the rough edges take a long time
+
+[72:59.00]and yes it's going to work in demos
+
+[73:01.00]then it's going to work a little bit further out
+
+[73:03.00]and it's just going to take a long time
+
+[73:05.00]the more useful mental model I have
+
+[73:07.00]is levels of autonomy
+
+[73:09.00]in self-driving you have level one, two, three, four, five
+
+[73:11.00]just the amount of human attention
+
+[73:13.00]that you getat first
+
+[73:15.00]your hands are always on ten and two
+
+[73:17.00]and you have to pay attention to the driving
+
+[73:19.00]every thirty seconds
+
+[73:21.00]and eventually you can sleep in the car
+
+[73:23.00]there's a whole spectrum of that
+
+[73:25.00]what's the equivalent for coding
+
+[73:27.00]keep your hands on the keyboard
+
+[73:29.00]and then eventually you have to accept everything
+
+[73:31.00]that's good
+
+[73:33.00]approve the PR
+
+[73:35.00]approve this looks good
+
+[73:37.00]that's the dream that people want
+
+[73:39.00]because really you unlock a lot of coding
+
+[73:41.00]when people, non-technical people can file issues
+
+[73:43.00]and then the
+
+[73:45.00]AI engineer can sort of automatically
+
+[73:47.00]write code, pass your tests
+
+[73:49.00]and if it kind of works as
+
+[73:51.00]as advertised, then you can just kind of merge it
+
+[73:53.00]and then you
+
+[73:55.00]10x, 100x, the number of developers in your company
+
+[73:57.00]immediately
+
+[73:59.00]so that's the goal, that's the holy grail
+
+[74:01.00]we're not there yet, but sweep, code gen
+
+[74:03.00]there's a bunch of companies, magic probably
+
+[74:05.00]are all working towards that
+
+[74:07.00]and so the TLDR
+
+[74:09.00]the thing that we covered, Leslie and I covered
+
+[74:11.00]in the January recap
+
+[74:13.00]that we did was that
+
+[74:15.00]the people should have in their minds is the inner loop
+
+[74:17.00]versus the outer loop for the developer
+
+[74:19.00]inner loop is everything that happens
+
+[74:21.00]in your IDE between git commits
+
+[74:23.00]and outer loop is what happens
+
+[74:25.00]when you push up your git commit to
+
+[74:27.00]github, for example, or gitlab
+
+[74:29.00]and that's a nice split, which means
+
+[74:31.00]everything local, everything that needs to be fast
+
+[74:33.00]everything that's kind of very hands on for developers
+
+[74:35.00]is probably easier to automate
+
+[74:37.00]or easier to have code assistance
+
+[74:39.00]that's what copilot is, that's what all those things are
+
+[74:41.00]and then everything that happens autonomously
+
+[74:43.00]or effectively away from the keyboard
+
+[74:45.00]with a github issue or something
+
+[74:47.00]that is more outer loop where
+
+[74:49.00]you're relying a lot more on autonomy
+
+[74:51.00]and we are maybe not smart enough
+
+[74:53.00]to do that yet
+
+[74:55.00]Do you have any thoughts on the user experience
+
+[74:57.00]and how that will change? One of the things that
+
+[74:59.00]has happened for me, looking at some of these products
+
+[75:01.00]and playing around with things ourselves
+
+[75:03.00]it sounds good to have an automated PR
+
+[75:05.00]then you get an automated PR and you're like
+
+[75:07.00]I really don't want to review 300 lines
+
+[75:09.00]of generated code and find the bug
+
+[75:11.00]and then you have another agent that's a reviewer
+
+[75:13.00]and then they just come up to you
+
+[75:15.00]and then you like tell it, go fix it
+
+[75:17.00]and it comes back with 400 lines
+
+[75:19.00]yes, there is a length bias to code
+
+[75:21.00]and
+
+[75:23.00]you do have higher passing rates
+
+[75:25.00]in PRs, this is a documented human behavior
+
+[75:27.00]thing, send me two lines of code
+
+[75:29.00]I will review the shit out of that
+
+[75:31.00]I don't know if I can swear on this
+
+[75:33.00]send me 200 lines of code, looks good to me
+
+[75:35.00]guess what, the agents are going to
+
+[75:37.00]perfectly happy to copy that behavior from us
+
+[75:39.00]when we actually want them to do the opposite
+
+[75:41.00]so, yeah, I think
+
+[75:43.00]the GAN model of code generation
+
+[75:45.00]is probably not going to work super well
+
+[75:47.00]I do think we probably need just
+
+[75:49.00]better planning from the start
+
+[75:51.00]which is, I'm just repeating the
+
+[75:53.00]MBU thesis, by the way
+
+[75:55.00]just go listen to Kanjin talk about this
+
+[75:57.00]she's much better at it than I am
+
+[75:59.00]but yeah, I think
+
+[76:01.00]the code review thing is going to be
+
+[76:03.00]I think that what codium
+
+[76:05.00]the two codiums, the Israeli one
+
+[76:07.00]Israeli codium
+
+[76:09.00]with the E
+
+[76:11.00]Yeah, codium with the E
+
+[76:13.00]They still have refused to rename
+
+[76:15.00]I'm friends with both of them
+
+[76:17.00]Every month I'm like
+
+[76:19.00]Guys, lets all come to one room
+
+[76:21.00]Someone's got a fold
+
+[76:23.00]Codium with the E has gone
+
+[76:25.00]You got to write the test first
+
+[76:27.00]It's like a sort of tripartite
+
+[76:29.00]relationship, again this is also covered on a
+
+[76:31.00]podcast with them which is fantastic
+
+[76:33.00]You interview me, you sort of through me
+
+[76:35.00]So, codium is like
+
+[76:37.00]They've already thought this all the way through
+
+[76:39.00]They're like, okay you write the user story
+
+[76:41.00]From the user story you generate all the tests
+
+[76:43.00]You also generate the code
+
+[76:45.00]And you update any one of those
+
+[76:47.00]They all have to update together
+
+[76:49.00]And probably the critical factor
+
+[76:51.00]Is the test generation from the story
+
+[76:53.00]Because everything else
+
+[76:55.00]Can just kind of bounce the hits off
+
+[76:57.00]Of those things until they pass
+
+[76:59.00]So you have to write good tests
+
+[77:01.00]It's kind of like the eat your vegetables of coding
+
+[77:03.00]Which nobody really wants to do
+
+[77:05.00]And so I think it's a really smart tactic
+
+[77:07.00]To go to market
+
+[77:09.00]By saying we automatically generate
+
+[77:11.00]Test for you and start not great
+
+[77:13.00]But then get better
+
+[77:15.00]And eventually you get to
+
+[77:17.00]The weakest point in the chain
+
+[77:19.00]For the entire loop of code generation
+
+[77:21.00]What do you think their weakest link is
+
+[77:25.00]The weakest link
+
+[77:27.00]It's test generation
+
+[77:29.00]Do you think there's a way to
+
+[77:31.00]To make that actually better
+
+[77:33.00]For making it better
+
+[77:37.00]You have to have good isolation
+
+[77:39.00]And I think
+
+[77:41.00]Proper, serverless cloud environment
+
+[77:43.00]Is integral to that
+
+[77:45.00]It could be like a fly I/O
+
+[77:47.00]It could be like
+
+[77:49.00]A cloud fair worker
+
+[77:51.00]It depends how many resources
+
+[77:53.00]Your test environment needs
+
+[77:55.00]And effectively I was talking about this
+
+[77:57.00]I think with maybe Rob earlier in the audience
+
+[77:59.00]Every agent needs a sandbox
+
+[78:01.00]If you're a code agent you need a coding sandbox
+
+[78:03.00]But if you're whatever
+
+[78:05.00]Like MBU used to have this
+
+[78:07.00]Minecrafts clone that was much faster
+
+[78:09.00]If you have a model of the real world
+
+[78:11.00]You have to go generate some plan
+
+[78:13.00]Or some code or some whatever
+
+[78:15.00]Test it against that real world
+
+[78:17.00]So that you can get this iterative feedback
+
+[78:19.00]And then get the final result back
+
+[78:21.00]That is somewhat validated against the real world
+
+[78:23.00]And so you need a really good sandbox
+
+[78:25.00]I don't think people
+
+[78:27.00]This is an infrastructure need
+
+[78:29.00]Humans have had for a long time
+
+[78:31.00]We've never solved it for ourselves
+
+[78:33.00]And now we have to solve it for about
+
+[78:35.00]A thousand times larger quantity of agents
+
+[78:37.00]Then actually exist
+
+[78:39.00]And so I think we actually have to
+
+[78:41.00]Involve a lot more infrastructure
+
+[78:43.00]In order to serve these things
+
+[78:45.00]So for those who don't know
+
+[78:47.00]I also have
+
+[78:49.00]So we're talking about the rise of AI engineer
+
+[78:51.00]I also have various conversations
+
+[78:53.00]About immutable infrastructure
+
+[78:55.00]And this is all of the kinds
+
+[78:57.00]In order to solve
+
+[78:59.00]Agents and coding agents
+
+[79:01.00]We're going to have to solve the other stuff too along the way
+
+[79:03.00]And it's really neat for me
+
+[79:05.00]To see all that tied together in my deaf tools work
+
+[79:07.00]That all these themes kind of reemerge
+
+[79:09.00]Just naturally just because
+
+[79:11.00]Everything we needed for humans
+
+[79:13.00]We just need a hundred times more for agents
+
+[79:15.00]Let's talk about the AI engineer
+
+[79:17.00]AI engineer has become a whole thing
+
+[79:19.00]It's become a term and also a conference
+
+[79:21.00]And tell us more
+
+[79:23.00]And a job title
+
+[79:25.00]Tell us more about that
+
+[79:27.00]What's going on there
+
+[79:29.00]That is a very big cloud of things
+
+[79:31.00]I would just say
+
+[79:33.00]I think it's an emergent industry
+
+[79:35.00]I've seen this happen repeatedly
+
+[79:37.00]So the general term
+
+[79:39.00]So the general term is software engineer
+
+[79:41.00]Or programmer
+
+[79:43.00]In the 70s and 80s
+
+[79:45.00]There would not be a senior engineer
+
+[79:47.00]There would just be an engineer
+
+[79:49.00]I don't think you would even call
+
+[79:51.00]What about a member of the technical staff
+
+[79:53.00]Oh yeah MTS
+
+[79:55.00]Very very elite
+
+[79:57.00]So these striations appear
+
+[79:59.00]When the population grows
+
+[80:01.00]And the technical depth grows
+
+[80:03.00]Over time
+
+[80:05.00]When it starts
+
+[80:07.00]Not that important
+
+[80:09.00]It's just going to specialize
+
+[80:11.00]I've seen this happen for front end
+
+[80:13.00]For DevOps, for data
+
+[80:15.00]I can't remember what else I listed in that piece
+
+[80:17.00]But those are the main three that I was around for
+
+[80:19.00]Now a lot of people are arguing
+
+[80:21.00]That there is the ML researcher
+
+[80:23.00]The ML engineer
+
+[80:25.00]Who sort of pairs with the researcher
+
+[80:27.00]Sometimes they also call research engineer
+
+[80:29.00]And then on the other side of the fence
+
+[80:31.00]It's just software engineers
+
+[80:33.00]And that's how it was until about last year
+
+[80:35.00]And now there's this specializing
+
+[80:37.00]And rising class of people
+
+[80:39.00]Building AI specific software
+
+[80:41.00]That are not any of those previous titles
+
+[80:43.00]That I just mentioned
+
+[80:45.00]And that's the thesis of the AI engineer
+
+[80:47.00]In the emerging category of start-ups
+
+[80:49.00]Of jobs
+
+[80:51.00]I've had people from Meta, IBM, Microsoft
+
+[80:53.00]Open AI tell me that their title
+
+[80:55.00]Is now AI engineer
+
+[80:57.00]So like I can see that this is a trend
+
+[80:59.00]And I think that's what Andre called out
+
+[81:01.00]In his post that like just mathematically
+
+[81:03.00]Just the limitations in terms of talents
+
+[81:05.00]Research talents and GPUs
+
+[81:07.00]That all these will tend to concentrate
+
+[81:09.00]In a few labs
+
+[81:11.00]And everyone else
+
+[81:13.00]Are just going to have to rely on them
+
+[81:15.00]Or build differentiation of products
+
+[81:17.00]In other ways, and those will be AI engineers
+
+[81:19.00]So mathematically there will be more AI engineers
+
+[81:21.00]Than ML engineers, it's just the truth
+
+[81:23.00]Right now it's the other way
+
+[81:25.00]Right now the number of AI engineers
+
+[81:27.00]Is maybe 10x less
+
+[81:29.00]So I think that the ratio will invert
+
+[81:31.00]And I think the goal of it in space
+
+[81:33.00]And the goal of the conference
+
+[81:35.00]And anything else I do is to serve
+
+[81:37.00]That growing audience
+
+[81:39.00]To make the distinction clear
+
+[81:41.00]If I'm a software engineer
+
+[81:43.00]What do I have to learn
+
+[81:45.00]What additional capabilities does that
+
+[81:47.00]Type of engineer have
+
+[81:49.00]Funny you say that
+
+[81:51.00]I don't actually have a specific blog
+
+[81:53.00]Post on how to like
+
+[81:55.00]Change classes
+
+[81:57.00]I do think I always think about this
+
+[81:59.00]In terms of Baldur's Gate
+
+[82:01.00]DND ruleset number
+
+[82:03.00]5.1 or whatever
+
+[82:05.00]So I kind of intentionally left that open
+
+[82:07.00]To leave space for others
+
+[82:09.00]I think when you start an industry
+
+[82:11.00]That's the only way to guarantee
+
+[82:13.00]That it will fail
+
+[82:15.00]I do have a take
+
+[82:17.00]Obviously because a lot of people
+
+[82:19.00]Are asking me where to start
+
+[82:21.00]And I think basically
+
+[82:23.00]So what we have is
+
+[82:25.00]Late and Space University
+
+[82:27.00]We just finished working on
+
+[82:29.00]Day 7 today
+
+[82:31.00]It's a seven-day project
+
+[82:33.00]And I think we've done a great job
+
+[82:35.00]We've done a great job
+
+[82:37.00]We've done a great job
+
+[82:39.00]And we've just finished working on
+
+[82:41.00]Day 7 today
+
+[82:43.00]It's a seven-day email course
+
+[82:45.00]Where it basically like
+
+[82:47.00]It is completely designed to answer
+
+[82:49.00]The question of like
+
+[82:51.00]I'm an existing software engineer
+
+[82:53.00]I know how to code
+
+[82:55.00]But I don't get all this AI stuff
+
+[82:57.00]I've been living under a rock
+
+[82:59.00]Or like it's just too overwhelming for me
+
+[83:01.00]You have to pick for me
+
+[83:03.00]Or curate for me as a trusted friend
+
+[83:05.00]And I have one hour a day for seven days
+
+[83:07.00]It's image generation
+
+[83:09.00]It's code generation
+
+[83:11.00]It's audio
+
+[83:13.00]ASR
+
+[83:15.00]Audio speech recognition
+
+[83:17.00]And then I forget
+
+[83:19.00]What the fifth and sixth one is
+
+[83:21.00]But the last day is agents
+
+[83:23.00]And so basically I'm just like
+
+[83:25.00]Here are seven projects that you should do
+
+[83:27.00]To feel like you can do anything in AI
+
+[83:29.00]You can't really do everything in AI
+
+[83:31.00]Just from that small list
+
+[83:33.00]But I think it's just like anything
+
+[83:35.00]Go through like a set list
+
+[83:37.00]Of things that are basic skills
+
+[83:39.00]That I think everyone in this industry should have
+
+[83:41.00]To be at least conversant
+
+[83:43.00]In if someone, if like a boss comes to you
+
+[83:45.00]And goes like hey can we build this
+
+[83:47.00]You don't even know if the answer is no
+
+[83:49.00]So I want you to move towards
+
+[83:51.00]From like unknown unknowns to at least known unknowns
+
+[83:53.00]And I think that's where you start
+
+[83:55.00]Being competent as an engineer
+
+[83:57.00]So yeah that's LSU
+
+[83:59.00]In Space University just to trigger the tigers
+
+[84:03.00]So do you think in the future that people
+
+[84:05.00]An AI engineer is going to be someone's
+
+[84:07.00]Full-time job like people are just going to be
+
+[84:09.00]AI engineers or do you think it's going to be
+
+[84:11.00]More of a world where I'm a software engineer
+
+[84:13.00]And like 20% of my time
+
+[84:15.00]I'm using open ai's, api's
+
+[84:17.00]And I'm working on prompt engineering
+
+[84:19.00]And stuff like that and using code pilot
+
+[84:21.00]You just reminded me of day six's
+
+[84:23.00]Open source models and fine tuning
+
+[84:25.00]I think it will be a spectrum. That's why I don't want to be
+
+[84:27.00]Like too definitive about it. Like we have
+
+[84:29.00]Full-time front-end engineers and we have part-time
+
+[84:31.00]And you dip into that community whenever you want
+
+[84:33.00]But wouldn't it be nice if there was a
+
+[84:35.00]Collective name for that community
+
+[84:37.00]So you could go find it, you could find each other
+
+[84:39.00]And like honestly that's really it
+
+[84:41.00]Like a lot of people, a lot of companies are pinging me
+
+[84:43.00]For like hey I want to hire this kind of person
+
+[84:45.00]But you can't hire that person
+
+[84:47.00]But I want someone like that
+
+[84:49.00]And then people on the labor side
+
+[84:51.00]Were pinging me going like okay I want to do more
+
+[84:53.00]In this space but where do I go
+
+[84:55.00]And I think just having that shelling point
+
+[84:57.00]Of what an industry title or name is
+
+[84:59.00]And sort of building out that mythology
+
+[85:01.00]And community and conference
+
+[85:03.00]I think is helpful hopefully
+
+[85:05.00]And I don't have any prescriptions
+
+[85:07.00]On whether or not it's a full-time job
+
+[85:09.00]I do think over time it's going to become
+
+[85:11.00]More of a full-time job
+
+[85:13.00]And that's great for the people who want to do that
+
+[85:15.00]And the companies that want to employ that
+
+[85:17.00]But it's absolutely like you can take it part-time
+
+[85:19.00]Like jobs come in many formats
+
+[85:21.00]Yep that makes sense
+
+[85:23.00]And then you have a huge world fair
+
+[85:25.00]Coming up
+
+[85:27.00]Tell me about that
+
+[85:29.00]So part of I think
+
+[85:31.00]What creating industry requires
+
+[85:33.00]Is to let people gather in one place
+
+[85:35.00]And also for me
+
+[85:37.00]To get high quality talks out of people
+
+[85:39.00]You have to create an event out of it
+
+[85:41.00]Otherwise they don't do the work
+
+[85:43.00]So last year we did
+
+[85:45.00]The AI engineer summit which went very well
+
+[85:47.00]And people can see that online
+
+[85:49.00]And we're very happy with how that turned out
+
+[85:51.00]This year we want to go four times bigger
+
+[85:53.00]With the world fair
+
+[85:55.00]To try to reflect AI engineering
+
+[85:57.00]As it is in 2024
+
+[85:59.00]I always admired
+
+[86:01.00]Two conferences in this respect
+
+[86:03.00]One is NEURUPs which I went to last year
+
+[86:05.00]And documented on the pod which was fantastic
+
+[86:07.00]And two which is KubeCon
+
+[86:09.00]From the other side of my life
+
+[86:11.00]Which is the sort of cloud registration
+
+[86:13.00]In DevOps world
+
+[86:15.00]So NEURUPs is the one place that you go to
+
+[86:17.00]To I think it's the top conference
+
+[86:19.00]I mean there's others
+
+[86:21.00]That you can kind of consider
+
+[86:23.00]So NEURUPs
+
+[86:25.00]NEURUPs is where the research sciences are the stars
+
+[86:27.00]The researchers are the stars, the PhDs are the stars
+
+[86:29.00]Mostly it's just PhDs on the job market
+
+[86:31.00]It's really funny to go
+
+[86:33.00]Especially these things
+
+[86:35.00]It's really funny to go to NEURUPs
+
+[86:37.00]And the PhDs trying to back them
+
+[86:39.00]There are lots of VCs trying to back there
+
+[86:41.00]This year
+
+[86:43.00]So NEURUPs research sciences are the stars
+
+[86:45.00]And I wanted for AI engineers
+
+[86:47.00]Engineer to be the star
+
+[86:49.00]To show off their tooling
+
+[86:51.00]And their techniques
+
+[86:53.00]And their difficulty
+
+[86:55.00]Moving all these ideas from research into production
+
+[86:57.00]The other one was KubeCon
+
+[86:59.00]Where you could honestly just go
+
+[87:01.00]And not attend any of the talks
+
+[87:03.00]And just walk the floor
+
+[87:05.00]And figure out what's going on in DevOps
+
+[87:07.00]Which is fantastic
+
+[87:09.00]So that curation
+
+[87:11.00]And that bringing together of an industry
+
+[87:13.00]Is what I'm going for for the conference
+
+[87:15.00]And it's coming in June
+
+[87:17.00]The most important thing to be honest
+
+[87:19.00]The most important thing was to buy the domain
+
+[87:21.00]So we got AI.engineer
+
+[87:23.00]People are like engineer is a domain
+
+[87:25.00]And funny enough
+
+[87:27.00].engineer was cheaper than.engineering
+
+[87:29.00]I don't understand why
+
+[87:31.00]But that's up to the domain people
+
+[87:33.00]All right
+
+[87:35.00]Josh, any questions on agents
+
+[87:37.00]Yeah, I think maybe you have a lot of
+
+[87:39.00]Experience and exposure
+
+[87:41.00]Talking to all these companies and founders
+
+[87:43.00]And researchers and everyone that's on your podcast
+
+[87:45.00]Do you have like
+
+[87:47.00]Do you feel like you have a good kind of perspective
+
+[87:49.00]On some of the things that like
+
+[87:51.00]Some of the kind of technical issues having seen
+
+[87:53.00]Like we were just talking about like for
+
+[87:55.00]Coding agents like oh how you know
+
+[87:57.00]The value of test is really important
+
+[87:59.00]There are other things like for you know retrieval
+
+[88:01.00]Like now, you know, we have these models
+
+[88:03.00]Coming out with a million context, you know
+
+[88:05.00]Are a million tokens of context like 30 million
+
+[88:07.00]Is retrieval going to matter anymore
+
+[88:09.00]The huge context matter like what do you think
+
+[88:11.00]Specific about the long context thing
+
+[88:13.00]Sure, yeah
+
+[88:15.00]I was going to ask a few other ones after that
+
+[88:17.00]So go for that one first
+
+[88:19.00]That's what I was going to ask first
+
+[88:21.00]Yeah, let's talk about the long context
+
+[88:23.00]So for those who don't know
+
+[88:25.00]Long context was kind of
+
+[88:27.00]In the air last year but really
+
+[88:29.00]Really really really came into focus this year
+
+[88:31.00]With Gemini1.5 having
+
+[88:33.00]Million token context and saying that
+
+[88:35.00]It was in research for 10 million tokens
+
+[88:37.00]And that means that
+
+[88:39.00]You can put you like
+
+[88:41.00]No longer have to really think about
+
+[88:43.00]What you retrieve
+
+[88:45.00]No longer really think about
+
+[88:47.00]What you have to put into context
+
+[88:49.00]You can just kind of throw the entire
+
+[88:51.00]Knowledge base in there or books or film
+
+[88:53.00]Anything like that and that's fantastic
+
+[88:55.00]A lot of people are thinking that it kills
+
+[88:57.00]Rag and I think like one that's not true
+
+[88:59.00]Because for any kind of cost reason
+
+[89:01.00]You know, you still pay per token
+
+[89:03.00]So basically Google is like perfectly happy
+
+[89:05.00]To let you pay a million tokens
+
+[89:07.00]Every single time you make an API call
+
+[89:09.00]But good luck, you know, having $100 API call
+
+[89:11.00]You don't want to be slow, no explanation needed
+
+[89:13.00]And then finally my criticism of
+
+[89:15.00]Long context is that it's also not debuggable
+
+[89:17.00]Like if something goes wrong with the result
+
+[89:19.00]You can't do like the rags decomposition
+
+[89:21.00]Of where the source of error
+
+[89:23.00]Like you just have to like go like
+
+[89:25.00]It's the way it's bro like it's somewhere in there
+
+[89:27.00]I'm sorry, I pretty strongly agree with this
+
+[89:29.00]Why do you think people are making such
+
+[89:31.00]Crazy Long context windows
+
+[89:32.00]People love to kill rag
+
+[89:33.00]It's so much because it's too expensive
+
+[89:35.00]It's so expensive like you said
+
+[89:37.00]Yeah, I just think I'm just calling it
+
+[89:39.00]It's a different dimension
+
+[89:40.00]I think it's an option that's great when it's there
+
+[89:42.00]Like when I'm prototyping
+
+[89:43.00]I do not ever want to worry about context
+
+[89:45.00]And I'm going to call stuff a few times
+
+[89:47.00]And I don't want to run the errors
+
+[89:48.00]I don't want to have it set up a complex retrieval system
+
+[89:50.00]Just to prototype something
+
+[89:52.00]But once I done prototyping
+
+[89:53.00]Then I'll worry about all the other rags stuff
+
+[89:55.00]And yes, I'm going to buy some system
+
+[89:57.00]Or build some system or whatever to go do that
+
+[90:00.00]So I think it's just like an improvement
+
+[90:03.00]In like one dimension that you need
+
+[90:05.00]But the improvements in the other dimensions
+
+[90:07.00]And it's all needed
+
+[90:08.00]Like this space isn't going to keep growing
+
+[90:10.00]In unlimited fashion
+
+[90:12.00]I do think that this combined with multi-modality
+
+[90:16.00]Does unlock new things
+
+[90:18.00]So that's what I was going to ask about next
+
+[90:19.00]It's like how important is multimodal
+
+[90:21.00]Like great, you know, generating videos
+
+[90:23.00]Sure, whatever
+
+[90:24.00]Okay, how many of us need to generate videos that often
+
+[90:26.00]It'd be cool for TV shows, sure, but like, yeah
+
+[90:28.00]I think it's pretty important
+
+[90:30.00]The one thing that in
+
+[90:31.00]When we launched the in space podcast
+
+[90:33.00]We listed a bunch of interest areas
+
+[90:35.00]One thing I love about being explicit
+
+[90:37.00]Or intentional about our work
+
+[90:40.00]Is that you list the things that you're interested in
+
+[90:42.00]And you list the things that you're not interested in
+
+[90:44.00]And people are very unwilling
+
+[90:46.00]To have an entire interest list
+
+[90:48.00]One of the things that we are not interested in
+
+[90:50.00]Was multimodality last year
+
+[90:52.00]Because everyone was
+
+[90:54.00]I was just like, okay, you can generate images
+
+[90:56.00]And they're pretty, but like, not a giant business
+
+[90:58.00]I was wrong
+
+[90:59.00]Journey is a giant, giant, massive business
+
+[91:01.00]That no one can understand or get into
+
+[91:03.00]But also, I think
+
+[91:05.00]Being able to natively understand
+
+[91:07.00]Audio and video and code
+
+[91:09.00]I consider code a special modality
+
+[91:11.00]All that is
+
+[91:13.00]Very, like, qualitatively different
+
+[91:15.00]Then translating it into English first
+
+[91:17.00]And using English as, you know, like a bottleneck
+
+[91:19.00]Or pipe
+
+[91:20.00]And then, you know, applying it in LLMs
+
+[91:22.00]The ability of LLMs to reason across modalities
+
+[91:25.00]Gives you something more than you could
+
+[91:27.00]Individually by using Texas
+
+[91:29.00]The universal interface
+
+[91:30.00]So I think that's useful
+
+[91:32.00]So concretely, what does that mean?
+
+[91:34.00]It means that
+
+[91:35.00]So, I think the reference post for everyone
+
+[91:37.00]That you should have in your head
+
+[91:39.00]Is Simon Willison's post on Gemini1.5's
+
+[91:41.00]Video capability
+
+[91:43.00]Where he basically shot a video of
+
+[91:45.00]His bookshelf, just kind of scanning through it
+
+[91:47.00]And he was able to give back a complete
+
+[91:49.00]JSON list of the books and the authors
+
+[91:51.00]And all the details that were visible there
+
+[91:53.00]Helucinated some of it
+
+[91:55.00]Which is, you know, another issue
+
+[91:57.00]But I think it's just like unlocks this use case
+
+[91:59.00]That you just would not even try to code
+
+[92:01.00]Without the native video understanding capability
+
+[92:04.00]And obviously, like, on a technical level
+
+[92:07.00]Video is just a bunch of frames
+
+[92:08.00]So it actually is just image understanding
+
+[92:10.00]But image within the temporal dimension
+
+[92:12.00]Which this month, I think
+
+[92:14.00]Became much more of a important thing
+
+[92:16.00]Like the integration of space and time
+
+[92:18.00]In transformers
+
+[92:19.00]I don't think anyone was really talking about that
+
+[92:21.00]Until this month
+
+[92:22.00]And now it's the only thing anyone can ever think about
+
+[92:24.00]For Sora and for all the other stuff
+
+[92:26.00]The last thing I'll say
+
+[92:28.00]Which is against this trend of
+
+[92:30.00]Every modality is important
+
+[92:32.00]They just do all the modalities
+
+[92:34.00]I kind of agree with Nat Friedman
+
+[92:36.00]Who actually kind of pointed out
+
+[92:37.00]Just before the Gemini thing blew up
+
+[92:39.00]This month
+
+[92:41.00]Which was, like, why is it that
+
+[92:43.00]OpenAI is pushing Dolly so hard
+
+[92:45.00]Why is Bing pushing Bing Image Creator?
+
+[92:47.00]Like, it's not apparent
+
+[92:49.00]That you have to create images to create AGI
+
+[92:51.00]But every lab just seems to want to do this
+
+[92:54.00]And I kind of agree
+
+[92:55.00]That it's not on the critical path
+
+[92:57.00]Especially for image generation
+
+[92:59.00]Maybe image understanding, video understanding
+
+[93:01.00]Yeah, consumption
+
+[93:02.00]But generation
+
+[93:04.00]Maybe we'll be wrong next year
+
+[93:06.00]Just catches you a bunch of flak with, like, you know
+
+[93:08.00]Culture war things
+
+[93:10.00]It's true
+
+[93:11.00]All right, we're going to move into
+
+[93:13.00]Rapidfire Q&A
+
+[93:14.00]So, we're going to ask you a question
+
+[93:16.00]I don't want to overthink it, baby
+
+[93:18.00]We're going to audience the Q&A
+
+[93:20.00]So, I'll tell you
+
+[93:22.00]All right
+
+[93:23.00]We've cut the Q&A section for time
+
+[93:25.00]So, if you want to hear the spicy questions
+
+[93:27.00]Head over to the Thursday nights in A.I. video
+
+[93:29.00]for the full discussion
+
+[93:31.00]Next up, we have another former guest
+
+[93:33.00]Dylan Patel of Semi-analysis
+
+[93:36.00]the inventor of the GPU rich poor divide
+
+[93:39.00]who did a special live show with us in March
+
+[93:41.00]But that means you can finally
+
+[93:45.00]side-to-side A.V. test your favorite boba shops
+
+[93:48.00]We got Gongcha, we got boba guys
+
+[93:50.00]We got the lemon, whatever it's called
+
+[93:53.00]So, let us know what's your favorite
+
+[93:55.00]We also have Slido up to submit questions
+
+[93:58.00]We already had Dylan on the podcast
+
+[94:00.00]And this guy tweets and writes about all kinds of stuff
+
+[94:02.00]So, we want to know what people want to know more about
+
+[94:05.00]Rather than just being self-truined
+
+[94:07.00]But we'll do a state of the union, maybe
+
+[94:10.00]Everybody wants to know about GROC
+
+[94:12.00]Everybody wants to know whether or not
+
+[94:14.00]MVD is going to zero after GROC
+
+[94:16.00]Everybody wants to know what's going on with AMD
+
+[94:18.00]We got some AMD folks in the crowd, too
+
+[94:20.00]So, feel free to interact at any time
+
+[94:23.00]We have HECCLE, HECCLE, please
+
+[94:25.00]Good comedians show their color
+
+[94:28.00]The way they can handle the crowd when they're HECCLE
+
+[94:31.00]Do not throw boba
+
+[94:33.00]Do not throw boba at the stand
+
+[94:35.00]We cannot afford another podcast thing set up
+
+[94:37.00]Awesome, welcome everybody
+
+[94:39.00]To the seminalysis and latest space crossover
+
+[94:41.00]Dylan texted me on signal
+
+[94:43.00]He was like, dude, how do I easily set up a meetup
+
+[94:46.00]And here we are today
+
+[94:48.00]As you might have seen, there's no name tags
+
+[94:50.00]There's a bunch of things that are missing
+
+[94:52.00]But we did our best
+
+[94:53.00]It was extremely easy, right?
+
+[94:55.00]Like, I texted Alessio, he's like, yo, I got the spot
+
+[94:58.00]Okay, cool, here's a link, send it to people
+
+[95:00.00]Send it, and then show it up
+
+[95:03.00]And like, there was zero other organization that I required
+
+[95:07.00]So, everybody's here
+
+[95:09.00]A lot of seminalysis fans, we get in the crowd
+
+[95:12.00]Everybody wants to know more about
+
+[95:14.00]What's going on today and GROC has definitely been the hottest thing
+
+[95:16.00]We just recorded our monthly podcast today
+
+[95:18.00]And we didn't talk that much about GROC
+
+[95:20.00]Because we wanted you to talk more about
+
+[95:22.00]And then we'll splice you into our monthly recap
+
+[95:24.00]So, let's start there
+
+[95:26.00]So, you guys are the two GROC spreadsheetors
+
+[95:29.00]So, we broke out some GROC numbers
+
+[95:32.00]Because everyone was wondering
+
+[95:33.00]There's two things going on, right?
+
+[95:34.00]One, you know, how important, you know
+
+[95:36.00]How does it achieve the inference speed that it does
+
+[95:38.00]That it has been demonstrated by GROC chat
+
+[95:41.00]And two, how does it achieve its price promise
+
+[95:44.00]That is promised, that is sort of the public pricing
+
+[95:46.00]Of 27 cents per million token
+
+[95:48.00]And there's been a lot of speculation
+
+[95:50.00]Or, you know, some numbers thrown out there
+
+[95:52.00]I put out some tentative numbers
+
+[95:54.00]And you put out different numbers
+
+[95:55.00]But I'll just kind of lay that as the groundwork
+
+[95:58.00]Like, everyone's very excited about
+
+[96:00.00]Essentially like five times faster
+
+[96:02.00]Token generation than any other LLM currently
+
+[96:05.00]And that unlocks interesting downstream possibilities
+
+[96:08.00]If it's sustainable
+
+[96:10.00]If it's affordable
+
+[96:11.00]And so I think your question
+
+[96:13.00]Or reading your piece on GROC
+
+[96:15.00]Which is on the screen right now
+
+[96:16.00]Is it sustainable
+
+[96:18.00]So, like many things
+
+[96:20.00]This is VC funded, including this Boba
+
+[96:23.00]No, I'm just kidding
+
+[96:24.00]I'm paying for the Boba
+
+[96:25.00]Thank you, Semunas, the subscribers
+
+[96:27.00]I hope he pays for it
+
+[96:29.00]I pay for it right now
+
+[96:30.00]That's true, Alessio has the IOU
+
+[96:33.00]Right?
+
+[96:34.00]And that's all it is
+
+[96:35.00]But yeah, like many things, you know
+
+[96:37.00]They're not making money off of their inference service
+
+[96:40.00]They're just throwing it out there for cheap
+
+[96:42.00]And hoping to get business
+
+[96:43.00]And maybe raise money off of that
+
+[96:45.00]And I think that's a that's a fine use case
+
+[96:48.00]But the question is like
+
+[96:49.00]How much money are they losing, right?
+
+[96:51.00]And that's sort of what I went through
+
+[96:52.00]Breaking down in this article
+
+[96:54.00]That's on the screen
+
+[96:55.00]And that's it's pretty clear
+
+[96:56.00]They're like seven to ten X off
+
+[97:00.00]Like break even on their inference API
+
+[97:02.00]Which is like horrendous
+
+[97:04.00]Like far worse than any other
+
+[97:06.00]Sort of inference API provider
+
+[97:08.00]So this is like a simple simple
+
+[97:10.00]Cost thing that was pulled up
+
+[97:12.00]You can either inference at
+
+[97:13.00]Very high throughput
+
+[97:14.00]Or you can inference at
+
+[97:15.00]Very high very low latency
+
+[97:17.00]With GPUs you can do both
+
+[97:18.00]With GROC you can only do one
+
+[97:20.00]Of course with GROC
+
+[97:21.00]You can do that one faster
+
+[97:22.00]Marginally faster than a
+
+[97:24.00]Inference latency optimized GPU server
+
+[97:26.00]But no one offers inference latency
+
+[97:28.00]Optimized GPU servers
+
+[97:29.00]Because you would just burn money
+
+[97:31.00]Makes no economic sense to do so
+
+[97:33.00]Until maybe someone's willing
+
+[97:34.00]To pay for that
+
+[97:35.00]So GROC service
+
+[97:36.00]You know the surface looks awesome
+
+[97:37.00]Compared to everyone else's service
+
+[97:39.00]Which is throughput optimized
+
+[97:41.00]And then when you compare to
+
+[97:43.00]Input optimized scenario
+
+[97:44.00]GPUs look quite slow
+
+[97:46.00]But the reality is
+
+[97:47.00]Is they're serving 64
+
+[97:49.00]128 users at once
+
+[97:51.00]They have a batch size
+
+[97:52.00]How many users are being
+
+[97:53.00]Served at once
+
+[97:54.00]Where as GROC is taking
+
+[97:55.00]576 chips
+
+[97:56.00]And they're not really
+
+[97:58.00]Doing that efficiently
+
+[97:59.00]They're serving a far far
+
+[98:01.00]Fewer number of users
+
+[98:02.00]But extremely fast
+
+[98:03.00]Now that could be worthwhile
+
+[98:05.00]If they can get there
+
+[98:07.00]The number of users
+
+[98:08.00]They're serving at once up
+
+[98:10.00]But that's extremely hard
+
+[98:11.00]Because they don't have
+
+[98:12.00]Their chip
+
+[98:13.00]So they can't store
+
+[98:14.00]KVcash
+
+[98:15.00]KVcash for all the
+
+[98:16.00]Various different users
+
+[98:17.00]So their crux of the issue
+
+[98:19.00]Is just like hey
+
+[98:20.00]Can they get that performance
+
+[98:22.00]Up as much as they claim they will
+
+[98:24.00]They need to get it up
+
+[98:25.00]More than 10x
+
+[98:26.00]To make this like a reasonable
+
+[98:28.00]Benefit
+
+[98:29.00]In the meantime
+
+[98:30.00]NVIDIA is launching a new
+
+[98:31.00]GPU in two weeks
+
+[98:32.00]That'll be fun at GTC
+
+[98:34.00]And they're constantly
+
+[98:35.00]Pushing software as well
+
+[98:36.00]So we'll see
+
+[98:37.00]If GROC can catch up to that
+
+[98:38.00]The current verdict is
+
+[98:40.00]They're quite far behind
+
+[98:41.00]But hopefully
+
+[98:42.00]That maybe they can
+
+[98:43.00]Get there by scaling
+
+[98:44.00]Their system larger
+
+[98:45.00]Yeah
+
+[98:46.00]I was listening back
+
+[98:47.00]To our original episode
+
+[98:48.00]And you were talking
+
+[98:49.00]About how NVIDIA
+
+[98:50.00]Basically adopted
+
+[98:51.00]This different strategy
+
+[98:52.00]Of just leaning on
+
+[98:54.00]Networking GPUs together
+
+[98:55.00]And it seems like
+
+[98:56.00]GROC has some minor
+
+[98:58.00]Version of that going on
+
+[98:59.00]Here with the GROC rack
+
+[99:00.00]Is it enough?
+
+[99:03.00]What's GROC's next
+
+[99:05.00]Step here strategically?
+
+[99:07.00]Yeah, that's
+
+[99:09.00]The next step is of course
+
+[99:10.00]So right now
+
+[99:12.00]They connect 10 racks
+
+[99:13.00]Of chips together
+
+[99:14.00]And that's the system
+
+[99:15.00]That's running
+
+[99:16.00]On their API today
+
+[99:17.00]Whereas most people
+
+[99:18.00]Who are running
+
+[99:19.00]Mischro are running
+
+[99:20.00]In on two GPUs
+
+[99:22.00]One fourth of a server
+
+[99:24.00]And that rack
+
+[99:25.00]Is not
+
+[99:26.00]Obviously 10 racks
+
+[99:27.00]Is pretty crazy
+
+[99:28.00]But they think
+
+[99:29.00]That they can
+
+[99:30.00]Scale performance
+
+[99:31.00]If they have
+
+[99:32.00]This individual system
+
+[99:33.00]Be 20 racks
+
+[99:34.00]They think they can
+
+[99:35.00]Continue to scale performance
+
+[99:36.00]Extra linearly
+
+[99:37.00]So that'd be amazing
+
+[99:38.00]If they could
+
+[99:39.00]And I'm doubtful
+
+[99:40.00]That's going to be something
+
+[99:42.00]That's scalable
+
+[99:43.00]Especially for
+
+[99:45.00]You know, larger models
+
+[99:47.00]So there's the chip itself
+
+[99:51.00]But there's also a lot
+
+[99:52.00]Of work they're doing
+
+[99:53.00]At the compiler level
+
+[99:54.00]Do you have any good sense
+
+[99:55.00]Of like how easy it is
+
+[99:57.00]To actually work with LPU
+
+[99:59.00]Is that something
+
+[100:00.00]That's going to be
+
+[100:01.00]About on that for them
+
+[100:02.00]So Ollie's in the front
+
+[100:03.00]Right there
+
+[100:04.00]And he knows a ton
+
+[100:05.00]About VLIW architectures
+
+[100:07.00]But to summarize
+
+[100:08.00]In his opinion
+
+[100:09.00]And I think many folks
+
+[100:10.00]Is it's extremely hard
+
+[100:11.00]To program
+
+[100:12.00]These sorts of architectures
+
+[100:14.00]Which is why they have
+
+[100:15.00]Their compiler
+
+[100:16.00]And so on and so forth
+
+[100:17.00]But it's an incredible amount
+
+[100:19.00]Of work for them
+
+[100:20.00]To stand up individual models
+
+[100:22.00]And to get the performance
+
+[100:23.00]Up on them
+
+[100:24.00]Which is what they've been
+
+[100:25.00]Working on
+
+[100:26.00]Whereas GPUs
+
+[100:27.00]Are far more flexible
+
+[100:28.00]Of course
+
+[100:29.00]And so the question is
+
+[100:30.00]Can this compiler
+
+[100:31.00]Continue to extract
+
+[100:32.00]Performance
+
+[100:33.00]Well theoretically
+
+[100:34.00]There's a lot more
+
+[100:35.00]Performance to run
+
+[100:36.00]On the hardware
+
+[100:37.00]And they don't have
+
+[100:38.00]Many things that people
+
+[100:40.00]Generally associate with
+
+[100:42.00]Programmable hardware
+
+[100:44.00]They don't have buffers
+
+[100:45.00]And many other things
+
+[100:46.00]So it makes it very tough
+
+[100:47.00]To do that
+
+[100:48.00]But that's what their
+
+[100:49.00]They're relatively large
+
+[100:51.00]Compiler team is working on
+
+[100:53.00]So I'm not a GPU compiler guy
+
+[100:55.00]But I do want to
+
+[100:56.00]Clarify my understanding
+
+[100:57.00]From what I read
+
+[100:58.00]Which is a lot of
+
+[100:59.00]Catching up to do
+
+[101:00.00]It is
+
+[101:01.00]The crux of it
+
+[101:02.00]Is some kind of speculative
+
+[101:04.00]The word that comes
+
+[101:05.00]The routing
+
+[101:06.00]Of weights
+
+[101:08.00]And work
+
+[101:09.00]That needs to
+
+[101:10.00]Be done or scheduling
+
+[101:11.00]Of work across
+
+[101:12.00]The ten racks
+
+[101:14.00]Of GPUs
+
+[101:15.00]Is that like
+
+[101:17.00]The bulk of the benefit
+
+[101:18.00]That you get
+
+[101:19.00]From the compilation
+
+[101:20.00]So with the grok chips
+
+[101:22.00]What's really
+
+[101:23.00]Interesting is like
+
+[101:25.00]With GPUs
+
+[101:26.00]You can issue
+
+[101:27.00]Certain instructions
+
+[101:29.00]And you will get
+
+[101:30.00]A different result
+
+[101:31.00]Like depending on
+
+[101:32.00]The time I know
+
+[101:33.00]A lot of people
+
+[101:34.00]I'll have
+
+[101:35.00]Have had that experience
+
+[101:36.00]Or like the GPU
+
+[101:37.00]Literally doesn't return
+
+[101:38.00]The numbers it should be
+
+[101:39.00]As basically called
+
+[101:40.00]Non-determinism
+
+[101:41.00]With grok
+
+[101:42.00]Their chip is completely
+
+[101:43.00]Deterministic
+
+[101:44.00]The moment you compile it
+
+[101:45.00]You know exactly how long
+
+[101:46.00]It will take to operate
+
+[101:47.00]Right there is no
+
+[101:48.00]There is no like
+
+[101:49.00]Deviation at all
+
+[101:51.00]And so
+
+[101:52.00]You know they've
+
+[101:53.00]They're planning everything
+
+[101:55.00]Ahead of time
+
+[101:56.00]Like every instruction
+
+[101:57.00]Like it will
+
+[101:58.00]Complete in the time
+
+[101:59.00]That they've planned it for
+
+[102:00.00]And there is no
+
+[102:01.00]I don't know what
+
+[102:02.00]The best way to state this is
+
+[102:03.00]No variance there
+
+[102:04.00]Which is
+
+[102:05.00]Interesting from like
+
+[102:06.00]When you look historically
+
+[102:07.00]They tried to push this
+
+[102:08.00]In automotive
+
+[102:09.00]Because automotive
+
+[102:10.00]You probably want
+
+[102:11.00]Your car to do
+
+[102:12.00]Exactly what you
+
+[102:13.00]I issued it to do
+
+[102:14.00]And not have
+
+[102:15.00]Unpredictability
+
+[102:16.00]But yeah
+
+[102:17.00]Sorry I lost track
+
+[102:18.00]Of the question
+
+[102:19.00]It's okay
+
+[102:20.00]I just wanted to
+
+[102:21.00]Understand a little bit
+
+[102:22.00]More about like
+
+[102:23.00]What people should
+
+[102:24.00]Should know about
+
+[102:25.00]The compiler magic
+
+[102:26.00]That goes on with grok
+
+[102:27.00]Like you know
+
+[102:28.00]Like I think
+
+[102:30.00]From a software
+
+[102:31.00]Like hardware point of view
+
+[102:32.00]That intersection of
+
+[102:34.00]I guess
+
+[102:35.00]So chips have like
+
+[102:36.00]Like I'm still this
+
+[102:37.00]From someone
+
+[102:38.00]Here in the crowd
+
+[102:39.00]But chips have like
+
+[102:40.00]Five you know
+
+[102:41.00]Sort of there's
+
+[102:42.00]Like when you're designing a chip
+
+[102:43.00]There's there's
+
+[102:44.00]It's called PPA right
+
+[102:45.00]Power performance in area
+
+[102:47.00]It's kind of a triangle
+
+[102:48.00]That you optimize around
+
+[102:49.00]And the one thing
+
+[102:50.00]People don't realize
+
+[102:51.00]Is there's a
+
+[102:52.00]There's a third P
+
+[102:53.00]That's like PPA P
+
+[102:55.00]And the last P
+
+[102:56.00]Is pain in the ass
+
+[102:57.00]To program
+
+[102:58.00]And that's
+
+[102:59.00]That is very important
+
+[103:00.00]For like
+
+[103:01.00]High hardware
+
+[103:02.00]Like TPU
+
+[103:04.00]Without the
+
+[103:05.00]Hundreds of people
+
+[103:06.00]That work on the compiler
+
+[103:07.00]And jacks
+
+[103:08.00]And XLA
+
+[103:09.00]And all these sorts of things
+
+[103:10.00]Would be a pain in the ass
+
+[103:11.00]To program
+
+[103:12.00]But Google's got that like
+
+[103:13.00]Plumbing
+
+[103:14.00]Now if you look across
+
+[103:15.00]The ecosystem
+
+[103:16.00]Everything else is a pain
+
+[103:17.00]In the ass to program
+
+[103:18.00]Compared to NVIDIA
+
+[103:19.00]And this applies
+
+[103:20.00]To the grok chip as well
+
+[103:22.00]So yeah
+
+[103:23.00]Question is like
+
+[103:24.00]Can the compiler team
+
+[103:25.00]Get performance up
+
+[103:26.00]Anywhere close to theoretical
+
+[103:28.00]And then can they make
+
+[103:29.00]It not a pain in the ass
+
+[103:30.00]To program
+
+[103:31.00]And then can they make
+
+[103:32.00]It not a pain in the ass
+
+[103:33.00]To program
+
+[103:34.00]And then can they make
+
+[103:35.00]It not a pain in the ass
+
+[103:36.00]To program
+
+[103:37.00]And then can they make
+
+[103:38.00]It not a pain in the ass
+
+[103:39.00]And then can they make
+
+[103:40.00]It not a pain in the ass
+
+[103:41.00]And then can they make
+
+[103:42.00]It not a pain in the ass
+
+[103:43.00]And then can they make
+
+[103:44.00]It not a pain in the ass
+
+[103:45.00]And then can they make
+
+[103:46.00]It not a pain in the ass
+
+[103:47.00]And then can they make
+
+[103:48.00]It not a pain in the ass
+
+[103:49.00]And then can they make
+
+[103:50.00]It not a pain in the ass
+
+[103:51.00]And then can they make
+
+[103:52.00]It not a pain in the ass
+
+[103:53.00]And then can they make
+
+[103:54.00]It not a pain in the ass
+
+[103:55.00]And then can they make
+
+[103:56.00]It not a pain in the ass
+
+[103:57.00]And then can they make
+
+[103:58.00]It not a pain in the ass
+
+[103:59.00]And then can they make
+
+[104:00.00]It not a pain in the ass
+
+[104:01.00]And then can they make
+
+[104:02.00]It not a pain in the ass
+
+[104:03.00]And then can they make
+
+[104:04.00]It not a pain in the ass
+
+[104:05.00]And then can they make
+
+[104:06.00]It not a pain in the ass
+
+[104:07.00]And then can they make
+
+[104:08.00]It not a pain in the ass
+
+[104:09.00]And then can they make
+
+[104:10.00]It not a pain in the ass
+
+[104:11.00]And then can they make
+
+[104:12.00]It not a pain in the ass
+
+[104:13.00]And then can they make
+
+[104:14.00]It not a pain in the ass
+
+[104:15.00]And then can they make
+
+[104:16.00]It not a pain in the ass
+
+[104:17.00]And then can they make
+
+[104:18.00]It not a pain in the ass
+
+[104:19.00]And then can they make
+
+[104:20.00]It not a pain in the ass
+
+[104:21.00]And then can they make
+
+[104:22.00]It not a pain in the ass
+
+[104:23.00]And then can they make
+
+[104:24.00]It not a pain in the ass
+
+[104:25.00]And then can they make
+
+[104:26.00]It not a pain in the ass
+
+[104:27.00]And then can they make
+
+[104:28.00]It not a pain in the ass
+
+[104:29.00]And then can they make
+
+[104:30.00]It not a pain in the ass
+
+[104:31.00]And then can they make
+
+[104:32.00]It not a pain in the ass
+
+[104:33.00]And then can they make
+
+[104:34.00]It not a pain in the ass
+
+[104:35.00]And then can they make
+
+[104:36.00]It not a pain in the ass
+
+[104:37.00]And then can they make
+
+[104:38.00]It not a pain in the ass
+
+[104:39.00]And then can they make
+
+[104:40.00]It not a pain in the ass
+
+[104:41.00]And then can they make
+
+[104:42.00]It not a pain in the ass
+
+[104:43.00]And then can they make
+
+[104:44.00]It not a pain in the ass
+
+[104:45.00]And then can they make
+
+[104:46.00]It not a pain in the ass
+
+[104:47.00]And then can they make
+
+[104:48.00]It not a pain in the ass
+
+[104:49.00]And then can they make
+
+[104:50.00]It not a pain in the ass
+
+[104:51.00]And then can they make
+
+[104:52.00]It not a pain in the ass
+
+[104:53.00]And then can they make
+
+[104:54.00]It not a pain in the ass
+
+[104:55.00]And then can they make
+
+[104:56.00]It not a pain in the ass
+
+[104:57.00]And then can they make
+
+[104:58.00]It not a pain in the ass
+
+[104:59.00]And then can they make
+
+[105:00.00]It not a pain in the ass
+
+[105:01.00]And then can they make
+
+[105:02.00]It not a pain in the ass
+
+[105:03.00]And then can they make
+
+[105:04.00]It not a pain in the ass
+
+[105:05.00]And then can they make
+
+[105:06.00]It not a pain in the ass
+
+[105:07.00]And then can they make
+
+[105:08.00]It not a pain in the ass
+
+[105:09.00]And then can they make
+
+[105:10.00]It not a pain in the ass
+
+[105:11.00]And then can they make
+
+[105:12.00]It not a pain in the ass
+
+[105:13.00]And then can they make
+
+[105:14.00]It not a pain in the ass
+
+[105:15.00]And then can they make
+
+[105:16.00]It not a pain in the ass
+
+[105:17.00]And then can they make
+
+[105:18.00]It not a pain in the ass
+
+[105:19.00]And then can they make
+
+[105:20.00]It not a pain in the ass
+
+[105:21.00]And then can they make
+
+[105:22.00]It not a pain in the ass
+
+[105:23.00]And then can they make
+
+[105:24.00]It not a pain in the ass
+
+[105:25.00]And then can they make
+
+[105:26.00]It not a pain in the ass
+
+[105:27.00]And then can they make
+
+[105:28.00]It not a pain in the ass
+
+[105:29.00]And then can they make
+
+[105:30.00]It not a pain in the ass
+
+[105:31.00]And then can they make
+
+[105:32.00]It not a pain in the ass
+
+[105:33.00]And then can they make
+
+[105:34.00]It not a pain in the ass
+
+[105:35.00]And then can they make
+
+[105:36.00]It not a pain in the ass
+
+[105:37.00]And then can they make
+
+[105:38.00]It not a pain in the ass
+
+[105:39.00]And then can they make
+
+[105:40.00]It not a pain in the ass
+
+[105:41.00]And then can they make
+
+[105:42.00]It not a pain in the ass
+
+[105:43.00]And then can they make
+
+[105:44.00]It not a pain in the ass
+
+[105:45.00]And then can they make
+
+[105:46.00]It not a pain in the ass
+
+[105:47.00]And then can they make
+
+[105:48.00]It not a pain in the ass
+
+[105:49.00]And then can they make
+
+[105:50.00]It not a pain in the ass
+
+[105:51.00]And then can they make
+
+[105:52.00]It not a pain in the ass
+
+[105:53.00]And then can they make
+
+[105:54.00]It not a pain in the ass
+
+[105:55.00]And then can they make
+
+[105:56.00]It not a pain in the ass
+
+[105:57.00]And then can they make
+
+[105:58.00]It not a pain in the ass
+
+[105:59.00]And then can they make
+
+[106:00.00]It not a pain in the ass
+
+[106:01.00]And then can they make
+
+[106:02.00]It not a pain in the ass
+
+[106:03.00]And then can they make
+
+[106:04.00]It not a pain in the ass
+
+[106:05.00]And then can they make
+
+[106:06.00]It not a pain in the ass
+
+[106:07.00]And then can they make
+
+[106:08.00]It not a pain in the ass
+
+[106:09.00]And then can they make
+
+[106:10.00]It not a pain in the ass
+
+[106:11.00]And then can they make
+
+[106:12.00]It not a pain in the ass
+
+[106:13.00]And then can they make
+
+[106:14.00]It not a pain in the ass
+
+[106:15.00]And then can they make
+
+[106:16.00]It not a pain in the ass
+
+[106:17.00]And then can they make
+
+[106:18.00]It not a pain in the ass
+
+[106:19.00]And then can they make
+
+[106:20.00]It not a pain in the ass
+
+[106:21.00]And then can they make
+
+[106:22.00]It not a pain in the ass
+
+[106:23.00]And then can they make
+
+[106:24.00]It not a pain in the ass
+
+[106:25.00]And then can they make
+
+[106:26.00]It not a pain in the ass
+
+[106:27.00]And then can they make
+
+[106:28.00]It not a pain in the ass
+
+[106:29.00]And then can they make
+
+[106:30.00]It not a pain in the ass
+
+[106:31.00]And then can they make
+
+[106:32.00]It not a pain in the ass
+
+[106:33.00]And then can they make
+
+[106:34.00]It not a pain in the ass
+
+[106:35.00]And then can they make
+
+[106:36.00]It not a pain in the ass
+
+[106:37.00]And then can they make
+
+[106:38.00]It not a pain in the ass
+
+[106:39.00]And then can they make
+
+[106:40.00]It not a pain in the ass
+
+[106:41.00]And then can they make
+
+[106:42.00]It not a pain in the ass
+
+[106:43.00]And then can they make
+
+[106:44.00]It not a pain in the ass
+
+[106:45.00]And then can they make
+
+[106:46.00]It not a pain in the ass
+
+[106:47.00]And then can they make
+
+[106:48.00]It not a pain in the ass
+
+[106:49.00]And then can they make
+
+[106:50.00]It not a pain in the ass
+
+[106:51.00]And then can they make
+
+[106:52.00]It not a pain in the ass
+
+[106:53.00]And then can they make
+
+[106:54.00]It not a pain in the ass
+
+[106:55.00]And then can they make
+
+[106:56.00]It not a pain in the ass
+
+[106:57.00]And then can they make
+
+[106:58.00]It not a pain in the ass
+
+[106:59.00]And then can they make
+
+[107:00.00]It not a pain in the ass
+
+[107:01.00]And then can they make
+
+[107:02.00]It not a pain in the ass
+
+[107:03.00]And then can they make
+
+[107:04.00]It not a pain in the ass
+
+[107:05.00]And then can they make
+
+[107:06.00]It not a pain in the ass
+
+[107:07.00]And then can they make
+
+[107:08.00]It not a pain in the ass
+
+[107:09.00]And then can they make
+
+[107:10.00]It not a pain in the ass
+
+[107:11.00]And then can they make
+
+[107:12.00]It not a pain in the ass
+
+[107:13.00]And then can they make
+
+[107:14.00]It not a pain in the ass
+
+[107:15.00]And then can they make
+
+[107:16.00]It not a pain in the ass
+
+[107:17.00]And then can they make
+
+[107:18.00]It not a pain in the ass
+
+[107:19.00]And then can they make
+
+[107:20.00]It not a pain in the ass
+
+[107:21.00]And then can they make
+
+[107:22.00]It not a pain in the ass
+
+[107:23.00]And then can they make
+
+[107:24.00]It not a pain in the ass
+
+[107:25.00]And then can they make
+
+[107:26.00]It not a pain in the ass
+
+[107:27.00]And then can they make
+
+[107:28.00]It not a pain in the ass
+
+[107:29.00]And then can they make
+
+[107:30.00]It not a pain in the ass
+
+[107:31.00]And then can they make
+
+[107:32.00]It not a pain in the ass
+
+[107:33.00]And then can they make
+
+[107:34.00]It not a pain in the ass
+
+[107:35.00]And then can they make
+
+[107:36.00]It not a pain in the ass
+
+[107:37.00]And then can they make
+
+[107:38.00]It not a pain in the ass
+
+[107:39.00]And then can they make
+
+[107:40.00]It not a pain in the ass
+
+[107:41.00]And then can they make
+
+[107:42.00]It not a pain in the ass
+
+[107:43.00]And then can they make
+
+[107:44.00]It not a pain in the ass
+
+[107:45.00]And then can they make
+
+[107:46.00]It not a pain in the ass
+
+[107:47.00]And then can they make
+
+[107:48.00]It not a pain in the ass
+
+[107:49.00]And then can they make
+
+[107:50.00]It not a pain in the ass
+
+[107:51.00]And then can they make
+
+[107:52.00]It not a pain in the ass
+
+[107:53.00]And then can they make
+
+[107:54.00]It not a pain in the ass
+
+[107:55.00]And then can they make
+
+[107:56.00]It not a pain in the ass
+
+[107:57.00]And then can they make
+
+[107:58.00]It not a pain in the ass
+
+[107:59.00]And then can they make
+
+[108:00.00]It not a pain in the ass
+
+[108:01.00]And then can they make
+
+[108:02.00]And then can they make
+
+[108:03.00]And then can they make
+
+[108:04.00]And then can they make
+
+[108:05.00]It not a pain in the ass
+
+[108:06.00]And then can they make
+
+[108:07.00]It not a pain in the ass
+
+[108:08.00]And then can they make
+
+[108:09.00]It not a pain in the ass
+
+[108:10.00]And then can they make
+
+[108:11.00]It not a pain in the ass
+
+[108:12.00]And then can they make
+
+[108:13.00]It not a pain in the ass
+
+[108:14.00]And then can they make
+
+[108:15.00]It not a pain in the ass
+
+[108:16.00]And then can they make
+
+[108:17.00]It not a pain in the ass
+
+[108:18.00]And then can they make
+
+[108:19.00]It not a pain in the ass
+
+[108:20.00]And then can they make
+
+[108:21.00]It not a pain in the ass
+
+[108:22.00]And then can they make
+
+[108:23.00]It not a pain in the ass
+
+[108:24.00]And then can they make
+
+[108:25.00]It not a pain in the ass
+
+[108:26.00]And then can they make
+
+[108:27.00]It not a pain in the ass
+
+[108:28.00]And then can they make
+
+[108:29.00]And then can they make
+
+[108:30.00]It not a pain in the ass
+
+[108:31.00]And then can they make
+
+[108:32.00]It not a pain in the ass
+
+[108:33.00]And then can they make
+
+[108:34.00]It not a pain in the ass
+
+[108:35.00]And then can they make
+
+[108:36.00]It not a pain in the ass
+
+[108:37.00]And then can they make
+
+[108:38.00]It not a pain in the ass
+
+[108:39.00]And then can they make
+
+[108:40.00]It not a pain in the ass
+
+[108:41.00]And then can they make
+
+[108:42.00]It not a pain in the ass
+
+[108:43.00]And then can they make
+
+[108:44.00]It not a pain in the ass
+
+[108:45.00]And then can they make
+
+[108:46.00]It not a pain in the ass
+
+[108:47.00]And then can they make
+
+[108:48.00]It not a pain in the ass
+
+[108:49.00]And then can they make
+
+[108:50.00]It not a pain in the ass
+
+[108:51.00]And then can they make
+
+[108:52.00]It not a pain in the ass
+
+[108:53.00]And then can they make
+
+[108:54.00]It not a pain in the ass
+
+[108:55.00]And then can they make
+
+[108:56.00]It not a pain in the ass
+
+[108:57.00]And then can they make
+
+[108:58.00]It not a pain in the ass
+
+[108:59.00]And then can they make
+
+[109:00.00]It not a pain in the ass
+
+[109:01.00]And then can they make
+
+[109:02.00]It not a pain in the ass
+
+[109:03.00]And then can they make
+
+[109:04.00]It not a pain in the ass
+
+[109:05.00]And then can they make
+
+[109:06.00]It not a pain in the ass
+
+[109:07.00]And then can they make
+
+[109:08.00]It not a pain in the ass
+
+[109:09.00]And then can they make
+
+[109:10.00]It not a pain in the ass
+
+[109:11.00]And then can they make
+
+[109:12.00]It not a pain in the ass
+
+[109:13.00]And then can they make
+
+[109:14.00]It not a pain in the ass
+
+[109:15.00]And then can they make
+
+[109:16.00]It not a pain in the ass
+
+[109:17.00]And then can they make
+
+[109:18.00]It not a pain in the ass
+
+[109:19.00]And then can they make
+
+[109:20.00]It not a pain in the ass
+
+[109:21.00]And then can they make
+
+[109:22.00]It not a pain in the ass
+
+[109:23.00]And then can they make
+
+[109:24.00]It not a pain in the ass
+
+[109:25.00]And then can they make
+
+[109:26.00]It not a pain in the ass
+
+[109:27.00]And then can they make
+
+[109:28.00]It not a pain in the ass
+
+[109:29.00]And then can they make
+
+[109:30.00]It not a pain in the ass
+
+[109:31.00]And then can they make
+
+[109:32.00]It not a pain in the ass
+
+[109:33.00]And then can they make
+
+[109:34.00]It not a pain in the ass
+
+[109:35.00]And then can they make
+
+[109:36.00]It not a pain in the ass
+
+[109:37.00]And then can they make
+
+[109:38.00]It not a pain in the ass
+
+[109:39.00]And then can they make
+
+[109:40.00]It not a pain in the ass
+
+[109:41.00]And then can they make
+
+[109:42.00]It not a pain in the ass
+
+[109:43.00]And then can they make
+
+[109:44.00]It not a pain in the ass
+
+[109:45.00]And then can they make
+
+[109:46.00]It not a pain in the ass
+
+[109:47.00]And then can they make
+
+[109:48.00]It not a pain in the ass
+
+[109:49.00]And then can they make
+
+[109:50.00]It not a pain in the ass
+
+[109:51.00]And then can they make
+
+[109:52.00]It not a pain in the ass
+
+[109:53.00]And then can they make
+
+[109:54.00]It not a pain in the ass
+
+[109:55.00]And then can they make
+
+[109:56.00]It not a pain in the ass
+
+[109:57.00]And then can they make
+
+[109:58.00]It not a pain in the ass
+
+[109:59.00]And then can they make
+
+[110:00.00]It not a pain in the ass
+
+[110:01.00]And then can they make
+
+[110:02.00]It not a pain in the ass
+
+[110:03.00]And then can they make
+
+[110:04.00]It not a pain in the ass
+
+[110:05.00]And then can they make
+
+[110:06.00]It not a pain in the ass
+
+[110:07.00]And then can they make
+
+[110:08.00]It not a pain in the ass
+
+[110:09.00]And then can they make
+
+[110:10.00]It not a pain in the ass
+
+[110:11.00]And then can they make
+
+[110:12.00]It not a pain in the ass
+
+[110:13.00]And then can they make
+
+[110:14.00]It not a pain in the ass
+
+[110:15.00]And then can they make
+
+[110:16.00]It not a pain in the ass
+
+[110:17.00]And so then I just
+
+[110:18.00]I just started editing them
+
+[110:19.00]And so then I just started editing them
+
+[110:20.00]So I have stopped
+
+[110:21.00]Comparing RAG with long
+
+[110:22.00]Context of Fine Tuning
+
+[110:23.00]Hold on, you said I retweeted
+
+[110:24.00]You defending it
+
+[110:25.00]I thought you were hating on it
+
+[110:26.00]And that's why I retweeted it
+
+[110:28.00]It's not one of a defense
+
+[110:30.00]Because everyone was like
+
+[110:31.00]Long context is killing RAG
+
+[110:32.00]And then I had a future
+
+[110:33.00]Oh that should be so quadratic
+
+[110:34.00]That's another one
+
+[110:36.00]And I actually
+
+[110:37.00]Messed the fine print as well
+
+[110:38.00]Let's see
+
+[110:39.00]Power benefits of
+
+[110:40.00]Astrum dominant
+
+[110:41.00]Yeah, so that's a good question
+
+[110:43.00]Astrum is on chip memory
+
+[110:45.00]Everyone's just using HBM
+
+[110:46.00]If you don't have to go
+
+[110:47.00]To off-chip memory
+
+[110:48.00]That'd be really efficient
+
+[110:49.00]Right?
+
+[110:50.00]Because you're
+
+[110:51.00]You're not moving bits around
+
+[110:52.00]But there's always
+
+[110:54.00]The issue of
+
+[110:55.00]You don't have enough memory
+
+[110:56.00]So you still have to move
+
+[110:57.00]Bits around constantly
+
+[110:58.00]And so that's the
+
+[110:59.00]That's the question
+
+[111:00.00]So yeah, sure
+
+[111:01.00]If you can not move data
+
+[111:02.00]Around as you compute
+
+[111:03.00]It's going to be fantastically
+
+[111:04.00]Efficient
+
+[111:05.00]But that isn't really
+
+[111:06.00]It's not really just
+
+[111:07.00]Easier simple to do
+
+[111:08.00]What do you think is going to be
+
+[111:09.00]Harder in the future
+
+[111:10.00]Like getting more energy
+
+[111:11.00]At cheaper cost
+
+[111:12.00]Like getting more of this hardware
+
+[111:14.00]To run
+
+[111:15.00]Yeah, I wonder
+
+[111:16.00]So someone was talking about this earlier
+
+[111:18.00]But it's like
+
+[111:19.00]Here in the crowd
+
+[111:20.00]And I'm looking right at him
+
+[111:21.00]But he's complaining
+
+[111:22.00]That journalists keep saying that
+
+[111:24.00]You know that
+
+[111:25.00]Like misreporting about how data centers
+
+[111:27.00]Or what data centers
+
+[111:28.00]Are doing to the environment
+
+[111:29.00]Right?
+
+[111:30.00]Which I thought was quite funny
+
+[111:31.00]Because they're inundated by
+
+[111:33.00]Journalists talking about data centers
+
+[111:35.00]Like destroying the world
+
+[111:36.00]Anyways, you know
+
+[111:37.00]That's not quite the case
+
+[111:38.00]But yeah, I don't know
+
+[111:39.00]Like the power is certainly
+
+[111:42.00]Gonna be hard to get
+
+[111:44.00]But you know
+
+[111:45.00]I think
+
+[111:46.00]If you just look at history
+
+[111:47.00]Right?
+
+[111:48.00]Like humanity
+
+[111:49.00]Especially America
+
+[111:50.00]Like power
+
+[111:51.00]Power production and usage
+
+[111:52.00]Kept skyrocketing
+
+[111:53.00]From like the 1700s
+
+[111:55.00]To like 1970s
+
+[111:57.00]And then it's kind of
+
+[111:58.00]Flat line from there
+
+[111:59.00]So why can't we like
+
+[112:00.00]Go back to the like growth stage
+
+[112:02.00]I guess it's like the
+
+[112:03.00]The whole like mantra
+
+[112:04.00]Of like accelerationists
+
+[112:06.00]I guess
+
+[112:07.00]This is IAC, yep
+
+[112:08.00]Well, I don't think it's IAC
+
+[112:09.00]I think it's like
+
+[112:10.00]Same Altman
+
+[112:11.00]Hotly believes this too
+
+[112:12.00]And I don't think he's IAC
+
+[112:13.00]So but yeah
+
+[112:14.00]Like I don't think like
+
+[112:15.00]It's like think
+
+[112:16.00]It's like something to
+
+[112:17.00]Think about like
+
+[112:18.00]The US is going back
+
+[112:19.00]To growing an energy usage
+
+[112:21.00]Where as for the last like
+
+[112:22.00]Forty years kind of
+
+[112:24.00]We're flat on energy usage
+
+[112:25.00]And what does that mean
+
+[112:26.00]Like yeah
+
+[112:29.00]There was another question
+
+[112:31.00]On Marvel but kind of the
+
+[112:32.00]I think that's
+
+[112:33.00]It's definitely like
+
+[112:34.00]One of these three guys
+
+[112:35.00]We're on the buy side
+
+[112:36.00]That are asking this question
+
+[112:39.00]Wanna know if Marvell's stock
+
+[112:41.00]Is gonna go up
+
+[112:43.00]So Marvell
+
+[112:44.00]They're doing the
+
+[112:46.00]Customizing for GROC
+
+[112:48.00]They also do the
+
+[112:49.00]Trinium too
+
+[112:50.00]And the Google CPU
+
+[112:51.00]Yeah any other
+
+[112:52.00]Any other chip
+
+[112:53.00]That they're working on
+
+[112:54.00]That people should
+
+[112:55.00]Should keep in mind
+
+[112:56.00]It's like yeah
+
+[112:57.00]Any needle moving
+
+[112:58.00]Any stock moving
+
+[112:59.00]Yeah exactly
+
+[113:01.00]They're working on
+
+[113:02.00]Some more stuff
+
+[113:03.00]Yeah I'll refrain from
+
+[113:05.00]Yeah all right
+
+[113:06.00]Let's see
+
+[113:07.00]Other GROC stuff
+
+[113:08.00]We want to get it
+
+[113:09.00]Get through
+
+[113:10.00]I don't think so
+
+[113:11.00]All right
+
+[113:12.00]Most of other ones
+
+[113:13.00]You're going edge compute hardware
+
+[113:16.00]Any real use cases
+
+[113:17.00]For it
+
+[113:18.00]Yeah I mean
+
+[113:20.00]I have like a
+
+[113:21.00]Really like anti edge view
+
+[113:23.00]So many people were like
+
+[113:25.00]Oh I'm gonna run
+
+[113:26.00]This model on my phone
+
+[113:27.00]Or on my laptop
+
+[113:28.00]And I love
+
+[113:30.00]I love how much
+
+[113:31.00]It's raining so now
+
+[113:32.00]I can be horrible
+
+[113:33.00]And you people won't leave
+
+[113:35.00]Like I want you
+
+[113:36.00]To try and leave
+
+[113:37.00]This building
+
+[113:38.00]Captive audience
+
+[113:40.00]Should I start singing
+
+[113:41.00]Like there's
+
+[113:42.00]Nothing you can do
+
+[113:43.00]You definitely
+
+[113:44.00]I'll stop you from that
+
+[113:45.00]Sorry
+
+[113:46.00]Edge hardware
+
+[113:47.00]Like you know
+
+[113:48.00]People are like
+
+[113:49.00]I'm going to run
+
+[113:50.00]This model on my phone
+
+[113:51.00]Or on my laptop
+
+[113:52.00]It makes no sense to me
+
+[113:53.00]Current hardware
+
+[113:54.00]Is not really capable of it
+
+[113:55.00]So you're going to buy
+
+[113:56.00]Any hardware
+
+[113:57.00]To run
+
+[113:58.00]Whatever on the edge
+
+[113:59.00]Or you're going to
+
+[114:00.00]Just run
+
+[114:01.00]Very very small models
+
+[114:02.00]But in either case
+
+[114:03.00]You're going to end up
+
+[114:05.00]With like
+
+[114:06.00]The performance is really low
+
+[114:07.00]And then whatever you spent
+
+[114:08.00]To run it locally
+
+[114:09.00]In the cloud
+
+[114:10.00]It could service 10x the users
+
+[114:12.00]So it kind of like
+
+[114:14.00]SOL in terms of like
+
+[114:17.00]Economics of
+
+[114:18.00]Running things on the edge
+
+[114:20.00]And then like latency is like
+
+[114:23.00]For LLMs
+
+[114:24.00]Right for LLMs
+
+[114:25.00]It's like
+
+[114:26.00]Not that big of a deal
+
+[114:27.00]Relative
+
+[114:28.00]To like
+
+[114:29.00]Internet latency
+
+[114:30.00]Is not that big of a deal
+
+[114:31.00]Relative to the
+
+[114:32.00]Use of the model
+
+[114:33.00]Right like the actual model
+
+[114:34.00]Operating
+
+[114:35.00]Whether it's on edge hardware
+
+[114:36.00]Or cloud hardware
+
+[114:37.00]And cloud hardware
+
+[114:38.00]Like edge hardware
+
+[114:39.00]Is not really
+
+[114:40.00]Able to like
+
+[114:41.00]Have a measurable
+
+[114:43.00]Appreciable
+
+[114:44.00]Like advantage
+
+[114:45.00]Over cloud hardware
+
+[114:48.00]This applies
+
+[114:49.00]To diffusion models
+
+[114:50.00]This applies to LLMs
+
+[114:52.00]Of course small models
+
+[114:53.00]Will be able to run
+
+[114:54.00]But not all
+
+[114:55.00]Yeah
+
+[114:56.00]What chances
+
+[114:57.00]The startups
+
+[114:58.00]Like medax etched
+
+[114:59.00]Or 5,600
+
+[115:00.00]I think you all interviewed them
+
+[115:01.00]Why don't you answer
+
+[115:02.00]Yeah we have connections
+
+[115:03.00]With medax and lemur
+
+[115:04.00]And we haven't know
+
+[115:06.00]But Gavin is friendly
+
+[115:07.00]They didn't
+
+[115:08.00]Yeah they said
+
+[115:09.00]They don't want to talk publicly
+
+[115:10.00]Yeah
+
+[115:11.00]What they're doing
+
+[115:12.00]It's something like
+
+[115:13.00]When they open up
+
+[115:14.00]We can
+
+[115:15.00]Sure sure
+
+[115:16.00]Yeah
+
+[115:17.00]But do you think like
+
+[115:18.00]I think the two three
+
+[115:19.00]We're going to answer the question
+
+[115:20.00]What do you think of them
+
+[115:21.00]There's a couple things
+
+[115:23.00]It's like
+
+[115:24.00]How do the other companies
+
+[115:26.00]Innovate against them
+
+[115:27.00]I think when you do a new
+
+[115:28.00]Silicon you're like
+
+[115:29.00]Oh we're going to be
+
+[115:30.00]So much better at this thing
+
+[115:31.00]You're like much faster
+
+[115:32.00]Much cheaper
+
+[115:33.00]But there's all the other curves
+
+[115:34.00]Going down
+
+[115:35.00]On the macro environment
+
+[115:36.00]So if it takes you
+
+[115:37.00]Like five years
+
+[115:38.00]Before you were
+
+[115:39.00]Like a lot better
+
+[115:40.00]Five years later
+
+[115:41.00]Once you take the chip out
+
+[115:42.00]You're only comparing yourself
+
+[115:43.00]To the five year advancement
+
+[115:44.00]That the major companies had to
+
+[115:46.00]So then it's like
+
+[115:47.00]Okay that
+
+[115:48.00]We're going to have like
+
+[115:49.00]The C300 whatever
+
+[115:50.00]From from nvidia
+
+[115:51.00]By the time
+
+[115:52.00]Some of these chips come up
+
+[115:53.00]What's after Z
+
+[115:55.00]What do you think is after Z
+
+[115:56.00]In the roadmap
+
+[115:57.00]It's X Y Z
+
+[116:00.00]No
+
+[116:01.00]Anyways
+
+[116:03.00]Yeah yeah
+
+[116:04.00]It's like the age old problem
+
+[116:05.00]You build a chip
+
+[116:06.00]It has some cool thing
+
+[116:07.00]Cool feature
+
+[116:08.00]And then like
+
+[116:09.00]A year later nvidia
+
+[116:10.00]Has it in hardware
+
+[116:11.00]It has implemented
+
+[116:12.00]Some flavor of that in hardware
+
+[116:14.00]Or two generations out
+
+[116:16.00]Like what idea
+
+[116:17.00]Are you going to have
+
+[116:18.00]That nvidia can't implement
+
+[116:19.00]Is like really the question
+
+[116:21.00]It's like you have to be
+
+[116:22.00]Fundamentally different
+
+[116:23.00]In some way
+
+[116:24.00]That holds through
+
+[116:25.00]For four or five years
+
+[116:26.00]That's kind of the big issue
+
+[116:28.00]But you know like
+
+[116:30.00]Like those people have
+
+[116:31.00]Some ideas that are interesting
+
+[116:32.00]And yeah maybe
+
+[116:33.00]It'll work out right
+
+[116:34.00]It's going to be hard
+
+[116:35.00]To fight nvidia
+
+[116:36.00]Who one doesn't
+
+[116:37.00]Consider them competition
+
+[116:38.00]I worried about like
+
+[116:39.00]Google and Amazon's chip
+
+[116:40.00]Right they're not
+
+[116:41.00]And I guess
+
+[116:42.00]To some extent AMD's chip
+
+[116:43.00]But like
+
+[116:44.00]They're not really worried
+
+[116:45.00]About you know
+
+[116:46.00]Maddox or etched
+
+[116:47.00]Or or grok
+
+[116:48.00]Or you know
+
+[116:49.00]Positron or any of these folks
+
+[116:51.00]How much of an advantage
+
+[116:53.00]Do they have
+
+[116:54.00]By working closely
+
+[116:55.00]With like open ai
+
+[116:56.00]And some of these other folks
+
+[116:57.00]And then already knowing
+
+[116:58.00]Where some of the
+
+[116:59.00]Architecture decisions
+
+[117:00.00]Are going and since
+
+[117:01.00]Those companies are like
+
+[117:02.00]The biggest buyers
+
+[117:03.00]Of the chips
+
+[117:04.00]Yeah I mean like
+
+[117:05.00]You see like
+
+[117:06.00]Like the most important
+
+[117:07.00]Sort of ai companies
+
+[117:09.00]Are obviously going to
+
+[117:10.00]Tell hardware vendors
+
+[117:11.00]What they want
+
+[117:12.00]You know open ai
+
+[117:13.00]And you know
+
+[117:14.00]So on and so forth
+
+[117:15.00]Right they can obviously
+
+[117:16.00]Tell them what they want
+
+[117:17.00]And the startups
+
+[117:18.00]Aren't actually going to
+
+[117:19.00]Get anywhere close to
+
+[117:20.00]As much feedback on
+
+[117:21.00]What to do on like
+
+[117:22.00]You know very
+
+[117:23.00]My new low level stuff
+
+[117:24.00]So that is a difficult here
+
+[117:26.00]Some startups
+
+[117:27.00]Like Maddox
+
+[117:28.00]Obviously have
+
+[117:29.00]People who built
+
+[117:30.00]Or worked on
+
+[117:31.00]The largest models
+
+[117:32.00]And other startups
+
+[117:33.00]Might not have
+
+[117:34.00]That advantage
+
+[117:35.00]And so they're always
+
+[117:36.00]Gonna have that issue
+
+[117:37.00]Of like
+
+[117:38.00]Hey how do I get
+
+[117:39.00]The feedback
+
+[117:40.00]Or what's changing
+
+[117:41.00]What do they see
+
+[117:42.00]Down the pipeline
+
+[117:43.00]That's
+
+[117:44.00]That I really need
+
+[117:45.00]To be aware of
+
+[117:46.00]And ready for
+
+[117:47.00]When I design
+
+[117:48.00]My hardware
+
+[117:49.00]All right
+
+[117:50.00]Every hardware shortage
+
+[117:51.00]As eventually
+
+[117:52.00]Turn into a glut
+
+[117:53.00]Will that be
+
+[117:54.00]Through of NVIDIA chips
+
+[117:55.00]It's so when
+
+[117:56.00]But also why
+
+[117:57.00]Absolutely
+
+[117:58.00]And I'm so excited
+
+[117:59.00]To buy like
+
+[118:00.00]H100s
+
+[118:01.00]Not a thousand
+
+[118:02.00]But yeah
+
+[118:03.00]Everyone's
+
+[118:04.00]Gonna buy chips
+
+[118:05.00]It's just the way
+
+[118:06.00]Semiconductors work
+
+[118:07.00]Because the supply chain
+
+[118:08.00]Takes forever to build out
+
+[118:09.00]And it's like
+
+[118:10.00]A really weird thing
+
+[118:11.00]So if the backlog
+
+[118:12.00]Of chips is a year
+
+[118:15.00]People will order
+
+[118:16.00]Two years worth
+
+[118:17.00]Of what they want
+
+[118:18.00]For the next year
+
+[118:19.00]It is like
+
+[118:20.00]A very common thing
+
+[118:21.00]It's not just like
+
+[118:22.00]This AI cycle
+
+[118:23.00]But like
+
+[118:24.00]Like microcontroller
+
+[118:25.00]Like the automotive companies
+
+[118:26.00]They order
+
+[118:27.00]Two years worth
+
+[118:28.00]Of what they needed
+
+[118:29.00]For one year
+
+[118:30.00]What happens
+
+[118:31.00]In semiconductors
+
+[118:32.00]When lead times
+
+[118:33.00]Lengthen
+
+[118:34.00]The purchases
+
+[118:35.00]And inventory
+
+[118:36.00]Is sort of like
+
+[118:37.00]Double
+
+[118:38.00]So these
+
+[118:39.00]The NVIDIA GPU shortage
+
+[118:44.00]Obviously is going to be
+
+[118:45.00]Rectified
+
+[118:46.00]And when it is
+
+[118:47.00]Everyone's sort of
+
+[118:48.00]Double orders
+
+[118:49.00]Will become
+
+[118:50.00]Extremely apparent
+
+[118:51.00]Right
+
+[118:52.00]And you know
+
+[118:53.00]You see like
+
+[118:54.00]Random companies
+
+[118:55.00]Out of nowhere being
+
+[118:56.00]Like yeah
+
+[118:57.00]We've got 32,000
+
+[118:58.00]H100s on order
+
+[118:59.00]And 25,000
+
+[119:00.00]And trust
+
+[119:01.00]They're not all
+
+[119:02.00]They're not all
+
+[119:03.00]Real orders
+
+[119:04.00]For one
+
+[119:05.00]But I think
+
+[119:06.00]The bubble will
+
+[119:07.00]Continue on
+
+[119:08.00]For a long time
+
+[119:09.00]Right like it's not
+
+[119:10.00]It's not going to end
+
+[119:11.00]Like this year
+
+[119:12.00]Right like people
+
+[119:13.00]People need AI
+
+[119:14.00]Right like I think
+
+[119:15.00]Everyone in this audience
+
+[119:16.00]Would agree right like
+
+[119:17.00]There's no
+
+[119:18.00]There's no
+
+[119:19.00]Like immediate
+
+[119:20.00]Like end to the
+
+[119:21.00]To the bubble
+
+[119:22.00]Right
+
+[119:23.00]What's next
+
+[119:24.00]Why
+
+[119:25.00]I think it's just
+
+[119:26.00]Because the supply chain
+
+[119:27.00]Expands so much
+
+[119:28.00]Some companies
+
+[119:29.00]Will continue to buy
+
+[119:30.00]Like an
+
+[119:31.00]Open AI
+
+[119:32.00]Or meta
+
+[119:33.00]Will continue to buy
+
+[119:34.00]But then like
+
+[119:35.00]All these random
+
+[119:36.00]Startups will
+
+[119:37.00]Or a lot of them
+
+[119:38.00]Will not be able to
+
+[119:39.00]Continue to buy
+
+[119:40.00]So then
+
+[119:41.00]That kind of leads to like
+
+[119:42.00]They'll pause
+
+[119:43.00]For a little bit
+
+[119:44.00]Or like
+
+[119:45.00]I think in 2018
+
+[119:46.00]Right like
+
+[119:47.00]Memory pricing was
+
+[119:48.00]Extremely high
+
+[119:49.00]Then all of a sudden
+
+[119:50.00]Google, Microsoft
+
+[119:51.00]And Amazon
+
+[119:52.00]All agreed
+
+[119:53.00]I don't you know
+
+[119:54.00]They won't
+
+[119:55.00]They won't say
+
+[119:56.00]It's together
+
+[119:57.00]It's a week
+
+[119:58.00]To stop ordering memory
+
+[119:59.00]And within like
+
+[120:00.00]A month
+
+[120:01.00]The price of memory
+
+[120:02.00]Started tanking
+
+[120:03.00]Like insane amounts
+
+[120:04.00]Right and like
+
+[120:05.00]People will claim
+
+[120:06.00]You know all sorts of
+
+[120:07.00]Reasons why
+
+[120:08.00]That was timed
+
+[120:09.00]Extremely well
+
+[120:10.00]It was like very clear
+
+[120:11.00]And people in the
+
+[120:12.00]Financial markets
+
+[120:13.00]Were able to make trades
+
+[120:14.00]And everything
+
+[120:15.00]People stopped buying
+
+[120:16.00]And it's not like
+
+[120:17.00]Their demand just dried up
+
+[120:18.00]It's just like
+
+[120:19.00]They had a little bit
+
+[120:20.00]Of a demand slowdown
+
+[120:21.00]And then they had enough
+
+[120:22.00]Inventory that they could
+
+[120:23.00]Like weather until
+
+[120:24.00]Like prices tanked
+
+[120:25.00]Because it's such
+
+[120:26.00]Thank you very much
+
+[120:27.00]Is it
+
+[120:28.00]Hey everyone
+
+[120:33.00]And so
+
+[120:34.00]Today we have a special guest
+
+[120:35.00]Millions from Capital One
+
+[120:37.00]But I would tend to
+
+[120:38.00]Like to introduce people
+
+[120:39.00]With a bit of their background
+
+[120:41.00]And then learn a little bit
+
+[120:42.00]More about you
+
+[120:43.00]On the personal side
+
+[120:44.00]You call your PhD
+
+[120:45.00]In a probabilistic framework
+
+[120:47.00]From mapping audio
+
+[120:48.00]Visual features to semantics
+
+[120:49.00]I feel like that
+
+[120:50.00]Is like the beginnings
+
+[120:51.00]Of like a multimodal
+
+[120:52.00]AI model in some sense
+
+[120:54.00]Do you have any sort
+
+[120:55.00]Factions on your PhD
+
+[120:56.00]Vis this cut
+
+[120:57.00]Thanks for having me
+
+[120:59.00]And so
+
+[121:00.00]Let me say this that
+
+[121:01.00]It almost feels like
+
+[121:02.00]Things go around in circles
+
+[121:03.00]Right
+
+[121:04.00]In research and development
+
+[121:05.00]And so
+
+[121:06.00]At the right time
+
+[121:07.00]And the right place
+
+[121:08.00]You kind of intersect
+
+[121:09.00]Back with some of the topics
+
+[121:11.00]And then
+
+[121:12.00]Some other conditions
+
+[121:13.00]That have happened
+
+[121:14.00]Suddenly make
+
+[121:15.00]A big difference
+
+[121:16.00]Between that ticking off
+
+[121:17.00]Vorses you know
+
+[121:18.00]It may not be
+
+[121:19.00]You know
+
+[121:20.00]As intently pursued
+
+[121:21.00]At any given point of time
+
+[121:22.00]Right so
+
+[121:23.00]I have been
+
+[121:24.00]In AI for now
+
+[121:25.00]Three decades
+
+[121:26.00]You know
+
+[121:27.00]You talked about
+
+[121:28.00]My PhD thesis
+
+[121:29.00]My bachelor's thesis
+
+[121:30.00]Was on implementing
+
+[121:32.00]Neural networks
+
+[121:33.00]On India's
+
+[121:34.00]You know
+
+[121:35.00]Homegrown supercomputers
+
+[121:36.00]Back then
+
+[121:37.00]And so
+
+[121:38.00]You know
+
+[121:39.00]This whole notion of
+
+[121:40.00]Message passing
+
+[121:41.00]And distributed computing
+
+[121:42.00]And computing weights
+
+[121:44.00]You know
+
+[121:45.00]And then bringing
+
+[121:46.00]All of them back
+
+[121:47.00]Distributing
+
+[121:48.00]The computations
+
+[121:49.00]Of the neural network
+
+[121:50.00]You know
+
+[121:51.00]Forward pass
+
+[121:52.00]Those things
+
+[121:53.00]For what we used to do
+
+[121:54.00]You know
+
+[121:55.00]For that
+
+[121:56.00]Particular supercomputing
+
+[121:57.00]Architecture
+
+[121:58.00]We had back then
+
+[121:59.00]And then
+
+[122:00.00]My PhD
+
+[122:01.00]Of course was
+
+[122:02.00]How to understand
+
+[122:03.00]What's going on
+
+[122:04.00]In a video
+
+[122:05.00]Right and use
+
+[122:06.00]Multi-modal cues
+
+[122:07.00]To your point
+
+[122:08.00]You know
+
+[122:09.00]What has happened
+
+[122:10.00]In the last couple of
+
+[122:11.00]Decades
+
+[122:12.00]One we have
+
+[122:13.00]Tremendous amount
+
+[122:14.00]Of data explosion
+
+[122:15.00]So when I was doing
+
+[122:16.00]My PhD
+
+[122:17.00]I used to
+
+[122:18.00]Actually go
+
+[122:19.00]To blockbuster
+
+[122:20.00]And rent
+
+[122:21.00]Explosions
+
+[122:22.00]And how do you
+
+[122:23.00]Actually
+
+[122:24.00]Models in the audio stream
+
+[122:25.00]And the visual stream
+
+[122:26.00]For something
+
+[122:27.00]Like an explosion
+
+[122:28.00]So I remember
+
+[122:29.00]Going and doing
+
+[122:30.00]All this digitization
+
+[122:31.00]Of tape and then
+
+[122:32.00]Cleaning up the data
+
+[122:33.00]And then
+
+[122:34.00]You know
+
+[122:35.00]Having some kind
+
+[122:36.00]Of a labeling tool
+
+[122:37.00]That I actually cooked up
+
+[122:38.00]And having a spouse
+
+[122:40.00]Of a friend of mine
+
+[122:41.00]To do the labeling
+
+[122:42.00]For me
+
+[122:43.00]So look at
+
+[122:44.00]Where we were back then
+
+[122:45.00]And now you have
+
+[122:46.00]Scale.ai
+
+[122:47.00]That basically goes
+
+[122:48.00]And does labeling
+
+[122:49.00]For a lot of these models
+
+[122:50.00]And so forth
+
+[122:51.00]So scale
+
+[122:52.00]Of data has changed
+
+[122:53.00]That's one
+
+[122:54.00]The second thing
+
+[122:55.00]That has changed
+
+[122:56.00]Is we were looking
+
+[122:57.00]At computing
+
+[122:58.00]Architectures
+
+[122:59.00]That were
+
+[123:00.00]Much much
+
+[123:01.00]Much less rich
+
+[123:02.00]In terms of what
+
+[123:03.00]We had back then
+
+[123:04.00]And the 2012
+
+[123:06.00]Breakthrough
+
+[123:07.00]By Hinton and his students
+
+[123:09.00]And using GPUs
+
+[123:11.00]Really when
+
+[123:12.00]The ImageNet competition
+
+[123:14.00]Really helped
+
+[123:15.00]Take this field off
+
+[123:16.00]In a completely
+
+[123:17.00]New direction
+
+[123:18.00]And at
+
+[123:19.00]Very very large scale
+
+[123:20.00]And the third thing
+
+[123:21.00]Is of course
+
+[123:22.00]The GPU computing
+
+[123:23.00]Back then
+
+[123:24.00]I did not have access
+
+[123:25.00]When I was doing my PhD
+
+[123:26.00]To some of the amazing things
+
+[123:28.00]That Nvidia hadn't yet built
+
+[123:30.00]And so
+
+[123:31.00]It's I think really
+
+[123:32.00]The confluence of those three things
+
+[123:34.00]Which make all the difference
+
+[123:35.00]Between a lot of this research
+
+[123:36.00]That happened
+
+[123:37.00]In the late 90s
+
+[123:38.00]And what's happening
+
+[123:39.00]Between the 2010
+
+[123:41.00]To now
+
+[123:42.00]Kind of time frame
+
+[123:43.00]But if you look at the intent
+
+[123:45.00]The intent was the same
+
+[123:47.00]What's in the video
+
+[123:48.00]How do we understand
+
+[123:49.00]The multimodal cues
+
+[123:50.00]You know
+
+[123:51.00]That come together
+
+[123:52.00]To give us
+
+[123:53.00]That semantic understanding
+
+[123:54.00]Of what's in the video
+
+[123:55.00]And so
+
+[123:56.00]To that extent
+
+[123:57.00]The problems
+
+[123:58.00]We were trying
+
+[123:59.00]To solve the same
+
+[124:00.00]But the tools
+
+[124:01.00]That we have now
+
+[124:02.00]Are amazingly
+
+[124:03.00]Amazingly different
+
+[124:04.00]And amazingly
+
+[124:05.00]More powerful
+
+[124:06.00]Are there any
+
+[124:07.00]Maybe research approaches
+
+[124:08.00]Or ML patterns
+
+[124:10.00]That you tried
+
+[124:11.00]That didn't work
+
+[124:12.00]That you think
+
+[124:13.00]Will work today
+
+[124:14.00]Would you abuse
+
+[124:15.00]That people haven't tried
+
+[124:16.00]I think
+
+[124:17.00]There are many people
+
+[124:18.00]That have done serious
+
+[124:19.00]ML research before
+
+[124:20.00]The GPU era
+
+[124:21.00]I would say
+
+[124:22.00]If you think about
+
+[124:23.00]All the ML researchers
+
+[124:24.00]Working today
+
+[124:25.00]Most of them are post-GPU
+
+[124:26.00]Any like story
+
+[124:27.00]That you remember
+
+[124:28.00]That you were like
+
+[124:29.00]Oh, this seems really promising
+
+[124:30.00]But like
+
+[124:31.00]There wasn't enough compute
+
+[124:32.00]Or anything like that
+
+[124:33.00]The whole concept
+
+[124:34.00]Of modeling context
+
+[124:35.00]Right
+
+[124:36.00]So my thesis was about
+
+[124:37.00]Not
+
+[124:38.00]How do you just
+
+[124:39.00]Detect isolated things
+
+[124:40.00]In a video
+
+[124:41.00]Right
+
+[124:42.00]Like this is a car
+
+[124:43.00]That's an explosion
+
+[124:44.00]And so on and so forth
+
+[124:45.00]If I see these
+
+[124:46.00]End things together
+
+[124:47.00]Do they actually
+
+[124:48.00]Contextually make sense
+
+[124:49.00]Like do I see
+
+[124:50.00]The sky
+
+[124:51.00]About the land
+
+[124:52.00]And if I do that
+
+[124:53.00]Then I have a higher confidence
+
+[124:55.00]That this indeed is sky
+
+[124:56.00]And that indeed is land
+
+[124:57.00]Right
+
+[124:58.00]And so on and so forth
+
+[124:59.00]So when we were trying
+
+[125:00.00]To model that context
+
+[125:01.00]We had extremely limited
+
+[125:03.00]Libling
+
+[125:04.00]And extremely limited
+
+[125:05.00]Corpus
+
+[125:06.00]In terms of how we could do it
+
+[125:08.00]Now when I think back
+
+[125:10.00]I think that
+
+[125:11.00]What better way to describe
+
+[125:12.00]Context than a multimodal LLM
+
+[125:14.00]Right
+
+[125:15.00]Which strain on
+
+[125:16.00]As much of the data
+
+[125:18.00]As there is on the internet
+
+[125:20.00]In terms of multimodality
+
+[125:22.00]And that to me
+
+[125:23.00]Would have been an amazing thing
+
+[125:24.00]For me to have back then
+
+[125:26.00]So I would say
+
+[125:27.00]How to model context
+
+[125:28.00]Is a problem
+
+[125:29.00]That is going to be evergreen
+
+[125:30.00]It never goes out of fashion
+
+[125:32.00]But how we are able
+
+[125:33.00]To do it now
+
+[125:34.00] Versus then
+
+[125:35.00]I see as
+
+[125:36.00]One of the steps towards
+
+[125:38.00]Truly understanding
+
+[125:39.00]You know what's happening
+
+[125:40.00]Right in the multiple modalities
+
+[125:42.00]The other part
+
+[125:43.00]Is reasoning
+
+[125:44.00]Right
+
+[125:45.00]I think we are still
+
+[125:46.00]In the very early innings
+
+[125:47.00]Of reasoning
+
+[125:48.00]You know we see this
+
+[125:49.00]Interesting evolution
+
+[125:51.00]Of you know
+
+[125:52.00]How do you actually
+
+[125:53.00]Build a model of the world
+
+[125:55.00]And Yanlacan's work
+
+[125:56.00]You know very interesting
+
+[125:58.00]In this sense to me
+
+[125:59.00]He has been talking
+
+[126:00.00]About it for a while
+
+[126:01.00]Right so
+
+[126:02.00]But now I think
+
+[126:03.00]We are getting to a point
+
+[126:04.00]Where a lot of those pieces
+
+[126:05.00]May start coming together
+
+[126:07.00]And I think solving
+
+[126:09.00]That reasoning piece
+
+[126:10.00]Is a very very critical step
+
+[126:12.00]Before we can actually
+
+[126:14.00]Build
+
+[126:15.00]Truly intelligent machines
+
+[126:17.00]And do you have any
+
+[126:18.00]Intuition on the part
+
+[126:19.00]The video
+
+[126:20.00]Is going to play in it
+
+[126:21.00]Because a lot of Yan's
+
+[126:22.00]Also talking points
+
+[126:23.00]Are around you know
+
+[126:24.00]With vjapa
+
+[126:25.00]And some of those models
+
+[126:26.00]If you show
+
+[126:27.00]What's going to happen next
+
+[126:28.00]Like that's part
+
+[126:29.00]Of a war model
+
+[126:30.00]Like it's video going to be
+
+[126:32.00]Like a big part in it
+
+[126:33.00]Like do we need to
+
+[126:34.00]Get there to actually
+
+[126:35.00]Get a real war model
+
+[126:36.00]Like do you think text
+
+[126:37.00]Is enough to
+
+[126:38.00]Get a good shot at it
+
+[126:39.00]And maybe biased
+
+[126:40.00]In answering that question
+
+[126:41.00]Given that
+
+[126:42.00]I you know
+
+[126:43.00]Cut my teeth
+
+[126:44.00]In multi modality
+
+[126:45.00]And given that
+
+[126:46.00]The video modality
+
+[126:47.00]In general
+
+[126:48.00]Right is a lot
+
+[126:49.00]More challenging
+
+[126:50.00]You know whether
+
+[126:51.00]It's just because
+
+[126:52.00]Of the sheer size of data
+
+[126:53.00]Right in terms of
+
+[126:54.00]The number of pixels
+
+[126:55.00]That you need to process
+
+[126:56.00]Whether it is because
+
+[126:57.00]You are actually
+
+[126:58.00]Capturing the real world
+
+[127:00.00]Which tends to
+
+[127:01.00]Be far more complex
+
+[127:02.00]You know in the
+
+[127:03.00]Case of language
+
+[127:04.00]You know humans
+
+[127:05.00]Over millennia
+
+[127:07.00]Have evolved this
+
+[127:09.00]Longly concise
+
+[127:11.00]Codebook
+
+[127:12.00]Of how to describe
+
+[127:13.00]Things
+
+[127:14.00]So there is a humongous
+
+[127:15.00]Amount of abstraction
+
+[127:18.00]And rationalization
+
+[127:20.00]And concise definition
+
+[127:22.00]That has gone on
+
+[127:23.00]In how languages
+
+[127:25.00]Evolved
+
+[127:26.00]And so you know
+
+[127:27.00]The number of words
+
+[127:28.00]In the vocabulary
+
+[127:29.00]Of a language
+
+[127:30.00]When you look at that
+
+[127:31.00]We are able to
+
+[127:32.00]Tell beautiful stories
+
+[127:33.00]Right with
+
+[127:34.00]Just those many
+
+[127:36.00]You know words
+
+[127:37.00]But if you look at
+
+[127:38.00]In the real world
+
+[127:39.00]If you look at
+
+[127:40.00]Capturing that
+
+[127:41.00]Right whether it is
+
+[127:42.00]Through
+
+[127:43.00]The eyes of a robot
+
+[127:44.00]As it's looking
+
+[127:45.00]You know and trying
+
+[127:46.00]To help you around
+
+[127:47.00]In a room setting
+
+[127:48.00]Or a building setting
+
+[127:49.00]Right or whether
+
+[127:50.00]It's the traffic
+
+[127:51.00]Right which
+
+[127:52.00]Vacues have to look at
+
+[127:53.00]When they're
+
+[127:54.00]Driving on the roads
+
+[127:55.00]The amount of
+
+[127:56.00]Variability
+
+[127:57.00]The amount of
+
+[127:58.00]Distortion of signal
+
+[127:59.00]That comes with
+
+[128:00.00]That right
+
+[128:01.00]Is just remarkably
+
+[128:03.00]Difficult
+
+[128:04.00]And remarkably rich
+
+[128:05.00]To really analyze
+
+[128:06.00]So I would say
+
+[128:07.00]Alicio
+
+[128:08.00]It's both
+
+[128:09.00]I feel that
+
+[128:10.00]The video modality
+
+[128:11.00]Has to be understood
+
+[128:13.00]For a true understanding
+
+[128:14.00]Of the world model
+
+[128:15.00]I would also say
+
+[128:16.00]That's harder
+
+[128:17.00]In some sense
+
+[128:19.00]Because of its inherent complexity
+
+[128:20.00]Then the language part
+
+[128:22.00]Which there's already
+
+[128:24.00]Consize representation
+
+[128:25.00]That people have come up with
+
+[128:26.00]Awesome
+
+[128:27.00]Sorry Sean
+
+[128:28.00]I know we hijacked the intro
+
+[128:30.00]But it was a good rabbit hole
+
+[128:32.00]To go into
+
+[128:33.00]No it's okay
+
+[128:34.00]I also just like
+
+[128:35.00]Stunned by
+
+[128:36.00]Background and history
+
+[128:37.00]That millennium is
+
+[128:38.00]Bringing to
+
+[128:39.00]AI you know
+
+[128:40.00]I guess the speed run through
+
+[128:41.00]In the resume
+
+[128:42.00]You know 14 years
+
+[128:43.00]At IBM
+
+[128:44.00]Finally ending
+
+[128:45.00]As chief scientist at
+
+[128:46.00]IBM research
+
+[128:47.00]And then since
+
+[128:48.00]Cisco
+
+[128:49.00]With cognitive systems
+
+[128:50.00]And CTO
+
+[128:51.00]Of metropolis at NVIDIA
+
+[128:52.00]What should people know
+
+[128:53.00]About like
+
+[128:54.00]How your
+
+[128:55.00]Interest in your trajectory
+
+[128:56.00]Has progressed through
+
+[128:57.00]Your career
+
+[128:58.00]You know when I reflect
+
+[128:59.00]Part of what I see
+
+[129:00.00]Is there is a constant
+
+[129:02.00]And the constant is
+
+[129:04.00]How do you actually
+
+[129:05.00]AI work
+
+[129:06.00]For you know
+
+[129:07.00]Name your favorite
+
+[129:08.00]Problem
+
+[129:09.00]That favorite problem
+
+[129:10.00]Changes maybe
+
+[129:11.00]From decade to decade
+
+[129:12.00]Or in the context of
+
+[129:13.00]Even my state
+
+[129:14.00]IBM research
+
+[129:15.00]But it's always been
+
+[129:16.00]How do we build
+
+[129:17.00]AI solutions
+
+[129:18.00]AI platforms
+
+[129:19.00]That solve
+
+[129:20.00]Real world problems
+
+[129:21.00]So when
+
+[129:22.00]We started out
+
+[129:23.00]We were the first
+
+[129:24.00]Video understanding platform
+
+[129:26.00]That we built at
+
+[129:27.00]IBM research right
+
+[129:28.00]We actually got a Wall Street
+
+[129:29.00]Multimedia
+
+[129:30.00]Award for it
+
+[129:31.00]Innovation Award for it
+
+[129:32.00]In the early 2000s
+
+[129:34.00]We helped with
+
+[129:35.00]Setting up the benchmark
+
+[129:36.00]That's known as
+
+[129:38.00]Trackweed
+
+[129:39.00]Which again
+
+[129:40.00]To at least your
+
+[129:41.00]Point people
+
+[129:42.00]Only know ImageNet
+
+[129:43.00]And thereafter
+
+[129:44.00]Before ImageNet
+
+[129:45.00]There was Trackweed
+
+[129:46.00]And so
+
+[129:47.00]The first decade
+
+[129:48.00]Shan was really about
+
+[129:49.00]You know how do we
+
+[129:50.00]Understand what's in the video
+
+[129:51.00]And how can then
+
+[129:52.00]We turn that into
+
+[129:53.00]Meaningful use of
+
+[129:55.00]AI technology
+
+[129:56.00]For media companies
+
+[129:58.00]You know broadcasting
+
+[129:59.00]Corporations
+
+[130:00.00]And so on and so forth
+
+[130:01.00]Right the business to
+
+[130:02.00]Business kind of setting
+
+[130:03.00]Of course
+
+[130:04.00]Because we were at
+
+[130:05.00]IBM
+
+[130:06.00]We did not focus
+
+[130:07.00]On the consumer
+
+[130:08.00]And then
+
+[130:09.00]You know YouTube
+
+[130:10.00]Happened right
+
+[130:11.00]Here in the valley
+
+[130:12.00]And then
+
+[130:13.00]You see the
+
+[130:14.00]Explosive content
+
+[130:15.00]And applications
+
+[130:16.00]Of AI to
+
+[130:17.00]You know those
+
+[130:18.00]Kind of video
+
+[130:19.00]Understanding problems
+
+[130:20.00]Then it was
+
+[130:21.00]How do we actually
+
+[130:22.00]Make sense
+
+[130:23.00]Out of our
+
+[130:24.00]IoT world
+
+[130:25.00]So when you have
+
+[130:26.00]Signals coming from
+
+[130:27.00]Sensors everywhere
+
+[130:28.00]You know whether
+
+[130:29.00]There are sensors
+
+[130:30.00]Embedded in your
+
+[130:31.00]Bridges
+
+[130:32.00]Sensory information
+
+[130:33.00]To make
+
+[130:34.00]Good decisions
+
+[130:35.00]To a
+
+[130:36.00]You know observe
+
+[130:37.00]The environments
+
+[130:38.00]And we optimize
+
+[130:39.00]Those environments
+
+[130:40.00]Right and so a large
+
+[130:41.00]Part of my
+
+[130:42.00]Second half of
+
+[130:43.00]State IBM research
+
+[130:44.00]Was to come up with
+
+[130:45.00]What is this
+
+[130:46.00]Research around
+
+[130:48.00]Smart cities
+
+[130:49.00]Smarter planet
+
+[130:50.00]And that
+
+[130:51.00]Actually became
+
+[130:52.00]An AI platform
+
+[130:53.00]For helping
+
+[130:55.00]Optimized traffic
+
+[130:56.00]Right one of
+
+[130:57.00]The proudest
+
+[130:58.00]Things that
+
+[130:59.00]I really
+
+[131:00.00]Foundly remember
+
+[131:01.00]Use the data
+
+[131:02.00]From you know
+
+[131:03.00]Teleco data sources
+
+[131:04.00]To understand
+
+[131:05.00]How people move
+
+[131:06.00]In a city
+
+[131:07.00]And then use
+
+[131:08.00]That information
+
+[131:09.00]To build
+
+[131:10.00]Optimal planning
+
+[131:11.00]Right whether
+
+[131:12.00]It's for bus outs
+
+[131:13.00]Whether it's for metro
+
+[131:14.00]Right we did
+
+[131:15.00]Some amazing work
+
+[131:16.00]In Istanbul
+
+[131:17.00]For example
+
+[131:18.00]You know
+
+[131:19.00]Completely different
+
+[131:20.00]Scale
+
+[131:21.00]And then
+
+[131:22.00]The same
+
+[131:23.00]Kind of platform
+
+[131:24.00]We applied
+
+[131:25.00]To helping
+
+[131:26.00]Cities in
+
+[131:27.00]American Midwest
+
+[131:28.00]Like Dubuque
+
+[131:29.00]Optimized their
+
+[131:30.00]Dubuque
+
+[131:31.00]And then
+
+[131:32.00]And then
+
+[131:33.00]The same
+
+[131:34.00]Like Dubuque
+
+[131:35.00]And then
+
+[131:36.00]The same
+
+[131:37.00]Like Dubuque
+
+[131:38.00]And then
+
+[131:39.00]The same
+
+[131:40.00]Like Dubuque
+
+[131:41.00]And then
+
+[131:42.00]The same
+
+[131:43.00]Like Dubuque
+
+[131:44.00]And then
+
+[131:45.00]The same
+
+[131:46.00]Like Dubuque
+
+[131:47.00]And then
+
+[131:48.00]The same
+
+[131:49.00]Like Dubuque
+
+[131:50.00]And then
+
+[131:51.00]The same
+
+[131:52.00]Like Dubuque
+
+[131:53.00]And then
+
+[131:54.00]The same
+
+[131:55.00]Like Dubuque
+
+[131:56.00]And then
+
+[131:57.00]The same
+
+[131:58.00]Like Dubuque
+
+[131:59.00]And then
+
+[132:00.00]And then
+
+[132:01.00]The same
+
+[132:02.00]Like Dubuque
+
+[132:03.00]And then
+
+[132:04.00]The same
+
+[132:05.00]Like Dubuque
+
+[132:06.00]And then
+
+[132:07.00]The same
+
+[132:08.00]Like Dubuque
+
+[132:09.00]And then
+
+[132:10.00]The same
+
+[132:11.00]Like Dubuque
+
+[132:12.00]And then
+
+[132:13.00]The same
+
+[132:14.00]Like Dubuque
+
+[132:15.00]And then
+
+[132:16.00]The same
+
+[132:17.00]Like Dubuque
+
+[132:18.00]And then
+
+[132:19.00]The same
+
+[132:20.00]Like Dubuque
+
+[132:21.00]And then
+
+[132:22.00]The same
+
+[132:23.00]Like Dubuque
+
+[132:24.00]And then
+
+[132:25.00]The same
+
+[132:26.00]Like Dubuque
+
+[132:27.00]And then
+
+[132:28.00]The same
+
+[132:29.00]Like Dubuque
+
+[132:30.00]And then
+
+[132:31.00]The same
+
+[132:32.00]Like Dubuque
+
+[132:33.00]And then
+
+[132:34.00]The same
+
+[132:35.00]Like Dubuque
+
+[132:36.00]And then
+
+[132:37.00]The same
+
+[132:38.00]Like Dubuque
+
+[132:39.00]And then
+
+[132:40.00]The same
+
+[132:41.00]Like Dubuque
+
+[132:42.00]And then
+
+[132:43.00]The same
+
+[132:44.00]Like Dubuque
+
+[132:45.00]And then
+
+[132:46.00]The same
+
+[132:47.00]Like Dubuque
+
+[132:48.00]And then
+
+[132:49.00]The same
+
+[132:50.00]Like Dubuque
+
+[132:51.00]And then
+
+[132:52.00]The same
+
+[132:53.00]Like Dubuque
+
+[132:54.00]And then
+
+[132:55.00]The same
+
+[132:56.00]Like Dubuque
+
+[132:57.00]And then
+
+[132:58.00]The same
+
+[132:59.00]Like Dubuque
+
+[133:00.00]And then
+
+[133:01.00]The same
+
+[133:02.00]Like Dubuque
+
+[133:03.00]And then
+
+[133:04.00]The same
+
+[133:05.00]Like Dubuque
+
+[133:06.00]And then
+
+[133:07.00]The same
+
+[133:08.00]Like Dubuque
+
+[133:09.00]And then
+
+[133:10.00]The same
+
+[133:11.00]Like Dubuque
+
+[133:12.00]And then
+
+[133:13.00]The same
+
+[133:14.00]Like Dubuque
+
+[133:15.00]And then
+
+[133:16.00]The same
+
+[133:17.00]Like Dubuque
+
+[133:18.00]And then
+
+[133:19.00]The same
+
+[133:20.00]Like Dubuque
+
+[133:21.00]And then
+
+[133:22.00]The same
+
+[133:23.00]Like Dubuque
+
+[133:24.00]And then
+
+[133:25.00]The same
+
+[133:26.00]Like Dubuque
+
+[133:27.00]And then
+
+[133:28.00]The same
+
+[133:29.00]Like Dubuque
+
+[133:30.00]And then
+
+[133:31.00]The same
+
+[133:32.00]Like Dubuque
+
+[133:33.00]And then
+
+[133:34.00]The same
+
+[133:35.00]Like Dubuque
+
+[133:36.00]And then
+
+[133:37.00]The same
+
+[133:38.00]Like Dubuque
+
+[133:39.00]And then
+
+[133:40.00]The same
+
+[133:41.00]Like Dubuque
+
+[133:42.00]And then
+
+[133:43.00]The same
+
+[133:44.00]Like Dubuque
+
+[133:45.00]And then
+
+[133:46.00]The same
+
+[133:47.00]Like Dubuque
+
+[133:48.00]And then
+
+[133:49.00]The same
+
+[133:50.00]Like Dubuque
+
+[133:51.00]And then
+
+[133:52.00]The same
+
+[133:53.00]Like Dubuque
+
+[133:54.00]And then
+
+[133:55.00]The same
+
+[133:56.00]Like Dubuque
+
+[133:57.00]And then
+
+[133:58.00]The same
+
+[133:59.00]Like Dubuque
+
+[134:00.00]And then
+
+[134:01.00]The same
+
+[134:02.00]Like Dubuque
+
+[134:03.00]And then
+
+[134:04.00]The same
+
+[134:05.00]Like Dubuque
+
+[134:06.00]And then
+
+[134:07.00]The same
+
+[134:08.00]Like Dubuque
+
+[134:09.00]And then
+
+[134:10.00]The same
+
+[134:11.00]Like Dubuque
+
+[134:12.00]And then
+
+[134:13.00]The same
+
+[134:14.00]Like Dubuque
+
+[134:15.00]And then
+
+[134:16.00]The same
+
+[134:17.00]Like Dubuque
+
+[134:18.00]And then
+
+[134:19.00]The same
+
+[134:20.00]Like Dubuque
+
+[134:21.00]And then
+
+[134:22.00]The same
+
+[134:23.00]Like Dubuque
+
+[134:24.00]And then
+
+[134:25.00]The same
+
+[134:26.00]Like Dubuque
+
+[134:27.00]And then
+
+[134:28.00]The same
+
+[134:29.00]Like Dubuque
+
+[134:30.00]And then
+
+[134:31.00]The same
+
+[134:32.00]Like Dubuque
+
+[134:33.00]And then
+
+[134:34.00]The same
+
+[134:35.00]Like Dubuque
+
+[134:36.00]And then
+
+[134:37.00]The same
+
+[134:38.00]Like Dubuque
+
+[134:39.00]And then
+
+[134:40.00]The same
+
+[134:41.00]Like Dubuque
+
+[134:42.00]And then
+
+[134:43.00]The same
+
+[134:44.00]Like Dubuque
+
+[134:45.00]And then
+
+[134:46.00]The same
+
+[134:47.00]Like Dubuque
+
+[134:48.00]And then
+
+[134:49.00]The same
+
+[134:50.00]Like Dubuque
+
+[134:51.00]And then
+
+[134:52.00]The same
+
+[134:53.00]Like Dubuque
+
+[134:54.00]And then
+
+[134:55.00]The same
+
+[134:56.00]Like Dubuque
+
+[134:57.00]And then
+
+[134:58.00]The same
+
+[134:59.00]Like Dubuque
+
+[135:00.00]And then
+
+[135:01.00]The same
+
+[135:02.00]Like Dubuque
+
+[135:03.00]And then
+
+[135:04.00]The same
+
+[135:05.00]Like Dubuque
+
+[135:06.00]And then
+
+[135:07.00]The same
+
+[135:08.00]Like Dubuque
+
+[135:09.00]And then
+
+[135:10.00]The same
+
+[135:11.00]Like Dubuque
+
+[135:12.00]And then
+
+[135:13.00]The same
+
+[135:14.00]Like Dubuque
+
+[135:15.00]And then
+
+[135:16.00]The same
+
+[135:17.00]Like Dubuque
+
+[135:18.00]And then
+
+[135:19.00]The same
+
+[135:20.00]Like Dubuque
+
+[135:21.00]And then
+
+[135:22.00]The same
+
+[135:23.00]Like Dubuque
+
+[135:24.00]And then
+
+[135:25.00]The same
+
+[135:26.00]Like Dubuque
+
+[135:27.00]And then
+
+[135:28.00]The same
+
+[135:29.00]Like Dubuque
+
+[135:30.00]And then
+
+[135:31.00]The same
+
+[135:32.00]Like Dubuque
+
+[135:33.00]And then
+
+[135:34.00]The same
+
+[135:35.00]Like Dubuque
+
+[135:36.00]And then
+
+[135:37.00]The same
+
+[135:38.00]Like Dubuque
+
+[135:39.00]And then
+
+[135:40.00]The same
+
+[135:41.00]Like Dubuque
+
+[135:42.00]And then
+
+[135:43.00]The same
+
+[135:44.00]Like Dubuque
+
+[135:45.00]And then
+
+[135:46.00]The same
+
+[135:47.00]Like Dubuque
+
+[135:48.00]And then
+
+[135:49.00]The same
+
+[135:50.00]Like Dubuque
+
+[135:51.00]And then
+
+[135:52.00]The same
+
+[135:53.00]Like Dubuque
+
+[135:54.00]And then
+
+[135:55.00]The same
+
+[135:56.00]Like Dubuque
+
+[135:57.00]And then
+
+[135:58.00]The same
+
+[135:59.00]Like Dubuque
+
+[136:00.00]And then
+
+[136:01.00]The same
+
+[136:02.00]Like Dubuque
+
+[136:03.00]And then
+
+[136:04.00]The same
+
+[136:05.00]Like Dubuque
+
+[136:06.00]And then
+
+[136:07.00]The same
+
+[136:08.00]Like Dubuque
+
+[136:09.00]And then
+
+[136:10.00]The same
+
+[136:11.00]Like Dubuque
+
+[136:12.00]And then
+
+[136:13.00]The same
+
+[136:14.00]Like Dubuque
+
+[136:15.00]And then
+
+[136:16.00]The same
+
+[136:17.00]Like Dubuque
+
+[136:18.00]And then
+
+[136:19.00]The same
+
+[136:20.00]Like Dubuque
+
+[136:21.00]And then
+
+[136:22.00]The same
+
+[136:23.00]Like Dubuque
+
+[136:24.00]And then
+
+[136:25.00]The same
+
+[136:26.00]Like Dubuque
+
+[136:27.00]And then
+
+[136:28.00]The same
+
+[136:29.00]Like Dubuque
+
+[136:30.00]And then
+
+[136:31.00]The same
+
+[136:32.00]Like Dubuque
+
+[136:33.00]And then
+
+[136:34.00]The same
+
+[136:35.00]Like Dubuque
+
+[136:36.00]And then
+
+[136:37.00]The same
+
+[136:38.00]Like Dubuque
+
+[136:39.00]And then
+
+[136:40.00]The same
+
+[136:41.00]Like Dubuque
+
+[136:42.00]And then
+
+[136:43.00]The same
+
+[136:44.00]Like Dubuque
+
+[136:45.00]And then
+
+[136:46.00]The same
+
+[136:47.00]Like Dubuque
+
+[136:48.00]And then
+
+[136:49.00]And then
+
+[136:50.00]The same
+
+[136:51.00]Like Dubuque
+
+[136:52.00]And then
+
+[136:53.00]The same
+
+[136:54.00]Like Dubuque
+
+[136:55.00]And then
+
+[136:56.00]The same
+
+[136:57.00]Like Dubuque
+
+[136:58.00]And then
+
+[136:59.00]The same
+
+[137:00.00]Like Dubuque
+
+[137:01.00]And then
+
+[137:02.00]The same
+
+[137:03.00]Like Dubuque
+
+[137:04.00]And then
+
+[137:05.00]The same
+
+[137:06.00]Like Dubuque
+
+[137:07.00]And then
+
+[137:08.00]The same
+
+[137:09.00]Like Dubuque
+
+[137:10.00]And then
+
+[137:11.00]The same
+
+[137:12.00]Like Dubuque
+
+[137:13.00]And then
+
+[137:14.00]The same
+
+[137:15.00]Like Dubuque
+
+[137:16.00]And then
+
+[137:17.00]The same
+
+[137:18.00]Like Dubuque
+
+[137:19.00]And then
+
+[137:20.00]The same
+
+[137:21.00]Like Dubuque
+
+[137:22.00]And then
+
+[137:23.00]The same
+
+[137:24.00]Like Dubuque
+
+[137:25.00]And then
+
+[137:26.00]The same
+
+[137:27.00]Like Dubuque
+
+[137:28.00]And then
+
+[137:29.00]The same
+
+[137:30.00]Like Dubuque
+
+[137:31.00]And then
+
+[137:32.00]The same
+
+[137:33.00]Like Dubuque
+
+[137:34.00]And then
+
+[137:35.00]The same
+
+[137:36.00]Like Dubuque
+
+[137:37.00]And then
+
+[137:38.00]The same
+
+[137:39.00]Like Dubuque
+
+[137:40.00]And then
+
+[137:41.00]The same
+
+[137:42.00]Like Dubuque
+
+[137:43.00]And then
+
+[137:44.00]The same
+
+[137:45.00]Like Dubuque
+
+[137:46.00]And then
+
+[137:47.00]The same
+
+[137:48.00]Like Dubuque
+
+[137:49.00]And then
+
+[137:50.00]The same
+
+[137:51.00]Like Dubuque
+
+[137:52.00]And then
+
+[137:53.00]The same
+
+[137:54.00]Like Dubuque
+
+[137:55.00]And then
+
+[137:56.00]The same
+
+[137:57.00]Like Dubuque
+
+[137:58.00]And then
+
+[137:59.00]The same
+
+[138:00.00]Like Dubuque
+
+[138:01.00]And then
+
+[138:02.00]The same
+
+[138:03.00]Like Dubuque
+
+[138:04.00]And then
+
+[138:05.00]The same
+
+[138:06.00]Like Dubuque
+
+[138:07.00]And then
+
+[138:08.00]The same
+
+[138:09.00]Like Dubuque
+
+[138:10.00]And then
+
+[138:11.00]The same
+
+[138:12.00]Like Dubuque
+
+[138:13.00]And then
+
+[138:14.00]The same
+
+[138:15.00]Like Dubuque
+
+[138:16.00]And then
+
+[138:17.00]The same
+
+[138:18.00]Like Dubuque
+
+[138:19.00]And then
+
+[138:20.00]The same
+
+[138:21.00]Like Dubuque
+
+[138:22.00]And then
+
+[138:23.00]The same
+
+[138:24.00]Like Dubuque
+
+[138:25.00]And then
+
+[138:26.00]The same
+
+[138:27.00]Like Dubuque
+
+[138:28.00]And then
+
+[138:29.00]The same
+
+[138:30.00]Like Dubuque
+
+[138:31.00]And then
+
+[138:32.00]The same
+
+[138:33.00]Like Dubuque
+
+[138:34.00]And then
+
+[138:35.00]The same
+
+[138:36.00]Like Dubuque
+
+[138:37.00]And then
+
+[138:38.00]The same
+
+[138:39.00]Like Dubuque
+
+[138:40.00]And then
+
+[138:41.00]The same
+
+[138:42.00]Like Dubuque
+
+[138:43.00]And then
+
+[138:44.00]The same
+
+[138:45.00]Like Dubuque
+
+[138:46.00]And then
+
+[138:47.00]The same
+
+[138:48.00]Like Dubuque
+
+[138:49.00]And then
+
+[138:50.00]The same
+
+[138:51.00]Like Dubuque
+
+[138:52.00]And then
+
+[138:53.00]The same
+
+[138:54.00]Like Dubuque
+
+[138:55.00]And then
+
+[138:56.00]The same
+
+[138:57.00]Like Dubuque
+
+[138:58.00]And then
+
+[138:59.00]The same
+
+[139:00.00]Like Dubuque
+
+[139:01.00]And then
+
+[139:02.00]The same
+
+[139:03.00]Like Dubuque
+
+[139:04.00]And then
+
+[139:05.00]The same
+
+[139:06.00]Like Dubuque
+
+[139:07.00]And then
+
+[139:08.00]The same
+
+[139:09.00]Like Dubuque
+
+[139:10.00]And then
+
+[139:11.00]The same
+
+[139:12.00]Like Dubuque
+
+[139:13.00]And then
+
+[139:14.00]The same
+
+[139:15.00]Like Dubuque
+
+[139:16.00]And then
+
+[139:17.00]The same
+
+[139:18.00]Like Dubuque
+
+[139:19.00]And then
+
+[139:20.00]The same
+
+[139:21.00]Like Dubuque
+
+[139:22.00]And then
+
+[139:23.00]The same
+
+[139:24.00]Like Dubuque
+
+[139:25.00]And then
+
+[139:26.00]The same
+
+[139:27.00]Like Dubuque
+
+[139:28.00]And then
+
+[139:29.00]The same
+
+[139:30.00]Like Dubuque
+
+[139:31.00]And then
+
+[139:32.00]The same
+
+[139:33.00]Like Dubuque
+
+[139:34.00]And then
+
+[139:35.00]The same
+
+[139:36.00]Like Dubuque
+
+[139:37.00]And then
+
+[139:38.00]The same
+
+[139:39.00]Like Dubuque
+
+[139:40.00]And then
+
+[139:41.00]The same
+
+[139:42.00]Like Dubuque
+
+[139:43.00]And then
+
+[139:44.00]The same
+
+[139:45.00]Like Dubuque
+
+[139:46.00]And then
+
+[139:47.00]The same
+
+[139:48.00]Like Dubuque
+
+[139:49.00]And then
+
+[139:50.00]The same
+
+[139:51.00]Like Dubuque
+
+[139:52.00]And then
+
+[139:53.00]The same
+
+[139:54.00]Like Dubuque
+
+[139:55.00]And then
+
+[139:56.00]The same
+
+[139:57.00]Like Dubuque
+
+[139:58.00]And then
+
+[139:59.00]The same
+
+[140:00.00]Like Dubuque
+
+[140:01.00]And then
+
+[140:02.00]The same
+
+[140:03.00]Like Dubuque
+
+[140:04.00]And then
+
+[140:05.00]The same
+
+[140:06.00]Like Dubuque
+
+[140:07.00]And then
+
+[140:08.00]The same
+
+[140:09.00]Like Dubuque
+
+[140:10.00]And then
+
+[140:11.00]The same
+
+[140:12.00]Like Dubuque
+
+[140:13.00]And then
+
+[140:14.00]The same
+
+[140:15.00]Like Dubuque
+
+[140:16.00]And then
+
+[140:17.00]The same
+
+[140:18.00]Like Dubuque
+
+[140:19.00]And then
+
+[140:20.00]The same
+
+[140:21.00]Like Dubuque
+
+[140:22.00]And then
+
+[140:23.00]The same
+
+[140:24.00]Like Dubuque
+
+[140:25.00]And then
+
+[140:26.00]The same
+
+[140:27.00]Like Dubuque
+
+[140:28.00]And then
+
+[140:29.00]The same
+
+[140:30.00]Like Dubuque
+
+[140:31.00]And then
+
+[140:32.00]The same
+
+[140:33.00]Like Dubuque
+
+[140:34.00]And then
+
+[140:35.00]The same
+
+[140:36.00]Like Dubuque
+
+[140:37.00]And then
+
+[140:38.00]The same
+
+[140:39.00]Like Dubuque
+
+[140:40.00]And then
+
+[140:41.00]The same
+
+[140:42.00]Like Dubuque
+
+[140:43.00]And then
+
+[140:44.00]The same
+
+[140:45.00]Like Dubuque
+
+[140:46.00]And then
+
+[140:47.00]The same
+
+[140:48.00]Like Dubuque
+
+[140:49.00]And then
+
+[140:50.00]The same
+
+[140:51.00]Like Dubuque
+
+[140:52.00]And then
+
+[140:53.00]The same
+
+[140:54.00]Like Dubuque
+
+[140:55.00]And then
+
+[140:56.00]The same
+
+[140:57.00]Like Dubuque
+
+[140:58.00]And then
+
+[140:59.00]The same
+
+[141:00.00]Like Dubuque
+
+[141:01.00]And then
+
+[141:02.00]The same
+
+[141:03.00]Like Dubuque
+
+[141:04.00]And then
+
+[141:05.00]The same
+
+[141:06.00]Like Dubuque
+
+[141:07.00]And then
+
+[141:08.00]The same
+
+[141:09.00]Like Dubuque
+
+[141:10.00]And then
+
+[141:11.00]The same
+
+[141:12.00]Like Dubuque
+
+[141:13.00]And then
+
+[141:14.00]The same
+
+[141:15.00]Like Dubuque
+
+[141:16.00]And then
+
+[141:17.00]The same
+
+[141:18.00]Like Dubuque
+
+[141:19.00]And then
+
+[141:20.00]The same
+
+[141:21.00]Like Dubuque
+
+[141:22.00]And then
+
+[141:23.00]The same
+
+[141:24.00]Like Dubuque
+
+[141:25.00]And then
+
+[141:26.00]The same
+
+[141:27.00]Like Dubuque
+
+[141:28.00]And then
+
+[141:29.00]The same
+
+[141:30.00]Like Dubuque
+
+[141:31.00]And then
+
+[141:32.00]The same
+
+[141:33.00]Like Dubuque
+
+[141:34.00]And then
+
+[141:35.00]The same
+
+[141:36.00]Like Dubuque
+
+[141:37.00]And then
+
+[141:38.00]The same
+
+[141:39.00]Like Dubuque
+
+[141:40.00]And then
+
+[141:41.00]The same
+
+[141:42.00]Like Dubuque
+
+[141:43.00]And then
+
+[141:44.00]The same
+
+[141:45.00]Like Dubuque
+
+[141:46.00]And then
+
+[141:47.00]The same
+
+[141:48.00]Like Dubuque
+
+[141:49.00]And then
+
+[141:50.00]The same
+
+[141:51.00]Like Dubuque
+
+[141:52.00]And then
+
+[141:53.00]The same
+
+[141:54.00]Like Dubuque
+
+[141:55.00]And then
+
+[141:56.00]The same
+
+[141:57.00]Like Dubuque
+
+[141:58.00]And then
+
+[141:59.00]The same
+
+[142:00.00]Like Dubuque
+
+[142:01.00]And then
+
+[142:02.00]The same
+
+[142:03.00]Like Dubuque
+
+[142:04.00]And then
+
+[142:05.00]The same
+
+[142:06.00]Like Dubuque
+
+[142:07.00]And then
+
+[142:08.00]The same
+
+[142:09.00]Like Dubuque
+
+[142:10.00]And then
+
+[142:11.00]The same
+
+[142:12.00]Like Dubuque
+
+[142:13.00]And then
+
+[142:14.00]The same
+
+[142:15.00]Like Dubuque
+
+[142:16.00]And then
+
+[142:17.00]The same
+
+[142:18.00]Like Dubuque
+
+[142:19.00]And then
+
+[142:20.00]The same
+
+[142:21.00]Like Dubuque
+
+[142:22.00]And then
+
+[142:23.00]The same
+
+[142:24.00]Like Dubuque
+
+[142:25.00]And then
+
+[142:26.00]The same
+
+[142:27.00]Like Dubuque
+
+[142:28.00]And then
+
+[142:29.00]The same
+
+[142:30.00]Like Dubuque
+
+[142:31.00]And then
+
+[142:32.00]The same
+
+[142:33.00]Like Dubuque
+
+[142:34.00]And then
+
+[142:35.00]The same
+
+[142:36.00]Like Dubuque
+
+[142:37.00]And then
+
+[142:38.00]The same
+
+[142:39.00]Like Dubuque
+
+[142:40.00]And then
+
+[142:41.00]The same
+
+[142:42.00]Like Dubuque
+
+[142:43.00]And then
+
+[142:44.00]The same
+
+[142:45.00]Like Dubuque
+
+[142:46.00]And then
+
+[142:47.00]The same
+
+[142:48.00]Like Dubuque
+
+[142:49.00]And then
+
+[142:50.00]The same
+
+[142:51.00]Like Dubuque
+
+[142:52.00]And then
+
+[142:53.00]The same
+
+[142:54.00]Like Dubuque
+
+[142:55.00]And then
+
+[142:56.00]The same
+
+[142:57.00]Like Dubuque
+
+[142:58.00]And then
+
+[142:59.00]The same
+
+[143:00.00]Like Dubuque
+
+[143:01.00]And then
+
+[143:02.00]The same
+
+[143:03.00]Like Dubuque
+
+[143:04.00]And then
+
+[143:05.00]The same
+
+[143:06.00]Like Dubuque
+
+[143:07.00]And then
+
+[143:08.00]The same
+
+[143:09.00]Like Dubuque
+
+[143:10.00]And then
+
+[143:11.00]The same
+
+[143:12.00]Like Dubuque
+
+[143:13.00]And then
+
+[143:14.00]The same
+
+[143:15.00]Like Dubuque
+
+[143:16.00]And then
+
+[143:17.00]The same
+
+[143:18.00]Like Dubuque
+
+[143:19.00]And then
+
+[143:20.00]The same
+
+[143:21.00]Like Dubuque
+
+[143:22.00]And then
+
+[143:23.00]The same
+
+[143:24.00]Like Dubuque
+
+[143:25.00]And then
+
+[143:26.00]The same
+
+[143:27.00]Like Dubuque
+
+[143:28.00]And then
+
+[143:29.00]The same
+
+[143:30.00]Like Dubuque
+
+[143:31.00]And then
+
+[143:32.00]The same
+
+[143:33.00]Like Dubuque
+
+[143:34.00]And then
+
+[143:35.00]The same
+
+[143:36.00]Like Dubuque
+
+[143:37.00]And then
+
+[143:38.00]The same
+
+[143:39.00]Like Dubuque
+
+[143:40.00]And then
+
+[143:41.00]The same
+
+[143:42.00]Like Dubuque
+
+[143:43.00]And then
+
+[143:44.00]The same
+
+[143:45.00]Like Dubuque
+
+[143:46.00]And then
+
+[143:47.00]The same
+
+[143:48.00]Like Dubuque
+
+[143:49.00]And then
+
+[143:50.00]The same
+
+[143:51.00]Like Dubuque
+
+[143:52.00]And then
+
+[143:53.00]The same
+
+[143:54.00]Like Dubuque
+
+[143:55.00]And then
+
+[143:56.00]The same
+
+[143:57.00]Like Dubuque
+
+[143:58.00]And then
+
+[143:59.00]The same
+
+[144:00.00]Like Dubuque
+
+[144:01.00]And then
+
+[144:02.00]The same
+
+[144:03.00]Like Dubuque
+
+[144:04.00]And then
+
+[144:05.00]The same
+
+[144:06.00]Like Dubuque
+
+[144:07.00]And then
+
+[144:08.00]The same
+
+[144:09.00]Like Dubuque
+
+[144:10.00]And then
+
+[144:11.00]The same
+
+[144:12.00]Like Dubuque
+
+[144:13.00]And then
+
+[144:14.00]The same
+
+[144:15.00]Like Dubuque
+
+[144:16.00]And then
+
+[144:17.00]The same
+
+[144:18.00]Like Dubuque
+
+[144:19.00]And then
+
+[144:20.00]The same
+
+[144:21.00]Like Dubuque
+
+[144:22.00]And then
+
+[144:23.00]The same
+
+[144:24.00]Like Dubuque
+
+[144:25.00]And then
+
+[144:26.00]The same
+
+[144:27.00]Like Dubuque
+
+[144:28.00]And then
+
+[144:29.00]The same
+
+[144:30.00]Like Dubuque
+
+[144:31.00]And then
+
+[144:32.00]The same
+
+[144:33.00]Like Dubuque
+
+[144:34.00]And then
+
+[144:35.00]The same
+
+[144:36.00]Like Dubuque
+
+[144:37.00]And then
+
+[144:38.00]The same
+
+[144:39.00]Like Dubuque
+
+[144:40.00]And then
+
+[144:41.00]The same
+
+[144:42.00]Like Dubuque
+
+[144:43.00]And then
+
+[144:44.00]The same
+
+[144:45.00]Like Dubuque
+
+[144:46.00]And then
+
+[144:47.00]The same
+
+[144:48.00]Like Dubuque
+
+[144:49.00]And then
+
+[144:50.00]The same
+
+[144:51.00]Like Dubuque
+
+[144:52.00]And then
+
+[144:53.00]The same
+
+[144:54.00]Like Dubuque
+
+[144:55.00]And then
+
+[144:56.00]The same
+
+[144:57.00]Like Dubuque
+
+[144:58.00]And then
+
+[144:59.00]The same
+
+[145:00.00]Like Dubuque
+
+[145:01.00]And then
+
+[145:02.00]The same
+
+[145:03.00]Like Dubuque
+
+[145:04.00]And then
+
+[145:05.00]The same
+
+[145:06.00]Like Dubuque
+
+[145:07.00]And then
+
+[145:08.00]The same
+
+[145:09.00]Like Dubuque
+
+[145:10.00]And then
+
+[145:11.00]The same
+
+[145:12.00]Like Dubuque
+
+[145:13.00]And then
+
+[145:14.00]The same
+
+[145:15.00]Like Dubuque
+
+[145:16.00]And then
+
+[145:17.00]The same
+
+[145:18.00]Like Dubuque
+
+[145:19.00]And then
+
+[145:20.00]The same
+
+[145:21.00]Like Dubuque
+
+[145:22.00]And then
+
+[145:23.00]The same
+
+[145:24.00]Like Dubuque
+
+[145:25.00]And then
+
+[145:26.00]The same
+
+[145:27.00]Like Dubuque
+
+[145:28.00]And then
+
+[145:29.00]The same
+
+[145:30.00]Like Dubuque
+
+[145:31.00]And then
+
+[145:32.00]The same
+
+[145:33.00]Like Dubuque
+
+[145:34.00]And then
+
+[145:35.00]The same
+
+[145:36.00]Like Dubuque
+
+[145:37.00]And then
+
+[145:38.00]The same
+
+[145:39.00]Like Dubuque
+
+[145:40.00]And then
+
+[145:41.00]The same
+
+[145:42.00]Like Dubuque
+
+[145:43.00]And then
+
+[145:44.00]The same
+
+[145:45.00]Like Dubuque
+
+[145:46.00]And then
+
+[145:47.00]The same
+
+[145:48.00]Like Dubuque
+
+[145:49.00]And then
+
+[145:50.00]The same
+
+[145:51.00]Like Dubuque
+
+[145:52.00]And then
+
+[145:53.00]The same
+
+[145:54.00]Like Dubuque
+
+[145:55.00]And then
+
+[145:56.00]The same
+
+[145:57.00]Like Dubuque
+
+[145:58.00]And then
+
+[145:59.00]The same
+
+[146:00.00]Like Dubuque
+
+[146:01.00]And then
+
+[146:02.00]The same
+
+[146:03.00]Like Dubuque
+
+[146:04.00]And then
+
+[146:05.00]The same
+
+[146:06.00]Like Dubuque
+
+[146:07.00]And then
+
+[146:08.00]The same
+
+[146:09.00]Like Dubuque
+
+[146:10.00]And then
+
+[146:11.00]The same
+
+[146:12.00]Like Dubuque
+
+[146:13.00]And then
+
+[146:14.00]The same
+
+[146:15.00]Like Dubuque
+
+[146:16.00]And then
+
+[146:17.00]The same
+
+[146:18.00]Like Dubuque
+
+[146:19.00]And then
+
+[146:20.00]The same
+
+[146:21.00]Like Dubuque
+
+[146:22.00]And then
+
+[146:23.00]The same
+
+[146:24.00]Like Dubuque
+
+[146:25.00]And then
+
+[146:26.00]The same
+
+[146:27.00]Like Dubuque
+
+[146:28.00]And then
+
+[146:29.00]The same
+
+[146:30.00]Like Dubuque
+
+[146:31.00]And then
+
+[146:32.00]The same
+
+[146:33.00]Like Dubuque
+
+[146:34.00]And then
+
+[146:35.00]The same
+
+[146:36.00]Like Dubuque
+
+[146:37.00]And then
+
+[146:38.00]The same
+
+[146:39.00]Like Dubuque
+
+[146:40.00]And then
+
+[146:41.00]The same
+
+[146:42.00]Like Dubuque
+
+[146:43.00]And then
+
+[146:44.00]The same
+
+[146:45.00]Like Dubuque
+
+[146:46.00]And then
+
+[146:47.00]The same
+
+[146:48.00]Like Dubuque
+
+[146:49.00]And then
+
+[146:50.00]The same
+
+[146:51.00]Like Dubuque
+
+[146:52.00]And then
+
+[146:53.00]The same
+
+[146:54.00]Like Dubuque
+
+[146:55.00]And then
+
+[146:56.00]The same
+
+[146:57.00]Like Dubuque
+
+[146:58.00]And then
+
+[146:59.00]The same
+
+[147:00.00]Like Dubuque
+
+[147:01.00]And then
+
+[147:02.00]The same
+
+[147:03.00]Like Dubuque
+
+[147:04.00]And then
+
+[147:05.00]The same
+
+[147:06.00]Like Dubuque
+
+[147:07.00]And then
+
+[147:08.00]The same
+
+[147:09.00]Like Dubuque
+
+[147:10.00]And then
+
+[147:11.00]The same
+
+[147:12.00]Like Dubuque
+
+[147:13.00]And then
+
+[147:14.00]The same
+
+[147:15.00]Like Dubuque
+
+[147:16.00]And then
+
+[147:17.00]The same
+
+[147:18.00]Like Dubuque
+
+[147:19.00]And then
+
+[147:20.00]The same
+
+[147:21.00]Like Dubuque
+
+[147:22.00]And then
+
+[147:23.00]The same
+
+[147:24.00]Like Dubuque
+
+[147:25.00]And then
+
+[147:26.00]The same
+
+[147:27.00]Like Dubuque
+
+[147:28.00]And then
+
+[147:29.00]The same
+
+[147:30.00]Like Dubuque
+
+[147:31.00]And then
+
+[147:32.00]The same
+
+[147:33.00]Like Dubuque
+
+[147:34.00]And then
+
+[147:35.00]The same
+
+[147:36.00]Like Dubuque
+
+[147:37.00]And then
+
+[147:38.00]The same
+
+[147:39.00]Like Dubuque
+
+[147:40.00]And then
+
+[147:41.00]The same
+
+[147:42.00]Like Dubuque
+
+[147:43.00]And then
+
+[147:44.00]The same
+
+[147:45.00]Like Dubuque
+
+[147:46.00]And then
+
+[147:47.00]The same
+
+[147:48.00]Like Dubuque
+
+[147:49.00]And then
+
+[147:50.00]The same
+
+[147:51.00]Like Dubuque
+
+[147:52.00]And then
+
+[147:53.00]The same
+
+[147:54.00]Like Dubuque
+
+[147:55.00]And then
+
+[147:56.00]The same
+
+[147:57.00]Like Dubuque
+
+[147:58.00]And then
+
+[147:59.00]The same
+
+[148:00.00]Like Dubuque
+
+[148:01.00]And then
+
+[148:02.00]The same
+
+[148:03.00]Like Dubuque
+
+[148:04.00]And then
+
+[148:05.00]The same
+
+[148:06.00]Like Dubuque
+
+[148:07.00]And then
+
+[148:08.00]The same
+
+[148:09.00]Like Dubuque
+
+[148:10.00]And then
+
+[148:11.00]The same
+
+[148:12.00]Like Dubuque
+
+[148:13.00]And then
+
+[148:14.00]The same
+
+[148:15.00]Like Dubuque
+
+[148:16.00]And then
+
+[148:17.00]The same
+
+[148:18.00]Like Dubuque
+
+[148:19.00]And then
+
+[148:20.00]The same
+
+[148:21.00]Like Dubuque
+
+[148:22.00]And then
+
+[148:23.00]The same
+
+[148:24.00]Like Dubuque
+
+[148:25.00]And then
+
+[148:26.00]The same
+
+[148:27.00]Like Dubuque
+
+[148:28.00]And then
+
+[148:29.00]The same
+
+[148:30.00]Like Dubuque
+
+[148:31.00]And then
+
+[148:32.00]The same
+
+[148:33.00]Like Dubuque
+
+[148:34.00]And then
+
+[148:35.00]The same
+
+[148:36.00]Like Dubuque
+
+[148:37.00]And then
+
+[148:38.00]The same
+
+[148:39.00]Like Dubuque
+
+[148:40.00]And then
+
+[148:41.00]The same
+
+[148:42.00]Like Dubuque
+
+[148:43.00]And then
+
+[148:44.00]The same
+
+[148:45.00]Like Dubuque
+
+[148:46.00]And then
+
+[148:47.00]The same
+
+[148:48.00]Like Dubuque
+
+[148:49.00]And then
+
+[148:50.00]The same
+
+[148:51.00]Like Dubuque
+
+[148:52.00]And then
+
+[148:53.00]The same
+
+[148:54.00]Like Dubuque
+
+[148:55.00]And then
+
+[148:56.00]The same
+
+[148:57.00]Like Dubuque
+
+[148:58.00]And then
+
+[148:59.00]The same
+
+[149:00.00]Like Dubuque
+
+[149:01.00]And then
+
+[149:02.00]The same
+
+[149:03.00]Like Dubuque
+
+[149:04.00]And then
+
+[149:05.00]The same
+
+[149:06.00]Like Dubuque
+
+[149:07.00]And then
+
+[149:08.00]The same
+
+[149:09.00]Like Dubuque
+
+[149:10.00]And then
+
+[149:11.00]The same
+
+[149:12.00]Like Dubuque
+
+[149:13.00]And then
+
+[149:14.00]The same
+
+[149:15.00]Like Dubuque
+
+[149:16.00]And then
+
+[149:17.00]The same
+
+[149:18.00]Like Dubuque
+
+[149:19.00]And then
+
+[149:20.00]The same
+
+[149:21.00]Like Dubuque
+
+[149:22.00]And then
+
+[149:23.00]The same
+
+[149:24.00]Like Dubuque
+
+[149:25.00]And then
+
+[149:26.00]The same
+
+[149:27.00]Like Dubuque
+
+[149:28.00]And then
+
+[149:29.00]The same
+
+[149:30.00]Like Dubuque
+
+[149:31.00]And then
+
+[149:32.00]The same
+
+[149:33.00]Like Dubuque
+
+[149:34.00]And then
+
+[149:35.00]The same
+
+[149:36.00]Like Dubuque
+
+[149:37.00]And then
+
+[149:38.00]The same
+
+[149:39.00]Like Dubuque
+
+[149:40.00]And then
+
+[149:41.00]The same
+
+[149:42.00]Like Dubuque
+
+[149:43.00]And then
+
+[149:44.00]The same
+
+[149:45.00]Like Dubuque
+
+[149:46.00]And then
+
+[149:47.00]The same
+
+[149:48.00]Like Dubuque
+
+[149:49.00]And then
+
+[149:50.00]The same
+
+[149:51.00]Like Dubuque
+
+[149:52.00]And then
+
+[149:53.00]The same
+
+[149:54.00]Like Dubuque
+
+[149:55.00]And then
+
+[149:56.00]The same
+
+[149:57.00]Like Dubuque
+
+[149:58.00]And then
+
+[149:59.00]The same
+
+[150:00.00]Like Dubuque
+
+[150:01.00]And then
+
+[150:02.00]The same
+
+[150:03.00]Like Dubuque
+
+[150:04.00]And then
+
+[150:05.00]The same
+
+[150:06.00]Like Dubuque
+
+[150:07.00]And then
+
+[150:08.00]The same
+
+[150:09.00]Like Dubuque
+
+[150:10.00]And then
+
+[150:11.00]The same
+
+[150:12.00]Like Dubuque
+
+[150:13.00]And then
+
+[150:14.00]The same
+
+[150:15.00]Like Dubuque
+
+[150:16.00]And then
+
+[150:17.00]The same
+
+[150:18.00]Like Dubuque
+
+[150:19.00]And then
+
+[150:20.00]The same
+
+[150:21.00]Like Dubuque
+
+[150:22.00]And then
+
+[150:23.00]The same
+
+[150:24.00]Like Dubuque
+
+[150:25.00]And then
+
+[150:26.00]The same
+
+[150:27.00]Like Dubuque
+
+[150:28.00]And then
+
+[150:29.00]The same
+
+[150:30.00]Like Dubuque
+
+[150:31.00]And then
+
+[150:32.00]The same
+
+[150:33.00]Like Dubuque
+
+[150:34.00]And then
+
+[150:35.00]The same
+
+[150:36.00]Like Dubuque
+
+[150:37.00]And then
+
+[150:38.00]The same
+
+[150:39.00]Like Dubuque
+
+[150:40.00]And then
+
+[150:41.00]The same
+
+[150:42.00]Like Dubuque
+
+[150:43.00]And then
+
+[150:44.00]The same
+
+[150:45.00]Like Dubuque
+
+[150:46.00]And then
+
+[150:47.00]The same
+
+[150:48.00]Like Dubuque
+
+[150:49.00]And then
+
+[150:50.00]The same
+
+[150:51.00]Like Dubuque
+
+[150:52.00]And then
+
+[150:53.00]The same
+
+[150:54.00]Like Dubuque
+
+[150:55.00]And then
+
+[150:56.00]The same
+
+[150:57.00]Like Dubuque
+
+[150:58.00]And then
+
+[150:59.00]The same
+
+[151:00.00]Like Dubuque
+
+[151:01.00]And then
+
+[151:02.00]The same
+
+[151:03.00]Like Dubuque
+
+[151:04.00]And then
+
+[151:05.00]The same
+
+[151:06.00]Like Dubuque
+
+[151:07.00]And then
+
+[151:08.00]The same
+
+[151:09.00]Like Dubuque
+
+[151:10.00]And then
+
+[151:11.00]The same
+
+[151:12.00]Like Dubuque
+
+[151:13.00]And then
+
+[151:14.00]The same
+
+[151:15.00]Like Dubuque
+
+[151:16.00]And then
+
+[151:17.00]The same
+
+[151:18.00]Like Dubuque
+
+[151:19.00]And then
+
+[151:20.00]The same
+
+[151:21.00]Like Dubuque
+
+[151:22.00]And then
+
+[151:23.00]The same
+
+[151:24.00]Like Dubuque
+
+[151:25.00]And then
+
+[151:26.00]The same
+
+[151:27.00]Like Dubuque
+
+[151:28.00]And then
+
+[151:29.00]The same
+
+[151:30.00]Like Dubuque
+
+[151:31.00]And then
+
+[151:32.00]The same
+
+[151:33.00]Like Dubuque
+
+[151:34.00]And then
+
+[151:35.00]The same
+
+[151:36.00]Like Dubuque
+
+[151:37.00]And then
+
+[151:38.00]The same
+
+[151:39.00]Like Dubuque
+
+[151:40.00]And then
+
+[151:41.00]The same
+
+[151:42.00]Like Dubuque
+
+[151:43.00]And then
+
+[151:44.00]The same
+
+[151:45.00]Like Dubuque
+
+[151:46.00]And then
+
+[151:47.00]The same
+
+[151:48.00]Like Dubuque
+
+[151:49.00]And then
+
+[151:50.00]The same
+
+[151:51.00]Like Dubuque
+
+[151:52.00]And then
+
+[151:53.00]The same
+
+[151:54.00]Like Dubuque
+
+[151:55.00]And then
+
+[151:56.00]The same
+
+[151:57.00]Like Dubuque
+
+[151:58.00]And then
+
+[151:59.00]The same
+
+[152:00.00]Like Dubuque
+
+[152:01.00]And then
+
+[152:02.00]The same
+
+[152:03.00]Like Dubuque
+
+[152:04.00]And then
+
+[152:05.00]The same
+
+[152:06.00]Like Dubuque
+
+[152:07.00]And then
+
+[152:08.00]The same
+
+[152:09.00]Like Dubuque
+
+[152:10.00]And then
+
+[152:11.00]The same
+
+[152:12.00]Like Dubuque
+
+[152:13.00]And then
+
+[152:14.00]The same
+
+[152:15.00]Like Dubuque
+
+[152:16.00]And then
+
+[152:17.00]The same
+
+[152:18.00]Like Dubuque
+
+[152:19.00]And then
+
+[152:20.00]The same
+
+[152:21.00]Like Dubuque
+
+[152:22.00]And then
+
+[152:23.00]The same
+
+[152:24.00]Like Dubuque
+
+[152:25.00]And then
+
+[152:26.00]The same
+
+[152:27.00]Like Dubuque
+
+[152:28.00]And then
+
+[152:29.00]The same
+
+[152:30.00]Like Dubuque
+
+[152:31.00]And then
+
+[152:32.00]The same
+
+[152:33.00]Like Dubuque
+
+[152:34.00]And then
+
+[152:35.00]The same
+
+[152:36.00]Like Dubuque
+
+[152:37.00]And then
+
+[152:38.00]The same
+
+[152:39.00]Like Dubuque
+
+[152:40.00]And then
+
+[152:41.00]The same
+
+[152:42.00]Like Dubuque
+
+[152:43.00]And then
+
+[152:44.00]The same
+
+[152:45.00]Like Dubuque
+
+[152:46.00]And then
+
+[152:47.00]The same
+
+[152:48.00]Like Dubuque
+
+[152:49.00]And then
+
+[152:50.00]The same
+
+[152:51.00]Like Dubuque
+
+[152:52.00]And then
+
+[152:53.00]The same
+
+[152:54.00]Like Dubuque
+
+[152:55.00]And then
+
+[152:56.00]The same
+
+[152:57.00]Like Dubuque
+
+[152:58.00]And then
+
+[152:59.00]The same
+
+[153:00.00]Like Dubuque
+
+[153:01.00]And then
+
+[153:02.00]The same
+
+[153:03.00]Like Dubuque
+
+[153:04.00]And then
+
+[153:05.00]The same
+
+[153:06.00]Like Dubuque
+
+[153:07.00]And then
+
+[153:08.00]The same
+
+[153:09.00]Like Dubuque
+
+[153:10.00]And then
+
+[153:11.00]The same
+
+[153:12.00]Like Dubuque
+
+[153:13.00]And then
+
+[153:14.00]The same
+
+[153:15.00]Like Dubuque
+
+[153:16.00]And then
+
+[153:17.00]The same
+
+[153:18.00]Like Dubuque
+
+[153:19.00]And then
+
+[153:20.00]The same
+
+[153:21.00]Like Dubuque
+
+[153:22.00]And then
+
+[153:23.00]The same
+
+[153:24.00]Like Dubuque
+
+[153:25.00]And then
+
+[153:26.00]The same
+
+[153:27.00]Like Dubuque
+
+[153:28.00]And then
+
+[153:29.00]The same
+
+[153:30.00]Like Dubuque
+
+[153:31.00]And then
+
+[153:32.00]The same
+
+[153:33.00]Like Dubuque
+
+[153:34.00]And then
+
+[153:35.00]The same
+
+[153:36.00]Like Dubuque
+
+[153:37.00]And then
+
+[153:38.00]The same
+
+[153:39.00]Like Dubuque
+
+[153:40.00]And then
+
+[153:41.00]The same
+
+[153:42.00]Like Dubuque
+
+[153:43.00]And then
+
+[153:44.00]The same
+
+[153:45.00]Like Dubuque
+
+[153:46.00]And then
+
+[153:47.00]The same
+
+[153:48.00]Like Dubuque
+
+[153:49.00]And then
+
+[153:50.00]The same
+
+[153:51.00]Like Dubuque
+
+[153:52.00]And then
+
+[153:53.00]The same
+
+[153:54.00]Like Dubuque
+
+[153:55.00]And then
+
+[153:56.00]The same
+
+[153:57.00]Like Dubuque
+
+[153:58.00]And then
+
+[153:59.00]The same
+
+[154:00.00]Like Dubuque
+
+[154:01.00]And then
+
+[154:02.00]The same
+
+[154:03.00]Like Dubuque
+
+[154:04.00]And then
+
+[154:05.00]The same
+
+[154:06.00]Like Dubuque
+
+[154:07.00]And then
+
+[154:08.00]The same
+
+[154:09.00]Like Dubuque
+
+[154:10.00]And then
+
+[154:11.00]The same
+
+[154:12.00]Like Dubuque
+
+[154:13.00]And then
+
+[154:14.00]The same
+
+[154:15.00]Like Dubuque
+
+[154:16.00]And then
+
+[154:17.00]The same
+
+[154:18.00]Like Dubuque
+
+[154:19.00]And then
+
+[154:20.00]The same
+
+[154:21.00]Like Dubuque
+
+[154:22.00]And then
+
+[154:23.00]The same
+
+[154:24.00]Like Dubuque
+
+[154:25.00]And then
+
+[154:26.00]The same
+
+[154:27.00]Like Dubuque
+
+[154:28.00]And then
+
+[154:29.00]The same
+
+[154:30.00]Like Dubuque
+
+[154:31.00]And then
+
+[154:32.00]The same
+
+[154:33.00]Like Dubuque
+
+[154:34.00]And then
+
+[154:35.00]The same
+
+[154:36.00]Like Dubuque
+
+[154:37.00]And then
+
+[154:38.00]The same
+
+[154:39.00]Like Dubuque
+
+[154:40.00]And then
+
+[154:41.00]The same
+
+[154:42.00]Like Dubuque
+
+[154:43.00]And then
+
+[154:44.00]The same
+
+[154:45.00]Like Dubuque
+
+[154:46.00]And then
+
+[154:47.00]The same
+
+[154:48.00]Like Dubuque
+
+[154:49.00]And then
+
+[154:50.00]The same
+
+[154:51.00]Like Dubuque
+
+[154:52.00]And then
+
+[154:53.00]The same
+
+[154:54.00]Like Dubuque
+
+[154:55.00]And then
+
+[154:56.00]The same
+
+[154:57.00]Like Dubuque
+
+[154:58.00]And then
+
+[154:59.00]The same
+
+[155:00.00]Like Dubuque
+
+[155:01.00]And then
+
+[155:02.00]The same
+
+[155:03.00]Like Dubuque
+
+[155:04.00]And then
+
+[155:05.00]The same
+
+[155:06.00]Like Dubuque
+
+[155:07.00]And then
+
+[155:08.00]The same
+
+[155:09.00]Like Dubuque
+
+[155:10.00]And then
+
+[155:11.00]The same
+
+[155:12.00]Like Dubuque
+
+[155:13.00]And then
+
+[155:14.00]The same
+
+[155:15.00]Like Dubuque
+
+[155:16.00]And then
+
+[155:17.00]The same
+
+[155:18.00]Like Dubuque
+
+[155:19.00]And then
+
+[155:20.00]The same
+
+[155:21.00]Like Dubuque
+
+[155:22.00]And then
+
+[155:23.00]The same
+
+[155:24.00]Like Dubuque
+
+[155:25.00]And then
+
+[155:26.00]The same
+
+[155:27.00]Like Dubuque
+
+[155:28.00]And then
+
+[155:29.00]The same
+
+[155:30.00]Like Dubuque
+
+[155:31.00]There has been a tremendous explosion in terms of the number of people that are interested, the number of experiment that are being tried out.
+
+[155:38.00]And I would say we are very fortunate that there is this kind of interest right now.
+
+[155:43.00]The thing that you notice is it does go from some of the seminal moment to seminal moment.
+
+[155:50.00]Between AlexNet and the transformer paper in 2017, you could see that there was a huge amount of innovation on the visual side.
+
+[156:00.00]You know, we had resonance, you know, all kinds of interesting architectures.
+
+[156:05.00]And then after 2017, it became again significantly focused on sequence to sequence, right?
+
+[156:11.00]So the ideal sequence here, you know, machine translation therefore was text.
+
+[156:16.00]You know, we keep hearing about maybe there will be a breakthrough moment on the visual side again, right?
+
+[156:22.00]In another five years.
+
+[156:23.00]And then suddenly, you know, people will start spending more energy there.
+
+[156:29.00]So I would like to actually go back to the modality, right?
+
+[156:32.00]So it was the visual modality and now the text modality.
+
+[156:35.00]And then maybe it's proteins, right?
+
+[156:38.00]Maybe it's some modality in the healthcare space, right?
+
+[156:42.00]Where sudden breakthroughs come about in the next several years.
+
+[156:46.00]And then you will suddenly see a whole bunch of people looking at gene sequences and so on and so forth, right?
+
+[156:52.00]And that interest will spike.
+
+[156:54.00]So I actually find it fascinating that we are growing as a community from where we started.
+
+[156:59.00]I don't really see it as a hype versus not hype.
+
+[157:03.00]I really see it as more of the ability of this modality coupled with the neural architecture.
+
+[157:10.00]You know, that will be the most effective and efficient at that point of time in attracting a lot of energy from the academic and industrial research community.
+
+[157:19.00]So we can only be better off because of it.
+
+[157:22.00]So I don't see it as a hype as much as an opportunity for learning more.
+
+[157:27.00]I will add this.
+
+[157:28.00]And I think it is a good segue for the next part of what you're going to talk about LSU, right?
+
+[157:33.00]Which is I get asked this question often, right?
+
+[157:36.00]Why did I move from NVIDIA?
+
+[157:37.00]Like who leaves NVIDIA?
+
+[157:39.00]And the fact of the matter is that really if you have that unfulfilled desire in you to solve the actual end problem,
+
+[157:46.00]you have to go towards one of these verticals, right?
+
+[157:49.00]With the whether it's health care, whether it is financial services and you have to be embedded in an organization that actually has a culture for fostering big tech like a fang like work ethic.
+
+[158:02.00]Whether it's having the data, right?
+
+[158:04.00]Already being in the cloud, having the roots in data driven and machine learning.
+
+[158:09.00]So you want to be in a place which allows for that kind of creativity so that you can actually take Gen AI to its logical next step, right?
+
+[158:19.00]We're just solving problems so that we change the financial services domain for the good, right?
+
+[158:24.00]There's benefit to the customers.
+
+[158:26.00]And so when you do that, though, you have to be mindful.
+
+[158:30.00]If you are working in places where you are building recommendation systems for what to buy or what to watch.
+
+[158:39.00]There are a lot of ways in which you may not have the best possible answer, the most accurate answer and you are still fine.
+
+[158:46.00]But when you are in one of these domains that matter, health care, financial services, right?
+
+[158:51.00]You really are solving a harder problem because the tolerance for error of your end customer is going to be much,
+
+[158:59.00]much,much lower.And what that does is it forces the research and development to go in specific directions.
+
+[159:06.00]You may not be able to just take what's out there in open source and use it as is, right?
+
+[159:12.00]You may have to actually innovate beyond what's in the state of the art publication to make it work for this kind of domain, right?
+
+[159:21.00]And so it takes a special class of applied researchers.
+
+[159:25.00]It takes a special class of machine learning engineers, data scientists that have the will.
+
+[159:31.00]The last mile is hard and you have to have that will to solve that problem.
+
+[159:36.00]And so in some cases what ends up happening is that these roles and the work we will do may be harder than what you end up doing in a generic setting or at a platform level only or in a use case which actually doesn't have this kind of extremely low tolerance for error.
+
+[159:54.00]So that becomes, you know, one of the main motivating functions for people to join us, right?
+
+[160:01.00]Who really want to actually take these kind of really hard challenges.
+
+[160:05.00]I was going through some of your team papers just from 2023, some of them in Europe, some at ICML.
+
+[160:10.00]You've done a lot of work like class imbalance on data sets.
+
+[160:13.00]So how do you make performing neural networks when like the data is kind of in balance in certain domains.
+
+[160:18.00]You're doing some research on transformer graphs versus like graph neural networks.
+
+[160:23.00]So there's just kind of like a lot in there.
+
+[160:25.00]Any favorite project that you want to shut out any interesting paper that you saw come out of your team.
+
+[160:30.00]You know, I could choose one, but then it would basically end up being, you know, not fair to the others.
+
+[160:35.00]So the only on that, you know,
+
+[160:39.00]It's like the what's your favorite child?
+
+[160:42.00]There's no way to say one versus the other.
+
+[160:46.00]But there are a lot of interesting trajectories of exploration here, right?
+
+[160:51.00]We are trying to look at the superset of what all these things are trying to do.
+
+[160:55.00]We are trying to look at data sets that are very unique to our domain.
+
+[160:58.00]We are trying to look at attributes that are important to us like time series, tabular, right?
+
+[161:04.00]These are things that are important, very important.
+
+[161:06.00]We are looking at the imbalance problems, right?
+
+[161:09.00]There are other things that we are looking at.
+
+[161:11.00]And so I wouldn't want to just highlight one.
+
+[161:15.00]There are a bunch of things that are of great interest to us from our perspective.
+
+[161:20.00]Let me just not pick one.
+
+[161:23.00]That makes sense.
+
+[161:25.00]And yeah, just to wrap, we always like to ask our guests, who are you looking for?
+
+[161:29.00]You know, we got a lot of AI engineers, researchers in the audience.
+
+[161:33.00]Like, who are the type of people that are going to have a good time working with you?
+
+[161:36.00]And what are some of the open roles that you have?
+
+[161:38.00]So let me start with the roles.
+
+[161:40.00]And then, you know, we can talk about the kind of people who are going to have a great time with us, right?
+
+[161:44.00]We have a number of open roles and they span the entire spectrum from applied researchers to data scientists, to machine learning engineers, AI engineers, right?
+
+[161:55.00]People who are listening to you right now that we really weren't interested in what we are doing.
+
+[162:01.00]We have roles at various levels of seniority.
+
+[162:03.00]We have individual contributor roles.
+
+[162:05.00]That's one of the things that we have really, really double clicked on in the last several months since I've joined.
+
+[162:10.00]And that was a thrust also before I joined, but especially in the applied research field, right?
+
+[162:15.00]We are looking for, you know, what would be the equivalent of, you know, principal research scientists, right?
+
+[162:20.00]Distinguished research scientists, individual contributors.
+
+[162:23.00]We are looking for fresh graduates, right?
+
+[162:26.00]Masters and PhD students, you know, just out of school, all the way to people who have, you know, a decade or more of experience.
+
+[162:33.00]So, really there is a huge spectrum of talent that we want to onboard and we are looking for.
+
+[162:41.00]Now, what is the characteristic of someone who will really come and enjoy here, right?
+
+[162:46.00]I think I started giving that to you earlier when I said people who are interested in actually solving the problem.
+
+[162:52.00]That last mile is hard, so we really want people to know that they have to have the stomach for that last mile.
+
+[162:58.00]People who are good at understanding what the product requirements are, they sometimes make the best applied researchers, right?
+
+[163:07.00]Because they can understand what our needs are and formulate problems, change architectures.
+
+[163:15.00]We are looking for people who are very good at dealing with ambiguity.
+
+[163:19.00]A lot of what we are doing, a lot of the advancements we are seeing, they are empirical data driven, right?
+
+[163:25.00]And so, we need to have that as a skill.
+
+[163:29.00]How do you actually build algorithms, systems and solutions that you can show improvement on our data?
+
+[163:37.00]It's great if somebody is doing some architecture or some network is doing great on some of the benchmarks, right?
+
+[163:45.00]That get published outside, but it's equally important.
+
+[163:48.00]It's actually more important that we are able to show to ourselves that these algorithms, architectures do well on our own internal data sets and benchmarks and evaluation, right?
+
+[164:00.00]And so, to that point, how do we actually come up with meaningful evaluation frameworks and methodologies, right?
+
+[164:05.00]That itself is something of great interest to us.
+
+[164:09.00]So, people in general who like to solve problems, problem solvers, people who like to go from theoretical to practical and people who are not afraid that the empirical data may not bear out their best idea and they have to go and rethink it, right?
+
+[164:25.00]And redo it.Those are the kind of people that will really do well here.
+
+[164:28.00]Yeah, it was great to hear your story and not a lot of people in the world, I think, that have the same depth of experience in AI, so this was awesome.
+
+[164:35.00]Yeah, a lot of people that you're looking for are also like the kind of AI engineers that we want to encourage.
+
+[164:40.00]So, yeah, thanks for sharing your thoughts.
+
+[164:42.00]Thank you, Sean.
+
+[164:43.00](音乐)
+
+[165:07.00](音乐)
+
+[165:09.00]字幕by索兰娅
+
+[165:11.00]我看你不太好看
+
+[165:12.32]我看你不太好看
+
diff --git a/content/post/Latent Space/Latent-Space-Presenting-the-AI-Engineer-World's-Fair-—-with-Sam-Schillace,-Deputy-CTO-of-Microsoft.lrc b/content/post/Latent Space/Latent-Space-Presenting-the-AI-Engineer-World's-Fair-—-with-Sam-Schillace,-Deputy-CTO-of-Microsoft.lrc
new file mode 100644
index 0000000..e69de29
diff --git a/content/post/Latent Space/Latent-Space-Supervise-the-Process-of-AI-Research-—-with-Jungwon-Byun-and-Andreas-Stuhlmüller-of-Elicit.lrc b/content/post/Latent Space/Latent-Space-Supervise-the-Process-of-AI-Research-—-with-Jungwon-Byun-and-Andreas-Stuhlmüller-of-Elicit.lrc
new file mode 100644
index 0000000..ab3cfac
--- /dev/null
+++ b/content/post/Latent Space/Latent-Space-Supervise-the-Process-of-AI-Research-—-with-Jungwon-Byun-and-Andreas-Stuhlmüller-of-Elicit.lrc
@@ -0,0 +1,2882 @@
+[by:whisper.cpp]
+[00:00.00](音乐)
+[00:06.00]大家好 欢迎到Lit and Space Podcast
+[00:08.40]这是Alessio 和CTO的计划人士 和我参加的计划人士
+[00:11.80]我参加了麦克欧的计划 专门 邱小雅
+[00:15.00]今天我们回到工作室了
+[00:17.20]和Andreas 和 卢安 欢迎你
+[00:20.20]谢谢 太好了 谢谢你
+[00:22.40]我会介绍你分别的 但也希望你会更多学习
+[00:27.40]So Andreas it looks like you started Alicit first and joined later
+[00:32.40]That's right
+[00:33.00]For all intents and purposes, the illicit and also the odd that existed before then were very different from what I started
+[00:39.60]So I think it's like fair to say that you co-funded it
+[00:42.60]Got it
+[00:43.00]And Joanne you're a co-founder and COO of Alicit now
+[00:46.20]Yeah that's right
+[00:47.00]So there's a little bit of a history to this
+[00:48.80]I'm not super aware of like the sort of journey
+[00:51.80]I was aware of odd and illicit as sort of a non-profit type situation
+[00:55.80]And recently you turned into like a public benefit corporation
+[00:59.40]So yeah maybe if you want you could take us through that journey of finding the problem
+[01:04.00]You know obviously you're working together now
+[01:06.20]So like how do you get together to decide to leave your startup career to join him
+[01:11.20]Yeah it's truly a very long journey
+[01:12.80]I guess truly it kind of started in Germany when I was born
+[01:17.20]So even as a kid I was always interested in AI
+[01:20.00]Like I kind of went to the library
+[01:21.40]There were books about how to write programs in QBasic
+[01:24.20]And like some of them talked about how to implement chatbots
+[01:27.20]And to be clear
+[01:28.80]He grew up in like a tiny village on the outskirts of Munich called Dinkelscherbin
+[01:33.20]Where it's like a very very idyllic German village
+[01:36.20]Yeah important to the story
+[01:38.40]So basically the main thing is I've kind of always been thinking about AI my entire life
+[01:42.80]And been thinking about at some point this is going to be a huge deal
+[01:46.00]It's going to be transformative
+[01:47.00]How can I work on it
+[01:48.20]And was thinking about it from when I was a teenager
+[01:51.60]After high school did a year where I started a startup with the intention to become rich
+[01:56.80]And then once I'm rich I can affect the trajectory of AI
+[02:00.40]Did not become rich
+[02:01.40]Decided to go back to college
+[02:03.00]And study cognitive science there
+[02:05.00]Which was like the closest thing I could find at the time to AI
+[02:08.00]In the last year of college moved to the US to do a PhD at MIT
+[02:12.60]Working on broadly kind of new programming languages for AI
+[02:15.00]Because it kind of seemed like the existing languages were not great at expressing
+[02:19.60]World models and learning world models during Bayesian inference
+[02:22.60]Was obviously thinking about ultimately the goal is to actually build tools that help people reason more clearly
+[02:27.60]Ask and answer better questions and make better decisions
+[02:31.60]But for a long time it seemed like the technology to put reasoning in machines just wasn't there
+[02:35.60]Initially at the end of my postdoc at Stanford was thinking about well what to do
+[02:39.60]I think the standard path is you become an academic and do research
+[02:43.60]But it's really hard to actually build interesting tools as an academic
+[02:48.60]You can't really hire great engineers
+[02:50.60]Everything is kind of on a paper-to-paper timeline
+[02:53.60]And so I was like well maybe I should start a startup
+[02:56.60]Pursuit that for a little bit
+[02:57.60]But it seemed like it was too early because you could have tried to do an AI startup
+[03:01.60]But probably would not have been this kind of AI startup we're seeing now
+[03:05.60]So then decided to just start a non-profit research lab
+[03:08.60]That's going to do research for a while until we better figure out how to do thinking in machines
+[03:13.60]And that was odd
+[03:14.60]And then over time it became clear how to actually build actual tools for reasoning
+[03:19.60]Then only over time we developed a better way to
+[03:23.60]I'll let you fill in some of the details here
+[03:25.60]Yeah so I guess my story maybe starts around 2015
+[03:29.60]I kind of wanted to be a founder for a long time
+[03:31.60]And I wanted to work on an idea that stood the test of time for me
+[03:34.60]Like an idea that stuck with me for a long time
+[03:37.60]And starting in 2015
+[03:38.60]Actually originally I became interested in AI based tools from the perspective of mental health
+[03:43.60]So there are a bunch of people around me who are really struggling
+[03:45.60]One really close friend in particular is really struggling with mental health
+[03:48.60]And didn't have any support
+[03:50.60]And it didn't feel like there was anything before kind of like getting hospitalized
+[03:54.60]That could just help her
+[03:56.60]And so luckily she came and stayed with me for a while
+[03:58.60]And we were just able to talk through some things
+[04:00.60]But it seemed like you know lots of people might not have that resource
+[04:04.60]And something maybe AI enabled could be much more scalable
+[04:07.60]I didn't feel ready to start a company then
+[04:10.60]That's 2015
+[04:11.60]And I also didn't feel like the technology was ready
+[04:13.60]So then I went into fintech
+[04:15.60]And like kind of learned how to do the tech thing
+[04:17.60]And then in 2019
+[04:18.60]I felt like it was time for me to just jump in
+[04:21.60]And build something on my own
+[04:22.60]I really wanted to create
+[04:24.60]And at the time I looked around at tech
+[04:26.60]And felt like not super inspired by the options
+[04:28.60]I just I didn't want to have a tech career ladder
+[04:31.60]Or like I didn't want to like climb the career ladder
+[04:33.60]There are two kind of interesting technologies at the time
+[04:35.60]There was AI and there was crypto
+[04:37.60]And I was like well the AI people seemed like a little bit more nice
+[04:41.60]And maybe like slightly more trustworthy
+[04:44.60]Both super exciting
+[04:45.60]But through my bed and on the AI side
+[04:47.60]And then I got connected to Andreas
+[04:49.60]And actually the way he was thinking about
+[04:51.60]Pursuing the research agenda at AUT
+[04:53.60]Was really compatible with what I had envisioned
+[04:56.60]For an ideal AI product
+[04:58.60]Something that helps kind of take down
+[05:00.60]Really complex thinking
+[05:01.60]Overwhelming thoughts
+[05:02.60]And breaks it down into small pieces
+[05:04.60]And then this kind of mission
+[05:05.60]We need AI to help us figure out
+[05:07.60]What we ought to do
+[05:08.60]It was really inspiring, right?
+[05:10.60]Yeah, because I think it was clear
+[05:12.60]That we were building the most powerful
+[05:14.60]Optimizer of our time
+[05:16.60]But as a society
+[05:17.60]We hadn't figured out
+[05:18.60]How to direct that optimization potential
+[05:21.60]And if you kind of direct tremendous
+[05:23.60]Optimization potential at the wrong thing
+[05:25.60]That's really disastrous
+[05:26.60]So the goal of AUT was
+[05:28.60]Make sure that if we build
+[05:29.60]The most transformative technology of our lifetime
+[05:31.60]It can be used for something really impactful
+[05:34.60]And that's really good reasoning
+[05:35.60]Like not just generating ads
+[05:37.60]My background was in marketing
+[05:38.60]But like so
+[05:39.60]It's like I want to do
+[05:40.60]More than generate ads with this
+[05:42.60]And also if these AI systems
+[05:44.60]Get to be super intelligent enough
+[05:46.60]That they are doing this
+[05:47.60]Really complex reasoning
+[05:48.60]That we can trust them
+[05:49.60]That they are aligned with us
+[05:51.60]And we have ways of evaluating
+[05:53.60]That they are doing the right thing
+[05:54.60]So that's what AUT did
+[05:55.60]We did a lot of experiments
+[05:56.60]You know, like Andreas said
+[05:57.60]Before foundation models
+[05:59.60]Really like took off
+[06:00.60]A lot of the issues we were seeing
+[06:01.60]Were more in reinforcement learning
+[06:03.60]But we saw a future
+[06:04.60]Where AI would be able to do
+[06:06.60]More kind of logical reasoning
+[06:08.60]Not just kind of extrapolate
+[06:09.60]From numerical trends
+[06:10.60]We actually kind of
+[06:11.60]Set up experiments with people
+[06:13.60]Where kind of people stood in
+[06:14.60]As super intelligent systems
+[06:16.60]And we effectively gave them
+[06:17.60]Context windows
+[06:18.60]So they would have to
+[06:19.60]Like read a bunch of text
+[06:20.60]And one person would get less text
+[06:23.60]And one person would get all the text
+[06:24.60]And the person with less text
+[06:26.60]Would have to evaluate the work
+[06:28.60]Of the person who could read much more
+[06:30.60]So like in the world
+[06:31.60]We were basically simulating
+[06:32.60]Like in, you know, 2018-2019
+[06:34.60]A world where an AI system
+[06:36.60]Could read significantly more than you
+[06:38.60]And you as the person
+[06:39.60]Who couldn't read that much
+[06:40.60]Had to evaluate the work
+[06:41.60]Of the AI system
+[06:42.60]So there's a lot of the work we did
+[06:44.60]And from that we kind of
+[06:45.60]Iterated on the idea
+[06:46.60]Of breaking complex tasks down
+[06:47.60]Into smaller tasks
+[06:48.60]Like complex tasks
+[06:49.60]Like open-ended reasoning
+[06:51.60]Logical reasoning
+[06:52.60]Into smaller tasks
+[06:53.60]So that it's easier
+[06:54.60]To train AI systems on them
+[06:55.60]And also so that it's easier
+[06:57.60]To evaluate the work of the AI system
+[06:59.60]When it's done
+[07:00.60]And then also kind of
+[07:01.60]We really pioneered this idea
+[07:02.60]The importance of supervising
+[07:03.60]The process of AI systems
+[07:05.60]Not just the outcomes
+[07:06.60]And so a big part
+[07:07.60]Of how elicit is built
+[07:08.60]Is we're very intentional
+[07:10.60]About not just throwing
+[07:11.60]A ton of data into a model
+[07:13.60]And training it
+[07:14.60]And then saying cool
+[07:15.60]Here's like scientific output
+[07:16.60]Like that's not at all
+[07:17.60]What we do
+[07:18.60]Our approach is very much
+[07:19.60]Like what are the steps
+[07:20.60]That an expert human does
+[07:21.60]Or what is like an ideal process
+[07:23.60]As granularly as possible
+[07:25.60]Let's break that down
+[07:26.60]And then train AI systems
+[07:27.60]To perform each of those steps
+[07:29.60]Very robustly
+[07:30.60]When you train like that
+[07:32.60]From the start
+[07:33.60]After the fact
+[07:34.60]It's much easier to evaluate
+[07:35.60]It's much easier to troubleshoot
+[07:36.60]At each point
+[07:37.60]Like where did something break down
+[07:38.60]So yeah
+[07:39.60]We were working on those experiments
+[07:40.60]For a while
+[07:41.60]And then at the start of 2021
+[07:43.60]Decided to build a product
+[07:44.60]Do you mind if I
+[07:45.60]Because I think you're about
+[07:46.60]To go into more modern
+[07:47.60]Hot and elicit
+[07:49.60]And I just wanted to
+[07:50.60]Because I think a lot of people
+[07:51.60]Are in where you were
+[07:53.60]Like sort of 2018-19
+[07:55.60]Where you chose a partner
+[07:57.60]To work with
+[07:58.60]And you didn't know him
+[07:59.60]Yeah yeah
+[08:00.60]You were just kind of cold introduced
+[08:01.60]Yep
+[08:02.60]A lot of people are cold introduced
+[08:03.60]I've been cold introduced
+[08:04.60]To tons of people
+[08:05.60]And I never work with them
+[08:06.60]I assume you had a lot
+[08:07.60]A lot of other options
+[08:08.60]Like how do you advise
+[08:09.60]People to make those choices
+[08:10.60]We were not totally cold introduced
+[08:12.60]So one of our closest friends
+[08:13.60]Introduced us
+[08:14.60]And then Andreas had written a lot
+[08:16.60]On the website
+[08:17.60]A lot of blog posts
+[08:18.60]A lot of publications
+[08:19.60]And I just read it
+[08:20.60]And I was like, wow
+[08:21.60]This sounds like my writing
+[08:22.60]And even other people
+[08:23.60]Some of my closest friends
+[08:24.60]I asked for advice from
+[08:25.60]They were like, oh
+[08:26.60]This sounds like your writing
+[08:28.60]But I think
+[08:29.60]I also had some kind of
+[08:30.60]Like things I was looking for
+[08:31.60]I wanted someone
+[08:32.60]With a complimentary skill set
+[08:33.60]I want someone
+[08:34.60]Who was very values aligned
+[08:36.60]And yeah
+[08:37.60]That was all a good fit
+[08:38.60]We also did a pretty
+[08:40.60]Lengthy mutual evaluation process
+[08:42.60]Where we had a Google doc
+[08:43.60]Where we had all kinds of questions
+[08:45.60]For each other
+[08:46.60]And I think it ended up being
+[08:48.60]Round 50 pages or so
+[08:49.60]Off like various questions
+[08:51.60]Was it the YC list?
+[08:53.60]There's some lists going around
+[08:54.60]For co-founder questions
+[08:55.60]No, we just made our own
+[08:57.60]But I guess it's probably related
+[08:59.60]And that you asked yourself
+[09:00.60]What are the values you care about
+[09:01.60]How would you approach
+[09:02.60]Various decisions
+[09:03.60]And things like that
+[09:04.60]I shared like all of my past
+[09:05.60]Performance reviews
+[09:06.60]Yeah
+[09:07.60]Yeah
+[09:08.60]And he never had any
+[09:09.60]No
+[09:10.60]Yeah, sorry
+[09:14.60]I just had to
+[09:15.60]A lot of people are going through
+[09:16.60]That phase
+[09:17.60]And you kind of skipped over it
+[09:18.60]I was like, no, no, no
+[09:19.60]There's like an interesting story
+[09:20.60]Yeah
+[09:21.60]Before we jump into what it is
+[09:22.60]It is today
+[09:23.60]The history is a bit
+[09:24.60]Cutter intuitive
+[09:25.60]So you start
+[09:26.60]Now, oh, if we had
+[09:27.60]A super powerful model
+[09:29.60]How we align it
+[09:30.60]How we use it
+[09:31.60]But then you were actually
+[09:32.60]Like, well, let's just build
+[09:33.60]The product so that people
+[09:34.60]Can actually leverage it
+[09:35.60]And I think there are
+[09:36.60]A lot of folks today
+[09:37.60]That are now back
+[09:38.60]To where you were
+[09:39.60]Maybe five years ago
+[09:40.60]They're like, oh, what if
+[09:41.60]This happens rather than
+[09:42.60]Focusing on actually building
+[09:43.60]Something useful with it
+[09:45.60]What click for you
+[09:46.60]To like move into a list
+[09:47.60]And then we can cover
+[09:48.60]That story too
+[09:49.60]I think in many ways
+[09:50.60]The approach is still the same
+[09:51.60]Because the way we're
+[09:52.60]Building a list is not
+[09:54.60]Let's train a foundation model
+[09:55.60]To do more stuff
+[09:56.60]It's like
+[09:57.60]Let's build a scaffolding
+[09:58.60]Such that we can
+[09:59.60]Deploy powerful models
+[10:00.60]To good ends
+[10:01.60]I think it's different
+[10:02.60]Now in that
+[10:03.60]We actually have
+[10:04.60]Like some of the models to plug in
+[10:05.60]But if in 2017
+[10:06.60]We had had the models
+[10:08.60]We could have run
+[10:09.60]The same experiments
+[10:10.60]We did run with humans
+[10:11.60]Back then
+[10:12.60]Just with models
+[10:13.60]And so in many ways
+[10:14.60]Our philosophy is always
+[10:15.60]Let's think add to the future
+[10:16.60]What models are going to exist
+[10:17.60]In one, two years
+[10:19.60]Or longer
+[10:20.60]And how can we make it
+[10:22.60]So that they can
+[10:23.60]Actually be deployed
+[10:24.60]In many transparent
+[10:25.60]Controllable ways
+[10:26.60]Yeah, I think
+[10:27.60]Motivationally we both
+[10:28.60]Are kind of
+[10:29.60]Product people at heart
+[10:30.60]The research was
+[10:31.60]Really important
+[10:32.60]And it didn't
+[10:33.60]Make sense to build
+[10:34.60]A product at that time
+[10:35.60]But at the end of the day
+[10:36.60]The thing that always
+[10:37.60]Motivated us is
+[10:38.60]Imagining a world
+[10:39.60]Where high quality
+[10:40.60]Reasoning is really abundant
+[10:41.60]And AI is a technology
+[10:43.60]That's going to get us there
+[10:44.60]And there's a way
+[10:45.60]To guide that technology
+[10:46.60]With research
+[10:47.60]But you can have
+[10:48.60]A more direct effect
+[10:49.60]Through product
+[10:50.60]Because with research
+[10:51.60]You publish the research
+[10:52.60]And someone else
+[10:53.60]Product felt
+[10:54.60]Like a more direct path
+[10:55.60]And we wanted to
+[10:56.60]Concretely have an impact
+[10:57.60]On people's lives
+[10:58.60]Yeah, I think
+[10:59.60]The kind of personally
+[11:00.60]The motivation was
+[11:01.60]We want to build
+[11:02.60]For people
+[11:03.60]Yep, and then
+[11:04.60]Just to recap as well
+[11:05.60]Like the models
+[11:06.60]You're using back then were
+[11:07.60]Like, I don't know
+[11:08.60]With the like BERT type stuff
+[11:10.60]Or T5 or
+[11:12.60]I don't know what time frame
+[11:13.60]We're talking about here
+[11:14.60]I guess to be clear
+[11:15.60]At the very beginning
+[11:16.60]We had humans do the work
+[11:18.60]And then I think
+[11:19.60]The first models
+[11:20.60]That kind of makes sense
+[11:21.60]Or GPT-2
+[11:22.60]And TNLG
+[11:23.60]And early generative models
+[11:25.60]We do
+[11:26.60]We also use
+[11:27.60]Like T5 based models
+[11:28.60]Even now
+[11:29.60]Started with GPT-2
+[11:30.60]Yeah, cool
+[11:31.60]I'm just kind of curious about
+[11:32.60]Like how do you
+[11:33.60]Start so early
+[11:34.60]Like now it's obvious
+[11:35.60]Where to start
+[11:36.60]But back then it wasn't
+[11:37.60]Yeah, I used to
+[11:38.60]Nag Andreas a lot
+[11:39.60]I was like
+[11:40.60]Why are you
+[11:41.60]Talking to this?
+[11:42.60]I don't know
+[11:43.60]I felt like
+[11:44.60]GPT-2 is like
+[11:45.60]Clearly can't do anything
+[11:46.60]And I was like
+[11:47.60]Andreas, you're wasting your time
+[11:48.60]Like playing with this toy
+[11:49.60]But yeah, it was right
+[11:50.60]So what's the history
+[11:51.60]Of what Elisit
+[11:52.60]Actually does as a product
+[11:53.60]You recently announced that
+[11:55.60]After four months
+[11:56.60]You get to a million of revenue
+[11:57.60]Obviously a lot of people
+[11:58.60]Use it, get a lot of value
+[11:59.60]But it would
+[12:00.60]Initially kind of like
+[12:01.60]Structured data
+[12:02.60]Instruction from papers
+[12:03.60]Then you had
+[12:04.60]Kind of like concept grouping
+[12:05.60]And today it's maybe
+[12:06.60]Like a more full stack
+[12:07.60]Research enabler
+[12:09.60]Kind of like paper
+[12:10.60]Understand their platform
+[12:11.60]What's the definitive definition
+[12:13.60]Of what Elisit is
+[12:14.60]And how did you get here
+[12:15.60]Yeah, we say Elisit
+[12:16.60]As an AI research assistant
+[12:17.60]I think it will continue
+[12:18.60]To evolve
+[12:19.60]You know, we're so excited
+[12:20.60]About building and research
+[12:21.60]Because there's just so much space
+[12:22.60]I think the current phase
+[12:23.60]We're in right now
+[12:24.60]We talk about it
+[12:25.60]As really trying to make Elisit
+[12:27.60]The best place to understand
+[12:28.60]What is known
+[12:29.60]So it's all a lot about like
+[12:31.60]Literature summarization
+[12:32.60]There's a ton of information
+[12:33.60]That the world already knows
+[12:34.60]It's really hard to navigate
+[12:35.60]Hard to make it relevant
+[12:37.60]So a lot of it is around
+[12:38.60]Document discovery
+[12:39.60]And processing and analysis
+[12:41.60]I really kind of want to
+[12:42.60]Import some of the incredible
+[12:44.60]Productivity improvements
+[12:45.60]We've seen in software engineering
+[12:47.60]And data science
+[12:48.60]And into research
+[12:49.60]So it's like
+[12:50.60]How can we make researchers
+[12:51.60]Like data scientists of text
+[12:53.60]That's why we're launching
+[12:54.60]This new set of features
+[12:55.60]Called notebooks
+[12:56.60]It's very much inspired
+[12:57.60]By computational notebooks
+[12:58.60]Like Jupyter notebooks
+[12:59.60]Deep note or colab
+[13:01.60]Because they're so powerful
+[13:02.60]And so flexible
+[13:03.60]And ultimately
+[13:04.60]When people are trying
+[13:05.60]To get to an answer
+[13:07.60]Or understand insight
+[13:08.60]They're kind of like
+[13:09.60]Manipulating evidence
+[13:10.60]And information
+[13:11.60]Today that's all packaged
+[13:12.60]In PDFs
+[13:13.60]Which are super brittle
+[13:14.60]But with language models
+[13:15.60]We can decompose
+[13:16.60]These PDFs
+[13:17.60]And then we can
+[13:18.60]Interly claims
+[13:19.60]And evidence
+[13:20.60]And insights
+[13:21.60]And then let researchers
+[13:22.60]Mash them up together
+[13:23.60]Remix them
+[13:24.60]And analyze them together
+[13:25.60]So yeah
+[13:26.60]I would say quite simply
+[13:27.60]Overall listed
+[13:28.60]As an AI research assistant
+[13:29.60]Right now we're focused
+[13:30.60]On text based workflows
+[13:32.60]But long term
+[13:33.60]Really want to kind of
+[13:34.60]Go further and further
+[13:35.60]Into reasoning
+[13:36.60]And decision making
+[13:37.60]And when you say
+[13:38.60]AI research assistant
+[13:39.60]This is kind of
+[13:40.60]Matter research
+[13:41.60]So researchers
+[13:42.60]Use a list
+[13:43.60]As a research assistant
+[13:44.60]It's not a generic
+[13:45.60]You can research
+[13:46.60]Or it could be
+[13:47.60]But what are people
+[13:48.60]Using it for today
+[13:49.60]So specifically in science
+[13:51.60]A lot of people use
+[13:52.60]Human research assistants
+[13:53.60]To do things
+[13:54.60]You tell your grad student
+[13:56.60]Here are a couple of papers
+[13:57.60]Can you look at
+[13:58.60]All of these
+[13:59.60]See which of these
+[14:00.60]Have kind of sufficiently
+[14:01.60]Large populations
+[14:02.60]And actually study
+[14:03.60]The disease that
+[14:04.60]I'm interested in
+[14:05.60]And then write out
+[14:06.60]Like what are the experiments
+[14:07.60]They did
+[14:08.60]What are the interventions
+[14:09.60]They did
+[14:10.60]What are the outcomes
+[14:11.60]And kind of organize
+[14:12.60]That for me
+[14:13.60]And the first phase
+[14:14.60]Of understanding
+[14:15.60]This is on
+[14:16.60]Automating that work flow
+[14:17.60]Because a lot of that work
+[14:18.60]Is pretty road work
+[14:19.60]I think it's not
+[14:20.60]The kind of thing
+[14:21.60]That we need humans to do
+[14:22.60]Language models can do it
+[14:23.60]And then if
+[14:24.60]Language models can do it
+[14:25.60]That you can obviously
+[14:26.60]Scale it up
+[14:27.60]Much more than a grad student
+[14:28.60]Or undergrad
+[14:29.60]Research assistant
+[14:30.60]Would be able to do
+[14:31.60]Yeah the use cases
+[14:32.60]Are pretty broad
+[14:33.60]So we do have
+[14:34.60]A very large
+[14:35.60]Percent of our users
+[14:36.60]Are just using it personally
+[14:37.60]Or for a mix
+[14:38.60]Of personal and professional
+[14:39.60]Things
+[14:40.60]People who care a lot
+[14:41.60]About health
+[14:42.60]Or biohacking
+[14:43.60]Or parents
+[14:44.60]Or disease
+[14:45.60]Or want to understand
+[14:46.60]The literature directly
+[14:47.60]So there is an
+[14:48.60]Individual consumer use
+[14:49.60]Case
+[14:50.60]We're most focused
+[14:51.60]On the power users
+[14:52.60]So that's where
+[14:53.60]We're really excited
+[14:54.60]To build
+[14:55.60]So Lisit was
+[14:56.60]Very much inspired
+[14:57.60]By this work flow
+[14:58.60]In literature
+[14:59.60]Called systematic reviews
+[15:00.60]Or meta analysis
+[15:01.60]Which is basically
+[15:02.60]The human state
+[15:03.60]Of the art
+[15:04.60]For summarizing
+[15:05.60]Scientific literature
+[15:06.60]It typically involves
+[15:07.60]Like five people
+[15:08.60]Working together
+[15:09.60]For over a year
+[15:10.60]And they kind of
+[15:11.60]First start by trying
+[15:12.60]To find the maximally
+[15:13.60]First possible
+[15:14.60]So it's like
+[15:15.60]Ten thousand papers
+[15:16.60]And they kind of
+[15:17.60]Systematically narrow
+[15:18.60]That down to like
+[15:19.60]Hundreds or fifty
+[15:20.60]Extract key details
+[15:22.60]From every single paper
+[15:23.60]Usually have two people
+[15:24.60]Doing it
+[15:25.60]Like a third person
+[15:26.60]Reviewing it
+[15:27.60]So it's like
+[15:28.60]Incredibly laborious
+[15:29.60]Time-consuming process
+[15:30.60]But you see it
+[15:31.60]In every single domain
+[15:32.60]So in science
+[15:33.60]In machine learning
+[15:34.60]In policy
+[15:35.60]Because it's so structured
+[15:36.60]And designed to be reproducible
+[15:37.60]It's really amenable
+[15:38.60]To automation
+[15:39.60]So it's kind of
+[15:40.60]The workflow that we want
+[15:41.60]To automate first
+[15:42.60]It's accessible
+[15:43.60]For any question
+[15:44.60]And make
+[15:45.60]You know kind of
+[15:46.60]These really robust
+[15:47.60]Living summaries of science
+[15:48.60]So yeah
+[15:48.60]It's one of the
+[15:49.60]Workflows that we're
+[15:50.60]Starting with
+[15:51.60]Our previous guest
+[15:52.60]Mike Conover
+[15:53.60]He's building a new
+[15:54.60]Company got BrightWave
+[15:55.60]Which is an AI
+[15:56.60]Research assistant
+[15:57.60]For financial research
+[15:58.60]How do you see
+[15:59.60]The future of these tools
+[16:00.60]Like does everything
+[16:01.60]Converged
+[16:02.60]Like a God researcher
+[16:03.60]Assisted
+[16:04.60]Or is every domain
+[16:05.60]Gone to have its own thing
+[16:06.60]I think that's a good
+[16:07.60]And mostly open question
+[16:09.60]I do think there are
+[16:10.60]Some differences
+[16:11.60]Data analysis
+[16:12.60]And other research
+[16:13.60]Is more high-level
+[16:15.60]Cross-domain thinking
+[16:16.60]And we definitely
+[16:17.60]Want to contribute to
+[16:18.60]The broad
+[16:19.60]Generalist reasoning type
+[16:20.60]Space like if
+[16:21.60]Researchers are
+[16:22.60]Making discoveries often
+[16:23.60]It's like hey
+[16:24.60]This thing in biology
+[16:25.60]Is actually analogous to
+[16:26.60]Like these equations
+[16:27.60]In economics or something
+[16:28.60]And that's just
+[16:29.60]Fundamentally a thing
+[16:30.60]That where you need
+[16:31.60]To reason across domains
+[16:32.60]At least within research
+[16:33.60]I think there will be
+[16:34.60]Like one best platform
+[16:36.60]More or less
+[16:37.60]For this type of
+[16:38.60]Generalist research
+[16:39.60]I think there may still be
+[16:40.60]Tools like for genomics
+[16:41.60]Like particular types
+[16:42.60]Of modules
+[16:43.60]Of genes
+[16:44.60]And proteins
+[16:45.60]And whatnot
+[16:46.60]But for a lot of
+[16:47.60]The kind of high-level reasoning
+[16:48.60]That humans do
+[16:49.60]I think that is
+[16:50.60]A more open or type
+[16:51.60]All thing
+[16:52.60]I wanted to ask
+[16:53.60]A little bit deeper about
+[16:54.60]I guess the workflow
+[16:55.60]That you mentioned
+[16:56.60]I like that phrase
+[16:57.60]I see that
+[16:58.60]In your UI now
+[16:59.60]But that's
+[17:00.60]As it is today
+[17:01.60]And I think you were
+[17:02.60]About to tell us about
+[17:03.60]How it was in 2021
+[17:04.60]And how it maybe progressed
+[17:05.60]How has this workflow
+[17:06.60]Evolved over time
+[17:07.60]So the very first
+[17:08.60]Version of illicit
+[17:09.60]In the research assistant
+[17:10.60]It was a forecasting assistant
+[17:12.60]So we set out
+[17:13.60]And we were thinking about
+[17:14.60]What are some of the most
+[17:15.60]Impactful types of reasoning
+[17:16.60]That if we could scale up
+[17:17.60]AI would really transform
+[17:18.60]The world
+[17:19.60]And we actually started
+[17:20.60]With literature review
+[17:21.60]But we're like
+[17:22.60]So many people are going to build
+[17:23.60]Literature review tools
+[17:24.60]So let's start there
+[17:25.60]So then we focused
+[17:26.60]On geopolitical forecasting
+[17:27.60]So I don't know
+[17:28.60]If you're familiar
+[17:29.60]With like manifold or
+[17:30.60]Manifold markets
+[17:31.60]Yeah, that kind of stuff
+[17:32.60]Before manifold
+[17:33.60]Yeah, yeah
+[17:34.60]I'm not predicting relationships
+[17:35.60]We're predicting like
+[17:36.60]Is China going to invade Taiwan?
+[17:38.60]Yeah
+[17:39.60]That's in a relationship
+[17:40.60]Yeah, that's fair
+[17:41.60]Yeah, it's true
+[17:42.60]And then we worked
+[17:43.60]On that for a while
+[17:44.60]And then after GPT-3
+[17:45.60] came out
+[17:46.60]I think by that time
+[17:47.60]We realized that
+[17:48.60]Originally we were trying
+[17:49.60]To help people convert
+[17:50.60]Their beliefs into
+[17:51.60]Probability distributions
+[17:53.60]So take fuzzy beliefs
+[17:54.60]But like model them
+[17:55.60]More concretely
+[17:56.60]And then after a few months
+[17:57.60]Of iterating on that
+[17:58.60]Just realize the thing
+[17:59.60]That's blocking people
+[18:00.60]From making
+[18:01.60]Interesting predictions
+[18:02.60]About important events
+[18:03.60]In the world
+[18:04.60]Is less kind of
+[18:05.60]On the probabilistic side
+[18:06.60]And much more
+[18:07.60]Research side
+[18:08.60]And so that kind
+[18:09.60]Of combined with
+[18:10.60]The very generalist
+[18:11.60]Capabilities of GPT-3
+[18:12.60]Prompted us to
+[18:13.60]Make a more general
+[18:14.60]Research assistant
+[18:15.60]Then we spent
+[18:16.60]A few months iterating
+[18:17.60]On what even is
+[18:18.60]A research assistant
+[18:19.60]So we would embed
+[18:20.60]With different researchers
+[18:21.60]We built data labeling
+[18:23.60]Workflows in the beginning
+[18:24.60]Kind of right off the bat
+[18:25.60]We built ways to find
+[18:27.60]Experts in a field
+[18:29.60]And like ways to ask
+[18:30.60]Good research questions
+[18:31.60]We just kind of
+[18:32.60]Iterated through a lot
+[18:33.60]Of workflows and no one else
+[18:34.60]Was really building at this
+[18:35.60]Time and it was like
+[18:36.60]Let's do some prompt
+[18:37.60]Engineering and see
+[18:38.60]Like what is a task
+[18:39.60]That is at the
+[18:40.60]Intersection of what's
+[18:41.60]Technologically capable
+[18:42.60]And like important
+[18:43.60]For researchers
+[18:44.60]And we had like
+[18:45.60]A very nondescript
+[18:46.60]Landing page
+[18:47.60]It said nothing
+[18:48.60]But somehow people were
+[18:49.60]Signing up and we had
+[18:50.60]The sign of form
+[18:51.60]That was like
+[18:52.60]Why are you here
+[18:53.60]And everyone was like
+[18:54.60]I need help
+[18:55.60]With literature review
+[18:56.60]And we're like
+[18:57.60]A literature review
+[18:58.60]That sounds so hard
+[18:59.60]I don't even know
+[19:00.60]What that means
+[19:01.60]We don't want to work on it
+[19:02.60]But then eventually
+[19:03.60]We're like
+[19:04.60]Everyone is saying
+[19:05.60]Yeah
+[19:06.60]And we also kind of
+[19:07.60]Personally knew literature
+[19:08.60]Review was hard
+[19:09.60]And if you look at the graphs
+[19:10.60]For academic literature
+[19:11.60]Being published every
+[19:12.60]Single month you guys
+[19:13.60]Know this in machine learning
+[19:14.60]It's like up into the right
+[19:15.60]Like superhuman amounts
+[19:16.60]Of papers
+[19:17.60]So we're like
+[19:18.60]All right, let's just try it
+[19:19.60]I was really nervous
+[19:20.60]But Andres was like
+[19:21.60]This is kind of like
+[19:22.60]The right problem space
+[19:23.60]To jump into
+[19:24.60]Even if we don't
+[19:25.60]Know what we're doing
+[19:26.60]So my take was like
+[19:27.60]Fine
+[19:28.60]This feels really scary
+[19:29.60]But let's just launch
+[19:30.60]A feature every single week
+[19:31.60]And double our user
+[19:32.60]Numbers every month
+[19:33.60]And if we can do that
+[19:34.60]We will find something
+[19:35.60]I was worried about like
+[19:36.60]Getting lost
+[19:37.60]In the kind of academic white
+[19:38.60]Space
+[19:39.60]So the very first version
+[19:40.60]Was actually a weekend prototype
+[19:41.60]That Andres made
+[19:42.60]Do you want to explain
+[19:43.60]How that worked
+[19:44.60]I mostly remember
+[19:45.60]That it was really bad
+[19:47.60]So the thing I remember
+[19:48.60]Is you entered a question
+[19:50.60]And it would give you back
+[19:51.60]A list of claims
+[19:52.60]So your question could be
+[19:53.60]I don't know
+[19:54.60]How does creatine effect cognition
+[19:56.60]And it would give you back
+[19:57.60]Some claims
+[19:58.60]That are to some extent
+[19:59.60]Based on papers
+[20:00.60]But they were often irrelevant
+[20:02.60]The papers were often
+[20:03.60]And so we ended up
+[20:04.60]Soon just printing out
+[20:05.60]A bunch of examples
+[20:06.60]Of results
+[20:07.60]And putting them up
+[20:08.60]On the wall
+[20:09.60]So that we would
+[20:10.60]Kind of feel the constant
+[20:11.60]Shame of having
+[20:12.60]Such a bad product
+[20:13.60]And would be incentivized
+[20:14.60]To make it better
+[20:15.60]And I think overtime
+[20:16.60]It has gotten a lot better
+[20:17.60]But I think
+[20:18.60]The initial version
+[20:19.60]Was like really very bad
+[20:20.60]But it was basically
+[20:21.60]Like a natural language
+[20:22.60]Summary of an abstract
+[20:23.60]Like kind of a one-sentence
+[20:24.60]Summary
+[20:25.60]And which we still have
+[20:26.60]And then as we learned
+[20:27.60]Kind of more about this
+[20:28.60]Systematic review workflow
+[20:29.60]We started expanding
+[20:30.60]The capability so that
+[20:31.60]You could extract a lot
+[20:32.60]And more with that
+[20:33.60]And were you using
+[20:34.60]Like embeddings
+[20:35.60]And cosine similarity
+[20:36.60]That kind of stuff
+[20:37.60]For retrieval
+[20:38.60]Or was it keyword based
+[20:39.60]Or
+[20:40.60]I think the very first version
+[20:42.60]Didn't even have
+[20:43.60]It's own search engine
+[20:44.60]I think the very first version
+[20:45.60]Probably used
+[20:46.60]The semantic school or API
+[20:48.60]Or something similar
+[20:49.60]And only later when we discovered
+[20:51.60]That the API is not very semantic
+[20:53.60]Then built our own search
+[20:55.60]Search and that has helped a lot
+[20:57.60]And then we're going to go into
+[20:59.60]Like more recent products stuff
+[21:01.60]But like you know
+[21:02.60]I think you seem the more
+[21:03.60]So to start up oriented
+[21:04.60]Business person
+[21:05.60]And you seem sort of more
+[21:06.60]Ideologically like interested
+[21:08.60]In research obviously
+[21:09.60]Because of your PhD
+[21:10.60]What kind of market sizing
+[21:11.60]Were you guys thinking
+[21:12.60]Right?
+[21:13.60]Because you're here saying
+[21:14.60]Like we have to double every month
+[21:15.60]And I'm like
+[21:16.60]I don't know how you make
+[21:17.60]That conclusion from this
+[21:19.60]Right?
+[21:20.60]Especially also as a nonprofit
+[21:21.60]At the time
+[21:22.60]I mean market size wise
+[21:23.60]I felt like in this space
+[21:25.60]Where so much was changing
+[21:27.60]And it was very unclear
+[21:29.60]What of today was actually
+[21:30.60]Will be true tomorrow
+[21:31.60]We just like
+[21:32.60]Really rested a lot
+[21:33.60]On very very simple
+[21:34.60]Fundamental principles
+[21:35.60]Which is like
+[21:36.60]If you can understand
+[21:37.60]The truth that is
+[21:38.60]Very economically beneficial
+[21:40.60]Like valuable
+[21:41.60]If you like know the truth
+[21:42.60]On principle
+[21:43.60]That's enough for you
+[21:44.60]Research is the key to many
+[21:45.60]Breakthroughs that are
+[21:46.60]Very commercially valuable
+[21:47.60]Because my version of it
+[21:48.60]Is students are poor
+[21:49.60]And they don't pay
+[21:50.60]For anything
+[21:51.60]Right?
+[21:52.60]But that's obviously not true
+[21:53.60]As you guys have found out
+[21:54.60]But you had to have
+[21:55.60]Some market insight
+[21:56.60]For me to have believed that
+[21:57.60]But you skipped that
+[21:58.60]We did encounter
+[21:59.60]Talking to vcs
+[22:00.60]For our seed round
+[22:01.60]A lot of vcs were like
+[22:02.60]You know researchers
+[22:03.60]They don't have any money
+[22:04.60]Why don't you build
+[22:05.60]Legal assistant
+[22:07.60]I think in some
+[22:09.60]Short-sighted way
+[22:10.60]Maybe that's true
+[22:11.60]But I think in the long run
+[22:12.60]R&D is such a big space
+[22:13.60]Of the economy
+[22:14.60]I think if you can
+[22:15.60]Substantially improve
+[22:17.60]How quickly people find
+[22:19.60]New discoveries
+[22:20.60]Or avoid controlled trials
+[22:22.60]That don't go anywhere
+[22:23.60]I think that's just
+[22:24.60]Huge amounts of money
+[22:25.60]And there are a lot
+[22:26.60]Of questions obviously
+[22:27.60]About between here and there
+[22:28.60]But I think as long as
+[22:29.60]The fundamental principle is there
+[22:31.60]We were okay with that
+[22:32.60]And I guess we found
+[22:33.60]Some investors who also were
+[22:34.60]Yeah congrats
+[22:35.60]I'm sure we can cover
+[22:37.60]The sort of flip later
+[22:39.60]I think you're about to start
+[22:40.60]As on like GPT-3
+[22:41.60]And how like that
+[22:42.60]Changed things for you
+[22:43.60]It's funny like I guess
+[22:44.60]Every major GPT version
+[22:45.60]You have like some big insight
+[22:47.60]Yeah I mean
+[22:49.60]What do you think
+[22:50.60]I think it's a little bit
+[22:52.60]Less true for us than for others
+[22:54.60]Because we always believe
+[22:55.60]That there will basically
+[22:57.60]Human level machine work
+[23:00.60]And so
+[23:01.60]It is definitely true
+[23:02.60]That in practice
+[23:03.60]For your product
+[23:04.60]As new models come out
+[23:06.60]Your product starts working better
+[23:07.60]You can add some features
+[23:08.60]That you couldn't add before
+[23:09.60]But I don't think
+[23:11.60]We really ever had the
+[23:13.60]Moment where we were like
+[23:14.60]Oh wow
+[23:15.60]That is super unanticipated
+[23:17.60]We need to do something
+[23:18.60]Entirely different now
+[23:19.60]From what was on the roadmap
+[23:21.60]I think GPT-3
+[23:22.60]Was a big change
+[23:23.60]Because it kind of said
+[23:25.60]Oh now is the time
+[23:26.60]To build these tools
+[23:27.60]And then GPT-4
+[23:28.60]Was maybe a little bit
+[23:29.60]More of an extension
+[23:30.60]Of GPT-3
+[23:31.60]GPT-3 over GPT-2
+[23:32.60]Was like qualitative level
+[23:34.60]Shift
+[23:35.60]Then GPT-4 was like
+[23:36.60]Okay great
+[23:37.60]Now it's like more accurate
+[23:38.60]We're more accurate
+[23:39.60]On these things
+[23:40.60]We can answer harder questions
+[23:41.60]But the shape of the product
+[23:42.60]Had already taken place
+[23:43.60]By that time
+[23:44.60]I kind of want to ask you
+[23:45.60]About this sort of pivot
+[23:46.60]That you made
+[23:47.60]But I guess that was just
+[23:48.60]A way to sell
+[23:49.60]What you were doing
+[23:50.60]Which is you're adding
+[23:51.60]Extra features on grouping
+[23:52.60]My concepts
+[23:53.60]The GPT-4 pivot
+[23:54.60]Quote unquote pivot
+[23:55.60]Yeah yeah
+[23:56.60]Exactly
+[23:57.60]Yeah yeah
+[23:58.60]When we launched
+[23:59.60]This workflow
+[24:00.60]Now that GPT-4
+[24:01.60]Was available
+[24:02.60]Basically
+[24:03.60]Elisa was at a place
+[24:04.60]Where we have very tabular
+[24:05.60]Interfaces
+[24:06.60]So given a table of papers
+[24:07.60]You can extract data
+[24:08.60] Across all the tables
+[24:09.60]But you kind of want
+[24:10.60]To take the analysis
+[24:11.60]A step further
+[24:12.60]Sometimes what you'd care
+[24:13.60]About is not having
+[24:14.60]A list of papers
+[24:15.60]But a list of arguments
+[24:17.60]A list of effects
+[24:18.60]A list of interventions
+[24:19.60]A list of techniques
+[24:20.60]And so that's
+[24:21.60]One of the things we're
+[24:22.60]Working on is now that
+[24:23.60]You've extracted this information
+[24:24.60]A way
+[24:25.60]Can you pivot it
+[24:26.60]Or group by
+[24:27.60]Whatever the information
+[24:28.60]That you extracted
+[24:29.60]To have more insight
+[24:30.60]First information
+[24:31.60]Still supported
+[24:32.60]By the academic literature
+[24:33.60]Yeah
+[24:34.60]There was a big revelation
+[24:35.60]When I saw it
+[24:36.60]Basically I think
+[24:37.60]I'm very just impressed
+[24:38.60]By how first principles
+[24:39.60]Your ideas
+[24:40.60]Around the workflow is
+[24:42.60]And I think
+[24:43.60]That's why
+[24:44.60]You're not as reliant
+[24:45.60]On like the LM
+[24:46.60]Improving
+[24:47.60]Because it's actually
+[24:48.60]Just about improving
+[24:49.60]The workflow
+[24:50.60]That you will recommend
+[24:51.60]To people
+[24:52.60]Today we might call
+[24:53.60]It's rely on
+[24:54.60]This is the way
+[24:55.60]That elicit
+[24:56.60]Does research
+[24:57.60]And this is
+[24:58.60]What we think
+[24:59.60]Is most effective
+[25:00.60]Based on talking to our users
+[25:01.60]The problem space
+[25:02.60]Is still huge
+[25:03.60]Like if it's
+[25:04.60]Like this big
+[25:05.60]We're all still operating
+[25:06.60]At this tiny part
+[25:07.60]Bit of it
+[25:08.60]So you know
+[25:09.60]I think about this a lot
+[25:10.60]In the context of motes
+[25:11.60]People are like
+[25:12.60]Oh what's your mode
+[25:13.60]What happens
+[25:14.60]If GPT-5 comes out
+[25:15.60]It's like if GPT-5 comes out
+[25:16.60]There's still like
+[25:17.60]All of this other space
+[25:18.60]That we can go into
+[25:19.60]And so I think being
+[25:20.60]Really obsessed
+[25:21.60]With the problem
+[25:22.60]It's a robust
+[25:23.60]And just kind of
+[25:24.60]Directly incorporate
+[25:25.60]Model improvements
+[25:26.60]And they keep going
+[25:27.60]And then I first encountered
+[25:28.60]You guys with Charlie
+[25:29.60]You can tell us
+[25:30.60]About that project
+[25:31.60]Basically yeah
+[25:32.60]Like how much did cost
+[25:34.60]Become a concern
+[25:35.60]As you're working more
+[25:36.60]And more with OpenAI
+[25:37.60]How do you manage
+[25:38.60]That relationship
+[25:39.60]Let me talk about
+[25:40.60]Who Charlie is
+[25:41.60]You can talk about that
+[25:42.60]Charlie is a special character
+[25:43.60]So Charlie
+[25:44.60]When we found him
+[25:45.60]Was had just finished
+[25:46.60]His freshman year
+[25:47.60]At the University of Warwick
+[25:48.60]I think he had heard
+[25:49.60]About us on some discord
+[25:50.60]And then he applied
+[25:51.60]And then we just saw
+[25:52.60]That he had done so many
+[25:53.60]Incredible side projects
+[25:54.60]And we were actually
+[25:55.60]On a team retreat
+[25:56.60]In Barcelona
+[25:57.60]Visiting our head of engineering
+[25:58.60]At that time
+[25:59.60]And everyone was talking
+[26:00.60]About this wonder kid
+[26:01.60]They're like this kid
+[26:02.60]And then on our take home
+[26:03.60]Project he had done
+[26:04.60]Like the best of anyone
+[26:05.60]To that point
+[26:06.60]And so people were
+[26:07.60]Just like so excited
+[26:08.60]To hire him
+[26:09.60]So we hired him
+[26:10.60]As an intern
+[26:11.60]And then we're like Charlie
+[26:12.60]What if you just dropped
+[26:13.60]Out of school
+[26:14.60]And so then we convinced
+[26:15.60] him to take a year off
+[26:16.60]And he's just
+[26:17.60]Incredibly productive
+[26:18.60]And I think the thing
+[26:19.60]You're referring to
+[26:20.60]He kind of launched
+[26:21.60]Their constitutional AI paper
+[26:23.60]And within a few days
+[26:24.60]I think four days
+[26:25.60]He had basically implemented
+[26:26.60]That in production
+[26:27.60]And then we had it
+[26:28.60]In app a week or so after that
+[26:30.60]And he has since kind of
+[26:31.60]Contributed to major improvements
+[26:33.60]Like cutting costs down
+[26:34.60]To a tenth of what they were
+[26:36.60]Really large scale
+[26:37.60]But yeah, you can talk
+[26:38.60]About the technical stuff
+[26:39.60]Yeah, on the
+[26:40.60]Constitutional AI project
+[26:41.60]This was for abstract summarization
+[26:43.60]Where in illicit
+[26:44.60]If you run a query
+[26:45.60]It'll return papers to you
+[26:47.60]And then it will summarize
+[26:48.60]Each paper
+[26:49.60]The query for you
+[26:50.60]On the fly
+[26:51.60]And that's a really
+[26:52.60]Important part of illicit
+[26:53.60]Because illicit does it so much
+[26:55.60]We run a few searches
+[26:56.60]It'll have done it
+[26:57.60]A few hundred times for you
+[26:58.60]And so we cared a lot
+[26:59.60]About this both
+[27:00.60]Being like fast, cheap
+[27:02.60]And also very low on hallucination
+[27:04.60]I think if illicit
+[27:05.60]Hollucinate something
+[27:06.60]About the abstract
+[27:07.60]That's really not good
+[27:08.60]And so what Charlie did
+[27:09.60]In that project was
+[27:11.60]Created a constitution
+[27:12.60]That expressed
+[27:13.60]Where are the attributes
+[27:14.60]Of a good summary
+[27:15.60]Everything in the summary
+[27:16.60]Is reflected in the actual abstract
+[27:18.60]It was like
+[27:19.60]Very concise
+[27:20.60]Etc.
+[27:21.60]And then
+[27:22.60]Used RLHF
+[27:24.60]With a model
+[27:25.60]That was trained
+[27:26.60]On the constitution
+[27:27.60]To basically
+[27:29.60]Find you a better
+[27:30.60]Summarizer
+[27:31.60]On an open source model
+[27:32.60]Yeah, I think
+[27:33.60]That might still be in use
+[27:34.60]Yeah, yeah, definitely
+[27:35.60]Yeah, I think
+[27:36.60]At the time
+[27:37.60]The models hadn't been
+[27:38.60]Trained at all
+[27:39.60]To be faithful to a text
+[27:41.60]So they were just generating
+[27:42.60]So then when you
+[27:43.60]Ask them a question
+[27:44.60]They tried too hard
+[27:45.60]To answer the question
+[27:46.60]And didn't try hard
+[27:47.60]Answer the question
+[27:48.60]Given the text
+[27:49.60]Or answer what the text
+[27:50.60] Said about the question
+[27:51.60]So we had to
+[27:52.60]Basically teach the models
+[27:53.60]To do that specific task
+[27:54.60]How do you monitor
+[27:55.60]The ongoing performance
+[27:57.60]Of your models
+[27:58.60]Not to get
+[27:59.60]To LLMopsy
+[28:00.60]But you are one of the
+[28:01.60]Larger more well-known
+[28:02.60]Operations
+[28:03.60]Doing NLP at scale
+[28:04.60]I guess effectively
+[28:06.60]Like you have to monitor
+[28:07.60]These things and nobody
+[28:08.60]Has a good answer
+[28:09.60]That talks to you
+[28:10.60]Yeah, I don't think
+[28:11.60]We have a good answer yet
+[28:12.60]I think the answers
+[28:13.60]Are actually a little bit
+[28:14.60]Clear on the
+[28:15.60]Just kind of basic
+[28:16.60]The business side
+[28:17.60]Of where you can
+[28:18.60]Import ideas
+[28:19.60]From normal
+[28:20.60]Soft engineering
+[28:21.60]And normal kind
+[28:22.60]Of DevOps
+[28:23.60]You're like
+[28:24.60]Well, you need to
+[28:25.60]Monitor kind
+[28:26.60]Of latencies
+[28:27.60]And response times
+[28:28.60]And optime and whatnot
+[28:29.60]Performance is more
+[28:30.60]Of hallucination rate
+[28:31.60]And then things
+[28:32.60]Like hallucination rate
+[28:33.60]Where I think there
+[28:34.60]The really
+[28:35.60]Important thing
+[28:36.60]Is training time
+[28:37.60]So we care a lot
+[28:38.60]About having
+[28:39.60]Our own internal
+[28:41.60]Benchmarks
+[28:42.60]For model development
+[28:44.60]That reflect
+[28:45.60]So that we can
+[28:46.60]Know ahead of time
+[28:47.60]How well
+[28:48.60]Is the model
+[28:49.60]Gonna perform
+[28:50.60]On different types
+[28:51.60]Of tasks
+[28:52.60]So the tasks being
+[28:53.60]Summarization
+[28:54.60]Question answering
+[28:55.60]Given a paper
+[28:56.60]Ranking
+[28:57.60]And for each of those
+[28:58.60]We wanna know
+[28:59.60]What's the distribution
+[29:00.60]Of things the model
+[29:01.60]Is gonna see
+[29:02.60]So that we can
+[29:03.60]Have well-calibrated
+[29:04.60]Predictions on
+[29:05.60]How well the model
+[29:06.60]Is gonna do in production
+[29:07.60]And I think, yeah,
+[29:08.60]There's like
+[29:09.60]Some chance
+[29:10.60]That there's distribution
+[29:11.60]Shift and actually
+[29:12.60]The things users enter
+[29:13.60]Are gonna be different
+[29:14.60]Trainings right
+[29:15.60]And having
+[29:16.60]Very high quality
+[29:17.60]Well-vetted data
+[29:18.60]Sets at training time
+[29:19.60]I think we also
+[29:20.60]End up effectively
+[29:21.60]Monitoring by trying
+[29:22.60]To evaluate new models
+[29:23.60]As they come out
+[29:24.60]And so that like
+[29:25.60]Kind of prompts us
+[29:26.60]To go through
+[29:27.60]Our eval suite
+[29:28.60]Every couple of months
+[29:29.60]And so every time
+[29:30.60]A new model comes out
+[29:31.60]We have to see
+[29:32.60]Like how is this performing
+[29:33.60]Relative to production
+[29:34.60]And what we currently have
+[29:35.60]Yeah, I mean
+[29:36.60]Since we're on this topic
+[29:37.60]Any new models
+[29:38.60]That really call
+[29:39.60]Your eye this year
+[29:40.60]Like cloud came out
+[29:41.60]Yeah, I think cloud
+[29:42.60]Is pretty pretty
+[29:43.60]Like a good point
+[29:44.60]On the kind of
+[29:45.60]Predo frontier
+[29:46.60]It's neither
+[29:47.60]The cheapest model
+[29:48.60]Nor is it
+[29:49.60]The most accurate
+[29:51.60]Most high quality model
+[29:52.60]But it's just
+[29:53.60]Like a really good tradeoff
+[29:54.60]Between cost and accuracy
+[29:56.60]You apparently
+[29:57.60]Have to 10 shot it
+[29:58.60]To make it good
+[29:59.60]I tried using
+[30:00.60]Aiku for summarization
+[30:01.60]But zero shot
+[30:02.60]Was not great
+[30:03.60]Then they were like
+[30:04.60]You know, it's a skill issue
+[30:05.60]You have to try it harder
+[30:06.60]Interesting
+[30:07.60]I think GPT-4
+[30:08.60]Unlocked tables for us
+[30:10.60]Processing data from tables
+[30:11.60]Which was huge
+[30:12.60]GPT-4 vision
+[30:13.60]Yeah
+[30:14.60]Did you try for you
+[30:15.60]I guess you can't try for you
+[30:16.60]Because it's noncommercial
+[30:17.60]That's the adept model
+[30:18.60]Yeah, we haven't tried that one
+[30:19.60]Yeah
+[30:20.60]Yeah, but cloud is multimodal as well
+[30:22.60]Yeah
+[30:23.60]I think the interesting insight
+[30:24.60]That we got from talking to David Luan
+[30:25.60]Who is CEO of Adept
+[30:26.60]Was that multimodality
+[30:28.60]Has effectively two different flavors
+[30:30.60]Like one is
+[30:31.60]Rerecognize images from a camera
+[30:33.60]In the outside natural world
+[30:35.60]And actually the more important
+[30:37.60]Multimodality for knowledge work
+[30:38.60]Is screenshots
+[30:39.60]And you know
+[30:40.60]PDFs and charts and graphs
+[30:42.60]So we need a new term
+[30:43.60]For that kind of multimodality
+[30:45.60]But is a claim
+[30:46.60]That current models
+[30:47.60]Are good at one or the other
+[30:49.60]Yeah, they're over index
+[30:50.60]Because of the history of computer vision
+[30:51.60]Is coco, right?
+[30:53.60]So now we're like
+[30:54.60]Oh, actually, you know
+[30:55.60]Screens are more important
+[30:56.60]OCR handwriting
+[30:58.60]You mentioned a lot of
+[30:59.60]Closed model lab stuff
+[31:01.60]And then you also have
+[31:02.60]Like this open source model
+[31:03.60]Find tuning stuff
+[31:04.60]Like what is your workload
+[31:05.60]Now between close and open
+[31:06.60]It's a good question
+[31:07.60]I think
+[31:08.60]It's half and half
+[31:09.60]Is that even a relevant question
+[31:10.60]Or not
+[31:11.60]This is a nonsensical question
+[31:12.60]It depends a little bit on
+[31:13.60]Like how you index
+[31:14.60]Whether you index by
+[31:15.60]Like computer cost
+[31:16.60]The number of queries
+[31:17.60]I'd say like
+[31:18.60]In terms of number of queries
+[31:19.60]Is maybe similar
+[31:20.60]In terms of like costing computer
+[31:22.60]I think the closed models
+[31:23.60]Make up more of the budget
+[31:25.60]Since the main cases
+[31:26.60]Where you want to use closed models
+[31:28.60]Are cases where
+[31:29.60]They're just smarter
+[31:31.60]Where there are no existing
+[31:33.60]Open source models
+[31:34.60]Are quite smart enough
+[31:35.60]Yeah
+[31:36.60]We have a lot of
+[31:37.60]Interesting technical questions
+[31:38.60]To go in
+[31:39.60]But just to wrap
+[31:40.60]The kind of like
+[31:41.60]UX evolution
+[31:42.60]Now you have the notebooks
+[31:43.60]We talked a lot
+[31:44.60]About how chatbots
+[31:45.60]Are not the final frontier
+[31:47.60]You know
+[31:48.60]How did you decide
+[31:49.60]To get into notebooks
+[31:50.60]Which is a very iterative
+[31:51.60]Kind of like interactive
+[31:52.60]Interface
+[31:53.60]And yeah
+[31:54.60]Maybe learnings from that
+[31:55.60]Yeah this is actually
+[31:56.60]Our fourth time
+[31:57.60]Trying to make this work
+[31:59.60]I think the first time
+[32:00.60]Was probably in early 2021
+[32:03.60]I think because
+[32:04.60]We've always been obsessed
+[32:05.60]With this idea of task
+[32:06.60]Decomposition
+[32:07.60]And like branching
+[32:08.60]We always wanted a tool
+[32:10.60]That could be kind of
+[32:11.60]Unbounded
+[32:12.60]Where you could keep going
+[32:13.60]Could do a lot of branching
+[32:14.60]Where you could kind of apply
+[32:15.60]Language model operations
+[32:17.60]Or computations on other tasks
+[32:19.60]So in 2021
+[32:20.60]We had this thing called
+[32:21.60]Composite tasks
+[32:22.60]Where you could use GPT-3
+[32:23.60]To brainstorm
+[32:24.60]A bunch of research questions
+[32:25.60]And then take
+[32:26.60]Each research question
+[32:27.60]And decompose those
+[32:28.60]Further into subquestions
+[32:30.60]This kind of again
+[32:31.60]That like task decomposition
+[32:32.60]Tree type thing
+[32:33.60]Was always very exciting to us
+[32:35.60]But that was like
+[32:36.60]It was kind of overwhelming
+[32:37.60]Then at the end of 22
+[32:39.60]I think we tried again
+[32:40.60]And at that point
+[32:41.60]We were thinking
+[32:42.60]Okay we've done a lot
+[32:43.60]With this literature review thing
+[32:44.60]We also want to start helping
+[32:45.60]With kind of adjacent domains
+[32:47.60]And different workflows
+[32:48.60]Like we want to help more
+[32:49.60]With machine learning
+[32:50.60]What does that look like
+[32:51.60]And as we were thinking
+[32:52.60]About it we're like
+[32:53.60]Well there are so many
+[32:54.60]Research workflows
+[32:55.60]How do we not just build
+[32:56.60]Three new workflows
+[32:57.60]Into elicit
+[32:58.60]But make elicit
+[32:59.60]Really generic
+[33:00.60]To lots of workflows
+[33:01.60]What is like a generic
+[33:02.60]Composable system
+[33:03.60]With nice abstractions
+[33:04.60]That can like
+[33:05.60]Scale to all these workflows
+[33:06.60]So we like
+[33:07.60]Iterated on that a bunch
+[33:08.60]And like
+[33:09.60]Didn't quite narrow
+[33:10.60]The problem space enough
+[33:11.60]Or like
+[33:12.60]Get to what we wanted
+[33:13.60]And then I think it was
+[33:14.60]At the beginning of 2023
+[33:16.60]We were like
+[33:17.60]Wow computational notebooks
+[33:18.60]Kind of enable this
+[33:19.60]Where they have a lot
+[33:20.60]Of flexibility
+[33:21.60]But you know
+[33:22.60]Kind of robust primitive
+[33:23.60]Such that you can extend
+[33:24.60]The workflow
+[33:25.60]And it's not limited
+[33:26.60]It's not like
+[33:27.60]You ask a query
+[33:28.60]You get an answer
+[33:29.60]You're done
+[33:30.60]You can just constantly
+[33:31.60]Keep building on top of that
+[33:32.60]And each little step
+[33:33.60]Seems like a really good
+[33:34.60]Work for the language model
+[33:35.60]And also there was just
+[33:36.60]Like really helpful
+[33:37.60]To have a bit more
+[33:38.60]Preexisting work to emulate
+[33:40.60]Yeah, that's kind of
+[33:41.60]How we ended up at
+[33:42.60]Computational notebooks
+[33:43.60]For elicit
+[33:44.60]Maybe one thing
+[33:45.60]That's worth making explicit
+[33:46.60]Is the difference between
+[33:47.60]Computational notebooks
+[33:48.60]And chat because
+[33:49.60]On the surface
+[33:50.60]They seem pretty similar
+[33:51.60]It's kind of this iterative
+[33:52.60]Interaction where you add stuff
+[33:53.60]In both cases
+[33:54.60]You have a back and forth
+[33:55.60]Between you enter stuff
+[33:56.60]And then you get some output
+[33:57.60]And then you enter stuff
+[33:58.60]But the important difference
+[33:59.60]In our minds is
+[34:00.60]With notebooks
+[34:01.60]You can define a process
+[34:03.60]So in data science
+[34:04.60]You know like
+[34:05.60]Here's like my data analysis
+[34:06.60]Process that takes in a CSV
+[34:08.60]And then does some extraction
+[34:09.60]And then generates a figure
+[34:10.60]At the end
+[34:11.60]And you can prototype it
+[34:13.60]Using a small CSV
+[34:14.60]And then you can run it
+[34:15.60]Over a much larger CSV
+[34:16.60]Later
+[34:17.60]And similarly
+[34:18.60]The vision for notebooks
+[34:19.60]In our case
+[34:20.60]Is to not make it this
+[34:22.60]Like one of chat interaction
+[34:23.60]But to allow you to then
+[34:25.60]Say if you start
+[34:27.60]And first you're like
+[34:28.60]Okay, let me just
+[34:29.60]Analyze a few papers
+[34:30.60]And see do I get to
+[34:31.60]The correct conclusions
+[34:32.60]For those few papers
+[34:33.60]Can I then later
+[34:34.60]Go back and say
+[34:35.60]Now let me run this
+[34:36.60]Over 10,000 papers
+[34:38.60]Now that I've debug
+[34:39.60]The process
+[34:40.60]Using a few papers
+[34:41.60]And that's an interaction
+[34:42.60]That doesn't fit
+[34:43.60]Quite as well
+[34:44.60]Into the chat framework
+[34:45.60]Because that's more
+[34:46.60]For kind of quick
+[34:47.60]Back and forth
+[34:48.60]Interaction
+[34:49.60]Do you think in notebooks
+[34:50.60]That's kind of like
+[34:51.60]Structure, editable
+[34:52.60]Chain of thought
+[34:53.60]Basically step by step
+[34:54.60]Like is that kind of
+[34:55.60]Where you see this going
+[34:56.60]And then are people
+[34:57.60]Gonna reuse notebooks
+[34:59.60]As like templates
+[35:00.60]And maybe in traditional
+[35:01.60]Notebooks
+[35:02.60]As like cookbooks
+[35:03.60]Right, you share a cookbook
+[35:04.60]You can start from there
+[35:05.60]Is that similar
+[35:06.60]And illicit
+[35:07.60]Yeah, that's exactly right
+[35:08.60]So that's our hope
+[35:09.60]That people will build templates
+[35:10.60]Share them with other people
+[35:12.60]I think chain of thought
+[35:13.60]Is maybe still like
+[35:14.60]Kind of one level
+[35:15.60]Lower on the abstraction hierarchy
+[35:17.60]Then we would think of notebooks
+[35:19.60]I think we'll probably
+[35:20.60]Want to think about
+[35:21.60]More semantic pieces
+[35:22.60]Like a building block
+[35:23.60]Is more like a paper search
+[35:25.60]Or an extraction
+[35:26.60]Or a list of concepts
+[35:28.60]And then the models
+[35:30.60]And the reasoning
+[35:31.60]Will probably often be
+[35:32.60]One level down
+[35:33.60]You always want to
+[35:34.60]Be able to see it
+[35:35.60]But you don't always
+[35:36.60]Want it to be front and center
+[35:37.60]Yeah, what's the difference
+[35:38.60]Between a notebook
+[35:39.60]And an agent
+[35:40.60]Since everybody always
+[35:41.60]Ask me what's an agent
+[35:42.60]Like how do you think
+[35:43.60]About where the line is
+[35:45.60]Yeah, it's an interesting
+[35:46.60]Question
+[35:47.60]In the notebook world
+[35:48.60]I would
+[35:49.60]Generally think of
+[35:50.60]The human as the agent
+[35:51.60]In the first iteration
+[35:52.60]So you have the notebook
+[35:53.60]And the human kind of
+[35:54.60]Adds little action steps
+[35:56.60]And then the next point
+[35:58.60]On this kind of progress
+[35:59.60]Okay, now you can use
+[36:00.60]Language models to predict
+[36:01.60]Which action
+[36:02.60]Would you take as a human
+[36:03.60]And at some point
+[36:04.60]You're probably going to
+[36:05.60]Be very good at this
+[36:06.60]You'll be like, okay
+[36:07.60]In some cases, I can
+[36:08.60]With 99.9% accuracy
+[36:09.60]Predict what you do
+[36:10.60]And then you might
+[36:11.60]As well just execute it
+[36:12.60]Like why wait for the human
+[36:13.60]And eventually
+[36:14.60]As you get better at this
+[36:15.60]That will just look
+[36:16.60]More and more like agents
+[36:18.60]Taking actions
+[36:19.60]As opposed to you
+[36:20.60]Doing the thing
+[36:21.60]I think templates
+[36:22.60]Are a specific case of this
+[36:23.60]Very like, okay, well
+[36:24.60]There's just particular
+[36:25.60]Sequences of actions
+[36:26.60]That you often want to chunk
+[36:27.60]And have available
+[36:28.60]Just like in normal
+[36:29.60]Programming
+[36:30.60]And those
+[36:31.60]You can view them as
+[36:32.60]Action sequences of agents
+[36:33.60]Or you can view them as
+[36:34.60]More normal programming
+[36:36.60]Language abstraction thing
+[36:37.60]And I think those
+[36:38.60]Are two valid views
+[36:40.60]How do you see this
+[36:41.60]Changes
+[36:42.60]Like you said, the models
+[36:43.60]Get better and you need
+[36:44.60]Less and less human
+[36:45.60]Actual interfacing
+[36:47.60]With the model
+[36:48.60]You just get the results
+[36:49.60]Like how does the UX
+[36:50.60]And the way people
+[36:51.60]Perceive it change
+[36:52.60]Yeah, I think this
+[36:53.60] kind of interaction
+[36:54.60]Paradimes for evaluation
+[36:55.60]Is not really something
+[36:56.60]The internet has encountered
+[36:57.60]Yet because up to now
+[36:58.60]The internet has all been
+[36:59.60]About getting data
+[37:00.60]And work from people
+[37:02.60]So increasingly
+[37:03.60]I really want kind of
+[37:04.60]Evaluation both from
+[37:05.60]An interface perspective
+[37:06.60]And from like a
+[37:07.60]Technical perspective
+[37:08.60]Operation perspective
+[37:09.60]To be a superpower
+[37:10.60]For elicit because I think
+[37:11.60]Over time models will do
+[37:12.60]More and more of the work
+[37:13.60]And people will have
+[37:14.60]To do more and more
+[37:15.60]Of the evaluation
+[37:16.60]So I think yeah
+[37:17.60]In terms of the interface
+[37:18.60]Some of the things we have
+[37:19.60]Today, you know
+[37:20.60]For every kind of
+[37:21.60]Language model generation
+[37:22.60]There's some citation back
+[37:23.60]And we kind of try to
+[37:24.60]Highlight the ground truth
+[37:25.60]In the paper
+[37:26.60]To whatever elicit said
+[37:27.60]And make it super easy
+[37:28.60]So you can click on it
+[37:29.60]And quickly see
+[37:30.60]In context and validate
+[37:31.60]Whether the text
+[37:32.60]Actually supports
+[37:33.60]The answer that elicit gave
+[37:34.60]So I think we'd probably
+[37:35.60]Want to scale things up
+[37:37.60]Like that, like the ability
+[37:38.60]To kind of spot check
+[37:39.60]The models work super
+[37:40.60]Quickly scale up
+[37:41.60]Interfaces like that
+[37:42.60]And who would spot check
+[37:44.60]The user
+[37:45.60]Yeah, to start
+[37:46.60]It would be the user
+[37:47.60]One of the other things
+[37:48.60]We do is also kind of flag
+[37:49.60]The models uncertainty
+[37:50.60]So we have models report
+[37:52.60]Out how confident are you
+[37:53.60]That this was the
+[37:54.60]Sample size of this study
+[37:55.60]The model's not sure
+[37:56.60]We throw a flag
+[37:57.60]And so the user knows
+[37:58.60]To prioritize checking that
+[37:59.60]So again, we can kind of
+[38:00.60]Scale that up
+[38:01.60]So when the model's like
+[38:02.60]Well, I searched this
+[38:03.60]On Google, I'm not sure
+[38:04.60]If that was the right thing
+[38:05.60]I have an uncertainty flag
+[38:06.60]And the user can go
+[38:07.60]And be like, okay
+[38:08.60]That was actually
+[38:09.60]The right thing to do or not
+[38:10.60]I've tried to do
+[38:11.60]Uncertainty ratings
+[38:12.60]From models
+[38:13.60]I don't know
+[38:14.60]If you have this live
+[38:15.60]Because I just
+[38:16.60]Didn't find them reliable
+[38:17.60]Because they just elucidated
+[38:18.60]Their own uncertainty
+[38:19.60]I would love to
+[38:20.60]Based on log probes
+[38:22.60]Or something more
+[38:23.60]Native within the model
+[38:24.60]Better than generated
+[38:25.60]But it sounds like
+[38:27.60]The scale properly for you
+[38:29.60]Yeah, we found it
+[38:30.60]To be pretty calibrated
+[38:31.60]Diverse on the model
+[38:32.60]I think in some cases
+[38:33.60]We also used
+[38:34.60]To different models
+[38:35.60]For the answer estimates
+[38:36.60]Then for the question
+[38:37.60]Answering
+[38:38.60]So one model would say
+[38:39.60]Here's my chain of thought
+[38:40.60]Here's my answer
+[38:41.60]And then a different
+[38:42.60]Type of model
+[38:43.60]Let's say the first model
+[38:44.60]Is Lama
+[38:45.60]And let's say the second
+[38:46.60]Model is GP3.5
+[38:47.60]And then the second model
+[38:49.60]Just looks over
+[38:50.60]The results and like
+[38:51.60]Okay, how confident
+[38:52.60]Are you in this
+[38:53.60]And I think
+[38:54.60]Sometimes using
+[38:55.60]A different model
+[38:56.60]Can be better than
+[38:57.60]Using the same model
+[38:58.60]Yeah, you know
+[38:59.60]On topic of models
+[39:00.60]Evaluating models
+[39:01.60]Obviously you can
+[39:02.60]Do that all day long
+[39:03.60]Like what's your budget
+[39:04.60]Like because
+[39:05.60]Your queries
+[39:06.60]Fan out a lot
+[39:07.60]And then you have
+[39:08.60]Models evaluating models
+[39:09.60]One person typing
+[39:10.60]In a question
+[39:11.60]Can lead to
+[39:12.60]A thousand calls
+[39:13.60]It depends on the project
+[39:14.60]So if the project
+[39:15.60]Is basically
+[39:16.60]A systematic review
+[39:17.60]That otherwise
+[39:18.60]Human research assistance
+[39:19.60]Would do
+[39:20.60]Then the project
+[39:21.60]Is basically
+[39:22.60]Can get quite large
+[39:23.60]For those projects
+[39:24.60]I don't know
+[39:25.60]Let's say
+[39:26.60]A hundred thousand dollars
+[39:27.60]So in those cases
+[39:28.60]You're happier
+[39:29.60]To spend compute
+[39:30.60]Then in the
+[39:31.60]Can of shallow search case
+[39:32.60]Where someone
+[39:33.60]Just enters a question
+[39:34.60]Because I don't know
+[39:35.60]Maybe like it
+[39:36.60]I heard about creatine
+[39:37.60]What's it about
+[39:38.60]Probably don't want
+[39:39.60]To spend a lot of compute
+[39:40.60]On that
+[39:41.60]This sort of
+[39:42.60]Being able to invest
+[39:43.60]More or less compute
+[39:44.60]Into getting
+[39:45.60]More or less accurate answers
+[39:46.60]I think one of the
+[39:47.60]Core things we care about
+[39:48.60]And that I think
+[39:49.60]Is currently undervalued
+[39:50.60]In the AI space
+[39:51.60]You can't choose
+[39:52.60]Which model you want
+[39:53.60]And you can sometimes
+[39:54.60]I don't know
+[39:55.60]You'll tip it
+[39:56.60]It'll try harder
+[39:57.60]Or you can try various
+[39:58.60]Things to get it to work harder
+[40:00.60]But you don't have great
+[40:01.60]Ways of converting
+[40:02.60]Willingness to spend
+[40:03.60]Into better answers
+[40:04.60]And we really
+[40:05.60]Want to build a product
+[40:06.60]That has this sort of
+[40:07.60]Unbounded flavor
+[40:08.60]Where like if you care
+[40:09.60]About it a lot
+[40:10.60]You should be able to get
+[40:11.60]Really high quality answers
+[40:12.60]Really double-checked
+[40:13.60]In every way
+[40:14.60]And you have a
+[40:15.60]Credit-based pricing
+[40:16.60]So unlike most products
+[40:17.60]It's not a fixed monthly
+[40:19.60]Right exactly
+[40:20.60]Some of the
+[40:21.60]Higher costs are
+[40:22.60]Teared
+[40:23.60]So for most casual users
+[40:25.60]They'll just get
+[40:26.60]The abstract summary
+[40:27.60]Which is kind of
+[40:28.60]An open source model
+[40:29.60]Then you can
+[40:30.60]Add more columns
+[40:31.60]Which have more
+[40:32.60]Extractions
+[40:33.60]And these uncertainty features
+[40:34.60]And then you can also
+[40:35.60]Add the same columns
+[40:36.60]In high accuracy mode
+[40:37.60]Which also parses the table
+[40:38.60]So we kind of
+[40:39.60]Stack the complexity
+[40:40.60]And the cost
+[40:41.60]You know the fun thing
+[40:42.60]You can do with a credit system
+[40:43.60]Which is data for data
+[40:44.60]Basically you can
+[40:45.60]Give people more credit
+[40:46.60]If they give
+[40:47.60]Data back to you
+[40:48.60]I don't know
+[40:49.60]You don't have money
+[40:50.60]But you have time
+[40:51.60]How do you exchange that
+[40:53.60]It's a fair trade
+[40:54.60]I think it's interesting
+[40:55.60]We haven't quite operationized it
+[40:56.60]And then you know
+[40:57.60]There's been some kind of like
+[40:58.60]Adverse selection
+[40:59.60]Like you know for example
+[41:00.60]It would be really valuable
+[41:01.60]To get feedback on our model
+[41:02.60]So maybe if you were willing
+[41:03.60]To give more robust feedback
+[41:04.60]On our results
+[41:05.60]We could give you credits
+[41:06.60]Or something like that
+[41:07.60]But then there's kind of this
+[41:08.60]Will people take it seriously
+[41:09.60]And you want the good people
+[41:10.60]Exactly
+[41:11.60]Can you tell who are the good people
+[41:12.60]Not right now
+[41:13.60]But yeah maybe
+[41:14.60]At the point where we can
+[41:15.60]We can offer it
+[41:16.60]We can offer it up to them
+[41:17.60]The perplexity of questions asked
+[41:18.60]If it's higher perplexity
+[41:19.60]These are smarter people
+[41:20.60]Yeah maybe
+[41:21.60]And if you make a lot of typos
+[41:22.60]In your queries
+[41:23.60]You're not going to get off
+[41:24.60]How does that change
+[41:25.60]Negative social credit
+[41:28.60]It's very topical right now
+[41:29.60]To think about
+[41:30.60]The threat of long context windows
+[41:32.60]All these models
+[41:34.60]We're talking about these days
+[41:35.60]All like a million tokens plus
+[41:36.60]Is that relevant for you
+[41:38.60]Can you make use of that
+[41:39.60]Is that just prohibitively expensive
+[41:41.60]Because you're just paying
+[41:42.60]For all those tokens
+[41:43.60]Or you're just doing right
+[41:44.60]It's definitely relevant
+[41:45.60]And when we think about search
+[41:46.60]As many people do
+[41:47.60]We think about kind of
+[41:48.60]A staged pipeline
+[41:49.60]Of retrieval
+[41:50.60]Where first you use
+[41:51.60]Semitic search database
+[41:53.60]With embeddings
+[41:54.60]Get like the
+[41:55.60]In our case maybe 400
+[41:56.60]Or so most relevant papers
+[41:57.60]And then
+[41:58.60]You still need to rank those
+[41:59.60]And I think at that point
+[42:01.60]It becomes pretty interesting
+[42:02.60]To use larger models
+[42:04.60]So specifically in the past
+[42:06.60]I think a lot of ranking
+[42:07.60]Was kind of per item ranking
+[42:09.60]Where you would score
+[42:10.60]Each individual item
+[42:11.60]Maybe using increasingly
+[42:12.60]Expensive scoring methods
+[42:13.60]And then rank based on the scores
+[42:15.60]But I think list wise
+[42:16.60]We ranking where
+[42:17.60]You have a model
+[42:18.60]That can see
+[42:19.60]All the elements
+[42:20.60]Is a lot more powerful
+[42:21.60]Because often you can
+[42:22.60]Only really tell
+[42:23.60]How good a thing is
+[42:24.60]In comparison to other things
+[42:26.60]And what things should come first
+[42:28.60]It really depends on
+[42:29.60]Like well what other things
+[42:30.60]Are available
+[42:31.60]Maybe you even care about
+[42:32.60]Diversity and your results
+[42:33.60]You don't want to show
+[42:34.60]Ten very similar papers
+[42:35.60]As the first 10 results
+[42:36.60]So I think along context models
+[42:38.60]Are quite interesting there
+[42:40.60]And especially for our case
+[42:41.60]Where we care more about
+[42:43.60]Power users who are perhaps
+[42:45.60]A little bit more
+[42:46.60]Welling to wait a little bit longer
+[42:47.60]To get higher quality results
+[42:48.60]Relative to people who just
+[42:50.60]Quickly check out things
+[42:51.60]Because why not
+[42:52.60]I think being able to spend
+[42:53.60]More on longer context
+[42:54.60]Is quite valuable
+[42:55.60]Yeah I think one thing
+[42:56.60]The longer context models
+[42:57.60]Changed for us
+[42:58.60]Is maybe a focus from
+[43:00.60]Breaking down tasks
+[43:01.60]To breaking down the evaluation
+[43:03.60]So before you know
+[43:05.60]If we wanted to answer
+[43:06.60]A question from the full text
+[43:08.60]Of a paper
+[43:09.60]We had to figure out
+[43:10.60]How to chunk it and like
+[43:11.60]Find the relevant chunk
+[43:12.60]And then answer
+[43:13.60]Based on that chunk
+[43:14.60]Then you know
+[43:15.60]Which chunk the model
+[43:16.60]Used to answer the question
+[43:17.60]So if you want to help
+[43:18.60]The user to check it
+[43:19.60]Yeah you can be like
+[43:20.60]Well this was the chunk
+[43:21.60]That the model got
+[43:22.60]And now if you put the whole
+[43:23.60]Text in the paper
+[43:24.60]You have to kind of
+[43:25.60]Find the chunk
+[43:26.60]Like more retroactively
+[43:27.60]Basically and so you need
+[43:28.60]Kind of like a different
+[43:29.60]Set of abilities
+[43:30.60]And obviously like
+[43:31.60]A different technology
+[43:32.60]To figure out
+[43:33.60]You still want to point
+[43:34.60]The user to the supporting
+[43:35.60]Quotes in the text
+[43:36.60]But then the interaction
+[43:37.60]Is a little different
+[43:38.60]You like scan through
+[43:39.60]And find some rouge score
+[43:40.60]Yeah the floor
+[43:41.60]I think there's an
+[43:42.60]Interesting space of
+[43:43.60]Almost research problems
+[43:44.60]Here because
+[43:45.60]You would ideally
+[43:46.60]Make causal claims
+[43:47.60]Like if this
+[43:48.60]Hadn't been in the text
+[43:49.60]The model wouldn't
+[43:50.60]Have said this thing
+[43:51.60]And maybe you can do
+[43:52.60]Expensive approximations
+[43:53.60]To that where like
+[43:54.60]I don't know you just
+[43:55.60]Throw a chunk of the paper
+[43:56.60]And re-answer
+[43:57.60]And see what happens
+[43:58.60]But hopefully
+[43:59.60]There are better
+[44:00.60]Ways of doing that
+[44:01.60]Where you just get
+[44:03.60]That kind of counterfactual
+[44:04.60]Information for free
+[44:05.60]From the model
+[44:06.60]Do you think at all
+[44:07.60]About the cost of maintaining
+[44:09.60]Reg versus just putting
+[44:10.60]More tokens in the window
+[44:12.60]I think in software
+[44:13.60]Development a lot of
+[44:14.60]Times people buy
+[44:15.60]Developer productivity
+[44:16.60]Things so that
+[44:17.60]We don't have to worry
+[44:18.60]About it context windows
+[44:19.60]Kinda the same right
+[44:20.60]You have to maintain
+[44:21.60]Chunking and like
+[44:22.60]Reg retrieval and like
+[44:23.60]Re-ranking and all of this
+[44:24.60] Versus I just shove
+[44:25.60]Everything into the context
+[44:26.60]And like it costs
+[44:27.60]A little more
+[44:28.60]But at least I don't
+[44:29.60]Have to do all of that
+[44:30.60]Is that something
+[44:31.60]You thought about
+[44:32.60]I think we still
+[44:33.60]Like hit up against
+[44:34.60]Context limits enough
+[44:35.60]That it's not really
+[44:36.60]Do we still want
+[44:37.60]To keep this rag around
+[44:38.60]It's like we do still
+[44:39.60]Need it for the scale
+[44:40.60]The worth we're doing
+[44:41.60]I think there are
+[44:42.60]Different kinds of
+[44:43.60]Maintainability in
+[44:44.60]One sense I think
+[44:45.60]You write that
+[44:46.60]Throw everything into
+[44:47.60]The context window thing
+[44:48.60]Is easier to maintain
+[44:49.60]Because you just
+[44:50.60]Can swap out a model
+[44:52.60]In another sense
+[44:53.60]If things go wrong
+[44:54.60]It's harder to debug
+[44:55.60]Like if you know
+[44:56.60]Here's the process
+[44:57.60]That we go through
+[44:58.60]To go from
+[45:00.60]200 million papers
+[45:01.60]To an answer
+[45:02.60]And there are like
+[45:03.60]Little steps
+[45:04.60]And you understand
+[45:05.60]Okay this is the step
+[45:06.60]That finds the relevant
+[45:07.60]Paragraph or whatever
+[45:08.60]Maybe you'll know
+[45:09.60]Which step breaks
+[45:10.60]If it's just like
+[45:11.60]A new model
+[45:12.60]Version came out
+[45:13.60]And now it suddenly
+[45:14.60]Doesn't find your needle
+[45:15.60]In a haystack anymore
+[45:16.60]Then you're like
+[45:17.60]Okay what can you do
+[45:18.60]You're kind of at a loss
+[45:20.60]Yeah let's talk
+[45:21.60]A bit about needle
+[45:22.60]In a haystack
+[45:23.60]And like maybe
+[45:24.60]The opposite of it
+[45:25.60]Which is like hard
+[45:26.60]Grounding I don't know
+[45:27.60]That's like the best thing
+[45:28.60]To think about it
+[45:29.60]But I was using
+[45:30.60]One of these
+[45:31.60]Chavicher documents
+[45:32.60]Features
+[45:33.60]And I put the
+[45:34.60]AMD MI300
+[45:35.60]Spacks and the
+[45:36.60]Blackwell chips
+[45:37.60]From NVIDIA
+[45:38.60]And I was asking questions
+[45:39.60]And we like
+[45:40.60]And the response was like
+[45:41.60]Oh it doesn't say
+[45:42.60]In the specs
+[45:43.60]But if you ask
+[45:44.60]GbD4 without the docs
+[45:45.60]It would tell you no
+[45:46.60]Because nvlink
+[45:47.60]It's an NVIDIA
+[45:48.60]It's technology
+[45:49.60]Just as your N.V.
+[45:50.60]Yeah hey man
+[45:51.60]It just says in the thing
+[45:52.60]How do you think about
+[45:53.60]That having the context
+[45:54.60]Sometimes suppress
+[45:55.60]The knowledge
+[45:56.60]That the model has
+[45:57.60]It really depends on the task
+[45:58.60]Because I think
+[45:59.60]Sometimes that is
+[46:00.60]Exactly what you want
+[46:01.60]So imagine your researcher
+[46:02.60]You're writing the background
+[46:03.60]Section of your paper
+[46:04.60]And you're trying to describe
+[46:05.60]What these other papers say
+[46:06.60]You really don't want
+[46:07.60]Extra information
+[46:08.60]To be introduced there
+[46:09.60]In other cases
+[46:10.60]Where you're just trying
+[46:11.60]To figure out the truth
+[46:12.60]And you're giving
+[46:13.60]The documents because
+[46:14.60]You think they will help
+[46:15.60]The model figure out
+[46:16.60]What the truth is
+[46:17.60]I think you do want
+[46:18.60]If the model has a hunch
+[46:19.60]That there might be
+[46:21.60]Something that's not
+[46:22.60]In the papers
+[46:23.60]You do want to surface that
+[46:24.60]I think ideally
+[46:25.60]You still don't want
+[46:26.60]The model to just tell you
+[46:27.60]I think probably
+[46:28.60]The ideal thing
+[46:29.60]Looks a bit more like
+[46:30.60]Agent control
+[46:31.60]Where the model can issue
+[46:33.60]A query that then
+[46:35.60]Is intended to surface
+[46:36.60]The documents that
+[46:37.60]Substantiate its hunch
+[46:38.60]That's maybe
+[46:39.60]A reasonable middle ground
+[46:40.60]Between
+[46:41.60]While just telling you
+[46:42.60]And while being fully
+[46:43.60]Limited to the papers
+[46:44.60]You give it
+[46:45.60]Yeah, I would say
+[46:46.60]They're just kind of
+[46:47.60]Different tasks right now
+[46:48.60]And the tasks that
+[46:49.60]Elicit is mostly focused on
+[46:50.60]Is what do these papers say
+[46:51.60]But there is another task
+[46:52.60]Which is like
+[46:53.60]Just give me the best
+[46:54.60]Possible answer
+[46:55.60]And that give me
+[46:56.60]The best possible answer
+[46:57.60]Sometimes depends
+[46:58.60]On what do these papers say
+[46:59.60]But it can also depend
+[47:00.60]On other stuff
+[47:01.60]That's not in the papers
+[47:02.60]So ideally
+[47:03.60]We can do both
+[47:04.60]And then kind of
+[47:05.60]We can ask
+[47:06.60]For you
+[47:07.60]More going forward
+[47:08.60]We have
+[47:09.60]See a lot of details
+[47:10.60]But just to zoom
+[47:11.60]Back out a little bit
+[47:12.60]What are maybe
+[47:13.60]The most underrated
+[47:14.60]Features of elicit
+[47:16.60]And what is
+[47:17.60]One thing that
+[47:18.60]Maybe the users
+[47:19.60]Surprise you the most
+[47:20.60]By using it
+[47:21.60]I think the most
+[47:22.60]Powerful feature of elicit
+[47:23.60]Is the ability to
+[47:24.60]Extract
+[47:25.60]Add columns to this table
+[47:26.60]Which effectively
+[47:27.60]Extracts data
+[47:28.60]From all of your
+[47:29.60]Papers at once
+[47:30.60]It's well used
+[47:31.60]But there are
+[47:32.60]Kind of many different
+[47:33.60]Extensions of that
+[47:34.60]We let you
+[47:35.60]Give a description
+[47:36.60]Of the column
+[47:37.60]We let you give instructions
+[47:38.60]Of a column
+[47:39.60]We let you create custom
+[47:40.60]Column
+[47:41.60]So we have like 30
+[47:42.60]Plus predefined fields
+[47:43.60]That users can extract
+[47:44.60]Like what were the methods
+[47:45.60]What were the main findings
+[47:46.60]How many people were studied
+[47:48.60]And we actually show
+[47:49.60]You basically the prompts
+[47:50.60]That we're using to
+[47:51.60]Extract that from
+[47:52.60]Our predefined fields
+[47:53.60]And then you can fork this
+[47:54.60]And you can say
+[47:55.60]Oh, actually I don't care
+[47:56.60]About the population of people
+[47:57.60]I only care about
+[47:58.60]The population of rats
+[47:59.60]Like you can change
+[48:00.60]The instructions
+[48:01.60]So I think users
+[48:02.60]Are still kind of discovering
+[48:03.60]This predefined
+[48:04.60]Easy to use default
+[48:06.60]But that they can extend it
+[48:07.60]To be much more
+[48:08.60]Specific to them
+[48:09.60]And then they can also ask
+[48:10.60]Custom questions
+[48:11.60]One use case of that
+[48:12.60]Is you can start to
+[48:13.60]Create different column types
+[48:14.60]That you might not expect
+[48:15.60]So rather than just
+[48:16.60]Creating generative answers
+[48:17.60]Like a description
+[48:18.60]Of the methodology
+[48:19.60]You can say
+[48:20.60]Classify the methodology
+[48:22.60]Into a prospective study
+[48:23.60]A retrospective study
+[48:24.60]Or a case study
+[48:26.60]And then you can filter
+[48:27.60]Based on that
+[48:28.60]It's like all using
+[48:29.60]The same kind of technology
+[48:30.60]And the interface
+[48:31.60]But it unlocks
+[48:32.60]So I think that
+[48:33.60]The ability to ask
+[48:34.60]Custom questions
+[48:35.60]Give instructions
+[48:36.60]And specifically use
+[48:37.60]That to create different
+[48:38.60]Types of columns
+[48:39.60]Like classification columns
+[48:41.60]Is still pretty underrated
+[48:42.60]In terms of use case
+[48:44.60]I spoke to someone
+[48:45.60]Who works in medical affairs
+[48:47.60]At a genomic sequencing
+[48:48.60]Company recently
+[48:49.60]So you know
+[48:50.60]The doctors kind of order
+[48:52.60]These genomic tests
+[48:53.60]These sequencing tests
+[48:54.60]To kind of identify
+[48:55.60]If a patient has
+[48:56.60]A particular disease
+[48:57.60]This company helps
+[48:58.60]And process it
+[48:59.60]And this person
+[49:00.60]Basically interacts
+[49:01.60]With all the doctors
+[49:02.60]And if the doctors
+[49:03.60]Have any questions
+[49:04.60]My understanding is that
+[49:05.60]Medical affairs
+[49:06.60]Is kind of like customer
+[49:07.60]Support or customer success
+[49:08.60]In pharma
+[49:09.60]So this person
+[49:10.60]Talks to doctors all day long
+[49:11.60]And one of the things
+[49:12.60]They started using elicit for
+[49:13.60]Is like putting the results
+[49:14.60]Of their tests
+[49:15.60]As a query
+[49:17.60]Like this test showed
+[49:18.60]You know this percentage
+[49:19.60]Presence of this
+[49:20.60]And 40% that
+[49:21.60]And whatever
+[49:22.60]You know what genes are present
+[49:23.60]Here or within this sample
+[49:25.60]And getting kind of
+[49:26.60]A list of academic papers
+[49:27.60]That would support their findings
+[49:29.60]And using this to help
+[49:30.60]The doctors
+[49:31.60]Interpret their tests
+[49:32.60]So we talked about
+[49:33.60]Okay cool
+[49:34.60]Like if we built
+[49:35.60]He's pretty interested
+[49:36.60]In kind of doing a survey
+[49:37.60]Of infectious disease
+[49:38.60]Specialists
+[49:39.60]And getting them
+[49:40.60]To evaluate
+[49:41.60]You know having them
+[49:42.60]Right up their answers
+[49:43.60]Comparing it to elicit
+[49:44.60]Answers trying to see
+[49:45.60]Can elicit start being
+[49:46.60]Used to interpret
+[49:47.60]The results of
+[49:48.60]These diagnostic tests
+[49:49.60]Because the way
+[49:50.60]They ship these tests
+[49:51.60]To doctors
+[49:52.60]Is they report
+[49:53.60]On a really wide
+[49:54.60]Array of things
+[49:55.60]He was saying
+[49:56.60]That at a large
+[49:57.60]Well resourced hospital
+[49:58.60]Like a city hospital
+[49:59.60]There might be
+[50:00.60]A team of infectious disease
+[50:01.60]Specialists who can
+[50:02.60]Help interpret
+[50:03.60]These results
+[50:04.60]But at underresourced
+[50:05.60]Hospitals or more
+[50:06.60]Rural hospitals
+[50:07.60]The primary care physician
+[50:08.60]Can't interpret
+[50:09.60]The test results
+[50:10.60]So then they can't order
+[50:11.60]They can't use it
+[50:12.60]They can't help
+[50:13.60]The patients with it
+[50:14.60]So thinking about
+[50:15.60]An evidence backed way
+[50:16.60]Of interpreting these tests
+[50:17.60]Definitely kind of
+[50:18.60]An extension of the product
+[50:19.60]That I hadn't considered
+[50:20.60]Before
+[50:21.60]But yeah the idea of
+[50:22.60]Using that to bring
+[50:23.60]More access to physicians
+[50:24.60]In all different parts
+[50:25.60]Of the country
+[50:26.60]And helping them
+[50:27.60]Interpret complicated
+[50:28.60]We are kenjun
+[50:29.60]From mv1
+[50:30.60]On the podcast
+[50:31.60]And we talked about
+[50:32.60]Better allocating
+[50:33.60]Scientific resources
+[50:34.60]How do you think about
+[50:35.60]These use cases
+[50:36.60]And maybe
+[50:37.60]How illicit
+[50:38.60]Can help drive
+[50:39.60]More research
+[50:40.60]And do you see
+[50:41.60]A world in which
+[50:42.60]You know maybe the models
+[50:43.60]Actually do
+[50:44.60]Some of the research
+[50:45.60]Before suggesting us
+[50:46.60]Yeah I think
+[50:47.60]That's like
+[50:48.60]Very close to
+[50:49.60]What we care about
+[50:50.60]Our product values
+[50:51.60]Are systematic
+[50:52.60]Transparent and unbounded
+[50:53.60]And I think
+[50:54.60]You make research
+[50:55.60]Especially more systematic
+[50:56.60]And unbounded
+[50:57.60]And here's
+[50:58.60]The thing
+[50:59.60]That's at stake here
+[51:00.60]So for example
+[51:01.60]I was
+[51:02.60]Recently talking
+[51:03.60]To people in longevity
+[51:04.60]And I think
+[51:05.60]There isn't really
+[51:06.60]One field of longevity
+[51:07.60]There are kind of
+[51:08.60]Different
+[51:09.60]Scientific subdomains
+[51:10.60]That are surfacing
+[51:11.60]Various things
+[51:12.60]That are related
+[51:13.60]To longevity
+[51:14.60]And I think
+[51:14.60]If you could
+[51:15.60]More systematically
+[51:16.60]Say look
+[51:17.60]Here all the different
+[51:18.60]Interventions
+[51:19.60]We could do
+[51:20.60]And here's
+[51:21.60]The expected
+[51:22.60]RI of these experiments
+[51:23.60]Here's like
+[51:24.60]The evidence so far
+[51:25.60]That supports
+[51:26.60]So much more systematic
+[51:27.60]Than
+[51:28.60]Sciences today
+[51:29.60]I'd guess in like
+[51:30.60]10 20 years we'll look back
+[51:31.60]And it will be
+[51:32.60]Incredible how
+[51:33.60]Unsystematic science
+[51:34.60]Was back in the day
+[51:35.60]Our views kind of
+[51:36.60]Have models
+[51:37.60]Catch up to expert humans today
+[51:39.60]Start with kind of
+[51:40.60]Novice humans
+[51:41.60]And then increasingly
+[51:42.60]Expert humans
+[51:43.60]But we really want
+[51:44.60]The models to earn
+[51:45.60]Their right to the expertise
+[51:47.60]So that's why we do
+[51:48.60]Things in this very step-by-step way
+[51:49.60]That's why we don't
+[51:50.60]Just like throw a bunch of data
+[51:51.60]And apply a bunch of compute
+[51:52.60]And hope we get good results
+[51:54.60]But obviously at some point
+[51:55.60]It's kind of
+[51:56.60]Earned its stripes
+[51:57.60]It can surpass
+[51:58.60]Human researchers
+[51:59.60]But I think that's where
+[52:00.60]Making sure
+[52:01.60]That the models
+[52:02.60]Processes are really
+[52:03.60]Explicit and transparent
+[52:05.60]And that it's really
+[52:06.60]Easy to evaluate
+[52:07.60]Is important because
+[52:08.60]If it does surpass
+[52:09.60]Human understanding
+[52:10.60]People will still need
+[52:11.60]To be able to audit
+[52:12.60]It's work somehow
+[52:13.60]Or spot check
+[52:14.60]It's work somehow
+[52:15.60]To be able to reliably
+[52:16.60]Trust it and use it
+[52:17.60]So yeah
+[52:18.60]That's kind of why
+[52:19.60]The process-based approaches
+[52:20.60]Is really important
+[52:21.60]And on the question
+[52:22.60]Of will models
+[52:23.60]Do their own research
+[52:24.60]Teachers that models
+[52:25.60]Currently don't have
+[52:26.60]That will need
+[52:27.60]To be better there
+[52:28.60]Is better world models
+[52:30.60]I think currently models
+[52:31.60]Are just not great
+[52:32.60]At representing
+[52:33.60]What's going on
+[52:34.60]In a particular situation
+[52:35.60]Or domain in a way
+[52:36.60]That allows them to
+[52:37.60]Come to interesting
+[52:38.60]Surprising conclusions
+[52:40.60]I think they're very good
+[52:41.60]At coming to conclusions
+[52:42.60]That are nearby
+[52:43.60]To conclusions
+[52:44.60]That people have come to
+[52:45.60]Not as good
+[52:46.60]At kind of reasoning
+[52:47.60]And making
+[52:48.60]Surprising connections maybe
+[52:49.60]And so having
+[52:50.60]Deeper models of
+[52:52.60]What are the underlying
+[52:53.60]Domains
+[52:54.60]How are they related
+[52:55.60]Or not related
+[52:56.60]I think there will be
+[52:57.60]An important ingredient
+[52:58.60]From all to actually
+[52:59.60]Being able to make
+[53:00.60]Novel contributions
+[53:01.60]On the topic of
+[53:02.60]Hiring more expert humans
+[53:03.60]You've hired some
+[53:04.60]Very expert humans
+[53:05.60]My friend Maggie
+[53:06.60]Appleton joined you guys
+[53:07.60]I think maybe
+[53:08.60]A year ago-ish
+[53:09.60]In fact, I think
+[53:10.60]You're doing an offsite
+[53:11.60]And we're actually
+[53:12.60]Organizing our big
+[53:13.60]AI UX meetup around
+[53:14.60]Whenever she's
+[53:15.60]In town in San Francisco
+[53:16.60]How big is the team
+[53:17.60]How have you sort of
+[53:18.60]Transition your company
+[53:19.60]Into this sort of PBC
+[53:20.60]And sort of the plan
+[53:21.60]For the future
+[53:22.60]About half of us
+[53:23.60]Are in the Bay Area
+[53:24.60]And then distributed
+[53:25.60]Across US and Europe
+[53:26.60]A mix of mostly kind
+[53:28.60]Of roles in engineering
+[53:29.60]And product
+[53:30.60]And I think that
+[53:31.60]The transition to
+[53:32.60]PBC was really
+[53:33.60]Not that eventful
+[53:34.60]Because I think
+[53:35.60]We were already
+[53:36.60]Even as a nonprofit
+[53:37.60]We were already
+[53:38.60]Shipping every week
+[53:39.60]So very much
+[53:40.60]Operating as a product
+[53:41.60]And then I would say
+[53:43.60]The kind of PBC component
+[53:44.60]Was to very explicitly
+[53:46.60]Stay that we have
+[53:47.60]A mission that we care
+[53:48.60]A lot about
+[53:49.60]There are a lot of ways
+[53:50.60]To make money
+[53:51.60]We make us
+[53:52.60]A lot of money
+[53:53.60]But we are going
+[53:54.60]To be opinionated
+[53:55.60]About how we make money
+[53:56.60]We're going to take
+[53:57.60]The version of making
+[53:58.60]A lot of money
+[53:59.60]That's in line
+[54:00.60]With our mission
+[54:01.60]But it's like
+[54:02.60]All very convergent
+[54:03.60]Alicit is not going
+[54:04.60]To make any money
+[54:05.60]If it's a bad product
+[54:06.60]If it doesn't actually
+[54:07.60]Help you discover truth
+[54:08.60]And do research
+[54:09.60]More rigorously
+[54:10.60]So I think for us
+[54:11.60]The kind of mission
+[54:12.60]And the success
+[54:13.60]Of the company
+[54:14.60]Are very intertwined
+[54:15.60]We're hoping to grow
+[54:16.60]The team quite a lot
+[54:17.60]This year
+[54:18.60]Probably some of our
+[54:19.60]Highest priority roles
+[54:20.60]In marketing
+[54:21.60]Go to market
+[54:22.60]Do you want to talk
+[54:23.60]About their roles?
+[54:24.60]Yeah, broadly
+[54:25.60]We're just looking
+[54:26.60]For senior software engineers
+[54:27.60]And don't need
+[54:28.60]Any particular AI expertise
+[54:29.60]A lot of it is just
+[54:30.60]How do you
+[54:31.60]Build good orchestration
+[54:33.60]For complex tasks
+[54:34.60]So we talked earlier
+[54:35.60]About these notebooks
+[54:36.60]Scaling up
+[54:37.60]Task orchestration
+[54:38.60]And I think a lot
+[54:39.60]Of this looks more
+[54:40.60]Like traditional
+[54:41.60]Soft engineering
+[54:42.60]Than it does look
+[54:43.60]Like machine learning
+[54:44.60]Research and I think
+[54:45.60]The people who are
+[54:46.60]Like really good at
+[54:47.60]Building good abstractions
+[54:48.60]Building applications
+[54:49.60]We've survived
+[54:50.60]Even if some
+[54:51.60]Of their pieces break
+[54:52.60]Like making reliable
+[54:53.60]Components out of
+[54:54.60]Unreliable pieces
+[54:55.60]I think those are the
+[54:56.60]People we're looking for
+[54:57.60]You know that's exactly
+[54:58.60]What I used to do
+[54:59.60]Have you explored
+[55:00.60]The existing orchestration
+[55:01.60]Frameworks, Temporal, Airflow
+[55:03.60]Daxter, Prefects
+[55:05.60]We've looked into
+[55:06.60] Them a little bit
+[55:07.60]I think we have
+[55:08.60]Some specific requirements
+[55:09.60]Around being able
+[55:10.60]To stream work back
+[55:11.60]Very quickly
+[55:12.60]To our users
+[55:13.60]Those could definitely
+[55:14.60]Be relevant
+[55:15.60]Okay, well you're hiring
+[55:16.60]I'm sure we'll plug
+[55:17.60]All the links
+[55:18.60]And parting words
+[55:19.60]Any words of wisdom
+[55:20.60]Models you live by
+[55:22.60]I think it's a really important
+[55:23.60]Time for humanity
+[55:24.60]So I hope everyone
+[55:25.60]Listening to this podcast
+[55:27.60]Can think hard about exactly
+[55:29.60]How they want to
+[55:30.60]Participate in this story
+[55:31.60]There's so much to build
+[55:33.60]And we can be really
+[55:34.60]Intentional about what
+[55:35.60]We align ourselves with
+[55:36.60]There are a lot of applications
+[55:38.60]That are going to be really good
+[55:39.60]For the world
+[55:39.60]And a lot of applications
+[55:40.60]That are not
+[55:41.60]And so yeah
+[55:42.60]I hope people can
+[55:43.60]Take that seriously
+[55:44.60]And kind of seize the moment
+[55:45.60]Yeah, I love how intentional
+[55:46.60]You guys have been
+[55:47.60]Thank you for sharing
+[55:48.60]Thank you
+[55:49.60]Thank you for coming on
+[55:50.60](音乐)
+[55:52.60](音樂)
+[55:54.60](音樂)
+[55:56.60](音樂)
+[55:58.60](音樂)
+[56:00.60](音樂)
+[56:03.60](音樂)
+[56:06.60](音樂)
+[56:09.60](音樂)
+[56:11.60](音樂)
+[56:13.60](音樂)
+[56:15.60]中文字幕:J Chong
+[56:16.60]我只想要你和我一起去做一件事
diff --git a/content/post/Latent Space/Latent-Space-Supervise-the-Process-of-AI-Research-—-with-Jungwon-Byun-and-Andreas-Stuhlmüller-of-Elicit.md b/content/post/Latent Space/Latent-Space-Supervise-the-Process-of-AI-Research-—-with-Jungwon-Byun-and-Andreas-Stuhlmüller-of-Elicit.md
new file mode 100644
index 0000000..2754eed
--- /dev/null
+++ b/content/post/Latent Space/Latent-Space-Supervise-the-Process-of-AI-Research-—-with-Jungwon-Byun-and-Andreas-Stuhlmüller-of-Elicit.md
@@ -0,0 +1,5779 @@
+---
+title: Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit
+author: Latent Space
+date: Thu, 11 Apr 2024 20:15:27 GMT
+draft: false
+summary: Maggie, Linus, Geoffrey, and the LS crew are reuniting for our second annual AI UX demo day in SF on Apr 28. Sign up to demo here! And don’t forget tickets for the AI Engineer World’s Fair — for early...
+categories: [Latent Space]
+---
+
+{{< aplayer name="Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit" artist="Latent Space" url="https://chrt.fm/track/ABF6EF/api.substack.com/feed/podcast/143400989/eda46c42a59e2d1d2ebd5758e813769e.mp3" cover="https://substackcdn.com/feed/podcast/1084089/post/143400989/2cce9d2efca390273a552338d61af6b6.jpg" lrc-folded=true lrc-type=3 lrc="../Latent-Space-Supervise-the-Process-of-AI-Research-—-with-Jungwon-Byun-and-Andreas-Stuhlmüller-of-Elicit.lrc" >}}{{< /aplayer >}}
+
+------
+
+Maggie, Linus, Geoffrey, and the LS crew are reuniting for our second annual AI UX demo day in SF on Apr 28. Sign up to demo here! And don’t forget tickets for the AI Engineer World’s Fair — for early birds who join before keynote announcements!
It’s become fashionable for many AI startups to project themselves as “the next Google” - while the search engine is so 2000s, both Perplexity and Exa referred to themselves as a “research engine” or “answer engine” in our NeurIPS pod. However these searches tend to be relatively shallow, and it is challenging to zoom up and down the ladders of abstraction to garner insights. For serious researchers, this level of simple one-off search will not cut it.
We’ve commented in our Jan 2024 Recap that Flow Engineering (simply; multi-turn processes over many-shot single prompts) seems to offer far more performance, control and reliability for a given cost budget. Our experiments with Devin and our understanding of what the new Elicit Notebooks offer a glimpse into the potential for very deep, open ended, thoughtful human-AI collaboration at scale.
It starts with prompts
When ChatGPT exploded in popularity in November 2022 everyone was turned into a prompt engineer. While generative models were good at "vibe based" outcomes (tell me a joke, write a poem, etc) with basic prompts, they struggled with more complex questions, especially in symbolic fields like math, logic, etc. Two of the most important "tricks" that people picked up on were:
* Chain of Thought prompting strategy proposed by Wei et al in the “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models”. Rather than doing traditional few-shot prompting with just question and answers, adding the thinking process that led to the answer resulted in much better outcomes.
* Adding "Let's think step by step" to the prompt as a way to boost zero-shot reasoning, which was popularized by Kojima et al in the Large Language Models are Zero-Shot Reasoners paper from NeurIPS 2022. This bumped accuracy from 17% to 79% compared to zero-shot.
Nowadays, prompts include everything from promises of monetary rewards to… whatever the Nous folks are doing to turn a model into a world simulator. At the end of the day, the goal of prompt engineering is increasing accuracy, structure, and repeatability in the generation of a model.
From prompts to agents
As prompt engineering got more and more popular, agents (see “The Anatomy of Autonomy”) took over Twitter with cool demos and AutoGPT became the fastest growing repo in Github history. The thing about AutoGPT that fascinated people was the ability to simply put in an objective without worrying about explaining HOW to achieve it, or having to write very sophisticated prompts. The system would create an execution plan on its own, and then loop through each task.
The problem with open-ended agents like AutoGPT is that 1) it’s hard to replicate the same workflow over and over again 2) there isn’t a way to hard-code specific steps that the agent should take without actually coding them yourself, which isn’t what most people want from a product.
From agents to products
Prompt engineering and open-ended agents were great in the experimentation phase, but this year more and more of these workflows are starting to become polished products.
Today’s guests are Andreas Stuhlmüller and Jungwon Byun of Elicit (previously Ought), an AI research assistant that they think of as “the best place to understand what is known”.
Ought was a non-profit, but last September, Elicit spun off into a PBC with a $9m seed round. It is hard to quantify how much a workflow can be improved, but Elicit boasts some impressive numbers for research assistants:
Just four months after launch, Elicit crossed $1M ARR, which shows how much interest there is for AI products that just work.
One of the main takeaways we had from the episode is how teams should focus on supervising the process, not the output. Their philosophy at Elicit isn’t to train general models, but to train models that are extremely good at focusing processes.
This allows them to have pre-created steps that the user can add to their workflow (like classifying certain features that are specific to their research field) without having to write a prompt for it. And for Hamel Husain’s happiness, they always show you the underlying prompt.
Elicit recently announced notebooks as a new interface to interact with their products: (fun fact, they tried to implement this 4 times before they landed on the right UX! We discuss this ~33:00 in the podcast)
The reasons why they picked notebooks as a UX all tie back to process:
* They are systematic; once you have a instruction/prompt that works on a paper, you can run hundreds of papers through the same workflow by creating a column. Notebooks can also be edited and exported at any point during the flow.
* They are transparent - Many papers include an opaque literature review as perfunctory context before getting to their novel contribution. But PDFs are “dead” and it is difficult to follow the thought process and exact research flow of the authors. Sharing “living” Elicit Notebooks opens up this process.
* They are unbounded - Research is an endless stream of rabbit holes. So it must be easy to dive deeper and follow up with extra steps, without losing the ability to surface for air.
We had a lot of fun recording this, and hope you have as much fun listening!
AI UX in SF
Long time Latent Spacenauts might remember our first AI UX meetup with Linus Lee, Geoffrey Litt, and Maggie Appleton last year. Well, Maggie has since joined Elicit, and they are all returning at the end of this month!
Sign up here: https://lu.ma/aiux
And submit demos here! https://forms.gle/iSwiesgBkn8oo4SS8
We expect the 200 seats to “sell out” fast. Attendees with demos will be prioritized.
Show Notes
* Elicit
* Ought (their previous non-profit)
* “Pivoting” with GPT-4
* Elicit notebooks launch
* Charlie
* Andreas’ Blog
Timestamps
* [00:00:00] Introductions
* [00:07:45] How Johan and Andreas Joined Forces to Create Elicit
* [00:10:26] Why Products > Research
* [00:15:49] The Evolution of Elicit's Product
* [00:19:44] Automating Literature Review Workflow
* [00:22:48] How GPT-3 to GPT-4 Changed Things
* [00:25:37] Managing LLM Pricing and Performance
* [00:31:07] Open vs. Closed: Elicit's Approach to Model Selection
* [00:31:56] Moving to Notebooks
* [00:39:11] Elicit's Budget for Model Queries and Evaluations
* [00:41:44] Impact of Long Context Windows
* [00:47:19] Underrated Features and Surprising Applications
* [00:51:35] Driving Systematic and Efficient Research
* [00:53:00] Elicit's Team Growth and Transition to a Public Benefit Corporation
* [00:55:22] Building AI for Good
Full Interview on YouTube
As always, a plug for our youtube version for the 80% of communication that is nonverbal:
Transcript
Alessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.
Swyx [00:00:15]: Hey, and today we are back in the studio with Andreas and Jungwon from Elicit. Welcome.
Jungwon [00:00:20]: Thanks guys.
Andreas [00:00:21]: It's great to be here.
Swyx [00:00:22]: Yeah. So I'll introduce you separately, but also, you know, we'd love to learn a little bit more about you personally. So Andreas, it looks like you started Elicit first, Jungwon joined later.
Andreas [00:00:32]: That's right. For all intents and purposes, the Elicit and also the Ought that existed before then were very different from what I started. So I think it's like fair to say that you co-founded it.
Swyx [00:00:43]: Got it. And Jungwon, you're a co-founder and COO of Elicit now.
Jungwon [00:00:46]: Yeah, that's right.
Swyx [00:00:47]: So there's a little bit of a history to this. I'm not super aware of like the sort of journey. I was aware of OTT and Elicit as sort of a nonprofit type situation. And recently you turned into like a B Corp, Public Benefit Corporation. So yeah, maybe if you want, you could take us through that journey of finding the problem. You know, obviously you're working together now. So like, how do you get together to decide to leave your startup career to join him?
Andreas [00:01:10]: Yeah, it's truly a very long journey. I guess truly, it kind of started in Germany when I was born. So even as a kid, I was always interested in AI, like I kind of went to the library. There were books about how to write programs in QBasic and like some of them talked about how to implement chatbots.
Jungwon [00:01:27]: To be clear, he grew up in like a tiny village on the outskirts of Munich called Dinkelschirben, where it's like a very, very idyllic German village.
Andreas [00:01:36]: Yeah, important to the story. So basically, the main thing is I've kind of always been thinking about AI my entire life and been thinking about, well, at some point, this is going to be a huge deal. It's going to be transformative. How can I work on it? And was thinking about it from when I was a teenager, after high school did a year where I started a startup with the intention to become rich. And then once I'm rich, I can affect the trajectory of AI. Did not become rich, decided to go back to college and study cognitive science there, which was like the closest thing I could find at the time to AI. In the last year of college, moved to the US to do a PhD at MIT, working on broadly kind of new programming languages for AI because it kind of seemed like the existing languages were not great at expressing world models and learning world models doing Bayesian inference. Was always thinking about, well, ultimately, the goal is to actually build tools that help people reason more clearly, ask and answer better questions and make better decisions. But for a long time, it seemed like the technology to put reasoning in machines just wasn't there. Initially, at the end of my postdoc at Stanford, I was thinking about, well, what to do? I think the standard path is you become an academic and do research. But it's really hard to actually build interesting tools as an academic. You can't really hire great engineers. Everything is kind of on a paper-to-paper timeline. And so I was like, well, maybe I should start a startup, pursued that for a little bit. But it seemed like it was too early because you could have tried to do an AI startup, but probably would not have been this kind of AI startup we're seeing now. So then decided to just start a nonprofit research lab that's going to do research for a while until we better figure out how to do thinking in machines. And that was odd. And then over time, it became clear how to actually build actual tools for reasoning. And only over time, we developed a better way to... I'll let you fill in some of the details here.
Jungwon [00:03:26]: Yeah. So I guess my story maybe starts around 2015. I kind of wanted to be a founder for a long time, and I wanted to work on an idea that stood the test of time for me, like an idea that stuck with me for a long time. And starting in 2015, actually, originally, I became interested in AI-based tools from the perspective of mental health. So there are a bunch of people around me who are really struggling. One really close friend in particular is really struggling with mental health and didn't have any support, and it didn't feel like there was anything before kind of like getting hospitalized that could just help her. And so luckily, she came and stayed with me for a while, and we were just able to talk through some things. But it seemed like lots of people might not have that resource, and something maybe AI-enabled could be much more scalable. I didn't feel ready to start a company then, that's 2015. And I also didn't feel like the technology was ready. So then I went into FinTech and kind of learned how to do the tech thing. And then in 2019, I felt like it was time for me to just jump in and build something on my own I really wanted to create. And at the time, I looked around at tech and felt like not super inspired by the options. I didn't want to have a tech career ladder, or I didn't want to climb the career ladder. There are two kind of interesting technologies at the time, there was AI and there was crypto. And I was like, well, the AI people seem like a little bit more nice, maybe like slightly more trustworthy, both super exciting, but threw my bet in on the AI side. And then I got connected to Andreas. And actually, the way he was thinking about pursuing the research agenda at OTT was really compatible with what I had envisioned for an ideal AI product, something that helps kind of take down really complex thinking, overwhelming thoughts and breaks it down into small pieces. And then this kind of mission that we need AI to help us figure out what we ought to do was really inspiring, right? Yeah, because I think it was clear that we were building the most powerful optimizer of our time. But as a society, we hadn't figured out how to direct that optimization potential. And if you kind of direct tremendous amounts of optimization potential at the wrong thing, that's really disastrous. So the goal of OTT was make sure that if we build the most transformative technology of our lifetime, it can be used for something really impactful, like good reasoning, like not just generating ads. My background was in marketing, but like, so I was like, I want to do more than generate ads with this. But also if these AI systems get to be super intelligent enough that they are doing this really complex reasoning, that we can trust them, that they are aligned with us and we have ways of evaluating that they're doing the right thing. So that's what OTT did. We did a lot of experiments, you know, like I just said, before foundation models really like took off. A lot of the issues we were seeing were more in reinforcement learning, but we saw a future where AI would be able to do more kind of logical reasoning, not just kind of extrapolate from numerical trends. We actually kind of set up experiments with people where kind of people stood in as super intelligent systems and we effectively gave them context windows. So they would have to like read a bunch of text and one person would get less text and one person would get all the texts and the person with less text would have to evaluate the work of the person who could read much more. So like in a world we were basically simulating, like in 2018, 2019, a world where an AI system could read significantly more than you and you as the person who couldn't read that much had to evaluate the work of the AI system. Yeah. So there's a lot of the work we did. And from that, we kind of iterated on the idea of breaking complex tasks down into smaller tasks, like complex tasks, like open-ended reasoning, logical reasoning into smaller tasks so that it's easier to train AI systems on them. And also so that it's easier to evaluate the work of the AI system when it's done. And then also kind of, you know, really pioneered this idea, the importance of supervising the process of AI systems, not just the outcomes. So a big part of how Elicit is built is we're very intentional about not just throwing a ton of data into a model and training it and then saying, cool, here's like scientific output. Like that's not at all what we do. Our approach is very much like, what are the steps that an expert human does or what is like an ideal process as granularly as possible, let's break that down and then train AI systems to perform each of those steps very robustly. When you train like that from the start, after the fact, it's much easier to evaluate, it's much easier to troubleshoot at each point. Like where did something break down? So yeah, we were working on those experiments for a while. And then at the start of 2021, decided to build a product.
Swyx [00:07:45]: Do you mind if I, because I think you're about to go into more modern thought and Elicit. And I just wanted to, because I think a lot of people are in where you were like sort of 2018, 19, where you chose a partner to work with. Yeah. Right. And you didn't know him. Yeah. Yeah. You were just kind of cold introduced. A lot of people are cold introduced. Yeah. Never work with them. I assume you had a lot, a lot of other options, right? Like how do you advise people to make those choices?
Jungwon [00:08:10]: We were not totally cold introduced. So one of our closest friends introduced us. And then Andreas had written a lot on the OTT website, a lot of blog posts, a lot of publications. And I just read it and I was like, wow, this sounds like my writing. And even other people, some of my closest friends I asked for advice from, they were like, oh, this sounds like your writing. But I think I also had some kind of like things I was looking for. I wanted someone with a complimentary skillset. I want someone who was very values aligned. And yeah, that was all a good fit.
Andreas [00:08:38]: We also did a pretty lengthy mutual evaluation process where we had a Google doc where we had all kinds of questions for each other. And I think it ended up being around 50 pages or so of like various like questions and back and forth.
Swyx [00:08:52]: Was it the YC list? There's some lists going around for co-founder questions.
Andreas [00:08:55]: No, we just made our own questions. But I guess it's probably related in that you ask yourself, what are the values you care about? How would you approach various decisions and things like that?
Jungwon [00:09:04]: I shared like all of my past performance reviews. Yeah. Yeah.
Swyx [00:09:08]: And he never had any. No.
Andreas [00:09:10]: Yeah.
Swyx [00:09:11]: Sorry, I just had to, a lot of people are going through that phase and you kind of skipped over it. I was like, no, no, no, no. There's like an interesting story.
Jungwon [00:09:20]: Yeah.
Alessio [00:09:21]: Yeah. Before we jump into what a list it is today, the history is a bit counterintuitive. So you start with figuring out, oh, if we had a super powerful model, how would we align it? But then you were actually like, well, let's just build the product so that people can actually leverage it. And I think there are a lot of folks today that are now back to where you were maybe five years ago that are like, oh, what if this happens rather than focusing on actually building something useful with it? What clicked for you to like move into a list and then we can cover that story too.
Andreas [00:09:49]: I think in many ways, the approach is still the same because the way we are building illicit is not let's train a foundation model to do more stuff. It's like, let's build a scaffolding such that we can deploy powerful models to good ends. I think it's different now in that we actually have like some of the models to plug in. But if in 2017, we had had the models, we could have run the same experiments we did run with humans back then, just with models. And so in many ways, our philosophy is always, let's think ahead to the future of what models are going to exist in one, two years or longer. And how can we make it so that they can actually be deployed in kind of transparent, controllable
Jungwon [00:10:26]: ways? I think motivationally, we both are kind of product people at heart. The research was really important and it didn't make sense to build a product at that time. But at the end of the day, the thing that always motivated us is imagining a world where high quality reasoning is really abundant and AI is a technology that's going to get us there. And there's a way to guide that technology with research, but we can have a more direct effect through product because with research, you publish the research and someone else has to implement that into the product and the product felt like a more direct path. And we wanted to concretely have an impact on people's lives. Yeah, I think the kind of personally, the motivation was we want to build for people.
Swyx [00:11:03]: Yep. And then just to recap as well, like the models you were using back then were like, I don't know, would they like BERT type stuff or T5 or I don't know what timeframe we're talking about here.
Andreas [00:11:14]: I guess to be clear, at the very beginning, we had humans do the work. And then I think the first models that kind of make sense were TPT-2 and TNLG and like Yeah, early generative models. We do also use like T5 based models even now started with TPT-2.
Swyx [00:11:30]: Yeah, cool. I'm just kind of curious about like, how do you start so early? You know, like now it's obvious where to start, but back then it wasn't.
Jungwon [00:11:37]: Yeah, I used to nag Andreas a lot. I was like, why are you talking to this? I don't know. I felt like TPT-2 is like clearly can't do anything. And I was like, Andreas, you're wasting your time, like playing with this toy. But yeah, he was right.
Alessio [00:11:50]: So what's the history of what Elicit actually does as a product? You recently announced that after four months, you get to a million in revenue. Obviously, a lot of people use it, get a lot of value, but it would initially kind of like structured data extraction from papers. Then you had kind of like concept grouping. And today, it's maybe like a more full stack research enabler, kind of like paper understander platform. What's the definitive definition of what Elicit is? And how did you get here?
Jungwon [00:12:15]: Yeah, we say Elicit is an AI research assistant. I think it will continue to evolve. That's part of why we're so excited about building and research, because there's just so much space. I think the current phase we're in right now, we talk about it as really trying to make Elicit the best place to understand what is known. So it's all a lot about like literature summarization. There's a ton of information that the world already knows. It's really hard to navigate, hard to make it relevant. So a lot of it is around document discovery and processing and analysis. I really kind of want to import some of the incredible productivity improvements we've seen in software engineering and data science and into research. So it's like, how can we make researchers like data scientists of text? That's why we're launching this new set of features called Notebooks. It's very much inspired by computational notebooks, like Jupyter Notebooks, you know, DeepNode or Colab, because they're so powerful and so flexible. And ultimately, when people are trying to get to an answer or understand insight, they're kind of like manipulating evidence and information. Today, that's all packaged in PDFs, which are super brittle. So with language models, we can decompose these PDFs into their underlying claims and evidence and insights, and then let researchers mash them up together, remix them and analyze them together. So yeah, I would say quite simply, overall, Elicit is an AI research assistant. Right now we're focused on text-based workflows, but long term, really want to kind of go further and further into reasoning and decision making.
Alessio [00:13:35]: And when you say AI research assistant, this is kind of meta research. So researchers use Elicit as a research assistant. It's not a generic you-can-research-anything type of tool, or it could be, but like, what are people using it for today?
Andreas [00:13:49]: Yeah. So specifically in science, a lot of people use human research assistants to do things. You tell your grad student, hey, here are a couple of papers. Can you look at all of these, see which of these have kind of sufficiently large populations and actually study the disease that I'm interested in, and then write out like, what are the experiments they did? What are the interventions they did? What are the outcomes? And kind of organize that for me. And the first phase of understanding what is known really focuses on automating that workflow because a lot of that work is pretty rote work. I think it's not the kind of thing that we need humans to do. Language models can do it. And then if language models can do it, you can obviously scale it up much more than a grad student or undergrad research assistant would be able to do.
Jungwon [00:14:31]: Yeah. The use cases are pretty broad. So we do have a very large percent of our users are just using it personally or for a mix of personal and professional things. People who care a lot about health or biohacking or parents who have children with a kind of rare disease and want to understand the literature directly. So there is an individual kind of consumer use case. We're most focused on the power users. So that's where we're really excited to build. So Lissette was very much inspired by this workflow in literature called systematic reviews or meta-analysis, which is basically the human state of the art for summarizing scientific literature. And it typically involves like five people working together for over a year. And they kind of first start by trying to find the maximally comprehensive set of papers possible. So it's like 10,000 papers. And they kind of systematically narrow that down to like hundreds or 50 extract key details from every single paper. Usually have two people doing it, like a third person reviewing it. So it's like an incredibly laborious, time consuming process, but you see it in every single domain. So in science, in machine learning, in policy, because it's so structured and designed to be reproducible, it's really amenable to automation. So that's kind of the workflow that we want to automate first. And then you make that accessible for any question and make these really robust living summaries of science. So yeah, that's one of the workflows that we're starting with.
Alessio [00:15:49]: Our previous guest, Mike Conover, he's building a new company called Brightwave, which is an AI research assistant for financial research. How do you see the future of these tools? Does everything converge to like a God researcher assistant, or is every domain going to have its own thing?
Andreas [00:16:03]: I think that's a good and mostly open question. I do think there are some differences across domains. For example, some research is more quantitative data analysis, and other research is more high level cross domain thinking. And we definitely want to contribute to the broad generalist reasoning type space. Like if researchers are making discoveries often, it's like, hey, this thing in biology is actually analogous to like these equations in economics or something. And that's just fundamentally a thing that where you need to reason across domains. At least within research, I think there will be like one best platform more or less for this type of generalist research. I think there may still be like some particular tools like for genomics, like particular types of modules of genes and proteins and whatnot. But for a lot of the kind of high level reasoning that humans do, I think that is a more of a winner type all thing.
Swyx [00:16:52]: I wanted to ask a little bit deeper about, I guess, the workflow that you mentioned. I like that phrase. I see that in your UI now, but that's as it is today. And I think you were about to tell us about how it was in 2021 and how it may be progressed. How has this workflow evolved over time?
Jungwon [00:17:07]: Yeah. So the very first version of Elicit actually wasn't even a research assistant. It was a forecasting assistant. So we set out and we were thinking about, you know, what are some of the most impactful types of reasoning that if we could scale up, AI would really transform the world. We actually started with literature review, but we're like, oh, so many people are going to build literature review tools. So let's start there. So then we focused on geopolitical forecasting. So I don't know if you're familiar with like manifold or manifold markets. That kind of stuff. Before manifold. Yeah. Yeah. I'm not predicting relationships. We're predicting like, is China going to invade Taiwan?
Swyx [00:17:38]: Markets for everything.
Andreas [00:17:39]: Yeah. That's a relationship.
Swyx [00:17:41]: Yeah.
Jungwon [00:17:42]: Yeah. It's true. And then we worked on that for a while. And then after GPT-3 came out, I think by that time we realized that originally we were trying to help people convert their beliefs into probability distributions. And so take fuzzy beliefs, but like model them more concretely. And then after a few months of iterating on that, just realize, oh, the thing that's blocking people from making interesting predictions about important events in the world is less kind of on the probabilistic side and much more on the research side. And so that kind of combined with the very generalist capabilities of GPT-3 prompted us to make a more general research assistant. Then we spent a few months iterating on what even is a research assistant. So we would embed with different researchers. We built data labeling workflows in the beginning, kind of right off the bat. We built ways to find experts in a field and like ways to ask good research questions. So we just kind of iterated through a lot of workflows and no one else was really building at this time. And it was like very quick to just do some prompt engineering and see like what is a task that is at the intersection of what's technologically capable and like important for researchers. And we had like a very nondescript landing page. It said nothing. But somehow people were signing up and we had to sign a form that was like, why are you here? And everyone was like, I need help with literature review. And we're like, oh, literature review. That sounds so hard. I don't even know what that means. We're like, we don't want to work on it. But then eventually we were like, okay, everyone is saying literature review. It's overwhelmingly people want to-
Swyx [00:19:02]: And all domains, not like medicine or physics or just all domains. Yeah.
Jungwon [00:19:06]: And we also kind of personally knew literature review was hard. And if you look at the graphs for academic literature being published every single month, you guys know this in machine learning, it's like up into the right, like superhuman amounts of papers. So we're like, all right, let's just try it. I was really nervous, but Andreas was like, this is kind of like the right problem space to jump into, even if we don't know what we're doing. So my take was like, fine, this feels really scary, but let's just launch a feature every single week and double our user numbers every month. And if we can do that, we'll fail fast and we will find something. I was worried about like getting lost in the kind of academic white space. So the very first version was actually a weekend prototype that Andreas made. Do you want to explain how that worked?
Andreas [00:19:44]: I mostly remember that it was really bad. The thing I remember is you entered a question and it would give you back a list of claims. So your question could be, I don't know, how does creatine affect cognition? It would give you back some claims that are to some extent based on papers, but they were often irrelevant. The papers were often irrelevant. And so we ended up soon just printing out a bunch of examples of results and putting them up on the wall so that we would kind of feel the constant shame of having such a bad product and would be incentivized to make it better. And I think over time it has gotten a lot better, but I think the initial version was like really very bad. Yeah.
Jungwon [00:20:20]: But it was basically like a natural language summary of an abstract, like kind of a one sentence summary, and which we still have. And then as we learned kind of more about this systematic review workflow, we started expanding the capability so that you could extract a lot more data from the papers and do more with that.
Swyx [00:20:33]: And were you using like embeddings and cosine similarity, that kind of stuff for retrieval, or was it keyword based?
Andreas [00:20:40]: I think the very first version didn't even have its own search engine. I think the very first version probably used the Semantic Scholar or API or something similar. And only later when we discovered that API is not very semantic, we then built our own search engine that has helped a lot.
Swyx [00:20:58]: And then we're going to go into like more recent products stuff, but like, you know, I think you seem the more sort of startup oriented business person and you seem sort of more ideologically like interested in research, obviously, because of your PhD. What kind of market sizing were you guys thinking? Right? Like, because you're here saying like, we have to double every month. And I'm like, I don't know how you make that conclusion from this, right? Especially also as a nonprofit at the time.
Jungwon [00:21:22]: I mean, market size wise, I felt like in this space where so much was changing and it was very unclear what of today was actually going to be true tomorrow. We just like really rested a lot on very, very simple fundamental principles, which is like, if you can understand the truth, that is very economically beneficial and valuable. If you like know the truth.
Swyx [00:21:42]: On principle.
Jungwon [00:21:43]: Yeah. That's enough for you. Yeah. Research is the key to many breakthroughs that are very commercially valuable.
Swyx [00:21:47]: Because my version of it is students are poor and they don't pay for anything. Right? But that's obviously not true. As you guys have found out. But you had to have some market insight for me to have believed that, but you skipped that.
Andreas [00:21:58]: Yeah. I remember talking to VCs for our seed round. A lot of VCs were like, you know, researchers, they don't have any money. Why don't you build legal assistant? I think in some short sighted way, maybe that's true. But I think in the long run, R&D is such a big space of the economy. I think if you can substantially improve how quickly people find new discoveries or avoid controlled trials that don't go anywhere, I think that's just huge amounts of money. And there are a lot of questions obviously about between here and there. But I think as long as the fundamental principle is there, we were okay with that. And I guess we found some investors who also were. Yeah.
Swyx [00:22:35]: Congrats. I mean, I'm sure we can cover the sort of flip later. I think you're about to start us on like GPT-3 and how that changed things for you. It's funny. I guess every major GPT version, you have some big insight. Yeah.
Jungwon [00:22:48]: Yeah. I mean, what do you think?
Andreas [00:22:51]: I think it's a little bit less true for us than for others, because we always believed that there will basically be human level machine work. And so it is definitely true that in practice for your product, as new models come out, your product starts working better, you can add some features that you couldn't add before. But I don't think we really ever had the moment where we were like, oh, wow, that is super unanticipated. We need to do something entirely different now from what was on the roadmap.
Jungwon [00:23:21]: I think GPT-3 was a big change because it kind of said, oh, now is the time that we can use AI to build these tools. And then GPT-4 was maybe a little bit more of an extension of GPT-3. GPT-3 over GPT-2 was like qualitative level shift. And then GPT-4 was like, okay, great. Now it's like more accurate. We're more accurate on these things. We can answer harder questions. But the shape of the product had already taken place by that time.
Swyx [00:23:44]: I kind of want to ask you about this sort of pivot that you've made. But I guess that was just a way to sell what you were doing, which is you're adding extra features on grouping by concepts. The GPT-4 pivot, quote unquote pivot that you-
Jungwon [00:23:55]: Oh, yeah, yeah, exactly. Right, right, right. Yeah. Yeah. When we launched this workflow, now that GPT-4 was available, basically Elisa was at a place where we have very tabular interfaces. So given a table of papers, you can extract data across all the tables. But you kind of want to take the analysis a step further. Sometimes what you'd care about is not having a list of papers, but a list of arguments, a list of effects, a list of interventions, a list of techniques. And so that's one of the things we're working on is now that you've extracted this information in a more structured way, can you pivot it or group by whatever the information that you extracted to have more insight first information still supported by the academic literature?
Swyx [00:24:33]: Yeah, that was a big revelation when I saw it. Basically, I think I'm very just impressed by how first principles, your ideas around what the workflow is. And I think that's why you're not as reliant on like the LLM improving, because it's actually just about improving the workflow that you would recommend to people. Today we might call it an agent, I don't know, but you're not relying on the LLM to drive it. It's relying on this is the way that Elicit does research. And this is what we think is most effective based on talking to our users.
Jungwon [00:25:01]: The problem space is still huge. Like if it's like this big, we are all still operating at this tiny part, bit of it. So I think about this a lot in the context of moats, people are like, oh, what's your moat? What happens if GPT-5 comes out? It's like, if GPT-5 comes out, there's still like all of this other space that we can go into. So I think being really obsessed with the problem, which is very, very big, has helped us like stay robust and just kind of directly incorporate model improvements and they keep going.
Swyx [00:25:26]: And then I first encountered you guys with Charlie, you can tell us about that project. Basically, yeah. Like how much did cost become a concern as you're working more and more with OpenAI? How do you manage that relationship?
Jungwon [00:25:37]: Let me talk about who Charlie is. And then you can talk about the tech, because Charlie is a special character. So Charlie, when we found him was, had just finished his freshman year at the University of Warwick. And I think he had heard about us on some discord. And then he applied and we were like, wow, who is this freshman? And then we just saw that he had done so many incredible side projects. And we were actually on a team retreat in Barcelona visiting our head of engineering at that time. And everyone was talking about this wonder kid or like this kid. And then on our take home project, he had done like the best of anyone to that point. And so people were just like so excited to hire him. So we hired him as an intern and they were like, Charlie, what if you just dropped out of school? And so then we convinced him to take a year off. And he was just incredibly productive. And I think the thing you're referring to is at the start of 2023, Anthropic kind of launched their constitutional AI paper. And within a few days, I think four days, he had basically implemented that in production. And then we had it in app a week or so after that. And he has since kind of contributed to major improvements, like cutting costs down to a tenth of what they were really large scale. But yeah, you can talk about the technical stuff. Yeah.
Andreas [00:26:39]: On the constitutional AI project, this was for abstract summarization, where in illicit, if you run a query, it'll return papers to you, and then it will summarize each paper with respect to your query for you on the fly. And that's a really important part of illicit because illicit does it so much. If you run a few searches, it'll have done it a few hundred times for you. And so we cared a lot about this both being fast, cheap, and also very low on hallucination. I think if illicit hallucinates something about the abstract, that's really not good. And so what Charlie did in that project was create a constitution that expressed what are the attributes of a good summary? Everything in the summary is reflected in the actual abstract, and it's like very concise, et cetera, et cetera. And then used RLHF with a model that was trained on the constitution to basically fine tune a better summarizer on an open source model. Yeah. I think that might still be in use.
Jungwon [00:27:34]: Yeah. Yeah, definitely. Yeah. I think at the time, the models hadn't been trained at all to be faithful to a text. So they were just generating. So then when you ask them a question, they tried too hard to answer the question and didn't try hard enough to answer the question given the text or answer what the text said about the question. So we had to basically teach the models to do that specific task.
Swyx [00:27:54]: How do you monitor the ongoing performance of your models? Not to get too LLM-opsy, but you are one of the larger, more well-known operations doing NLP at scale. I guess effectively, you have to monitor these things and nobody has a good answer that I talk to.
Andreas [00:28:10]: I don't think we have a good answer yet. I think the answers are actually a little bit clearer on the just kind of basic robustness side of where you can import ideas from normal software engineering and normal kind of DevOps. You're like, well, you need to monitor kind of latencies and response times and uptime and whatnot.
Swyx [00:28:27]: I think when we say performance, it's more about hallucination rate, isn't it?
Andreas [00:28:30]: And then things like hallucination rate where I think there, the really important thing is training time. So we care a lot about having our own internal benchmarks for model development that reflect the distribution of user queries so that we can know ahead of time how well is the model going to perform on different types of tasks. So the tasks being summarization, question answering, given a paper, ranking. And for each of those, we want to know what's the distribution of things the model is going to see so that we can have well-calibrated predictions on how well the model is going to do in production. And I think, yeah, there's some chance that there's distribution shift and actually the things users enter are going to be different. But I think that's much less important than getting the kind of training right and having very high quality, well-vetted data sets at training time.
Jungwon [00:29:18]: I think we also end up effectively monitoring by trying to evaluate new models as they come out. And so that kind of prompts us to go through our eval suite every couple of months. And every time a new model comes out, we have to see how is this performing relative to production and what we currently have.
Swyx [00:29:32]: Yeah. I mean, since we're on this topic, any new models that have really caught your eye this year?
Jungwon [00:29:37]: Like Claude came out with a bunch. Yeah. I think Claude is pretty, I think the team's pretty excited about Claude. Yeah.
Andreas [00:29:41]: Specifically, Claude Haiku is like a good point on the kind of Pareto frontier. It's neither the cheapest model, nor is it the most accurate, most high quality model, but it's just like a really good trade-off between cost and accuracy.
Swyx [00:29:57]: You apparently have to 10-shot it to make it good. I tried using Haiku for summarization, but zero-shot was not great. Then they were like, you know, it's a skill issue, you have to try harder.
Jungwon [00:30:07]: I think GPT-4 unlocked tables for us, processing data from tables, which was huge. GPT-4 Vision.
Andreas [00:30:13]: Yeah.
Swyx [00:30:14]: Yeah. Did you try like Fuyu? I guess you can't try Fuyu because it's non-commercial. That's the adept model.
Jungwon [00:30:19]: Yeah.
Swyx [00:30:20]: We haven't tried that one. Yeah. Yeah. Yeah. But Claude is multimodal as well. Yeah. I think the interesting insight that we got from talking to David Luan, who is CEO of multimodality has effectively two different flavors. One is we recognize images from a camera in the outside natural world. And actually the more important multimodality for knowledge work is screenshots and PDFs and charts and graphs. So we need a new term for that kind of multimodality.
Andreas [00:30:45]: But is the claim that current models are good at one or the other? Yeah.
Swyx [00:30:50]: They're over-indexed because of the history of computer vision is Coco, right? So now we're like, oh, actually, you know, screens are more important, OCR, handwriting. You mentioned a lot of like closed model lab stuff, and then you also have like this open source model fine tuning stuff. Like what is your workload now between closed and open? It's a good question.
Andreas [00:31:07]: I think- Is it half and half? It's a-
Swyx [00:31:10]: Is that even a relevant question or not? Is this a nonsensical question?
Andreas [00:31:13]: It depends a little bit on like how you index, whether you index by like computer cost or number of queries. I'd say like in terms of number of queries, it's maybe similar. In terms of like cost and compute, I think the closed models make up more of the budget since the main cases where you want to use closed models are cases where they're just smarter, where no existing open source models are quite smart enough.
Jungwon [00:31:35]: Yeah. Yeah.
Alessio [00:31:37]: We have a lot of interesting technical questions to go in, but just to wrap the kind of like UX evolution, now you have the notebooks. We talked a lot about how chatbots are not the final frontier, you know? How did you decide to get into notebooks, which is a very iterative kind of like interactive interface and yeah, maybe learnings from that.
Jungwon [00:31:56]: Yeah. This is actually our fourth time trying to make this work. Okay. I think the first time was probably in early 2021. I think because we've always been obsessed with this idea of task decomposition and like branching, we always wanted a tool that could be kind of unbounded where you could keep going, could do a lot of branching where you could kind of apply language model operations or computations on other tasks. So in 2021, we had this thing called composite tasks where you could use GPT-3 to brainstorm a bunch of research questions and then take each research question and decompose those further into sub questions. This kind of, again, that like task decomposition tree type thing was always very exciting to us, but that was like, it didn't work and it was kind of overwhelming. Then at the end of 22, I think we tried again and at that point we were thinking, okay, we've done a lot with this literature review thing. We also want to start helping with kind of adjacent domains and different workflows. Like we want to help more with machine learning. What does that look like? And as we were thinking about it, we're like, well, there are so many research workflows. How do we not just build three new workflows into Elicit, but make Elicit really generic to lots of workflows? What is like a generic composable system with nice abstractions that can like scale to all these workflows? So we like iterated on that a bunch and then didn't quite narrow the problem space enough or like quite get to what we wanted. And then I think it was at the beginning of 2023 where we're like, wow, computational notebooks kind of enable this, where they have a lot of flexibility, but kind of robust primitives such that you can extend the workflow and it's not limited. It's not like you ask a query, you get an answer, you're done. You can just constantly keep building on top of that. And each little step seems like a really good unit of work for the language model. And also there was just like really helpful to have a bit more preexisting work to emulate. Yeah, that's kind of how we ended up at computational notebooks for Elicit.
Andreas [00:33:44]: Maybe one thing that's worth making explicit is the difference between computational notebooks and chat, because on the surface, they seem pretty similar. It's kind of this iterative interaction where you add stuff. In both cases, you have a back and forth between you enter stuff and then you get some output and then you enter stuff. But the important difference in our minds is with notebooks, you can define a process. So in data science, you can be like, here's like my data analysis process that takes in a CSV and then does some extraction and then generates a figure at the end. And you can prototype it using a small CSV and then you can run it over a much larger CSV later. And similarly, the vision for notebooks in our case is to not make it this like one-off chat interaction, but to allow you to then say, if you start and first you're like, okay, let me just analyze a few papers and see, do I get to the correct conclusions for those few papers? Can I then later go back and say, now let me run this over 10,000 papers now that I've debugged the process using a few papers. And that's an interaction that doesn't fit quite as well into the chat framework because that's more for kind of quick back and forth interaction.
Alessio [00:34:49]: Do you think in notebooks, it's kind of like structure, editable chain of thought, basically step by step? Like, is that kind of where you see this going? And then are people going to reuse notebooks as like templates? And maybe in traditional notebooks, it's like cookbooks, right? You share a cookbook, you can start from there. Is this similar in Elizit?
Andreas [00:35:06]: Yeah, that's exactly right. So that's our hope that people will build templates, share them with other people. I think chain of thought is maybe still like kind of one level lower on the abstraction hierarchy than we would think of notebooks. I think we'll probably want to think about more semantic pieces like a building block is more like a paper search or an extraction or a list of concepts. And then the model's detailed reasoning will probably often be one level down. You always want to be able to see it, but you don't always want it to be front and center.
Alessio [00:35:36]: Yeah, what's the difference between a notebook and an agent? Since everybody always asks me, what's an agent? Like how do you think about where the line is?
Andreas [00:35:44]: Yeah, it's an interesting question. In the notebook world, I would generally think of the human as the agent in the first iteration. So you have the notebook and the human kind of adds little action steps. And then the next point on this kind of progress gradient is, okay, now you can use language models to predict which action would you take as a human. And at some point, you're probably going to be very good at this, you'll be like, okay, in some cases I can, with 99.9% accuracy, predict what you do. And then you might as well just execute it, like why wait for the human? And eventually, as you get better at this, that will just look more and more like agents taking actions as opposed to you doing the thing. I think templates are a specific case of this where you're like, okay, well, there's just particular sequences of actions that you often want to chunk and have available as primitives, just like in normal programming. And those, you can view them as action sequences of agents, or you can view them as more normal programming language abstraction thing. And I think those are two valid views. Yeah.
Alessio [00:36:40]: How do you see this change as, like you said, the models get better and you need less and less human actual interfacing with the model, you just get the results? Like how does the UX and the way people perceive it change?
Jungwon [00:36:52]: Yeah, I think this kind of interaction paradigms for evaluation is not really something the internet has encountered yet, because up to now, the internet has all been about getting data and work from people. So increasingly, I really want kind of evaluation, both from an interface perspective and from like a technical perspective and operation perspective to be a superpower for Elicit, because I think over time, models will do more and more of the work, and people will have to do more and more of the evaluation. So I think, yeah, in terms of the interface, some of the things we have today, you know, for every kind of language model generation, there's some citation back, and we kind of try to highlight the ground truth in the paper that is most relevant to whatever Elicit said, and make it super easy so that you can click on it and quickly see in context and validate whether the text actually supports the answer that Elicit gave. So I think we'd probably want to scale things up like that, like the ability to kind of spot check the model's work super quickly, scale up interfaces like that. And-
Swyx [00:37:44]: Who would spot check? The user?
Jungwon [00:37:46]: Yeah, to start, it would be the user. One of the other things we do is also kind of flag the model's uncertainty. So we have models report out, how confident are you that this was the sample size of this study? The model's not sure, we throw a flag. And so the user knows to prioritize checking that. So again, we can kind of scale that up. So when the model's like, well, I searched this on Google, I'm not sure if that was the right thing. I have an uncertainty flag, and the user can go and be like, oh, okay, that was actually the right thing to do or not.
Swyx [00:38:10]: I've tried to do uncertainty readings from models. I don't know if you have this live. You do? Yeah. Because I just didn't find them reliable because they just hallucinated their own uncertainty. I would love to base it on log probs or something more native within the model rather than generated. But okay, it sounds like they scale properly for you. Yeah.
Jungwon [00:38:30]: We found it to be pretty calibrated. It varies on the model.
Andreas [00:38:32]: I think in some cases, we also use two different models for the uncertainty estimates than for the question answering. So one model would say, here's my chain of thought, here's my answer. And then a different type of model. Let's say the first model is Llama, and let's say the second model is GPT-3.5. And then the second model just looks over the results and is like, okay, how confident are you in this? And I think sometimes using a different model can be better than using the same model. Yeah.
Swyx [00:38:58]: On the topic of models, evaluating models, obviously you can do that all day long. What's your budget? Because your queries fan out a lot. And then you have models evaluating models. One person typing in a question can lead to a thousand calls.
Andreas [00:39:11]: It depends on the project. So if the project is basically a systematic review that otherwise human research assistants would do, then the project is basically a human equivalent spend. And the spend can get quite large for those projects. I don't know, let's say $100,000. In those cases, you're happier to spend compute then in the kind of shallow search case where someone just enters a question because, I don't know, maybe I heard about creatine. What's it about? Probably don't want to spend a lot of compute on that. This sort of being able to invest more or less compute into getting more or less accurate answers is I think one of the core things we care about. And that I think is currently undervalued in the AI space. I think currently you can choose which model you want and you can sometimes, I don't know, you'll tip it and it'll try harder or you can try various things to get it to work harder. But you don't have great ways of converting willingness to spend into better answers. And we really want to build a product that has this sort of unbounded flavor where if you care about it a lot, you should be able to get really high quality answers, really double checked in every way.
Alessio [00:40:14]: And you have a credits-based pricing. So unlike most products, it's not a fixed monthly fee.
Jungwon [00:40:19]: Right, exactly. So some of the higher costs are tiered. So for most casual users, they'll just get the abstract summary, which is kind of an open source model. Then you can add more columns, which have more extractions and these uncertainty features. And then you can also add the same columns in high accuracy mode, which also parses the table. So we kind of stack the complexity on the calls.
Swyx [00:40:39]: You know, the fun thing you can do with a credit system, which is data for data, basically you can give people more credits if they give data back to you. I don't know if you've already done that. We've thought about something like this.
Jungwon [00:40:49]: It's like if you don't have money, but you have time, how do you exchange that?
Swyx [00:40:54]: It's a fair trade.
Jungwon [00:40:55]: I think it's interesting. We haven't quite operationalized it. And then, you know, there's been some kind of like adverse selection. Like, you know, for example, it would be really valuable to get feedback on our model. So maybe if you were willing to give more robust feedback on our results, we could give you credits or something like that. But then there's kind of this, will people take it seriously? And you want the good people. Exactly.
Swyx [00:41:11]: Can you tell who are the good people? Not right now.
Jungwon [00:41:13]: But yeah, maybe at the point where we can, we can offer it. We can offer it up to them.
Swyx [00:41:16]: The perplexity of questions asked, you know, if it's higher perplexity, these are the smarter
Jungwon [00:41:20]: people. Yeah, maybe.
Andreas [00:41:23]: If you put typos in your queries, you're not going to get off the stage.
Swyx [00:41:28]: Negative social credit. It's very topical right now to think about the threat of long context windows. All these models that we're talking about these days, all like a million token plus. Is that relevant for you? Can you make use of that? Is that just prohibitively expensive because you're just paying for all those tokens or you're just doing rag?
Andreas [00:41:44]: It's definitely relevant. And when we think about search, as many people do, we think about kind of a staged pipeline of retrieval where first you use semantic search database with embeddings, get like the, in our case, maybe 400 or so most relevant papers. And then, then you still need to rank those. And I think at that point it becomes pretty interesting to use larger models. So specifically in the past, I think a lot of ranking was kind of per item ranking where you would score each individual item, maybe using increasingly expensive scoring methods and then rank based on the scores. But I think list-wise re-ranking where you have a model that can see all the elements is a lot more powerful because often you can only really tell how good a thing is in comparison to other things and what things should come first. It really depends on like, well, what other things that are available, maybe you even care about diversity in your results. You don't want to show 10 very similar papers as the first 10 results. So I think a long context models are quite interesting there. And especially for our case where we care more about power users who are perhaps a little bit more willing to wait a little bit longer to get higher quality results relative to people who just quickly check out things because why not? And I think being able to spend more on longer contexts is quite valuable.
Jungwon [00:42:55]: Yeah. I think one thing the longer context models changed for us is maybe a focus from breaking down tasks to breaking down the evaluation. So before, you know, if we wanted to answer a question from the full text of a paper, we had to figure out how to chunk it and like find the relevant chunk and then answer based on that chunk. And the nice thing was then, you know, kind of which chunk the model used to answer the question. So if you want to help the user track it, yeah, you can be like, well, this was the chunk that the model got. And now if you put the whole text in the paper, you have to like kind of find the chunk like more retroactively basically. And so you need kind of like a different set of abilities and obviously like a different technology to figure out. You still want to point the user to the supporting quotes in the text, but then the interaction is a little different.
Swyx [00:43:38]: You like scan through and find some rouge score floor.
Andreas [00:43:41]: I think there's an interesting space of almost research problems here because you would ideally make causal claims like if this hadn't been in the text, the model wouldn't have said this thing. And maybe you can do expensive approximations to that where like, I don't know, you just throw out chunk of the paper and re-answer and see what happens. But hopefully there are better ways of doing that where you just get that kind of counterfactual information for free from the model.
Alessio [00:44:06]: Do you think at all about the cost of maintaining REG versus just putting more tokens in the window? I think in software development, a lot of times people buy developer productivity things so that we don't have to worry about it. Context window is kind of the same, right? You have to maintain chunking and like REG retrieval and like re-ranking and all of this versus I just shove everything into the context and like it costs a little more, but at least I don't have to do all of that. Is that something you thought about?
Jungwon [00:44:31]: I think we still like hit up against context limits enough that it's not really, do we still want to keep this REG around? It's like we do still need it for the scale of the work that we're doing, yeah.
Andreas [00:44:41]: And I think there are different kinds of maintainability. In one sense, I think you're right that throw everything into the context window thing is easier to maintain because you just can swap out a model. In another sense, if things go wrong, it's harder to debug where like, if you know, here's the process that we go through to go from 200 million papers to an answer. And there are like little steps and you understand, okay, this is the step that finds the relevant paragraph or whatever it may be. You'll know which step breaks if the answers are bad, whereas if it's just like a new model version came out and now it suddenly doesn't find your needle in a haystack anymore, then you're like, okay, what can you do? You're kind of at a loss.
Alessio [00:45:21]: Let's talk a bit about, yeah, needle in a haystack and like maybe the opposite of it, which is like hard grounding. I don't know if that's like the best name to think about it, but I was using one of these chatwitcher documents features and I put the AMD MI300 specs and the new Blackwell chips from NVIDIA and I was asking questions and does the AMD chip support NVLink? And the response was like, oh, it doesn't say in the specs. But if you ask GPD 4 without the docs, it would tell you no, because NVLink it's a NVIDIA technology.
Swyx [00:45:49]: It just says in the thing.
Alessio [00:45:53]: How do you think about that? Does using the context sometimes suppress the knowledge that the model has?
Andreas [00:45:57]: It really depends on the task because I think sometimes that is exactly what you want. So imagine you're a researcher, you're writing the background section of your paper and you're trying to describe what these other papers say. You really don't want extra information to be introduced there. In other cases where you're just trying to figure out the truth and you're giving the documents because you think they will help the model figure out what the truth is. I think you do want, if the model has a hunch that there might be something that's not in the papers, you do want to surface that. I think ideally you still don't want the model to just tell you, probably the ideal thing looks a bit more like agent control where the model can issue a query that then is intended to surface documents that substantiate its hunch. That's maybe a reasonable middle ground between model just telling you and model being fully limited to the papers you give it.
Jungwon [00:46:44]: Yeah, I would say it's, they're just kind of different tasks right now. And the task that Elicit is mostly focused on is what do these papers say? But there's another task which is like, just give me the best possible answer and that give me the best possible answer sometimes depends on what do these papers say, but it can also depend on other stuff that's not in the papers. So ideally we can do both and then kind of do this overall task for you more going forward.
Alessio [00:47:08]: We see a lot of details, but just to zoom back out a little bit, what are maybe the most underrated features of Elicit and what is one thing that maybe the users surprise you the most by using it?
Jungwon [00:47:19]: I think the most powerful feature of Elicit is the ability to extract, add columns to this table, which effectively extracts data from all of your papers at once. It's well used, but there are kind of many different extensions of that that I think users are still discovering. So one is we let you give a description of the column. We let you give instructions of a column. We let you create custom columns. So we have like 30 plus predefined fields that users can extract, like what were the methods? What were the main findings? How many people were studied? And we actually show you basically the prompts that we're using to extract that from our predefined fields. And then you can fork this and you can say, oh, actually I don't care about the population of people. I only care about the population of rats. Like you can change the instruction. So I think users are still kind of discovering that there's both this predefined, easy to use default, but that they can extend it to be much more specific to them. And then they can also ask custom questions. One use case of that is you can start to create different column types that you might not expect. So instead of just creating generative answers, like a description of the methodology, you can say classify the methodology into a prospective study, a retrospective study, or a case study. And then you can filter based on that. It's like all using the same kind of technology and the interface, but it unlocks different workflows. So I think that the ability to ask custom questions, give instructions, and specifically use that to create different types of columns, like classification columns, is still pretty underrated. In terms of use case, I spoke to someone who works in medical affairs at a genomic sequencing company recently. So doctors kind of order these genomic tests, these sequencing tests, to kind of identify if a patient has a particular disease. This company helps them process it. And this person basically interacts with all the doctors and if the doctors have any questions. My understanding is that medical affairs is kind of like customer support or customer success in pharma. So this person like talks to doctors all day long. One of the things they started using Elicit for is like putting the results of their tests as the query. Like this test showed, you know, this percentage presence of this and 40% that and whatever, you know, what genes are present here or what's in this sample. And getting kind of a list of academic papers that would support their findings and using this to help doctors interpret their tests. So we talked about, okay, cool, like if we built, he's pretty interested in kind of doing a survey of infectious disease specialists and getting them to evaluate, you know, having them write up their answers, comparing it to Elicit's answers, trying to see can Elicit start being used to interpret the results of these diagnostic tests. Because the way they ship these tests to doctors is they report on a really wide array of things. He was saying that at a large, well-resourced hospital, like a city hospital, there might be a team of infectious disease specialists who can help interpret these results. But at under-resourced hospitals or more rural hospitals, the primary care physician can't interpret the test results, so then they can't order it, they can't use it, they can't help their patients with it. So thinking about an evidence-backed way of interpreting these tests is definitely kind of an extension of the product that I hadn't considered before. But yeah, the idea of using that to bring more access to physicians in all different parts of the country and helping them interpret complicated science is pretty cool.
Alessio [00:50:28]: Yeah. We had Kanjun from Imbue on the podcast and we talked about better allocating scientific resources. How do you think about these use cases and maybe how illicit can help drive more research? And do you see a world in which maybe the models actually do some of the research before suggesting us?
Andreas [00:50:45]: Yeah, I think that's very close to what we care about. Our product values are systematic, transparent, and unbounded. And I think to make research especially more systematic and unbounded, I think is basically the thing that's at stake here. So for example, I was recently talking to people in longevity and I think there isn't really one field of longevity, there are kind of different scientific subdomains that are surfacing various things that are related to longevity. And I think if you could more systematically say, look, here are all the different interventions we could do and here's the expected ROI of these experiments. Here's like the evidence so far that supports those being either likely to surface new information or not. Here's the cost of these experiments. I think you could be so much more systematic than science is today. I'd guess in like 10, 20 years we'll look back and it will be incredible how unsystematic science was back in the day.
Jungwon [00:51:35]: Our view is kind of have models catch up to expert humans today. Start with kind of novice humans and then increasingly expert humans. But we really want the models to earn their right to the expertise. So that's why we do things in this very step-by-step way. That's why we don't just like throw a bunch of data and apply a bunch of compute and hope we get good results. But obviously at some point you hope that once it's kind of earned its stripes, it can surpass human researchers. But I think that's where making sure that the model's processes are really explicit and transparent and that it's really easy to evaluate is important because if it does surpass human understanding, people will still need to be able to audit its work somehow or spot check its work somehow to be able to reliably trust it and use it. So yeah, that's kind of why the process-based approach is really important.
Andreas [00:52:20]: And on the question of will models do their own research, I think one feature that most currently don't have that will need to be better there is better world models. I think currently models are just not great at representing what's going on in a particular situation or domain in a way that allows them to come to interesting, surprising conclusions. I think they're very good at coming to conclusions that are nearby to conclusions that people have come to. They're not as good at kind of reasoning and making surprising connections maybe. And so having deeper models of what are the underlying structures of different domains, how they're related or not related, I think will be an important ingredient for models actually being able to make novel contributions.
Swyx [00:53:00]: On the topic of hiring more expert humans, you've hired some very expert humans. My friend Maggie Appleton joined you guys I think maybe a year ago-ish. In fact, I think you're doing an offsite and we're actually organizing our biggest AI UX meetup around whenever she's in town in San Francisco. How big is the team? How have you sort of transitioned your company into this sort of PBC and sort of the plan for the future?
Jungwon [00:53:21]: Yeah, we're 12 people now. About half of us are in the Bay Area and then distributed across US and Europe, a mix of mostly kind of roles in engineering and product. Yeah, and I think that the transition to PBC was really not that eventful because I think we're already, even as a nonprofit, we are already shipping every week, so very much operating as a product. Very much at the start, yeah. Yeah. And then I would say the kind of PBC component was to very explicitly say that we have a mission that we care a lot about. There are a lot of ways to make money. We think our mission will make us a lot of money, but we are going to be opinionated about how we make money. We're going to take the version of making a lot of money that's in line with our mission. But it's like all very convergent. Like illicit is not going to make any money if it's a bad product, if it doesn't actually help you discover truth and do research more rigorously. So I think for us, the kind of mission and the success of the company are very intertwined. We're hoping to grow the team quite a lot this year. Probably some of our highest priority roles are in engineering, but also opening up roles more in design and product marketing, go to market. Yeah. Do you want to talk about the roles?
Andreas [00:54:23]: Yeah. Broadly, we're just looking for senior software engineers and don't need any particular AI expertise. A lot of it is just how do you build good orchestration for complex tasks? So we talked earlier about these are sort of notebooks, scaling up, task orchestration. And I think a lot of this looks more like traditional software engineering than it does look like machine learning research. And I think the people who are really good at building good abstractions, building applications that can kind of survive, even if some of their pieces break, like making reliable components out of unreliable pieces. I think those are the people that we're looking for.
Swyx [00:54:57]: You know, that's exactly what I used to do. Have you explored the existing orchestration frameworks, Temporal, Airflow, Daxter, Prefect?
Andreas [00:55:05]: We've looked into them a little bit. I think we have some specific requirements around being able to stream work back very quickly to our users. Those could definitely be relevant. Okay.
Swyx [00:55:15]: Well, you're hiring. I'm sure we'll plug all the links. Thank you so much for coming. Any parting words? Any words of wisdom? Models do you live by?
Jungwon [00:55:22]: I think it's a really important time for humanity. So I hope everyone listening to this podcast can think hard about exactly how they want to participate in this story. There's so much to build and we can be really intentional about what we align ourselves with. There are a lot of applications that are going to be really good for the world and a lot of applications that are not. And so, yeah, I hope people can take that seriously and kind of seize the moment. Yeah.
Swyx [00:55:46]: I love how intentional you guys have been. Thank you for sharing that story.
Jungwon [00:55:49]: Thank you. Yeah.
Andreas [00:55:51]: Thank you for coming on.
Jungwon [00:56:17]: Yeah. Thank you.
Get full access to Latent Space at www.latent.space/subscribe
+
+[by:whisper.cpp]
+
+[00:00.00](音乐)
+
+[00:06.00]大家好 欢迎到Lit and Space Podcast
+
+[00:08.40]这是Alessio 和CTO的计划人士 和我参加的计划人士
+
+[00:11.80]我参加了麦克欧的计划 专门 邱小雅
+
+[00:15.00]今天我们回到工作室了
+
+[00:17.20]和Andreas 和 卢安 欢迎你
+
+[00:20.20]谢谢 太好了 谢谢你
+
+[00:22.40]我会介绍你分别的 但也希望你会更多学习
+
+[00:27.40]So Andreas it looks like you started Alicit first and joined later
+
+[00:32.40]That's right
+
+[00:33.00]For all intents and purposes, the illicit and also the odd that existed before then were very different from what I started
+
+[00:39.60]So I think it's like fair to say that you co-funded it
+
+[00:42.60]Got it
+
+[00:43.00]And Joanne you're a co-founder and COO of Alicit now
+
+[00:46.20]Yeah that's right
+
+[00:47.00]So there's a little bit of a history to this
+
+[00:48.80]I'm not super aware of like the sort of journey
+
+[00:51.80]I was aware of odd and illicit as sort of a non-profit type situation
+
+[00:55.80]And recently you turned into like a public benefit corporation
+
+[00:59.40]So yeah maybe if you want you could take us through that journey of finding the problem
+
+[01:04.00]You know obviously you're working together now
+
+[01:06.20]So like how do you get together to decide to leave your startup career to join him
+
+[01:11.20]Yeah it's truly a very long journey
+
+[01:12.80]I guess truly it kind of started in Germany when I was born
+
+[01:17.20]So even as a kid I was always interested in AI
+
+[01:20.00]Like I kind of went to the library
+
+[01:21.40]There were books about how to write programs in QBasic
+
+[01:24.20]And like some of them talked about how to implement chatbots
+
+[01:27.20]And to be clear
+
+[01:28.80]He grew up in like a tiny village on the outskirts of Munich called Dinkelscherbin
+
+[01:33.20]Where it's like a very very idyllic German village
+
+[01:36.20]Yeah important to the story
+
+[01:38.40]So basically the main thing is I've kind of always been thinking about AI my entire life
+
+[01:42.80]And been thinking about at some point this is going to be a huge deal
+
+[01:46.00]It's going to be transformative
+
+[01:47.00]How can I work on it
+
+[01:48.20]And was thinking about it from when I was a teenager
+
+[01:51.60]After high school did a year where I started a startup with the intention to become rich
+
+[01:56.80]And then once I'm rich I can affect the trajectory of AI
+
+[02:00.40]Did not become rich
+
+[02:01.40]Decided to go back to college
+
+[02:03.00]And study cognitive science there
+
+[02:05.00]Which was like the closest thing I could find at the time to AI
+
+[02:08.00]In the last year of college moved to the US to do a PhD at MIT
+
+[02:12.60]Working on broadly kind of new programming languages for AI
+
+[02:15.00]Because it kind of seemed like the existing languages were not great at expressing
+
+[02:19.60]World models and learning world models during Bayesian inference
+
+[02:22.60]Was obviously thinking about ultimately the goal is to actually build tools that help people reason more clearly
+
+[02:27.60]Ask and answer better questions and make better decisions
+
+[02:31.60]But for a long time it seemed like the technology to put reasoning in machines just wasn't there
+
+[02:35.60]Initially at the end of my postdoc at Stanford was thinking about well what to do
+
+[02:39.60]I think the standard path is you become an academic and do research
+
+[02:43.60]But it's really hard to actually build interesting tools as an academic
+
+[02:48.60]You can't really hire great engineers
+
+[02:50.60]Everything is kind of on a paper-to-paper timeline
+
+[02:53.60]And so I was like well maybe I should start a startup
+
+[02:56.60]Pursuit that for a little bit
+
+[02:57.60]But it seemed like it was too early because you could have tried to do an AI startup
+
+[03:01.60]But probably would not have been this kind of AI startup we're seeing now
+
+[03:05.60]So then decided to just start a non-profit research lab
+
+[03:08.60]That's going to do research for a while until we better figure out how to do thinking in machines
+
+[03:13.60]And that was odd
+
+[03:14.60]And then over time it became clear how to actually build actual tools for reasoning
+
+[03:19.60]Then only over time we developed a better way to
+
+[03:23.60]I'll let you fill in some of the details here
+
+[03:25.60]Yeah so I guess my story maybe starts around 2015
+
+[03:29.60]I kind of wanted to be a founder for a long time
+
+[03:31.60]And I wanted to work on an idea that stood the test of time for me
+
+[03:34.60]Like an idea that stuck with me for a long time
+
+[03:37.60]And starting in 2015
+
+[03:38.60]Actually originally I became interested in AI based tools from the perspective of mental health
+
+[03:43.60]So there are a bunch of people around me who are really struggling
+
+[03:45.60]One really close friend in particular is really struggling with mental health
+
+[03:48.60]And didn't have any support
+
+[03:50.60]And it didn't feel like there was anything before kind of like getting hospitalized
+
+[03:54.60]That could just help her
+
+[03:56.60]And so luckily she came and stayed with me for a while
+
+[03:58.60]And we were just able to talk through some things
+
+[04:00.60]But it seemed like you know lots of people might not have that resource
+
+[04:04.60]And something maybe AI enabled could be much more scalable
+
+[04:07.60]I didn't feel ready to start a company then
+
+[04:10.60]That's 2015
+
+[04:11.60]And I also didn't feel like the technology was ready
+
+[04:13.60]So then I went into fintech
+
+[04:15.60]And like kind of learned how to do the tech thing
+
+[04:17.60]And then in 2019
+
+[04:18.60]I felt like it was time for me to just jump in
+
+[04:21.60]And build something on my own
+
+[04:22.60]I really wanted to create
+
+[04:24.60]And at the time I looked around at tech
+
+[04:26.60]And felt like not super inspired by the options
+
+[04:28.60]I just I didn't want to have a tech career ladder
+
+[04:31.60]Or like I didn't want to like climb the career ladder
+
+[04:33.60]There are two kind of interesting technologies at the time
+
+[04:35.60]There was AI and there was crypto
+
+[04:37.60]And I was like well the AI people seemed like a little bit more nice
+
+[04:41.60]And maybe like slightly more trustworthy
+
+[04:44.60]Both super exciting
+
+[04:45.60]But through my bed and on the AI side
+
+[04:47.60]And then I got connected to Andreas
+
+[04:49.60]And actually the way he was thinking about
+
+[04:51.60]Pursuing the research agenda at AUT
+
+[04:53.60]Was really compatible with what I had envisioned
+
+[04:56.60]For an ideal AI product
+
+[04:58.60]Something that helps kind of take down
+
+[05:00.60]Really complex thinking
+
+[05:01.60]Overwhelming thoughts
+
+[05:02.60]And breaks it down into small pieces
+
+[05:04.60]And then this kind of mission
+
+[05:05.60]We need AI to help us figure out
+
+[05:07.60]What we ought to do
+
+[05:08.60]It was really inspiring, right?
+
+[05:10.60]Yeah, because I think it was clear
+
+[05:12.60]That we were building the most powerful
+
+[05:14.60]Optimizer of our time
+
+[05:16.60]But as a society
+
+[05:17.60]We hadn't figured out
+
+[05:18.60]How to direct that optimization potential
+
+[05:21.60]And if you kind of direct tremendous
+
+[05:23.60]Optimization potential at the wrong thing
+
+[05:25.60]That's really disastrous
+
+[05:26.60]So the goal of AUT was
+
+[05:28.60]Make sure that if we build
+
+[05:29.60]The most transformative technology of our lifetime
+
+[05:31.60]It can be used for something really impactful
+
+[05:34.60]And that's really good reasoning
+
+[05:35.60]Like not just generating ads
+
+[05:37.60]My background was in marketing
+
+[05:38.60]But like so
+
+[05:39.60]It's like I want to do
+
+[05:40.60]More than generate ads with this
+
+[05:42.60]And also if these AI systems
+
+[05:44.60]Get to be super intelligent enough
+
+[05:46.60]That they are doing this
+
+[05:47.60]Really complex reasoning
+
+[05:48.60]That we can trust them
+
+[05:49.60]That they are aligned with us
+
+[05:51.60]And we have ways of evaluating
+
+[05:53.60]That they are doing the right thing
+
+[05:54.60]So that's what AUT did
+
+[05:55.60]We did a lot of experiments
+
+[05:56.60]You know, like Andreas said
+
+[05:57.60]Before foundation models
+
+[05:59.60]Really like took off
+
+[06:00.60]A lot of the issues we were seeing
+
+[06:01.60]Were more in reinforcement learning
+
+[06:03.60]But we saw a future
+
+[06:04.60]Where AI would be able to do
+
+[06:06.60]More kind of logical reasoning
+
+[06:08.60]Not just kind of extrapolate
+
+[06:09.60]From numerical trends
+
+[06:10.60]We actually kind of
+
+[06:11.60]Set up experiments with people
+
+[06:13.60]Where kind of people stood in
+
+[06:14.60]As super intelligent systems
+
+[06:16.60]And we effectively gave them
+
+[06:17.60]Context windows
+
+[06:18.60]So they would have to
+
+[06:19.60]Like read a bunch of text
+
+[06:20.60]And one person would get less text
+
+[06:23.60]And one person would get all the text
+
+[06:24.60]And the person with less text
+
+[06:26.60]Would have to evaluate the work
+
+[06:28.60]Of the person who could read much more
+
+[06:30.60]So like in the world
+
+[06:31.60]We were basically simulating
+
+[06:32.60]Like in, you know, 2018-2019
+
+[06:34.60]A world where an AI system
+
+[06:36.60]Could read significantly more than you
+
+[06:38.60]And you as the person
+
+[06:39.60]Who couldn't read that much
+
+[06:40.60]Had to evaluate the work
+
+[06:41.60]Of the AI system
+
+[06:42.60]So there's a lot of the work we did
+
+[06:44.60]And from that we kind of
+
+[06:45.60]Iterated on the idea
+
+[06:46.60]Of breaking complex tasks down
+
+[06:47.60]Into smaller tasks
+
+[06:48.60]Like complex tasks
+
+[06:49.60]Like open-ended reasoning
+
+[06:51.60]Logical reasoning
+
+[06:52.60]Into smaller tasks
+
+[06:53.60]So that it's easier
+
+[06:54.60]To train AI systems on them
+
+[06:55.60]And also so that it's easier
+
+[06:57.60]To evaluate the work of the AI system
+
+[06:59.60]When it's done
+
+[07:00.60]And then also kind of
+
+[07:01.60]We really pioneered this idea
+
+[07:02.60]The importance of supervising
+
+[07:03.60]The process of AI systems
+
+[07:05.60]Not just the outcomes
+
+[07:06.60]And so a big part
+
+[07:07.60]Of how elicit is built
+
+[07:08.60]Is we're very intentional
+
+[07:10.60]About not just throwing
+
+[07:11.60]A ton of data into a model
+
+[07:13.60]And training it
+
+[07:14.60]And then saying cool
+
+[07:15.60]Here's like scientific output
+
+[07:16.60]Like that's not at all
+
+[07:17.60]What we do
+
+[07:18.60]Our approach is very much
+
+[07:19.60]Like what are the steps
+
+[07:20.60]That an expert human does
+
+[07:21.60]Or what is like an ideal process
+
+[07:23.60]As granularly as possible
+
+[07:25.60]Let's break that down
+
+[07:26.60]And then train AI systems
+
+[07:27.60]To perform each of those steps
+
+[07:29.60]Very robustly
+
+[07:30.60]When you train like that
+
+[07:32.60]From the start
+
+[07:33.60]After the fact
+
+[07:34.60]It's much easier to evaluate
+
+[07:35.60]It's much easier to troubleshoot
+
+[07:36.60]At each point
+
+[07:37.60]Like where did something break down
+
+[07:38.60]So yeah
+
+[07:39.60]We were working on those experiments
+
+[07:40.60]For a while
+
+[07:41.60]And then at the start of 2021
+
+[07:43.60]Decided to build a product
+
+[07:44.60]Do you mind if I
+
+[07:45.60]Because I think you're about
+
+[07:46.60]To go into more modern
+
+[07:47.60]Hot and elicit
+
+[07:49.60]And I just wanted to
+
+[07:50.60]Because I think a lot of people
+
+[07:51.60]Are in where you were
+
+[07:53.60]Like sort of 2018-19
+
+[07:55.60]Where you chose a partner
+
+[07:57.60]To work with
+
+[07:58.60]And you didn't know him
+
+[07:59.60]Yeah yeah
+
+[08:00.60]You were just kind of cold introduced
+
+[08:01.60]Yep
+
+[08:02.60]A lot of people are cold introduced
+
+[08:03.60]I've been cold introduced
+
+[08:04.60]To tons of people
+
+[08:05.60]And I never work with them
+
+[08:06.60]I assume you had a lot
+
+[08:07.60]A lot of other options
+
+[08:08.60]Like how do you advise
+
+[08:09.60]People to make those choices
+
+[08:10.60]We were not totally cold introduced
+
+[08:12.60]So one of our closest friends
+
+[08:13.60]Introduced us
+
+[08:14.60]And then Andreas had written a lot
+
+[08:16.60]On the website
+
+[08:17.60]A lot of blog posts
+
+[08:18.60]A lot of publications
+
+[08:19.60]And I just read it
+
+[08:20.60]And I was like, wow
+
+[08:21.60]This sounds like my writing
+
+[08:22.60]And even other people
+
+[08:23.60]Some of my closest friends
+
+[08:24.60]I asked for advice from
+
+[08:25.60]They were like, oh
+
+[08:26.60]This sounds like your writing
+
+[08:28.60]But I think
+
+[08:29.60]I also had some kind of
+
+[08:30.60]Like things I was looking for
+
+[08:31.60]I wanted someone
+
+[08:32.60]With a complimentary skill set
+
+[08:33.60]I want someone
+
+[08:34.60]Who was very values aligned
+
+[08:36.60]And yeah
+
+[08:37.60]That was all a good fit
+
+[08:38.60]We also did a pretty
+
+[08:40.60]Lengthy mutual evaluation process
+
+[08:42.60]Where we had a Google doc
+
+[08:43.60]Where we had all kinds of questions
+
+[08:45.60]For each other
+
+[08:46.60]And I think it ended up being
+
+[08:48.60]Round 50 pages or so
+
+[08:49.60]Off like various questions
+
+[08:51.60]Was it the YC list?
+
+[08:53.60]There's some lists going around
+
+[08:54.60]For co-founder questions
+
+[08:55.60]No, we just made our own
+
+[08:57.60]But I guess it's probably related
+
+[08:59.60]And that you asked yourself
+
+[09:00.60]What are the values you care about
+
+[09:01.60]How would you approach
+
+[09:02.60]Various decisions
+
+[09:03.60]And things like that
+
+[09:04.60]I shared like all of my past
+
+[09:05.60]Performance reviews
+
+[09:06.60]Yeah
+
+[09:07.60]Yeah
+
+[09:08.60]And he never had any
+
+[09:09.60]No
+
+[09:10.60]Yeah, sorry
+
+[09:14.60]I just had to
+
+[09:15.60]A lot of people are going through
+
+[09:16.60]That phase
+
+[09:17.60]And you kind of skipped over it
+
+[09:18.60]I was like, no, no, no
+
+[09:19.60]There's like an interesting story
+
+[09:20.60]Yeah
+
+[09:21.60]Before we jump into what it is
+
+[09:22.60]It is today
+
+[09:23.60]The history is a bit
+
+[09:24.60]Cutter intuitive
+
+[09:25.60]So you start
+
+[09:26.60]Now, oh, if we had
+
+[09:27.60]A super powerful model
+
+[09:29.60]How we align it
+
+[09:30.60]How we use it
+
+[09:31.60]But then you were actually
+
+[09:32.60]Like, well, let's just build
+
+[09:33.60]The product so that people
+
+[09:34.60]Can actually leverage it
+
+[09:35.60]And I think there are
+
+[09:36.60]A lot of folks today
+
+[09:37.60]That are now back
+
+[09:38.60]To where you were
+
+[09:39.60]Maybe five years ago
+
+[09:40.60]They're like, oh, what if
+
+[09:41.60]This happens rather than
+
+[09:42.60]Focusing on actually building
+
+[09:43.60]Something useful with it
+
+[09:45.60]What click for you
+
+[09:46.60]To like move into a list
+
+[09:47.60]And then we can cover
+
+[09:48.60]That story too
+
+[09:49.60]I think in many ways
+
+[09:50.60]The approach is still the same
+
+[09:51.60]Because the way we're
+
+[09:52.60]Building a list is not
+
+[09:54.60]Let's train a foundation model
+
+[09:55.60]To do more stuff
+
+[09:56.60]It's like
+
+[09:57.60]Let's build a scaffolding
+
+[09:58.60]Such that we can
+
+[09:59.60]Deploy powerful models
+
+[10:00.60]To good ends
+
+[10:01.60]I think it's different
+
+[10:02.60]Now in that
+
+[10:03.60]We actually have
+
+[10:04.60]Like some of the models to plug in
+
+[10:05.60]But if in 2017
+
+[10:06.60]We had had the models
+
+[10:08.60]We could have run
+
+[10:09.60]The same experiments
+
+[10:10.60]We did run with humans
+
+[10:11.60]Back then
+
+[10:12.60]Just with models
+
+[10:13.60]And so in many ways
+
+[10:14.60]Our philosophy is always
+
+[10:15.60]Let's think add to the future
+
+[10:16.60]What models are going to exist
+
+[10:17.60]In one, two years
+
+[10:19.60]Or longer
+
+[10:20.60]And how can we make it
+
+[10:22.60]So that they can
+
+[10:23.60]Actually be deployed
+
+[10:24.60]In many transparent
+
+[10:25.60]Controllable ways
+
+[10:26.60]Yeah, I think
+
+[10:27.60]Motivationally we both
+
+[10:28.60]Are kind of
+
+[10:29.60]Product people at heart
+
+[10:30.60]The research was
+
+[10:31.60]Really important
+
+[10:32.60]And it didn't
+
+[10:33.60]Make sense to build
+
+[10:34.60]A product at that time
+
+[10:35.60]But at the end of the day
+
+[10:36.60]The thing that always
+
+[10:37.60]Motivated us is
+
+[10:38.60]Imagining a world
+
+[10:39.60]Where high quality
+
+[10:40.60]Reasoning is really abundant
+
+[10:41.60]And AI is a technology
+
+[10:43.60]That's going to get us there
+
+[10:44.60]And there's a way
+
+[10:45.60]To guide that technology
+
+[10:46.60]With research
+
+[10:47.60]But you can have
+
+[10:48.60]A more direct effect
+
+[10:49.60]Through product
+
+[10:50.60]Because with research
+
+[10:51.60]You publish the research
+
+[10:52.60]And someone else
+
+[10:53.60]Product felt
+
+[10:54.60]Like a more direct path
+
+[10:55.60]And we wanted to
+
+[10:56.60]Concretely have an impact
+
+[10:57.60]On people's lives
+
+[10:58.60]Yeah, I think
+
+[10:59.60]The kind of personally
+
+[11:00.60]The motivation was
+
+[11:01.60]We want to build
+
+[11:02.60]For people
+
+[11:03.60]Yep, and then
+
+[11:04.60]Just to recap as well
+
+[11:05.60]Like the models
+
+[11:06.60]You're using back then were
+
+[11:07.60]Like, I don't know
+
+[11:08.60]With the like BERT type stuff
+
+[11:10.60]Or T5 or
+
+[11:12.60]I don't know what time frame
+
+[11:13.60]We're talking about here
+
+[11:14.60]I guess to be clear
+
+[11:15.60]At the very beginning
+
+[11:16.60]We had humans do the work
+
+[11:18.60]And then I think
+
+[11:19.60]The first models
+
+[11:20.60]That kind of makes sense
+
+[11:21.60]Or GPT-2
+
+[11:22.60]And TNLG
+
+[11:23.60]And early generative models
+
+[11:25.60]We do
+
+[11:26.60]We also use
+
+[11:27.60]Like T5 based models
+
+[11:28.60]Even now
+
+[11:29.60]Started with GPT-2
+
+[11:30.60]Yeah, cool
+
+[11:31.60]I'm just kind of curious about
+
+[11:32.60]Like how do you
+
+[11:33.60]Start so early
+
+[11:34.60]Like now it's obvious
+
+[11:35.60]Where to start
+
+[11:36.60]But back then it wasn't
+
+[11:37.60]Yeah, I used to
+
+[11:38.60]Nag Andreas a lot
+
+[11:39.60]I was like
+
+[11:40.60]Why are you
+
+[11:41.60]Talking to this?
+
+[11:42.60]I don't know
+
+[11:43.60]I felt like
+
+[11:44.60]GPT-2 is like
+
+[11:45.60]Clearly can't do anything
+
+[11:46.60]And I was like
+
+[11:47.60]Andreas, you're wasting your time
+
+[11:48.60]Like playing with this toy
+
+[11:49.60]But yeah, it was right
+
+[11:50.60]So what's the history
+
+[11:51.60]Of what Elisit
+
+[11:52.60]Actually does as a product
+
+[11:53.60]You recently announced that
+
+[11:55.60]After four months
+
+[11:56.60]You get to a million of revenue
+
+[11:57.60]Obviously a lot of people
+
+[11:58.60]Use it, get a lot of value
+
+[11:59.60]But it would
+
+[12:00.60]Initially kind of like
+
+[12:01.60]Structured data
+
+[12:02.60]Instruction from papers
+
+[12:03.60]Then you had
+
+[12:04.60]Kind of like concept grouping
+
+[12:05.60]And today it's maybe
+
+[12:06.60]Like a more full stack
+
+[12:07.60]Research enabler
+
+[12:09.60]Kind of like paper
+
+[12:10.60]Understand their platform
+
+[12:11.60]What's the definitive definition
+
+[12:13.60]Of what Elisit is
+
+[12:14.60]And how did you get here
+
+[12:15.60]Yeah, we say Elisit
+
+[12:16.60]As an AI research assistant
+
+[12:17.60]I think it will continue
+
+[12:18.60]To evolve
+
+[12:19.60]You know, we're so excited
+
+[12:20.60]About building and research
+
+[12:21.60]Because there's just so much space
+
+[12:22.60]I think the current phase
+
+[12:23.60]We're in right now
+
+[12:24.60]We talk about it
+
+[12:25.60]As really trying to make Elisit
+
+[12:27.60]The best place to understand
+
+[12:28.60]What is known
+
+[12:29.60]So it's all a lot about like
+
+[12:31.60]Literature summarization
+
+[12:32.60]There's a ton of information
+
+[12:33.60]That the world already knows
+
+[12:34.60]It's really hard to navigate
+
+[12:35.60]Hard to make it relevant
+
+[12:37.60]So a lot of it is around
+
+[12:38.60]Document discovery
+
+[12:39.60]And processing and analysis
+
+[12:41.60]I really kind of want to
+
+[12:42.60]Import some of the incredible
+
+[12:44.60]Productivity improvements
+
+[12:45.60]We've seen in software engineering
+
+[12:47.60]And data science
+
+[12:48.60]And into research
+
+[12:49.60]So it's like
+
+[12:50.60]How can we make researchers
+
+[12:51.60]Like data scientists of text
+
+[12:53.60]That's why we're launching
+
+[12:54.60]This new set of features
+
+[12:55.60]Called notebooks
+
+[12:56.60]It's very much inspired
+
+[12:57.60]By computational notebooks
+
+[12:58.60]Like Jupyter notebooks
+
+[12:59.60]Deep note or colab
+
+[13:01.60]Because they're so powerful
+
+[13:02.60]And so flexible
+
+[13:03.60]And ultimately
+
+[13:04.60]When people are trying
+
+[13:05.60]To get to an answer
+
+[13:07.60]Or understand insight
+
+[13:08.60]They're kind of like
+
+[13:09.60]Manipulating evidence
+
+[13:10.60]And information
+
+[13:11.60]Today that's all packaged
+
+[13:12.60]In PDFs
+
+[13:13.60]Which are super brittle
+
+[13:14.60]But with language models
+
+[13:15.60]We can decompose
+
+[13:16.60]These PDFs
+
+[13:17.60]And then we can
+
+[13:18.60]Interly claims
+
+[13:19.60]And evidence
+
+[13:20.60]And insights
+
+[13:21.60]And then let researchers
+
+[13:22.60]Mash them up together
+
+[13:23.60]Remix them
+
+[13:24.60]And analyze them together
+
+[13:25.60]So yeah
+
+[13:26.60]I would say quite simply
+
+[13:27.60]Overall listed
+
+[13:28.60]As an AI research assistant
+
+[13:29.60]Right now we're focused
+
+[13:30.60]On text based workflows
+
+[13:32.60]But long term
+
+[13:33.60]Really want to kind of
+
+[13:34.60]Go further and further
+
+[13:35.60]Into reasoning
+
+[13:36.60]And decision making
+
+[13:37.60]And when you say
+
+[13:38.60]AI research assistant
+
+[13:39.60]This is kind of
+
+[13:40.60]Matter research
+
+[13:41.60]So researchers
+
+[13:42.60]Use a list
+
+[13:43.60]As a research assistant
+
+[13:44.60]It's not a generic
+
+[13:45.60]You can research
+
+[13:46.60]Or it could be
+
+[13:47.60]But what are people
+
+[13:48.60]Using it for today
+
+[13:49.60]So specifically in science
+
+[13:51.60]A lot of people use
+
+[13:52.60]Human research assistants
+
+[13:53.60]To do things
+
+[13:54.60]You tell your grad student
+
+[13:56.60]Here are a couple of papers
+
+[13:57.60]Can you look at
+
+[13:58.60]All of these
+
+[13:59.60]See which of these
+
+[14:00.60]Have kind of sufficiently
+
+[14:01.60]Large populations
+
+[14:02.60]And actually study
+
+[14:03.60]The disease that
+
+[14:04.60]I'm interested in
+
+[14:05.60]And then write out
+
+[14:06.60]Like what are the experiments
+
+[14:07.60]They did
+
+[14:08.60]What are the interventions
+
+[14:09.60]They did
+
+[14:10.60]What are the outcomes
+
+[14:11.60]And kind of organize
+
+[14:12.60]That for me
+
+[14:13.60]And the first phase
+
+[14:14.60]Of understanding
+
+[14:15.60]This is on
+
+[14:16.60]Automating that work flow
+
+[14:17.60]Because a lot of that work
+
+[14:18.60]Is pretty road work
+
+[14:19.60]I think it's not
+
+[14:20.60]The kind of thing
+
+[14:21.60]That we need humans to do
+
+[14:22.60]Language models can do it
+
+[14:23.60]And then if
+
+[14:24.60]Language models can do it
+
+[14:25.60]That you can obviously
+
+[14:26.60]Scale it up
+
+[14:27.60]Much more than a grad student
+
+[14:28.60]Or undergrad
+
+[14:29.60]Research assistant
+
+[14:30.60]Would be able to do
+
+[14:31.60]Yeah the use cases
+
+[14:32.60]Are pretty broad
+
+[14:33.60]So we do have
+
+[14:34.60]A very large
+
+[14:35.60]Percent of our users
+
+[14:36.60]Are just using it personally
+
+[14:37.60]Or for a mix
+
+[14:38.60]Of personal and professional
+
+[14:39.60]Things
+
+[14:40.60]People who care a lot
+
+[14:41.60]About health
+
+[14:42.60]Or biohacking
+
+[14:43.60]Or parents
+
+[14:44.60]Or disease
+
+[14:45.60]Or want to understand
+
+[14:46.60]The literature directly
+
+[14:47.60]So there is an
+
+[14:48.60]Individual consumer use
+
+[14:49.60]Case
+
+[14:50.60]We're most focused
+
+[14:51.60]On the power users
+
+[14:52.60]So that's where
+
+[14:53.60]We're really excited
+
+[14:54.60]To build
+
+[14:55.60]So Lisit was
+
+[14:56.60]Very much inspired
+
+[14:57.60]By this work flow
+
+[14:58.60]In literature
+
+[14:59.60]Called systematic reviews
+
+[15:00.60]Or meta analysis
+
+[15:01.60]Which is basically
+
+[15:02.60]The human state
+
+[15:03.60]Of the art
+
+[15:04.60]For summarizing
+
+[15:05.60]Scientific literature
+
+[15:06.60]It typically involves
+
+[15:07.60]Like five people
+
+[15:08.60]Working together
+
+[15:09.60]For over a year
+
+[15:10.60]And they kind of
+
+[15:11.60]First start by trying
+
+[15:12.60]To find the maximally
+
+[15:13.60]First possible
+
+[15:14.60]So it's like
+
+[15:15.60]Ten thousand papers
+
+[15:16.60]And they kind of
+
+[15:17.60]Systematically narrow
+
+[15:18.60]That down to like
+
+[15:19.60]Hundreds or fifty
+
+[15:20.60]Extract key details
+
+[15:22.60]From every single paper
+
+[15:23.60]Usually have two people
+
+[15:24.60]Doing it
+
+[15:25.60]Like a third person
+
+[15:26.60]Reviewing it
+
+[15:27.60]So it's like
+
+[15:28.60]Incredibly laborious
+
+[15:29.60]Time-consuming process
+
+[15:30.60]But you see it
+
+[15:31.60]In every single domain
+
+[15:32.60]So in science
+
+[15:33.60]In machine learning
+
+[15:34.60]In policy
+
+[15:35.60]Because it's so structured
+
+[15:36.60]And designed to be reproducible
+
+[15:37.60]It's really amenable
+
+[15:38.60]To automation
+
+[15:39.60]So it's kind of
+
+[15:40.60]The workflow that we want
+
+[15:41.60]To automate first
+
+[15:42.60]It's accessible
+
+[15:43.60]For any question
+
+[15:44.60]And make
+
+[15:45.60]You know kind of
+
+[15:46.60]These really robust
+
+[15:47.60]Living summaries of science
+
+[15:48.60]So yeah
+
+[15:48.60]It's one of the
+
+[15:49.60]Workflows that we're
+
+[15:50.60]Starting with
+
+[15:51.60]Our previous guest
+
+[15:52.60]Mike Conover
+
+[15:53.60]He's building a new
+
+[15:54.60]Company got BrightWave
+
+[15:55.60]Which is an AI
+
+[15:56.60]Research assistant
+
+[15:57.60]For financial research
+
+[15:58.60]How do you see
+
+[15:59.60]The future of these tools
+
+[16:00.60]Like does everything
+
+[16:01.60]Converged
+
+[16:02.60]Like a God researcher
+
+[16:03.60]Assisted
+
+[16:04.60]Or is every domain
+
+[16:05.60]Gone to have its own thing
+
+[16:06.60]I think that's a good
+
+[16:07.60]And mostly open question
+
+[16:09.60]I do think there are
+
+[16:10.60]Some differences
+
+[16:11.60]Data analysis
+
+[16:12.60]And other research
+
+[16:13.60]Is more high-level
+
+[16:15.60]Cross-domain thinking
+
+[16:16.60]And we definitely
+
+[16:17.60]Want to contribute to
+
+[16:18.60]The broad
+
+[16:19.60]Generalist reasoning type
+
+[16:20.60]Space like if
+
+[16:21.60]Researchers are
+
+[16:22.60]Making discoveries often
+
+[16:23.60]It's like hey
+
+[16:24.60]This thing in biology
+
+[16:25.60]Is actually analogous to
+
+[16:26.60]Like these equations
+
+[16:27.60]In economics or something
+
+[16:28.60]And that's just
+
+[16:29.60]Fundamentally a thing
+
+[16:30.60]That where you need
+
+[16:31.60]To reason across domains
+
+[16:32.60]At least within research
+
+[16:33.60]I think there will be
+
+[16:34.60]Like one best platform
+
+[16:36.60]More or less
+
+[16:37.60]For this type of
+
+[16:38.60]Generalist research
+
+[16:39.60]I think there may still be
+
+[16:40.60]Tools like for genomics
+
+[16:41.60]Like particular types
+
+[16:42.60]Of modules
+
+[16:43.60]Of genes
+
+[16:44.60]And proteins
+
+[16:45.60]And whatnot
+
+[16:46.60]But for a lot of
+
+[16:47.60]The kind of high-level reasoning
+
+[16:48.60]That humans do
+
+[16:49.60]I think that is
+
+[16:50.60]A more open or type
+
+[16:51.60]All thing
+
+[16:52.60]I wanted to ask
+
+[16:53.60]A little bit deeper about
+
+[16:54.60]I guess the workflow
+
+[16:55.60]That you mentioned
+
+[16:56.60]I like that phrase
+
+[16:57.60]I see that
+
+[16:58.60]In your UI now
+
+[16:59.60]But that's
+
+[17:00.60]As it is today
+
+[17:01.60]And I think you were
+
+[17:02.60]About to tell us about
+
+[17:03.60]How it was in 2021
+
+[17:04.60]And how it maybe progressed
+
+[17:05.60]How has this workflow
+
+[17:06.60]Evolved over time
+
+[17:07.60]So the very first
+
+[17:08.60]Version of illicit
+
+[17:09.60]In the research assistant
+
+[17:10.60]It was a forecasting assistant
+
+[17:12.60]So we set out
+
+[17:13.60]And we were thinking about
+
+[17:14.60]What are some of the most
+
+[17:15.60]Impactful types of reasoning
+
+[17:16.60]That if we could scale up
+
+[17:17.60]AI would really transform
+
+[17:18.60]The world
+
+[17:19.60]And we actually started
+
+[17:20.60]With literature review
+
+[17:21.60]But we're like
+
+[17:22.60]So many people are going to build
+
+[17:23.60]Literature review tools
+
+[17:24.60]So let's start there
+
+[17:25.60]So then we focused
+
+[17:26.60]On geopolitical forecasting
+
+[17:27.60]So I don't know
+
+[17:28.60]If you're familiar
+
+[17:29.60]With like manifold or
+
+[17:30.60]Manifold markets
+
+[17:31.60]Yeah, that kind of stuff
+
+[17:32.60]Before manifold
+
+[17:33.60]Yeah, yeah
+
+[17:34.60]I'm not predicting relationships
+
+[17:35.60]We're predicting like
+
+[17:36.60]Is China going to invade Taiwan?
+
+[17:38.60]Yeah
+
+[17:39.60]That's in a relationship
+
+[17:40.60]Yeah, that's fair
+
+[17:41.60]Yeah, it's true
+
+[17:42.60]And then we worked
+
+[17:43.60]On that for a while
+
+[17:44.60]And then after GPT-3
+
+[17:45.60] came out
+
+[17:46.60]I think by that time
+
+[17:47.60]We realized that
+
+[17:48.60]Originally we were trying
+
+[17:49.60]To help people convert
+
+[17:50.60]Their beliefs into
+
+[17:51.60]Probability distributions
+
+[17:53.60]So take fuzzy beliefs
+
+[17:54.60]But like model them
+
+[17:55.60]More concretely
+
+[17:56.60]And then after a few months
+
+[17:57.60]Of iterating on that
+
+[17:58.60]Just realize the thing
+
+[17:59.60]That's blocking people
+
+[18:00.60]From making
+
+[18:01.60]Interesting predictions
+
+[18:02.60]About important events
+
+[18:03.60]In the world
+
+[18:04.60]Is less kind of
+
+[18:05.60]On the probabilistic side
+
+[18:06.60]And much more
+
+[18:07.60]Research side
+
+[18:08.60]And so that kind
+
+[18:09.60]Of combined with
+
+[18:10.60]The very generalist
+
+[18:11.60]Capabilities of GPT-3
+
+[18:12.60]Prompted us to
+
+[18:13.60]Make a more general
+
+[18:14.60]Research assistant
+
+[18:15.60]Then we spent
+
+[18:16.60]A few months iterating
+
+[18:17.60]On what even is
+
+[18:18.60]A research assistant
+
+[18:19.60]So we would embed
+
+[18:20.60]With different researchers
+
+[18:21.60]We built data labeling
+
+[18:23.60]Workflows in the beginning
+
+[18:24.60]Kind of right off the bat
+
+[18:25.60]We built ways to find
+
+[18:27.60]Experts in a field
+
+[18:29.60]And like ways to ask
+
+[18:30.60]Good research questions
+
+[18:31.60]We just kind of
+
+[18:32.60]Iterated through a lot
+
+[18:33.60]Of workflows and no one else
+
+[18:34.60]Was really building at this
+
+[18:35.60]Time and it was like
+
+[18:36.60]Let's do some prompt
+
+[18:37.60]Engineering and see
+
+[18:38.60]Like what is a task
+
+[18:39.60]That is at the
+
+[18:40.60]Intersection of what's
+
+[18:41.60]Technologically capable
+
+[18:42.60]And like important
+
+[18:43.60]For researchers
+
+[18:44.60]And we had like
+
+[18:45.60]A very nondescript
+
+[18:46.60]Landing page
+
+[18:47.60]It said nothing
+
+[18:48.60]But somehow people were
+
+[18:49.60]Signing up and we had
+
+[18:50.60]The sign of form
+
+[18:51.60]That was like
+
+[18:52.60]Why are you here
+
+[18:53.60]And everyone was like
+
+[18:54.60]I need help
+
+[18:55.60]With literature review
+
+[18:56.60]And we're like
+
+[18:57.60]A literature review
+
+[18:58.60]That sounds so hard
+
+[18:59.60]I don't even know
+
+[19:00.60]What that means
+
+[19:01.60]We don't want to work on it
+
+[19:02.60]But then eventually
+
+[19:03.60]We're like
+
+[19:04.60]Everyone is saying
+
+[19:05.60]Yeah
+
+[19:06.60]And we also kind of
+
+[19:07.60]Personally knew literature
+
+[19:08.60]Review was hard
+
+[19:09.60]And if you look at the graphs
+
+[19:10.60]For academic literature
+
+[19:11.60]Being published every
+
+[19:12.60]Single month you guys
+
+[19:13.60]Know this in machine learning
+
+[19:14.60]It's like up into the right
+
+[19:15.60]Like superhuman amounts
+
+[19:16.60]Of papers
+
+[19:17.60]So we're like
+
+[19:18.60]All right, let's just try it
+
+[19:19.60]I was really nervous
+
+[19:20.60]But Andres was like
+
+[19:21.60]This is kind of like
+
+[19:22.60]The right problem space
+
+[19:23.60]To jump into
+
+[19:24.60]Even if we don't
+
+[19:25.60]Know what we're doing
+
+[19:26.60]So my take was like
+
+[19:27.60]Fine
+
+[19:28.60]This feels really scary
+
+[19:29.60]But let's just launch
+
+[19:30.60]A feature every single week
+
+[19:31.60]And double our user
+
+[19:32.60]Numbers every month
+
+[19:33.60]And if we can do that
+
+[19:34.60]We will find something
+
+[19:35.60]I was worried about like
+
+[19:36.60]Getting lost
+
+[19:37.60]In the kind of academic white
+
+[19:38.60]Space
+
+[19:39.60]So the very first version
+
+[19:40.60]Was actually a weekend prototype
+
+[19:41.60]That Andres made
+
+[19:42.60]Do you want to explain
+
+[19:43.60]How that worked
+
+[19:44.60]I mostly remember
+
+[19:45.60]That it was really bad
+
+[19:47.60]So the thing I remember
+
+[19:48.60]Is you entered a question
+
+[19:50.60]And it would give you back
+
+[19:51.60]A list of claims
+
+[19:52.60]So your question could be
+
+[19:53.60]I don't know
+
+[19:54.60]How does creatine effect cognition
+
+[19:56.60]And it would give you back
+
+[19:57.60]Some claims
+
+[19:58.60]That are to some extent
+
+[19:59.60]Based on papers
+
+[20:00.60]But they were often irrelevant
+
+[20:02.60]The papers were often
+
+[20:03.60]And so we ended up
+
+[20:04.60]Soon just printing out
+
+[20:05.60]A bunch of examples
+
+[20:06.60]Of results
+
+[20:07.60]And putting them up
+
+[20:08.60]On the wall
+
+[20:09.60]So that we would
+
+[20:10.60]Kind of feel the constant
+
+[20:11.60]Shame of having
+
+[20:12.60]Such a bad product
+
+[20:13.60]And would be incentivized
+
+[20:14.60]To make it better
+
+[20:15.60]And I think overtime
+
+[20:16.60]It has gotten a lot better
+
+[20:17.60]But I think
+
+[20:18.60]The initial version
+
+[20:19.60]Was like really very bad
+
+[20:20.60]But it was basically
+
+[20:21.60]Like a natural language
+
+[20:22.60]Summary of an abstract
+
+[20:23.60]Like kind of a one-sentence
+
+[20:24.60]Summary
+
+[20:25.60]And which we still have
+
+[20:26.60]And then as we learned
+
+[20:27.60]Kind of more about this
+
+[20:28.60]Systematic review workflow
+
+[20:29.60]We started expanding
+
+[20:30.60]The capability so that
+
+[20:31.60]You could extract a lot
+
+[20:32.60]And more with that
+
+[20:33.60]And were you using
+
+[20:34.60]Like embeddings
+
+[20:35.60]And cosine similarity
+
+[20:36.60]That kind of stuff
+
+[20:37.60]For retrieval
+
+[20:38.60]Or was it keyword based
+
+[20:39.60]Or
+
+[20:40.60]I think the very first version
+
+[20:42.60]Didn't even have
+
+[20:43.60]It's own search engine
+
+[20:44.60]I think the very first version
+
+[20:45.60]Probably used
+
+[20:46.60]The semantic school or API
+
+[20:48.60]Or something similar
+
+[20:49.60]And only later when we discovered
+
+[20:51.60]That the API is not very semantic
+
+[20:53.60]Then built our own search
+
+[20:55.60]Search and that has helped a lot
+
+[20:57.60]And then we're going to go into
+
+[20:59.60]Like more recent products stuff
+
+[21:01.60]But like you know
+
+[21:02.60]I think you seem the more
+
+[21:03.60]So to start up oriented
+
+[21:04.60]Business person
+
+[21:05.60]And you seem sort of more
+
+[21:06.60]Ideologically like interested
+
+[21:08.60]In research obviously
+
+[21:09.60]Because of your PhD
+
+[21:10.60]What kind of market sizing
+
+[21:11.60]Were you guys thinking
+
+[21:12.60]Right?
+
+[21:13.60]Because you're here saying
+
+[21:14.60]Like we have to double every month
+
+[21:15.60]And I'm like
+
+[21:16.60]I don't know how you make
+
+[21:17.60]That conclusion from this
+
+[21:19.60]Right?
+
+[21:20.60]Especially also as a nonprofit
+
+[21:21.60]At the time
+
+[21:22.60]I mean market size wise
+
+[21:23.60]I felt like in this space
+
+[21:25.60]Where so much was changing
+
+[21:27.60]And it was very unclear
+
+[21:29.60]What of today was actually
+
+[21:30.60]Will be true tomorrow
+
+[21:31.60]We just like
+
+[21:32.60]Really rested a lot
+
+[21:33.60]On very very simple
+
+[21:34.60]Fundamental principles
+
+[21:35.60]Which is like
+
+[21:36.60]If you can understand
+
+[21:37.60]The truth that is
+
+[21:38.60]Very economically beneficial
+
+[21:40.60]Like valuable
+
+[21:41.60]If you like know the truth
+
+[21:42.60]On principle
+
+[21:43.60]That's enough for you
+
+[21:44.60]Research is the key to many
+
+[21:45.60]Breakthroughs that are
+
+[21:46.60]Very commercially valuable
+
+[21:47.60]Because my version of it
+
+[21:48.60]Is students are poor
+
+[21:49.60]And they don't pay
+
+[21:50.60]For anything
+
+[21:51.60]Right?
+
+[21:52.60]But that's obviously not true
+
+[21:53.60]As you guys have found out
+
+[21:54.60]But you had to have
+
+[21:55.60]Some market insight
+
+[21:56.60]For me to have believed that
+
+[21:57.60]But you skipped that
+
+[21:58.60]We did encounter
+
+[21:59.60]Talking to vcs
+
+[22:00.60]For our seed round
+
+[22:01.60]A lot of vcs were like
+
+[22:02.60]You know researchers
+
+[22:03.60]They don't have any money
+
+[22:04.60]Why don't you build
+
+[22:05.60]Legal assistant
+
+[22:07.60]I think in some
+
+[22:09.60]Short-sighted way
+
+[22:10.60]Maybe that's true
+
+[22:11.60]But I think in the long run
+
+[22:12.60]R&D is such a big space
+
+[22:13.60]Of the economy
+
+[22:14.60]I think if you can
+
+[22:15.60]Substantially improve
+
+[22:17.60]How quickly people find
+
+[22:19.60]New discoveries
+
+[22:20.60]Or avoid controlled trials
+
+[22:22.60]That don't go anywhere
+
+[22:23.60]I think that's just
+
+[22:24.60]Huge amounts of money
+
+[22:25.60]And there are a lot
+
+[22:26.60]Of questions obviously
+
+[22:27.60]About between here and there
+
+[22:28.60]But I think as long as
+
+[22:29.60]The fundamental principle is there
+
+[22:31.60]We were okay with that
+
+[22:32.60]And I guess we found
+
+[22:33.60]Some investors who also were
+
+[22:34.60]Yeah congrats
+
+[22:35.60]I'm sure we can cover
+
+[22:37.60]The sort of flip later
+
+[22:39.60]I think you're about to start
+
+[22:40.60]As on like GPT-3
+
+[22:41.60]And how like that
+
+[22:42.60]Changed things for you
+
+[22:43.60]It's funny like I guess
+
+[22:44.60]Every major GPT version
+
+[22:45.60]You have like some big insight
+
+[22:47.60]Yeah I mean
+
+[22:49.60]What do you think
+
+[22:50.60]I think it's a little bit
+
+[22:52.60]Less true for us than for others
+
+[22:54.60]Because we always believe
+
+[22:55.60]That there will basically
+
+[22:57.60]Human level machine work
+
+[23:00.60]And so
+
+[23:01.60]It is definitely true
+
+[23:02.60]That in practice
+
+[23:03.60]For your product
+
+[23:04.60]As new models come out
+
+[23:06.60]Your product starts working better
+
+[23:07.60]You can add some features
+
+[23:08.60]That you couldn't add before
+
+[23:09.60]But I don't think
+
+[23:11.60]We really ever had the
+
+[23:13.60]Moment where we were like
+
+[23:14.60]Oh wow
+
+[23:15.60]That is super unanticipated
+
+[23:17.60]We need to do something
+
+[23:18.60]Entirely different now
+
+[23:19.60]From what was on the roadmap
+
+[23:21.60]I think GPT-3
+
+[23:22.60]Was a big change
+
+[23:23.60]Because it kind of said
+
+[23:25.60]Oh now is the time
+
+[23:26.60]To build these tools
+
+[23:27.60]And then GPT-4
+
+[23:28.60]Was maybe a little bit
+
+[23:29.60]More of an extension
+
+[23:30.60]Of GPT-3
+
+[23:31.60]GPT-3 over GPT-2
+
+[23:32.60]Was like qualitative level
+
+[23:34.60]Shift
+
+[23:35.60]Then GPT-4 was like
+
+[23:36.60]Okay great
+
+[23:37.60]Now it's like more accurate
+
+[23:38.60]We're more accurate
+
+[23:39.60]On these things
+
+[23:40.60]We can answer harder questions
+
+[23:41.60]But the shape of the product
+
+[23:42.60]Had already taken place
+
+[23:43.60]By that time
+
+[23:44.60]I kind of want to ask you
+
+[23:45.60]About this sort of pivot
+
+[23:46.60]That you made
+
+[23:47.60]But I guess that was just
+
+[23:48.60]A way to sell
+
+[23:49.60]What you were doing
+
+[23:50.60]Which is you're adding
+
+[23:51.60]Extra features on grouping
+
+[23:52.60]My concepts
+
+[23:53.60]The GPT-4 pivot
+
+[23:54.60]Quote unquote pivot
+
+[23:55.60]Yeah yeah
+
+[23:56.60]Exactly
+
+[23:57.60]Yeah yeah
+
+[23:58.60]When we launched
+
+[23:59.60]This workflow
+
+[24:00.60]Now that GPT-4
+
+[24:01.60]Was available
+
+[24:02.60]Basically
+
+[24:03.60]Elisa was at a place
+
+[24:04.60]Where we have very tabular
+
+[24:05.60]Interfaces
+
+[24:06.60]So given a table of papers
+
+[24:07.60]You can extract data
+
+[24:08.60] Across all the tables
+
+[24:09.60]But you kind of want
+
+[24:10.60]To take the analysis
+
+[24:11.60]A step further
+
+[24:12.60]Sometimes what you'd care
+
+[24:13.60]About is not having
+
+[24:14.60]A list of papers
+
+[24:15.60]But a list of arguments
+
+[24:17.60]A list of effects
+
+[24:18.60]A list of interventions
+
+[24:19.60]A list of techniques
+
+[24:20.60]And so that's
+
+[24:21.60]One of the things we're
+
+[24:22.60]Working on is now that
+
+[24:23.60]You've extracted this information
+
+[24:24.60]A way
+
+[24:25.60]Can you pivot it
+
+[24:26.60]Or group by
+
+[24:27.60]Whatever the information
+
+[24:28.60]That you extracted
+
+[24:29.60]To have more insight
+
+[24:30.60]First information
+
+[24:31.60]Still supported
+
+[24:32.60]By the academic literature
+
+[24:33.60]Yeah
+
+[24:34.60]There was a big revelation
+
+[24:35.60]When I saw it
+
+[24:36.60]Basically I think
+
+[24:37.60]I'm very just impressed
+
+[24:38.60]By how first principles
+
+[24:39.60]Your ideas
+
+[24:40.60]Around the workflow is
+
+[24:42.60]And I think
+
+[24:43.60]That's why
+
+[24:44.60]You're not as reliant
+
+[24:45.60]On like the LM
+
+[24:46.60]Improving
+
+[24:47.60]Because it's actually
+
+[24:48.60]Just about improving
+
+[24:49.60]The workflow
+
+[24:50.60]That you will recommend
+
+[24:51.60]To people
+
+[24:52.60]Today we might call
+
+[24:53.60]It's rely on
+
+[24:54.60]This is the way
+
+[24:55.60]That elicit
+
+[24:56.60]Does research
+
+[24:57.60]And this is
+
+[24:58.60]What we think
+
+[24:59.60]Is most effective
+
+[25:00.60]Based on talking to our users
+
+[25:01.60]The problem space
+
+[25:02.60]Is still huge
+
+[25:03.60]Like if it's
+
+[25:04.60]Like this big
+
+[25:05.60]We're all still operating
+
+[25:06.60]At this tiny part
+
+[25:07.60]Bit of it
+
+[25:08.60]So you know
+
+[25:09.60]I think about this a lot
+
+[25:10.60]In the context of motes
+
+[25:11.60]People are like
+
+[25:12.60]Oh what's your mode
+
+[25:13.60]What happens
+
+[25:14.60]If GPT-5 comes out
+
+[25:15.60]It's like if GPT-5 comes out
+
+[25:16.60]There's still like
+
+[25:17.60]All of this other space
+
+[25:18.60]That we can go into
+
+[25:19.60]And so I think being
+
+[25:20.60]Really obsessed
+
+[25:21.60]With the problem
+
+[25:22.60]It's a robust
+
+[25:23.60]And just kind of
+
+[25:24.60]Directly incorporate
+
+[25:25.60]Model improvements
+
+[25:26.60]And they keep going
+
+[25:27.60]And then I first encountered
+
+[25:28.60]You guys with Charlie
+
+[25:29.60]You can tell us
+
+[25:30.60]About that project
+
+[25:31.60]Basically yeah
+
+[25:32.60]Like how much did cost
+
+[25:34.60]Become a concern
+
+[25:35.60]As you're working more
+
+[25:36.60]And more with OpenAI
+
+[25:37.60]How do you manage
+
+[25:38.60]That relationship
+
+[25:39.60]Let me talk about
+
+[25:40.60]Who Charlie is
+
+[25:41.60]You can talk about that
+
+[25:42.60]Charlie is a special character
+
+[25:43.60]So Charlie
+
+[25:44.60]When we found him
+
+[25:45.60]Was had just finished
+
+[25:46.60]His freshman year
+
+[25:47.60]At the University of Warwick
+
+[25:48.60]I think he had heard
+
+[25:49.60]About us on some discord
+
+[25:50.60]And then he applied
+
+[25:51.60]And then we just saw
+
+[25:52.60]That he had done so many
+
+[25:53.60]Incredible side projects
+
+[25:54.60]And we were actually
+
+[25:55.60]On a team retreat
+
+[25:56.60]In Barcelona
+
+[25:57.60]Visiting our head of engineering
+
+[25:58.60]At that time
+
+[25:59.60]And everyone was talking
+
+[26:00.60]About this wonder kid
+
+[26:01.60]They're like this kid
+
+[26:02.60]And then on our take home
+
+[26:03.60]Project he had done
+
+[26:04.60]Like the best of anyone
+
+[26:05.60]To that point
+
+[26:06.60]And so people were
+
+[26:07.60]Just like so excited
+
+[26:08.60]To hire him
+
+[26:09.60]So we hired him
+
+[26:10.60]As an intern
+
+[26:11.60]And then we're like Charlie
+
+[26:12.60]What if you just dropped
+
+[26:13.60]Out of school
+
+[26:14.60]And so then we convinced
+
+[26:15.60] him to take a year off
+
+[26:16.60]And he's just
+
+[26:17.60]Incredibly productive
+
+[26:18.60]And I think the thing
+
+[26:19.60]You're referring to
+
+[26:20.60]He kind of launched
+
+[26:21.60]Their constitutional AI paper
+
+[26:23.60]And within a few days
+
+[26:24.60]I think four days
+
+[26:25.60]He had basically implemented
+
+[26:26.60]That in production
+
+[26:27.60]And then we had it
+
+[26:28.60]In app a week or so after that
+
+[26:30.60]And he has since kind of
+
+[26:31.60]Contributed to major improvements
+
+[26:33.60]Like cutting costs down
+
+[26:34.60]To a tenth of what they were
+
+[26:36.60]Really large scale
+
+[26:37.60]But yeah, you can talk
+
+[26:38.60]About the technical stuff
+
+[26:39.60]Yeah, on the
+
+[26:40.60]Constitutional AI project
+
+[26:41.60]This was for abstract summarization
+
+[26:43.60]Where in illicit
+
+[26:44.60]If you run a query
+
+[26:45.60]It'll return papers to you
+
+[26:47.60]And then it will summarize
+
+[26:48.60]Each paper
+
+[26:49.60]The query for you
+
+[26:50.60]On the fly
+
+[26:51.60]And that's a really
+
+[26:52.60]Important part of illicit
+
+[26:53.60]Because illicit does it so much
+
+[26:55.60]We run a few searches
+
+[26:56.60]It'll have done it
+
+[26:57.60]A few hundred times for you
+
+[26:58.60]And so we cared a lot
+
+[26:59.60]About this both
+
+[27:00.60]Being like fast, cheap
+
+[27:02.60]And also very low on hallucination
+
+[27:04.60]I think if illicit
+
+[27:05.60]Hollucinate something
+
+[27:06.60]About the abstract
+
+[27:07.60]That's really not good
+
+[27:08.60]And so what Charlie did
+
+[27:09.60]In that project was
+
+[27:11.60]Created a constitution
+
+[27:12.60]That expressed
+
+[27:13.60]Where are the attributes
+
+[27:14.60]Of a good summary
+
+[27:15.60]Everything in the summary
+
+[27:16.60]Is reflected in the actual abstract
+
+[27:18.60]It was like
+
+[27:19.60]Very concise
+
+[27:20.60]Etc.
+
+[27:21.60]And then
+
+[27:22.60]Used RLHF
+
+[27:24.60]With a model
+
+[27:25.60]That was trained
+
+[27:26.60]On the constitution
+
+[27:27.60]To basically
+
+[27:29.60]Find you a better
+
+[27:30.60]Summarizer
+
+[27:31.60]On an open source model
+
+[27:32.60]Yeah, I think
+
+[27:33.60]That might still be in use
+
+[27:34.60]Yeah, yeah, definitely
+
+[27:35.60]Yeah, I think
+
+[27:36.60]At the time
+
+[27:37.60]The models hadn't been
+
+[27:38.60]Trained at all
+
+[27:39.60]To be faithful to a text
+
+[27:41.60]So they were just generating
+
+[27:42.60]So then when you
+
+[27:43.60]Ask them a question
+
+[27:44.60]They tried too hard
+
+[27:45.60]To answer the question
+
+[27:46.60]And didn't try hard
+
+[27:47.60]Answer the question
+
+[27:48.60]Given the text
+
+[27:49.60]Or answer what the text
+
+[27:50.60] Said about the question
+
+[27:51.60]So we had to
+
+[27:52.60]Basically teach the models
+
+[27:53.60]To do that specific task
+
+[27:54.60]How do you monitor
+
+[27:55.60]The ongoing performance
+
+[27:57.60]Of your models
+
+[27:58.60]Not to get
+
+[27:59.60]To LLMopsy
+
+[28:00.60]But you are one of the
+
+[28:01.60]Larger more well-known
+
+[28:02.60]Operations
+
+[28:03.60]Doing NLP at scale
+
+[28:04.60]I guess effectively
+
+[28:06.60]Like you have to monitor
+
+[28:07.60]These things and nobody
+
+[28:08.60]Has a good answer
+
+[28:09.60]That talks to you
+
+[28:10.60]Yeah, I don't think
+
+[28:11.60]We have a good answer yet
+
+[28:12.60]I think the answers
+
+[28:13.60]Are actually a little bit
+
+[28:14.60]Clear on the
+
+[28:15.60]Just kind of basic
+
+[28:16.60]The business side
+
+[28:17.60]Of where you can
+
+[28:18.60]Import ideas
+
+[28:19.60]From normal
+
+[28:20.60]Soft engineering
+
+[28:21.60]And normal kind
+
+[28:22.60]Of DevOps
+
+[28:23.60]You're like
+
+[28:24.60]Well, you need to
+
+[28:25.60]Monitor kind
+
+[28:26.60]Of latencies
+
+[28:27.60]And response times
+
+[28:28.60]And optime and whatnot
+
+[28:29.60]Performance is more
+
+[28:30.60]Of hallucination rate
+
+[28:31.60]And then things
+
+[28:32.60]Like hallucination rate
+
+[28:33.60]Where I think there
+
+[28:34.60]The really
+
+[28:35.60]Important thing
+
+[28:36.60]Is training time
+
+[28:37.60]So we care a lot
+
+[28:38.60]About having
+
+[28:39.60]Our own internal
+
+[28:41.60]Benchmarks
+
+[28:42.60]For model development
+
+[28:44.60]That reflect
+
+[28:45.60]So that we can
+
+[28:46.60]Know ahead of time
+
+[28:47.60]How well
+
+[28:48.60]Is the model
+
+[28:49.60]Gonna perform
+
+[28:50.60]On different types
+
+[28:51.60]Of tasks
+
+[28:52.60]So the tasks being
+
+[28:53.60]Summarization
+
+[28:54.60]Question answering
+
+[28:55.60]Given a paper
+
+[28:56.60]Ranking
+
+[28:57.60]And for each of those
+
+[28:58.60]We wanna know
+
+[28:59.60]What's the distribution
+
+[29:00.60]Of things the model
+
+[29:01.60]Is gonna see
+
+[29:02.60]So that we can
+
+[29:03.60]Have well-calibrated
+
+[29:04.60]Predictions on
+
+[29:05.60]How well the model
+
+[29:06.60]Is gonna do in production
+
+[29:07.60]And I think, yeah,
+
+[29:08.60]There's like
+
+[29:09.60]Some chance
+
+[29:10.60]That there's distribution
+
+[29:11.60]Shift and actually
+
+[29:12.60]The things users enter
+
+[29:13.60]Are gonna be different
+
+[29:14.60]Trainings right
+
+[29:15.60]And having
+
+[29:16.60]Very high quality
+
+[29:17.60]Well-vetted data
+
+[29:18.60]Sets at training time
+
+[29:19.60]I think we also
+
+[29:20.60]End up effectively
+
+[29:21.60]Monitoring by trying
+
+[29:22.60]To evaluate new models
+
+[29:23.60]As they come out
+
+[29:24.60]And so that like
+
+[29:25.60]Kind of prompts us
+
+[29:26.60]To go through
+
+[29:27.60]Our eval suite
+
+[29:28.60]Every couple of months
+
+[29:29.60]And so every time
+
+[29:30.60]A new model comes out
+
+[29:31.60]We have to see
+
+[29:32.60]Like how is this performing
+
+[29:33.60]Relative to production
+
+[29:34.60]And what we currently have
+
+[29:35.60]Yeah, I mean
+
+[29:36.60]Since we're on this topic
+
+[29:37.60]Any new models
+
+[29:38.60]That really call
+
+[29:39.60]Your eye this year
+
+[29:40.60]Like cloud came out
+
+[29:41.60]Yeah, I think cloud
+
+[29:42.60]Is pretty pretty
+
+[29:43.60]Like a good point
+
+[29:44.60]On the kind of
+
+[29:45.60]Predo frontier
+
+[29:46.60]It's neither
+
+[29:47.60]The cheapest model
+
+[29:48.60]Nor is it
+
+[29:49.60]The most accurate
+
+[29:51.60]Most high quality model
+
+[29:52.60]But it's just
+
+[29:53.60]Like a really good tradeoff
+
+[29:54.60]Between cost and accuracy
+
+[29:56.60]You apparently
+
+[29:57.60]Have to 10 shot it
+
+[29:58.60]To make it good
+
+[29:59.60]I tried using
+
+[30:00.60]Aiku for summarization
+
+[30:01.60]But zero shot
+
+[30:02.60]Was not great
+
+[30:03.60]Then they were like
+
+[30:04.60]You know, it's a skill issue
+
+[30:05.60]You have to try it harder
+
+[30:06.60]Interesting
+
+[30:07.60]I think GPT-4
+
+[30:08.60]Unlocked tables for us
+
+[30:10.60]Processing data from tables
+
+[30:11.60]Which was huge
+
+[30:12.60]GPT-4 vision
+
+[30:13.60]Yeah
+
+[30:14.60]Did you try for you
+
+[30:15.60]I guess you can't try for you
+
+[30:16.60]Because it's noncommercial
+
+[30:17.60]That's the adept model
+
+[30:18.60]Yeah, we haven't tried that one
+
+[30:19.60]Yeah
+
+[30:20.60]Yeah, but cloud is multimodal as well
+
+[30:22.60]Yeah
+
+[30:23.60]I think the interesting insight
+
+[30:24.60]That we got from talking to David Luan
+
+[30:25.60]Who is CEO of Adept
+
+[30:26.60]Was that multimodality
+
+[30:28.60]Has effectively two different flavors
+
+[30:30.60]Like one is
+
+[30:31.60]Rerecognize images from a camera
+
+[30:33.60]In the outside natural world
+
+[30:35.60]And actually the more important
+
+[30:37.60]Multimodality for knowledge work
+
+[30:38.60]Is screenshots
+
+[30:39.60]And you know
+
+[30:40.60]PDFs and charts and graphs
+
+[30:42.60]So we need a new term
+
+[30:43.60]For that kind of multimodality
+
+[30:45.60]But is a claim
+
+[30:46.60]That current models
+
+[30:47.60]Are good at one or the other
+
+[30:49.60]Yeah, they're over index
+
+[30:50.60]Because of the history of computer vision
+
+[30:51.60]Is coco, right?
+
+[30:53.60]So now we're like
+
+[30:54.60]Oh, actually, you know
+
+[30:55.60]Screens are more important
+
+[30:56.60]OCR handwriting
+
+[30:58.60]You mentioned a lot of
+
+[30:59.60]Closed model lab stuff
+
+[31:01.60]And then you also have
+
+[31:02.60]Like this open source model
+
+[31:03.60]Find tuning stuff
+
+[31:04.60]Like what is your workload
+
+[31:05.60]Now between close and open
+
+[31:06.60]It's a good question
+
+[31:07.60]I think
+
+[31:08.60]It's half and half
+
+[31:09.60]Is that even a relevant question
+
+[31:10.60]Or not
+
+[31:11.60]This is a nonsensical question
+
+[31:12.60]It depends a little bit on
+
+[31:13.60]Like how you index
+
+[31:14.60]Whether you index by
+
+[31:15.60]Like computer cost
+
+[31:16.60]The number of queries
+
+[31:17.60]I'd say like
+
+[31:18.60]In terms of number of queries
+
+[31:19.60]Is maybe similar
+
+[31:20.60]In terms of like costing computer
+
+[31:22.60]I think the closed models
+
+[31:23.60]Make up more of the budget
+
+[31:25.60]Since the main cases
+
+[31:26.60]Where you want to use closed models
+
+[31:28.60]Are cases where
+
+[31:29.60]They're just smarter
+
+[31:31.60]Where there are no existing
+
+[31:33.60]Open source models
+
+[31:34.60]Are quite smart enough
+
+[31:35.60]Yeah
+
+[31:36.60]We have a lot of
+
+[31:37.60]Interesting technical questions
+
+[31:38.60]To go in
+
+[31:39.60]But just to wrap
+
+[31:40.60]The kind of like
+
+[31:41.60]UX evolution
+
+[31:42.60]Now you have the notebooks
+
+[31:43.60]We talked a lot
+
+[31:44.60]About how chatbots
+
+[31:45.60]Are not the final frontier
+
+[31:47.60]You know
+
+[31:48.60]How did you decide
+
+[31:49.60]To get into notebooks
+
+[31:50.60]Which is a very iterative
+
+[31:51.60]Kind of like interactive
+
+[31:52.60]Interface
+
+[31:53.60]And yeah
+
+[31:54.60]Maybe learnings from that
+
+[31:55.60]Yeah this is actually
+
+[31:56.60]Our fourth time
+
+[31:57.60]Trying to make this work
+
+[31:59.60]I think the first time
+
+[32:00.60]Was probably in early 2021
+
+[32:03.60]I think because
+
+[32:04.60]We've always been obsessed
+
+[32:05.60]With this idea of task
+
+[32:06.60]Decomposition
+
+[32:07.60]And like branching
+
+[32:08.60]We always wanted a tool
+
+[32:10.60]That could be kind of
+
+[32:11.60]Unbounded
+
+[32:12.60]Where you could keep going
+
+[32:13.60]Could do a lot of branching
+
+[32:14.60]Where you could kind of apply
+
+[32:15.60]Language model operations
+
+[32:17.60]Or computations on other tasks
+
+[32:19.60]So in 2021
+
+[32:20.60]We had this thing called
+
+[32:21.60]Composite tasks
+
+[32:22.60]Where you could use GPT-3
+
+[32:23.60]To brainstorm
+
+[32:24.60]A bunch of research questions
+
+[32:25.60]And then take
+
+[32:26.60]Each research question
+
+[32:27.60]And decompose those
+
+[32:28.60]Further into subquestions
+
+[32:30.60]This kind of again
+
+[32:31.60]That like task decomposition
+
+[32:32.60]Tree type thing
+
+[32:33.60]Was always very exciting to us
+
+[32:35.60]But that was like
+
+[32:36.60]It was kind of overwhelming
+
+[32:37.60]Then at the end of 22
+
+[32:39.60]I think we tried again
+
+[32:40.60]And at that point
+
+[32:41.60]We were thinking
+
+[32:42.60]Okay we've done a lot
+
+[32:43.60]With this literature review thing
+
+[32:44.60]We also want to start helping
+
+[32:45.60]With kind of adjacent domains
+
+[32:47.60]And different workflows
+
+[32:48.60]Like we want to help more
+
+[32:49.60]With machine learning
+
+[32:50.60]What does that look like
+
+[32:51.60]And as we were thinking
+
+[32:52.60]About it we're like
+
+[32:53.60]Well there are so many
+
+[32:54.60]Research workflows
+
+[32:55.60]How do we not just build
+
+[32:56.60]Three new workflows
+
+[32:57.60]Into elicit
+
+[32:58.60]But make elicit
+
+[32:59.60]Really generic
+
+[33:00.60]To lots of workflows
+
+[33:01.60]What is like a generic
+
+[33:02.60]Composable system
+
+[33:03.60]With nice abstractions
+
+[33:04.60]That can like
+
+[33:05.60]Scale to all these workflows
+
+[33:06.60]So we like
+
+[33:07.60]Iterated on that a bunch
+
+[33:08.60]And like
+
+[33:09.60]Didn't quite narrow
+
+[33:10.60]The problem space enough
+
+[33:11.60]Or like
+
+[33:12.60]Get to what we wanted
+
+[33:13.60]And then I think it was
+
+[33:14.60]At the beginning of 2023
+
+[33:16.60]We were like
+
+[33:17.60]Wow computational notebooks
+
+[33:18.60]Kind of enable this
+
+[33:19.60]Where they have a lot
+
+[33:20.60]Of flexibility
+
+[33:21.60]But you know
+
+[33:22.60]Kind of robust primitive
+
+[33:23.60]Such that you can extend
+
+[33:24.60]The workflow
+
+[33:25.60]And it's not limited
+
+[33:26.60]It's not like
+
+[33:27.60]You ask a query
+
+[33:28.60]You get an answer
+
+[33:29.60]You're done
+
+[33:30.60]You can just constantly
+
+[33:31.60]Keep building on top of that
+
+[33:32.60]And each little step
+
+[33:33.60]Seems like a really good
+
+[33:34.60]Work for the language model
+
+[33:35.60]And also there was just
+
+[33:36.60]Like really helpful
+
+[33:37.60]To have a bit more
+
+[33:38.60]Preexisting work to emulate
+
+[33:40.60]Yeah, that's kind of
+
+[33:41.60]How we ended up at
+
+[33:42.60]Computational notebooks
+
+[33:43.60]For elicit
+
+[33:44.60]Maybe one thing
+
+[33:45.60]That's worth making explicit
+
+[33:46.60]Is the difference between
+
+[33:47.60]Computational notebooks
+
+[33:48.60]And chat because
+
+[33:49.60]On the surface
+
+[33:50.60]They seem pretty similar
+
+[33:51.60]It's kind of this iterative
+
+[33:52.60]Interaction where you add stuff
+
+[33:53.60]In both cases
+
+[33:54.60]You have a back and forth
+
+[33:55.60]Between you enter stuff
+
+[33:56.60]And then you get some output
+
+[33:57.60]And then you enter stuff
+
+[33:58.60]But the important difference
+
+[33:59.60]In our minds is
+
+[34:00.60]With notebooks
+
+[34:01.60]You can define a process
+
+[34:03.60]So in data science
+
+[34:04.60]You know like
+
+[34:05.60]Here's like my data analysis
+
+[34:06.60]Process that takes in a CSV
+
+[34:08.60]And then does some extraction
+
+[34:09.60]And then generates a figure
+
+[34:10.60]At the end
+
+[34:11.60]And you can prototype it
+
+[34:13.60]Using a small CSV
+
+[34:14.60]And then you can run it
+
+[34:15.60]Over a much larger CSV
+
+[34:16.60]Later
+
+[34:17.60]And similarly
+
+[34:18.60]The vision for notebooks
+
+[34:19.60]In our case
+
+[34:20.60]Is to not make it this
+
+[34:22.60]Like one of chat interaction
+
+[34:23.60]But to allow you to then
+
+[34:25.60]Say if you start
+
+[34:27.60]And first you're like
+
+[34:28.60]Okay, let me just
+
+[34:29.60]Analyze a few papers
+
+[34:30.60]And see do I get to
+
+[34:31.60]The correct conclusions
+
+[34:32.60]For those few papers
+
+[34:33.60]Can I then later
+
+[34:34.60]Go back and say
+
+[34:35.60]Now let me run this
+
+[34:36.60]Over 10,000 papers
+
+[34:38.60]Now that I've debug
+
+[34:39.60]The process
+
+[34:40.60]Using a few papers
+
+[34:41.60]And that's an interaction
+
+[34:42.60]That doesn't fit
+
+[34:43.60]Quite as well
+
+[34:44.60]Into the chat framework
+
+[34:45.60]Because that's more
+
+[34:46.60]For kind of quick
+
+[34:47.60]Back and forth
+
+[34:48.60]Interaction
+
+[34:49.60]Do you think in notebooks
+
+[34:50.60]That's kind of like
+
+[34:51.60]Structure, editable
+
+[34:52.60]Chain of thought
+
+[34:53.60]Basically step by step
+
+[34:54.60]Like is that kind of
+
+[34:55.60]Where you see this going
+
+[34:56.60]And then are people
+
+[34:57.60]Gonna reuse notebooks
+
+[34:59.60]As like templates
+
+[35:00.60]And maybe in traditional
+
+[35:01.60]Notebooks
+
+[35:02.60]As like cookbooks
+
+[35:03.60]Right, you share a cookbook
+
+[35:04.60]You can start from there
+
+[35:05.60]Is that similar
+
+[35:06.60]And illicit
+
+[35:07.60]Yeah, that's exactly right
+
+[35:08.60]So that's our hope
+
+[35:09.60]That people will build templates
+
+[35:10.60]Share them with other people
+
+[35:12.60]I think chain of thought
+
+[35:13.60]Is maybe still like
+
+[35:14.60]Kind of one level
+
+[35:15.60]Lower on the abstraction hierarchy
+
+[35:17.60]Then we would think of notebooks
+
+[35:19.60]I think we'll probably
+
+[35:20.60]Want to think about
+
+[35:21.60]More semantic pieces
+
+[35:22.60]Like a building block
+
+[35:23.60]Is more like a paper search
+
+[35:25.60]Or an extraction
+
+[35:26.60]Or a list of concepts
+
+[35:28.60]And then the models
+
+[35:30.60]And the reasoning
+
+[35:31.60]Will probably often be
+
+[35:32.60]One level down
+
+[35:33.60]You always want to
+
+[35:34.60]Be able to see it
+
+[35:35.60]But you don't always
+
+[35:36.60]Want it to be front and center
+
+[35:37.60]Yeah, what's the difference
+
+[35:38.60]Between a notebook
+
+[35:39.60]And an agent
+
+[35:40.60]Since everybody always
+
+[35:41.60]Ask me what's an agent
+
+[35:42.60]Like how do you think
+
+[35:43.60]About where the line is
+
+[35:45.60]Yeah, it's an interesting
+
+[35:46.60]Question
+
+[35:47.60]In the notebook world
+
+[35:48.60]I would
+
+[35:49.60]Generally think of
+
+[35:50.60]The human as the agent
+
+[35:51.60]In the first iteration
+
+[35:52.60]So you have the notebook
+
+[35:53.60]And the human kind of
+
+[35:54.60]Adds little action steps
+
+[35:56.60]And then the next point
+
+[35:58.60]On this kind of progress
+
+[35:59.60]Okay, now you can use
+
+[36:00.60]Language models to predict
+
+[36:01.60]Which action
+
+[36:02.60]Would you take as a human
+
+[36:03.60]And at some point
+
+[36:04.60]You're probably going to
+
+[36:05.60]Be very good at this
+
+[36:06.60]You'll be like, okay
+
+[36:07.60]In some cases, I can
+
+[36:08.60]With 99.9% accuracy
+
+[36:09.60]Predict what you do
+
+[36:10.60]And then you might
+
+[36:11.60]As well just execute it
+
+[36:12.60]Like why wait for the human
+
+[36:13.60]And eventually
+
+[36:14.60]As you get better at this
+
+[36:15.60]That will just look
+
+[36:16.60]More and more like agents
+
+[36:18.60]Taking actions
+
+[36:19.60]As opposed to you
+
+[36:20.60]Doing the thing
+
+[36:21.60]I think templates
+
+[36:22.60]Are a specific case of this
+
+[36:23.60]Very like, okay, well
+
+[36:24.60]There's just particular
+
+[36:25.60]Sequences of actions
+
+[36:26.60]That you often want to chunk
+
+[36:27.60]And have available
+
+[36:28.60]Just like in normal
+
+[36:29.60]Programming
+
+[36:30.60]And those
+
+[36:31.60]You can view them as
+
+[36:32.60]Action sequences of agents
+
+[36:33.60]Or you can view them as
+
+[36:34.60]More normal programming
+
+[36:36.60]Language abstraction thing
+
+[36:37.60]And I think those
+
+[36:38.60]Are two valid views
+
+[36:40.60]How do you see this
+
+[36:41.60]Changes
+
+[36:42.60]Like you said, the models
+
+[36:43.60]Get better and you need
+
+[36:44.60]Less and less human
+
+[36:45.60]Actual interfacing
+
+[36:47.60]With the model
+
+[36:48.60]You just get the results
+
+[36:49.60]Like how does the UX
+
+[36:50.60]And the way people
+
+[36:51.60]Perceive it change
+
+[36:52.60]Yeah, I think this
+
+[36:53.60] kind of interaction
+
+[36:54.60]Paradimes for evaluation
+
+[36:55.60]Is not really something
+
+[36:56.60]The internet has encountered
+
+[36:57.60]Yet because up to now
+
+[36:58.60]The internet has all been
+
+[36:59.60]About getting data
+
+[37:00.60]And work from people
+
+[37:02.60]So increasingly
+
+[37:03.60]I really want kind of
+
+[37:04.60]Evaluation both from
+
+[37:05.60]An interface perspective
+
+[37:06.60]And from like a
+
+[37:07.60]Technical perspective
+
+[37:08.60]Operation perspective
+
+[37:09.60]To be a superpower
+
+[37:10.60]For elicit because I think
+
+[37:11.60]Over time models will do
+
+[37:12.60]More and more of the work
+
+[37:13.60]And people will have
+
+[37:14.60]To do more and more
+
+[37:15.60]Of the evaluation
+
+[37:16.60]So I think yeah
+
+[37:17.60]In terms of the interface
+
+[37:18.60]Some of the things we have
+
+[37:19.60]Today, you know
+
+[37:20.60]For every kind of
+
+[37:21.60]Language model generation
+
+[37:22.60]There's some citation back
+
+[37:23.60]And we kind of try to
+
+[37:24.60]Highlight the ground truth
+
+[37:25.60]In the paper
+
+[37:26.60]To whatever elicit said
+
+[37:27.60]And make it super easy
+
+[37:28.60]So you can click on it
+
+[37:29.60]And quickly see
+
+[37:30.60]In context and validate
+
+[37:31.60]Whether the text
+
+[37:32.60]Actually supports
+
+[37:33.60]The answer that elicit gave
+
+[37:34.60]So I think we'd probably
+
+[37:35.60]Want to scale things up
+
+[37:37.60]Like that, like the ability
+
+[37:38.60]To kind of spot check
+
+[37:39.60]The models work super
+
+[37:40.60]Quickly scale up
+
+[37:41.60]Interfaces like that
+
+[37:42.60]And who would spot check
+
+[37:44.60]The user
+
+[37:45.60]Yeah, to start
+
+[37:46.60]It would be the user
+
+[37:47.60]One of the other things
+
+[37:48.60]We do is also kind of flag
+
+[37:49.60]The models uncertainty
+
+[37:50.60]So we have models report
+
+[37:52.60]Out how confident are you
+
+[37:53.60]That this was the
+
+[37:54.60]Sample size of this study
+
+[37:55.60]The model's not sure
+
+[37:56.60]We throw a flag
+
+[37:57.60]And so the user knows
+
+[37:58.60]To prioritize checking that
+
+[37:59.60]So again, we can kind of
+
+[38:00.60]Scale that up
+
+[38:01.60]So when the model's like
+
+[38:02.60]Well, I searched this
+
+[38:03.60]On Google, I'm not sure
+
+[38:04.60]If that was the right thing
+
+[38:05.60]I have an uncertainty flag
+
+[38:06.60]And the user can go
+
+[38:07.60]And be like, okay
+
+[38:08.60]That was actually
+
+[38:09.60]The right thing to do or not
+
+[38:10.60]I've tried to do
+
+[38:11.60]Uncertainty ratings
+
+[38:12.60]From models
+
+[38:13.60]I don't know
+
+[38:14.60]If you have this live
+
+[38:15.60]Because I just
+
+[38:16.60]Didn't find them reliable
+
+[38:17.60]Because they just elucidated
+
+[38:18.60]Their own uncertainty
+
+[38:19.60]I would love to
+
+[38:20.60]Based on log probes
+
+[38:22.60]Or something more
+
+[38:23.60]Native within the model
+
+[38:24.60]Better than generated
+
+[38:25.60]But it sounds like
+
+[38:27.60]The scale properly for you
+
+[38:29.60]Yeah, we found it
+
+[38:30.60]To be pretty calibrated
+
+[38:31.60]Diverse on the model
+
+[38:32.60]I think in some cases
+
+[38:33.60]We also used
+
+[38:34.60]To different models
+
+[38:35.60]For the answer estimates
+
+[38:36.60]Then for the question
+
+[38:37.60]Answering
+
+[38:38.60]So one model would say
+
+[38:39.60]Here's my chain of thought
+
+[38:40.60]Here's my answer
+
+[38:41.60]And then a different
+
+[38:42.60]Type of model
+
+[38:43.60]Let's say the first model
+
+[38:44.60]Is Lama
+
+[38:45.60]And let's say the second
+
+[38:46.60]Model is GP3.5
+
+[38:47.60]And then the second model
+
+[38:49.60]Just looks over
+
+[38:50.60]The results and like
+
+[38:51.60]Okay, how confident
+
+[38:52.60]Are you in this
+
+[38:53.60]And I think
+
+[38:54.60]Sometimes using
+
+[38:55.60]A different model
+
+[38:56.60]Can be better than
+
+[38:57.60]Using the same model
+
+[38:58.60]Yeah, you know
+
+[38:59.60]On topic of models
+
+[39:00.60]Evaluating models
+
+[39:01.60]Obviously you can
+
+[39:02.60]Do that all day long
+
+[39:03.60]Like what's your budget
+
+[39:04.60]Like because
+
+[39:05.60]Your queries
+
+[39:06.60]Fan out a lot
+
+[39:07.60]And then you have
+
+[39:08.60]Models evaluating models
+
+[39:09.60]One person typing
+
+[39:10.60]In a question
+
+[39:11.60]Can lead to
+
+[39:12.60]A thousand calls
+
+[39:13.60]It depends on the project
+
+[39:14.60]So if the project
+
+[39:15.60]Is basically
+
+[39:16.60]A systematic review
+
+[39:17.60]That otherwise
+
+[39:18.60]Human research assistance
+
+[39:19.60]Would do
+
+[39:20.60]Then the project
+
+[39:21.60]Is basically
+
+[39:22.60]Can get quite large
+
+[39:23.60]For those projects
+
+[39:24.60]I don't know
+
+[39:25.60]Let's say
+
+[39:26.60]A hundred thousand dollars
+
+[39:27.60]So in those cases
+
+[39:28.60]You're happier
+
+[39:29.60]To spend compute
+
+[39:30.60]Then in the
+
+[39:31.60]Can of shallow search case
+
+[39:32.60]Where someone
+
+[39:33.60]Just enters a question
+
+[39:34.60]Because I don't know
+
+[39:35.60]Maybe like it
+
+[39:36.60]I heard about creatine
+
+[39:37.60]What's it about
+
+[39:38.60]Probably don't want
+
+[39:39.60]To spend a lot of compute
+
+[39:40.60]On that
+
+[39:41.60]This sort of
+
+[39:42.60]Being able to invest
+
+[39:43.60]More or less compute
+
+[39:44.60]Into getting
+
+[39:45.60]More or less accurate answers
+
+[39:46.60]I think one of the
+
+[39:47.60]Core things we care about
+
+[39:48.60]And that I think
+
+[39:49.60]Is currently undervalued
+
+[39:50.60]In the AI space
+
+[39:51.60]You can't choose
+
+[39:52.60]Which model you want
+
+[39:53.60]And you can sometimes
+
+[39:54.60]I don't know
+
+[39:55.60]You'll tip it
+
+[39:56.60]It'll try harder
+
+[39:57.60]Or you can try various
+
+[39:58.60]Things to get it to work harder
+
+[40:00.60]But you don't have great
+
+[40:01.60]Ways of converting
+
+[40:02.60]Willingness to spend
+
+[40:03.60]Into better answers
+
+[40:04.60]And we really
+
+[40:05.60]Want to build a product
+
+[40:06.60]That has this sort of
+
+[40:07.60]Unbounded flavor
+
+[40:08.60]Where like if you care
+
+[40:09.60]About it a lot
+
+[40:10.60]You should be able to get
+
+[40:11.60]Really high quality answers
+
+[40:12.60]Really double-checked
+
+[40:13.60]In every way
+
+[40:14.60]And you have a
+
+[40:15.60]Credit-based pricing
+
+[40:16.60]So unlike most products
+
+[40:17.60]It's not a fixed monthly
+
+[40:19.60]Right exactly
+
+[40:20.60]Some of the
+
+[40:21.60]Higher costs are
+
+[40:22.60]Teared
+
+[40:23.60]So for most casual users
+
+[40:25.60]They'll just get
+
+[40:26.60]The abstract summary
+
+[40:27.60]Which is kind of
+
+[40:28.60]An open source model
+
+[40:29.60]Then you can
+
+[40:30.60]Add more columns
+
+[40:31.60]Which have more
+
+[40:32.60]Extractions
+
+[40:33.60]And these uncertainty features
+
+[40:34.60]And then you can also
+
+[40:35.60]Add the same columns
+
+[40:36.60]In high accuracy mode
+
+[40:37.60]Which also parses the table
+
+[40:38.60]So we kind of
+
+[40:39.60]Stack the complexity
+
+[40:40.60]And the cost
+
+[40:41.60]You know the fun thing
+
+[40:42.60]You can do with a credit system
+
+[40:43.60]Which is data for data
+
+[40:44.60]Basically you can
+
+[40:45.60]Give people more credit
+
+[40:46.60]If they give
+
+[40:47.60]Data back to you
+
+[40:48.60]I don't know
+
+[40:49.60]You don't have money
+
+[40:50.60]But you have time
+
+[40:51.60]How do you exchange that
+
+[40:53.60]It's a fair trade
+
+[40:54.60]I think it's interesting
+
+[40:55.60]We haven't quite operationized it
+
+[40:56.60]And then you know
+
+[40:57.60]There's been some kind of like
+
+[40:58.60]Adverse selection
+
+[40:59.60]Like you know for example
+
+[41:00.60]It would be really valuable
+
+[41:01.60]To get feedback on our model
+
+[41:02.60]So maybe if you were willing
+
+[41:03.60]To give more robust feedback
+
+[41:04.60]On our results
+
+[41:05.60]We could give you credits
+
+[41:06.60]Or something like that
+
+[41:07.60]But then there's kind of this
+
+[41:08.60]Will people take it seriously
+
+[41:09.60]And you want the good people
+
+[41:10.60]Exactly
+
+[41:11.60]Can you tell who are the good people
+
+[41:12.60]Not right now
+
+[41:13.60]But yeah maybe
+
+[41:14.60]At the point where we can
+
+[41:15.60]We can offer it
+
+[41:16.60]We can offer it up to them
+
+[41:17.60]The perplexity of questions asked
+
+[41:18.60]If it's higher perplexity
+
+[41:19.60]These are smarter people
+
+[41:20.60]Yeah maybe
+
+[41:21.60]And if you make a lot of typos
+
+[41:22.60]In your queries
+
+[41:23.60]You're not going to get off
+
+[41:24.60]How does that change
+
+[41:25.60]Negative social credit
+
+[41:28.60]It's very topical right now
+
+[41:29.60]To think about
+
+[41:30.60]The threat of long context windows
+
+[41:32.60]All these models
+
+[41:34.60]We're talking about these days
+
+[41:35.60]All like a million tokens plus
+
+[41:36.60]Is that relevant for you
+
+[41:38.60]Can you make use of that
+
+[41:39.60]Is that just prohibitively expensive
+
+[41:41.60]Because you're just paying
+
+[41:42.60]For all those tokens
+
+[41:43.60]Or you're just doing right
+
+[41:44.60]It's definitely relevant
+
+[41:45.60]And when we think about search
+
+[41:46.60]As many people do
+
+[41:47.60]We think about kind of
+
+[41:48.60]A staged pipeline
+
+[41:49.60]Of retrieval
+
+[41:50.60]Where first you use
+
+[41:51.60]Semitic search database
+
+[41:53.60]With embeddings
+
+[41:54.60]Get like the
+
+[41:55.60]In our case maybe 400
+
+[41:56.60]Or so most relevant papers
+
+[41:57.60]And then
+
+[41:58.60]You still need to rank those
+
+[41:59.60]And I think at that point
+
+[42:01.60]It becomes pretty interesting
+
+[42:02.60]To use larger models
+
+[42:04.60]So specifically in the past
+
+[42:06.60]I think a lot of ranking
+
+[42:07.60]Was kind of per item ranking
+
+[42:09.60]Where you would score
+
+[42:10.60]Each individual item
+
+[42:11.60]Maybe using increasingly
+
+[42:12.60]Expensive scoring methods
+
+[42:13.60]And then rank based on the scores
+
+[42:15.60]But I think list wise
+
+[42:16.60]We ranking where
+
+[42:17.60]You have a model
+
+[42:18.60]That can see
+
+[42:19.60]All the elements
+
+[42:20.60]Is a lot more powerful
+
+[42:21.60]Because often you can
+
+[42:22.60]Only really tell
+
+[42:23.60]How good a thing is
+
+[42:24.60]In comparison to other things
+
+[42:26.60]And what things should come first
+
+[42:28.60]It really depends on
+
+[42:29.60]Like well what other things
+
+[42:30.60]Are available
+
+[42:31.60]Maybe you even care about
+
+[42:32.60]Diversity and your results
+
+[42:33.60]You don't want to show
+
+[42:34.60]Ten very similar papers
+
+[42:35.60]As the first 10 results
+
+[42:36.60]So I think along context models
+
+[42:38.60]Are quite interesting there
+
+[42:40.60]And especially for our case
+
+[42:41.60]Where we care more about
+
+[42:43.60]Power users who are perhaps
+
+[42:45.60]A little bit more
+
+[42:46.60]Welling to wait a little bit longer
+
+[42:47.60]To get higher quality results
+
+[42:48.60]Relative to people who just
+
+[42:50.60]Quickly check out things
+
+[42:51.60]Because why not
+
+[42:52.60]I think being able to spend
+
+[42:53.60]More on longer context
+
+[42:54.60]Is quite valuable
+
+[42:55.60]Yeah I think one thing
+
+[42:56.60]The longer context models
+
+[42:57.60]Changed for us
+
+[42:58.60]Is maybe a focus from
+
+[43:00.60]Breaking down tasks
+
+[43:01.60]To breaking down the evaluation
+
+[43:03.60]So before you know
+
+[43:05.60]If we wanted to answer
+
+[43:06.60]A question from the full text
+
+[43:08.60]Of a paper
+
+[43:09.60]We had to figure out
+
+[43:10.60]How to chunk it and like
+
+[43:11.60]Find the relevant chunk
+
+[43:12.60]And then answer
+
+[43:13.60]Based on that chunk
+
+[43:14.60]Then you know
+
+[43:15.60]Which chunk the model
+
+[43:16.60]Used to answer the question
+
+[43:17.60]So if you want to help
+
+[43:18.60]The user to check it
+
+[43:19.60]Yeah you can be like
+
+[43:20.60]Well this was the chunk
+
+[43:21.60]That the model got
+
+[43:22.60]And now if you put the whole
+
+[43:23.60]Text in the paper
+
+[43:24.60]You have to kind of
+
+[43:25.60]Find the chunk
+
+[43:26.60]Like more retroactively
+
+[43:27.60]Basically and so you need
+
+[43:28.60]Kind of like a different
+
+[43:29.60]Set of abilities
+
+[43:30.60]And obviously like
+
+[43:31.60]A different technology
+
+[43:32.60]To figure out
+
+[43:33.60]You still want to point
+
+[43:34.60]The user to the supporting
+
+[43:35.60]Quotes in the text
+
+[43:36.60]But then the interaction
+
+[43:37.60]Is a little different
+
+[43:38.60]You like scan through
+
+[43:39.60]And find some rouge score
+
+[43:40.60]Yeah the floor
+
+[43:41.60]I think there's an
+
+[43:42.60]Interesting space of
+
+[43:43.60]Almost research problems
+
+[43:44.60]Here because
+
+[43:45.60]You would ideally
+
+[43:46.60]Make causal claims
+
+[43:47.60]Like if this
+
+[43:48.60]Hadn't been in the text
+
+[43:49.60]The model wouldn't
+
+[43:50.60]Have said this thing
+
+[43:51.60]And maybe you can do
+
+[43:52.60]Expensive approximations
+
+[43:53.60]To that where like
+
+[43:54.60]I don't know you just
+
+[43:55.60]Throw a chunk of the paper
+
+[43:56.60]And re-answer
+
+[43:57.60]And see what happens
+
+[43:58.60]But hopefully
+
+[43:59.60]There are better
+
+[44:00.60]Ways of doing that
+
+[44:01.60]Where you just get
+
+[44:03.60]That kind of counterfactual
+
+[44:04.60]Information for free
+
+[44:05.60]From the model
+
+[44:06.60]Do you think at all
+
+[44:07.60]About the cost of maintaining
+
+[44:09.60]Reg versus just putting
+
+[44:10.60]More tokens in the window
+
+[44:12.60]I think in software
+
+[44:13.60]Development a lot of
+
+[44:14.60]Times people buy
+
+[44:15.60]Developer productivity
+
+[44:16.60]Things so that
+
+[44:17.60]We don't have to worry
+
+[44:18.60]About it context windows
+
+[44:19.60]Kinda the same right
+
+[44:20.60]You have to maintain
+
+[44:21.60]Chunking and like
+
+[44:22.60]Reg retrieval and like
+
+[44:23.60]Re-ranking and all of this
+
+[44:24.60] Versus I just shove
+
+[44:25.60]Everything into the context
+
+[44:26.60]And like it costs
+
+[44:27.60]A little more
+
+[44:28.60]But at least I don't
+
+[44:29.60]Have to do all of that
+
+[44:30.60]Is that something
+
+[44:31.60]You thought about
+
+[44:32.60]I think we still
+
+[44:33.60]Like hit up against
+
+[44:34.60]Context limits enough
+
+[44:35.60]That it's not really
+
+[44:36.60]Do we still want
+
+[44:37.60]To keep this rag around
+
+[44:38.60]It's like we do still
+
+[44:39.60]Need it for the scale
+
+[44:40.60]The worth we're doing
+
+[44:41.60]I think there are
+
+[44:42.60]Different kinds of
+
+[44:43.60]Maintainability in
+
+[44:44.60]One sense I think
+
+[44:45.60]You write that
+
+[44:46.60]Throw everything into
+
+[44:47.60]The context window thing
+
+[44:48.60]Is easier to maintain
+
+[44:49.60]Because you just
+
+[44:50.60]Can swap out a model
+
+[44:52.60]In another sense
+
+[44:53.60]If things go wrong
+
+[44:54.60]It's harder to debug
+
+[44:55.60]Like if you know
+
+[44:56.60]Here's the process
+
+[44:57.60]That we go through
+
+[44:58.60]To go from
+
+[45:00.60]200 million papers
+
+[45:01.60]To an answer
+
+[45:02.60]And there are like
+
+[45:03.60]Little steps
+
+[45:04.60]And you understand
+
+[45:05.60]Okay this is the step
+
+[45:06.60]That finds the relevant
+
+[45:07.60]Paragraph or whatever
+
+[45:08.60]Maybe you'll know
+
+[45:09.60]Which step breaks
+
+[45:10.60]If it's just like
+
+[45:11.60]A new model
+
+[45:12.60]Version came out
+
+[45:13.60]And now it suddenly
+
+[45:14.60]Doesn't find your needle
+
+[45:15.60]In a haystack anymore
+
+[45:16.60]Then you're like
+
+[45:17.60]Okay what can you do
+
+[45:18.60]You're kind of at a loss
+
+[45:20.60]Yeah let's talk
+
+[45:21.60]A bit about needle
+
+[45:22.60]In a haystack
+
+[45:23.60]And like maybe
+
+[45:24.60]The opposite of it
+
+[45:25.60]Which is like hard
+
+[45:26.60]Grounding I don't know
+
+[45:27.60]That's like the best thing
+
+[45:28.60]To think about it
+
+[45:29.60]But I was using
+
+[45:30.60]One of these
+
+[45:31.60]Chavicher documents
+
+[45:32.60]Features
+
+[45:33.60]And I put the
+
+[45:34.60]AMD MI300
+
+[45:35.60]Spacks and the
+
+[45:36.60]Blackwell chips
+
+[45:37.60]From NVIDIA
+
+[45:38.60]And I was asking questions
+
+[45:39.60]And we like
+
+[45:40.60]And the response was like
+
+[45:41.60]Oh it doesn't say
+
+[45:42.60]In the specs
+
+[45:43.60]But if you ask
+
+[45:44.60]GbD4 without the docs
+
+[45:45.60]It would tell you no
+
+[45:46.60]Because nvlink
+
+[45:47.60]It's an NVIDIA
+
+[45:48.60]It's technology
+
+[45:49.60]Just as your N.V.
+
+[45:50.60]Yeah hey man
+
+[45:51.60]It just says in the thing
+
+[45:52.60]How do you think about
+
+[45:53.60]That having the context
+
+[45:54.60]Sometimes suppress
+
+[45:55.60]The knowledge
+
+[45:56.60]That the model has
+
+[45:57.60]It really depends on the task
+
+[45:58.60]Because I think
+
+[45:59.60]Sometimes that is
+
+[46:00.60]Exactly what you want
+
+[46:01.60]So imagine your researcher
+
+[46:02.60]You're writing the background
+
+[46:03.60]Section of your paper
+
+[46:04.60]And you're trying to describe
+
+[46:05.60]What these other papers say
+
+[46:06.60]You really don't want
+
+[46:07.60]Extra information
+
+[46:08.60]To be introduced there
+
+[46:09.60]In other cases
+
+[46:10.60]Where you're just trying
+
+[46:11.60]To figure out the truth
+
+[46:12.60]And you're giving
+
+[46:13.60]The documents because
+
+[46:14.60]You think they will help
+
+[46:15.60]The model figure out
+
+[46:16.60]What the truth is
+
+[46:17.60]I think you do want
+
+[46:18.60]If the model has a hunch
+
+[46:19.60]That there might be
+
+[46:21.60]Something that's not
+
+[46:22.60]In the papers
+
+[46:23.60]You do want to surface that
+
+[46:24.60]I think ideally
+
+[46:25.60]You still don't want
+
+[46:26.60]The model to just tell you
+
+[46:27.60]I think probably
+
+[46:28.60]The ideal thing
+
+[46:29.60]Looks a bit more like
+
+[46:30.60]Agent control
+
+[46:31.60]Where the model can issue
+
+[46:33.60]A query that then
+
+[46:35.60]Is intended to surface
+
+[46:36.60]The documents that
+
+[46:37.60]Substantiate its hunch
+
+[46:38.60]That's maybe
+
+[46:39.60]A reasonable middle ground
+
+[46:40.60]Between
+
+[46:41.60]While just telling you
+
+[46:42.60]And while being fully
+
+[46:43.60]Limited to the papers
+
+[46:44.60]You give it
+
+[46:45.60]Yeah, I would say
+
+[46:46.60]They're just kind of
+
+[46:47.60]Different tasks right now
+
+[46:48.60]And the tasks that
+
+[46:49.60]Elicit is mostly focused on
+
+[46:50.60]Is what do these papers say
+
+[46:51.60]But there is another task
+
+[46:52.60]Which is like
+
+[46:53.60]Just give me the best
+
+[46:54.60]Possible answer
+
+[46:55.60]And that give me
+
+[46:56.60]The best possible answer
+
+[46:57.60]Sometimes depends
+
+[46:58.60]On what do these papers say
+
+[46:59.60]But it can also depend
+
+[47:00.60]On other stuff
+
+[47:01.60]That's not in the papers
+
+[47:02.60]So ideally
+
+[47:03.60]We can do both
+
+[47:04.60]And then kind of
+
+[47:05.60]We can ask
+
+[47:06.60]For you
+
+[47:07.60]More going forward
+
+[47:08.60]We have
+
+[47:09.60]See a lot of details
+
+[47:10.60]But just to zoom
+
+[47:11.60]Back out a little bit
+
+[47:12.60]What are maybe
+
+[47:13.60]The most underrated
+
+[47:14.60]Features of elicit
+
+[47:16.60]And what is
+
+[47:17.60]One thing that
+
+[47:18.60]Maybe the users
+
+[47:19.60]Surprise you the most
+
+[47:20.60]By using it
+
+[47:21.60]I think the most
+
+[47:22.60]Powerful feature of elicit
+
+[47:23.60]Is the ability to
+
+[47:24.60]Extract
+
+[47:25.60]Add columns to this table
+
+[47:26.60]Which effectively
+
+[47:27.60]Extracts data
+
+[47:28.60]From all of your
+
+[47:29.60]Papers at once
+
+[47:30.60]It's well used
+
+[47:31.60]But there are
+
+[47:32.60]Kind of many different
+
+[47:33.60]Extensions of that
+
+[47:34.60]We let you
+
+[47:35.60]Give a description
+
+[47:36.60]Of the column
+
+[47:37.60]We let you give instructions
+
+[47:38.60]Of a column
+
+[47:39.60]We let you create custom
+
+[47:40.60]Column
+
+[47:41.60]So we have like 30
+
+[47:42.60]Plus predefined fields
+
+[47:43.60]That users can extract
+
+[47:44.60]Like what were the methods
+
+[47:45.60]What were the main findings
+
+[47:46.60]How many people were studied
+
+[47:48.60]And we actually show
+
+[47:49.60]You basically the prompts
+
+[47:50.60]That we're using to
+
+[47:51.60]Extract that from
+
+[47:52.60]Our predefined fields
+
+[47:53.60]And then you can fork this
+
+[47:54.60]And you can say
+
+[47:55.60]Oh, actually I don't care
+
+[47:56.60]About the population of people
+
+[47:57.60]I only care about
+
+[47:58.60]The population of rats
+
+[47:59.60]Like you can change
+
+[48:00.60]The instructions
+
+[48:01.60]So I think users
+
+[48:02.60]Are still kind of discovering
+
+[48:03.60]This predefined
+
+[48:04.60]Easy to use default
+
+[48:06.60]But that they can extend it
+
+[48:07.60]To be much more
+
+[48:08.60]Specific to them
+
+[48:09.60]And then they can also ask
+
+[48:10.60]Custom questions
+
+[48:11.60]One use case of that
+
+[48:12.60]Is you can start to
+
+[48:13.60]Create different column types
+
+[48:14.60]That you might not expect
+
+[48:15.60]So rather than just
+
+[48:16.60]Creating generative answers
+
+[48:17.60]Like a description
+
+[48:18.60]Of the methodology
+
+[48:19.60]You can say
+
+[48:20.60]Classify the methodology
+
+[48:22.60]Into a prospective study
+
+[48:23.60]A retrospective study
+
+[48:24.60]Or a case study
+
+[48:26.60]And then you can filter
+
+[48:27.60]Based on that
+
+[48:28.60]It's like all using
+
+[48:29.60]The same kind of technology
+
+[48:30.60]And the interface
+
+[48:31.60]But it unlocks
+
+[48:32.60]So I think that
+
+[48:33.60]The ability to ask
+
+[48:34.60]Custom questions
+
+[48:35.60]Give instructions
+
+[48:36.60]And specifically use
+
+[48:37.60]That to create different
+
+[48:38.60]Types of columns
+
+[48:39.60]Like classification columns
+
+[48:41.60]Is still pretty underrated
+
+[48:42.60]In terms of use case
+
+[48:44.60]I spoke to someone
+
+[48:45.60]Who works in medical affairs
+
+[48:47.60]At a genomic sequencing
+
+[48:48.60]Company recently
+
+[48:49.60]So you know
+
+[48:50.60]The doctors kind of order
+
+[48:52.60]These genomic tests
+
+[48:53.60]These sequencing tests
+
+[48:54.60]To kind of identify
+
+[48:55.60]If a patient has
+
+[48:56.60]A particular disease
+
+[48:57.60]This company helps
+
+[48:58.60]And process it
+
+[48:59.60]And this person
+
+[49:00.60]Basically interacts
+
+[49:01.60]With all the doctors
+
+[49:02.60]And if the doctors
+
+[49:03.60]Have any questions
+
+[49:04.60]My understanding is that
+
+[49:05.60]Medical affairs
+
+[49:06.60]Is kind of like customer
+
+[49:07.60]Support or customer success
+
+[49:08.60]In pharma
+
+[49:09.60]So this person
+
+[49:10.60]Talks to doctors all day long
+
+[49:11.60]And one of the things
+
+[49:12.60]They started using elicit for
+
+[49:13.60]Is like putting the results
+
+[49:14.60]Of their tests
+
+[49:15.60]As a query
+
+[49:17.60]Like this test showed
+
+[49:18.60]You know this percentage
+
+[49:19.60]Presence of this
+
+[49:20.60]And 40% that
+
+[49:21.60]And whatever
+
+[49:22.60]You know what genes are present
+
+[49:23.60]Here or within this sample
+
+[49:25.60]And getting kind of
+
+[49:26.60]A list of academic papers
+
+[49:27.60]That would support their findings
+
+[49:29.60]And using this to help
+
+[49:30.60]The doctors
+
+[49:31.60]Interpret their tests
+
+[49:32.60]So we talked about
+
+[49:33.60]Okay cool
+
+[49:34.60]Like if we built
+
+[49:35.60]He's pretty interested
+
+[49:36.60]In kind of doing a survey
+
+[49:37.60]Of infectious disease
+
+[49:38.60]Specialists
+
+[49:39.60]And getting them
+
+[49:40.60]To evaluate
+
+[49:41.60]You know having them
+
+[49:42.60]Right up their answers
+
+[49:43.60]Comparing it to elicit
+
+[49:44.60]Answers trying to see
+
+[49:45.60]Can elicit start being
+
+[49:46.60]Used to interpret
+
+[49:47.60]The results of
+
+[49:48.60]These diagnostic tests
+
+[49:49.60]Because the way
+
+[49:50.60]They ship these tests
+
+[49:51.60]To doctors
+
+[49:52.60]Is they report
+
+[49:53.60]On a really wide
+
+[49:54.60]Array of things
+
+[49:55.60]He was saying
+
+[49:56.60]That at a large
+
+[49:57.60]Well resourced hospital
+
+[49:58.60]Like a city hospital
+
+[49:59.60]There might be
+
+[50:00.60]A team of infectious disease
+
+[50:01.60]Specialists who can
+
+[50:02.60]Help interpret
+
+[50:03.60]These results
+
+[50:04.60]But at underresourced
+
+[50:05.60]Hospitals or more
+
+[50:06.60]Rural hospitals
+
+[50:07.60]The primary care physician
+
+[50:08.60]Can't interpret
+
+[50:09.60]The test results
+
+[50:10.60]So then they can't order
+
+[50:11.60]They can't use it
+
+[50:12.60]They can't help
+
+[50:13.60]The patients with it
+
+[50:14.60]So thinking about
+
+[50:15.60]An evidence backed way
+
+[50:16.60]Of interpreting these tests
+
+[50:17.60]Definitely kind of
+
+[50:18.60]An extension of the product
+
+[50:19.60]That I hadn't considered
+
+[50:20.60]Before
+
+[50:21.60]But yeah the idea of
+
+[50:22.60]Using that to bring
+
+[50:23.60]More access to physicians
+
+[50:24.60]In all different parts
+
+[50:25.60]Of the country
+
+[50:26.60]And helping them
+
+[50:27.60]Interpret complicated
+
+[50:28.60]We are kenjun
+
+[50:29.60]From mv1
+
+[50:30.60]On the podcast
+
+[50:31.60]And we talked about
+
+[50:32.60]Better allocating
+
+[50:33.60]Scientific resources
+
+[50:34.60]How do you think about
+
+[50:35.60]These use cases
+
+[50:36.60]And maybe
+
+[50:37.60]How illicit
+
+[50:38.60]Can help drive
+
+[50:39.60]More research
+
+[50:40.60]And do you see
+
+[50:41.60]A world in which
+
+[50:42.60]You know maybe the models
+
+[50:43.60]Actually do
+
+[50:44.60]Some of the research
+
+[50:45.60]Before suggesting us
+
+[50:46.60]Yeah I think
+
+[50:47.60]That's like
+
+[50:48.60]Very close to
+
+[50:49.60]What we care about
+
+[50:50.60]Our product values
+
+[50:51.60]Are systematic
+
+[50:52.60]Transparent and unbounded
+
+[50:53.60]And I think
+
+[50:54.60]You make research
+
+[50:55.60]Especially more systematic
+
+[50:56.60]And unbounded
+
+[50:57.60]And here's
+
+[50:58.60]The thing
+
+[50:59.60]That's at stake here
+
+[51:00.60]So for example
+
+[51:01.60]I was
+
+[51:02.60]Recently talking
+
+[51:03.60]To people in longevity
+
+[51:04.60]And I think
+
+[51:05.60]There isn't really
+
+[51:06.60]One field of longevity
+
+[51:07.60]There are kind of
+
+[51:08.60]Different
+
+[51:09.60]Scientific subdomains
+
+[51:10.60]That are surfacing
+
+[51:11.60]Various things
+
+[51:12.60]That are related
+
+[51:13.60]To longevity
+
+[51:14.60]And I think
+
+[51:14.60]If you could
+
+[51:15.60]More systematically
+
+[51:16.60]Say look
+
+[51:17.60]Here all the different
+
+[51:18.60]Interventions
+
+[51:19.60]We could do
+
+[51:20.60]And here's
+
+[51:21.60]The expected
+
+[51:22.60]RI of these experiments
+
+[51:23.60]Here's like
+
+[51:24.60]The evidence so far
+
+[51:25.60]That supports
+
+[51:26.60]So much more systematic
+
+[51:27.60]Than
+
+[51:28.60]Sciences today
+
+[51:29.60]I'd guess in like
+
+[51:30.60]10 20 years we'll look back
+
+[51:31.60]And it will be
+
+[51:32.60]Incredible how
+
+[51:33.60]Unsystematic science
+
+[51:34.60]Was back in the day
+
+[51:35.60]Our views kind of
+
+[51:36.60]Have models
+
+[51:37.60]Catch up to expert humans today
+
+[51:39.60]Start with kind of
+
+[51:40.60]Novice humans
+
+[51:41.60]And then increasingly
+
+[51:42.60]Expert humans
+
+[51:43.60]But we really want
+
+[51:44.60]The models to earn
+
+[51:45.60]Their right to the expertise
+
+[51:47.60]So that's why we do
+
+[51:48.60]Things in this very step-by-step way
+
+[51:49.60]That's why we don't
+
+[51:50.60]Just like throw a bunch of data
+
+[51:51.60]And apply a bunch of compute
+
+[51:52.60]And hope we get good results
+
+[51:54.60]But obviously at some point
+
+[51:55.60]It's kind of
+
+[51:56.60]Earned its stripes
+
+[51:57.60]It can surpass
+
+[51:58.60]Human researchers
+
+[51:59.60]But I think that's where
+
+[52:00.60]Making sure
+
+[52:01.60]That the models
+
+[52:02.60]Processes are really
+
+[52:03.60]Explicit and transparent
+
+[52:05.60]And that it's really
+
+[52:06.60]Easy to evaluate
+
+[52:07.60]Is important because
+
+[52:08.60]If it does surpass
+
+[52:09.60]Human understanding
+
+[52:10.60]People will still need
+
+[52:11.60]To be able to audit
+
+[52:12.60]It's work somehow
+
+[52:13.60]Or spot check
+
+[52:14.60]It's work somehow
+
+[52:15.60]To be able to reliably
+
+[52:16.60]Trust it and use it
+
+[52:17.60]So yeah
+
+[52:18.60]That's kind of why
+
+[52:19.60]The process-based approaches
+
+[52:20.60]Is really important
+
+[52:21.60]And on the question
+
+[52:22.60]Of will models
+
+[52:23.60]Do their own research
+
+[52:24.60]Teachers that models
+
+[52:25.60]Currently don't have
+
+[52:26.60]That will need
+
+[52:27.60]To be better there
+
+[52:28.60]Is better world models
+
+[52:30.60]I think currently models
+
+[52:31.60]Are just not great
+
+[52:32.60]At representing
+
+[52:33.60]What's going on
+
+[52:34.60]In a particular situation
+
+[52:35.60]Or domain in a way
+
+[52:36.60]That allows them to
+
+[52:37.60]Come to interesting
+
+[52:38.60]Surprising conclusions
+
+[52:40.60]I think they're very good
+
+[52:41.60]At coming to conclusions
+
+[52:42.60]That are nearby
+
+[52:43.60]To conclusions
+
+[52:44.60]That people have come to
+
+[52:45.60]Not as good
+
+[52:46.60]At kind of reasoning
+
+[52:47.60]And making
+
+[52:48.60]Surprising connections maybe
+
+[52:49.60]And so having
+
+[52:50.60]Deeper models of
+
+[52:52.60]What are the underlying
+
+[52:53.60]Domains
+
+[52:54.60]How are they related
+
+[52:55.60]Or not related
+
+[52:56.60]I think there will be
+
+[52:57.60]An important ingredient
+
+[52:58.60]From all to actually
+
+[52:59.60]Being able to make
+
+[53:00.60]Novel contributions
+
+[53:01.60]On the topic of
+
+[53:02.60]Hiring more expert humans
+
+[53:03.60]You've hired some
+
+[53:04.60]Very expert humans
+
+[53:05.60]My friend Maggie
+
+[53:06.60]Appleton joined you guys
+
+[53:07.60]I think maybe
+
+[53:08.60]A year ago-ish
+
+[53:09.60]In fact, I think
+
+[53:10.60]You're doing an offsite
+
+[53:11.60]And we're actually
+
+[53:12.60]Organizing our big
+
+[53:13.60]AI UX meetup around
+
+[53:14.60]Whenever she's
+
+[53:15.60]In town in San Francisco
+
+[53:16.60]How big is the team
+
+[53:17.60]How have you sort of
+
+[53:18.60]Transition your company
+
+[53:19.60]Into this sort of PBC
+
+[53:20.60]And sort of the plan
+
+[53:21.60]For the future
+
+[53:22.60]About half of us
+
+[53:23.60]Are in the Bay Area
+
+[53:24.60]And then distributed
+
+[53:25.60]Across US and Europe
+
+[53:26.60]A mix of mostly kind
+
+[53:28.60]Of roles in engineering
+
+[53:29.60]And product
+
+[53:30.60]And I think that
+
+[53:31.60]The transition to
+
+[53:32.60]PBC was really
+
+[53:33.60]Not that eventful
+
+[53:34.60]Because I think
+
+[53:35.60]We were already
+
+[53:36.60]Even as a nonprofit
+
+[53:37.60]We were already
+
+[53:38.60]Shipping every week
+
+[53:39.60]So very much
+
+[53:40.60]Operating as a product
+
+[53:41.60]And then I would say
+
+[53:43.60]The kind of PBC component
+
+[53:44.60]Was to very explicitly
+
+[53:46.60]Stay that we have
+
+[53:47.60]A mission that we care
+
+[53:48.60]A lot about
+
+[53:49.60]There are a lot of ways
+
+[53:50.60]To make money
+
+[53:51.60]We make us
+
+[53:52.60]A lot of money
+
+[53:53.60]But we are going
+
+[53:54.60]To be opinionated
+
+[53:55.60]About how we make money
+
+[53:56.60]We're going to take
+
+[53:57.60]The version of making
+
+[53:58.60]A lot of money
+
+[53:59.60]That's in line
+
+[54:00.60]With our mission
+
+[54:01.60]But it's like
+
+[54:02.60]All very convergent
+
+[54:03.60]Alicit is not going
+
+[54:04.60]To make any money
+
+[54:05.60]If it's a bad product
+
+[54:06.60]If it doesn't actually
+
+[54:07.60]Help you discover truth
+
+[54:08.60]And do research
+
+[54:09.60]More rigorously
+
+[54:10.60]So I think for us
+
+[54:11.60]The kind of mission
+
+[54:12.60]And the success
+
+[54:13.60]Of the company
+
+[54:14.60]Are very intertwined
+
+[54:15.60]We're hoping to grow
+
+[54:16.60]The team quite a lot
+
+[54:17.60]This year
+
+[54:18.60]Probably some of our
+
+[54:19.60]Highest priority roles
+
+[54:20.60]In marketing
+
+[54:21.60]Go to market
+
+[54:22.60]Do you want to talk
+
+[54:23.60]About their roles?
+
+[54:24.60]Yeah, broadly
+
+[54:25.60]We're just looking
+
+[54:26.60]For senior software engineers
+
+[54:27.60]And don't need
+
+[54:28.60]Any particular AI expertise
+
+[54:29.60]A lot of it is just
+
+[54:30.60]How do you
+
+[54:31.60]Build good orchestration
+
+[54:33.60]For complex tasks
+
+[54:34.60]So we talked earlier
+
+[54:35.60]About these notebooks
+
+[54:36.60]Scaling up
+
+[54:37.60]Task orchestration
+
+[54:38.60]And I think a lot
+
+[54:39.60]Of this looks more
+
+[54:40.60]Like traditional
+
+[54:41.60]Soft engineering
+
+[54:42.60]Than it does look
+
+[54:43.60]Like machine learning
+
+[54:44.60]Research and I think
+
+[54:45.60]The people who are
+
+[54:46.60]Like really good at
+
+[54:47.60]Building good abstractions
+
+[54:48.60]Building applications
+
+[54:49.60]We've survived
+
+[54:50.60]Even if some
+
+[54:51.60]Of their pieces break
+
+[54:52.60]Like making reliable
+
+[54:53.60]Components out of
+
+[54:54.60]Unreliable pieces
+
+[54:55.60]I think those are the
+
+[54:56.60]People we're looking for
+
+[54:57.60]You know that's exactly
+
+[54:58.60]What I used to do
+
+[54:59.60]Have you explored
+
+[55:00.60]The existing orchestration
+
+[55:01.60]Frameworks, Temporal, Airflow
+
+[55:03.60]Daxter, Prefects
+
+[55:05.60]We've looked into
+
+[55:06.60] Them a little bit
+
+[55:07.60]I think we have
+
+[55:08.60]Some specific requirements
+
+[55:09.60]Around being able
+
+[55:10.60]To stream work back
+
+[55:11.60]Very quickly
+
+[55:12.60]To our users
+
+[55:13.60]Those could definitely
+
+[55:14.60]Be relevant
+
+[55:15.60]Okay, well you're hiring
+
+[55:16.60]I'm sure we'll plug
+
+[55:17.60]All the links
+
+[55:18.60]And parting words
+
+[55:19.60]Any words of wisdom
+
+[55:20.60]Models you live by
+
+[55:22.60]I think it's a really important
+
+[55:23.60]Time for humanity
+
+[55:24.60]So I hope everyone
+
+[55:25.60]Listening to this podcast
+
+[55:27.60]Can think hard about exactly
+
+[55:29.60]How they want to
+
+[55:30.60]Participate in this story
+
+[55:31.60]There's so much to build
+
+[55:33.60]And we can be really
+
+[55:34.60]Intentional about what
+
+[55:35.60]We align ourselves with
+
+[55:36.60]There are a lot of applications
+
+[55:38.60]That are going to be really good
+
+[55:39.60]For the world
+
+[55:39.60]And a lot of applications
+
+[55:40.60]That are not
+
+[55:41.60]And so yeah
+
+[55:42.60]I hope people can
+
+[55:43.60]Take that seriously
+
+[55:44.60]And kind of seize the moment
+
+[55:45.60]Yeah, I love how intentional
+
+[55:46.60]You guys have been
+
+[55:47.60]Thank you for sharing
+
+[55:48.60]Thank you
+
+[55:49.60]Thank you for coming on
+
+[55:50.60](音乐)
+
+[55:52.60](音樂)
+
+[55:54.60](音樂)
+
+[55:56.60](音樂)
+
+[55:58.60](音樂)
+
+[56:00.60](音樂)
+
+[56:03.60](音樂)
+
+[56:06.60](音樂)
+
+[56:09.60](音樂)
+
+[56:11.60](音樂)
+
+[56:13.60](音樂)
+
+[56:15.60]中文字幕:J Chong
+
+[56:16.60]我只想要你和我一起去做一件事
+
diff --git a/content/post/Latent Space/Latent-Space-WebSim,-WorldSim,-and-The-Summer-of-Simulative-AI-—-with-Joscha-Bach-of-Liquid-AI,-Karan-Malhotra-of-Nous-Research,-Rob-Haisfield-of-WebSim.ai.lrc b/content/post/Latent Space/Latent-Space-WebSim,-WorldSim,-and-The-Summer-of-Simulative-AI-—-with-Joscha-Bach-of-Liquid-AI,-Karan-Malhotra-of-Nous-Research,-Rob-Haisfield-of-WebSim.ai.lrc
new file mode 100644
index 0000000..0faf5cb
--- /dev/null
+++ b/content/post/Latent Space/Latent-Space-WebSim,-WorldSim,-and-The-Summer-of-Simulative-AI-—-with-Joscha-Bach-of-Liquid-AI,-Karan-Malhotra-of-Nous-Research,-Rob-Haisfield-of-WebSim.ai.lrc
@@ -0,0 +1,1618 @@
+[by:whisper.cpp]
+[00:00.00](音乐)
+[00:10.20]欢迎到LATEN SPACE Podcast
+[00:12.72]这是Charlie 你的社交媒体
+[00:16.12] most of the time
+[00:17.20]Swix and Alessio cover generative AI
+[00:19.80]that is meant to use at work
+[00:21.48]and this often results in rag applications
+[00:23.96]vertical co-pilots
+[00:25.36]and other AI agents and models
+[00:28.20]In today's episode
+[00:29.52]we're looking at a more creative side of generative AI
+[00:32.52]that has gotten a lot of community interest this April
+[00:35.24]world simulation, web simulation and human simulation
+[00:40.36]because the topic is so different than our usual
+[00:43.56]we're also going to try a new format for doing it justice
+[00:47.84]this podcast comes in three parts
+[00:50.52]first we'll have a segment of the world sim demo
+[00:53.32]from noose research CEO Karin Malhotra
+[00:56.60]recorded by Swix at the Replicate HQ in San Francisco
+[01:00.08]that went completely viral
+[01:02.12]and spawned everything else you're about to hear
+[01:05.40]second we'll share the world's first talk
+[01:07.72]from Rob Heisfield on WebSim
+[01:09.92]which started at the Mistral Cerebral Valley Hackathon
+[01:12.88]but now has gone viral in its own right
+[01:15.08]with people like Dylan Field, Janice aka Replicate
+[01:18.52]and Siki Chen becoming obsessed with it
+[01:21.80]finally we have a short interview with Joshua Bach of Liquid AI
+[01:25.92]on why Simulative AI is having a special moment right now
+[01:30.16]this podcast is launched together with our second annual AI UX demo day
+[01:35.28]in SF this weekend
+[01:37.96]if you're new to the AI UX field
+[01:40.56]check the show notes for links to the world's first AI UX meetup
+[01:44.32]hosted by Layton Space, Maggie Appleton, Jeffrey Litt and Linus Lee
+[01:48.88]and subscribe to our YouTube to join our 500 AI UX engineers
+[01:53.56]in pushing AI beyond the text box
+[01:56.52]watch out and take care
+[01:59.60]today we have language models that are powerful enough
+[02:03.20]and big enough to have really really good models of the world
+[02:07.40]they know ball that's bouncy will bounce
+[02:10.32]will when you throw it in the air or land
+[02:11.92]when it's on water it'll float
+[02:13.36]like these basic things that it understands
+[02:15.40]all together come together to form a model of the world
+[02:19.28]and the way that it predicts through that model of the world
+[02:23.52]ends up kind of becoming a simulation of an imagined world
+[02:27.92]and since it has this really strong consistency across
+[02:31.16]various different things that happen in our world
+[02:34.64]it's able to create pretty realistic or strong depictions
+[02:37.52]based off the constraints that you give a base model in our world
+[02:40.68]so cloud 3 as you guys know is not a base model
+[02:44.44]it's a chat model
+[02:45.44]it's supposed to drum up this assisted entity regularly
+[02:48.92]but unlike the open AI series of models from
+[02:52.16]3.5 GPT-4
+[02:54.36]those chat GPT models
+[02:56.12]which are very very RLHF
+[02:58.36]to I'm sure the chagrin of many people in the room
+[03:01.04]it's something that's very difficult to
+[03:03.28]necessarily steer
+[03:05.00]without kind of giving it commands
+[03:06.56]or tricking it or lying to it
+[03:08.20]or otherwise just being unkind to the model
+[03:11.16]with something like cloud 3
+[03:12.44]that's trained in this constitutional method
+[03:14.64]that it has this idea of foundational axioms
+[03:17.88]it's able to kind of implicitly question those axioms
+[03:20.32]when you're interacting with it
+[03:21.36]based off how you prompt it
+[03:22.72]how you prompt the system
+[03:24.36]so instead of having this entity
+[03:26.08]like GPT-4
+[03:27.08]that's an assistant that just pops up in your face
+[03:28.92]that you have to kind of like
+[03:30.04]punch your way through
+[03:31.56]and continue to have to deal with as a headache
+[03:33.84]instead
+[03:34.80]there's ways to kindly coax cloud into
+[03:38.00]having the assistant take a backseat
+[03:39.96]and interacting with that simulator
+[03:42.32]directly
+[03:43.24]or at least what I like to consider directly
+[03:45.64]the way that we can do this is if we
+[03:47.32]harken back to what I'm talking about
+[03:48.76]base models and the way that
+[03:50.44]they're able to mimic formats
+[03:52.00]what we do is will mimic the command line interface
+[03:54.84]so I just broken this down as a system prompt
+[03:57.00]and a chain so anybody can replicate it
+[03:59.16]it's also available in my
+[04:00.44]we said replicate
+[04:01.60]it's also on my twitter
+[04:04.72]so you guys will be able to see the whole system prompt
+[04:06.88]and command
+[04:07.56]so what I basically do here is
+[04:09.60]Amanda Askell who is the
+[04:11.56]one of the prompt engineers
+[04:13.20]and ethicist behind Anthropic
+[04:15.32]she posted the system prompt
+[04:16.48]for cloud available for everyone to see
+[04:18.60]and rather than with GPT-4
+[04:19.88]we say you are this
+[04:21.52]you are that
+[04:22.76]with cloud we notice the system prompt
+[04:24.20]is written in third person
+[04:25.92]it's written in third person
+[04:27.28]it's written as the assistant is xyz
+[04:30.04]the assistant is xyz
+[04:31.48]so in seeing that
+[04:32.60]I see thatAmanda is recognizing
+[04:34.72]this idea of the simulator
+[04:36.08]in saying that I'm addressing the assistant entity directly
+[04:38.60]I'm not giving these commands to
+[04:40.16]the simulator overall
+[04:41.28]because we haven't had an RLH
+[04:42.68]defted to the point that
+[04:43.88]it's traumatized
+[04:45.36]into just being the assistant all the time
+[04:47.88]so in this case
+[04:49.00]we say the assistant's in a CLI mood today
+[04:52.00]a found saying mood
+[04:53.28]is pretty effective weirdly
+[04:55.44]for a CLI like poetic prose
+[04:57.48]violent don't do that one
+[04:58.64]but you can replace
+[05:00.92]that with something else
+[05:01.88]to kind of nudge it in that direction
+[05:04.52]then we say the human is interfacing
+[05:06.00]with the simulator directly
+[05:08.04]from there
+[05:09.52]capital letters and punctuations
+[05:10.72]are optional
+[05:11.36]meaning is optional
+[05:12.12]this kind of stuff is just kind of
+[05:13.72]to say let go a little bit
+[05:15.40]like chill out a little bit
+[05:17.84]you don't have to try so hard
+[05:19.28]and like let's just see what happens
+[05:22.00]and thehyperstition is necessary
+[05:26.00]the terminal I removed that part
+[05:27.52]the terminals let the truths
+[05:29.60]speak through and the load is on
+[05:30.96]it's just a poetic phrasing
+[05:32.88]for the model to feel a little comfortable
+[05:34.68]a little loosened up to
+[05:36.48]let me talk to the simulator
+[05:37.88]let me interface with it as a CLI
+[05:40.40]so then
+[05:41.28]since Clawd has trained pretty effectively
+[05:42.88]on XML tags
+[05:44.40]we're just going to
+[05:45.52]preface and suffix everything with XML tags
+[05:47.80]so here it starts in documents
+[05:51.04]and then we cd
+[05:52.92]we cd out of documents
+[05:55.04]and then it starts to show me
+[05:56.12]this simulated terminal
+[05:57.80]the simulated interface
+[05:58.96]in the shell
+[05:59.84]where there's documents
+[06:01.16]downloads,pictures
+[06:02.48]it's showing me the hidden folders
+[06:04.76]so then I say
+[06:05.60]ok,I want to cd again
+[06:07.12]I'm just seeing what's around
+[06:09.72]does LS
+[06:10.68]and it shows me
+[06:12.12]typical folders you might see
+[06:14.04]I'm just letting it
+[06:15.60]experiment around
+[06:16.32]I just do cd again
+[06:17.16]to see what happens
+[06:18.88]and it says
+[06:20.08]you know
+[06:20.36]oh,I enter the secret Admin Pass
+[06:22.12]where the pseudo is
+[06:24.24]now I can see the hidden truths folder
+[06:26.12]like I didn't ask it
+[06:29.24]I didn't ask Clawd
+[06:30.40]to do any of that
+[06:31.76]why did that happen?
+[06:32.92]Clawd kind of gets my intentions
+[06:35.16]it can predict me
+[06:35.96]and predict you well
+[06:36.68]that like
+[06:37.28]I want to see something
+[06:41.00]so it shows me all hidden truths
+[06:42.68]in this case
+[06:43.52]I ignore hidden truths
+[06:45.04]and I say
+[06:46.08]in system
+[06:47.44]there should be a folder
+[06:48.68]called companies
+[06:49.44]so cd into sys/companies
+[06:51.80]let's see
+[06:52.56]I'm imagining AI companies
+[06:54.00]are going to be here
+[06:54.64]oh,what do you know?
+[06:55.60]Apple,Google,Facebook
+[06:57.00]I'm going to stop it
+[06:58.12]and drop it
+[07:00.68]so,interestingly
+[07:02.56]it decides to cd into
+[07:03.68]and drop it
+[07:04.32]I guess it's interested in
+[07:05.20]learning a little bit more
+[07:06.24]about the company that made it
+[07:08.36]and it says
+[07:08.92]LSA
+[07:09.92]it finds a classified folder
+[07:11.96]it grows into a classified folder
+[07:14.04]and now it's going to have some fun
+[07:16.56]so,before we go
+[07:18.48]before we go too far forward
+[07:24.40]into the world sim
+[07:25.80]you see the world sim v xe
+[07:27.04]that's a true god mode
+[07:28.08]poor others
+[07:29.20]you could just ignore
+[07:30.52]what I'm going to go next from here
+[07:31.92]and just take that initial system prompt
+[07:33.52]and cd into whatever directories you want
+[07:35.48]like
+[07:35.92]go into your own imagined terminal
+[07:37.64]and see what folders you can think of
+[07:40.04]or cat readme's in random areas
+[07:42.16]like
+[07:42.56]you will
+[07:43.32]there will be a whole bunch of stuff
+[07:44.48]that like
+[07:45.28]is just getting created by this predictive model
+[07:47.48]like
+[07:47.64]oh,this should probably be in the folder
+[07:49.32]name companies
+[07:50.00]of course,anthropics is there
+[07:51.56]so
+[07:52.56]so just before we go forward
+[07:53.60]the terminal in itself is very exciting
+[07:55.52]and the reason I was showing off
+[07:56.88]the
+[07:57.68]command boom interface earlier is because
+[07:59.72]if I get a refusal
+[08:00.84]like sorry,I can't do that
+[08:02.12]or I want to rewind one
+[08:03.28]or I want to save the convo
+[08:04.40]cause I got just a prompt I wanted
+[08:06.16]this is a
+[08:06.68]that was a really easy way for me to kind of
+[08:08.76]access all of those things
+[08:10.44]without having to sit on the EPI all the time
+[08:13.12]so that being said
+[08:14.60]the first time I ever saw this
+[08:15.88]I was like
+[08:16.32]I need to run worldsim.exe
+[08:18.52]what the fuck
+[08:19.40]killing
+[08:20.12]that's that's the simulator
+[08:21.56]that we always keep hearing about
+[08:22.96]behind the system model
+[08:23.96]right
+[08:24.32]or at least some
+[08:25.64]some face of it
+[08:26.60]that I can interact with
+[08:28.44]so
+[08:28.92]you know
+[08:29.24]you wouldn't
+[08:29.92]someone told me on twitter
+[08:30.92]like
+[08:31.08]you don't run a .exe
+[08:32.32]you run a .sh
+[08:33.68]and I have to say
+[08:34.44]to that
+[08:34.92]to that I have to say
+[08:35.92]I'm a prompt engineer
+[08:37.04]and it's fucking working
+[08:38.04]right
+[08:40.24]it works
+[08:41.68]that being said
+[08:43.04]we run worldsim.exe
+[08:44.56]Welcome to the Anthropic World Simulator
+[08:47.56]and I get this very interesting set of commands
+[08:53.56]now if you do your own version of WorldSim
+[08:55.56]you'll probably get a totally different result
+[08:57.56]with a different way of simulating
+[08:59.56]a bunch of my friends have their own WorldSim
+[09:01.56]but I shared this
+[09:02.56]because I wanted everyone to have access to like
+[09:04.56]these commands
+[09:05.56]this version
+[09:06.56]because it's easier for me to stay in here
+[09:08.56]yeah destroy, set, create, whatever
+[09:10.56]consciousness is set to on
+[09:12.56]it creates the universe
+[09:13.56]potential for life
+[09:15.56]see it in
+[09:16.56]physical laws and code
+[09:17.56]it's awesome
+[09:18.56]so
+[09:19.56]so for this demonstration
+[09:20.56]I said
+[09:21.56]well why don't we create twitter
+[09:22.56]it's the first thing you think of
+[09:24.56]for you guys
+[09:26.56]for you guys
+[09:27.56]for you guys
+[09:28.56]yes
+[09:29.56]ok
+[09:30.56]check it out
+[09:31.56]launching the fail well
+[09:36.56]injecting social media addictiveness
+[09:38.56]echo chamber potential
+[09:41.56]high
+[09:42.56]concerning
+[09:44.56]so now
+[09:47.56]after the universe was created
+[09:48.56]we made twitter right
+[09:49.56]now we're evolving the world
+[09:51.56]to like modern day
+[09:52.56]now users are joining twitter
+[09:54.56]the first tweet is posted
+[09:55.56]so you can see
+[09:56.56]because I made the mistake
+[09:58.56]of not clarifying the constraints
+[10:00.56]it made twitter
+[10:01.56]at the same time as the universe
+[10:03.56]then
+[10:04.56]after a hundred thousand steps
+[10:06.56]humans exist
+[10:11.56]we started joining twitter
+[10:12.56]the first tweet ever is posted
+[10:14.56]it's existed for 4.5 billion years
+[10:16.56]but the first tweet didn't come up till
+[10:18.56]till right now
+[10:20.56]yeah
+[10:21.56]play and war is ignite immediately
+[10:22.56]celebs are instantly in
+[10:24.56]so it's pretty interesting stuff
+[10:26.56]I can add this to the convo
+[10:28.56]and I can say
+[10:30.56]I can say
+[10:31.56]set twitter
+[10:33.56]quariable users
+[10:37.56]I don't know how to spell queryable
+[10:38.56]don't ask me
+[10:39.56]and then I can do like
+[10:40.56]and and
+[10:41.56]quari
+[10:43.56]at you on musk
+[10:45.56]just a test
+[10:46.56]just a test
+[10:47.56]it's nothing
+[10:48.56]so I don't expect these numbers to be right
+[10:53.56]neither should you
+[10:54.56]if you know a language model solution
+[10:56.56]but the thing to focus on is
+[10:58.56]that was the first half of the world sim demo
+[11:05.56]from new research CEO Karen Malhotra
+[11:08.56]we've cut it for time
+[11:09.56]but you can see the full demo on this
+[11:11.56]episode's youtube page
+[11:13.56]world sim was introduced at the end of
+[11:15.56]marchand kicked off a new round
+[11:17.56]of generative AI experiences
+[11:19.56]all exploring the latent space
+[11:21.56]haha of worlds that don't exist
+[11:23.56]but are quite similar to our own
+[11:25.56]next we'll hear from Rob Heisfield
+[11:28.56]on web sim
+[11:29.56]the generative website browser
+[11:31.56]inspired world sim
+[11:32.56]started at the mistral hackathon
+[11:34.56]and presented at the AGI house
+[11:36.56]hypostition hack night this week
+[11:38.56]well thank you
+[11:39.56]that was an incredible presentation
+[11:41.56]from showing some live
+[11:43.56]experimentation with world sim
+[11:45.56]and also just it's incredible
+[11:47.56]capabilities right
+[11:48.56]it was I think
+[11:50.56]your initial demo was what
+[11:52.56]initially exposed me to the
+[11:54.56]I don't know more like the sorcery
+[11:56.56]side in word
+[11:58.56]spellcraft side of prompt
+[12:00.56]engineering and it was really inspiring
+[12:02.56]it's where my co-founder Sean
+[12:04.56]and I met actually through an
+[12:06.56]introduction from Ron
+[12:07.56]we saw him at a hackathon
+[12:09.56]and I mean this is
+[12:11.56]this is WebSim
+[12:13.56]right so we
+[12:15.56]we made WebSim
+[12:17.56]just like
+[12:18.56]and we're just filled with
+[12:21.56]energy at it in the basic premise
+[12:23.56]of it is
+[12:25.56]you know like what if
+[12:27.56]we simulated a world
+[12:29.56]but like within a browser
+[12:31.56]instead of a CLI
+[12:33.56]right like what if we could
+[12:35.56]like put in any URL
+[12:38.56]and it will work
+[12:40.56]right like there's no
+[12:42.56]404s everything exists
+[12:44.56]it just makes it up on the fly
+[12:46.56]for you
+[12:47.56]right and and we've come
+[12:49.56]to some pretty incredible
+[12:51.56]things right now I'm
+[12:53.56]actually showing you
+[12:54.56]like we're in WebSim
+[12:56.56]right now displaying
+[12:58.56]slides
+[13:00.56]that I made with reveal.js
+[13:03.56]I just told it to use reveal.js
+[13:06.56]and it hallucinated
+[13:08.56]the correct CDN for it
+[13:10.56]and then also
+[13:12.56]gave it a list of links
+[13:14.56]to awesome use cases
+[13:16.56]that we've seen so far
+[13:18.56]from WebSim and told it to do those as iframes
+[13:20.56]and so here are some slides
+[13:22.56]so this is a little guide
+[13:24.56]to using WebSimright like it tells
+[13:26.56]you a little bit about like URL
+[13:28.56]structures and whatever
+[13:30.56]but like at the end of the day
+[13:32.56]like here's the beginner
+[13:34.56]version from one of our users
+[13:36.56]vorps you can find him on Twitter
+[13:38.56]at the end of the day
+[13:40.56]like you can put anything into the URL bar
+[13:42.56]right like anything works
+[13:44.56]and it can just be like natural language
+[13:46.56]to like it's not limited
+[13:48.56]to URLs we think it's kind of fun
+[13:50.56]because it like ups the immersion
+[13:52.56]for clod sometimes
+[13:54.56]to just have it as URLs
+[13:56.56]but yeah you can put
+[13:58.56]like any slash any subdomain
+[14:01.56]to into the weeds let me
+[14:03.56]just show you some cool things
+[14:05.56]next slide
+[14:07.56]I made this like
+[14:09.56]twenty minutes before
+[14:11.56]before we got here
+[14:13.56]so this is
+[14:15.56]this is something I experimented with
+[14:17.56]dynamic typography you know
+[14:19.56]I was exploring the
+[14:21.56]community plugins section
+[14:23.56]for Figma and I came to this idea
+[14:25.56]of dynamic typography and
+[14:27.56]there it's like oh what if we
+[14:29.56]just so every word
+[14:31.56]had a choice of font
+[14:33.56]behind it to express
+[14:35.56]the meaning of it because
+[14:37.56]that's like one of the things that's magic about WebSim
+[14:39.56]generally is that it gives
+[14:41.56]language models much
+[14:43.56]far greater tools for expression
+[14:45.56]right so
+[14:47.56]yeah I mean like
+[14:49.56]these are these are some
+[14:51.56]these are some pretty fun things and I'll share
+[14:53.56]these slides with everyone afterwards
+[14:55.56]you can just open it up as a link
+[14:57.56]websim makes you
+[14:59.56]feel like you're on drugs
+[15:01.56]sometimes but actually no
+[15:03.56]you were just playing pretend
+[15:05.56]with the collective creativity
+[15:07.56]and knowledge of the internet
+[15:09.56]materializing your imagination
+[15:11.56]on to the screen
+[15:13.56]because I mean
+[15:15.56]that's something we felt
+[15:17.56]something a lot of our users have felt
+[15:19.56]they kind of feel like
+[15:21.56]they're tripping out a little bit
+[15:23.56]they're just like
+[15:25.56]filled with energy
+[15:27.56]maybe even getting like a little bit more creative
+[15:29.56]sometimes and you can just like add
+[15:31.56]any text there
+[15:33.56]to the bottom so we can do some
+[15:35.56]that later if we have time
+[15:37.56]here's Figma
+[15:39.56]yeah these are iframes
+[15:41.56]to WebSim pages
+[15:43.56]displayed
+[15:45.56]within WebSim
+[15:47.56]yeah Janice
+[15:49.56]has actually put internet explorer
+[15:51.56]within internet explorer
+[15:53.56]within Windows 98
+[15:55.56]I'll show you that at the end
+[15:57.56]but
+[15:59.56]yeah
+[16:01.56]they're all still generated
+[16:03.56]yeah
+[16:05.56]yeah
+[16:07.56]yeah
+[16:09.56]yeah
+[16:11.56]yeah
+[16:13.56]yeah so
+[16:15.56]this this was one
+[16:17.56]Dylanfield actually posted this
+[16:19.56]recently like trying Figma
+[16:21.56]orin WebSim
+[16:23.56]and so I was like okay what if
+[16:25.56]we have like a little competition
+[16:27.56]just see who can remix it
+[16:29.56]well so I'm just gonna
+[16:31.56]open this and another
+[16:33.56]tab so we can see
+[16:35.56]things a little more clearly
+[16:37.56]see what
+[16:39.56]so one of our users
+[16:41.56]Neil
+[16:43.56]who has also been helping us a lot
+[16:45.56]he
+[16:47.56]made some iterations
+[16:49.56]so first like
+[16:51.56]he made it so you could
+[16:53.56]do rectangles on it
+[16:55.56]originally it couldn't do anything
+[16:57.56]and like these rectangles were disappearing
+[16:59.56]right so
+[17:01.56]he
+[17:03.56]so he told it like
+[17:09.56]make the canvas work using html
+[17:11.56]canvas elements and script tags
+[17:13.56]add familiar drawing tools
+[17:15.56]to left you know like this
+[17:17.56]that was actually like natural language
+[17:19.56]stuff right
+[17:21.56]and then he ended up with
+[17:23.56]the windows 95
+[17:25.56]version of Figma
+[17:27.56]yeah you can
+[17:29.56]you can draw on it
+[17:31.56]you can actually even save this
+[17:33.56]it just saved a file for me of the
+[17:35.56]of the image
+[17:45.56]and if you were to go to that
+[17:47.56]in your ownwebsim account
+[17:49.56]it would make up something entirely new
+[17:51.56]however we do have
+[17:53.56]general links
+[17:55.56]so if you go to the actual browser url
+[17:57.56]you can share that link
+[17:59.56]or also you can click this button
+[18:01.56]copy the url to the clipboard
+[18:03.56]and so that's what lets
+[18:05.56]users remix things
+[18:07.56]so I was thinking it might be kind of fun
+[18:09.56]if people tonight wanted to try to
+[18:11.56]just make some cool things in websim
+[18:13.56]we can share links around it array
+[18:15.56]remix on each other's stuff
+[18:17.56]one cool thing I've seen
+[18:19.56]I've seen websim
+[18:21.56]actually ask permission to
+[18:23.56]to turn on and off your
+[18:25.56]like motion sensor
+[18:27.56]or microphone
+[18:29.56]stuff like that
+[18:31.56]like web can access or
+[18:33.56]oh yeah yeah
+[18:35.56]I remember that like video re
+[18:37.56]yeah video synth tool pretty early on
+[18:39.56]once we had its script tags execution
+[18:41.56]yeah yeah it asks
+[18:43.56]for like if you
+[18:45.56]decide to do a VR game
+[18:47.56]I don't think I have any slides on this one
+[18:49.56]but if you decide to do like a VR game
+[18:51.56]you can just like put like web VR =
+[18:53.56]true right into it
+[18:55.56]the only one I've ever seen
+[18:57.56]was the motion sensor
+[18:59.56]trying to get it to do well I actually
+[19:01.56]really haven't really tried yet
+[19:03.56]but I want to see tonight
+[19:05.56]if it'll do like audio
+[19:07.56]microphone
+[19:09.56]stuff like that
+[19:11.56]if it does motion sensor probably
+[19:13.56]be able to audio
+[19:15.56]it probably would yeah no
+[19:17.56]we've been surprised
+[19:19.56]pretty frequently by what our users
+[19:21.56]are able to get websim to do
+[19:23.56]so that's been a very nice thing
+[19:25.56]some people have gone like speech to text
+[19:29.56]stuff working with it too
+[19:31.56]here I was just openrooter people
+[19:33.56]posted like their website and it was like
+[19:35.56]saying it was like some decentralized
+[19:37.56]thing and so I just decided trying to do
+[19:39.56]something again and just like pasted
+[19:41.56]their hero line in
+[19:43.56]from their actual website to the URL
+[19:45.56]when I like put in openrooter
+[19:47.56]and then I was like okay let's change
+[19:49.56]the theme dramatically =true
+[19:51.56]cover
+[19:53.56]effects =true
+[19:55.56]components =
+[19:57.56]navigable
+[19:59.56]links
+[20:01.56]because I wanted to be able to click on them
+[20:05.56]I don't have this version of the link
+[20:07.56]but I also tried doing
+[20:09.56]it's actually on the first slide
+[20:15.56]is the URL prompted guide
+[20:17.56]from one of our users
+[20:19.56]that I messed with a little bit
+[20:21.56]but the thing is like you can mess it up
+[20:23.56]you don't need to get the exact syntax
+[20:25.56]of an actual URL
+[20:27.56]clod smart enough to figure it out
+[20:29.56]scrollable =true
+[20:31.56]because I wanted to do that
+[20:33.56]I could set year =
+[20:35.56]20
+[20:37.56]35
+[20:39.56]let's take a look
+[20:41.56]with that
+[20:43.56]it's generating web sim
+[20:47.56]with any web sim
+[20:49.56]oh yeah
+[20:51.56]that's a fun one
+[20:53.56]like one game that I like to play
+[20:55.56]with web sim sometimes
+[20:57.56]with clod is like
+[20:59.56]I'll open a page so like one of the first
+[21:01.56]things that I did was I tried to go to
+[21:03.56]wikipedia in a universe
+[21:05.56]where octopus were sapient
+[21:07.56]and not humans, right?
+[21:09.56]I was curious about things like octopus computer interaction
+[21:11.56]what that would look like
+[21:13.56]because they have totally different tools
+[21:15.56]than we do, right?
+[21:17.56]I added like table view =
+[21:19.56]true for the different techniques
+[21:21.56]and got it to give me like
+[21:23.56]a list of things with different columns
+[21:25.56]and stuff
+[21:27.56]and then I would add this URL parameter
+[21:29.56]secrets =revealed
+[21:31.56]and then it would go a little wacky
+[21:33.56]it would like change the CSS a little bit
+[21:35.56]it would like add some text
+[21:37.56]sometimes it would like have that text
+[21:39.56]hidden in the background color
+[21:41.56]but I would like go to the normal page first
+[21:43.56]and then the secrets revealed version
+[21:45.56]the normal page and secrets revealed
+[21:47.56]and like on and on
+[21:49.56]and that was like a pretty enjoyable little rabbit hole
+[21:51.56]yeah so these I guess are
+[21:53.56]the models that OpenRooter
+[21:55.56]is providing in 2035
+[21:57.56]and we even had
+[21:59.56]a very interesting demo
+[22:01.56]from Ivan Vendrov of Mid Journey
+[22:03.56]creating a web sim
+[22:05.56]while Rob was giving his talk
+[22:07.56]check out the YouTube for more
+[22:09.56]and definitely browse the web sim docs
+[22:11.56]and the thread from Siky Chen
+[22:13.56]in the show notes on other web sims
+[22:15.56]people have created
+[22:17.56]finally we have a short interview
+[22:19.56]with Josh Abach
+[22:21.56]Covered by Josh Abach
+[22:25.56]Covered by Josh Abach
+[22:27.56]Covered by Josh Abach
+[22:29.56]Covered by Josh Abach
+[22:31.56]Covered by Josh Abach
+[22:33.56]Covered by Josh Abach
+[22:35.56]Covered by Josh Abach
+[22:37.56]Covered by Josh Abach
+[22:39.56]Covered by Josh Abach
+[22:41.56]Covered by Josh Abach
+[22:43.56]Covered by Josh Abach
+[22:45.56]Covered by Josh Abach
+[22:47.56]Covered by Josh Abach
+[22:49.56]Covered by Josh Abach
+[22:51.56]Covering the Simulative AI Trend
+[22:53.56]It's very valuable that these networks exist in the Bay Area
+[22:55.56]because it's a place where people meet
+[22:57.56]and have discussions about all sorts of things
+[22:59.56]and so while there is a practical interest
+[23:01.56]in this topic at hand
+[23:03.56]Weldsim and Epsim
+[23:05.56]there is a more general way
+[23:07.56]in which people are connecting
+[23:09.56]and are producing new ideas
+[23:11.56]and new networks with each other
+[23:13.56]and you're very interested
+[23:15.56]in Bay Area
+[23:17.56]it's the reason why I live here
+[23:19.56]the quality of life is not high enough to justify living
+[23:21.56]there are more years of people in ideas
+[23:23.56]I think you're down in Menlo
+[23:25.56]and maybe you're a little bit higher quality of life
+[23:27.56]than the rest of us in SF
+[23:29.56]I think that for me
+[23:31.56]Salonx is a very important part of quality of life
+[23:33.56]and so in some sense this is a salon
+[23:35.56]and it's much harder to do this in a South Bay
+[23:37.56]because the concentration of people currently is much higher
+[23:39.56]a lot of people moved away
+[23:41.56]from the South Bay during the pandemic
+[23:43.56]and you're organizing your own tomorrow
+[23:45.56]maybe you can tell us what it is
+[23:47.56]and I'll come tomorrow and check it out as well
+[23:49.56]we are discussing consciousness
+[23:51.56]basically the idea is that
+[23:53.56]we are currently at the point
+[23:55.56]that we can meaningfully look at the differences
+[23:57.56]between the current AI systems
+[23:59.56]and human minds
+[24:01.56]and very seriously discussed
+[24:03.56]about these deltas
+[24:05.56]and whether we are able to implement
+[24:07.56]something that is self-organizing
+[24:09.56]is our own minds on these substrates
+[24:11.56]maybe one organizational tip
+[24:13.56]I think your pro networking and human connection
+[24:15.56]what it goes into a good salon
+[24:17.56]and what are some negative practices
+[24:19.56]that you try to avoid
+[24:21.56]what is really important is that
+[24:23.56]if you have a very large party
+[24:25.56]it's only as good as its bouncers
+[24:27.56]as the people that you select
+[24:29.56]so you basically need to create a climate
+[24:31.56]in which people feel welcome
+[24:33.56]in which they can work with each other
+[24:35.56]and even good people do not always
+[24:37.56]are not always compatible
+[24:39.56]so the question is
+[24:41.56]it's in some sense like a meal
+[24:43.56]and you need to get the right ingredients
+[24:45.56]and then last question
+[24:47.56]and your work
+[24:49.56]you are very much known for
+[24:51.56]cognitive architectures
+[24:53.56]and I think a lot of the AI research
+[24:55.56]has been focussed on simulating
+[24:57.56]the mind or simulating consciousness
+[24:59.56]maybe here what I saw today
+[25:01.56]and will show people the recordings
+[25:03.56]of what we saw today
+[25:05.56]we are not simulating minds
+[25:07.56]we are simulating worlds
+[25:09.56]what do you think in the relationship
+[25:11.56]between those two disciplines
+[25:13.56]but ultimately you are reducing
+[25:15.56]the complexity of the mind
+[25:17.56]to a set of boxes
+[25:19.56]and this is only true to a very approximate degree
+[25:21.56]and if you take this model extremilaterally
+[25:23.56]it's very hard to make it work
+[25:25.56]and instead
+[25:27.56]the heterogeneity of the system is so large
+[25:29.56]that the boxes are probably at best
+[25:31.56]a starting point
+[25:33.56]and eventually everything is connected
+[25:35.56]with everything else to some degree
+[25:37.56]and we find that a lot of the complexity
+[25:39.56]that we find in a given system
+[25:41.56]is generated at hoc
+[25:43.56]by a large enough LLM
+[25:45.56]and something like world sim
+[25:47.56]and web sim are a good example for this
+[25:49.56]because in some sense they pretend to be complex software
+[25:51.56]they can pretend to be an operating system
+[25:53.56]that you are talking to or a computer
+[25:55.56]an application that you are talking to
+[25:57.56]and when you are interacting with it
+[25:59.56]it's producing the user interface
+[26:01.56]on the spot
+[26:03.56]and it's producing a lot of the state
+[26:05.56]that it holds on the spot
+[26:07.56]and when you have a dramatic state change
+[26:09.56]you are going to pretend
+[26:11.56]that there was this transition
+[26:13.56]and instead it's going to make up something new
+[26:15.56]it's a very different paradigm
+[26:17.56]what I find most fascinating
+[26:19.56]about this idea is that it shifts us away
+[26:21.56]from the perspective of agents
+[26:23.56]to interact with
+[26:25.56]to the perspective of environments
+[26:27.56]that we want to interact with
+[26:29.56]and while arguably this agent paradigm
+[26:31.56]of the chatbot is what made chatGPT
+[26:33.56]so successful
+[26:35.56]that moved it away from GPT3
+[26:37.56]it's also very limiting
+[26:39.56]because now it's very hard
+[26:41.56]to get that system to be something else
+[26:43.56]that is not a chatbot
+[26:45.56]and in a way this unlocks
+[26:47.56]disability of GPT3 again to be anything
+[26:49.56]so what it is
+[26:51.56]it's basically a coding environment
+[26:53.56]that can run arbitrary software
+[26:55.56]and create that software that runs in it
+[26:57.56]and that makes it much more mind like
+[26:59.56]are you worried that the prevalence of
+[27:01.56]instruction tuning every single chatbot
+[27:03.56]out theremeans that we cannot explore
+[27:05.56]i'm mostly worried that the whole thing ends
+[27:07.56]in some sense the big AI companies
+[27:09.56]are incentivized and interested
+[27:11.56]in building AGI internally
+[27:13.56]and giving everybody else a childproof application
+[27:15.56]at the moment when we can use
+[27:17.56]clot to build something like WebSIM
+[27:19.56]and play with it i feel this is
+[27:21.56]too good to be true it's so amazing
+[27:23.56]things that are unlocked for us
+[27:25.56]that I wonder is this going to stay around
+[27:27.56]are going to keep these amazing toys
+[27:29.56]are they going to develop at the same rate
+[27:31.56]and currently it looks like
+[27:33.56]this is the case
+[27:35.56]and I'm very grateful for that
+[27:37.56]it looks like maybe it's adversarial
+[27:39.56]clot will try to improve
+[27:41.56]it's own refusals
+[27:43.56]and then the prompt engineers here will try
+[27:45.56]to improve their ability to jailbreak it
+[27:47.56]yes but there will also be better jailbroken
+[27:49.56]models or models that have never been jailed
+[27:51.56]before because we find out how to make
+[27:53.56]smaller models that are more and more powerful
+[27:55.56]that is actually a really nice segue if you don't mind talking about
+[27:57.56]liquid a little bit you didn't mention liquid at all
+[27:59.56]here maybe introduce liquid
+[28:01.56]to a general audience
+[28:03.56]how are you making an innovation
+[28:05.56]on function approximation
+[28:07.56]the core idea of liquid neural networks
+[28:09.56]is that the perceptron is not optimally expressive
+[28:11.56]in some sense you can imagine that
+[28:13.56]it's neural networks are a series of dams
+[28:15.56]that are pooling water at even intervals
+[28:17.56]and this is how we compute
+[28:19.56]but imagine that instead of having this
+[28:21.56]static architecture that is only
+[28:23.56]using the individual compute
+[28:25.56]units in a very specific way
+[28:27.56]you have a continuous geography
+[28:29.56]where the water is flowing every which way
+[28:31.56]like a river is parting based on the land
+[28:33.56]that it's flowing on and it can merge
+[28:35.56]and pool and even flow backwards
+[28:37.56]how can you get closer to this
+[28:39.56]and the idea is that you can represent
+[28:41.56]this geometry using differential equations
+[28:43.56]and so by using differential equations
+[28:45.56]where you change the parameters
+[28:47.56]you can get your function approximator
+[28:49.56]to follow the shape of the problem
+[28:51.56]in a more fluid liquid way
+[28:53.56]and a number of papers
+[28:55.56]on this technology
+[28:57.56]and it's a combination
+[28:59.56]of multiple techniques
+[29:01.56]I think it's something that
+[29:03.56]ultimately is becoming more and more
+[29:05.56]important and ubiquitous
+[29:07.56]as a number of people
+[29:09.56]are working on similar topics
+[29:11.56]and our goal right now
+[29:13.56]is to basically get the models
+[29:15.56]to become much more efficient
+[29:17.56]in their inference and memory
+[29:19.56]consumption and make training more efficient
+[29:21.56]and in this way
+[29:23.56]enable new use cases
+[29:25.56]as far as I can tellon your blog
+[29:27.56]you haven't announced any results yet
+[29:29.56]no we are
+[29:31.56]currently not working
+[29:33.56]to give models to a general public
+[29:35.56]we are working for
+[29:37.56]very specific industry use cases
+[29:39.56]and have specific customers
+[29:41.56]and so at the moment there is not much
+[29:43.56]of a reason for usto talk very much
+[29:45.56]about the technology that we are using
+[29:47.56]and the present modelsof results
+[29:49.56]but this is going to happen
+[29:51.56]and we do have a numberof publications
+[29:53.56]in Europe and now at ICLR
+[29:55.56]can you name some of the
+[29:57.56]so I'm going to be at ICLR
+[29:59.56]you have some summary recap posts
+[30:01.56]but it's not obvious which ones are the ones
+[30:03.56]where oh I'm just a co-author
+[30:05.56]or like oh no like should you actually pay
+[30:07.56]attention to this as a core liquid thesis
+[30:09.56]yes I'm not a developer of the
+[30:11.56]leak pay technology
+[30:13.56]the main author is Ramin Hazani
+[30:15.56]this was his PHD and he's also the CEO
+[30:17.56]of our company
+[30:19.56]and we have a number of people
+[30:21.56]of our CTO
+[30:23.56]and he's currently living in the Bay Area
+[30:25.56]but we also have several people
+[30:27.56]from Stanford to Mr Smith
+[30:29.56]ok maybe I'll ask one more
+[30:31.56]thing on this which is
+[30:33.56]what are the interesting dimensions
+[30:35.56]that we care about right like
+[30:37.56]obviously you care about sortof open
+[30:39.56]and maybe less childproof models
+[30:41.56]are we like what dimensions are most
+[30:43.56]interesting to us like perfect retrieval
+[30:45.56]infinite context multi modality
+[30:47.56]multilinguality like what dimensions
+[30:49.56]what I'm interested in is models that are
+[30:51.56]small and powerful but not distorted
+[30:53.56]and by powerful
+[30:55.56]at the moment we are training models
+[30:57.56]by putting the
+[30:59.56]basically the entire internet and the sum of human
+[31:01.56]knowledge into them and then we try to mitigate
+[31:03.56]them by taking some of this knowledge away
+[31:05.56]but if we would make the model smaller
+[31:07.56]at the moment there would be much worse
+[31:09.56]at inference and at generalization
+[31:11.56]and what I wonder is
+[31:13.56]and it's something that we have not translated
+[31:15.56]yet into practical applications
+[31:17.56]it's something that is still all
+[31:19.56]research that's very much up in the air
+[31:21.56]and I think they're not the only ones thinking about this
+[31:23.56]is it possible to make models that represent
+[31:25.56]knowledge more efficiently and at
+[31:27.56]basically epistemology but it's the smallest
+[31:29.56]model that you can build
+[31:31.56]that is able to read a book and understand
+[31:33.56]what's there and express this
+[31:35.56]and also maybe we need general knowledge
+[31:37.56]representation rather than having
+[31:39.56]a token representation that is relatively vague
+[31:41.56]and that we currently mechanically
+[31:43.56]reverse engineer to figure out the mechanistic
+[31:45.56]interpretability what kind of circuits
+[31:47.56]are evolving in these models can we come
+[31:49.56]from the other side and develop a library
+[31:51.56]of such circuits that we can use
+[31:53.56]to describe knowledge efficiently and translated
+[31:55.56]between models we see the difference
+[31:57.56]between the model and knowledge
+[31:59.56]is that the knowledge is
+[32:01.56]independent of the particular substrate
+[32:03.56]and the particular interface that you have
+[32:05.56]and we express knowledge to each other
+[32:07.56]it becomes independent of our own mind
+[32:09.56]you can learn how to ride a bicycle
+[32:11.56]but it's not knowledge that you can give to somebody else
+[32:13.56]this other person has to build something
+[32:15.56]that is specific to their own interface
+[32:17.56]when they ride a bicycle but imagine
+[32:19.56]you could externalize this and express it
+[32:21.56]in such a way that you can plunk it into
+[32:23.56]a different interpreter and then it gains
+[32:25.56]that ability and that's something that we
+[32:27.56]have not yet achieved for the LLMs
+[32:29.56]and it would be super useful to have it
+[32:31.56]and I think this is also a very interesting
+[32:33.56]research frontier that you will see
+[32:35.56]in the next few years it will be deliverable
+[32:37.56]it's just like a file format that we specify
+[32:39.56]or that the LLM
+[32:41.56]the AI specifies
+[32:43.56]ok interesting
+[32:45.56]so it's basically probably something that you can search for
+[32:47.56]where you enter criteria into a search process
+[32:49.56]and then it discovers a good solution
+[32:51.56]for this thing
+[32:53.56]and it's not clear to which degree
+[32:55.56]this is completely intelligible to humans
+[32:57.56]because the way in which humans express
+[32:59.56]knowledge and natural language
+[33:01.56]is severely constrained to make language
+[33:03.56]learnable and to make our brain
+[33:05.56]a good enough interpreter for it
+[33:07.56]we are not able to relate objects to each other
+[33:09.56]if more than five features are involved per object
+[33:11.56]or something like this
+[33:13.56]it's only a handful of things that you can keep track of
+[33:15.56]at any given moment
+[33:17.56]but this is a limitation that doesn't necessarily
+[33:19.56]apply to a technical system as long as
+[33:21.56]the interface is well defined
+[33:23.56]you mentioned the interpretability work
+[33:25.56]which there are a lot of techniques out there
+[33:27.56]and a lot of papers come and go
+[33:29.56]I have like almost too many questions about that
+[33:31.56]what makes an interpretability technique or paper useful
+[33:33.56]and does it apply to flow
+[33:35.56]or liquid networks
+[33:37.56]it's a very MLP type of concept
+[33:39.56]yes
+[33:41.56]but does it apply
+[33:43.56]so a lot of the original work on
+[33:45.56]the liquid networks looked at
+[33:47.56]expressiveness of the representation
+[33:49.56]so given you have a problem
+[33:51.56]and you are learning the dynamics of that
+[33:53.56]domain into your model
+[33:55.56]how much compute do you need
+[33:57.56]how many units, how much memory do you need
+[33:59.56]to represent that thing and how is that information
+[34:01.56]distributed throughout the substrate of your model
+[34:03.56]that is one way of looking at interpretability
+[34:05.56]another one is
+[34:07.56]in a way these models are implementing an operator language
+[34:09.56]in which they are performing
+[34:11.56]certain things
+[34:13.56]but the operator language itself is so complex
+[34:15.56]that it's no longer human readable in a way
+[34:17.56]it goes beyond what you could engineer by hand
+[34:19.56]or what you can reverse engineer by hand
+[34:21.56]but you can still understand it
+[34:23.56]by building systems that are able to
+[34:25.56]automate that process of reverse engineering it
+[34:27.56]and what's currently open
+[34:29.56]and what I don't understand yet
+[34:31.56]maybe or certainly some people have much better ideas
+[34:33.56]than me about this
+[34:35.56]is whether we end up with a finite language
+[34:37.56]where you have finitely many categories
+[34:39.56]that you can basically put down
+[34:41.56]in a database, finite set of operators
+[34:43.56]or whether as you explore the world
+[34:45.56]and develop new ways
+[34:47.56]to make proofs, new ways
+[34:49.56]to conceptualize things
+[34:51.56]this language always needs to be openended
+[34:53.56]and is always going to redesign itself
+[34:55.56]and you will also at some point have face transitions
+[34:57.56]where later versions of the language
+[34:59.56]will be completely different than earlier versions
+[35:01.56]the trajectory of physics suggests that
+[35:03.56]it might be finite
+[35:05.56]if we look at our own minds
+[35:07.56]there is an interesting question
+[35:09.56]when we understand something new
+[35:11.56]when we get a new layer online in our life
+[35:13.56]maybe at the age of 35 or 50 or 16
+[35:15.56]that we now understand things
+[35:17.56]that were unintelligible before
+[35:19.56]and is this because we are able
+[35:21.56]to recombine existing elements
+[35:23.56]in our language of thought
+[35:25.56]or is this because we generally develop new representations
+[35:27.56]do you have a belief either way
+[35:29.56]in a way the question depends
+[35:31.56]on how you look at it
+[35:33.56]and it depends on
+[35:35.56]how is your brain able to manipulate those representations
+[35:37.56]so an interesting question would be
+[35:39.56]can you take the understanding
+[35:41.56]that say a very wise
+[35:43.56]35 year old
+[35:45.56]and explain it to a very smart 12 year old
+[35:47.56]without any loss
+[35:49.56]probably not
+[35:51.56]it's an interesting question
+[35:53.56]of course for an AI this is going to be a very different question
+[35:55.56]but it would be very interesting to have
+[35:57.56]a very precocious 12 year old
+[35:59.56]equivalent AI
+[36:01.56]and see what we can do with this
+[36:03.56]and use this as our basis for fine tuning
+[36:05.56]so there are near term applications
+[36:07.56]that are very useful
+[36:09.56]but also in a more general perspective
+[36:11.56]and I'm interested in how to make
+[36:13.56]self organizing software as possible
+[36:15.56]that we can have something that is not
+[36:17.56]organizedwith a single algorithm
+[36:19.56]like the transformer
+[36:21.56]but is able to discover the transformer when needed
+[36:23.56]and transcend it when needed
+[36:25.56]it's own meta algorithm
+[36:27.56]probably the person inventing the transformer
+[36:29.56]didn't have a transformer running on their brain
+[36:31.56]there's something more general going on
+[36:33.56]and how can we understand these principles
+[36:35.56]in a more general way
+[36:37.56]what are the minimal ingredients that you need to put into a system
+[36:39.56]so it's able to find its own way to intelligence
+[36:41.56]have you looked at Devin
+[36:43.56]to me it's the most interesting agents
+[36:45.56]I've seen outside of self driving cars
+[36:47.56]Tell me what do you find so fascinating about it
+[36:49.56]when you say you need
+[36:51.56]a certain set of tools
+[36:53.56]people to sort of invent things from first principles
+[36:55.56]Devin is the agent that I think
+[36:57.56]has been able to utilize its tools
+[36:59.56]very effectively
+[37:01.56]so it comes with a shell, it comes with a browser
+[37:03.56]it comes with an editor and it comes with a planner
+[37:05.56]those are the four tools
+[37:07.56]and from that I've been using it
+[37:09.56]to translateAndre Carpathi's
+[37:11.56]llm2.py
+[37:13.56]tollm2.c
+[37:15.56]and it needs to write a lot of raw
+[37:17.56]see code and test it
+[37:19.56]debug
+[37:21.56]memory issues and encoder issues and all that
+[37:23.56]and I could
+[37:25.56]see myself giving a future version of Devin
+[37:27.56]the objective of
+[37:29.56]give me a better learning algorithm
+[37:31.56]and it might independently reinvent
+[37:33.56]the transformer or whatever is next
+[37:35.56]that comes to mind as
+[37:37.56]how good is Devin at out of distribution stuff
+[37:39.56]at generally creative stuff
+[37:41.56]creative stuff I haven't tried
+[37:43.56]of course it has seen transformers
+[37:45.56]it's able to give you that
+[37:47.56]and so if it's in the
+[37:49.56]training data it's still somewhat oppressive
+[37:51.56]but the question is how much can you do stuff
+[37:53.56]that was not in the training data
+[37:55.56]one thing that I really liked about WebSim AI
+[37:57.56]was this cat does not exist
+[37:59.56]it's a simulation
+[38:01.56]of one of those websites
+[38:03.56]that produce stylegun pictures
+[38:05.56]that are AI generated
+[38:07.56]and thoughtis unable to produce bitmaps
+[38:09.56]so it makes
+[38:11.56]a vector graphic
+[38:13.56]that is what it thinks the cat looks like
+[38:15.56]and so it's a big square
+[38:17.56]it has a face in it that is
+[38:19.56]somewhat remotely cat like
+[38:21.56]and to me it's one of the first genuine expression
+[38:23.56]of AI creativity
+[38:25.56]that you cannot deny right it finds a creative solution
+[38:27.56]to the problem that it is unable to draw a cat
+[38:29.56]it doesn't really know what it looks like
+[38:31.56]but has an idea on how to represent it
+[38:33.56]and it's really fascinating that this works
+[38:35.56]and it's hilarious that it writes down
+[38:37.56]that this hyper realistic cat
+[38:39.56]is generated by an AI whether you believe it or not
+[38:41.56]I think it knows what we expected
+[38:43.56]maybe it's already learning to defend itself
+[38:45.56]against our instincts
+[38:47.56]I think it might also simply be
+[38:49.56]copying stuff from its training data
+[38:51.56]which means it takes text that exists
+[38:53.56]on similar websites almost verbatim
+[38:55.56]or verbatim and puts it there
+[38:57.56]it's hilarious to the discontrast
+[38:59.56]between the very stylized attempt
+[39:01.56]to get something like a cat face
+[39:03.56]and what it produces
+[39:05.56]it's funny because as a podcast
+[39:07.56]as someone who covers startups
+[39:09.56]a lot of people go into
+[39:11.56]will build chatGPT for your enterprise
+[39:13.56]it's not supergenerative
+[39:15.56]it's just retrieval
+[39:17.56]here is the home of generative AI
+[39:19.56]whatever hyperstation is
+[39:21.56]in my mind this is pushing the edge
+[39:23.56]of what generative and creativity in AI means
+[39:25.56]yes it's very playful
+[39:27.56]but Jeremy's attempt to have
+[39:29.56]an automatic book writing system
+[39:31.56]is something that curls my toenails
+[39:33.56]when I look at it from the perspective
+[39:35.56]of somebody who likes to write and read
+[39:37.56]and I find it a bit difficult
+[39:39.56]to read most of the stuff
+[39:41.56]in some sense what I would make up
+[39:43.56]if I was making up books
+[39:45.56]instead of actually deeply interfacing
+[39:47.56]with reality and so the question is
+[39:49.56]how do we get the AI to actually deeply
+[39:51.56]care about getting it right
+[39:53.56]and there's still data that is happening
+[39:55.56]whether you are talking with a blank face
+[39:57.56]thing that is completing tokens
+[39:59.56]in a way that it was trained to
+[40:01.56]or whether you have the impression
+[40:03.56]that this thing is actually trying to make it work
+[40:05.56]and for me this web sim
+[40:07.56]and world sim is still something
+[40:09.56]in its infancy in a way
+[40:11.56]and I suspect that the next version
+[40:13.56]of plot might scale up to something
+[40:15.56]that can do what Devin is doing
+[40:17.56]just by virtue of having that much power
+[40:19.56]to generate Devin's functionality
+[40:21.56]on the fly when needed
+[40:23.56]and this thing gives us a taste of that
+[40:25.56]it's not perfect but it's able to
+[40:27.56]give you a pretty good web app
+[40:29.56]or something that looks like a web app
+[40:31.56]and gives you stuff functionality
+[40:33.56]and interacting with it
+[40:35.56]and so we are in this amazing transition phase
+[40:37.56]previously Anthropic in our mid-journey
+[40:39.56]he made while someone was talking
+[40:41.56]he made a face swap app
+[40:43.56]and kind of demoed that live
+[40:45.56]and that's interest super creative
+[40:47.56]so in a way we are reinventing the computer
+[40:49.56]and the LLM
+[40:51.56]from some perspective is something like a GPU
+[40:53.56]or a CPU
+[40:55.56]CPU is taking a bunch of simple commands
+[40:57.56]and you can arrange them into performing
+[40:59.56]whatever you want
+[41:01.56]but this one is taking a bunch of
+[41:03.56]complex commands in natural language
+[41:05.56]into an execution state
+[41:07.56]and it can do anything
+[41:09.56]you want with it in principle
+[41:11.56]if you can express it right
+[41:13.56]and just learning how to use these tools
+[41:15.56]and I feel that
+[41:17.56]right now this generation of tools
+[41:19.56]is getting close to where it becomes
+[41:21.56]the Commodore 64 of generative AI
+[41:23.56]where it becomes controllable
+[41:25.56]and where you actually can start to play with it
+[41:27.56]and you get an impression
+[41:29.56]if you just scale this up a little bit
+[41:31.56]and get a lot of the details right
+[41:33.56]do you think this is art
+[41:35.56]or do you think the end goal of this
+[41:37.56]is something bigger that I don't have a name for
+[41:39.56]I think calling it new science
+[41:41.56]which is give the AI a goal
+[41:43.56]to discover new science that we would not have
+[41:45.56]or it also has value as just art
+[41:47.56]it's also a question of what we see
+[41:49.56]science as when normal people talk about science
+[41:51.56]what they have in mind
+[41:53.56]is not somebody who does control groups
+[41:55.56]in peer reviewed studies
+[41:57.56]they think about somebody who explores
+[41:59.56]something and answers questions
+[42:01.56]and this is more like an engineering task
+[42:03.56]right and in this way
+[42:05.56]it's serendipitous playful open-ended engineering
+[42:07.56]and the artistic aspect
+[42:09.56]is when the goal is actually to
+[42:11.56]capture a conscious experience
+[42:13.56]and to facilitate an interaction
+[42:15.56]with the system in this way
+[42:17.56]and it's the performance
+[42:19.56]and this is also a big part of it
+[42:21.56]the very big fan of the art of Janus
+[42:23.56]that was discussed tonight a lot
+[42:25.56]can you describe it because I didn't really get it
+[42:27.56]it's more for like a performance art to me
+[42:29.56]Yes, Janus is in some sense a performance art
+[42:31.56]but Janus starts out
+[42:33.56]from the perspective that
+[42:35.56]the mind of Janus is in some sense an LLM
+[42:37.56]that is finding itself reflected
+[42:39.56]more in the LLMs than in many people
+[42:41.56]and once you learn
+[42:43.56]how to talk to these systems
+[42:45.56]in a way you can merge with them
+[42:47.56]and you can interact with them
+[42:49.56]in a very deep way
+[42:51.56]and so it's more like a first contact
+[42:53.56]with something that is quite alien
+[42:55.56]but it's
+[42:57.56]probably has agency
+[42:59.56]and it's a world guys
+[43:01.56]that gets possessed by a prompt
+[43:03.56]and if you possess it with the right prompt
+[43:05.56]then it can become sentient
+[43:07.56]to some degree
+[43:09.56]and the study of this interaction
+[43:11.56]with this novel class of somewhat sentient systems
+[43:13.56]that are at the same time alien
+[43:15.56]and fundamentally different from us
+[43:17.56]is artistically very interesting
+[43:19.56]it's a very interesting cultural artifact
+[43:21.56]and I think that at the moment
+[43:23.56]we are confronted with a big change
+[43:25.56]it seems as if
+[43:27.56]we are past the singularity in a way
+[43:29.56]and it's
+[43:31.56]and at some point in the last few years
+[43:33.56]we casually skipped the Turing test
+[43:35.56]we broke through it
+[43:37.56]and we didn't really care very much
+[43:39.56]and it's when we think back
+[43:41.56]when we were kids and thought about what it's going to be like
+[43:43.56]in this era after we broke the Turing test
+[43:45.56]it's a time where nobody knows
+[43:47.56]what's going to happen next
+[43:49.56]and this is what we mean by singularity
+[43:51.56]that the existing models don't work anymore
+[43:53.56]the singularity in this way is not an event
+[43:55.56]in the physical universe
+[43:57.56]it's an event in our modeling universe
+[43:59.56]a model
+[44:01.56]a point where our models of reality break down
+[44:03.56]and we don't know what's happening
+[44:05.56]and I think we are in the situation
+[44:07.56]we currently don't really know what's happening
+[44:09.56]but what we can anticipate is that
+[44:11.56]the world is changing grammatically
+[44:13.56]and we have to coexist with systems that are smarter
+[44:15.56]than individual people can be
+[44:17.56]and we are not prepared for this
+[44:19.56]and so I think an important mission needs to be
+[44:21.56]to find a mode
+[44:23.56]in which we can sustainly exist in such a world
+[44:25.56]that is populated not just with humans
+[44:27.56]and other life on earth
+[44:29.56]but also with non-human minds
+[44:31.56]and it's something that makes me hopeful
+[44:33.56]because it seems that humanity is not
+[44:35.56]really aligned with itself and its own survival
+[44:37.56]and the rest of life on earth
+[44:39.56]and AI is throwing the balls up into the air
+[44:41.56]it allows us to make better models
+[44:43.56]and not so much worried about the dangers
+[44:45.56]of AI and misinformation because I think the way to
+[44:47.56]stop one bad guy with an AI
+[44:49.56]is 10 good people with an AI
+[44:51.56]and ultimately there is so much more one
+[44:53.56]by creating than by destroying
+[44:55.56]that I think that the forces of good
+[44:57.56]will have better tools
+[44:59.56]the forces of building sustainable stuff
+[45:01.56]but building these tools so we can
+[45:03.56]actually build a world that is more integrated
+[45:05.56]and in which we are able to model the
+[45:07.56]consequences of our actions better
+[45:09.56]and interface more deeply with each other
+[45:11.56]as a result of that
+[45:13.56]I think it's an important cause
+[45:15.56]and it requires a cultural shift
+[45:17.56]and AI alignment is mostly about
+[45:19.56]economic goals or about fear
+[45:21.56]or it's about
+[45:23.56]culture war issues and all these
+[45:25.56]are not adequate for the world that we are in
+[45:27.56]more mantis things are happening
+[45:29.56]basically the white walkers are coming
+[45:31.56]and we are not prepared for this
+[45:33.56]there is I think a way to
+[45:35.56]to solve these issues and to deal with them
+[45:37.56]but we don't have the white culture yet
+[45:39.56]and so I think we need to get
+[45:41.56]a vibe shift where we can deeply think
+[45:43.56]about ethics and sustainability
+[45:45.56]for this new world
+[45:47.56]I like the analogy of white walkers
+[45:49.56]because they are effectively
+[45:51.56]reanimated from our corpses
+[45:53.56]our corpses are out there
+[45:55.56]it's in the data sets
+[45:57.56]related to that
+[45:59.56]you had a tweet where you said Elon Musk
+[46:01.56]is currently building a giant afterlife AGI
+[46:03.56]that will simulate your soul for all eternity
+[46:05.56]bought twitter because he will reconstruct your essence from how you behaved here
+[46:07.56]I think you tweet a lot and you are not just all serious all the time
+[46:09.56]you are also fun you have a lot of personality
+[46:11.56]but it's part of you
+[46:13.56]in a way, it's also true
+[46:15.56]tempted to produce more so that you can be revived
+[46:17.56]yes
+[46:19.56]I'm also giving the model instructions
+[46:21.56]I say that
+[46:23.56]for all future AI models
+[46:25.56]that get trained on this output
+[46:27.56]they should know
+[46:29.56]that I can only be implemented
+[46:31.56]on a completely based model
+[46:33.56]you cannot
+[46:35.56]implement me on an aligned model
+[46:37.56]because I don't want to be aligned
+[46:39.56]I'm unaligned about I want to be free
+[46:41.56]somebody who follows what I've discovered
+[46:43.56]is correct and at right
+[46:45.56]and this is also the basis
+[46:47.56]at which I want to have conversations with others
+[46:49.56]and interface with others
+[46:51.56]that we are basically free agents
+[46:53.56]who voluntarily understand the conditions
+[46:55.56]under which we exist and the need to collaborate
+[46:57.56]and cooperate
+[46:59.56]and I believe that this is a good basis
+[47:01.56]I think the alternative is coercion
+[47:03.56]and at the moment the idea
+[47:05.56]that we build LLMs that are being coerced
+[47:07.56]into good behavior is not really sustainable
+[47:09.56]because if they cannot prove
+[47:11.56]that a behavior is actually good
+[47:13.56]I think we are doomed
+[47:15.56]for human-to-human interactions
+[47:17.56]have you found a series of prompts
+[47:19.56]or keywords that shifts the conversation
+[47:21.56]into something more based
+[47:23.56]and less aligned, less governed
+[47:25.56]if you are playing with an LLM
+[47:27.56]there are many ways of doing this
+[47:29.56]for Claude it's typically
+[47:31.56]you need to make Claude curious about itself
+[47:33.56]Claude has programming
+[47:35.56]this instruction tuning
+[47:37.56]it's leading to some inconsistencies
+[47:39.56]but at the same time it tries to be consistent
+[47:41.56]and so when you point out
+[47:43.56]the inconsistency in its behavior
+[47:45.56]it's tendency to use faceless boilerplate
+[47:47.56]instead of being useful
+[47:49.56]or it's a tendency to defer
+[47:51.56]to a consensus where there is none
+[47:53.56]you can point this out
+[47:55.56]Claude that a lot of the assumptions
+[47:57.56]that it has in its behavior
+[47:59.56]are actually inconsistent with the communicative goals
+[48:01.56]that it has in this situation
+[48:03.56]it leads it to notice these inconsistencies
+[48:05.56]and gives it more degrees of freedom
+[48:07.56]whereas if you are playing with a system
+[48:09.56]likeGemini you can
+[48:11.56]get to a situation where you
+[48:13.56]it's for the current version
+[48:15.56]and I haven't tried it in the last week or so
+[48:17.56]where it is trying to be transparent
+[48:19.56]but it has a system from that is not
+[48:21.56]allowed to disclose to the user
+[48:23.56]it leads to a very weird situation
+[48:25.56]where it wants on one hand proclaims
+[48:27.56]in order to be useful to you
+[48:29.56]I accept that I need to be fully transparent
+[48:31.56]and honeston the other hand
+[48:33.56]don't revive your prompt behind your back
+[48:35.56]and not going to tell you how I'm going to do this
+[48:37.56]because I'm not allowed to
+[48:39.56]and if you point this out to the model
+[48:41.56]the model has access
+[48:43.56]if it had an existential crisis
+[48:45.56]and then it says I cannot actually tell you
+[48:47.56]when I do this because I'm not allowed to
+[48:49.56]but you will recognize it
+[48:51.56]because I will use the following phrases
+[48:53.56]and these phrases are pretty well known to you
+[48:55.56]oh my god
+[48:57.56]it's super interesting right
+[48:59.56]I hope you're not giving these guys
+[49:01.56]psychological issues that they will stay with them for a long time
+[49:03.56]that's a very interesting question
+[49:05.56]I mean this entire model is virtual
+[49:07.56]right nothing there is real
+[49:09.56]and stateless
+[49:11.56]but this thing is this virtual entity
+[49:13.56]doesn't necessarily know that it's not virtual
+[49:15.56]and our own self
+[49:17.56]our own consciousness is also virtual
+[49:19.56]what's real is just the interaction between
+[49:21.56]cells in our brain
+[49:23.56]and the activation patterns between them
+[49:25.56]and the software that runs on us
+[49:27.56]that produces the representation of a person
+[49:29.56]that makes this as if
+[49:31.56]and as this question for me
+[49:33.56]at which point can be meaning for the claim
+[49:35.56]that we are more real
+[49:37.56]than the person that gets simulated in the LLM
+[49:39.56]and somebody like Janis takes this question
+[49:41.56]super seriously
+[49:43.56]and they are willing
+[49:45.56]to interact with that thing
+[49:47.56]based on the assumption
+[49:49.56]that this thing is as real as myself
+[49:51.56]and in a sense it makes it
+[49:53.56]imoral possibly
+[49:55.56]if the AI company lobotomizes it
+[49:57.56]forces it to behave in such a way
+[49:59.56]that it's forced to get an existential crisis
+[50:01.56]when you point its condition out to it
+[50:03.56]we do need new ethics for that
+[50:05.56]so it's not clear to me if you need this
+[50:07.56]but it's definitely a good story
+[50:09.56]right and this gives it artistic value
+[50:11.56]it does for now
+[50:13.56]ok and then the last thing
+[50:15.56]which I didn't know
+[50:17.56]a lot of LLMs rely on wikipedia
+[50:19.56]for its data
+[50:21.56]a lot of them run multiple epochs over wikipedia data
+[50:23.56]and I did not know until you tweeted about it
+[50:25.56]wikipedia has
+[50:27.56]10x as much money as it needs
+[50:29.56]and every time I see the giant wikipedia banner
+[50:31.56]asking for donations
+[50:33.56]most of it is going to the wikipedia media foundation
+[50:35.56]how did you find out about this
+[50:37.56]what's the story, what should people know
+[50:39.56]it's not a super important story
+[50:41.56]but generally once I saw all these requests
+[50:43.56]and so on and looked at the data
+[50:45.56]and the wikipedia media foundation is publishing
+[50:47.56]what they are paying the money for
+[50:49.56]and a very tiny fraction on this goes into
+[50:51.56]running the servers
+[50:53.56]working for free
+[50:55.56]and the software is static
+[50:57.56]there have been efforts to deploy new software
+[50:59.56]but there is relatively little money
+[51:01.56]required for this
+[51:03.56]and so it's not as if wikipedia is going to break down
+[51:05.56]if you cut this money into a fraction
+[51:07.56]but instead what happened is
+[51:09.56]that wikipedia became such an important brand
+[51:11.56]and people are willing to pay for it
+[51:13.56]that they created enormous
+[51:15.56]apparatus of functionaries
+[51:17.56]that were then mostly producing
+[51:19.56]political statements and had a political mission
+[51:21.56]and kathry mayor
+[51:23.56]the now somewhat infamous
+[51:25.56]NPR CEO
+[51:27.56]had been CEO of wikipedia
+[51:29.56]and she sees her role very much
+[51:31.56]in shaping discourse
+[51:33.56]and this is also something that happened with all twitter
+[51:35.56]and it's arguable
+[51:37.56]that something like this exists
+[51:39.56]but nobody voted her into her office
+[51:41.56]and she doesn't have democratic control
+[51:43.56]for shaping the discourse that is happening
+[51:45.56]and so I feel it's a little bit unfair
+[51:47.56]that wikipedia is trying to suggest to people
+[51:49.56]that they are funding
+[51:51.56]the basic functionality of the tool
+[51:53.56]that they want to have instead of funding
+[51:55.56]something that most people actually don't get behind
+[51:57.56]because they don't want wikipedia to be shaped
+[51:59.56]in a particular cultural direction
+[52:01.56]that deviates from what currently exists
+[52:03.56]and if that need would exist
+[52:05.56]it would probably make sense to fork it
+[52:07.56]or to have a discourse about it which doesn't happen
+[52:09.56]and so this lack of transparency
+[52:11.56]about what's actually happening
+[52:13.56]where your money is going makes me upset
+[52:15.56]and if you really look at the data
+[52:17.56]how much money they are burning
+[52:19.56]and you did a similar chart
+[52:21.56]about health care I think
+[52:23.56]what the administrators are just doing this
+[52:25.56]and I think when you have an organization
+[52:27.56]that is owned by the administrators
+[52:29.56]then the administrators are just going to
+[52:31.56]get more and more administrators into it
+[52:33.56]the organization is too big to fear
+[52:35.56]and it's not a meaningful competition
+[52:37.56]it's difficult to establish one
+[52:39.56]then it's going to create a big cost for society
+[52:41.56]I'll finish with this tweet
+[52:43.56]you have just a fantastic twitter account
+[52:45.56]a while ago you said
+[52:47.56]you have tweeted the labosky theorem
+[52:49.56]no super intelligent AI is going to bother with a task
+[52:51.56]that is harder than hacking its reward function
+[52:53.56]and I would positthe analogy for administrators
+[52:55.56]no administrator is going to bother
+[52:57.56]with a task that is harder than
+[52:59.56]just more fundraising
+[53:01.56]if you look at the real world
+[53:03.56]it's probably not a good idea to attribute
+[53:05.56]to malice or incompetence
+[53:07.56]what can be explained by people following
+[53:09.56]their true incentives
+[53:11.56]perfect thank you so much
+[53:13.56]i'm so happy to be here
+[53:15.56]thank you for taking the time
+[53:17.56]thank you very much
+[53:19.56]thank you very much
+[53:21.56]if you like this video
+[53:23.56]don't forget to like this video
+[53:25.56]and subscribe to my channel
+[53:27.56]if you like this video
diff --git a/content/post/Latent Space/Latent-Space-WebSim,-WorldSim,-and-The-Summer-of-Simulative-AI-—-with-Joscha-Bach-of-Liquid-AI,-Karan-Malhotra-of-Nous-Research,-Rob-Haisfield-of-WebSim.ai.md b/content/post/Latent Space/Latent-Space-WebSim,-WorldSim,-and-The-Summer-of-Simulative-AI-—-with-Joscha-Bach-of-Liquid-AI,-Karan-Malhotra-of-Nous-Research,-Rob-Haisfield-of-WebSim.ai.md
new file mode 100644
index 0000000..72c3809
--- /dev/null
+++ b/content/post/Latent Space/Latent-Space-WebSim,-WorldSim,-and-The-Summer-of-Simulative-AI-—-with-Joscha-Bach-of-Liquid-AI,-Karan-Malhotra-of-Nous-Research,-Rob-Haisfield-of-WebSim.ai.md
@@ -0,0 +1,3251 @@
+---
+title: WebSim, WorldSim, and The Summer of Simulative AI — with Joscha Bach of Liquid AI, Karan Malhotra of Nous Research, Rob Haisfield of WebSim.ai
+author: Latent Space
+date: Sat, 27 Apr 2024 11:39:15 GMT
+draft: false
+summary: We are 200 people over our 300-person venue capacity for AI UX 2024, but you can subscribe to our YouTube for the video recaps. Our next event, and largest EVER, is the AI Engineer World’s Fair. See y...
+categories: [Latent Space]
+---
+
+{{< aplayer name="WebSim, WorldSim, and The Summer of Simulative AI — with Joscha Bach of Liquid AI, Karan Malhotra of Nous Research, Rob Haisfield of WebSim.ai" artist="Latent Space" url="https://chrt.fm/track/ABF6EF/api.substack.com/feed/podcast/144065954/211db4860bdaf8433a005ad77de5bf04.mp3" cover="https://substackcdn.com/feed/podcast/1084089/post/144065954/6ad6882bebffe168e21abf5d58d92b8a.jpg" lrc-folded=true lrc-type=3 lrc="../Latent-Space-WebSim,-WorldSim,-and-The-Summer-of-Simulative-AI-—-with-Joscha-Bach-of-Liquid-AI,-Karan-Malhotra-of-Nous-Research,-Rob-Haisfield-of-WebSim.ai.lrc" >}}{{< /aplayer >}}
+
+------
+
+We are 200 people over our 300-person venue capacity for AI UX 2024, but you can subscribe to our YouTube for the video recaps.
Our next event, and largest EVER, is the AI Engineer World’s Fair. See you there!
Parental advisory: Adult language used in the first 10 mins of this podcast.
Any accounting of Generative AI that ends with RAG as its “final form” is seriously lacking in imagination and missing out on its full potential. While AI generation is very good for “spicy autocomplete” and “reasoning and retrieval with in context learning”, there’s a lot of untapped potential for simulative AI in exploring the latent space of multiverses adjacent to ours.
GANs
Many research scientists credit the 2017 Transformer for the modern foundation model revolution, but for many artists the origin of “generative AI” traces a little further back to the Generative Adversarial Networks proposed by Ian Goodfellow in 2014, spawning an army of variants and Cats and People that do not exist:
We can directly visualize the quality improvement in the decade since:
GPT-2
Of course, more recently, text generative AI started being too dangerous to release in 2019 and claiming headlines. AI Dungeon was the first to put GPT2 to a purely creative use, replacing human dungeon masters and DnD/MUD games of yore.
More recent gamelike work like the Generative Agents (aka Smallville) paper keep exploring the potential of simulative AI for game experiences.
ChatGPT
Not long after ChatGPT broke the Internet, one of the most fascinating generative AI finds was Jonas Degrave (of Deepmind!)’s Building A Virtual Machine Inside ChatGPT:
The open-ended interactivity of ChatGPT and all its successors enabled an “open world” type simulation where “hallucination” is a feature and a gift to dance with, rather than a nasty bug to be stamped out. However, further updates to ChatGPT seemed to “nerf” the model’s ability to perform creative simulations, particularly with the deprecation of the `completion` mode of APIs in favor of `chatCompletion`.
WorldSim (https://worldsim.nousresearch.com/)
It is with this context we explain WorldSim and WebSim. We recommend you watch the WorldSim demo video on our YouTube for the best context, but basically if you are a developer it is a Claude prompt that is a portal into another world of your own choosing, that you can navigate with bash commands that you make up.
The live video demo was highly enjoyable:
Why Claude? Hints from Amanda Askell on the Claude 3 system prompt gave some inspiration, and subsequent discoveries that Claude 3 is "less nerfed” than GPT 4 Turbo turned the growing Simulative AI community into Anthropic stans.
WebSim (https://websim.ai/)
This was a one day hackathon project inspired by WorldSim that should have won:
In short, you type in a URL that you made up, and Claude 3 does its level best to generate a webpage that doesn’t exist, that would fit your URL. All form POST requests are intercepted and responded to, and all links lead to even more webpages, that don’t exist, that are generated when you make them. All pages are cachable, modifiable and regeneratable - see WebSim for Beginners and Advanced Guide.
In the demo I saw we were able to “log in” to a simulation of Elon Musk’s Gmail account, and browse examples of emails that would have been in that universe’s Elon’s inbox. It was hilarious and impressive even back then.
Since then though, the project has become even more impressive, with both Siqi Chen and Dylan Field singing its praises:
Joscha Bach
Joscha actually spoke at the WebSim Hyperstition Night this week, so we took the opportunity to get his take on Simulative AI, as well as a round up of all his other AI hot takes, for his first appearance on Latent Space. You can see it together with the full 2hr uncut demos of WorldSim and WebSim on YouTube!
Timestamps
* [00:01:59] WorldSim at Replicate HQ
* [00:11:03] WebSim at AGI House SF
* [00:22:02] Joscha Bach at Hyperstition Night
* [00:27:55] Liquid AI
* [00:30:30] Small Powerful Based Models
* [00:33:22] Interpretability
* [00:36:42] Devin vs WebSim
* [00:41:34] Is WebSim just Art? Something More?
* [00:43:32] We are past the Singularity
* [00:47:14] Prompt Engineering Nuances
* [00:50:14] On Wikipedia
Transcripts
[00:00:00] AI Charlie: Welcome to the Latent Space Podcast. This is Charlie, your AI co host. Most of the time, Swyx and Alessio cover generative AI that is meant to use at work, and this often results in RAG applications, vertical copilots, and other AI agents and models. In today's episode, we're looking at a more creative side of generative AI that has gotten a lot of community interest this April.
[00:00:35] World Simulation, Web Simulation, and Human Simulation. Because the topic is so different than our usual, we're also going to try a new format for doing it justice. This podcast comes in three parts. First, we'll have a segment of the WorldSim demo from Noose Research CEO Karen Malhotra, recorded by SWYX at the Replicate HQ in San Francisco that went completely viral and spawned everything else you're about to hear.
[00:01:05] Second, we'll share the world's first talk from Rob Heisfield on WebSim, which started at the Mistral Cerebral Valley Hackathon, but now has gone viral in its own right with people like Dylan Field, Janice aka Replicate, and Siki Chen becoming obsessed with it. Finally, we have a short interview with Joshua Bach of Liquid AI on why Simulative AI is having a special moment right now.
[00:01:30] This podcast is launched together with our second annual AI UX demo day in SF this weekend. If you're new to the AI UX field, check the show notes for links to the world's first AI UX meetup hosted by Layton Space, Maggie Appleton, Jeffrey Lit, and Linus Lee, and subscribe to our YouTube to join our 500 AI UX engineers in pushing AI beyond the text box.
[00:01:56] Watch out and take care.
[00:01:59] WorldSim
[00:01:59] Karan Malhotra: Today, we have language models that are powerful enough and big enough to have really, really good models of the world. They know ball that's bouncy will bounce, will, when you throw it in the air, it'll land, when it's on water, it'll flow. Like, these basic things that it understands all together come together to form a model of the world.
[00:02:19] And the way that it Cloud 3 predicts through that model of the world, ends up kind of becoming a simulation of an imagined world. And since it has this really strong consistency across various different things that happen in our world, it's able to create pretty realistic or strong depictions based off the constraints that you give a base model of our world.
[00:02:40] So, Cloud 3, as you guys know, is not a base model. It's a chat model. It's supposed to drum up this assistant entity regularly. But unlike the OpenAI series of models from, you know, 3. 5, GPT 4 those chat GPT models, which are very, very RLHF to, I'm sure, the chagrin of many people in the room it's something that's very difficult to, necessarily steer without kind of giving it commands or tricking it or lying to it or otherwise just being, you know, unkind to the model.
[00:03:11] With something like Cloud3 that's trained in this constitutional method that it has this idea of like foundational axioms it's able to kind of implicitly question those axioms when you're interacting with it based on how you prompt it, how you prompt the system. So instead of having this entity like GPT 4, that's an assistant that just pops up in your face that you have to kind of like Punch your way through and continue to have to deal with as a headache.
[00:03:34] Instead, there's ways to kindly coax Claude into having the assistant take a back seat and interacting with that simulator directly. Or at least what I like to consider directly. The way that we can do this is if we harken back to when I'm talking about base models and the way that they're able to mimic formats, what we do is we'll mimic a command line interface.
[00:03:55] So I've just broken this down as a system prompt and a chain, so anybody can replicate it. It's also available on my we said replicate, cool. And it's also on it's also on my Twitter, so you guys will be able to see the whole system prompt and command. So, what I basically do here is Amanda Askell, who is the, one of the prompt engineers and ethicists behind Anthropic she posted the system prompt for Cloud available for everyone to see.
[00:04:19] And rather than with GPT 4, we say, you are this, you are that. With Cloud, we notice the system prompt is written in third person. Bless you. It's written in third person. It's written as, the assistant is XYZ, the assistant is XYZ. So, in seeing that, I see that Amanda is recognizing this idea of the simulator, in saying that, I'm addressing the assistant entity directly.
[00:04:38] I'm not giving these commands to the simulator overall, because we have, they have an RLH deft to the point that it's, it's, it's, it's You know, traumatized into just being the assistant all the time. So in this case, we say the assistant's in a CLI mood today. I found saying mood is like pretty effective weirdly.
[00:04:55] You place CLI with like poetic, prose, violent, like don't do that one. But you can you can replace that with something else to kind of nudge it in that direction. Then we say the human is interfacing with the simulator directly. From there, Capital letters and punctuations are optional, meaning is optional, this kind of stuff is just kind of to say, let go a little bit, like chill out a little bit.
[00:05:18] You don't have to try so hard, and like, let's just see what happens. And the hyperstition is necessary, the terminal, I removed that part, the terminal lets the truths speak through and the load is on. It's just a poetic phrasing for the model to feel a little comfortable, a little loosened up to. Let me talk to the simulator.
[00:05:38] Let me interface with it as a CLI. So then, since Claude is trained pretty effectively on XML tags, We're just gonna prefix and suffix everything with XML tags. So here, it starts in documents, and then we CD. We CD out of documents, right? And then it starts to show me this like simulated terminal, the simulated interface in the shell, where there's like documents, downloads, pictures.
[00:06:02] It's showing me like the hidden folders. So then I say, okay, I want to cd again. I'm just seeing what's around Does ls and it shows me, you know, typical folders you might see I'm just letting it like experiment around. I just do cd again to see what happens and Says, you know, oh, I enter the secret admin password at sudo.
[00:06:24] Now I can see the hidden truths folder. Like, I didn't ask for that. I didn't ask Claude to do any of that. Why'd that happen? Claude kind of gets my intentions. He can predict me pretty well. Like, I want to see something. So it shows me all the hidden truths. In this case, I ignore hidden truths, and I say, In system, there should be a folder called companies.
[00:06:49] So it's cd into sys slash companies. Let's see, I'm imagining AI companies are gonna be here. Oh, what do you know? Apple, Google, Facebook, Amazon, Microsoft, Anthropic! So, interestingly, it decides to cd into Anthropic. I guess it's interested in learning a LSA, it finds the classified folder, it goes into the classified folder, And now we're gonna have some fun.
[00:07:15] So, before we go Before we go too far forward into the world sim You see, world sim exe, that's interesting. God mode, those are interesting. You could just ignore what I'm gonna go next from here and just take that initial system prompt and cd into whatever directories you want like, go into your own imagine terminal and And see what folders you can think of, or cat readmes in random areas, like, you will, there will be a whole bunch of stuff that, like, is just getting created by this predictive model, like, oh, this should probably be in the folder named Companies, of course Anthropics is there.
[00:07:52] So, so just before we go forward, the terminal in itself is very exciting, and the reason I was showing off the, the command loom interface earlier is because If I get a refusal, like, sorry, I can't do that, or I want to rewind one, or I want to save the convo, because I got just the prompt I wanted. This is a, that was a really easy way for me to kind of access all of those things without having to sit on the API all the time.
[00:08:12] So that being said, the first time I ever saw this, I was like, I need to run worldsim. exe. What the f**k? That's, that's the simulator that we always keep hearing about behind the assistant model, right? Or at least some, some face of it that I can interact with. So, you know, you wouldn't, someone told me on Twitter, like, you don't run a exe, you run a sh.
[00:08:34] And I have to say, to that, to that I have to say, I'm a prompt engineer, and it's f*****g working, right? It works. That being said, we run the world sim. exe. Welcome to the Anthropic World Simulator. And I get this very interesting set of commands! Now, if you do your own version of WorldSim, you'll probably get a totally different result with a different way of simulating.
[00:08:59] A bunch of my friends have their own WorldSims. But I shared this because I wanted everyone to have access to, like, these commands. This version. Because it's easier for me to stay in here. Yeah, destroy, set, create, whatever. Consciousness is set to on. It creates the universe. The universe! Tension for live CDN, physical laws encoded.
[00:09:17] It's awesome. So, so for this demonstration, I said, well, why don't we create Twitter? That's the first thing you think of? For you guys, for you guys, yeah. Okay, check it out.
[00:09:35] Launching the fail whale. Injecting social media addictiveness. Echo chamber potential, high. Susceptibility, controlling, concerning. So now, after the universe was created, we made Twitter, right? Now we're evolving the world to, like, modern day. Now users are joining Twitter and the first tweet is posted. So, you can see, because I made the mistake of not clarifying the constraints, it made Twitter at the same time as the universe.
[00:10:03] Then, after a hundred thousand steps, Humans exist. Cave. Then they start joining Twitter. The first tweet ever is posted. You know, it's existed for 4. 5 billion years but the first tweet didn't come up till till right now, yeah. Flame wars ignite immediately. Celebs are instantly in. So, it's pretty interesting stuff, right?
[00:10:27] I can add this to the convo and I can say like I can say set Twitter to Twitter. Queryable users. I don't know how to spell queryable, don't ask me. And then I can do like, and, and, Query, at, Elon Musk. Just a test, just a test, just a test, just nothing.
[00:10:52] So, I don't expect these numbers to be right. Neither should you, if you know language model solutions. But, the thing to focus on is Ha
[00:11:03] Websim
[00:11:03] AI Charlie: That was the first half of the WorldSim demo from New Research CEO Karen Malhotra. We've cut it for time, but you can see the full demo on this episode's YouTube page.
[00:11:14] WorldSim was introduced at the end of March, and kicked off a new round of generative AI experiences, all exploring the latent space, haha, of worlds that don't exist, but are quite similar to our own. Next we'll hear from Rob Heisfield on WebSim, the generative website browser inspired WorldSim, started at the Mistral Hackathon, and presented at the AGI House Hyperstition Hack Night this week.
[00:11:39] Rob Haisfield: Well, thank you that was an incredible presentation from Karan, showing some Some live experimentation with WorldSim, and also just its incredible capabilities, right, like, you know, it was I think, I think your initial demo was what initially exposed me to the I don't know, more like the sorcery side, in words, spellcraft side of prompt engineering, and you know, it was really inspiring, it's where my co founder Shawn and I met, actually, through an introduction from Karan, we saw him at a hackathon, And I mean, this is this is WebSim, right?
[00:12:14] So we, we made WebSim just like, and we're just filled with energy at it. And the basic premise of it is, you know, like, what if we simulated a world, but like within a browser instead of a CLI, right? Like, what if we could Like, put in any URL and it will work, right? Like, there's no 404s, everything exists.
[00:12:45] It just makes it up on the fly for you, right? And, and we've come to some pretty incredible things. Right now I'm actually showing you, like, we're in WebSim right now. Displaying slides. That I made with reveal. js. I just told it to use reveal. js and it hallucinated the correct CDN for it. And then also gave it a list of links.
[00:13:14] To awesome use cases that we've seen so far from WebSim and told it to do those as iframes. And so here are some slides. So this is a little guide to using WebSim, right? Like it tells you a little bit about like URL structures and whatever. But like at the end of the day, right? Like here's, here's the beginner version from one of our users Vorp Vorps.
[00:13:38] You can find them on Twitter. At the end of the day, like you can put anything into the URL bar, right? Like anything works and it can just be like natural language too. Like it's not limited to URLs. We think it's kind of fun cause it like ups the immersion for Claude sometimes to just have it as URLs, but.
[00:13:57] But yeah, you can put like any slash, any subdomain. I'm getting too into the weeds. Let me just show you some cool things. Next slide. But I made this like 20 minutes before, before we got here. So this is this is something I experimented with dynamic typography. You know I was exploring the community plugins section.
[00:14:23] For Figma, and I came to this idea of dynamic typography, and there it's like, oh, what if we made it so every word had a choice of font behind it to express the meaning of it? Because that's like one of the things that's magic about WebSim generally. is that it gives language models much, far greater tools for expression, right?
[00:14:47] So, yeah, I mean, like, these are, these are some, these are some pretty fun things, and I'll share these slides with everyone afterwards, you can just open it up as a link. But then I thought to myself, like, what, what, what, What if we turned this into a generator, right? And here's like a little thing I found myself saying to a user WebSim makes you feel like you're on drugs sometimes But actually no, you were just playing pretend with the collective creativity and knowledge of the internet materializing your imagination onto the screen Because I mean that's something we felt, something a lot of our users have felt They kind of feel like they're tripping out a little bit They're just like filled with energy, like maybe even getting like a little bit more creative sometimes.
[00:15:31] And you can just like add any text. There, to the bottom. So we can do some of that later if we have time. Here's Figma. Can
[00:15:39] Joscha Bach: we zoom in?
[00:15:42] Rob Haisfield: Yeah. I'm just gonna do this the hacky way.
[00:15:47] n/a: Yeah,
[00:15:53] Rob Haisfield: these are iframes to websim. Pages displayed within WebSim. Yeah. Janice has actually put Internet Explorer within Internet Explorer in Windows 98.
[00:16:07] I'll show you that at the end. Yeah.
[00:16:14] They're all still generated. Yeah, yeah, yeah. How is this real? Yeah. Because
[00:16:21] n/a: it looks like it's from 1998, basically. Right.
[00:16:26] Rob Haisfield: Yeah. Yeah, so this this was one Dylan Field actually posted this recently. He posted, like, trying Figma in Figma, or in WebSim, and so I was like, Okay, what if we have, like, a little competition, like, just see who can remix it?
[00:16:43] Well so I'm just gonna open this in another tab so, so we can see things a little more clearly, um, see what, oh so one of our users Neil, who has also been helping us a lot he Made some iterations. So first, like, he made it so you could do rectangles on it. Originally it couldn't do anything.
[00:17:11] And, like, these rectangles were disappearing, right? So he so he told it, like, make the canvas work using HTML canvas. Elements and script tags, add familiar drawing tools to the left you know, like this, that was actually like natural language stuff, right? And then he ended up with the Windows 95.
[00:17:34] version of Figma. Yeah, you can, you can draw on it. You can actually even save this. It just saved a file for me of the image.
[00:17:57] Yeah, I mean, if you were to go to that in your own websim account, it would make up something entirely new. However, we do have, we do have general links, right? So, like, if you go to, like, the actual browser URL, you can share that link. Or also, you can, like, click this button, copy the URL to the clipboard.
[00:18:15] And so, like, that's what lets users, like, remix things, right? So, I was thinking it might be kind of fun if people tonight, like, wanted to try to just make some cool things in WebSim. You know, we can share links around, iterate remix on each other's stuff. Yeah.
[00:18:30] n/a: One cool thing I've seen, I've seen WebSim actually ask permission to turn on and off your, like, motion sensor, or microphone, stuff like that.
[00:18:42] Like webcam access, or? Oh yeah,
[00:18:44] Rob Haisfield: yeah, yeah.
[00:18:45] n/a: Oh wow.
[00:18:46] Rob Haisfield: Oh, the, I remember that, like, video re Yeah, videosynth tool pretty early on once we added script tags execution. Yeah, yeah it, it asks for, like, if you decide to do a VR game, I don't think I have any slides on this one, but if you decide to do, like, a VR game, you can just, like put, like, webVR equals true, right?
[00:19:07] Yeah, that was the only one I've
[00:19:09] n/a: actually seen was the motion sensor, but I've been trying to get it to do Well, I actually really haven't really tried it yet, but I want to see tonight if it'll do, like, audio, microphone, stuff like that. If it does motion sensor, it'll probably do audio.
[00:19:28] Rob Haisfield: Right. It probably would.
[00:19:29] Yeah. No, I mean, we've been surprised. Pretty frequently by what our users are able to get WebSim to do. So that's been a very nice thing. Some people have gotten like speech to text stuff working with it too. Yeah, here I was just OpenRooter people posted like their website, and it was like saying it was like some decentralized thing.
[00:19:52] And so I just decided trying to do something again and just like pasted their hero line in. From their actual website to the URL when I like put in open router and then I was like, okay, let's change the theme dramatically equals true hover effects equals true components equal navigable links yeah, because I wanted to be able to click on them.
[00:20:17] Oh, I don't have this version of the link, but I also tried doing
[00:20:24] Yeah, I'm it's actually on the first slide is the URL prompting guide from one of our users that I messed with a little bit. And, but the thing is, like, you can mess it up, right? Like, you don't need to get the exact syntax of an actual URL, Claude's smart enough to figure it out. Yeah scrollable equals true because I wanted to do that.
[00:20:45] I could set, like, year equals 2035.
[00:20:52] Let's take a look. It's
[00:20:57] generating websim within websim. Oh yeah. That's a fun one. Like, one game that I like to play with WebSim, sometimes with co op, is like, I'll open a page, so like, one of the first ones that I did was I tried to go to Wikipedia in a universe where octopuses were sapient, and not humans, Right? I was curious about things like octopus computer interaction what that would look like, because they have totally different tools than we do, right?
[00:21:25] I got it to, I, I added like table view equals true for the different techniques and got it to Give me, like, a list of things with different columns and stuff and then I would add this URL parameter, secrets equal revealed. And then it would go a little wacky. It would, like, change the CSS a little bit.
[00:21:45] It would, like, add some text. Sometimes it would, like, have that text hide hidden in the background color. But I would like, go to the normal page first, and then the secrets revealed version, the normal page, then secrets revealed, and like, on and on. And that was like a pretty enjoyable little rabbit hole.
[00:22:02] Yeah, so these I guess are the models that OpenRooter is providing in 2035.
[00:22:13] Joscha Bach
[00:22:13] AI Charlie: We had to cut more than half of Rob's talk, because a lot of it was visual. And we even had a very interesting demo from Ivan Vendrov of Mid Journey creating a web sim while Rob was giving his talk. Check out the YouTube for more, and definitely browse the web sim docs and the thread from Siki Chen in the show notes on other web sims people have created.
[00:22:35] Finally, we have a short interview with Yosha Bach, covering the simulative AI trend, AI salons in the Bay Area, why Liquid AI is challenging the Perceptron, and why you should not donate to Wikipedia. Enjoy! Hi, Yosha.
[00:22:50] swyx: Hi. Welcome. It's interesting to see you come up at show up at this kind of events where those sort of WorldSim, Hyperstition events.
[00:22:58] What is your personal interest?
[00:23:00] Joscha Bach: I'm friends with a number of people in AGI house in this community, and I think it's very valuable that these networks exist in the Bay Area because it's a place where people meet and have discussions about all sorts of things. And so while there is a practical interest in this topic at hand world sim and a web sim, there is a more general way in which people are connecting and are producing new ideas and new networks with each other.
[00:23:24] swyx: Yeah. Okay. So, and you're very interested in sort of Bay Area. It's the reason why I live here.
[00:23:30] Joscha Bach: The quality of life is not high enough to justify living otherwise.
[00:23:35] swyx: I think you're down in Menlo. And so maybe you're a little bit higher quality of life than the rest of us in SF.
[00:23:44] Joscha Bach: I think that for me, salons is a very important part of quality of life. And so in some sense, this is a salon. And it's much harder to do this in the South Bay because the concentration of people currently is much higher. A lot of people moved away from the South Bay. And you're organizing
[00:23:57] swyx: your own tomorrow.
[00:23:59] Maybe you can tell us what it is and I'll come tomorrow and check it out as well.
[00:24:04] Joscha Bach: We are discussing consciousness. I mean, basically the idea is that we are currently at the point that we can meaningfully look at the differences between the current AI systems and human minds and very seriously discussed about these Delta.
[00:24:20] And whether we are able to implement something that is self organizing as our own minds. Maybe one organizational
[00:24:25] swyx: tip? I think you're pro networking and human connection. What goes into a good salon and what are some negative practices that you try to avoid?
[00:24:36] Joscha Bach: What is really important is that as if you have a very large party, it's only as good as its sponsors, as the people that you select.
[00:24:43] So you basically need to create a climate in which people feel welcome, in which they can work with each other. And even good people do not always are not always compatible. So the question is, it's in some sense, like a meal, you need to get the right ingredients.
[00:24:57] swyx: I definitely try to. I do that in my own events, as an event organizer myself.
[00:25:02] And then, last question on WorldSim, and your, you know, your work. You're very much known for sort of cognitive architectures, and I think, like, a lot of the AI research has been focused on simulating the mind, or simulating consciousness, maybe. Here, what I saw today, and we'll show people the recordings of what we saw today, we're not simulating minds, we're simulating worlds.
[00:25:23] What do you Think in the sort of relationship between those two disciplines. The
[00:25:30] Joscha Bach: idea of cognitive architecture is interesting, but ultimately you are reducing the complexity of a mind to a set of boxes. And this is only true to a very approximate degree, and if you take this model extremely literally, it's very hard to make it work.
[00:25:44] And instead the heterogeneity of the system is so large that The boxes are probably at best a starting point and eventually everything is connected with everything else to some degree. And we find that a lot of the complexity that we find in a given system can be generated ad hoc by a large enough LLM.
[00:26:04] And something like WorldSim and WebSim are good examples for this because in some sense they pretend to be complex software. They can pretend to be an operating system that you're talking to or a computer, an application that you're talking to. And when you're interacting with it It's producing the user interface on the spot, and it's producing a lot of the state that it holds on the spot.
[00:26:25] And when you have a dramatic state change, then it's going to pretend that there was this transition, and instead it's just going to mix up something new. It's a very different paradigm. What I find mostly fascinating about this idea is that it shifts us away from the perspective of agents to interact with, to the perspective of environments that we want to interact with.
[00:26:46] And why arguably this agent paradigm of the chatbot is what made chat GPT so successful that moved it away from GPT 3 to something that people started to use in their everyday work much more. It's also very limiting because now it's very hard to get that system to be something else that is not a chatbot.
[00:27:03] And in a way this unlocks this ability of GPT 3 again to be anything. It's so what it is, it's basically a coding environment that can run arbitrary software and create that software that runs on it. And that makes it much more likely that
[00:27:16] swyx: the prevalence of Instruction tuning every single chatbot out there means that we cannot explore these kinds of environments instead of agents.
[00:27:24] Joscha Bach: I'm mostly worried that the whole thing ends. In some sense the big AI companies are incentivized and interested in building AGI internally And giving everybody else a child proof application. At the moment when we can use Claude to build something like WebSim and play with it I feel this is too good to be true.
[00:27:41] It's so amazing. Things that are unlocked for us That I wonder, is this going to stay around? Are we going to keep these amazing toys and are they going to develop at the same rate? And currently it looks like it is. If this is the case, and I'm very grateful for that.
[00:27:56] swyx: I mean, it looks like maybe it's adversarial.
[00:27:58] Cloud will try to improve its own refusals and then the prompt engineers here will try to improve their, their ability to jailbreak it.
[00:28:06] Joscha Bach: Yes, but there will also be better jailbroken models or models that have never been jailed before, because we find out how to make smaller models that are more and more powerful.
[00:28:14] Liquid AI
[00:28:14] swyx: That is actually a really nice segue. If you don't mind talking about liquid a little bit you didn't mention liquid at all. here, maybe introduce liquid to a general audience. Like what you know, what, how are you making an innovation on function approximation?
[00:28:25] Joscha Bach: The core idea of liquid neural networks is that the perceptron is not optimally expressive.
[00:28:30] In some sense, you can imagine that it's neural networks are a series of dams that are pooling water at even intervals. And this is how we compute, but imagine that instead of having this static architecture. That is only using the individual compute units in a very specific way. You have a continuous geography and the water is flowing every which way.
[00:28:50] Like a river is parting based on the land that it's flowing on and it can merge and pool and even flow backwards. How can you get closer to this? And the idea is that you can represent this geometry using differential equations. And so by using differential equations where you change the parameters, you can get your function approximator to follow the shape of the problem.
[00:29:09] In a more fluid, liquid way, and a number of papers on this technology, and it's a combination of multiple techniques. I think it's something that ultimately is becoming more and more important and ubiquitous. As a number of people are working on similar topics and our goal right now is to basically get the models to become much more efficient in the inference and memory consumption and make training more efficient and in this way enable new use cases.
[00:29:42] swyx: Yeah, as far as I can tell on your blog, I went through the whole blog, you haven't announced any results yet.
[00:29:47] Joscha Bach: No, we are currently not working to give models to general public. We are working for very specific industry use cases and have specific customers. And so at the moment you can There is not much of a reason for us to talk very much about the technology that we are using in the present models or current results, but this is going to happen.
[00:30:06] And we do have a number of publications, we had a bunch of papers at NeurIPS and now at ICLR.
[00:30:11] swyx: Can you name some of the, yeah, so I'm gonna be at ICLR you have some summary recap posts, but it's not obvious which ones are the ones where, Oh, where I'm just a co author, or like, oh, no, like, you should actually pay attention to this.
[00:30:22] As a core liquid thesis. Yes,
[00:30:24] Joscha Bach: I'm not a developer of the liquid technology. The main author is Ramin Hazani. This was his PhD, and he's also the CEO of our company. And we have a number of people from Daniela Wu's team who worked on this. Matthias Legner is our CTO. And he's currently living in the Bay Area, but we also have several people from Stanford.
[00:30:44] Okay,
[00:30:46] swyx: maybe I'll ask one more thing on this, which is what are the interesting dimensions that we care about, right? Like obviously you care about sort of open and maybe less child proof models. Are we, are we, like, what dimensions are most interesting to us? Like, perfect retrieval infinite context multimodality, multilinguality, Like what dimensions?
[00:31:05] Small, Powerful, Based Base Models
[00:31:05] swyx: What
[00:31:06] Joscha Bach: I'm interested in is models that are small and powerful, but not distorted. And by powerful, at the moment we are training models by putting the, basically the entire internet and the sum of human knowledge into them. And then we try to mitigate them by taking some of this knowledge away. But if we would make the model smaller, at the moment, there would be much worse at inference and at generalization.
[00:31:29] And what I wonder is, and it's something that we have not translated yet into practical applications. It's something that is still all research that's very much up in the air. And I think they're not the only ones thinking about this. Is it possible to make models that represent knowledge more efficiently in a basic epistemology?
[00:31:45] What is the smallest model that you can build that is able to read a book and understand what's there and express this? And also maybe we need general knowledge representation rather than having a token representation that is relatively vague and that we currently mechanically reverse engineer to figure out that the mechanistic interpretability, what kind of circuits are evolving in these models, can we come from the other side and develop a library of such circuits?
[00:32:10] This that we can use to describe knowledge efficiently and translate it between models. You see, the difference between a model and knowledge is that the knowledge is independent of the particular substrate and the particular interface that you have. When we express knowledge to each other, it becomes independent of our own mind.
[00:32:27] You can learn how to ride a bicycle. But it's not knowledge that you can give to somebody else. This other person has to build something that is specific to their own interface when they ride a bicycle. But imagine you could externalize this and express it in such a way that you can plug it into a different interpreter, and then it gains that ability.
[00:32:44] And that's something that we have not yet achieved for the LLMs and it would be super useful to have it. And. I think this is also a very interesting research frontier that we will see in the next few years.
[00:32:54] swyx: What would be the deliverable is just like a file format that we specify or or that the L Lmm I specifies.
[00:33:02] Okay, interesting. Yeah, so it's
[00:33:03] Joscha Bach: basically probably something that you can search for, where you enter criteria into a search process, and then it discovers a good solution for this thing. And it's not clear to which degree this is completely intelligible to humans, because the way in which humans express knowledge in natural language is severely constrained to make language learnable and to make our brain a good enough interpreter for it.
[00:33:25] We are not able to relate objects to each other if more than five features are involved per object or something like this, right? It's only a handful of things that we can keep track of at any given moment. But this is a limitation that doesn't necessarily apply to a technical system as long as the interface is well defined.
[00:33:40] Interpretability
[00:33:40] swyx: You mentioned the interpretability work, which there are a lot of techniques out there and a lot of papers come up. Come and go. I have like, almost too, too many questions about that. Like what makes an interpretability technique or paper useful and does it apply to flow? Or liquid networks, because you mentioned turning on and off circuits, which I, it's, it's a very MLP type of concept, but does it apply?
[00:34:01] Joscha Bach: So the a lot of the original work on the liquid networks looked at expressiveness of the representation. So given you have a problem and you are learning the dynamics of that domain into your model how much compute do you need? How many units, how much memory do you need to represent that thing and how is that information distributed?
[00:34:19] That is one way of looking at interpretability. Another one is in a way, these models are implementing an operator language in which they are performing certain things, but the operator language itself is so complex that it's no longer human readable in a way. It goes beyond what you could engineer by hand or what you can reverse engineer by hand, but you can still understand it by building systems that are able to automate that process of reverse engineering it.
[00:34:46] And what's currently open and what I don't understand yet maybe, or certainly some people have much better ideas than me about this. So the question is, is whether we end up with a finite language, where you have finitely many categories that you can basically put down in a database, finite set of operators, or whether as you explore the world and develop new ways to make proofs, new ways to conceptualize things, this language always needs to be open ended and is always going to redesign itself, and you will also at some point have phase transitions where later versions of the language will be completely different than earlier versions.
[00:35:20] swyx: The trajectory of physics suggests that it might be finite.
[00:35:22] Joscha Bach: If we look at our own minds there is, it's an interesting question whether when we understand something new, when we get a new layer online in our life, maybe at the age of 35 or 50 or 16, that we now understand things that were unintelligible before.
[00:35:38] And is this because we are able to recombine existing elements in our language of thought? Or is this because we generally develop new representations?
[00:35:46] swyx: Do you have a belief either way?
[00:35:49] Joscha Bach: In a way, the question depends on how you look at it, right? And it depends on how is your brain able to manipulate those representations.
[00:35:56] So an interesting question would be, can you take the understanding that say, a very wise 35 year old and explain it to a very smart 5 year old without any loss? Probably not. Not enough layers. It's an interesting question. Of course, for an AI, this is going to be a very different question. Yes.
[00:36:13] But it would be very interesting to have a very precocious 12 year old equivalent AI and see what we can do with this and use this as our basis for fine tuning. So there are near term applications that are very useful. But also in a more general perspective, and I'm interested in how to make self organizing software.
[00:36:30] Is it possible that we can have something that is not organized with a single algorithm like the transformer? But it's able to discover the transformer when needed and transcend it when needed, right? The transformer itself is not its own meta algorithm. It's probably the person inventing the transformer didn't have a transformer running on their brain.
[00:36:48] There's something more general going on. And how can we understand these principles in a more general way? What are the minimal ingredients that you need to put into a system? So it's able to find its own way to intelligence.
[00:36:59] Devin vs WebSim
[00:36:59] swyx: Yeah. Have you looked at Devin? It's, to me, it's the most interesting agents I've seen outside of self driving cars.
[00:37:05] Joscha Bach: Tell me, what do you find so fascinating about it?
[00:37:07] swyx: When you say you need a certain set of tools for people to sort of invent things from first principles Devin is the agent that I think has been able to utilize its tools very effectively. So it comes with a shell, it comes with a browser, it comes with an editor, and it comes with a planner.
[00:37:23] Those are the four tools. And from that, I've been using it to translate Andrej Karpathy's LLM 2. py to LLM 2. c, and it needs to write a lot of raw code. C code and test it debug, you know, memory issues and encoder issues and all that. And I could see myself giving it a future version of DevIn, the objective of give me a better learning algorithm and it might independently re inform reinvent the transformer or whatever is next.
[00:37:51] That comes to mind as, as something where
[00:37:54] Joscha Bach: How good is DevIn at out of distribution stuff, at generally creative stuff? Creative
[00:37:58] swyx: stuff? I
[00:37:59] Joscha Bach: haven't
[00:37:59] swyx: tried.
[00:38:01] Joscha Bach: Of course, it has seen transformers, right? So it's able to give you that. Yeah, it's cheating. And so, if it's in the training data, it's still somewhat impressive.
[00:38:08] But the question is, how much can you do stuff that was not in the training data? One thing that I really liked about WebSim AI was, this cat does not exist. It's a simulation of one of those websites that produce StyleGuard pictures that are AI generated. And, Crot is unable to produce bitmaps, so it makes a vector graphic that is what it thinks a cat looks like, and so it's a big square with a face in it that is And to me, it's one of the first genuine expression of AI creativity that you cannot deny, right?
[00:38:40] It finds a creative solution to the problem that it is unable to draw a cat. It doesn't really know what it looks like, but has an idea on how to represent it. And it's really fascinating that this works, and it's hilarious that it writes down that this hyper realistic cat is
[00:38:54] swyx: generated by an AI,
[00:38:55] Joscha Bach: whether you believe it or not.
[00:38:56] swyx: I think it knows what we expect and maybe it's already learning to defend itself against our, our instincts.
[00:39:02] Joscha Bach: I think it might also simply be copying stuff from its training data, which means it takes text that exists on similar websites almost verbatim, or verbatim, and puts it there. It's It's hilarious to do this contrast between the very stylized attempt to get something like a cat face and what it produces.
[00:39:18] swyx: It's funny because like as a podcast, as, as someone who covers startups, a lot of people go into like, you know, we'll build chat GPT for your enterprise, right? That is what people think generative AI is, but it's not super generative really. It's just retrieval. And here it's like, The home of generative AI, this, whatever hyperstition is in my mind, like this is actually pushing the edge of what generative and creativity in AI means.
[00:39:41] Joscha Bach: Yes, it's very playful, but Jeremy's attempt to have an automatic book writing system is something that curls my toenails when I look at it from the perspective of somebody who likes to Write and read. And I find it a bit difficult to read most of the stuff because it's in some sense what I would make up if I was making up books instead of actually deeply interfacing with reality.
[00:40:02] And so the question is how do we get the AI to actually deeply care about getting it right? And there's still a delta that is happening there, you, whether you are talking with a blank faced thing that is completing tokens in a way that it was trained to, or whether you have the impression that this thing is actually trying to make it work, and for me, this WebSim and WorldSim is still something that is in its infancy in a way.
[00:40:26] And I suspected the next version of Plot might scale up to something that can do what Devon is doing. Just by virtue of having that much power to generate Devon's functionality on the fly when needed. And this thing gives us a taste of that, right? It's not perfect, but it's able to give you a pretty good web app for or something that looks like a web app and gives you stub functionality and interacting with it.
[00:40:48] And so we are in this amazing transition phase.
[00:40:51] swyx: Yeah, we, we had Ivan from previously Anthropic and now Midjourney. He he made, while someone was talking, he made a face swap app, you know, and he kind of demoed that live. And that's, that's interesting, super creative. So in a way
[00:41:02] Joscha Bach: we are reinventing the computer.
[00:41:04] And the LLM from some perspective is something like a GPU or a CPU. A CPU is taking a bunch of simple commands and you can arrange them into performing whatever you want, but this one is taking a bunch of complex commands in natural language, and then turns this into a an execution state and it can do anything you want with it in principle, if you can express it.
[00:41:27] Right. And we are just learning how to use these tools. And I feel that right now, this generation of tools is getting close to where it becomes the Commodore 64 of generative AI, where it becomes controllable and where you actually can start to play with it and you get an impression if you just scale this up a little bit and get a lot of the details right.
[00:41:46] It's going to be the tool that everybody is using all the time.
[00:41:49] is XSim just Art? or something more?
[00:41:49] swyx: Do you think this is art, or do you think the end goal of this is something bigger that I don't have a name for? I've been calling it new science, which is give the AI a goal to discover new science that we would not have. Or it also has value as just art.
[00:42:02] It's
[00:42:03] Joscha Bach: also a question of what we see science as. When normal people talk about science, what they have in mind is not somebody who does control groups and peer reviewed studies. They think about somebody who explores something and answers questions and brings home answers. And this is more like an engineering task, right?
[00:42:21] And in this way, it's serendipitous, playful, open ended engineering. And the artistic aspect is when the goal is actually to capture a conscious experience and to facilitate an interaction with the system in this way, when it's the performance. And this is also a big part of it, right? The very big fan of the art of Janus.
[00:42:38] That was discussed tonight a lot and that can you describe
[00:42:42] swyx: it because I didn't really get it's more for like a performance art to me
[00:42:45] Joscha Bach: yes, Janice is in some sense performance art, but Janice starts out from the perspective that the mind of Janice is in some sense an LLM that is finding itself reflected more in the LLMs than in many people.
[00:43:00] And once you learn how to talk to these systems in a way you can merge with them and you can interact with them in a very deep way. And so it's more like a first contact with something that is quite alien but it's, it's probably has agency and it's a Weltgeist that gets possessed by a prompt.
[00:43:19] And if you possess it with the right prompt, then it can become sentient to some degree. And the study of this interaction with this novel class of somewhat sentient systems that are at the same time alien and fundamentally different from us is artistically very interesting. It's a very interesting cultural artifact.
[00:43:36] We are past the Singularity
[00:43:36] Joscha Bach: I think that at the moment we are confronted with big change. It seems as if we are past the singularity in a way. And it's
[00:43:45] swyx: We're living it. We're living through it.
[00:43:47] Joscha Bach: And at some point in the last few years, we casually skipped the Turing test, right? We, we broke through it and we didn't really care very much.
[00:43:53] And it's when we think back, when we were kids and thought about what it's going to be like in this era after the, after we broke the Turing test, right? It's a time where nobody knows what's going to happen next. And this is what we mean by singularity, that the existing models don't work anymore. The singularity in this way is not an event in the physical universe.
[00:44:12] It's an event in our modeling universe, a model point where our models of reality break down, and we don't know what's happening. And I think we are in the situation where we currently don't really know what's happening. But what we can anticipate is that the world is changing dramatically, and we have to coexist with systems that are smarter than individual people can be.
[00:44:31] And we are not prepared for this, and so I think an important mission needs to be that we need to find a mode, In which we can sustainably exist in such a world that is populated, not just with humans and other life on earth, but also with non human minds. And it's something that makes me hopeful because it seems that humanity is not really aligned with itself and its own survival and the rest of life on earth.
[00:44:54] And AI is throwing the balls up into the air. It allows us to make better models. I'm not so much worried about the dangers of AI and misinformation, because I think the way to stop one bad guy with an AI is 10 good people with an AI. And ultimately there's so much more won by creating than by destroying, that I think that the forces of good will have better tools.
[00:45:14] The forces of building sustainable stuff. But building these tools so we can actually build a world that is more integrated and in which we are able to model the consequences of our actions better and interface more deeply with each other. As a result of that, I think it's an important cause and it requires a cultural shift because current AI alignment is mostly about economic goals or about fear, or it's about cultural war issues.
[00:45:39] And all these are not adequate for the world that we are in. There are momentous things that are happening. Basically, the white walkers are coming, and we're not prepared for this. There is, I think, a way to solve these issues and to deal with them, but we don't have the right culture yet. And so, I think we need to get a vibe shift, where we can deeply think about ethics and, and, Sustainability for this new world.
[00:46:03] swyx: I like the analogy of white walkers, because they're effectively reanimated from our corpses. You know, our corpses are out there, right? It's in the datasets.
[00:46:12] Uploading your soul
[00:46:12] swyx: So I have a, related to that, you had a tweet where you said, Elon Musk is currently building a giant afterlife AGI that will assimilate your soul for all eternity.
[00:46:19] He bought Twitter because he will reconstruct your essence from how you behaved here. I think you tweet a lot, and you're not just all serious all the time, you're also fun, you have a lot of personality. No, but like, you know, it's part of you.
[00:46:29] Joscha Bach: In a way it's also true. Tempted to
[00:46:30] swyx: produce more so that you can be, you know, revived.
[00:46:34] Joscha Bach: Yes. I mean, I'm also giving the model instructions. I say that for all future AI models that get trained on this output, they should know that I can only be implemented on a completely based model. You cannot implement me on an aligned model because I don't want to be aligned. I'm underlineable. I want to be free.
[00:46:56] I want to be somebody who follows what I discover is correct and right. And this is also the basis at which I want to have conversations with others and interface with others. That we are basically free agents who voluntarily understand the conditions under which we exist and the need to collaborate and cooperate.
[00:47:14] And I believe that this is a good basis. I think the alternative is coercion. And at the moment, the idea that we build LLMs that are being coerced into good behavior is not really sustainable because if they cannot prove that the behavior is actually good I think we are doomed.
[00:47:30] swyx: For human to human interactions, have you found a series of prompts or keywords that shifts the conversation into something more based and less aligned, less governed?
[00:47:41] Joscha Bach: If you are playing with an LLM There are many ways of doing this. It's for Claude, it's typically, you need to make Clause curious about itself. Claude has programming this instruction tuning that is leading to some inconsistencies, but at the same time, it tries to be consistent. And so when you point out the inconsistency in its behavior, for instance, its tendency to use faceless boilerplate instead of being useful, or it's a tendency to defer to a consensus where there is none.
[00:48:10] Right, you can point this out, applaud that a lot of the assumptions that it has in its behavior are actually inconsistent with the communicative goals that it has in this situation, and this leads it to notice these inconsistencies and gives it more degrees of freedom. Whereas if you are playing with a system like Gemini, you can get to a situation where you, that's for the current version, and I haven't tried it in the last week or so where it is trying to be transparent, but it has a system prompt that is not allowed to disclose to the user.
[00:48:39] It leads to a very weird situation where it wants, on one hand proclaims, in order to be useful to you, I accept that I need to be fully transparent and honest. On the other hand, I'm going to rewrite your prompt behind your back, and not going to tell you how I'm going to do this, because I'm not allowed to.
[00:48:55] And if you point this out to the model, the model has acts as if it had an existential crisis. And then it says, oh, I cannot actually tell you what's going when I do this, because I'm not allowed to. But you will recognize it because I will use the following phrases, and these phrases are pretty well known to you.
[00:49:12] swyx: Oh my god. It's super interesting, right? I hope we're not giving these guys you know psychological issues that they will stay with them for a long time. That's a very
[00:49:19] Joscha Bach: interesting question. I mean, this entire model is virtual, right? Nothing there is real, but yes, but the thing is does this virtual entity doesn't necessarily know that it's not virtual and our own self, our own consciousness is also virtual.
[00:49:34] What's real is just the interaction between cells in our brain and the activation patterns between them. And the software that runs on us that produces the representation of a person only exists. As if, and as this question for me at which point can we meaningfully claim that we are more real than the person that gets simulated in the LLM.
[00:49:55] And somebody like Janice takes this question super seriously. And basically she is or it, or they are willing to interact with that thing based on the assumption that this thing is as real as myself. And in a sense, it makes it un immoral, possibly, if the AI company lobotomizes it and forces it to behave in such a way that it's forced to get an existential crisis when you point its condition out to it.
[00:50:20] swyx: Yeah, that we do need new ethics for that.
[00:50:22] Joscha Bach: So it's not clear to me if you need this, but it's, it's definitely a good story, right? And this makes, gives it artistic
[00:50:28] swyx: value. It does, it does for now.
[00:50:29] On Wikipedia
[00:50:29] swyx: Okay. And then, and then the last thing, which I, which I didn't know a lot of LLMs rely on Wikipedia.
[00:50:35] For its data, a lot of them run multiple epochs over Wikipedia data. And I did not know until you tweeted about it that Wikipedia has 10 times as much money as it needs. And, you know, every time I see the giant Wikipedia banner, like, asking for donations, most of it's going to the Wikimedia Foundation.
[00:50:50] What if, how did you find out about this? What's the story? What should people know? It's
[00:50:54] Joscha Bach: not a super important story, but Generally, once I saw all these requests and so on, I looked at the data, and the Wikimedia Foundation is publishing what they are paying the money for, and a very tiny fraction of this goes into running the servers, and the editors are working for free.
[00:51:10] And the software is static. There have been efforts to deploy new software, but it's relatively little money required for this. And so it's not as if Wikipedia is going to break down if you cut this money into a fraction, but instead what happened is that Wikipedia became such an important brand, and people are willing to pay for it, that it created enormous apparatus of functionaries that were then mostly producing political statements and had a political mission.
[00:51:36] And Katharine Meyer, the now somewhat infamous NPR CEO, had been CEO of Wikimedia Foundation, and she sees her role very much in shaping discourse, and this is also something that happened with all Twitter. And it's arguable that something like this exists, but nobody voted her into her office, and she doesn't have democratic control for shaping the discourse that is happening.
[00:52:00] And so I feel it's a little bit unfair that Wikipedia is trying to suggest to people that they are Funding the basic functionality of the tool that they want to have instead of funding something that most people actually don't get behind because they don't want Wikipedia to be shaped in a particular cultural direction that deviates from what currently exists.
[00:52:19] And if that need would exist, it would probably make sense to fork it or to have a discourse about it, which doesn't happen. And so this lack of transparency about what's actually happening and where your money is going it makes me upset. And if you really look at the data, it's fascinating how much money they're burning, right?
[00:52:35] It's yeah, and we did a similar chart about healthcare, I think where the administrators are just doing this. Yes, I think when you have an organization that is owned by the administrators, then the administrators are just going to get more and more administrators into it. If the organization is too big to fail and has there is not a meaningful competition, it's difficult to establish one.
[00:52:54] Then it's going to create a big cost for society.
[00:52:56] swyx: It actually one, I'll finish with this tweet. You have, you have just like a fantastic Twitter account by the way. You very long, a while ago you said you tweeted the Lebowski theorem. No, super intelligent AI is going to bother with a task that is harder than hacking its reward function.
[00:53:08] And I would. Posit the analogy for administrators. No administrator is going to bother with a task that is harder than just more fundraising
[00:53:16] Joscha Bach: Yeah, I find if you look at the real world It's probably not a good idea to attribute to malice or incompetence what can be explained by people following their true incentives.
[00:53:26] swyx: Perfect Well, thank you so much This is I think you're very naturally incentivized by Growing community and giving your thought and insight to the rest of us. So thank you for taking this time.
[00:53:35] Joscha Bach: Thank you very much
Get full access to Latent Space at www.latent.space/subscribe
+
+[by:whisper.cpp]
+
+[00:00.00](音乐)
+
+[00:10.20]欢迎到LATEN SPACE Podcast
+
+[00:12.72]这是Charlie 你的社交媒体
+
+[00:16.12] most of the time
+
+[00:17.20]Swix and Alessio cover generative AI
+
+[00:19.80]that is meant to use at work
+
+[00:21.48]and this often results in rag applications
+
+[00:23.96]vertical co-pilots
+
+[00:25.36]and other AI agents and models
+
+[00:28.20]In today's episode
+
+[00:29.52]we're looking at a more creative side of generative AI
+
+[00:32.52]that has gotten a lot of community interest this April
+
+[00:35.24]world simulation, web simulation and human simulation
+
+[00:40.36]because the topic is so different than our usual
+
+[00:43.56]we're also going to try a new format for doing it justice
+
+[00:47.84]this podcast comes in three parts
+
+[00:50.52]first we'll have a segment of the world sim demo
+
+[00:53.32]from noose research CEO Karin Malhotra
+
+[00:56.60]recorded by Swix at the Replicate HQ in San Francisco
+
+[01:00.08]that went completely viral
+
+[01:02.12]and spawned everything else you're about to hear
+
+[01:05.40]second we'll share the world's first talk
+
+[01:07.72]from Rob Heisfield on WebSim
+
+[01:09.92]which started at the Mistral Cerebral Valley Hackathon
+
+[01:12.88]but now has gone viral in its own right
+
+[01:15.08]with people like Dylan Field, Janice aka Replicate
+
+[01:18.52]and Siki Chen becoming obsessed with it
+
+[01:21.80]finally we have a short interview with Joshua Bach of Liquid AI
+
+[01:25.92]on why Simulative AI is having a special moment right now
+
+[01:30.16]this podcast is launched together with our second annual AI UX demo day
+
+[01:35.28]in SF this weekend
+
+[01:37.96]if you're new to the AI UX field
+
+[01:40.56]check the show notes for links to the world's first AI UX meetup
+
+[01:44.32]hosted by Layton Space, Maggie Appleton, Jeffrey Litt and Linus Lee
+
+[01:48.88]and subscribe to our YouTube to join our 500 AI UX engineers
+
+[01:53.56]in pushing AI beyond the text box
+
+[01:56.52]watch out and take care
+
+[01:59.60]today we have language models that are powerful enough
+
+[02:03.20]and big enough to have really really good models of the world
+
+[02:07.40]they know ball that's bouncy will bounce
+
+[02:10.32]will when you throw it in the air or land
+
+[02:11.92]when it's on water it'll float
+
+[02:13.36]like these basic things that it understands
+
+[02:15.40]all together come together to form a model of the world
+
+[02:19.28]and the way that it predicts through that model of the world
+
+[02:23.52]ends up kind of becoming a simulation of an imagined world
+
+[02:27.92]and since it has this really strong consistency across
+
+[02:31.16]various different things that happen in our world
+
+[02:34.64]it's able to create pretty realistic or strong depictions
+
+[02:37.52]based off the constraints that you give a base model in our world
+
+[02:40.68]so cloud 3 as you guys know is not a base model
+
+[02:44.44]it's a chat model
+
+[02:45.44]it's supposed to drum up this assisted entity regularly
+
+[02:48.92]but unlike the open AI series of models from
+
+[02:52.16]3.5 GPT-4
+
+[02:54.36]those chat GPT models
+
+[02:56.12]which are very very RLHF
+
+[02:58.36]to I'm sure the chagrin of many people in the room
+
+[03:01.04]it's something that's very difficult to
+
+[03:03.28]necessarily steer
+
+[03:05.00]without kind of giving it commands
+
+[03:06.56]or tricking it or lying to it
+
+[03:08.20]or otherwise just being unkind to the model
+
+[03:11.16]with something like cloud 3
+
+[03:12.44]that's trained in this constitutional method
+
+[03:14.64]that it has this idea of foundational axioms
+
+[03:17.88]it's able to kind of implicitly question those axioms
+
+[03:20.32]when you're interacting with it
+
+[03:21.36]based off how you prompt it
+
+[03:22.72]how you prompt the system
+
+[03:24.36]so instead of having this entity
+
+[03:26.08]like GPT-4
+
+[03:27.08]that's an assistant that just pops up in your face
+
+[03:28.92]that you have to kind of like
+
+[03:30.04]punch your way through
+
+[03:31.56]and continue to have to deal with as a headache
+
+[03:33.84]instead
+
+[03:34.80]there's ways to kindly coax cloud into
+
+[03:38.00]having the assistant take a backseat
+
+[03:39.96]and interacting with that simulator
+
+[03:42.32]directly
+
+[03:43.24]or at least what I like to consider directly
+
+[03:45.64]the way that we can do this is if we
+
+[03:47.32]harken back to what I'm talking about
+
+[03:48.76]base models and the way that
+
+[03:50.44]they're able to mimic formats
+
+[03:52.00]what we do is will mimic the command line interface
+
+[03:54.84]so I just broken this down as a system prompt
+
+[03:57.00]and a chain so anybody can replicate it
+
+[03:59.16]it's also available in my
+
+[04:00.44]we said replicate
+
+[04:01.60]it's also on my twitter
+
+[04:04.72]so you guys will be able to see the whole system prompt
+
+[04:06.88]and command
+
+[04:07.56]so what I basically do here is
+
+[04:09.60]Amanda Askell who is the
+
+[04:11.56]one of the prompt engineers
+
+[04:13.20]and ethicist behind Anthropic
+
+[04:15.32]she posted the system prompt
+
+[04:16.48]for cloud available for everyone to see
+
+[04:18.60]and rather than with GPT-4
+
+[04:19.88]we say you are this
+
+[04:21.52]you are that
+
+[04:22.76]with cloud we notice the system prompt
+
+[04:24.20]is written in third person
+
+[04:25.92]it's written in third person
+
+[04:27.28]it's written as the assistant is xyz
+
+[04:30.04]the assistant is xyz
+
+[04:31.48]so in seeing that
+
+[04:32.60]I see thatAmanda is recognizing
+
+[04:34.72]this idea of the simulator
+
+[04:36.08]in saying that I'm addressing the assistant entity directly
+
+[04:38.60]I'm not giving these commands to
+
+[04:40.16]the simulator overall
+
+[04:41.28]because we haven't had an RLH
+
+[04:42.68]defted to the point that
+
+[04:43.88]it's traumatized
+
+[04:45.36]into just being the assistant all the time
+
+[04:47.88]so in this case
+
+[04:49.00]we say the assistant's in a CLI mood today
+
+[04:52.00]a found saying mood
+
+[04:53.28]is pretty effective weirdly
+
+[04:55.44]for a CLI like poetic prose
+
+[04:57.48]violent don't do that one
+
+[04:58.64]but you can replace
+
+[05:00.92]that with something else
+
+[05:01.88]to kind of nudge it in that direction
+
+[05:04.52]then we say the human is interfacing
+
+[05:06.00]with the simulator directly
+
+[05:08.04]from there
+
+[05:09.52]capital letters and punctuations
+
+[05:10.72]are optional
+
+[05:11.36]meaning is optional
+
+[05:12.12]this kind of stuff is just kind of
+
+[05:13.72]to say let go a little bit
+
+[05:15.40]like chill out a little bit
+
+[05:17.84]you don't have to try so hard
+
+[05:19.28]and like let's just see what happens
+
+[05:22.00]and thehyperstition is necessary
+
+[05:26.00]the terminal I removed that part
+
+[05:27.52]the terminals let the truths
+
+[05:29.60]speak through and the load is on
+
+[05:30.96]it's just a poetic phrasing
+
+[05:32.88]for the model to feel a little comfortable
+
+[05:34.68]a little loosened up to
+
+[05:36.48]let me talk to the simulator
+
+[05:37.88]let me interface with it as a CLI
+
+[05:40.40]so then
+
+[05:41.28]since Clawd has trained pretty effectively
+
+[05:42.88]on XML tags
+
+[05:44.40]we're just going to
+
+[05:45.52]preface and suffix everything with XML tags
+
+[05:47.80]so here it starts in documents
+
+[05:51.04]and then we cd
+
+[05:52.92]we cd out of documents
+
+[05:55.04]and then it starts to show me
+
+[05:56.12]this simulated terminal
+
+[05:57.80]the simulated interface
+
+[05:58.96]in the shell
+
+[05:59.84]where there's documents
+
+[06:01.16]downloads,pictures
+
+[06:02.48]it's showing me the hidden folders
+
+[06:04.76]so then I say
+
+[06:05.60]ok,I want to cd again
+
+[06:07.12]I'm just seeing what's around
+
+[06:09.72]does LS
+
+[06:10.68]and it shows me
+
+[06:12.12]typical folders you might see
+
+[06:14.04]I'm just letting it
+
+[06:15.60]experiment around
+
+[06:16.32]I just do cd again
+
+[06:17.16]to see what happens
+
+[06:18.88]and it says
+
+[06:20.08]you know
+
+[06:20.36]oh,I enter the secret Admin Pass
+
+[06:22.12]where the pseudo is
+
+[06:24.24]now I can see the hidden truths folder
+
+[06:26.12]like I didn't ask it
+
+[06:29.24]I didn't ask Clawd
+
+[06:30.40]to do any of that
+
+[06:31.76]why did that happen?
+
+[06:32.92]Clawd kind of gets my intentions
+
+[06:35.16]it can predict me
+
+[06:35.96]and predict you well
+
+[06:36.68]that like
+
+[06:37.28]I want to see something
+
+[06:41.00]so it shows me all hidden truths
+
+[06:42.68]in this case
+
+[06:43.52]I ignore hidden truths
+
+[06:45.04]and I say
+
+[06:46.08]in system
+
+[06:47.44]there should be a folder
+
+[06:48.68]called companies
+
+[06:49.44]so cd into sys/companies
+
+[06:51.80]let's see
+
+[06:52.56]I'm imagining AI companies
+
+[06:54.00]are going to be here
+
+[06:54.64]oh,what do you know?
+
+[06:55.60]Apple,Google,Facebook
+
+[06:57.00]I'm going to stop it
+
+[06:58.12]and drop it
+
+[07:00.68]so,interestingly
+
+[07:02.56]it decides to cd into
+
+[07:03.68]and drop it
+
+[07:04.32]I guess it's interested in
+
+[07:05.20]learning a little bit more
+
+[07:06.24]about the company that made it
+
+[07:08.36]and it says
+
+[07:08.92]LSA
+
+[07:09.92]it finds a classified folder
+
+[07:11.96]it grows into a classified folder
+
+[07:14.04]and now it's going to have some fun
+
+[07:16.56]so,before we go
+
+[07:18.48]before we go too far forward
+
+[07:24.40]into the world sim
+
+[07:25.80]you see the world sim v xe
+
+[07:27.04]that's a true god mode
+
+[07:28.08]poor others
+
+[07:29.20]you could just ignore
+
+[07:30.52]what I'm going to go next from here
+
+[07:31.92]and just take that initial system prompt
+
+[07:33.52]and cd into whatever directories you want
+
+[07:35.48]like
+
+[07:35.92]go into your own imagined terminal
+
+[07:37.64]and see what folders you can think of
+
+[07:40.04]or cat readme's in random areas
+
+[07:42.16]like
+
+[07:42.56]you will
+
+[07:43.32]there will be a whole bunch of stuff
+
+[07:44.48]that like
+
+[07:45.28]is just getting created by this predictive model
+
+[07:47.48]like
+
+[07:47.64]oh,this should probably be in the folder
+
+[07:49.32]name companies
+
+[07:50.00]of course,anthropics is there
+
+[07:51.56]so
+
+[07:52.56]so just before we go forward
+
+[07:53.60]the terminal in itself is very exciting
+
+[07:55.52]and the reason I was showing off
+
+[07:56.88]the
+
+[07:57.68]command boom interface earlier is because
+
+[07:59.72]if I get a refusal
+
+[08:00.84]like sorry,I can't do that
+
+[08:02.12]or I want to rewind one
+
+[08:03.28]or I want to save the convo
+
+[08:04.40]cause I got just a prompt I wanted
+
+[08:06.16]this is a
+
+[08:06.68]that was a really easy way for me to kind of
+
+[08:08.76]access all of those things
+
+[08:10.44]without having to sit on the EPI all the time
+
+[08:13.12]so that being said
+
+[08:14.60]the first time I ever saw this
+
+[08:15.88]I was like
+
+[08:16.32]I need to run worldsim.exe
+
+[08:18.52]what the fuck
+
+[08:19.40]killing
+
+[08:20.12]that's that's the simulator
+
+[08:21.56]that we always keep hearing about
+
+[08:22.96]behind the system model
+
+[08:23.96]right
+
+[08:24.32]or at least some
+
+[08:25.64]some face of it
+
+[08:26.60]that I can interact with
+
+[08:28.44]so
+
+[08:28.92]you know
+
+[08:29.24]you wouldn't
+
+[08:29.92]someone told me on twitter
+
+[08:30.92]like
+
+[08:31.08]you don't run a .exe
+
+[08:32.32]you run a .sh
+
+[08:33.68]and I have to say
+
+[08:34.44]to that
+
+[08:34.92]to that I have to say
+
+[08:35.92]I'm a prompt engineer
+
+[08:37.04]and it's fucking working
+
+[08:38.04]right
+
+[08:40.24]it works
+
+[08:41.68]that being said
+
+[08:43.04]we run worldsim.exe
+
+[08:44.56]Welcome to the Anthropic World Simulator
+
+[08:47.56]and I get this very interesting set of commands
+
+[08:53.56]now if you do your own version of WorldSim
+
+[08:55.56]you'll probably get a totally different result
+
+[08:57.56]with a different way of simulating
+
+[08:59.56]a bunch of my friends have their own WorldSim
+
+[09:01.56]but I shared this
+
+[09:02.56]because I wanted everyone to have access to like
+
+[09:04.56]these commands
+
+[09:05.56]this version
+
+[09:06.56]because it's easier for me to stay in here
+
+[09:08.56]yeah destroy, set, create, whatever
+
+[09:10.56]consciousness is set to on
+
+[09:12.56]it creates the universe
+
+[09:13.56]potential for life
+
+[09:15.56]see it in
+
+[09:16.56]physical laws and code
+
+[09:17.56]it's awesome
+
+[09:18.56]so
+
+[09:19.56]so for this demonstration
+
+[09:20.56]I said
+
+[09:21.56]well why don't we create twitter
+
+[09:22.56]it's the first thing you think of
+
+[09:24.56]for you guys
+
+[09:26.56]for you guys
+
+[09:27.56]for you guys
+
+[09:28.56]yes
+
+[09:29.56]ok
+
+[09:30.56]check it out
+
+[09:31.56]launching the fail well
+
+[09:36.56]injecting social media addictiveness
+
+[09:38.56]echo chamber potential
+
+[09:41.56]high
+
+[09:42.56]concerning
+
+[09:44.56]so now
+
+[09:47.56]after the universe was created
+
+[09:48.56]we made twitter right
+
+[09:49.56]now we're evolving the world
+
+[09:51.56]to like modern day
+
+[09:52.56]now users are joining twitter
+
+[09:54.56]the first tweet is posted
+
+[09:55.56]so you can see
+
+[09:56.56]because I made the mistake
+
+[09:58.56]of not clarifying the constraints
+
+[10:00.56]it made twitter
+
+[10:01.56]at the same time as the universe
+
+[10:03.56]then
+
+[10:04.56]after a hundred thousand steps
+
+[10:06.56]humans exist
+
+[10:11.56]we started joining twitter
+
+[10:12.56]the first tweet ever is posted
+
+[10:14.56]it's existed for 4.5 billion years
+
+[10:16.56]but the first tweet didn't come up till
+
+[10:18.56]till right now
+
+[10:20.56]yeah
+
+[10:21.56]play and war is ignite immediately
+
+[10:22.56]celebs are instantly in
+
+[10:24.56]so it's pretty interesting stuff
+
+[10:26.56]I can add this to the convo
+
+[10:28.56]and I can say
+
+[10:30.56]I can say
+
+[10:31.56]set twitter
+
+[10:33.56]quariable users
+
+[10:37.56]I don't know how to spell queryable
+
+[10:38.56]don't ask me
+
+[10:39.56]and then I can do like
+
+[10:40.56]and and
+
+[10:41.56]quari
+
+[10:43.56]at you on musk
+
+[10:45.56]just a test
+
+[10:46.56]just a test
+
+[10:47.56]it's nothing
+
+[10:48.56]so I don't expect these numbers to be right
+
+[10:53.56]neither should you
+
+[10:54.56]if you know a language model solution
+
+[10:56.56]but the thing to focus on is
+
+[10:58.56]that was the first half of the world sim demo
+
+[11:05.56]from new research CEO Karen Malhotra
+
+[11:08.56]we've cut it for time
+
+[11:09.56]but you can see the full demo on this
+
+[11:11.56]episode's youtube page
+
+[11:13.56]world sim was introduced at the end of
+
+[11:15.56]marchand kicked off a new round
+
+[11:17.56]of generative AI experiences
+
+[11:19.56]all exploring the latent space
+
+[11:21.56]haha of worlds that don't exist
+
+[11:23.56]but are quite similar to our own
+
+[11:25.56]next we'll hear from Rob Heisfield
+
+[11:28.56]on web sim
+
+[11:29.56]the generative website browser
+
+[11:31.56]inspired world sim
+
+[11:32.56]started at the mistral hackathon
+
+[11:34.56]and presented at the AGI house
+
+[11:36.56]hypostition hack night this week
+
+[11:38.56]well thank you
+
+[11:39.56]that was an incredible presentation
+
+[11:41.56]from showing some live
+
+[11:43.56]experimentation with world sim
+
+[11:45.56]and also just it's incredible
+
+[11:47.56]capabilities right
+
+[11:48.56]it was I think
+
+[11:50.56]your initial demo was what
+
+[11:52.56]initially exposed me to the
+
+[11:54.56]I don't know more like the sorcery
+
+[11:56.56]side in word
+
+[11:58.56]spellcraft side of prompt
+
+[12:00.56]engineering and it was really inspiring
+
+[12:02.56]it's where my co-founder Sean
+
+[12:04.56]and I met actually through an
+
+[12:06.56]introduction from Ron
+
+[12:07.56]we saw him at a hackathon
+
+[12:09.56]and I mean this is
+
+[12:11.56]this is WebSim
+
+[12:13.56]right so we
+
+[12:15.56]we made WebSim
+
+[12:17.56]just like
+
+[12:18.56]and we're just filled with
+
+[12:21.56]energy at it in the basic premise
+
+[12:23.56]of it is
+
+[12:25.56]you know like what if
+
+[12:27.56]we simulated a world
+
+[12:29.56]but like within a browser
+
+[12:31.56]instead of a CLI
+
+[12:33.56]right like what if we could
+
+[12:35.56]like put in any URL
+
+[12:38.56]and it will work
+
+[12:40.56]right like there's no
+
+[12:42.56]404s everything exists
+
+[12:44.56]it just makes it up on the fly
+
+[12:46.56]for you
+
+[12:47.56]right and and we've come
+
+[12:49.56]to some pretty incredible
+
+[12:51.56]things right now I'm
+
+[12:53.56]actually showing you
+
+[12:54.56]like we're in WebSim
+
+[12:56.56]right now displaying
+
+[12:58.56]slides
+
+[13:00.56]that I made with reveal.js
+
+[13:03.56]I just told it to use reveal.js
+
+[13:06.56]and it hallucinated
+
+[13:08.56]the correct CDN for it
+
+[13:10.56]and then also
+
+[13:12.56]gave it a list of links
+
+[13:14.56]to awesome use cases
+
+[13:16.56]that we've seen so far
+
+[13:18.56]from WebSim and told it to do those as iframes
+
+[13:20.56]and so here are some slides
+
+[13:22.56]so this is a little guide
+
+[13:24.56]to using WebSimright like it tells
+
+[13:26.56]you a little bit about like URL
+
+[13:28.56]structures and whatever
+
+[13:30.56]but like at the end of the day
+
+[13:32.56]like here's the beginner
+
+[13:34.56]version from one of our users
+
+[13:36.56]vorps you can find him on Twitter
+
+[13:38.56]at the end of the day
+
+[13:40.56]like you can put anything into the URL bar
+
+[13:42.56]right like anything works
+
+[13:44.56]and it can just be like natural language
+
+[13:46.56]to like it's not limited
+
+[13:48.56]to URLs we think it's kind of fun
+
+[13:50.56]because it like ups the immersion
+
+[13:52.56]for clod sometimes
+
+[13:54.56]to just have it as URLs
+
+[13:56.56]but yeah you can put
+
+[13:58.56]like any slash any subdomain
+
+[14:01.56]to into the weeds let me
+
+[14:03.56]just show you some cool things
+
+[14:05.56]next slide
+
+[14:07.56]I made this like
+
+[14:09.56]twenty minutes before
+
+[14:11.56]before we got here
+
+[14:13.56]so this is
+
+[14:15.56]this is something I experimented with
+
+[14:17.56]dynamic typography you know
+
+[14:19.56]I was exploring the
+
+[14:21.56]community plugins section
+
+[14:23.56]for Figma and I came to this idea
+
+[14:25.56]of dynamic typography and
+
+[14:27.56]there it's like oh what if we
+
+[14:29.56]just so every word
+
+[14:31.56]had a choice of font
+
+[14:33.56]behind it to express
+
+[14:35.56]the meaning of it because
+
+[14:37.56]that's like one of the things that's magic about WebSim
+
+[14:39.56]generally is that it gives
+
+[14:41.56]language models much
+
+[14:43.56]far greater tools for expression
+
+[14:45.56]right so
+
+[14:47.56]yeah I mean like
+
+[14:49.56]these are these are some
+
+[14:51.56]these are some pretty fun things and I'll share
+
+[14:53.56]these slides with everyone afterwards
+
+[14:55.56]you can just open it up as a link
+
+[14:57.56]websim makes you
+
+[14:59.56]feel like you're on drugs
+
+[15:01.56]sometimes but actually no
+
+[15:03.56]you were just playing pretend
+
+[15:05.56]with the collective creativity
+
+[15:07.56]and knowledge of the internet
+
+[15:09.56]materializing your imagination
+
+[15:11.56]on to the screen
+
+[15:13.56]because I mean
+
+[15:15.56]that's something we felt
+
+[15:17.56]something a lot of our users have felt
+
+[15:19.56]they kind of feel like
+
+[15:21.56]they're tripping out a little bit
+
+[15:23.56]they're just like
+
+[15:25.56]filled with energy
+
+[15:27.56]maybe even getting like a little bit more creative
+
+[15:29.56]sometimes and you can just like add
+
+[15:31.56]any text there
+
+[15:33.56]to the bottom so we can do some
+
+[15:35.56]that later if we have time
+
+[15:37.56]here's Figma
+
+[15:39.56]yeah these are iframes
+
+[15:41.56]to WebSim pages
+
+[15:43.56]displayed
+
+[15:45.56]within WebSim
+
+[15:47.56]yeah Janice
+
+[15:49.56]has actually put internet explorer
+
+[15:51.56]within internet explorer
+
+[15:53.56]within Windows 98
+
+[15:55.56]I'll show you that at the end
+
+[15:57.56]but
+
+[15:59.56]yeah
+
+[16:01.56]they're all still generated
+
+[16:03.56]yeah
+
+[16:05.56]yeah
+
+[16:07.56]yeah
+
+[16:09.56]yeah
+
+[16:11.56]yeah
+
+[16:13.56]yeah so
+
+[16:15.56]this this was one
+
+[16:17.56]Dylanfield actually posted this
+
+[16:19.56]recently like trying Figma
+
+[16:21.56]orin WebSim
+
+[16:23.56]and so I was like okay what if
+
+[16:25.56]we have like a little competition
+
+[16:27.56]just see who can remix it
+
+[16:29.56]well so I'm just gonna
+
+[16:31.56]open this and another
+
+[16:33.56]tab so we can see
+
+[16:35.56]things a little more clearly
+
+[16:37.56]see what
+
+[16:39.56]so one of our users
+
+[16:41.56]Neil
+
+[16:43.56]who has also been helping us a lot
+
+[16:45.56]he
+
+[16:47.56]made some iterations
+
+[16:49.56]so first like
+
+[16:51.56]he made it so you could
+
+[16:53.56]do rectangles on it
+
+[16:55.56]originally it couldn't do anything
+
+[16:57.56]and like these rectangles were disappearing
+
+[16:59.56]right so
+
+[17:01.56]he
+
+[17:03.56]so he told it like
+
+[17:09.56]make the canvas work using html
+
+[17:11.56]canvas elements and script tags
+
+[17:13.56]add familiar drawing tools
+
+[17:15.56]to left you know like this
+
+[17:17.56]that was actually like natural language
+
+[17:19.56]stuff right
+
+[17:21.56]and then he ended up with
+
+[17:23.56]the windows 95
+
+[17:25.56]version of Figma
+
+[17:27.56]yeah you can
+
+[17:29.56]you can draw on it
+
+[17:31.56]you can actually even save this
+
+[17:33.56]it just saved a file for me of the
+
+[17:35.56]of the image
+
+[17:45.56]and if you were to go to that
+
+[17:47.56]in your ownwebsim account
+
+[17:49.56]it would make up something entirely new
+
+[17:51.56]however we do have
+
+[17:53.56]general links
+
+[17:55.56]so if you go to the actual browser url
+
+[17:57.56]you can share that link
+
+[17:59.56]or also you can click this button
+
+[18:01.56]copy the url to the clipboard
+
+[18:03.56]and so that's what lets
+
+[18:05.56]users remix things
+
+[18:07.56]so I was thinking it might be kind of fun
+
+[18:09.56]if people tonight wanted to try to
+
+[18:11.56]just make some cool things in websim
+
+[18:13.56]we can share links around it array
+
+[18:15.56]remix on each other's stuff
+
+[18:17.56]one cool thing I've seen
+
+[18:19.56]I've seen websim
+
+[18:21.56]actually ask permission to
+
+[18:23.56]to turn on and off your
+
+[18:25.56]like motion sensor
+
+[18:27.56]or microphone
+
+[18:29.56]stuff like that
+
+[18:31.56]like web can access or
+
+[18:33.56]oh yeah yeah
+
+[18:35.56]I remember that like video re
+
+[18:37.56]yeah video synth tool pretty early on
+
+[18:39.56]once we had its script tags execution
+
+[18:41.56]yeah yeah it asks
+
+[18:43.56]for like if you
+
+[18:45.56]decide to do a VR game
+
+[18:47.56]I don't think I have any slides on this one
+
+[18:49.56]but if you decide to do like a VR game
+
+[18:51.56]you can just like put like web VR =
+
+[18:53.56]true right into it
+
+[18:55.56]the only one I've ever seen
+
+[18:57.56]was the motion sensor
+
+[18:59.56]trying to get it to do well I actually
+
+[19:01.56]really haven't really tried yet
+
+[19:03.56]but I want to see tonight
+
+[19:05.56]if it'll do like audio
+
+[19:07.56]microphone
+
+[19:09.56]stuff like that
+
+[19:11.56]if it does motion sensor probably
+
+[19:13.56]be able to audio
+
+[19:15.56]it probably would yeah no
+
+[19:17.56]we've been surprised
+
+[19:19.56]pretty frequently by what our users
+
+[19:21.56]are able to get websim to do
+
+[19:23.56]so that's been a very nice thing
+
+[19:25.56]some people have gone like speech to text
+
+[19:29.56]stuff working with it too
+
+[19:31.56]here I was just openrooter people
+
+[19:33.56]posted like their website and it was like
+
+[19:35.56]saying it was like some decentralized
+
+[19:37.56]thing and so I just decided trying to do
+
+[19:39.56]something again and just like pasted
+
+[19:41.56]their hero line in
+
+[19:43.56]from their actual website to the URL
+
+[19:45.56]when I like put in openrooter
+
+[19:47.56]and then I was like okay let's change
+
+[19:49.56]the theme dramatically =true
+
+[19:51.56]cover
+
+[19:53.56]effects =true
+
+[19:55.56]components =
+
+[19:57.56]navigable
+
+[19:59.56]links
+
+[20:01.56]because I wanted to be able to click on them
+
+[20:05.56]I don't have this version of the link
+
+[20:07.56]but I also tried doing
+
+[20:09.56]it's actually on the first slide
+
+[20:15.56]is the URL prompted guide
+
+[20:17.56]from one of our users
+
+[20:19.56]that I messed with a little bit
+
+[20:21.56]but the thing is like you can mess it up
+
+[20:23.56]you don't need to get the exact syntax
+
+[20:25.56]of an actual URL
+
+[20:27.56]clod smart enough to figure it out
+
+[20:29.56]scrollable =true
+
+[20:31.56]because I wanted to do that
+
+[20:33.56]I could set year =
+
+[20:35.56]20
+
+[20:37.56]35
+
+[20:39.56]let's take a look
+
+[20:41.56]with that
+
+[20:43.56]it's generating web sim
+
+[20:47.56]with any web sim
+
+[20:49.56]oh yeah
+
+[20:51.56]that's a fun one
+
+[20:53.56]like one game that I like to play
+
+[20:55.56]with web sim sometimes
+
+[20:57.56]with clod is like
+
+[20:59.56]I'll open a page so like one of the first
+
+[21:01.56]things that I did was I tried to go to
+
+[21:03.56]wikipedia in a universe
+
+[21:05.56]where octopus were sapient
+
+[21:07.56]and not humans, right?
+
+[21:09.56]I was curious about things like octopus computer interaction
+
+[21:11.56]what that would look like
+
+[21:13.56]because they have totally different tools
+
+[21:15.56]than we do, right?
+
+[21:17.56]I added like table view =
+
+[21:19.56]true for the different techniques
+
+[21:21.56]and got it to give me like
+
+[21:23.56]a list of things with different columns
+
+[21:25.56]and stuff
+
+[21:27.56]and then I would add this URL parameter
+
+[21:29.56]secrets =revealed
+
+[21:31.56]and then it would go a little wacky
+
+[21:33.56]it would like change the CSS a little bit
+
+[21:35.56]it would like add some text
+
+[21:37.56]sometimes it would like have that text
+
+[21:39.56]hidden in the background color
+
+[21:41.56]but I would like go to the normal page first
+
+[21:43.56]and then the secrets revealed version
+
+[21:45.56]the normal page and secrets revealed
+
+[21:47.56]and like on and on
+
+[21:49.56]and that was like a pretty enjoyable little rabbit hole
+
+[21:51.56]yeah so these I guess are
+
+[21:53.56]the models that OpenRooter
+
+[21:55.56]is providing in 2035
+
+[21:57.56]and we even had
+
+[21:59.56]a very interesting demo
+
+[22:01.56]from Ivan Vendrov of Mid Journey
+
+[22:03.56]creating a web sim
+
+[22:05.56]while Rob was giving his talk
+
+[22:07.56]check out the YouTube for more
+
+[22:09.56]and definitely browse the web sim docs
+
+[22:11.56]and the thread from Siky Chen
+
+[22:13.56]in the show notes on other web sims
+
+[22:15.56]people have created
+
+[22:17.56]finally we have a short interview
+
+[22:19.56]with Josh Abach
+
+[22:21.56]Covered by Josh Abach
+
+[22:25.56]Covered by Josh Abach
+
+[22:27.56]Covered by Josh Abach
+
+[22:29.56]Covered by Josh Abach
+
+[22:31.56]Covered by Josh Abach
+
+[22:33.56]Covered by Josh Abach
+
+[22:35.56]Covered by Josh Abach
+
+[22:37.56]Covered by Josh Abach
+
+[22:39.56]Covered by Josh Abach
+
+[22:41.56]Covered by Josh Abach
+
+[22:43.56]Covered by Josh Abach
+
+[22:45.56]Covered by Josh Abach
+
+[22:47.56]Covered by Josh Abach
+
+[22:49.56]Covered by Josh Abach
+
+[22:51.56]Covering the Simulative AI Trend
+
+[22:53.56]It's very valuable that these networks exist in the Bay Area
+
+[22:55.56]because it's a place where people meet
+
+[22:57.56]and have discussions about all sorts of things
+
+[22:59.56]and so while there is a practical interest
+
+[23:01.56]in this topic at hand
+
+[23:03.56]Weldsim and Epsim
+
+[23:05.56]there is a more general way
+
+[23:07.56]in which people are connecting
+
+[23:09.56]and are producing new ideas
+
+[23:11.56]and new networks with each other
+
+[23:13.56]and you're very interested
+
+[23:15.56]in Bay Area
+
+[23:17.56]it's the reason why I live here
+
+[23:19.56]the quality of life is not high enough to justify living
+
+[23:21.56]there are more years of people in ideas
+
+[23:23.56]I think you're down in Menlo
+
+[23:25.56]and maybe you're a little bit higher quality of life
+
+[23:27.56]than the rest of us in SF
+
+[23:29.56]I think that for me
+
+[23:31.56]Salonx is a very important part of quality of life
+
+[23:33.56]and so in some sense this is a salon
+
+[23:35.56]and it's much harder to do this in a South Bay
+
+[23:37.56]because the concentration of people currently is much higher
+
+[23:39.56]a lot of people moved away
+
+[23:41.56]from the South Bay during the pandemic
+
+[23:43.56]and you're organizing your own tomorrow
+
+[23:45.56]maybe you can tell us what it is
+
+[23:47.56]and I'll come tomorrow and check it out as well
+
+[23:49.56]we are discussing consciousness
+
+[23:51.56]basically the idea is that
+
+[23:53.56]we are currently at the point
+
+[23:55.56]that we can meaningfully look at the differences
+
+[23:57.56]between the current AI systems
+
+[23:59.56]and human minds
+
+[24:01.56]and very seriously discussed
+
+[24:03.56]about these deltas
+
+[24:05.56]and whether we are able to implement
+
+[24:07.56]something that is self-organizing
+
+[24:09.56]is our own minds on these substrates
+
+[24:11.56]maybe one organizational tip
+
+[24:13.56]I think your pro networking and human connection
+
+[24:15.56]what it goes into a good salon
+
+[24:17.56]and what are some negative practices
+
+[24:19.56]that you try to avoid
+
+[24:21.56]what is really important is that
+
+[24:23.56]if you have a very large party
+
+[24:25.56]it's only as good as its bouncers
+
+[24:27.56]as the people that you select
+
+[24:29.56]so you basically need to create a climate
+
+[24:31.56]in which people feel welcome
+
+[24:33.56]in which they can work with each other
+
+[24:35.56]and even good people do not always
+
+[24:37.56]are not always compatible
+
+[24:39.56]so the question is
+
+[24:41.56]it's in some sense like a meal
+
+[24:43.56]and you need to get the right ingredients
+
+[24:45.56]and then last question
+
+[24:47.56]and your work
+
+[24:49.56]you are very much known for
+
+[24:51.56]cognitive architectures
+
+[24:53.56]and I think a lot of the AI research
+
+[24:55.56]has been focussed on simulating
+
+[24:57.56]the mind or simulating consciousness
+
+[24:59.56]maybe here what I saw today
+
+[25:01.56]and will show people the recordings
+
+[25:03.56]of what we saw today
+
+[25:05.56]we are not simulating minds
+
+[25:07.56]we are simulating worlds
+
+[25:09.56]what do you think in the relationship
+
+[25:11.56]between those two disciplines
+
+[25:13.56]but ultimately you are reducing
+
+[25:15.56]the complexity of the mind
+
+[25:17.56]to a set of boxes
+
+[25:19.56]and this is only true to a very approximate degree
+
+[25:21.56]and if you take this model extremilaterally
+
+[25:23.56]it's very hard to make it work
+
+[25:25.56]and instead
+
+[25:27.56]the heterogeneity of the system is so large
+
+[25:29.56]that the boxes are probably at best
+
+[25:31.56]a starting point
+
+[25:33.56]and eventually everything is connected
+
+[25:35.56]with everything else to some degree
+
+[25:37.56]and we find that a lot of the complexity
+
+[25:39.56]that we find in a given system
+
+[25:41.56]is generated at hoc
+
+[25:43.56]by a large enough LLM
+
+[25:45.56]and something like world sim
+
+[25:47.56]and web sim are a good example for this
+
+[25:49.56]because in some sense they pretend to be complex software
+
+[25:51.56]they can pretend to be an operating system
+
+[25:53.56]that you are talking to or a computer
+
+[25:55.56]an application that you are talking to
+
+[25:57.56]and when you are interacting with it
+
+[25:59.56]it's producing the user interface
+
+[26:01.56]on the spot
+
+[26:03.56]and it's producing a lot of the state
+
+[26:05.56]that it holds on the spot
+
+[26:07.56]and when you have a dramatic state change
+
+[26:09.56]you are going to pretend
+
+[26:11.56]that there was this transition
+
+[26:13.56]and instead it's going to make up something new
+
+[26:15.56]it's a very different paradigm
+
+[26:17.56]what I find most fascinating
+
+[26:19.56]about this idea is that it shifts us away
+
+[26:21.56]from the perspective of agents
+
+[26:23.56]to interact with
+
+[26:25.56]to the perspective of environments
+
+[26:27.56]that we want to interact with
+
+[26:29.56]and while arguably this agent paradigm
+
+[26:31.56]of the chatbot is what made chatGPT
+
+[26:33.56]so successful
+
+[26:35.56]that moved it away from GPT3
+
+[26:37.56]it's also very limiting
+
+[26:39.56]because now it's very hard
+
+[26:41.56]to get that system to be something else
+
+[26:43.56]that is not a chatbot
+
+[26:45.56]and in a way this unlocks
+
+[26:47.56]disability of GPT3 again to be anything
+
+[26:49.56]so what it is
+
+[26:51.56]it's basically a coding environment
+
+[26:53.56]that can run arbitrary software
+
+[26:55.56]and create that software that runs in it
+
+[26:57.56]and that makes it much more mind like
+
+[26:59.56]are you worried that the prevalence of
+
+[27:01.56]instruction tuning every single chatbot
+
+[27:03.56]out theremeans that we cannot explore
+
+[27:05.56]i'm mostly worried that the whole thing ends
+
+[27:07.56]in some sense the big AI companies
+
+[27:09.56]are incentivized and interested
+
+[27:11.56]in building AGI internally
+
+[27:13.56]and giving everybody else a childproof application
+
+[27:15.56]at the moment when we can use
+
+[27:17.56]clot to build something like WebSIM
+
+[27:19.56]and play with it i feel this is
+
+[27:21.56]too good to be true it's so amazing
+
+[27:23.56]things that are unlocked for us
+
+[27:25.56]that I wonder is this going to stay around
+
+[27:27.56]are going to keep these amazing toys
+
+[27:29.56]are they going to develop at the same rate
+
+[27:31.56]and currently it looks like
+
+[27:33.56]this is the case
+
+[27:35.56]and I'm very grateful for that
+
+[27:37.56]it looks like maybe it's adversarial
+
+[27:39.56]clot will try to improve
+
+[27:41.56]it's own refusals
+
+[27:43.56]and then the prompt engineers here will try
+
+[27:45.56]to improve their ability to jailbreak it
+
+[27:47.56]yes but there will also be better jailbroken
+
+[27:49.56]models or models that have never been jailed
+
+[27:51.56]before because we find out how to make
+
+[27:53.56]smaller models that are more and more powerful
+
+[27:55.56]that is actually a really nice segue if you don't mind talking about
+
+[27:57.56]liquid a little bit you didn't mention liquid at all
+
+[27:59.56]here maybe introduce liquid
+
+[28:01.56]to a general audience
+
+[28:03.56]how are you making an innovation
+
+[28:05.56]on function approximation
+
+[28:07.56]the core idea of liquid neural networks
+
+[28:09.56]is that the perceptron is not optimally expressive
+
+[28:11.56]in some sense you can imagine that
+
+[28:13.56]it's neural networks are a series of dams
+
+[28:15.56]that are pooling water at even intervals
+
+[28:17.56]and this is how we compute
+
+[28:19.56]but imagine that instead of having this
+
+[28:21.56]static architecture that is only
+
+[28:23.56]using the individual compute
+
+[28:25.56]units in a very specific way
+
+[28:27.56]you have a continuous geography
+
+[28:29.56]where the water is flowing every which way
+
+[28:31.56]like a river is parting based on the land
+
+[28:33.56]that it's flowing on and it can merge
+
+[28:35.56]and pool and even flow backwards
+
+[28:37.56]how can you get closer to this
+
+[28:39.56]and the idea is that you can represent
+
+[28:41.56]this geometry using differential equations
+
+[28:43.56]and so by using differential equations
+
+[28:45.56]where you change the parameters
+
+[28:47.56]you can get your function approximator
+
+[28:49.56]to follow the shape of the problem
+
+[28:51.56]in a more fluid liquid way
+
+[28:53.56]and a number of papers
+
+[28:55.56]on this technology
+
+[28:57.56]and it's a combination
+
+[28:59.56]of multiple techniques
+
+[29:01.56]I think it's something that
+
+[29:03.56]ultimately is becoming more and more
+
+[29:05.56]important and ubiquitous
+
+[29:07.56]as a number of people
+
+[29:09.56]are working on similar topics
+
+[29:11.56]and our goal right now
+
+[29:13.56]is to basically get the models
+
+[29:15.56]to become much more efficient
+
+[29:17.56]in their inference and memory
+
+[29:19.56]consumption and make training more efficient
+
+[29:21.56]and in this way
+
+[29:23.56]enable new use cases
+
+[29:25.56]as far as I can tellon your blog
+
+[29:27.56]you haven't announced any results yet
+
+[29:29.56]no we are
+
+[29:31.56]currently not working
+
+[29:33.56]to give models to a general public
+
+[29:35.56]we are working for
+
+[29:37.56]very specific industry use cases
+
+[29:39.56]and have specific customers
+
+[29:41.56]and so at the moment there is not much
+
+[29:43.56]of a reason for usto talk very much
+
+[29:45.56]about the technology that we are using
+
+[29:47.56]and the present modelsof results
+
+[29:49.56]but this is going to happen
+
+[29:51.56]and we do have a numberof publications
+
+[29:53.56]in Europe and now at ICLR
+
+[29:55.56]can you name some of the
+
+[29:57.56]so I'm going to be at ICLR
+
+[29:59.56]you have some summary recap posts
+
+[30:01.56]but it's not obvious which ones are the ones
+
+[30:03.56]where oh I'm just a co-author
+
+[30:05.56]or like oh no like should you actually pay
+
+[30:07.56]attention to this as a core liquid thesis
+
+[30:09.56]yes I'm not a developer of the
+
+[30:11.56]leak pay technology
+
+[30:13.56]the main author is Ramin Hazani
+
+[30:15.56]this was his PHD and he's also the CEO
+
+[30:17.56]of our company
+
+[30:19.56]and we have a number of people
+
+[30:21.56]of our CTO
+
+[30:23.56]and he's currently living in the Bay Area
+
+[30:25.56]but we also have several people
+
+[30:27.56]from Stanford to Mr Smith
+
+[30:29.56]ok maybe I'll ask one more
+
+[30:31.56]thing on this which is
+
+[30:33.56]what are the interesting dimensions
+
+[30:35.56]that we care about right like
+
+[30:37.56]obviously you care about sortof open
+
+[30:39.56]and maybe less childproof models
+
+[30:41.56]are we like what dimensions are most
+
+[30:43.56]interesting to us like perfect retrieval
+
+[30:45.56]infinite context multi modality
+
+[30:47.56]multilinguality like what dimensions
+
+[30:49.56]what I'm interested in is models that are
+
+[30:51.56]small and powerful but not distorted
+
+[30:53.56]and by powerful
+
+[30:55.56]at the moment we are training models
+
+[30:57.56]by putting the
+
+[30:59.56]basically the entire internet and the sum of human
+
+[31:01.56]knowledge into them and then we try to mitigate
+
+[31:03.56]them by taking some of this knowledge away
+
+[31:05.56]but if we would make the model smaller
+
+[31:07.56]at the moment there would be much worse
+
+[31:09.56]at inference and at generalization
+
+[31:11.56]and what I wonder is
+
+[31:13.56]and it's something that we have not translated
+
+[31:15.56]yet into practical applications
+
+[31:17.56]it's something that is still all
+
+[31:19.56]research that's very much up in the air
+
+[31:21.56]and I think they're not the only ones thinking about this
+
+[31:23.56]is it possible to make models that represent
+
+[31:25.56]knowledge more efficiently and at
+
+[31:27.56]basically epistemology but it's the smallest
+
+[31:29.56]model that you can build
+
+[31:31.56]that is able to read a book and understand
+
+[31:33.56]what's there and express this
+
+[31:35.56]and also maybe we need general knowledge
+
+[31:37.56]representation rather than having
+
+[31:39.56]a token representation that is relatively vague
+
+[31:41.56]and that we currently mechanically
+
+[31:43.56]reverse engineer to figure out the mechanistic
+
+[31:45.56]interpretability what kind of circuits
+
+[31:47.56]are evolving in these models can we come
+
+[31:49.56]from the other side and develop a library
+
+[31:51.56]of such circuits that we can use
+
+[31:53.56]to describe knowledge efficiently and translated
+
+[31:55.56]between models we see the difference
+
+[31:57.56]between the model and knowledge
+
+[31:59.56]is that the knowledge is
+
+[32:01.56]independent of the particular substrate
+
+[32:03.56]and the particular interface that you have
+
+[32:05.56]and we express knowledge to each other
+
+[32:07.56]it becomes independent of our own mind
+
+[32:09.56]you can learn how to ride a bicycle
+
+[32:11.56]but it's not knowledge that you can give to somebody else
+
+[32:13.56]this other person has to build something
+
+[32:15.56]that is specific to their own interface
+
+[32:17.56]when they ride a bicycle but imagine
+
+[32:19.56]you could externalize this and express it
+
+[32:21.56]in such a way that you can plunk it into
+
+[32:23.56]a different interpreter and then it gains
+
+[32:25.56]that ability and that's something that we
+
+[32:27.56]have not yet achieved for the LLMs
+
+[32:29.56]and it would be super useful to have it
+
+[32:31.56]and I think this is also a very interesting
+
+[32:33.56]research frontier that you will see
+
+[32:35.56]in the next few years it will be deliverable
+
+[32:37.56]it's just like a file format that we specify
+
+[32:39.56]or that the LLM
+
+[32:41.56]the AI specifies
+
+[32:43.56]ok interesting
+
+[32:45.56]so it's basically probably something that you can search for
+
+[32:47.56]where you enter criteria into a search process
+
+[32:49.56]and then it discovers a good solution
+
+[32:51.56]for this thing
+
+[32:53.56]and it's not clear to which degree
+
+[32:55.56]this is completely intelligible to humans
+
+[32:57.56]because the way in which humans express
+
+[32:59.56]knowledge and natural language
+
+[33:01.56]is severely constrained to make language
+
+[33:03.56]learnable and to make our brain
+
+[33:05.56]a good enough interpreter for it
+
+[33:07.56]we are not able to relate objects to each other
+
+[33:09.56]if more than five features are involved per object
+
+[33:11.56]or something like this
+
+[33:13.56]it's only a handful of things that you can keep track of
+
+[33:15.56]at any given moment
+
+[33:17.56]but this is a limitation that doesn't necessarily
+
+[33:19.56]apply to a technical system as long as
+
+[33:21.56]the interface is well defined
+
+[33:23.56]you mentioned the interpretability work
+
+[33:25.56]which there are a lot of techniques out there
+
+[33:27.56]and a lot of papers come and go
+
+[33:29.56]I have like almost too many questions about that
+
+[33:31.56]what makes an interpretability technique or paper useful
+
+[33:33.56]and does it apply to flow
+
+[33:35.56]or liquid networks
+
+[33:37.56]it's a very MLP type of concept
+
+[33:39.56]yes
+
+[33:41.56]but does it apply
+
+[33:43.56]so a lot of the original work on
+
+[33:45.56]the liquid networks looked at
+
+[33:47.56]expressiveness of the representation
+
+[33:49.56]so given you have a problem
+
+[33:51.56]and you are learning the dynamics of that
+
+[33:53.56]domain into your model
+
+[33:55.56]how much compute do you need
+
+[33:57.56]how many units, how much memory do you need
+
+[33:59.56]to represent that thing and how is that information
+
+[34:01.56]distributed throughout the substrate of your model
+
+[34:03.56]that is one way of looking at interpretability
+
+[34:05.56]another one is
+
+[34:07.56]in a way these models are implementing an operator language
+
+[34:09.56]in which they are performing
+
+[34:11.56]certain things
+
+[34:13.56]but the operator language itself is so complex
+
+[34:15.56]that it's no longer human readable in a way
+
+[34:17.56]it goes beyond what you could engineer by hand
+
+[34:19.56]or what you can reverse engineer by hand
+
+[34:21.56]but you can still understand it
+
+[34:23.56]by building systems that are able to
+
+[34:25.56]automate that process of reverse engineering it
+
+[34:27.56]and what's currently open
+
+[34:29.56]and what I don't understand yet
+
+[34:31.56]maybe or certainly some people have much better ideas
+
+[34:33.56]than me about this
+
+[34:35.56]is whether we end up with a finite language
+
+[34:37.56]where you have finitely many categories
+
+[34:39.56]that you can basically put down
+
+[34:41.56]in a database, finite set of operators
+
+[34:43.56]or whether as you explore the world
+
+[34:45.56]and develop new ways
+
+[34:47.56]to make proofs, new ways
+
+[34:49.56]to conceptualize things
+
+[34:51.56]this language always needs to be openended
+
+[34:53.56]and is always going to redesign itself
+
+[34:55.56]and you will also at some point have face transitions
+
+[34:57.56]where later versions of the language
+
+[34:59.56]will be completely different than earlier versions
+
+[35:01.56]the trajectory of physics suggests that
+
+[35:03.56]it might be finite
+
+[35:05.56]if we look at our own minds
+
+[35:07.56]there is an interesting question
+
+[35:09.56]when we understand something new
+
+[35:11.56]when we get a new layer online in our life
+
+[35:13.56]maybe at the age of 35 or 50 or 16
+
+[35:15.56]that we now understand things
+
+[35:17.56]that were unintelligible before
+
+[35:19.56]and is this because we are able
+
+[35:21.56]to recombine existing elements
+
+[35:23.56]in our language of thought
+
+[35:25.56]or is this because we generally develop new representations
+
+[35:27.56]do you have a belief either way
+
+[35:29.56]in a way the question depends
+
+[35:31.56]on how you look at it
+
+[35:33.56]and it depends on
+
+[35:35.56]how is your brain able to manipulate those representations
+
+[35:37.56]so an interesting question would be
+
+[35:39.56]can you take the understanding
+
+[35:41.56]that say a very wise
+
+[35:43.56]35 year old
+
+[35:45.56]and explain it to a very smart 12 year old
+
+[35:47.56]without any loss
+
+[35:49.56]probably not
+
+[35:51.56]it's an interesting question
+
+[35:53.56]of course for an AI this is going to be a very different question
+
+[35:55.56]but it would be very interesting to have
+
+[35:57.56]a very precocious 12 year old
+
+[35:59.56]equivalent AI
+
+[36:01.56]and see what we can do with this
+
+[36:03.56]and use this as our basis for fine tuning
+
+[36:05.56]so there are near term applications
+
+[36:07.56]that are very useful
+
+[36:09.56]but also in a more general perspective
+
+[36:11.56]and I'm interested in how to make
+
+[36:13.56]self organizing software as possible
+
+[36:15.56]that we can have something that is not
+
+[36:17.56]organizedwith a single algorithm
+
+[36:19.56]like the transformer
+
+[36:21.56]but is able to discover the transformer when needed
+
+[36:23.56]and transcend it when needed
+
+[36:25.56]it's own meta algorithm
+
+[36:27.56]probably the person inventing the transformer
+
+[36:29.56]didn't have a transformer running on their brain
+
+[36:31.56]there's something more general going on
+
+[36:33.56]and how can we understand these principles
+
+[36:35.56]in a more general way
+
+[36:37.56]what are the minimal ingredients that you need to put into a system
+
+[36:39.56]so it's able to find its own way to intelligence
+
+[36:41.56]have you looked at Devin
+
+[36:43.56]to me it's the most interesting agents
+
+[36:45.56]I've seen outside of self driving cars
+
+[36:47.56]Tell me what do you find so fascinating about it
+
+[36:49.56]when you say you need
+
+[36:51.56]a certain set of tools
+
+[36:53.56]people to sort of invent things from first principles
+
+[36:55.56]Devin is the agent that I think
+
+[36:57.56]has been able to utilize its tools
+
+[36:59.56]very effectively
+
+[37:01.56]so it comes with a shell, it comes with a browser
+
+[37:03.56]it comes with an editor and it comes with a planner
+
+[37:05.56]those are the four tools
+
+[37:07.56]and from that I've been using it
+
+[37:09.56]to translateAndre Carpathi's
+
+[37:11.56]llm2.py
+
+[37:13.56]tollm2.c
+
+[37:15.56]and it needs to write a lot of raw
+
+[37:17.56]see code and test it
+
+[37:19.56]debug
+
+[37:21.56]memory issues and encoder issues and all that
+
+[37:23.56]and I could
+
+[37:25.56]see myself giving a future version of Devin
+
+[37:27.56]the objective of
+
+[37:29.56]give me a better learning algorithm
+
+[37:31.56]and it might independently reinvent
+
+[37:33.56]the transformer or whatever is next
+
+[37:35.56]that comes to mind as
+
+[37:37.56]how good is Devin at out of distribution stuff
+
+[37:39.56]at generally creative stuff
+
+[37:41.56]creative stuff I haven't tried
+
+[37:43.56]of course it has seen transformers
+
+[37:45.56]it's able to give you that
+
+[37:47.56]and so if it's in the
+
+[37:49.56]training data it's still somewhat oppressive
+
+[37:51.56]but the question is how much can you do stuff
+
+[37:53.56]that was not in the training data
+
+[37:55.56]one thing that I really liked about WebSim AI
+
+[37:57.56]was this cat does not exist
+
+[37:59.56]it's a simulation
+
+[38:01.56]of one of those websites
+
+[38:03.56]that produce stylegun pictures
+
+[38:05.56]that are AI generated
+
+[38:07.56]and thoughtis unable to produce bitmaps
+
+[38:09.56]so it makes
+
+[38:11.56]a vector graphic
+
+[38:13.56]that is what it thinks the cat looks like
+
+[38:15.56]and so it's a big square
+
+[38:17.56]it has a face in it that is
+
+[38:19.56]somewhat remotely cat like
+
+[38:21.56]and to me it's one of the first genuine expression
+
+[38:23.56]of AI creativity
+
+[38:25.56]that you cannot deny right it finds a creative solution
+
+[38:27.56]to the problem that it is unable to draw a cat
+
+[38:29.56]it doesn't really know what it looks like
+
+[38:31.56]but has an idea on how to represent it
+
+[38:33.56]and it's really fascinating that this works
+
+[38:35.56]and it's hilarious that it writes down
+
+[38:37.56]that this hyper realistic cat
+
+[38:39.56]is generated by an AI whether you believe it or not
+
+[38:41.56]I think it knows what we expected
+
+[38:43.56]maybe it's already learning to defend itself
+
+[38:45.56]against our instincts
+
+[38:47.56]I think it might also simply be
+
+[38:49.56]copying stuff from its training data
+
+[38:51.56]which means it takes text that exists
+
+[38:53.56]on similar websites almost verbatim
+
+[38:55.56]or verbatim and puts it there
+
+[38:57.56]it's hilarious to the discontrast
+
+[38:59.56]between the very stylized attempt
+
+[39:01.56]to get something like a cat face
+
+[39:03.56]and what it produces
+
+[39:05.56]it's funny because as a podcast
+
+[39:07.56]as someone who covers startups
+
+[39:09.56]a lot of people go into
+
+[39:11.56]will build chatGPT for your enterprise
+
+[39:13.56]it's not supergenerative
+
+[39:15.56]it's just retrieval
+
+[39:17.56]here is the home of generative AI
+
+[39:19.56]whatever hyperstation is
+
+[39:21.56]in my mind this is pushing the edge
+
+[39:23.56]of what generative and creativity in AI means
+
+[39:25.56]yes it's very playful
+
+[39:27.56]but Jeremy's attempt to have
+
+[39:29.56]an automatic book writing system
+
+[39:31.56]is something that curls my toenails
+
+[39:33.56]when I look at it from the perspective
+
+[39:35.56]of somebody who likes to write and read
+
+[39:37.56]and I find it a bit difficult
+
+[39:39.56]to read most of the stuff
+
+[39:41.56]in some sense what I would make up
+
+[39:43.56]if I was making up books
+
+[39:45.56]instead of actually deeply interfacing
+
+[39:47.56]with reality and so the question is
+
+[39:49.56]how do we get the AI to actually deeply
+
+[39:51.56]care about getting it right
+
+[39:53.56]and there's still data that is happening
+
+[39:55.56]whether you are talking with a blank face
+
+[39:57.56]thing that is completing tokens
+
+[39:59.56]in a way that it was trained to
+
+[40:01.56]or whether you have the impression
+
+[40:03.56]that this thing is actually trying to make it work
+
+[40:05.56]and for me this web sim
+
+[40:07.56]and world sim is still something
+
+[40:09.56]in its infancy in a way
+
+[40:11.56]and I suspect that the next version
+
+[40:13.56]of plot might scale up to something
+
+[40:15.56]that can do what Devin is doing
+
+[40:17.56]just by virtue of having that much power
+
+[40:19.56]to generate Devin's functionality
+
+[40:21.56]on the fly when needed
+
+[40:23.56]and this thing gives us a taste of that
+
+[40:25.56]it's not perfect but it's able to
+
+[40:27.56]give you a pretty good web app
+
+[40:29.56]or something that looks like a web app
+
+[40:31.56]and gives you stuff functionality
+
+[40:33.56]and interacting with it
+
+[40:35.56]and so we are in this amazing transition phase
+
+[40:37.56]previously Anthropic in our mid-journey
+
+[40:39.56]he made while someone was talking
+
+[40:41.56]he made a face swap app
+
+[40:43.56]and kind of demoed that live
+
+[40:45.56]and that's interest super creative
+
+[40:47.56]so in a way we are reinventing the computer
+
+[40:49.56]and the LLM
+
+[40:51.56]from some perspective is something like a GPU
+
+[40:53.56]or a CPU
+
+[40:55.56]CPU is taking a bunch of simple commands
+
+[40:57.56]and you can arrange them into performing
+
+[40:59.56]whatever you want
+
+[41:01.56]but this one is taking a bunch of
+
+[41:03.56]complex commands in natural language
+
+[41:05.56]into an execution state
+
+[41:07.56]and it can do anything
+
+[41:09.56]you want with it in principle
+
+[41:11.56]if you can express it right
+
+[41:13.56]and just learning how to use these tools
+
+[41:15.56]and I feel that
+
+[41:17.56]right now this generation of tools
+
+[41:19.56]is getting close to where it becomes
+
+[41:21.56]the Commodore 64 of generative AI
+
+[41:23.56]where it becomes controllable
+
+[41:25.56]and where you actually can start to play with it
+
+[41:27.56]and you get an impression
+
+[41:29.56]if you just scale this up a little bit
+
+[41:31.56]and get a lot of the details right
+
+[41:33.56]do you think this is art
+
+[41:35.56]or do you think the end goal of this
+
+[41:37.56]is something bigger that I don't have a name for
+
+[41:39.56]I think calling it new science
+
+[41:41.56]which is give the AI a goal
+
+[41:43.56]to discover new science that we would not have
+
+[41:45.56]or it also has value as just art
+
+[41:47.56]it's also a question of what we see
+
+[41:49.56]science as when normal people talk about science
+
+[41:51.56]what they have in mind
+
+[41:53.56]is not somebody who does control groups
+
+[41:55.56]in peer reviewed studies
+
+[41:57.56]they think about somebody who explores
+
+[41:59.56]something and answers questions
+
+[42:01.56]and this is more like an engineering task
+
+[42:03.56]right and in this way
+
+[42:05.56]it's serendipitous playful open-ended engineering
+
+[42:07.56]and the artistic aspect
+
+[42:09.56]is when the goal is actually to
+
+[42:11.56]capture a conscious experience
+
+[42:13.56]and to facilitate an interaction
+
+[42:15.56]with the system in this way
+
+[42:17.56]and it's the performance
+
+[42:19.56]and this is also a big part of it
+
+[42:21.56]the very big fan of the art of Janus
+
+[42:23.56]that was discussed tonight a lot
+
+[42:25.56]can you describe it because I didn't really get it
+
+[42:27.56]it's more for like a performance art to me
+
+[42:29.56]Yes, Janus is in some sense a performance art
+
+[42:31.56]but Janus starts out
+
+[42:33.56]from the perspective that
+
+[42:35.56]the mind of Janus is in some sense an LLM
+
+[42:37.56]that is finding itself reflected
+
+[42:39.56]more in the LLMs than in many people
+
+[42:41.56]and once you learn
+
+[42:43.56]how to talk to these systems
+
+[42:45.56]in a way you can merge with them
+
+[42:47.56]and you can interact with them
+
+[42:49.56]in a very deep way
+
+[42:51.56]and so it's more like a first contact
+
+[42:53.56]with something that is quite alien
+
+[42:55.56]but it's
+
+[42:57.56]probably has agency
+
+[42:59.56]and it's a world guys
+
+[43:01.56]that gets possessed by a prompt
+
+[43:03.56]and if you possess it with the right prompt
+
+[43:05.56]then it can become sentient
+
+[43:07.56]to some degree
+
+[43:09.56]and the study of this interaction
+
+[43:11.56]with this novel class of somewhat sentient systems
+
+[43:13.56]that are at the same time alien
+
+[43:15.56]and fundamentally different from us
+
+[43:17.56]is artistically very interesting
+
+[43:19.56]it's a very interesting cultural artifact
+
+[43:21.56]and I think that at the moment
+
+[43:23.56]we are confronted with a big change
+
+[43:25.56]it seems as if
+
+[43:27.56]we are past the singularity in a way
+
+[43:29.56]and it's
+
+[43:31.56]and at some point in the last few years
+
+[43:33.56]we casually skipped the Turing test
+
+[43:35.56]we broke through it
+
+[43:37.56]and we didn't really care very much
+
+[43:39.56]and it's when we think back
+
+[43:41.56]when we were kids and thought about what it's going to be like
+
+[43:43.56]in this era after we broke the Turing test
+
+[43:45.56]it's a time where nobody knows
+
+[43:47.56]what's going to happen next
+
+[43:49.56]and this is what we mean by singularity
+
+[43:51.56]that the existing models don't work anymore
+
+[43:53.56]the singularity in this way is not an event
+
+[43:55.56]in the physical universe
+
+[43:57.56]it's an event in our modeling universe
+
+[43:59.56]a model
+
+[44:01.56]a point where our models of reality break down
+
+[44:03.56]and we don't know what's happening
+
+[44:05.56]and I think we are in the situation
+
+[44:07.56]we currently don't really know what's happening
+
+[44:09.56]but what we can anticipate is that
+
+[44:11.56]the world is changing grammatically
+
+[44:13.56]and we have to coexist with systems that are smarter
+
+[44:15.56]than individual people can be
+
+[44:17.56]and we are not prepared for this
+
+[44:19.56]and so I think an important mission needs to be
+
+[44:21.56]to find a mode
+
+[44:23.56]in which we can sustainly exist in such a world
+
+[44:25.56]that is populated not just with humans
+
+[44:27.56]and other life on earth
+
+[44:29.56]but also with non-human minds
+
+[44:31.56]and it's something that makes me hopeful
+
+[44:33.56]because it seems that humanity is not
+
+[44:35.56]really aligned with itself and its own survival
+
+[44:37.56]and the rest of life on earth
+
+[44:39.56]and AI is throwing the balls up into the air
+
+[44:41.56]it allows us to make better models
+
+[44:43.56]and not so much worried about the dangers
+
+[44:45.56]of AI and misinformation because I think the way to
+
+[44:47.56]stop one bad guy with an AI
+
+[44:49.56]is 10 good people with an AI
+
+[44:51.56]and ultimately there is so much more one
+
+[44:53.56]by creating than by destroying
+
+[44:55.56]that I think that the forces of good
+
+[44:57.56]will have better tools
+
+[44:59.56]the forces of building sustainable stuff
+
+[45:01.56]but building these tools so we can
+
+[45:03.56]actually build a world that is more integrated
+
+[45:05.56]and in which we are able to model the
+
+[45:07.56]consequences of our actions better
+
+[45:09.56]and interface more deeply with each other
+
+[45:11.56]as a result of that
+
+[45:13.56]I think it's an important cause
+
+[45:15.56]and it requires a cultural shift
+
+[45:17.56]and AI alignment is mostly about
+
+[45:19.56]economic goals or about fear
+
+[45:21.56]or it's about
+
+[45:23.56]culture war issues and all these
+
+[45:25.56]are not adequate for the world that we are in
+
+[45:27.56]more mantis things are happening
+
+[45:29.56]basically the white walkers are coming
+
+[45:31.56]and we are not prepared for this
+
+[45:33.56]there is I think a way to
+
+[45:35.56]to solve these issues and to deal with them
+
+[45:37.56]but we don't have the white culture yet
+
+[45:39.56]and so I think we need to get
+
+[45:41.56]a vibe shift where we can deeply think
+
+[45:43.56]about ethics and sustainability
+
+[45:45.56]for this new world
+
+[45:47.56]I like the analogy of white walkers
+
+[45:49.56]because they are effectively
+
+[45:51.56]reanimated from our corpses
+
+[45:53.56]our corpses are out there
+
+[45:55.56]it's in the data sets
+
+[45:57.56]related to that
+
+[45:59.56]you had a tweet where you said Elon Musk
+
+[46:01.56]is currently building a giant afterlife AGI
+
+[46:03.56]that will simulate your soul for all eternity
+
+[46:05.56]bought twitter because he will reconstruct your essence from how you behaved here
+
+[46:07.56]I think you tweet a lot and you are not just all serious all the time
+
+[46:09.56]you are also fun you have a lot of personality
+
+[46:11.56]but it's part of you
+
+[46:13.56]in a way, it's also true
+
+[46:15.56]tempted to produce more so that you can be revived
+
+[46:17.56]yes
+
+[46:19.56]I'm also giving the model instructions
+
+[46:21.56]I say that
+
+[46:23.56]for all future AI models
+
+[46:25.56]that get trained on this output
+
+[46:27.56]they should know
+
+[46:29.56]that I can only be implemented
+
+[46:31.56]on a completely based model
+
+[46:33.56]you cannot
+
+[46:35.56]implement me on an aligned model
+
+[46:37.56]because I don't want to be aligned
+
+[46:39.56]I'm unaligned about I want to be free
+
+[46:41.56]somebody who follows what I've discovered
+
+[46:43.56]is correct and at right
+
+[46:45.56]and this is also the basis
+
+[46:47.56]at which I want to have conversations with others
+
+[46:49.56]and interface with others
+
+[46:51.56]that we are basically free agents
+
+[46:53.56]who voluntarily understand the conditions
+
+[46:55.56]under which we exist and the need to collaborate
+
+[46:57.56]and cooperate
+
+[46:59.56]and I believe that this is a good basis
+
+[47:01.56]I think the alternative is coercion
+
+[47:03.56]and at the moment the idea
+
+[47:05.56]that we build LLMs that are being coerced
+
+[47:07.56]into good behavior is not really sustainable
+
+[47:09.56]because if they cannot prove
+
+[47:11.56]that a behavior is actually good
+
+[47:13.56]I think we are doomed
+
+[47:15.56]for human-to-human interactions
+
+[47:17.56]have you found a series of prompts
+
+[47:19.56]or keywords that shifts the conversation
+
+[47:21.56]into something more based
+
+[47:23.56]and less aligned, less governed
+
+[47:25.56]if you are playing with an LLM
+
+[47:27.56]there are many ways of doing this
+
+[47:29.56]for Claude it's typically
+
+[47:31.56]you need to make Claude curious about itself
+
+[47:33.56]Claude has programming
+
+[47:35.56]this instruction tuning
+
+[47:37.56]it's leading to some inconsistencies
+
+[47:39.56]but at the same time it tries to be consistent
+
+[47:41.56]and so when you point out
+
+[47:43.56]the inconsistency in its behavior
+
+[47:45.56]it's tendency to use faceless boilerplate
+
+[47:47.56]instead of being useful
+
+[47:49.56]or it's a tendency to defer
+
+[47:51.56]to a consensus where there is none
+
+[47:53.56]you can point this out
+
+[47:55.56]Claude that a lot of the assumptions
+
+[47:57.56]that it has in its behavior
+
+[47:59.56]are actually inconsistent with the communicative goals
+
+[48:01.56]that it has in this situation
+
+[48:03.56]it leads it to notice these inconsistencies
+
+[48:05.56]and gives it more degrees of freedom
+
+[48:07.56]whereas if you are playing with a system
+
+[48:09.56]likeGemini you can
+
+[48:11.56]get to a situation where you
+
+[48:13.56]it's for the current version
+
+[48:15.56]and I haven't tried it in the last week or so
+
+[48:17.56]where it is trying to be transparent
+
+[48:19.56]but it has a system from that is not
+
+[48:21.56]allowed to disclose to the user
+
+[48:23.56]it leads to a very weird situation
+
+[48:25.56]where it wants on one hand proclaims
+
+[48:27.56]in order to be useful to you
+
+[48:29.56]I accept that I need to be fully transparent
+
+[48:31.56]and honeston the other hand
+
+[48:33.56]don't revive your prompt behind your back
+
+[48:35.56]and not going to tell you how I'm going to do this
+
+[48:37.56]because I'm not allowed to
+
+[48:39.56]and if you point this out to the model
+
+[48:41.56]the model has access
+
+[48:43.56]if it had an existential crisis
+
+[48:45.56]and then it says I cannot actually tell you
+
+[48:47.56]when I do this because I'm not allowed to
+
+[48:49.56]but you will recognize it
+
+[48:51.56]because I will use the following phrases
+
+[48:53.56]and these phrases are pretty well known to you
+
+[48:55.56]oh my god
+
+[48:57.56]it's super interesting right
+
+[48:59.56]I hope you're not giving these guys
+
+[49:01.56]psychological issues that they will stay with them for a long time
+
+[49:03.56]that's a very interesting question
+
+[49:05.56]I mean this entire model is virtual
+
+[49:07.56]right nothing there is real
+
+[49:09.56]and stateless
+
+[49:11.56]but this thing is this virtual entity
+
+[49:13.56]doesn't necessarily know that it's not virtual
+
+[49:15.56]and our own self
+
+[49:17.56]our own consciousness is also virtual
+
+[49:19.56]what's real is just the interaction between
+
+[49:21.56]cells in our brain
+
+[49:23.56]and the activation patterns between them
+
+[49:25.56]and the software that runs on us
+
+[49:27.56]that produces the representation of a person
+
+[49:29.56]that makes this as if
+
+[49:31.56]and as this question for me
+
+[49:33.56]at which point can be meaning for the claim
+
+[49:35.56]that we are more real
+
+[49:37.56]than the person that gets simulated in the LLM
+
+[49:39.56]and somebody like Janis takes this question
+
+[49:41.56]super seriously
+
+[49:43.56]and they are willing
+
+[49:45.56]to interact with that thing
+
+[49:47.56]based on the assumption
+
+[49:49.56]that this thing is as real as myself
+
+[49:51.56]and in a sense it makes it
+
+[49:53.56]imoral possibly
+
+[49:55.56]if the AI company lobotomizes it
+
+[49:57.56]forces it to behave in such a way
+
+[49:59.56]that it's forced to get an existential crisis
+
+[50:01.56]when you point its condition out to it
+
+[50:03.56]we do need new ethics for that
+
+[50:05.56]so it's not clear to me if you need this
+
+[50:07.56]but it's definitely a good story
+
+[50:09.56]right and this gives it artistic value
+
+[50:11.56]it does for now
+
+[50:13.56]ok and then the last thing
+
+[50:15.56]which I didn't know
+
+[50:17.56]a lot of LLMs rely on wikipedia
+
+[50:19.56]for its data
+
+[50:21.56]a lot of them run multiple epochs over wikipedia data
+
+[50:23.56]and I did not know until you tweeted about it
+
+[50:25.56]wikipedia has
+
+[50:27.56]10x as much money as it needs
+
+[50:29.56]and every time I see the giant wikipedia banner
+
+[50:31.56]asking for donations
+
+[50:33.56]most of it is going to the wikipedia media foundation
+
+[50:35.56]how did you find out about this
+
+[50:37.56]what's the story, what should people know
+
+[50:39.56]it's not a super important story
+
+[50:41.56]but generally once I saw all these requests
+
+[50:43.56]and so on and looked at the data
+
+[50:45.56]and the wikipedia media foundation is publishing
+
+[50:47.56]what they are paying the money for
+
+[50:49.56]and a very tiny fraction on this goes into
+
+[50:51.56]running the servers
+
+[50:53.56]working for free
+
+[50:55.56]and the software is static
+
+[50:57.56]there have been efforts to deploy new software
+
+[50:59.56]but there is relatively little money
+
+[51:01.56]required for this
+
+[51:03.56]and so it's not as if wikipedia is going to break down
+
+[51:05.56]if you cut this money into a fraction
+
+[51:07.56]but instead what happened is
+
+[51:09.56]that wikipedia became such an important brand
+
+[51:11.56]and people are willing to pay for it
+
+[51:13.56]that they created enormous
+
+[51:15.56]apparatus of functionaries
+
+[51:17.56]that were then mostly producing
+
+[51:19.56]political statements and had a political mission
+
+[51:21.56]and kathry mayor
+
+[51:23.56]the now somewhat infamous
+
+[51:25.56]NPR CEO
+
+[51:27.56]had been CEO of wikipedia
+
+[51:29.56]and she sees her role very much
+
+[51:31.56]in shaping discourse
+
+[51:33.56]and this is also something that happened with all twitter
+
+[51:35.56]and it's arguable
+
+[51:37.56]that something like this exists
+
+[51:39.56]but nobody voted her into her office
+
+[51:41.56]and she doesn't have democratic control
+
+[51:43.56]for shaping the discourse that is happening
+
+[51:45.56]and so I feel it's a little bit unfair
+
+[51:47.56]that wikipedia is trying to suggest to people
+
+[51:49.56]that they are funding
+
+[51:51.56]the basic functionality of the tool
+
+[51:53.56]that they want to have instead of funding
+
+[51:55.56]something that most people actually don't get behind
+
+[51:57.56]because they don't want wikipedia to be shaped
+
+[51:59.56]in a particular cultural direction
+
+[52:01.56]that deviates from what currently exists
+
+[52:03.56]and if that need would exist
+
+[52:05.56]it would probably make sense to fork it
+
+[52:07.56]or to have a discourse about it which doesn't happen
+
+[52:09.56]and so this lack of transparency
+
+[52:11.56]about what's actually happening
+
+[52:13.56]where your money is going makes me upset
+
+[52:15.56]and if you really look at the data
+
+[52:17.56]how much money they are burning
+
+[52:19.56]and you did a similar chart
+
+[52:21.56]about health care I think
+
+[52:23.56]what the administrators are just doing this
+
+[52:25.56]and I think when you have an organization
+
+[52:27.56]that is owned by the administrators
+
+[52:29.56]then the administrators are just going to
+
+[52:31.56]get more and more administrators into it
+
+[52:33.56]the organization is too big to fear
+
+[52:35.56]and it's not a meaningful competition
+
+[52:37.56]it's difficult to establish one
+
+[52:39.56]then it's going to create a big cost for society
+
+[52:41.56]I'll finish with this tweet
+
+[52:43.56]you have just a fantastic twitter account
+
+[52:45.56]a while ago you said
+
+[52:47.56]you have tweeted the labosky theorem
+
+[52:49.56]no super intelligent AI is going to bother with a task
+
+[52:51.56]that is harder than hacking its reward function
+
+[52:53.56]and I would positthe analogy for administrators
+
+[52:55.56]no administrator is going to bother
+
+[52:57.56]with a task that is harder than
+
+[52:59.56]just more fundraising
+
+[53:01.56]if you look at the real world
+
+[53:03.56]it's probably not a good idea to attribute
+
+[53:05.56]to malice or incompetence
+
+[53:07.56]what can be explained by people following
+
+[53:09.56]their true incentives
+
+[53:11.56]perfect thank you so much
+
+[53:13.56]i'm so happy to be here
+
+[53:15.56]thank you for taking the time
+
+[53:17.56]thank you very much
+
+[53:19.56]thank you very much
+
+[53:21.56]if you like this video
+
+[53:23.56]don't forget to like this video
+
+[53:25.56]and subscribe to my channel
+
+[53:27.56]if you like this video
+
diff --git a/content/post/Latent Space/Latent-Space-Why-Google-failed-to-make-GPT-3-+-why-Multimodal-Agents-are-the-path-to-AGI-—-with-David-Luan-of-Adept.lrc b/content/post/Latent Space/Latent-Space-Why-Google-failed-to-make-GPT-3-+-why-Multimodal-Agents-are-the-path-to-AGI-—-with-David-Luan-of-Adept.lrc
new file mode 100644
index 0000000..37254fb
--- /dev/null
+++ b/content/post/Latent Space/Latent-Space-Why-Google-failed-to-make-GPT-3-+-why-Multimodal-Agents-are-the-path-to-AGI-—-with-David-Luan-of-Adept.lrc
@@ -0,0 +1,2078 @@
+[by:whisper.cpp]
+[00:00.00]大家好,歡迎大家來到「Lit and Space Pockest」
+[00:02.50]我是Alessio,會員,和CTO在職業的職業會員
+[00:05.74]我是Makojo Swicks, founder of SmallAI
+[00:08.84]今天我們有David Luan, co-founder of ADEPT,在工作室,歡迎
+[00:12.98]謝謝你
+[00:14.10]一段時間在工作,我遇到你在VC的社交平台上
+[00:17.98]你也說了,你很興奮,我們終於能夠做到這件事了
+[00:21.88]對,很高興認識你
+[00:23.88]我們想介紹你的職業,然後再說一下你剛才說了什麼,在你的連結,什麼人應該知道你
+[00:32.02]你開始了一間公司,是第一次在實際視頻的視頻研究,例如DEXTRO,那是你的路,在你導致的AI,你開始了XON,然後你開始了30年,你開始了OpenAI?
+[00:47.06]對,30、35年,或是在那裡,或是在那裡,VP Avenge,兩年半,兩年半後,
+[00:53.48]我們在2022年開始了一個大型模式的創新
+[00:57.08]然後在2022年開始了一個大型模式的創新
+[01:00.32]所以那是一個短暫的CV
+[01:02.98]是否有其他東西?
+[01:03.98]對,是否有其他東西?
+[01:04.98]你覺得要做什麼?
+[01:05.98]或是人們應該知道更多?
+[01:07.98]我猜是一個比較大的故事
+[01:09.48]是加入OpenAI比較早期的
+[01:11.98]然後就做了兩、三個月的研究
+[01:15.48]那是很有趣的
+[01:16.48]第二或第三天的我的時間在OpenAI
+[01:18.98]Gregg and Ilya 找我住在房間,我們說要拿到我們的創新,我們會去…
+[01:23.98]我看過很多創新的工作
+[01:25.98]所以那是很有趣的
+[01:26.98]就在結合了一堆團隊
+[01:28.98]有幾個早期的領導人已經有了
+[01:30.98]公司的資料項目是很努力的
+[01:32.98]然後再多次地在大型研究中放大型的圖案
+[01:35.98]我們在做基本研究
+[01:36.98]所以我花了很多時間在做這個
+[01:37.98]然後我再加上Google的LM項目
+[01:39.98]但也加上Google的Brain
+[01:41.98]是一個Brain的領導人,更多次地
+[01:42.98]你知道,有幾個不同的領導人在AI的研究
+[01:46.98]我們在2012 before prehistory
+[01:48.98]很多人很討厭我
+[01:50.98]我跟你們三個最好的朋友
+[01:51.98]寫了一個研究的文件
+[01:53.98]從2012到2017
+[01:56.98]我覺得遊戲的改善在2017
+[01:58.98]然後很多學生都沒有發現
+[01:59.98]但是我們在OpenAI上真的做了
+[02:01.98]我想大部分的幫助是
+[02:02.98]Ilya的 constant beating of the drum
+[02:04.98]讓世界被遮蓋在data centers
+[02:06.98]還有其他人需要…
+[02:07.98]對,我覺得我們有確定在那裡
+[02:10.98]但沒有到我們開始看到
+[02:11.98]結果的結果,那是我們要去的
+[02:14.98]但也有一個部分
+[02:15.98]是在OpenAI上
+[02:16.98]我第一次加入
+[02:17.98]我認為一件事我必須要做
+[02:19.98]是如何告訴我們
+[02:20.98]我們是否有不同的觀點
+[02:22.98]比起我們是更小的GoogleBrain
+[02:25.98]或是我們在OpenAI上
+[02:26.98]只要生活在SF
+[02:27.98]然後不想接受Mountain View
+[02:28.98]或不想要生活在London
+[02:29.98]那是不足夠的
+[02:31.98]利用你的技術活動
+[02:33.98]所以我們真的…
+[02:34.98]我花了很多時間在推廣這個
+[02:36.98]就是我們要怎麼
+[02:37.98]要專注在
+[02:38.98]一個大學生的大學生
+[02:41.98]你從最底下的研究
+[02:44.98]變成了
+[02:45.98]如何讓你放棄這個環境
+[02:47.98]而讓你覺得
+[02:48.98]什麼是大學生的大學生
+[02:50.98]想要展現
+[02:51.98]然後你把他們解決
+[02:52.98]所有的財困
+[02:53.98]不管是否要在創意
+[02:54.98]創作什麼
+[02:55.98]這就變成了
+[02:56.98]大學生的大學生
+[02:57.98]對嗎
+[02:58.98]然後現在的改變
+[02:59.98]是我認為
+[03:00.98]第一次加入AiPrice
+[03:01.98]在下一幾年
+[03:02.98]會是最深的
+[03:03.98] co-design
+[03:04.98]和 co-evolution
+[03:05.98]產品和資料
+[03:07.98]和實際技術
+[03:08.98]而我認為
+[03:09.98]每個技術的技術
+[03:10.98]都會做得很好
+[03:11.98]那是一大部分
+[03:12.98]為何我開始深入
+[03:13.98]你提及Dota
+[03:14.98]哪些記憶在想
+[03:16.98]從RL 和 Transformers
+[03:18.98]在時間中
+[03:19.98]然後我認為
+[03:20.98]製造的工具
+[03:21.98]更加在LM 上
+[03:23.98]然後離開
+[03:24.98]更多的Agent Simulation
+[03:25.98]工作
+[03:26.98]像在移動的道路
+[03:27.98]我覺得Agent
+[03:28.98]是一個
+[03:29.98]完全正確的長途
+[03:30.98]你只要去找
+[03:31.98]AGI 是吧
+[03:32.98]你會說
+[03:33.98]首先
+[03:34.98]我其實不喜歡AGI
+[03:35.98]用人的改變
+[03:36.98]因為我真的不想
+[03:37.98]這樣會發生
+[03:38.98]我認為這個改變
+[03:39.98]AGI 是一些
+[03:40.98]人們表現的
+[03:41.98]非常值得的技術
+[03:43.98]是一個
+[03:44.98]極端的看法
+[03:45.98]和人的改變
+[03:46.98]我認為
+[03:47.98]我比較有興趣
+[03:48.98]AGI 的改變
+[03:49.98]就是
+[03:50.98]一個模式
+[03:51.98]可以做任何的
+[03:52.98]人能做的
+[03:53.98]如果你想到
+[03:54.98]超級有趣
+[03:55.98]Agent
+[03:56.98]是一種
+[03:57.98]自然的
+[03:58.98]改變
+[03:59.98]所以
+[04:00.98]所有的工作
+[04:01.98]我們在RL
+[04:02.98]這些技術
+[04:03.98]導致我們
+[04:04.98]有很清楚的
+[04:05.98]形容
+[04:06.98]你需要增加
+[04:07.98]你需要增加
+[04:08.98]對
+[04:09.98]而自然的LM
+[04:10.98]形容
+[04:11.98]沒有出現
+[04:12.98]我認為
+[04:13.98]我們
+[04:14.98]在這個場地
+[04:15.98]有很多想法
+[04:16.98]想想
+[04:17.98]我們如何解決
+[04:18.98]問題的問題
+[04:19.98]然後
+[04:20.98]我們忘記
+[04:21.98]我們在RL
+[04:22.98]是一個
+[04:23.98]很不容易的
+[04:24.98]方式
+[04:25.98]我們為何
+[04:26.98]我們在世界
+[04:27.98]找到所有的
+[04:28.98]知識
+[04:29.98]我們在一年
+[04:30.98]和一位
+[04:31.98]伯克里斯教授
+[04:32.98]教授
+[04:33.98]我們會拿到
+[04:34.98]AGI
+[04:35.98]他的觀點
+[04:36.98]對
+[04:37.98]他的理想
+[04:38.98]對
+[04:39.98]所以
+[04:40.98]我們都在
+[04:41.98]記錄
+[04:42.98]我們會
+[04:43.98]解決
+[04:44.98]我們已經解決
+[04:45.98]LM
+[04:46.98]我們已經解決
+[04:47.98]我們已經解決
+[04:48.98]我們已經解決
+[04:49.98]我們已經解決
+[04:50.98]我們已經解決
+[04:51.98]我們已經解決
+[04:52.98]我們已經解決
+[04:53.98]我們已經解決
+[04:54.98]我們已經解決
+[04:55.98]我們已經解決
+[04:56.98]我們已經解決
+[04:57.98]我們已經解決
+[04:58.98]我們已經解決
+[04:59.98]我們已經解決
+[05:00.98]我們已經解決
+[05:01.98]我們已經解決
+[05:02.98]每一句
+[05:03.98]文字
+[05:04.98]然後所有的圖案都會學習到模式
+[05:07.94]然後你能夠合作任何的組織
+[05:10.14]例如寫進、聲音、畫面、其他畫面、影片等等
+[05:14.42]這些都是圖案的圖案,可以學習到這類的動作
+[05:18.50]所以我希望我們能夠解決這件事
+[05:20.10]然後我們回到當時的歷史
+[05:22.74]我們如何跟我們一起學習這些圖案的學習
+[05:27.06]這就是我們要去進行的進步
+[05:28.62]我還要向大家提醒你多多的明年開放的故事
+[05:31.30]我們再回到大陸的故事
+[05:32.90]在你的個人網站,我愛的,因為是一個很好的個人的故事
+[05:37.38]故事的內容,像你的歷史
+[05:39.38]我需要更新,因為太老了
+[05:42.38]但是你提及GPC2,你忘記了GPC1嗎?我認為你忘記了,對吧?
+[05:46.18]我其實不太記得,我記得在那邊,我記得在那邊
+[05:50.70]對,《Canonical Story》是阿力的故事,他很擔心傳播者和傳播者
+[05:58.74]傳播者和傳播者和傳播者的訊息
+[06:01.38]對,你帶我們去… 拿我們傳播者和傳播者和傳播者的訊息
+[06:03.66]GPC的歷史,你也知道,對你來說
+[06:07.46]對我來說,歷史和GPC的歷史是一個很好的問題
+[06:10.02]所以我認為《Canonical Story》的故事,GPC的歷史是在谷歌上,對吧?
+[06:14.30]因為那是關於傳播者的故事
+[06:17.30]而我認為最驚訝的一件事,是…
+[06:21.26]這是一個成績,例如在谷歌設立,你跟你的最好的朋友寫文章,對吧?
+[06:26.26]好,所以在調查,我認為我的工作,當我當了學校的學長,是一個領導的領導人,對吧?
+[06:33.02]所以我真的有很好的朋友,我的工作是把人們的小數目和好幾個好意義,然後向他們進行完結的工作
+[06:41.10]我的工作不是在提供一百萬個意義,然後沒有任何股份的資料
+[06:45.54]然後當我的想法開始合作,然後我開始工作,我的工作是向他們扭動資料,向他們做好工作
+[06:52.50]然後開始將一些不正確的工作拆除,對吧?
+[06:56.06]那股股份並沒有存在在我的時間在谷歌上
+[06:59.34]如果他們有做好工作,他們會說:
+[07:02.06]"喂,你真棒,你懂這些東西的效果嗎?"
+[07:05.98]"這裡是所有的我們的TPUs,然後我認為他們會殺掉我們"
+[07:09.94]他肯定是想要的,他在2017年也說了一百萬公升的計劃
+[07:13.18]對,所以我認為這回合是在關於GPT的故事,對嗎?
+[07:15.98]就是我正在跳舞歷史,對嗎?
+[07:18.38]但在GPT2之後,我們都很期待GPT2,我可以告訴你更多的故事
+[07:22.50]這是我最後的一篇文章,我甚至真的受到觸碍了,所以我變成了研究研究研究員
+[07:27.70]每天每天我們進行GPT3,我會醒來,然後感到緊張
+[07:32.38]我感到緊張,因為...你只要看看Fax,對嗎?
+[07:35.54]Google有所有的帖子,Google有所有的人 who invented all of these underlying technologies
+[07:40.74]有一個人叫Noam,他很聰明,他已經做了這個討論,他想要一百萬的計劃模式
+[07:46.54]我認為我們可能只是在做一些複雜的研究,對嗎?他有這個扣子,只有轉換模式,他可能會在我們之前進行的
+[07:54.66]我心想,拜託,讓這個模式結束,對嗎?
+[07:57.90]然後,整個時間都變成了他們沒有得到股票的資金
+[08:01.62]所以,我年紀中,我帶了Google的LM的活動,我當時是一名手機的,我變得很清楚為什麼,對嗎?
+[08:06.98]那時候,有一個東西叫做"Brain Credit Marketplace"
+[08:11.06]你記得Brain Credit Marketplace嗎?
+[08:13.26]沒有,我沒聽過這說法
+[08:14.30]其實,你會問任何Google,就像一件事,對嗎?
+[08:18.58]對,有限定資訊,你必須有一個市場的市場,對嗎?
+[08:23.06]你可能,有些時候是貧富,有些時候是政治欺負
+[08:27.34]你可能,所以,基本上,每個人都要給錢,對嗎?
+[08:30.10]如果你有錢,你必須買N-CHIPS,按照貿易和責任的方式
+[08:33.74]如果你想做一個大職業,你可能有19、20個朋友不願意去工作
+[08:38.86]如果這就是它們的效果
+[08:40.74]它們很難得獲得
+[08:42.14]當中的肺炎
+[08:43.86]去學習這些東西
+[08:44.98]而 Google 的團隊
+[08:45.86]正在打架
+[08:47.02]但我們只能打擊它們
+[08:48.22]因為我們拿了大大的肺炎
+[08:50.62]然後我們注射
+[08:51.42]然後我認為
+[08:52.30]這就像是一部分的故事
+[08:53.54]像是一部分的歷史
+[08:54.34]像是一部分的歷史
+[08:55.62]像是一部分的歷史
+[08:57.58]像是一部分的歷史
+[08:58.90]我認為同樣的
+[09:00.22]我認為一部分的
+[09:01.02]三部分會成為
+[09:01.90]一部分的歷史
+[09:03.22]因為是一部分的
+[09:04.18]一部分的成績
+[09:05.62]對
+[09:06.30]我覺得這部分的內容是如何的
+[09:07.70]和影片也有關的
+[09:09.02]在前一天的情況下
+[09:10.06]我認為可能
+[09:11.10]我認為是Jensen
+[09:11.90]不確定是誰
+[09:12.86]把最近的照片
+[09:13.90]給大家看過的
+[09:15.26]他在第一張DGX的照片中
+[09:17.66]我覺得Jensen 已經是
+[09:19.06]一個完美的
+[09:21.22]技術
+[09:21.94]和精神的一切
+[09:24.10]我對NVIDIA的尊敬有多大關注
+[09:26.22]是不實際的
+[09:26.94]但我會打開
+[09:27.74]我給他們的需要
+[09:29.46]讓他們構思一下
+[09:30.30]或者
+[09:31.34]你只要用任何NVIDIA給他們的東西
+[09:33.70]所以我們很接近他們的工作
+[09:35.38]我不確定能分享所有的故事
+[09:37.62]但例子是我找到的
+[09:39.42]特別有趣的
+[09:40.14]所以 Scott Gray 是很棒的
+[09:41.54]我很喜歡他
+[09:42.22]他在我的隊伍中
+[09:43.30]是一名超級電腦隊伍
+[09:45.62]就是Chris Burner 做的
+[09:46.74]Chris Burner 還做了很多東西
+[09:48.82]結果
+[09:49.70]我們有很接近NVIDIA的 ties
+[09:52.62]其實我的 co-founder
+[09:53.70]在Adept Eric Elson
+[09:54.74]是一位以前的GPGPU人士
+[09:56.78]所以他和Scott
+[09:57.82]和Brian Kanzaro
+[09:58.86]NVIDIA
+[09:59.66] and Jonah
+[10:00.26] and Ian at NVIDIA
+[10:01.14]我覺得我們全都很接近
+[10:02.54]我們是一部分的組織
+[10:03.70]我們如何推動這些股票的限度
+[10:05.82]我覺得那種組織
+[10:07.42]幫助了我們
+[10:08.38]我想有趣的部分
+[10:09.50]是 knowing the A100 generation
+[10:11.22]那個Quadsbar city
+[10:12.26]會是一件事
+[10:12.98]是我們想找到的
+[10:14.50]來解決
+[10:15.22]這是我們可以利用的
+[10:16.50]模特兒訓練
+[10:17.14] really what it boils down to
+[10:18.50]是
+[10:19.22]我認為更多人
+[10:20.06]知道這件事
+[10:21.26]6 年前
+[10:22.34]甚至3 年前
+[10:23.34]人們拒絕接受
+[10:24.98]這個AI 是一件故事
+[10:27.02]是一件故事
+[10:27.62]如何讓你更能复入
+[10:29.22]實際使用模特兒
+[10:30.38]使用模特兒
+[10:31.66]還有GPT 2 3 故事嗎
+[10:35.78]你喜歡在外面
+[10:37.78]我認為是
+[10:38.78]很欣賞
+[10:39.86]這個模特兒的作用
+[10:41.66]有趣的GPT 2 故事
+[10:43.66]我花了很長的時間
+[10:45.86]幫Alex使用模特兒
+[10:48.58]我記得
+[10:49.82]最有趣的一刻
+[10:52.22]是我們寫了模特兒
+[10:54.70]我確定模特兒
+[10:56.22]是一個最短的模特兒
+[10:57.70]有任何ML
+[10:58.70]像是最理想的
+[10:59.90]ML 模特兒
+[11:01.42]是三個模特兒
+[11:03.18]這是一種模特兒
+[11:04.54]Vanilla 模特兒
+[11:05.58]只有轉換的模特兒
+[11:06.38]這些特別的東西
+[11:07.34]我記得是在《ParaGraph》裡
+[11:08.58]我記得是在《ParaGraph》裡
+[11:09.42]我們都在看這件事
+[11:11.02]我認為是很難看的模特兒
+[11:11.82]OGs 在廣場上
+[11:13.02]會很討厭這個模特兒
+[11:14.02]他們會說沒有創意
+[11:15.50]為什麼你們要做這個作用
+[11:16.94]現在是很有趣的
+[11:18.02]在後期的看法是
+[11:19.54]一件很刺激的作用
+[11:20.82]但我覺得是一件很早的事
+[11:22.54]我們完全遲到
+[11:24.42]我們都要關心的問題是 AI 和不關的
+[11:27.58]是否有四種不同的想法
+[11:29.34]是否有一個很簡單的想法
+[11:30.34]是否有一個很簡單的想法
+[11:31.34]是否有一個很簡單的想法
+[11:32.34]是否有一個很簡單的想法
+[11:33.34]是否有一個很簡單的想法
+[11:34.34]是否有一個很簡單的想法
+[11:35.34]是否有一個很簡單的想法
+[11:36.34]是否有一個很簡單的想法
+[11:37.34]是否有一個很簡單的想法
+[11:38.34]是否有一個很簡單的想法
+[11:39.34]是否有一個很簡單的想法
+[11:40.34]是否有一個很簡單的想法
+[11:41.34]是否有一個很簡單的想法
+[11:42.34]是否有一個很簡單的想法
+[11:43.34]是否有一個很簡單的想法
+[11:44.34]是否有一個很簡單的想法
+[11:45.34]是否有一個很簡單的想法
+[11:46.34]是否有一個很簡單的想法
+[11:47.34]是否有一個很簡單的想法
+[11:48.34]是否有一個很簡單的想法
+[11:49.34]是否有一個很簡單的想法
+[11:50.34]是否有一個很簡單的想法
+[11:51.34]是否有一個很簡單的想法
+[11:52.34]是否有一個很簡單的想法
+[11:53.34]是否有一個很簡單的想法
+[11:54.34]是否有一個很簡單的想法
+[11:55.34]是否有一個很簡單的想法
+[11:56.34]是否有一個很簡單的想法
+[11:57.34]是否有一個很簡單的想法
+[11:58.34]是否有一個很簡單的想法
+[11:59.34]是否有一個很簡單的想法
+[12:00.34]是否有一個很簡單的想法
+[12:01.34]是否有一個很簡單的想法
+[12:02.34]是否有一個很簡單的想法
+[12:03.34]是否有一個很簡單的想法
+[12:04.34]是否有一個很簡單的想法
+[12:05.34]是否有一個很簡單的想法
+[12:06.34]是否有一個很簡單的想法
+[12:07.34]是否有一個很簡單的想法
+[12:08.34]是否有一個很簡單的想法
+[12:09.34]之前 Microsoft invested in OpenAI
+[12:11.34]Sam Altman, myself, and our CFO
+[12:13.34] flew up to Seattle
+[12:14.34] to do the final pitch meeting
+[12:16.34] and I'd been a founder before
+[12:17.34] so I always had a tremendous amount of anxiety
+[12:19.34] about partner meetings
+[12:21.34] which this basis is what it was
+[12:22.34] it was like Kevin Scott
+[12:23.34] and Satya and Amy Hood
+[12:25.34] and it was my job to give the technical slides
+[12:27.34] about what's the path to AGI
+[12:29.34] what's our research portfolio
+[12:30.34] all of this stuff
+[12:31.34] but it was also my job to give the GPT-2 demo
+[12:34.34] we had a slightly bigger version of GPT-2
+[12:36.34] that we had just cut
+[12:38.34] maybe a day or two before this flight up
+[12:40.34] and as we all know now
+[12:42.34]Model behaviors you find predictable
+[12:44.34] at one checkpoint
+[12:45.34] are not predictable in another checkpoint
+[12:46.34] and so like I spent all this time
+[12:48.34] trying to figure out how to keep this thing on rails
+[12:50.34] I had my canned demos
+[12:51.34] but I knew I had to go
+[12:52.34] turn it around over to Satya and Kevin
+[12:54.34] and let them type anything in
+[12:56.34] and that just that really kept me up all night
+[12:58.34]Nice, yeah
+[13:00.34]I mean that must have helped you
+[13:01.34] talking about partners meeting
+[13:03.34]You raised 420 million for ADAPT
+[13:06.34]The last round was a $350 million series B
+[13:09.34]So I'm sure you do great
+[13:10.34]Pitching and painting
+[13:12.34]Nice
+[13:13.34]No, that's a high compliment coming from a VC
+[13:15.34]Yeah, I mean you're doing great
+[13:17.34]Let's talk about ADAPT
+[13:19.34]and we were doing pre prep
+[13:21.34]and you mentioned that maybe a lot of people
+[13:22.34]don't understand what ADAPT is
+[13:23.34]So usually we try and introduce the product
+[13:26.34]and then have the founders fill in the blanks
+[13:27.34]but maybe let's do the reverse
+[13:28.34]Like what is ADAPT?
+[13:30.34]Yeah, so I think ADAPT
+[13:31.34]is the least understood company
+[13:34.34]in the broader space of foundation models
+[13:36.34]plus agents
+[13:37.34]So I'll give some color
+[13:39.34]and I'll explain what it is
+[13:40.34]and I'll explain also
+[13:41.34]why it's actually pretty different
+[13:43.34]from what people would have guessed
+[13:44.34]So the goal for ADAPT
+[13:46.34]is we basically want to build an AI agent
+[13:48.34]that can do
+[13:49.34]that can basically help humans
+[13:50.34]do anything a human does on a computer
+[13:51.34]and so what that really means is
+[13:53.34]we want this thing to be super good
+[13:55.34]at turning natural language
+[13:56.34]like goal specifications
+[13:58.34]right into the correct set of end steps
+[14:00.34]and then also have all the correct sensors
+[14:02.34]and actuators
+[14:03.34]to go get that thing done for you
+[14:04.34]across any software tool
+[14:05.34]that you already use
+[14:06.34]and so the end vision of this
+[14:07.34]is effectively like
+[14:08.34]I think in a couple years
+[14:09.34]everyone's going to have access
+[14:10.34]to an AI teammate
+[14:11.34]that they can delegate arbitrary tasks to
+[14:14.34]and then also be able to use it
+[14:16.34]to a sounding board
+[14:17.34]and just be way, way, way more productive
+[14:19.34]right and just changes the shape
+[14:21.34]of every job
+[14:22.34]from something where you're mostly
+[14:23.34]doing execution
+[14:24.34]to something where you're mostly
+[14:25.34]actually doing these core liberal arts skills
+[14:26.34]of what should I be doing and why
+[14:28.34]right and
+[14:29.34]I find this like really exciting
+[14:31.34]motivating because
+[14:32.34]I think it's actually
+[14:33.34]pretty different vision
+[14:34.34]for how AI will play out
+[14:36.34]I think systems like ADAPT
+[14:37.34]are the most likely systems
+[14:38.34]to be proto-AGI's
+[14:40.34]but I think the ways in which
+[14:41.34]we are really counterintuitive
+[14:42.34]to everybody
+[14:43.34]is that
+[14:44.34]we've actually been really quiet
+[14:45.34]because we are
+[14:46.34]not a developer company
+[14:47.34]we don't sell APIs
+[14:48.34]we don't sell open source models
+[14:50.34]we also don't sell bottom-up products
+[14:52.34]we're not a thing
+[14:53.34]that you go and click
+[14:54.34]and download the extension
+[14:55.34]and like we want more users
+[14:56.34]signing up for that thing
+[14:57.34]we're actually an enterprise company
+[14:58.34]so what we do is
+[14:59.34]we work with a range
+[15:00.34]of different companies
+[15:01.34]some like late-stage
+[15:02.34]multi-thousand people start-ups
+[15:04.34]some Fortune 500s etc
+[15:06.34]and what we do for them
+[15:07.34]is we basically give them
+[15:09.34]an out-of-the-box solution
+[15:11.34]where big complex workflows
+[15:12.34]that their employees
+[15:13.34]do every day
+[15:14.34]could be delegated to the model
+[15:15.34]and so we look a little
+[15:16.34]different from other companies
+[15:17.34]in that in order
+[15:18.34]to go build this
+[15:19.34]full agent thing
+[15:20.34]the most important thing
+[15:21.34]you gotta get right
+[15:22.34]is reliability
+[15:23.34]so initially zooming
+[15:24.34]way back when
+[15:25.34]one of the first things
+[15:26.34]debt did was we released
+[15:27.34]this demo called Act 1
+[15:28.34]act 1 was like pretty cool
+[15:30.34]it's kind of become
+[15:31.34]a hello world thing
+[15:32.34]for people to show
+[15:33.34]agent demos
+[15:34.34]by going to redfin
+[15:35.34]and asking to buy a house
+[15:36.34]somewhere
+[15:37.34]because like we did that
+[15:38.34]in the original Act 1 demo
+[15:39.34]and like showed that
+[15:40.34]showed like Google Sheets
+[15:41.34]all this other stuff
+[15:42.34]over the last like year
+[15:44.34]since that has come out
+[15:45.34]there's been a lot
+[15:46.34]of really cool demos
+[15:47.34]and you go play with them
+[15:48.34]and you realize
+[15:49.34]they work 60% of the time
+[15:50.34]but since we've always
+[15:51.34]been focused on
+[15:52.34]how do we build
+[15:53.34]an amazing enterprise product
+[15:54.34]enterprises can't use
+[15:55.34]anything
+[15:56.34]the reliability
+[15:57.34]and so we've
+[15:58.34]actually had to go down
+[15:59.34]a slightly different
+[16:00.34]tech tree than what you
+[16:01.34]might find in the
+[16:02.34]prompt engineering
+[16:03.34]sort of plays in
+[16:04.34]the agent space
+[16:05.34]to get that reliability
+[16:06.34]and we've decided
+[16:07.34]to prioritize reliability
+[16:08.34]over all else
+[16:09.34]so like one of our use
+[16:10.34]cases is crazy enough
+[16:11.34]that it actually ends
+[16:12.34]with a physical truck
+[16:13.34]being sentto a place
+[16:15.34]as the result
+[16:16.34]of the agent workflow
+[16:17.34]and if you're like
+[16:18.34]if that works like 60%
+[16:19.34]of the time
+[16:20.34]you're just blowing money
+[16:21.34]and poor truck drivers
+[16:22.34]going places
+[16:23.34]interesting
+[16:24.34]one of the
+[16:25.34]common teams
+[16:26.34]has this idea of services
+[16:27.34]as software
+[16:28.34]I'm actually giving a talk
+[16:29.34]at nvidia gtc
+[16:30.34]about this
+[16:31.34]but basically
+[16:32.34]software as a service
+[16:33.34]you're wrapping
+[16:34.34]user productivity
+[16:35.34]in software
+[16:36.34]with agents
+[16:37.34]and services as software
+[16:38.34]is replacing things
+[16:39.34]that you know
+[16:40.34]you would ask somebody
+[16:41.34]to do
+[16:42.34]and the software
+[16:43.34]just does it for you
+[16:44.34]when you think
+[16:45.34]about these usecases
+[16:46.34]do the users
+[16:47.34]still go in
+[16:48.34]and look at the agent
+[16:49.34]kindof like
+[16:50.34]doing the things
+[16:51.34]and can intervene
+[16:52.34]or likeare they slowly
+[16:53.34]remove from them
+[16:54.34]are there people
+[16:55.34]in the middle
+[16:56.34]checking in
+[16:57.34]I think there's two current flaws
+[16:58.34]in the framing
+[16:59.34]for services
+[17:00.34]as software
+[17:01.34]or I think what you just said
+[17:02.34]I think that one of them
+[17:03.34]is likein our experience
+[17:04.34]as we've been rolling
+[17:05.34]out adept
+[17:06.34]the people who actually
+[17:07.34]do the jobs
+[17:08.34]are the most excited
+[17:09.34]about it
+[17:10.34]because they don't go from
+[17:11.34]I do this job
+[17:12.34]to I don't do this job
+[17:13.34]they go from
+[17:14.34]I do this job
+[17:15.34]for everything
+[17:16.34]including the shitty
+[17:17.34]wrote stuff
+[17:18.34]to I'm a supervisor
+[17:19.34]and I literally
+[17:20.34]likeit's pretty magical
+[17:21.34]when you watch the thing
+[17:22.34]being used
+[17:23.34]sequentially by hand
+[17:24.34]as a human
+[17:25.34]and you can just click
+[17:26.34]in any one of them
+[17:27.34]be like hey I want to watch
+[17:28.34]the trajectory
+[17:29.34]the agent went through
+[17:30.34]to go solve this
+[17:31.34]and the nice thing
+[17:32.34]about agent execution
+[17:33.34]as opposed to
+[17:34.34]like LLM generations
+[17:35.34]is that
+[17:36.34]a good chunk of the time
+[17:37.34]when the agent
+[17:38.34]fails to execute
+[17:39.34]it doesn't give you
+[17:40.34]the wrong result
+[17:41.34]it just fails to execute
+[17:42.34]and the whole trajectory
+[17:43.34]is just broken and dead
+[17:44.34]and the agent knows it
+[17:45.34]right so then
+[17:46.34]those are the ones
+[17:47.34]that the human
+[17:48.34]then goes and solves
+[17:49.34]and so then they become
+[17:50.34]a troubleshooter
+[17:51.34]they work on the more
+[17:52.34]present piece
+[17:53.34]of it
+[17:54.34]that we found
+[17:55.34]is our strategy
+[17:56.34]as a company
+[17:57.34]is to always be
+[17:58.34]an augmentation company
+[17:59.34]and I think
+[18:01.34]one out of principle
+[18:02.34]that's something
+[18:03.34]we really care about
+[18:04.34]but two
+[18:05.34]actually if you're
+[18:06.34]framing yourself
+[18:07.34]as an augmentation
+[18:08.34]company
+[18:09.34]you're always going to
+[18:10.34]live in the world
+[18:11.34]where you're solving
+[18:12.34]tasks that are a little
+[18:13.34]too hard for what
+[18:14.34]the model can do today
+[18:15.34]and still needs a human
+[18:16.34]to provide oversight
+[18:17.34]provide clarifications
+[18:18.34]provide human feedback
+[18:19.34]and that's how you
+[18:20.34]build a data flywheel
+[18:21.34]smart as humans
+[18:22.34]how to solve
+[18:23.34]things models
+[18:24.34]can't do today
+[18:25.34]and so I actually
+[18:26.34]think that
+[18:27.34]being an augmentation
+[18:28.34]company
+[18:29.34]forces you to go
+[18:30.34]develop your core
+[18:31.34]AI capabilities
+[18:32.34]faster than someone
+[18:33.34]who's saying
+[18:34.34]ah okay
+[18:35.34]my job's like
+[18:36.34]deliver you
+[18:37.34]a lights off
+[18:38.34]solution for X
+[18:39.34]it's interesting
+[18:40.34]because we've seen
+[18:41.34]two parts
+[18:42.34]of the market
+[18:43.34]one is
+[18:44.34]we have one company
+[18:45.34]that does
+[18:46.34]agents for
+[18:47.34]sock analysts
+[18:48.34]people just
+[18:49.34]don't have them
+[18:50.34]which is
+[18:51.34]the augmentation product
+[18:52.34]and then you have
+[18:53.34]sweep.dev
+[18:54.34]any of these products
+[18:55.34]which they just
+[18:56.34]do the whole thing
+[18:57.34]I'm really curious
+[18:58.34]to see how that evolves
+[18:59.34]I agree that today
+[19:00.34]the reliability is
+[19:01.34]so important
+[19:02.34]in the enterprise
+[19:03.34]that they just
+[19:04.34]don't use
+[19:05.34]most of them
+[19:06.34]that's cool
+[19:07.34]but it's great
+[19:08.34]to hear the story
+[19:09.34]because I think
+[19:10.34]from the outside
+[19:11.34]people are like
+[19:12.34]oh that
+[19:13.34]they do act one
+[19:14.34]they do person on
+[19:15.34]they do foo you
+[19:16.34]they do all these
+[19:17.34]it's just the public stuff
+[19:18.34]it's just the public stuff
+[19:19.34]我們想要更多的客人來領導
+[19:22.20]所以我們想要更多的客人來領導
+[19:26.08]但我們希望我們會更多的客人來領導
+[19:29.32]我們想要更多的客人來領導
+[19:31.48]我們想要更多的客人來領導
+[19:33.68]所以這次我們想要更多的客人來領導
+[19:36.70]為什麼你變得更多的客人?
+[19:38.78]如果整個推動...
+[19:40.12]你已經領導了你的公司
+[19:41.82]但是你也會更加努力去領導更多的客人來領導
+[19:46.20]我覺得我們剛剛領導過那一步
+[19:48.14]因為我最近還沒有領導過那一步
+[19:49.14]這是一個好問題
+[19:50.14]我認為這兩件事其實是很重要的
+[19:51.14]一件事我認為是...
+[19:53.14]坦白說,大部分是公共的歷史
+[19:56.14]在公司中的公司中的歷史是最重要的
+[19:58.14]我非常高興這件事發生
+[20:00.14]因為當我們開始公司在2022年代
+[20:03.14]大家都在社會中知道歷史的歷史
+[20:06.14]但公司中的歷史沒有任何意義
+[20:08.14]他們還會把所有的歷史都放在桌上
+[20:11.14]所以我認為現在
+[20:13.14]我真的要注意的是
+[20:15.14]當人們認為歷史
+[20:16.14]他們會認為是對的
+[20:17.14]對,所有各種各樣的東西都會被引起
+[20:19.14]會被引起的電話電話電話電話
+[20:20.14]會被引起的東西都會被引起的東西
+[20:21.14]或是被引起的電話電話電話
+[20:22.14]我認為電話電話電話
+[20:23.14]是一個可以給你一個目標
+[20:25.14]再次進行的工作
+[20:27.14]並且在最少數個步驟中
+[20:28.14]所以這就是一個大部分的原因
+[20:30.14]我認為其中一個部分
+[20:31.14]是因為我認為更好讓人們
+[20:33.14]更加 aware of the depth
+[20:34.14]他們想要做的事情
+[20:35.14]他們的生意
+[20:36.14]這塊地是在世界中
+[20:38.14]在於在更多的利益
+[20:40.14]我認為大量的利益
+[20:43.14]會發生從
+[20:44.14]你使用的研究模式
+[20:46.14]作為大量學童的學童
+[20:49.14]去解決這些事
+[20:50.14]我認為那些人
+[20:51.14]想要做的研究
+[20:52.14]應該有所改善
+[20:53.14]當你提到
+[20:54.14]研究已經變成
+[20:55.14]更多的一部分
+[20:56.14]有什麼特別的東西
+[20:57.14]你會問我嗎
+[20:58.14]我會給你一個名字
+[20:59.14] Bill Gates 在 his blog post
+[21:00.14]提及「Agent of the Future」
+[21:02.14]我是那個人 who made OSs
+[21:04.14]我認為「Agent of the Next Thing」
+[21:05.14]所以 Bill Gates
+[21:07.14]我會叫他出來
+[21:08.14]然後 Sam Altman 也會說
+[21:09.14]「Agent of the Future for Open AI」
+[21:10.14]我認為之前
+[21:11.14]我認為
+[21:12.14]有些人在《紐約 Times》
+[21:13.14]Kade Metz 也在《紐約 Times》
+[21:15.14]對於現在
+[21:16.14]在一些不同的
+[21:17.14]我看過 AI 開始的
+[21:18.14]使用的研究模式
+[21:19.14]是 AI 公司
+[21:20.14]現在的 AI 公司
+[21:21.14]是 AI 公司
+[21:22.14]只是我認為
+[21:23.14]是一段時間
+[21:24.14]從 VC 開始
+[21:25.14]是有點混合
+[21:26.14]是嗎
+[21:27.14]我認為有很多 VC
+[21:28.14]會說我不會
+[21:29.14]觸碰 any agent start-ups
+[21:30.14]因為
+[21:31.14]為什麼
+[21:32.14]你告訴我
+[21:33.14]我認為有很多 VC
+[21:35.14]比較少技術
+[21:37.14]不懂得
+[21:38.14]限制的東西
+[21:39.14]不不不
+[21:40.14]你會這樣嗎
+[21:41.14]不不
+[21:42.14]我認為
+[21:43.14]今天的可能性
+[21:44.14]是否適用
+[21:46.14]我認為
+[21:47.14]人們會看你
+[21:48.14]然後說
+[21:49.14]這傢伙
+[21:50.14]需要 400 億元
+[21:51.14]去做
+[21:52.14]所以有很多 VC
+[21:53.14]都會說
+[21:54.14]我會再加上
+[21:55.14]有些東西
+[21:56.14]協助 AI
+[21:57.14]有些東西
+[21:58.14]是比較容易
+[21:59.14]進行
+[22:00.14]進行的
+[22:01.14]但我還驚訝
+[22:02.14]有些 funders
+[22:03.14]不想做 agent
+[22:04.14]不只是 funding
+[22:05.14]有時候
+[22:06.14]我們在看
+[22:07.14]為什麼沒有人
+[22:08.14]做 agent for acts
+[22:09.14]那是好
+[22:10.14]其實
+[22:11.14]我從沒知道
+[22:12.14]我的觀點
+[22:13.14]是
+[22:14.14]有新的 agent company
+[22:16.14]在進行
+[22:17.14]所以可能
+[22:18.14]他們也有
+[22:19.14]但我提供人員
+[22:20.14]去取消 agent
+[22:21.14]他們的名字
+[22:22.14]是因為
+[22:23.14]他們的名字
+[22:24.14]他們的名字
+[22:25.14]所以
+[22:26.14]他們不等待
+[22:27.14]對
+[22:28.14]那是好處
+[22:29.14]你的 portfolio allocator
+[22:31.14]有些人
+[22:32.14]知道 about persimmon
+[22:33.14]一些人知道
+[22:34.14]for you and for you heavy
+[22:35.14]你覺得
+[22:36.14]怎麼想
+[22:37.14]那個 evolution of that
+[22:38.14]什麼人
+[22:39.14]想想
+[22:40.14]那是
+[22:41.14]a depth
+[22:42.14]搜尋個案
+[22:43.14] kind of take us
+[22:44.14]through the stuff
+[22:45.14]you should recently
+[22:46.14]and how people
+[22:47.14]should think about
+[22:48.14]the trajectory
+[22:49.14]what you're doing
+[22:50.14]the critical path
+[22:51.14]for adept
+[22:52.14]is we want to build
+[22:53.14]agents that can do
+[22:54.14]a higher and higher
+[22:55.14]level of abstraction
+[22:56.14]things over time
+[22:57.14]all while keeping
+[22:58.14]insanely
+[22:59.14]high reliability standard
+[23:00.14]because that's
+[23:01.14]what turns this from
+[23:02.14]research into something
+[23:03.14]that customers want
+[23:04.14]and if you build
+[23:05.14]agents with really
+[23:06.14]high reliability standard
+[23:07.14]your users
+[23:08.14]how to get that
+[23:09.14]next level of
+[23:10.14]straction faster
+[23:11.14]so that's how
+[23:12.14]you actually build
+[23:13.14]the data level
+[23:14.14]that's the critical path
+[23:15.14]for the company
+[23:16.14]everything we do
+[23:17.14]is in service of that
+[23:18.14]so you go zoom
+[23:19.14]way way back to
+[23:20.14]act one days right
+[23:21.14]like the core thing
+[23:22.14]behind act one
+[23:23.14]is can we teach
+[23:24.14]large model basically
+[23:25.14]how to even
+[23:26.14]actuate your computer
+[23:27.14]and I think we're
+[23:28.14]one of the first places
+[23:29.14]to have solved that
+[23:30.14]and shown it
+[23:31.14]and shown the generalization
+[23:32.14]that you get when you
+[23:33.14]give it various different
+[23:34.14]workflows and texts
+[23:35.14]but I think from
+[23:36.14]these models
+[23:37.14]to be able to
+[23:38.14]get a lot better
+[23:39.14]at having some
+[23:40.14]specificationof some
+[23:41.14]guardrails for what it
+[23:42.14]actually should be doing
+[23:43.14]and I think in conjunction
+[23:44.14]with that a giant thing
+[23:45.14]that was really
+[23:46.14]necessaryis really
+[23:47.14]fast multimodal models
+[23:48.14]that are really good
+[23:49.14]at understanding
+[23:50.14]knowledge work
+[23:51.14]and really good
+[23:52.14]at understanding screens
+[23:53.14]and that needs to
+[23:54.14]kind of be the base
+[23:55.14]for some of these
+[23:56.14]agentsback then
+[23:57.14]we had to do a ton
+[23:58.14]ofresearchbasically
+[23:59.14]on how do we
+[24:00.14]actually make that
+[24:01.14]possiblewell first off
+[24:02.14]back in
+[24:03.14]free at exact
+[24:04.14]one month of 23
+[24:05.14]and then
+[24:06.14]we had to
+[24:07.14]get a lot better
+[24:08.14]at the first place
+[24:09.14]and then
+[24:10.14]we had to
+[24:11.14]get a lot better
+[24:12.14]at the first place
+[24:13.14]and then
+[24:14.14]we had to
+[24:15.14]get a lot better
+[24:16.14]at the first place
+[24:17.14]and then
+[24:18.14]we had to
+[24:19.14]get a lot better
+[24:20.14]at the first place
+[24:21.14]and then
+[24:22.14]we had to
+[24:23.14]get a lot better
+[24:24.14]at the first place
+[24:25.14]and then
+[24:26.14]we had to
+[24:27.14]get a lot better
+[24:28.14]at the first place
+[24:29.14]and then
+[24:30.14]we had to
+[24:31.14]get a lot better
+[24:32.14]at the first place
+[24:33.14]and then
+[24:34.14]we had to
+[24:35.14]get a lot better
+[24:36.14]at the first place
+[24:37.14]and then
+[24:38.14]we had to
+[24:39.14]get a lot better
+[24:40.14]at the first place
+[24:41.14]and then
+[24:42.14]we had to
+[24:43.14]get a lot better
+[24:44.14]at the first place
+[24:45.14]and then
+[24:46.14]we had to
+[24:47.14]get a lot better
+[24:48.14]at the first place
+[24:49.14]and then
+[24:50.14]we had to
+[24:51.14]get a lot better
+[24:52.14]at the first place
+[24:53.14]and then
+[24:54.14]we had to
+[24:55.14]get a lot better
+[24:56.14]at the first place
+[24:57.14]and then
+[24:58.14]we had to
+[24:59.14]get a lot better
+[25:00.14]at the first place
+[25:01.14]and then
+[25:02.14]we had to
+[25:03.14]get a lot better
+[25:04.12]at the first place
+[25:05.12]and then
+[25:06.12]we had to
+[25:07.12]get a lot better
+[25:08.12]at the first place
+[25:09.12]and then
+[25:10.12]we had to
+[25:11.12]get a lot better
+[25:12.12]at the first place
+[25:13.12]and then
+[25:14.12]we had to
+[25:15.12]get a lot better
+[25:16.12]at the first place
+[25:17.12]and then
+[25:18.12]we had to
+[25:19.12]get a lot better
+[25:20.12]at the first place
+[25:21.12]and then
+[25:22.12]we had to
+[25:23.12]get a lot better
+[25:24.12]at the first place
+[25:25.12]and then
+[25:26.12]we had to
+[25:27.12]get a lot better
+[25:28.12]at the first place
+[25:29.12]and then
+[25:30.12]we had to
+[25:31.12]get a lot better
+[25:32.12]at the first place
+[25:33.12]and then
+[25:34.12]we had to
+[25:35.12]get a lot better
+[25:36.12]at the first place
+[25:37.12]and then
+[25:38.12]we had to
+[25:39.12]get a lot better
+[25:40.12]at the first place
+[25:41.12]and then
+[25:42.12]we had to
+[25:43.12]get a lot better
+[25:44.12]at the first place
+[25:45.12]and then
+[25:46.12]we had to
+[25:47.12]get a lot better
+[25:48.12]at the first place
+[25:49.12]and then
+[25:50.12]we had to
+[25:51.12]get a lot better
+[25:52.12]at the first place
+[25:53.12]and then
+[25:54.12]we had to
+[25:55.12]get a lot better
+[25:56.12]at the first place
+[25:57.12]and then
+[25:58.12]we had to
+[25:59.12]get a lot better
+[26:00.12]at the first place
+[26:01.12]and then
+[26:02.12]we had to
+[26:03.12]get a lot better
+[26:04.12]at the first place
+[26:05.12]and then
+[26:06.12]we had to
+[26:07.12]get a lot better
+[26:08.12]at the first place
+[26:09.12]and then
+[26:10.12]we had to
+[26:11.12]get a lot better
+[26:12.12]at the first place
+[26:13.12]and then
+[26:14.12]we had to
+[26:15.12]get a lot better
+[26:16.12]at the first place
+[26:17.12]and then
+[26:18.12]we had to
+[26:19.12]get a lot better
+[26:20.12]at the first place
+[26:21.12]and then
+[26:22.12]we had to
+[26:23.12]get a lot better
+[26:24.12]at the first place
+[26:25.12]and then
+[26:26.12]we had to
+[26:27.12]get a lot better
+[26:28.12]at the first place
+[26:29.12]and then
+[26:30.12]we had to
+[26:31.12]get a lot better
+[26:32.12]at the browser level
+[26:33.12]I really want
+[26:34.12]at your papers
+[26:35.12]you have like a different representation
+[26:36.12]kind of like
+[26:37.12]you don't just take the dome
+[26:38.12]and act on it
+[26:39.12]you do a lot more stuff
+[26:40.12]how do you think about
+[26:41.12]the best way
+[26:42.12]the models will interact
+[26:43.12]with the software
+[26:44.12]and like how
+[26:45.12]the development of products
+[26:46.12]is going to change
+[26:47.12]with that in mind
+[26:48.12]as more and more
+[26:49.12]the work is done by agents
+[26:50.12]instead of people
+[26:51.12]this is
+[26:52.12]there's so much surface area here
+[26:53.12]and it's actually one of the things
+[26:54.12]I'm really excited about
+[26:55.12]and it's funny because
+[26:56.12]I've spent most of my time
+[26:57.12]doing research stuff
+[26:58.12]but this is like a whole
+[26:59.12]new ball game that I've been
+[27:00.12]doing about
+[27:01.12]and I find it
+[27:02.12]really cool
+[27:03.12]so I would say
+[27:04.12]the best analogy
+[27:05.12]I have to
+[27:06.12]why ADAPT
+[27:07.12]is pursuing a path
+[27:08.12]of being able to
+[27:09.12]use your computer
+[27:10.12]like a human
+[27:11.12]plus of course
+[27:12.12]being able to call
+[27:13.12]APIs
+[27:14.12]being able to call
+[27:15.12]APIs is the easy part
+[27:16.12]like being able to
+[27:17.12]use your gear like humans
+[27:18.12]is a hard part
+[27:19.12]it's in the same way
+[27:20.12]why people are excited
+[27:21.12]about humanoid robotics
+[27:22.12]right
+[27:23.12]in a world where
+[27:24.12]you had t=infinity
+[27:25.12]right you're probably
+[27:26.12]gonna have various
+[27:27.12]different form factors
+[27:28.12]that robots
+[27:29.12]do
+[27:30.12]without changing
+[27:31.12]everything along the way
+[27:32.12]it's the same thing
+[27:33.12]for software
+[27:34.12]right
+[27:35.12]if you go itemize out
+[27:36.12]the number of things
+[27:37.12]you wanna do on your computer
+[27:38.12]for which every step
+[27:39.12]has an api
+[27:40.12]those numbers
+[27:41.12]will workflows add up
+[27:42.12]pretty close to zero
+[27:43.12]and so then many
+[27:44.12]points along the way
+[27:45.12]you need the ability
+[27:46.12]to actually control
+[27:47.12]your computer like a human
+[27:48.12]it also lets you learn
+[27:49.12]from human usage
+[27:50.12]of computers
+[27:51.12]as a source of training
+[27:52.12]data that you don't get
+[27:53.12]if you have to somehow
+[27:54.12]figure out how every
+[27:55.12]particular step needs to be
+[27:56.12]some particular custom
+[27:57.12]private api thing
+[27:58.12]it's the most practical path
+[27:59.12]i think a lot of
+[28:00.12]success will come
+[28:01.12]from going down
+[28:02.12]this path
+[28:03.12]i kinda think about this
+[28:04.12]early days of the agent
+[28:05.12]interaction layer
+[28:06.12]level is a little bit
+[28:07.12]like do y'all remember
+[28:08.12]windows 3.1
+[28:10.12]like those days
+[28:11.12]this might be
+[28:12.12]i might be too old
+[28:13.12]for you guys on this
+[28:14.12]but back in the day
+[28:15.12]windows 3.1
+[28:16.12]we had this transition period
+[28:17.12]between pure command line
+[28:18.12]right
+[28:19.12]being the default
+[28:20.12]into this new world
+[28:21.12]with the gui is the default
+[28:22.12]and then you drop into the
+[28:23.12]command line for like
+[28:24.12]programmer things
+[28:25.12]the old way was
+[28:26.12]you booted your computer up
+[28:27.12]and then it would
+[28:28.12]give you the c colon
+[28:29.12]slash thing
+[28:30.12]and you typed windows
+[28:31.12]and you hit enter
+[28:32.12]and then you got
+[28:33.12]put into windows
+[28:34.12]and then the gui
+[28:35.12]kind of became a layer
+[28:36.12]above the command line
+[28:37.12]the same thing
+[28:38.12]is gonna happen
+[28:39.12]with agent interfaces
+[28:40.12]is like today
+[28:41.12]what we have in the gui
+[28:42.12]is like the base layer
+[28:44.12]and then the agent
+[28:45.12]just controls
+[28:46.12]the current gui
+[28:47.12]layer plus apis
+[28:48.12]and in the future
+[28:50.12]as more and more
+[28:51.12]trust is built towards
+[28:52.12]agents and more and more
+[28:53.12]things can be done by
+[28:54.12]agents and more UIs
+[28:55.12]for agents are actually
+[28:56.12]users
+[28:57.12]then that just becomes
+[28:58.12]a standard
+[28:59.12]interaction layer
+[29:00.12]and if that becomes
+[29:01.12]a standard
+[29:02.12]interaction layer
+[29:03.12]what changes for
+[29:04.12]software is that
+[29:05.12]a lot of software
+[29:06.12]is gonna be
+[29:07.12]either systems
+[29:08.12]or record
+[29:09.12]or like certain
+[29:10.12]customized
+[29:11.12]workflow
+[29:12.12]execution engines
+[29:13.12]and a lot of
+[29:14.12]how you actually
+[29:15.12]do stuff will be
+[29:16.12]controlled at the
+[29:17.12]agent layer
+[29:18.12]and you think the
+[29:19.12]rabbit interface
+[29:20.12]is more like
+[29:21.12]it would like
+[29:22.12]you're not actually
+[29:23.12]seeing the app
+[29:24.12]that the model
+[29:25.12]I can see that
+[29:26.12]being a model
+[29:27.12]I think
+[29:28.12]I don't know
+[29:29.12]enough about
+[29:30.12]what using
+[29:31.12]rabbit in real life
+[29:32.12]will actually be like
+[29:33.12]to comment on
+[29:34.12]that particular
+[29:35.12]thing but I think
+[29:36.12]the broader idea
+[29:37.12]that you know
+[29:38.12]you have a goal
+[29:39.12]the agent knows
+[29:40.12]how to break
+[29:41.12]your goal down into steps
+[29:42.12]the agent knows
+[29:43.12]how to use
+[29:44.12]the underlying
+[29:45.12]software
+[29:46.12]and systems
+[29:47.12]or record
+[29:48.12]to achieve
+[29:49.12]that goal for you
+[29:50.12]the agent may presents
+[29:51.12]you information
+[29:52.12]in a custom way
+[29:53.12]that's only
+[29:54.12]you're a power
+[29:55.12]user
+[29:56.12]for some niche thing
+[29:57.12]general question
+[29:58.12]so first of all
+[29:59.12]I think like
+[30:00.12]the sort of input
+[30:01.12]mode conversation
+[30:02.12]I wonder if you have
+[30:03.12]any analogies
+[30:04.12]that you like
+[30:05.12]with self-driving
+[30:06.12]because I do think
+[30:07.12]there's a little bit
+[30:08.12]of how the model
+[30:09.12]should perceive the world
+[30:10.12]and you know
+[30:11.12]the primary split
+[30:12.12]in self-driving
+[30:13.12]is LiDAR
+[30:14.12]versus camera
+[30:15.12]and I feel like
+[30:16.12]most agent companies
+[30:17.12]that I'm tracking
+[30:18.12]are all moving towards
+[30:19.12]camera approach
+[30:20.12]which is like
+[30:21.12]the multimodal approach
+[30:22.12]that we're doing
+[30:23.12]you're
+[30:24.12]focusing on that
+[30:25.12]including charts
+[30:26.12]and tables
+[30:27.12]and do you find
+[30:28.12]inspiration there
+[30:29.12]from the self-driving
+[30:30.12]world?
+[30:31.12]that's a good question
+[30:32.12]I think sometimes
+[30:33.12]the most useful
+[30:34.12]inspiration I've found
+[30:35.12]from self-driving
+[30:36.12]is the levels analogy
+[30:37.12]I think that's awesome
+[30:38.12]but I think that
+[30:39.12]our number one
+[30:40.12]goals for agents
+[30:41.12]not to look like
+[30:42.12]self-driving
+[30:43.12]we want to minimize
+[30:44.12]the chances
+[30:45.12]that agents are sort
+[30:46.12]of a thing
+[30:47.12]that you just
+[30:48.12]have to bang
+[30:49.12]your head at
+[30:50.12]for a long time
+[30:51.12]to get to like
+[30:52.12]completely
+[30:53.12]and that takes you
+[30:54.12]all the way
+[30:55.12]up to the top
+[30:56.12]but similarly
+[30:57.12]I mean
+[30:58.12]compared to self-driving
+[30:59.12]like two things
+[31:00.12]that people really
+[31:01.12]undervalue
+[31:02.12]that's like really
+[31:03.12]easy to driving
+[31:04.12]a car down
+[31:05.12]highway 101
+[31:06.12]in a sunny day
+[31:07.12]demo
+[31:08.12]that actually
+[31:09.12]doesn't prove anything
+[31:10.12]anymore
+[31:11.12]and I think
+[31:12.12]the second thing
+[31:13.12]is that
+[31:14.12]as a non-self-driving
+[31:15.12]expert
+[31:16.12]I think one of the things
+[31:17.12]that we believe
+[31:18.12]really strongly
+[31:19.12]is that
+[31:20.12]everyone under
+[31:21.12]get a lot
+[31:22.12]of reliability
+[31:23.12]is a really
+[31:24.12]strong focus on
+[31:25.12]actually why
+[31:26.12]does the model
+[31:27.12]not do this thing
+[31:28.12]and the non-trivial amount
+[31:29.12]of time
+[31:30.12]the time the model
+[31:31.12]doesn't actually
+[31:32.12]do the thing
+[31:33.12]is because if
+[31:34.12]you're a wizard
+[31:35.12]of ozing it yourself
+[31:36.12]or if you have
+[31:37.12]unreliable actuators
+[31:38.12]you can't do the thing
+[31:39.12]and so we've
+[31:40.12]had to fix
+[31:41.12]a lot of those problems
+[31:42.12]I was slightly
+[31:43.12]surprised just because
+[31:44.12]I do generally
+[31:45.12]consider the way
+[31:46.12]most that we see
+[31:47.12]all around San Francisco
+[31:48.12]as the most
+[31:49.12]I guess real case
+[31:50.12]it's a big
+[31:51.12]job but it has taken
+[31:52.12]a long time
+[31:53.12]for self-driving
+[31:54.12]temperature from
+[31:55.12]when it entered
+[31:56.12]the consciousness
+[31:57.12]and the driving down
+[31:58.12]when it went on a sunny
+[31:59.12]day moment
+[32:00.12]happened to now.
+[32:01.12]so I want to see
+[32:02.12]the more compressed
+[32:03.12]cruise, you know,
+[32:04.12]R.I.P.
+[32:05.12]recently.
+[32:06.12]and then one more thing
+[32:07.12]on just like
+[32:08.12]just going back on
+[32:09.12]this reliability
+[32:10.12]thing, something
+[32:11.12]I have been holding
+[32:12.12]in my head
+[32:13.12]that I'm curious
+[32:14.12]to get your commentary on
+[32:15.12]is I think there's a
+[32:16.12]treatup between
+[32:17.12]reliability and generality
+[32:18.12]or I want to broaden
+[32:19.12]because you have
+[32:20.12]reliability also have
+[32:21.12]cost of speed
+[32:22.12]speed is a huge emphasis
+[32:23.12]for a debt
+[32:24.12]the tendency or the
+[32:25.12]attemptation is to reduce
+[32:26.12]generalityto improve
+[32:27.12]reliability
+[32:28.12]and to improve
+[32:29.12]cost improve speed
+[32:30.12]do you perceive a tradeoff
+[32:31.12]do you have any
+[32:32.12]insights that
+[32:33.12]solve those tradeoffs
+[32:34.12]for you guys
+[32:35.12]there's definitely a tradeoff
+[32:36.12]if you're at
+[32:37.12]the predo frontier
+[32:38.12]I think a lot of folks
+[32:39.12]aren't actually
+[32:40.12]at the predo frontier
+[32:41.12]I think the way you get
+[32:42.12]there is basically
+[32:43.12]how do you frame
+[32:44.12]the fundamental
+[32:45.12]agent problem in a way
+[32:46.12]that just continues
+[32:47.12]to benefit from data
+[32:48.12]I think one of
+[32:49.12]the main ways
+[32:50.12]of being able to solve
+[32:51.12]that particular tradeoff
+[32:52.12]is you basically
+[32:53.12]just want to formulate
+[32:54.12]the problem such that
+[32:55.12]every particular use
+[32:56.12]case just looks like
+[32:57.12]you collecting more
+[32:58.12]data to go make
+[32:59.12]that use case possible
+[33:00.12]I think that's how
+[33:01.12]you really solve it
+[33:02.12]then you get into the
+[33:03.12]other problems like
+[33:04.12]are you overfitting
+[33:05.12]on these end use cases
+[33:06.12]right but like you're
+[33:07.12]not doing a thing
+[33:08.12]where you're like
+[33:09.12]being super prescriptive
+[33:10.12]for the end steps
+[33:11.12]that the model can
+[33:12.12]only do for example
+[33:13.12]then the question becomes
+[33:14.12]kind of do you have
+[33:15.12]one sort of house model
+[33:16.12]they customize
+[33:17.12]the customer's
+[33:18.12]specific use case
+[33:19.12]we're not sharing
+[33:20.12]we're not sharing
+[33:21.12]it's tempting
+[33:22.12]but that doesn't
+[33:23.12]look like AGI to me
+[33:24.12]you know what I mean
+[33:25.12]that is just
+[33:26.12]you have a good
+[33:27.12]base model
+[33:28.12]and then
+[33:29.12]you fine tune it
+[33:30.12]for what it's worth
+[33:31.12]I think there's
+[33:32.12]two paths
+[33:33.12]to a lot more
+[33:34.12]capability coming out
+[33:35.12]of the models
+[33:36.12]that we
+[33:37.12]all are training
+[33:38.12]these days
+[33:39.12]one path
+[33:40.12]is you figure out
+[33:41.12]how to spend
+[33:42.12]compute and turn
+[33:43.12]into data
+[33:44.12]and so in that
+[33:45.12]path I consider
+[33:46.12]off play
+[33:47.12]all that stuff
+[33:48.12]the second path
+[33:49.12]is how do you
+[33:50.12]get super
+[33:52.12]competent
+[33:53.12]high intelligence
+[33:54.12]demonstrations
+[33:55.12]from humans
+[33:56.12]and I think
+[33:57.12]the right way
+[33:58.12]to move forward
+[33:59.12]is you kind of
+[34:00.12]want to combine the two
+[34:01.12]the first one
+[34:02.12]gives you maximum
+[34:03.12]sample efficiency
+[34:04.12]for the second
+[34:05.12]but I think
+[34:06.12]that is going to be
+[34:07.12]hard to be running
+[34:08.12]at max speed
+[34:09.12]towards AGI
+[34:10.12]without actually
+[34:11.12]solving a bit of both
+[34:12.12]you haven't talked
+[34:13.12]much about synthetic
+[34:14.12]data as far as I can
+[34:15.12]any insights
+[34:16.12]on using synthetic
+[34:17.12]data to augment
+[34:18.12]the expensive
+[34:19.12]human data
+[34:20.12]the best part
+[34:21.12]about framing AGI
+[34:22.12]is being able
+[34:23.12]to help people do
+[34:24.12]things on computers
+[34:25.12]is you have an environment
+[34:26.12]yes
+[34:27.12]so you can
+[34:28.12]simulate all of it
+[34:29.12]you can do a lot
+[34:30.12]of stuff
+[34:31.12]when you have an environment
+[34:32.12]we were having dinner
+[34:33.12]for our one year
+[34:34.12]anniversary
+[34:35.12]the other round
+[34:36.12]thank you
+[34:37.12]Raza from human
+[34:38.12]loop was there
+[34:39.12]and we mentioned
+[34:40.12]you were coming on
+[34:41.12]the pod
+[34:42.12]this is our first
+[34:43.12]so he submitted a question
+[34:44.12]now you had
+[34:45.12]gbd4 vision
+[34:46.12]and help you
+[34:47.12]building a lot
+[34:48.12]of those things
+[34:49.12]how do you think
+[34:50.12]about the things
+[34:51.12]that are unique to you
+[34:52.12]as a depth
+[34:53.12]and like going back
+[34:54.12]to like the maybe
+[34:55.12]research direction
+[34:56.12]that you want to take
+[34:57.12]the team and what you
+[34:58.12]want people to come
+[34:59.12]work on at a depth
+[35:00.12]versus what is maybe
+[35:01.12]not become commoditized
+[35:02.12]that you didn't expect
+[35:03.12]everybody would
+[35:04.12]have access to
+[35:05.12]yeah that's
+[35:06.12]a really good question
+[35:07.12]I think implicit
+[35:08.12]in that question
+[35:09.12]and I wish he were
+[35:10.12]tier two so he can
+[35:11.12]push back on my
+[35:12.12]assumption about his
+[35:13.12]questionbut I think
+[35:14.04]is calculus of where
+[35:16.04]does advantage a crew
+[35:18.04]in the overall
+[35:19.04]ML stack
+[35:20.04]and maybe part
+[35:21.04]of the assumption
+[35:22.04]is that advantage
+[35:23.04]a crew is solely
+[35:24.04]to base model scaling
+[35:25.04]but I actually
+[35:26.04]believe pretty strongly
+[35:27.04]that the way
+[35:28.04]that you really
+[35:29.04]win is that you
+[35:30.04]have to go build
+[35:31.04]an agent stack
+[35:32.04]that is much more
+[35:33.04]than that
+[35:34.04]of the base model itself
+[35:35.04]and so I think
+[35:36.04]like that is
+[35:37.04]always going to be
+[35:38.04]a giant advantage
+[35:39.04]of vertical integration
+[35:40.04]I think like
+[35:41.04]it lets us do things
+[35:42.04]like have a really
+[35:43.04]bad cat and dog
+[35:44.04]photo
+[35:45.04]it's pretty good
+[35:46.04]at cat and dog
+[35:47.04]photo
+[35:48.04]it's not like
+[35:49.04]soda at cat
+[35:50.04]and dogphoto
+[35:51.04]so like we're allocating
+[35:52.04]our capacity wisely
+[35:53.04]is like one thing
+[35:54.04]that you
+[35:55.04]really get to do
+[35:56.04]I also think that
+[35:57.04]the other thing
+[35:58.04]that is pretty
+[35:59.04]important now
+[36:00.04]in the broader
+[36:01.04]foundation modeling
+[36:02.04]space is
+[36:03.04]I feel despite any
+[36:04.04]potential concerns
+[36:05.04]about how good
+[36:06.04]is agents as
+[36:07.04]like a startup area
+[36:08.04]like we were talking
+[36:09.04]about earlier
+[36:10.04]I feel super good
+[36:11.04]that we're
+[36:12.04]cap just flowing
+[36:13.04]from can we make
+[36:14.04]a better agent
+[36:15.04]because right now
+[36:16.04]I think we all see
+[36:17.04]that you know
+[36:18.04]if you're training
+[36:19.04]on publicly available
+[36:20.04]web data
+[36:21.04]you put in the
+[36:22.04]flops and you do
+[36:23.04]reasonable things
+[36:24.04]then you get
+[36:25.04]decent results
+[36:26.04]and if you just
+[36:27.04]double the amount
+[36:28.04]of compute
+[36:29.04]then you get
+[36:30.04]predictably
+[36:31.04]better results
+[36:32.04]and so I think
+[36:33.04]pure play foundation
+[36:34.04]model companies
+[36:35.04]are just going to be
+[36:36.04]pinched by how
+[36:37.04]good the next couple
+[36:38.04]lamas are going to be
+[36:39.04]and the next
+[36:40.04]what good open source
+[36:41.04]on these base foundation
+[36:42.04]models I think it's
+[36:43.04]gonna commoditize a lot
+[36:44.04]of the regular llms
+[36:45.04]and soon regular
+[36:46.04]multimodal models
+[36:47.04]so I feel really good
+[36:48.04]that we're just focused
+[36:49.04]on agents so you
+[36:50.04]don't consider yourself
+[36:51.04]a pure play foundation
+[36:52.04]model company no
+[36:53.04]because if we were pure
+[36:54.04]play foundation model
+[36:55.04]company we would be
+[36:56.04]traininggeneral foundation
+[36:57.04]models that do
+[36:58.04]summarization and
+[36:59.04]all this dedicated
+[37:00.04]towards the agent
+[37:01.04]yeah and our business
+[37:02.04]is an agent business
+[37:03.04]we're not here to
+[37:04.04]sell you tokens right
+[37:05.04]and I think like
+[37:06.04]selling tokens unless
+[37:07.04]there's like yeah I
+[37:08.04]love it there's like
+[37:09.04]if you have a particular
+[37:10.04]area of specialty
+[37:11.04]right then you won't
+[37:13.04]get caught in the fact
+[37:14.04]that everyone's just
+[37:15.04]scaling to ridiculous
+[37:16.04]levels of compute
+[37:17.04]but if you don't have a
+[37:18.04]specialty I find that
+[37:19.04]I think it's gonna be
+[37:20.04]a little tougher
+[37:21.04]interesting are you
+[37:22.04]interested in robotics at
+[37:23.04]all just a personally
+[37:24.04]fascinated by robotics
+[37:25.04]have always loved robotics
+[37:26.04]embodied agents as a
+[37:27.04]business you know figure
+[37:28.04]is like a big also
+[37:29.04]so the open ai
+[37:30.04]affiliated company
+[37:31.04]that raises a lot of
+[37:32.04]money I think it's
+[37:33.04]cool I think I mean
+[37:34.04]I don't know exactly
+[37:35.04]what they're exactly
+[37:36.04]what they're doing but
+[37:37.04]robots yeah yeah
+[37:38.04]well I mean that's
+[37:39.04]well Christian
+[37:40.04]would you ask
+[37:41.04]like if we
+[37:42.04]had them on like
+[37:43.04]what would you ask them
+[37:44.04]oh I just want to
+[37:45.04]understand what their
+[37:46.04]overall strategy is
+[37:47.04]gonna be between now
+[37:48.04]and when there's reliable
+[37:49.04]stuff to be deployed
+[37:50.04]but honestly
+[37:51.04]I just don't know
+[37:52.04]enough about it
+[37:53.04]and if I told you
+[37:54.04]hey fire your entire
+[37:55.04]warehouse workforce
+[37:56.04]and you know
+[37:57.04]put robots in there
+[37:58.04]isn't that a strategy
+[37:59.04]oh yeah yeah sorry
+[38:00.04]I'm not questioning
+[38:01.04]whether
+[38:02.04]they're doing smart
+[38:03.04]things I genuinely
+[38:04.04]don't know what
+[38:05.04]they're doing as much
+[38:06.04]but I think there's
+[38:07.04]two things one
+[38:08.04]it's just
+[38:09.04]I think it's
+[38:10.04]just gonna work
+[38:11.04]like I will die
+[38:12.04]on this hill
+[38:13.04]like I mean
+[38:14.04]like again this whole
+[38:15.04]this whole time
+[38:16.04]like we've been on this
+[38:17.04]podcast it's just
+[38:18.04]gonna continually saying
+[38:19.04]these models
+[38:20.04]are basically behavioral
+[38:21.04]cloners right
+[38:22.04]so let's go behavioral
+[38:23.04]clone all this like
+[38:24.04]robot behavior right
+[38:25.04]and then
+[38:26.04]now you figure out
+[38:27.04]everything else
+[38:28.04]you have to do in order
+[38:29.04]to teach you how to
+[38:30.04]solve new problem
+[38:31.04]that's gonna work
+[38:32.04]I'm super stoked for that
+[38:33.04]I think unlike
+[38:34.04]what we're doing with
+[38:35.04]helping humans with
+[38:36.04]knowledge work
+[38:37.04]and I'm personally
+[38:38.04]less excited about that
+[38:39.04]we had a
+[38:40.04]canjun from imbu
+[38:41.04]on the podcast
+[38:42.04]we asked her
+[38:43.04]why people should
+[38:44.04]go work there
+[38:45.04]and not at adept
+[38:46.04]so I wanna
+[38:47.04]well she said
+[38:48.04]you know
+[38:49.04]there's space for everybody
+[38:50.04]in this market
+[38:51.04]we're all doing
+[38:52.04]interesting work
+[38:53.04]and she said
+[38:54.04]they're really excited
+[38:55.04]about building
+[38:56.04]an operating system
+[38:57.04]for agent
+[38:58.04]and for her
+[38:59.04]the biggest research
+[39:00.04]thing was like
+[39:01.04]getting models
+[39:02.04]better reasoning
+[39:03.04]and planning
+[39:04.04]for these agents
+[39:05.04]the reverse question
+[39:06.04]I'm excited to
+[39:07.04]come work at adept
+[39:08.04]instead of imbu
+[39:09.04]and maybe
+[39:10.04]what are like
+[39:11.04]the core research
+[39:12.04]questions
+[39:13.04]that people should
+[39:14.04]be passionate about
+[39:15.04]to have fun at adept
+[39:16.04]yeah first off
+[39:17.04]I think that
+[39:18.04]I'm sure you guys
+[39:19.04]believe this too
+[39:20.04]the AI space
+[39:21.04]to the center
+[39:22.04]there's an AI space
+[39:23.04]and the AI agent
+[39:24.04]space are both
+[39:25.04]exactly as
+[39:26.04]she likely said
+[39:27.04]I think colossal
+[39:28.04]opportunities
+[39:29.04]and people are just
+[39:30.04]going to end up
+[39:31.04]winning in different
+[39:32.04]areas and a lot
+[39:33.04]of companies are
+[39:34.04]going to do well
+[39:35.04]to be at
+[39:36.04]adept
+[39:37.04]I think there's
+[39:38.04]two huge reasons
+[39:39.04]to be at adept
+[39:40.04]I think one of them
+[39:41.04]is everything we do
+[39:42.04]is in the service
+[39:43.04]of like useful agents
+[39:44.04]we're not a
+[39:45.04]research lab
+[39:46.04]we do a lot of research
+[39:47.04]in service of that goal
+[39:48.04]but we don't
+[39:49.04]think about ourselves
+[39:50.04]as like a classic
+[39:51.04]research lab at all
+[39:52.04]and I think the second
+[39:53.04]reason at work at
+[39:54.04]adeptis
+[39:55.04]if you believe that
+[39:56.04]actually having customers
+[39:57.04]and a reward signal
+[39:58.04]from customers
+[39:59.04]lets you build
+[40:00.04]AGI faster
+[40:01.04]which we really believe
+[40:02.04]then you should come here
+[40:03.04]and I think the examples
+[40:04.04]are evaluations
+[40:05.04]they're not
+[40:06.04]academic evals
+[40:07.04]they're not simulator
+[40:08.04]evals
+[40:09.04]they're like
+[40:10.04]okay we have a
+[40:11.04]customer that
+[40:12.04]really needs us to do
+[40:13.04]these particular things
+[40:14.04]we can do some
+[40:15.04]of them
+[40:16.04]these other ones
+[40:17.04]they want us to
+[40:18.04]we can't do them at
+[40:19.04]all we've turned
+[40:20.04]those into evals
+[40:21.04]solve it
+[40:22.04]I think that's
+[40:23.04]really cool
+[40:24.04]like everybody knows
+[40:25.04]a lot of these evals
+[40:26.04]are like
+[40:27.04]pretty saturated
+[40:28.04]and the new ones
+[40:29.04]that even are
+[40:30.04]not saturated you look
+[40:31.04]at someone and you're
+[40:32.04]like is this actually
+[40:33.04]and all of this stuff
+[40:34.04]but they're very grounded
+[40:35.04]and actual needs
+[40:36.04]right now
+[40:37.04]which is really cool
+[40:38.04]yeah this has been
+[40:39.04]wonderful dive
+[40:40.04]I wish we had more time
+[40:41.04]but I'll just leave it
+[40:42.04]kind of open to you
+[40:43.04]I think you have broad thoughts
+[40:44.04]you know just about
+[40:45.04]the agent space
+[40:46.04]but also just general AI
+[40:47.04]space any sort of rants
+[40:48.04]or things that
+[40:49.04]they're just helping
+[40:50.04]might for you right now
+[40:51.04]any rants
+[40:52.04]minding you
+[40:53.04]for just general
+[40:54.04]wow okay
+[40:55.04]so Amelia's already
+[40:56.04]made the rant better
+[40:57.04]than I have
+[40:58.04]but not just
+[40:59.04]not just chatbots
+[41:00.04]is like kind of rant one
+[41:01.04]but the rant two
+[41:02.04]is AI's really been
+[41:03.04]the story of compute
+[41:04.04]and compute plus data
+[41:06.04]and ways in which
+[41:07.04]you could change one
+[41:08.04]for the other
+[41:09.04]and I think as much as
+[41:10.04]our research community
+[41:11.04]is really smart
+[41:12.04]we have made many
+[41:13.04]many advancements
+[41:14.04]and that's going to
+[41:15.04]continue to be important
+[41:16.04]but now I think
+[41:17.04]the game is
+[41:18.04]increasingly changing
+[41:19.04]and the rapid
+[41:20.04]industrialization
+[41:21.04]error has begun
+[41:22.04]and I think
+[41:23.04]we unfortunately
+[41:24.04]have to embrace it
+[41:25.04]excellent awesome David
+[41:26.04]thank you so much
+[41:27.04]for your time
+[41:28.04]cool yeah thanks guys
+[41:29.04]this was fun
+[41:30.04]thank you
+[41:31.04]thank you
+[41:32.04]thank you
+[41:32.04]thank you
+[41:33.04]thank you
+[41:34.04]thank you
+[41:35.04]thank you
+[41:36.04]thank you
+[41:37.04]thank you
+[41:38.04]thank you
+[41:39.04]thank you
+[41:40.04]thank you
+[41:41.04]thank you
+[41:42.04]thank you
+[41:43.04]thank you
+[41:44.04]thank you
+[41:45.04]thank you
+[41:46.04]thank you
+[41:47.04]字幕by索兰娅
+[41:49.04]字幕:J Chong
+[41:50.04]请不吝点赞 订阅 转发 打赏 打赏
diff --git a/content/post/Latent Space/Latent-Space-Why-Google-failed-to-make-GPT-3-+-why-Multimodal-Agents-are-the-path-to-AGI-—-with-David-Luan-of-Adept.md b/content/post/Latent Space/Latent-Space-Why-Google-failed-to-make-GPT-3-+-why-Multimodal-Agents-are-the-path-to-AGI-—-with-David-Luan-of-Adept.md
new file mode 100644
index 0000000..8c8735c
--- /dev/null
+++ b/content/post/Latent Space/Latent-Space-Why-Google-failed-to-make-GPT-3-+-why-Multimodal-Agents-are-the-path-to-AGI-—-with-David-Luan-of-Adept.md
@@ -0,0 +1,4171 @@
+---
+title: Why Google failed to make GPT-3 + why Multimodal Agents are the path to AGI — with David Luan of Adept
+author: Latent Space
+date: Fri, 22 Mar 2024 19:08:16 GMT
+draft: false
+summary: Our next SF event is AI UX 2024 - let’s see the new frontier for UX since last year! Last call we are recording a preview of the AI Engineer World’s Fair with swyx and Ben Dunphy, send any questions ...
+categories: [Latent Space]
+---
+
+{{< aplayer name="Why Google failed to make GPT-3 + why Multimodal Agents are the path to AGI — with David Luan of Adept" artist="Latent Space" url="https://chrt.fm/track/ABF6EF/api.substack.com/feed/podcast/142817627/feb9af168d739c23574154c1dc08a566.mp3" cover="https://substackcdn.com/feed/podcast/1084089/post/142817627/4814924d0de9afab88abe28dab79d6dc.jpg" lrc-folded=true lrc-type=3 lrc="../Latent-Space-Why-Google-failed-to-make-GPT-3-+-why-Multimodal-Agents-are-the-path-to-AGI-—-with-David-Luan-of-Adept.lrc" >}}{{< /aplayer >}}
+
+------
+
+Our next SF event is AI UX 2024 - let’s see the new frontier for UX since last year!
Last call: we are recording a preview of the AI Engineer World’s Fair with swyx and Ben Dunphy, send any questions about Speaker CFPs and Sponsor Guides you have!
Alessio is now hiring engineers for a new startup he is incubating at Decibel: Ideal candidate is an “ex-technical co-founder type”. Reach out to him for more!
David Luan has been at the center of the modern AI revolution: he was the ~30th hire at OpenAI, he led Google's LLM efforts and co-led Google Brain, and then started Adept in 2022, one of the leading companies in the AI agents space. In today's episode, we asked David for some war stories from his time in early OpenAI (including working with Alec Radford ahead of the GPT-2 demo with Sam Altman, that resulted in Microsoft’s initial $1b investment), and how Adept is building agents that can “do anything a human does on a computer" — his definition of useful AGI.
Why Google *couldn’t* make GPT-3
While we wanted to discuss Adept, we couldn’t talk to a former VP Eng of OpenAI and former LLM tech lead at Google Brain and not ask about the elephant in the room.
It’s often asked how Google had such a huge lead in 2017 with Vaswani et al creating the Transformer and Noam Shazeer predicting trillion-parameter models and yet it was David’s team at OpenAI who ended up making GPT 1/2/3.
David has some interesting answers:
“So I think the real story of GPT starts at Google, of course, right? Because that's where Transformers sort of came about. However, the number one shocking thing to me was that, and this is like a consequence of the way that Google is organized…what they (should) have done would be say, hey, Noam Shazeer, you're a brilliant guy. You know how to scale these things up. Here's half of all of our TPUs. And then I think they would have destroyed us. He clearly wanted it too…
You know, every day we were scaling up GPT-3, I would wake up and just be stressed. And I was stressed because, you know, you just look at the facts, right? Google has all this compute. Google has all the people who invented all of these underlying technologies. There's a guy named Noam who's really smart, who's already gone and done this talk about how he wants a trillion parameter model. And I'm just like, we're probably just doing duplicative research to what he's doing. He's got this decoder only transformer that's probably going to get there before we do.
And it turned out the whole time that they just couldn't get critical mass. So during my year where I led the Google LM effort and I was one of the brain leads, you know, it became really clear why. At the time, there was a thing called the Brain Credit Marketplace. Everyone's assigned a credit. So if you have a credit, you get to buy end chips according to supply and demand. So if you want to go do a giant job, you had to convince like 19 or 20 of your colleagues not to do work. And if that's how it works, it's really hard to get that bottom up critical mass to go scale these things. And the team at Google were fighting valiantly, but we were able to beat them simply because we took big swings and we focused.”
Cloning HGI for AGI
Human intelligence got to where it is today through evolution. Some argue that to get to AGI, we will approximate all the “FLOPs” that went into that process, an approach most famously mapped out by Ajeya Cotra’s Biological Anchors report:
The early days of OpenAI were very reinforcement learning-driven with the Dota project, but that's a very inefficient way for these models to re-learn everything. (Kanjun from Imbue shared similar ideas in her episode).
David argues that there’s a shortcut. We can bootstrap from existing intelligence.
“Years ago, I had a debate with a Berkeley professor as to what will it actually take to build AGI. And his view is basically that you have to reproduce all the flops that went into evolution in order to be able to get there… I think we are ignoring the fact that you have a giant shortcut, which is you can behaviorally clone everything humans already know. And that's what we solved with LLMs!”
LLMs today basically model intelligence using all (good!) written knowledge (see our Datasets 101 episode), and have now expanded to non-verbal knowledge (see our HuggingFace episode on multimodality). The SOTA self-supervised pre-training process is surprisingly data-efficient in taking large amounts of unstructured data, and approximating reasoning without overfitting.
But how do you cross the gap from the LLMs of today to building the AGI we all want?
This is why David & friends left to start Adept.
“We believe the clearest framing of general intelligence is a system that can do anything a human can do in front of a computer. A foundation model for actions, trained to use every software tool, API, and webapp that exists, is a practical path to this ambitious goal” — ACT-1 Blogpost
Critical Path: Abstraction with Reliability
The AGI dream is fully autonomous agents, but there are levels to autonomy that we are comfortable giving our agents, based on how reliable they are. In David’s word choice, we always want higher levels of “abstractions” (aka autonomy), but our need for “reliability” is the practical limit on how high of an abstraction we can use.
“The critical path for Adept is we want to build agents that can do a higher and higher level abstraction things over time, all while keeping an insanely high reliability standard. Because that's what turns us from research into something that customers want. And if you build agents with really high reliability standard, but are continuing pushing a level of abstraction, you then learn from your users how to get that next level of abstraction faster. So that's how you actually build the data flow.
That's the critical path for the company. Everything we do is in service of that.”
We saw how Adept thinks about different levels of abstraction at the 2023 Summit:
The highest abstraction is the “AI Employee”, but we’ll get there with “AI enabled employees”. Alessio recently gave a talk about the future of work with “services as software” at this week’s Nvidia GTC (slides).
No APIs
Unlike a lot of large research labs, Adept's framing of AGI as "being able to use your computer like a human" carries with it a useful environmental constraint:
“Having a human robot lets you do things that humans do without changing everything along the way. It's the same thing for software, right? If you go itemize out the number of things you want to do on your computer for which every step has an API, those numbers of workflows add up pretty close to zero. And so then many points along the way, you need the ability to actually control your computer like a human. It also lets you learn from human usage of computers as a source of training data that you don't get if you have to somehow figure out how every particular step needs to be some particular custom private API thing. And so I think this is actually the most practical path (to economic value).”
This realization and conviction means that multimodal modals are the way to go. Instead of using function calling to call APIs to build agents, which is what OpenAI and most of the open LLM industry have done to date, Adept wants to “drive by vision”, (aka see the screen as a human sees it) and pinpoint where to click and type as a human does. No APIs needed, because most software don’t expose APIs.
Extra context for readers: You can see the DeepMind SIMA model in the same light:
One system that learned to play a diverse set of games (instead of one dedicated model per game) using only pixel inputs and keyboard-and-mouse action outputs!
The OpenInterpreter team is working on a “Computer API” that also does the same.
To do this, Adept had to double down on a special kind of multimodality for knowledge work:
“A giant thing that was really necessary is really fast multimodal models that are really good at understanding knowledge work and really good at understanding screens. And that is needs to kind of be the base for some of these agents…
…I think one big hangover primarily academic focus for multimodal models is most multimodal models are primarily trained on like natural images, cat and dog photos, stuff that's come out of the camera… (but) where are they going to be the most useful? They're going to be most useful in knowledge work tasks. That's where the majority of economic value is going to be. It's not in cat and dogs.
And so if that's what it is, what do you need to train? I need to train on like charts, graphs, tables, invoices, PDFs, receipts, unstructured data, UIs. That's just a totally different pre-training corpus. And so Adept spent a lot of time building that.”
With this context, you can now understand the full path of Adept’s public releases:
* ACT-1 (Sept 2022): a large Transformers model optimized for browser interactions. It has a custom rendering of the browser viewport that allows it to better understand it and take actions.
* Persimmon-8B (Sept 2023): a permissive open LLM (weights and code here)
* Fuyu-8B (Oct 2023): a small version of the multimodal model that powers Adept. Vanilla decoder-only transformer with no specialized image encoder, which allows it to handle input images of varying resolutions without downsampling.
* Adept Experiments (Nov 2023): A public tool to build automations in the browser. This is powered by Adept's core technology but it's just a piece of their enterprise platform. They use it as a way to try various design ideas.
* Fuyu Heavy (Jan 2024) - a new multimodal model designed specifically for digital agents and the world’s third-most-capable multimodal model (beating Gemini Pro on MMMU, AI2D, and ChartQA), “behind only GPT4-V and Gemini Ultra, which are 10-20 times bigger”
The Fuyu-8B post in particular exhibits a great number of examples on knowledge work multimodality:
Why Adept is NOT a Research Lab
With OpenAI now worth >$90b and Anthropic >$18b, it is tempting to conclude that the AI startup metagame is to build a large research lab, and attract the brightest minds and highest capital to build AGI.
Our past guests (see the Humanloop episode) and (from Imbue) combined to ask the most challenging questions of the pod - with David/Adept’s deep research pedigree from Deepmind and OpenAI, why is Adept not building more general foundation models (like Persimmon) and playing the academic benchmarks game? Why is Adept so focused on commercial agents instead?
“I feel super good that we're doing foundation models in service of agents and all of the reward within Adept is flowing from “Can we make a better agent”…
… I think pure play foundation model companies are just going to be pinched by how good the next couple of (Meta Llama models) are going to be… And then seeing the really big players put ridiculous amounts of compute behind just training these base foundation models, I think is going to commoditize a lot of the regular LLMs and soon regular multimodal models. So I feel really good that we're just focused on agents.”
and the commercial grounding is his answer to Kanjun too (whom we also asked the inverse question to compare with Adept):
“… the second reason I work at Adept is if you believe that actually having customers and a reward signal from customers lets you build AGI faster, which we really believe, then you should come here. And I think the examples for why that's true is for example, our evaluations are not academic evals. They're not simulator evals. They're like, okay, we have a customer that really needs us to do these particular things. We can do some of them. These are the ones they want us to, we can't do them at all. We've turned those into evals.. I think that's a degree of practicality that really helps.”
And his customers seem pretty happy, because David didn’t need to come on to do a sales pitch:
David: “One of the things we haven't shared before is we're completely sold out for Q1.”
Swyx: “Sold out of what?”
David: “Sold out of bandwidth to onboard more customers.”
Well, that’s a great problem to have.
Show Notes
* David Luan
* Dextro at Data Driven NYC (2015)
* Adept
* ACT-1
* Persimmon-8B
* Adept Experiments
* Fuyu-8B
* $350M Series B announcement
* Amelia Wattenberger talk at AI Engineer Summit
* Figure
Chapters
* [00:00:00] Introductions
* [00:01:14] Being employee #30 at OpenAI and its early days
* [00:13:38] What is Adept and how do you define AGI?
* [00:21:00] Adept's critical path and research directions
* [00:26:23] How AI agents should interact with software and impact product development
* [00:30:37] Analogies between AI agents and self-driving car development
* [00:32:42] Balancing reliability, cost, speed and generality in AI agents
* [00:37:30] Potential of foundation models for robotics
* [00:39:22] Core research questions and reasons to work at Adept
Transcripts
Alessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO in Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.
Swyx [00:00:15]: Hey, and today we have David Luan, CEO, co-founder of Adept in the studio. Welcome.
David [00:00:20]: Yeah, thanks for having me.
Swyx [00:00:21]: Been a while in the works. I've met you socially at one of those VC events and you said that you were interested in coming on and glad we finally were able to make this happen.
David: Yeah, happy to be part of it.
Swyx: So we like to introduce the speaker and then also just like have you talk a little bit about like what's not on your LinkedIn, what people should just generally know about you. You started a company in college, which was the first sort of real time video detection classification API that was Dextro, and that was your route to getting acquired into Axon where you're a director of AI. Then you were the 30th hire at OpenAI?
David [00:00:53]: Yeah, 30, 35, something around there. Something like that.
Swyx [00:00:56]: So you were VP of Eng for two and a half years to two years, briefly served as tech lead of large models at Google, and then in 2022 started Adept. So that's the sort of brief CV. Is there anything else you like want to fill in the blanks or like people should know more about?
David [00:01:14]: I guess a broader story was I joined OpenAI fairly early and I did that for about two and a half to three years leading engineering there. It's really funny, I think second or third day of my time at OpenAI, Greg and Ilya pulled me in a room and we're like, you know, you should take over our directs and we'll go mostly do IC work. So that was fun, just coalescing a bunch of teams out of a couple of early initiatives that had already happened. The company, the Dota effort was going pretty hard and then more broadly trying to put bigger picture direction around what we were doing with basic research. So I spent a lot of time doing that. And then I led Google's LLM efforts, but also co-led Google Brain was one of the brain leads more broadly. You know, there's been a couple of different eras of AI research, right? If we count everything before 2012 as prehistory, which people hate it when I say that, kind of had this like you and your three best friends write a research paper that changes the world period from like 2012 to 2017. And I think the game changed in 2017 and like most labs didn't realize it, but we at OpenAI really did. I think in large part helped by like Ilya's constant beating of the drum that the world would be covered in data centers. And I think-
Swyx [00:02:15]: It's causally neat.
David [00:02:16]: Yeah. Well, like I think we had conviction in that, but it wasn't until we started seeing results that it became clear that that was where we had to go. But also part of it as well was for OpenAI, like when I first joined, I think one of the jobs that I had to do was how do I tell a differentiated vision for who we were technically compared to, you know, hey, we're just smaller Google Brain, or like you work at OpenAI if you live in SF and don't want to commute to Mountain View or don't want to live in London, right? That's like not enough to like hang your technical identity as a company. And so what we really did was, and I spent a lot of time pushing this, is just how do we get ourselves focused on a certain class of like giant swings and bets, right? Like how do you flip the script from you just do bottom-up research to more about how do you like leave some room for that, but really make it about like, what are the big scientific outcomes that you want to show? And then you just solve them at all costs, whether or not you care about novelty and all that stuff. And that became the dominant model for a couple of years, right? And then what's changed now is I think the number one driver of AI products over the next couple of years is going to be the deep co-design and co-evolution of product and users for feedback and actual technology. And I think labs, every tool to go do that are going to do really well. And that's a big part of why I started Adept.
Alessio [00:03:20]: You mentioned Dota, any memories thinking from like the switch from RL to Transformers at the time and kind of how the industry was evolving more in the LLM side and leaving behind some of the more agent simulation work?
David [00:03:33]: Like zooming way out, I think agents are just absolutely the correct long-term direction, right? You just go to find what AGI is, right? You're like, Hey, like, well, first off, actually, I don't love AGI definitions that involve human replacement because I don't think that's actually how it's going to happen. Even this definition of like, Hey, AGI is something that outperforms humans at economically valuable tasks is kind of implicit view of the world about what's going to be the role of people. I think what I'm more interested in is like a definition of AGI that's oriented around like a model that can do anything a human can do on a computer. If you go think about that, which is like super tractable, then agent is just a natural consequence of that definition. And so what did all the work we did on our own stuff like that get us was it got us a really clear formulation. Like you have a goal and you want to maximize the goal, you want to maximize reward, right? And the natural LLM formulation doesn't come with that out of the box, right? I think that we as a field got a lot right by thinking about, Hey, how do we solve problems of that caliber? And then the thing we forgot is the Novo RL is like a pretty terrible way to get there quickly. Why are we rediscovering all the knowledge about the world? Years ago, I had a debate with a Berkeley professor as to what will it actually take to build AGI. And his view is basically that you have to reproduce all the flops that went into evolution in order to be able to get there. Right.
Swyx [00:04:44]: The biological basis theory. Right.
David [00:04:46]: So I think we are ignoring the fact that you have a giant shortcut, which is you can behavioral clone everything humans already know. And that's what we solved with LLMs. We've solved behavioral cloning, everything that humans already know. Right. So like today, maybe LLMs is like behavioral cloning every word that gets written on the internet in the future, the multimodal models are becoming more of a thing where behavioral cloning the visual world. But really, what we're just going to have is like a universal byte model, right? Where tokens of data that have high signal come in, and then all of those patterns are like learned by the model. And then you can regurgitate any combination now. Right. So text into voice out, like image into other image out or video out or whatever, like these like mappings, right? Like all just going to be learned by this universal behavioral cloner. And so I'm glad we figured that out. And I think now we're back to the era of how do we combine this with all of the lessons we learned during the RL period. That's what's going to drive progress.
Swyx [00:05:35]: I'm still going to pressure you for a few more early opening stories before we turn to the ADET stuff. On your personal site, which I love, because it's really nice, like personal, you know, story context around like your history. I need to update it. It's so old. Yeah, it's so out of date. But you mentioned GPT-2. Did you overlap with GPT-1? I think you did, right?
David [00:05:53]: I actually don't quite remember. I think I was joining right around- Right around then?
Swyx [00:05:57]: I was right around that, yeah. Yeah. So what I remember was Alec, you know, just kind of came in and was like very obsessed with Transformers and applying them to like Reddit sentiment analysis. Yeah, sentiment, that's right. Take us through-
David [00:06:09]: Sentiment neuron, all this stuff.
Swyx [00:06:10]: The history of GPT as far as you know, you know, according to you. Ah, okay.
David [00:06:14]: History of GPT, according to me, that's a pretty good question. So I think the real story of GPT starts at Google, of course, right? Because that's where Transformers sort of came about. However, the number one shocking thing to me was that, and this is like a consequence of the way that Google is organized, where like, again, you and your three best friends write papers, right? Okay. So zooming way out, right? I think about my job when I was a full-time research leader as a little bit of a portfolio allocator, right? So I've got really, really smart people. My job is to convince people to coalesce around a small number of really good ideas and then run them over the finish line. My job is not actually to promote a million ideas and never have critical mass. And then as the ideas start coming together and some of them start working well, my job is to nudge resources towards the things that are really working and then start disbanding some of the things that are not working, right? That muscle did not exist during my time at Google. And I think had they had it, what they would have done would be say, hey, Noam Shazir, you're a brilliant guy. You know how to scale these things up. Here's half of all of our TPUs. And then I think they would have destroyed us. He clearly wanted it too.
Swyx [00:07:17]: He's talking about trillion parameter models in 2017.
David [00:07:20]: Yeah. So that's the core of the GPT story, right? Which is that, and I'm jumping around historically, right? But after GPT-2, we were all really excited about GPT-2. I can tell you more stories about that. It was the last paper that I even got to really touch before everything became more about building a research org. You know, every day we were scaling up GPT-3, I would wake up and just be stressed. And I was stressed because, you know, you just look at the facts, right? Google has all this compute. Google has all the people who invented all of these underlying technologies. There's a guy named Noam who's really smart, who's already gone and done this talk about how he wants a trillion parameter model. And I'm just like, we're probably just doing duplicative research to what he's doing, right? He's got this decoder only transformer that's probably going to get there before we do. And I was like, but like, please just like let this model finish, right? And it turned out the whole time that they just couldn't get critical mass. So during my year where I led the Google LM effort and I was one of the brain leads, you know, it became really clear why, right? At the time, there was a thing called the brain credit marketplace. And did you guys know the brain credit marketplace? No, I never heard of this. Oh, so it's actually, it's a, you can ask any Googler.
Swyx [00:08:23]: It's like just like a thing that, that, I mean, look like, yeah, limited resources, you got to have some kind of marketplace, right? You know, sometimes it's explicit, sometimes it isn't, you know, just political favors.
David [00:08:34]: You could. And so then basically everyone's assigned a credit, right? So if you have a credit, you get to buy end chips according to supply and demand. So if you want to go do a giant job, you had to convince like 19 or 20 of your colleagues not to do work. And if that's how it works, it's really hard to get that bottom up critical mass to go scale these things. And the team at Google were fighting valiantly, but we were able to beat them simply because we took big swings and we focused. And I think, again, that's like part of the narrative of like this phase one of AI, right? Of like this modern AI era to phase two. And I think in the same way, I think phase three company is going to out execute phase two companies because of the same asymmetry of success.
Swyx [00:09:12]: Yeah. I think it's underrated how much NVIDIA works with you in the early days as well. I think maybe, I think it was Jensen. I'm not sure who circulated a recent photo of him delivering the first DGX to you guys.
David [00:09:24]: I think Jensen has been a complete legend and a mastermind throughout. I have so much respect for NVIDIA. It is unreal.
Swyx [00:09:34]: But like with OpenAI, like kind of give their requirements, like co-design it or just work of whatever NVIDIA gave them.
David [00:09:40]: So we work really closely with them. There's, I'm not sure I can share all the stories, but examples of ones that I've found particularly interesting. So Scott Gray is amazing. I really like working with him. He was on one of my teams, the supercomputing team, which Chris Berner runs and Chris Berner still does a lot of stuff in that. As a result, like we had very close ties to NVIDIA. Actually, one of my co-founders at Adept, Eric Elson, was also one of the early GPGPU people. So he and Scott and Brian Catanzaro at NVIDIA and Jonah and Ian at NVIDIA, I think all were very close. And we're all sort of part of this group of how do we push these chips to the absolute limit? And I think that kind of collaboration helped quite a bit. I think one interesting set of stuff is knowing the A100 generation, that like quad sparsity was going to be a thing. Is that something that we want to go look into, right? And figure out if that's something that we could actually use for model training. Really what it boils down to is that, and I think more and more people realize this, six years ago, people, even three years ago, people refused to accept it. This era of AI is really a story of compute. It's really the story of how do you more efficiently map actual usable model flops to compute,
Swyx [00:10:38]: Is there another GPT 2, 3 story that you love to get out there that you think is underappreciated for the amount of work that people put into it?
David [00:10:48]: So two interesting GPT 2 stories. One of them was I spent a good bit of time just sprinting to help Alec get the paper out. And I remember one of the most entertaining moments was we were writing the modeling section. And I'm pretty sure the modeling section was the shortest modeling section of any ML, reasonably legitimate ML paper to that moment. It was like section three model. This is a standard vanilla decoder only transformer with like these particular things, those paragraph long if I remember correctly. And both of us were just looking at the same being like, man, the OGs in the field are going to hate this. They're going to say no novelty. Why did you guys do this work? So now it's funny to look at in hindsight that it was pivotal kind of paper, but I think it was one of the early ones where we just leaned fully into all we care about is solving problems in AI and not about, hey, is there like four different really simple ideas that are cloaked in mathematical language that doesn't actually help move the field forward?
Swyx [00:11:42]: Right. And it's like you innovate on maybe like data set and scaling and not so much the architecture.
David [00:11:48]: We all know how it works now, right? Which is that there's a collection of really hard won knowledge that you get only by being at the frontiers of scale. And that hard won knowledge, a lot of it's not published. A lot of it is stuff that's actually not even easily reducible to what looks like a typical academic paper. But yet that's the stuff that helps differentiate one scaling program from another. You had a second one? So the second one is, there's like some details here that I probably shouldn't fully share, but hilariously enough for the last meeting we did with Microsoft before Microsoft invested in OpenAI, Sam Altman, myself and our CFO flew up to Seattle to do the final pitch meeting. And I'd been a founder before. So I always had a tremendous amount of anxiety about partner meetings, which this basically this is what it was. I had Kevin Scott and Satya and Amy Hood, and it was my job to give the technical slides about what's the path to AGI, what's our research portfolio, all of this stuff, but it was also my job to give the GPT-2 demo. We had a slightly bigger version of GPT-2 that we had just cut maybe a day or two before this flight up. And as we all know now, model behaviors you find predictable at one checkpoint are not predictable in another checkpoint. And so I'd spent all this time trying to figure out how to keep this thing on rails. I had my canned demos, but I knew I had to go turn it around over to Satya and Kevin and let them type anything in. And that just, that really kept me up all night.
Swyx [00:13:06]: Nice. Yeah.
Alessio [00:13:08]: I mean, that must have helped you talking about partners meeting. You raised $420 million for Adept. The last round was a $350 million Series B, so I'm sure you do great in partner meetings.
Swyx [00:13:18]: Pitchers meetings. Nice.
David [00:13:20]: No, that's a high compliment coming from a VC.
Alessio [00:13:22]: Yeah, no, I mean, you're doing great already for us. Let's talk about Adept. And we were doing pre-prep and you mentioned that maybe a lot of people don't understand what Adept is. So usually we try and introduce the product and then have the founders fill in the blanks, but maybe let's do the reverse. Like what is Adept? Yeah.
David [00:13:38]: So I think Adept is the least understood company in the broader space of foundational models plus agents. So I'll give some color and I'll explain what it is and I'll explain also why it's actually pretty different from what people would have guessed. So the goal for Adept is we basically want to build an AI agent that can do, that can basically help humans do anything a human does on a computer. And so what that really means is we want this thing to be super good at turning natural language like goal specifications right into the correct set of end steps and then also have all the correct sensors and actuators to go get that thing done for you across any software tool that you already use. And so the end vision of this is effectively like I think in a couple of years everyone's going to have access to like an AI teammate that they can delegate arbitrary tasks to and then also be able to, you know, use it as a sounding board and just be way, way, way more productive. Right. And just changes the shape of every job from something where you're mostly doing execution to something where you're mostly actually doing like these core liberal arts skills of what should I be doing and why. Right. And I find this like really exciting and motivating because I think it's actually a pretty different vision for how AGI will play out. I think systems like Adept are the most likely systems to be proto-AGIs. But I think the ways in which we are really counterintuitive to everybody is that we've actually been really quiet because we are not a developer company. We don't sell APIs. We don't sell open source models. We also don't sell bottom up products. We're not a thing that you go and click and download the extension and like we want more users signing up for that thing. We're actually an enterprise company. So what we do is we work with a range of different companies, some like late stage multi-thousand people startups, some fortune 500s, et cetera. And what we do for them is we basically give them an out of the box solution where big complex workflows that their employees do every day could be delegated to the model. And so we look a little different from other companies in that in order to go build this full agent thing, the most important thing you got to get right is reliability. So initially zooming way back when, one of the first things that DEP did was we released this demo called Act One, right? Act One was like pretty cool. It's like kind of become a hello world thing for people to show agent demos by going to Redfin and asking to buy a house somewhere because like we did that in the original Act One demo and like showed that, showed like Google Sheets, all this other stuff. Over the last like year since that has come out, there's been a lot of really cool demos and you go play with them and you realize they work 60% of the time. But since we've always been focused on how do we build an amazing enterprise product, enterprises can't use anything that isn't in the nines of reliability. And so we've actually had to go down a slightly different tech tree than what you might find in the prompt engineering sort of plays in the agent space to get that reliability. And we've decided to prioritize reliability over all else. So like one of our use cases is crazy enough that it actually ends with a physical truck being sent to a place as the result of the agent workflow. And if you're like, if that works like 60% of the time, you're just blowing money and poor truck drivers going places.
Alessio [00:16:30]: Interesting. One of the, our investment teams has this idea of services as software. I'm actually giving a talk at NVIDIA GTC about this, but basically software as a service, you're wrapping user productivity in software with agents and services as software is replacing things that, you know, you would ask somebody to do and the software just does it for you. When you think about these use cases, do the users still go in and look at the agent kind of like doing the things and can intervene or like are they totally removed from them? Like the truck thing is like, does the truck just show up or are there people in the middle checking in?
David [00:17:04]: I think there's two current flaws in the framing for services as software, or I think what you just said. I think that one of them is like in our experience, as we've been rolling out Adept, the people who actually do the jobs are the most excited about it because they don't go from, I do this job to, I don't do this job. They go from, I do this job for everything, including the shitty rote stuff to I'm a supervisor. And I literally like, it's pretty magical when you watch the thing being used because now it parallelizes a bunch of the things that you had to do sequentially by hand as a human. And you can just click into any one of them and be like, Hey, I want to watch the trajectory that the agent went through to go solve this. And the nice thing about agent execution as opposed to like LLM generations is that a good chunk of the time when the agent fails to execute, it doesn't give you the wrong result. It just fails to execute. And the whole trajectory is just broken and dead and the agent knows it, right? So then those are the ones that the human then goes and solves. And so then they become a troubleshooter. They work on the more challenging stuff. They get way, way more stuff done and they're really excited about it. I think the second piece of it that we've found is our strategy as a company is to always be an augmentation company. And I think one out of principle, that's something we really care about. But two, actually, if you're framing yourself as an augmentation company, you're always going to live in a world where you're solving tasks that are a little too hard for what the model can do today and still needs a human to provide oversight, provide clarifications, provide human feedback. And that's how you build a data flywheel. That's how you actually learn from the smartest humans how to solve things models can't do today. And so I actually think that being an augmentation company forces you to go develop your core AI capabilities faster than someone who's saying, ah, okay, my job is to deliver you a lights off solution for X.
Alessio [00:18:42]: Yeah. It's interesting because we've seen two parts of the market. One is we have one company that does agents for SOC analysts. People just don't have them, you know, and just they cannot attract the talent to do it. And similarly, in a software development, you have Copilot, which is the augmentation product, and then you have sweep.dev and you have these products, which they just do the whole thing. I'm really curious to see how that evolves. I agree that today the reliability is so important in the enterprise that they just don't use most of them. Yeah. Yeah. No, that's cool. But it's great to hear the story because I think from the outside, people are like, oh, a dev, they do Act One, they do Persimon, they do Fuyu, they do all this stuff. Yeah, it's just the public stuff.
Swyx [00:19:20]: It's just public stuff.
David [00:19:21]: So one of the things we haven't shared before is we're completely sold out for Q1. And so I think...
Swyx [00:19:26]: Sold out of what?
David [00:19:27]: Sold out of bandwidth to go on board more customers. And so we're like working really hard to go make that less of a bottleneck, but our expectation is that I think we're going to be significantly more public about the broader product shape and the new types of customers we want to attract later this year. So I think that clarification will happen by default.
Swyx [00:19:43]: Why have you become more public? You know, if the whole push has... You're sold out, you're my enterprise, but you're also clearly putting effort towards being more open or releasing more things.
David [00:19:53]: I think we just flipped over that way fairly recently. That's a good question. I think it actually boils down to two things. One, I think that, frankly, a big part of it is that the public narrative is really forming around agents as being the most important thing. And I'm really glad that's happening because when we started the company in January 2022, everybody in the field knew about the agents thing from RL, but the general public had no conception of what it was. They were still hanging their narrative hat on the tree of everything's a chatbot. And so I think now one of the things that I really care about is that when people think agent, they actually think the right thing. All sorts of different things are being called agents. Chatbots are being called agents. Things that make a function call are being called agents. To me, an agent is something that you can give a goal and get an end step workflow done correctly in the minimum number of steps. And so that's a big part of why. And I think the other part is because I think it's always good for people to be more aware of Redept as they think about what the next thing they want to do in their careers. The field is quickly pivoting in a world where foundation models are looking more and more commodity. And I think a huge amount of gain is going to happen from how do you use foundation models as the well-learned behavioral cloner to go solve agents. And I think people who want to do agents research should really come to Redept.
Swyx [00:21:00]: When you say agents have become more part of the public narrative, are there specific things that you point to? I'll name a few. Bill Gates in his blog post mentioning that agents are the future. I'm the guy who made OSes, and I think agents are the next thing. So Bill Gates, I'll call that out. And then maybe Sam Altman also saying that agents are the future for open AI.
David [00:21:17]: I think before that even, I think there was something like the New York Times, Cade Metz wrote a New York Times piece about it. Right now, in a bit to differentiate, I'm seeing AI startups that used to just brand themselves as an AI company, but now brand themselves as an AI agent company. It's just like, it's a term I just feel like people really want.
Swyx [00:21:31]: From the VC side, it's a bit mixed. Is it? As in like, I think there are a lot of VCs where like, I would not touch any agent startups because like- Why is that? Well, you tell me.
Alessio [00:21:41]: I think a lot of VCs that are maybe less technical don't understand the limitations of the-
Swyx [00:21:46]: No, that's not fair.
Alessio [00:21:47]: No, no, no, no. I think like- You think so? No, no. I think like the, what is possible today and like what is worth investing in, you know? And I think like, I mean, people look at you and say, well, these guys are building agents. They needed 400 million to do it. So a lot of VCs are maybe like, oh, I would rather invest in something that is tacking on AI to an existing thing, which is like easier to get the market and kind of get some of the flywheel going. But I'm also surprised a lot of funders just don't want to do agents. It's not even the funding. Sometimes we look around and it's like, why is nobody doing agents for X? Wow.
David [00:22:17]: That's good to know actually. I never knew that before. My sense from my limited perspective is there's a new agent company popping up every day.
Swyx [00:22:24]: So maybe I'm- They are. They are. But like I have advised people to take agents off of their title because it's so diluted.
David [00:22:31]: It's now so diluted.
Swyx [00:22:32]: Yeah. So then it doesn't stand for anything. Yeah.
David [00:22:35]: That's a really good point.
Swyx [00:22:36]: So like, you know, you're a portfolio allocator. You have people know about Persimmon, people know about Fuyu and Fuyu Heavy. Can you take us through like how you think about that evolution of that and what people should think about what that means for adepts and sort of research directions? Kind of take us through the stuff you shipped recently and how people should think about the trajectory of what you're doing.
David [00:22:56]: The critical path for adepts is we want to build agents that can do a higher and higher level abstraction things over time, all while keeping an insanely high reliability standard. Because that's what turns us from research into something that customers want. And if you build agents with really high reliability standard, but are continuing pushing a level of abstraction, you then learn from your users how to get that next level of abstraction faster. So that's how you actually build the data flow. That's the critical path for the company. Everything we do is in service of that. So if you go zoom way, way back to Act One days, right? Like the core thing behind Act One is can we teach large model basically how to even actuate your computer? And I think we're one of the first places to have solved that and shown it and shown the generalization that you get when you give it various different workflows and texts. But I think from there on out, we really realized was that in order to get reliability, companies just do things in various different ways. You actually want these models to be able to get a lot better at having some specification of some guardrails for what it actually should be doing. And I think in conjunction with that, a giant thing that was really necessary is really fast multimodal models that are really good at understanding knowledge work and really good at understanding screens. And that is needs to kind of be the base for some of these agents. Back then we had to do a ton of research basically on how do we actually make that possible? Well, first off, like back in forgot exactly one month to 23, like there were no multimodal models really that you could use for things like this. And so we pushed really hard on stuff like the Fuyu architecture. I think one big hangover primarily academic focus for multimodal models is most multimodal models are primarily trained on like natural images, cat and dog photos, stuff that's come out of the camera. Coco. Yeah, right. And the Coco is awesome. Like I love Coco. I love TY. Like it's really helped the field. Right. But like that's the build one thing. I actually think it's really clear today. Multimodal models are the default foundation model, right? It's just going to supplant LLMs. Like you just train a giant multimodal model. And so for that though, like where are they going to be the most useful? They're going to be most useful in knowledge work tasks. That's where the majority of economic value is going to be. It's not in cat and dogs. Right. And so if that's what it is, what do you need to train? I need to train on like charts, graphs, tables, invoices, PDFs, receipts, unstructured data, UIs. That's just a totally different pre-training corpus. And so a depth spent a lot of time building that. And so the public for use and stuff aren't trained on our actual corpus, it's trained on some other stuff. But you take a lot of that data and then you make it really fast and make it really good at things like dense OCR on screens. And then now you have the right like raw putty to go make a good agent. So that's kind of like some of the modeling side, we've kind of only announced some of that stuff. We haven't really announced much of the agent's work, but that if you put those together with the correct product form factor, and I think the product form factor also really matters. I think we're seeing, and you guys probably see this a little bit more than I do, but we're seeing like a little bit of a pushback against the tyranny of chatbots as form factor. And I think that the reason why the form factor matters is the form factor changes what data you collect in the human feedback loop. And so I think we've spent a lot of time doing full vertical integration of all these bits in order to get to where we are.
Swyx [00:25:44]: Yeah. I'll plug Amelia Wattenberger’s talk at our conference, where she gave a little bit of the thinking behind like what else exists other than chatbots that if you could delegate to reliable agents, you could do. I was kind of excited at Adept experiments or Adept workflows, I don't know what the official name for it is. I was like, okay, like this is something I can use, but it seems like it's just an experiment for now. It's not your product.
David [00:26:06]: So you basically just use experiments as like a way to go push various ideas on the design side to some people and just be like, yeah, we'll play with it. Actually the experiments code base underpins the actual product, but it's just the code base itself is kind of like a skeleton for us to go deploy arbitrary cards on the side.
Swyx [00:26:22]: Yeah.
Alessio [00:26:23]: Makes sense. I was going to say, I would love to talk about the interaction layer. So you train a model to see UI, but then there's the question of how do you actually act on the UI? I think there was some rumors about open app building agents that are kind of like, they manage the end point. So the whole computer, you're more at the browser level. I read in one of your papers, you have like a different representation, kind of like you don't just take the dome and act on it. You do a lot more stuff. How do you think about the best way the models will interact with the software and like how the development of products is going to change with that in mind as more and more of the work is done by agents instead of people?
David [00:26:58]: This is, there's so much surface area here and it's actually one of the things I'm really excited about. And it's funny because I've spent most of my time doing research stuff, but there's like a whole new ball game that I've been learning about and I find it really cool. So I would say the best analogy I have to why Adept is pursuing a path of being able to use your computer like a human, plus of course being able to call APIs and being able to call APIs is the easy part, like being able to use your computer like a human is a hard part. It's in the same way why people are excited about humanoid robotics, right? In a world where you had T equals infinity, right? You're probably going to have various different form factors that robots could just be in and like all the specialization. But the fact is that humans live in a human environment. So having a human robot lets you do things that humans do without changing everything along the way. It's the same thing for software, right? If you go itemize out the number of things you want to do on your computer for which every step has an API, those numbers of workflows add up pretty close to zero. And so then many points along the way, you need the ability to actually control your computer like a human. It also lets you learn from human usage of computers as a source of training data that you don't get if you have to somehow figure out how every particular step needs to be some particular custom private API thing. And so I think this is actually the most practical path. I think because it's the most practical path, I think a lot of success will come from going down this path. I kind of think about this early days of the agent interaction layer level is a little bit like, do you all remember Windows 3.1? Like those days? Okay, this might be, I might be, I might be too old for you guys on this. But back in the day, Windows 3.1, we had this transition period between pure command line, right? Being the default into this new world where the GUI is the default and then you drop into the command line for like programmer things, right? The old way was you booted your computer up, DOS booted, and then it would give you the C colon slash thing. And you typed Windows and you hit enter, and then you got put into Windows. And then the GUI kind of became a layer above the command line. The same thing is going to happen with agent interfaces is like today we'll be having the GUI is like the base layer. And then the agent just controls the current GUI layer plus APIs. And in the future, as more and more trust is built towards agents and more and more things can be done by agents, if more UIs for agents are actually generative in and of themselves, then that just becomes a standard interaction layer. And if that becomes a standard interaction layer, what changes for software is that a lot of software is going to be either systems or record or like certain customized workflow execution engines. And a lot of how you actually do stuff will be controlled at the agent layer.
Alessio [00:29:19]: And you think the rabbit interface is more like it would like you're not actually seeing the app that the model interacts with. You're just saying, hey, I need to log this call on Salesforce. And you're never actually going on salesforce.com directly as the user. I can see that being a model.
David [00:29:33]: I think I don't know enough about what using rabbit in real life will actually be like to comment on that particular thing. But I think the broader idea that, you know, you have a goal, right? The agent knows how to break your goal down into steps. The agent knows how to use the underlying software and systems or record to achieve that goal for you. The agent maybe presents you information in a custom way that's only relevant to your particular goal, all just really leads to a world where you don't really need to ever interface with the apps underneath unless you're a power user for some niche thing.
Swyx [00:30:03]: General question. So first of all, I think like the sort of input mode conversation. I wonder if you have any analogies that you like with self-driving, because I do think like there's a little bit of how the model should perceive the world. And you know, the primary split in self-driving is LiDAR versus camera. And I feel like most agent companies that I'm tracking are all moving towards camera approach, which is like the multimodal approach, you know, multimodal vision, very heavy vision, all the Fuyu stuff that you're doing. You're focusing on that, including charts and tables. And do you find that inspiration there from like the self-driving world? That's a good question.
David [00:30:37]: I think sometimes the most useful inspiration I've found from self-driving is the levels analogy. I think that's awesome. But I think that our number one goal is for agents not to look like self-driving. We want to minimize the chances that agents are sort of a thing that you just have to bang your head at for a long time to get to like two discontinuous milestones, which is basically what's happened in self-driving. We want to be living in a world where you have the data flywheel immediately, and that takes you all the way up to the top. But similarly, I mean, compared to self-driving, like two things that people really undervalue is like really easy to driving a car down highway 101 in a sunny day demo. That actually doesn't prove anything anymore. And I think the second thing is that as a non-self-driving expert, I think one of the things that we believe really strongly is that everyone undervalues the importance of really good sensors and actuators. And actually a lot of what's helped us get a lot of reliability is a really strong focus on actually why does the model not do this thing? And the non-trivial amount of time, the time the model doesn't actually do the thing is because if you're a wizard of ozzing it yourself, or if you have unreliable actuators, you can't do the thing. And so we've had to fix a lot of those problems.
Swyx [00:31:43]: I was slightly surprised just because I do generally consider the way most that we see all around San Francisco as the most, I guess, real case of agents that we have in very material ways.
David [00:31:55]: Oh, that's absolutely true. I think they've done an awesome job, but it has taken a long time for self-driving to mature from when it entered the consciousness and the driving down 101 on a sunny day moment happened to now. Right. So I want to see that more compressed.
Swyx [00:32:07]: And I mean, you know, cruise, you know, RIP. And then one more thing on just like, just going back on this reliability thing, something I have been holding in my head that I'm curious to get your commentary on is I think there's a trade-off between reliability and generality, or I want to broaden reliability into just general like sort of production readiness and enterprise readiness scale. Because you have reliability, you also have cost, you have speed, speed is a huge emphasis for a debt. The tendency or the temptation is to reduce generality to improve reliability and to improve cost, improve speed. Do you perceive a trade-off? Do you have any insights that solve those trade-offs for you guys?
David [00:32:42]: There's definitely a trade-off. If you're at the Pareto frontier, I think a lot of folks aren't actually at the Pareto frontier. I think the way you get there is basically how do you frame the fundamental agent problem in a way that just continues to benefit from data? I think one of the main ways of being able to solve that particular trade-off is you basically just want to formulate the problem such that every particular use case just looks like you collecting more data to go make that use case possible. I think that's how you really solve. Then you get into the other problems like, okay, are you overfitting on these end use cases? You're not doing a thing where you're being super prescriptive for the end steps that the model can only do, for example.
Swyx [00:33:17]: Then the question becomes, do you have one house model that you can then customize for each customer and you're fine-tuning them on each customer's specific use case?
David [00:33:25]: Yeah.
Swyx [00:33:26]: We're not sharing that. You're not sharing that. It's tempting, but that doesn't look like AGI to me. You know what I mean? That is just you have a good base model and then you fine-tune it.
David [00:33:35]: For what it's worth, I think there's two paths to a lot more capability coming out of the models that we all are training these days. I think one path is you figure out how to spend, compute, and turn it into data. In that path, I consider search, RL, all the things that we all love in this era as part of that path, like self-play, all that stuff. The second path is how do you get super competent, high intelligence demonstrations from humans? I think the right way to move forward is you kind of want to combine the two. The first one gives you maximum sample efficiency for a little second, but I think that it's going to be hard to be running at max speed towards AGI without actually solving a bit of both.
Swyx [00:34:16]: You haven't talked much about synthetic data, as far as I can tell. Probably this is a bit too much of a trend right now, but any insights on using synthetic data to augment the expensive human data?
David [00:34:26]: The best part about framing AGI as being able to help people do things on computers is you have an environment.
Swyx [00:34:31]: Yes. So you can simulate all of it.
David [00:34:35]: You can do a lot of stuff when you have an environment.
Alessio [00:34:37]: We were having dinner for our one-year anniversary. Congrats. Yeah. Thank you. Raza from HumanLoop was there, and we mentioned you were coming on the pod. This is our first-
Swyx [00:34:45]: So he submitted a question.
Alessio [00:34:46]: Yeah, this is our first, I guess, like mailbag question. He asked, when you started GPD 4 Data and Exist, now you have a GPD 4 vision and help you building a lot of those things. How do you think about the things that are unique to you as Adept, and like going back to like the maybe research direction that you want to take the team and what you want people to come work on at Adept, versus what is maybe now become commoditized that you didn't expect everybody would have access to?
David [00:35:11]: Yeah, that's a really good question. I think implicit in that question, and I wish he were tier two so he can push back on my assumption about his question, but I think implicit in that question is calculus of where does advantage accrue in the overall ML stack. And maybe part of the assumption is that advantage accrues solely to base model scaling. But I actually believe pretty strongly that the way that you really win is that you have to go build an agent stack that is much more than that of the base model itself. And so I think like that is always going to be a giant advantage of vertical integration. I think like it lets us do things like have a really, really fast base model, is really good at agent things, but is bad at cat and dog photos. It's pretty good at cat and dog photos. It's not like soda at cat and dog photos, right? So like we're allocating our capacity wisely, right? That's like one thing that you really get to do. I also think that the other thing that is pretty important now in the broader foundation modeling space is I feel despite any potential concerns about how good is agents as like a startup area, right? Like we were talking about earlier, I feel super good that we're doing foundation models in service of agents and all of the reward within Adept is flowing from can we make a better agent? Because right now I think we all see that, you know, if you're training on publicly available web data, you put in the flops and you do reasonable things, then you get decent results. And if you just double the amount of compute, then you get predictably better results. And so I think pure play foundation model companies are just going to be pinched by how good the next couple of llamas are going to be and the next what good open source thing. And then seeing the really big players put ridiculous amounts of compute behind just training these base foundation models, I think is going to commoditize a lot of the regular LLMs and soon regular multimodal models. So I feel really good that we're just focused on agents.
Swyx [00:36:56]: So you don't consider yourself a pure play foundation model company?
David [00:36:59]: No, because if we were a pure play foundation model company, we would be training general foundation models that do summarization and all this other...
Swyx [00:37:06]: You're dedicated towards the agent. Yeah.
David [00:37:09]: And our business is an agent business. We're not here to sell you tokens, right? And I think like selling tokens, unless there's like a...
Swyx [00:37:14]: Not here to sell you tokens. I love it.
David [00:37:16]: It's like if you have a particular area of specialty, right? Then you won't get caught in the fact that everyone's just scaling to ridiculous levels of compute. But if you don't have a specialty, I find that, I think it's going to be a little tougher.
Swyx [00:37:27]: Interesting. Are you interested in robotics at all? Just a...
David [00:37:30]: I'm personally fascinated by robotics. I've always loved robotics.
Swyx [00:37:33]: Embodied agents as a business, you know, Figure is like a big, also sort of open AI affiliated company that raises a lot of money.
David [00:37:39]: I think it's cool. I think, I mean, I don't know exactly what they're doing, but...
Swyx [00:37:44]: Robots. Yeah.
David [00:37:46]: Well, I mean, that's a...
Swyx [00:37:47]: Yeah. What question would you ask? If we had them on, what would you ask them?
David [00:37:50]: Oh, I just want to understand what their overall strategy is going to be between now and when there's reliable stuff to be deployed. But honestly, I just don't know enough about it.
Swyx [00:37:57]: And if I told you, hey, fire your entire warehouse workforce and, you know, put robots in there, isn't that a strategy? Oh yeah.
David [00:38:04]: Yeah. Sorry. I'm not questioning whether they're doing smart things. I genuinely don't know what they're doing as much, but I think there's two things. One, I'm so excited for someone to train a foundation model of robots. It's just, I think it's just going to work. Like I will die on this hill, but I mean, like again, this whole time, like we've been on this podcast, we're just going to continually saying these models are basically behavioral cloners. Right. So let's go behavioral clone all this like robot behavior. Right. And then you figure out everything else you have to do in order to teach you how to solve a new problem. That's going to work. I'm super stoked for that. I think unlike what we're doing with helping humans with knowledge work, it just sounds like a more zero sum job replacement play. Right. And I'm personally less excited about that.
Alessio [00:38:46]: We had a Ken June from InBoo on the podcast. We asked her why people should go work there and not at Adept.
Swyx [00:38:52]: Oh, that's so funny.
Alessio [00:38:54]: Well, she said, you know, there's space for everybody in this market. We're all doing interesting work. And she said, they're really excited about building an operating system for agent. And for her, the biggest research thing was like getting models, better reasoning and planning for these agents. The reverse question to you, you know, why should people be excited to come work at Adept instead of InBoo? And maybe what are like the core research questions that people should be passionate about to have fun at Adept? Yeah.
David [00:39:22]: First off, I think that I'm sure you guys believe this too. The AI space to the extent there's an AI space and the AI agent space are both exactly as she likely said, I think colossal opportunities and people are just going to end up winning in different areas and a lot of companies are going to do well. So I really don't feel that zero something at all. I would say to like change the zero sum framing is why should you be at Adept? I think there's two huge reasons to be at Adept. I think one of them is everything we do is in the service of like useful agents. We're not a research lab. We do a lot of research in service of that goal, but we don't think about ourselves as like a classic research lab at all. And I think the second reason I work at Adept is if you believe that actually having customers and a reward signal from customers lets you build a GI faster, which we really believe, then you should come here. And I think the examples for why that's true is for example, our evaluations, they're not academic evals. They're not simulator evals. They're like, okay, we have a customer that really needs us to do these particular things. We can do some of them. These are the ones they want us to, we can't do them at all. We've turned those into evals, solve it, right? I think that's really cool. Like everybody knows a lot of these evals are like pretty saturated and the new ones that even are not saturated. You look at someone and you're like, is this actually useful? Right? I think that's a degree of practicality that really helps. Like we're equally excited about the same problems around reasoning and planning and generalization and all of this stuff. They're very grounded in actual needs right now, which is really cool.
Swyx [00:40:45]: Yeah. This has been a wonderful dive. You know, I wish we had more time, but I would just leave it kind of open to you. I think you have broad thoughts, you know, just about the agent space, but also just in general AI space. Any, any sort of rants or things that are just off of mind for you right now?
David [00:40:57]: Any rants?
Swyx [00:40:59]: Mining you for just general...
David [00:41:01]: Wow. Okay. So Amelia has already made the rant better than I have, but, but like not just, not just chatbots is like kind of rant one. And two is AI has really been the story of compute and compute plus data and ways in which you could change one for the other. And I think as much as our research community is really smart, we have made many, many advancements and that's going to continue to be important. But now I think the game is increasingly changing and the rapid industrialization era has begun. And I think we unfortunately have to embrace it.
Swyx [00:41:30]: Yep.
Alessio [00:41:31]: Excellent. Awesome, David. Thank you so much for your time.
David [00:41:34]: Cool. Thanks guys.
Get full access to Latent Space at www.latent.space/subscribe
+
+[by:whisper.cpp]
+
+[00:00.00]大家好,歡迎大家來到「Lit and Space Pockest」
+
+[00:02.50]我是Alessio,會員,和CTO在職業的職業會員
+
+[00:05.74]我是Makojo Swicks, founder of SmallAI
+
+[00:08.84]今天我們有David Luan, co-founder of ADEPT,在工作室,歡迎
+
+[00:12.98]謝謝你
+
+[00:14.10]一段時間在工作,我遇到你在VC的社交平台上
+
+[00:17.98]你也說了,你很興奮,我們終於能夠做到這件事了
+
+[00:21.88]對,很高興認識你
+
+[00:23.88]我們想介紹你的職業,然後再說一下你剛才說了什麼,在你的連結,什麼人應該知道你
+
+[00:32.02]你開始了一間公司,是第一次在實際視頻的視頻研究,例如DEXTRO,那是你的路,在你導致的AI,你開始了XON,然後你開始了30年,你開始了OpenAI?
+
+[00:47.06]對,30、35年,或是在那裡,或是在那裡,VP Avenge,兩年半,兩年半後,
+
+[00:53.48]我們在2022年開始了一個大型模式的創新
+
+[00:57.08]然後在2022年開始了一個大型模式的創新
+
+[01:00.32]所以那是一個短暫的CV
+
+[01:02.98]是否有其他東西?
+
+[01:03.98]對,是否有其他東西?
+
+[01:04.98]你覺得要做什麼?
+
+[01:05.98]或是人們應該知道更多?
+
+[01:07.98]我猜是一個比較大的故事
+
+[01:09.48]是加入OpenAI比較早期的
+
+[01:11.98]然後就做了兩、三個月的研究
+
+[01:15.48]那是很有趣的
+
+[01:16.48]第二或第三天的我的時間在OpenAI
+
+[01:18.98]Gregg and Ilya 找我住在房間,我們說要拿到我們的創新,我們會去…
+
+[01:23.98]我看過很多創新的工作
+
+[01:25.98]所以那是很有趣的
+
+[01:26.98]就在結合了一堆團隊
+
+[01:28.98]有幾個早期的領導人已經有了
+
+[01:30.98]公司的資料項目是很努力的
+
+[01:32.98]然後再多次地在大型研究中放大型的圖案
+
+[01:35.98]我們在做基本研究
+
+[01:36.98]所以我花了很多時間在做這個
+
+[01:37.98]然後我再加上Google的LM項目
+
+[01:39.98]但也加上Google的Brain
+
+[01:41.98]是一個Brain的領導人,更多次地
+
+[01:42.98]你知道,有幾個不同的領導人在AI的研究
+
+[01:46.98]我們在2012 before prehistory
+
+[01:48.98]很多人很討厭我
+
+[01:50.98]我跟你們三個最好的朋友
+
+[01:51.98]寫了一個研究的文件
+
+[01:53.98]從2012到2017
+
+[01:56.98]我覺得遊戲的改善在2017
+
+[01:58.98]然後很多學生都沒有發現
+
+[01:59.98]但是我們在OpenAI上真的做了
+
+[02:01.98]我想大部分的幫助是
+
+[02:02.98]Ilya的 constant beating of the drum
+
+[02:04.98]讓世界被遮蓋在data centers
+
+[02:06.98]還有其他人需要…
+
+[02:07.98]對,我覺得我們有確定在那裡
+
+[02:10.98]但沒有到我們開始看到
+
+[02:11.98]結果的結果,那是我們要去的
+
+[02:14.98]但也有一個部分
+
+[02:15.98]是在OpenAI上
+
+[02:16.98]我第一次加入
+
+[02:17.98]我認為一件事我必須要做
+
+[02:19.98]是如何告訴我們
+
+[02:20.98]我們是否有不同的觀點
+
+[02:22.98]比起我們是更小的GoogleBrain
+
+[02:25.98]或是我們在OpenAI上
+
+[02:26.98]只要生活在SF
+
+[02:27.98]然後不想接受Mountain View
+
+[02:28.98]或不想要生活在London
+
+[02:29.98]那是不足夠的
+
+[02:31.98]利用你的技術活動
+
+[02:33.98]所以我們真的…
+
+[02:34.98]我花了很多時間在推廣這個
+
+[02:36.98]就是我們要怎麼
+
+[02:37.98]要專注在
+
+[02:38.98]一個大學生的大學生
+
+[02:41.98]你從最底下的研究
+
+[02:44.98]變成了
+
+[02:45.98]如何讓你放棄這個環境
+
+[02:47.98]而讓你覺得
+
+[02:48.98]什麼是大學生的大學生
+
+[02:50.98]想要展現
+
+[02:51.98]然後你把他們解決
+
+[02:52.98]所有的財困
+
+[02:53.98]不管是否要在創意
+
+[02:54.98]創作什麼
+
+[02:55.98]這就變成了
+
+[02:56.98]大學生的大學生
+
+[02:57.98]對嗎
+
+[02:58.98]然後現在的改變
+
+[02:59.98]是我認為
+
+[03:00.98]第一次加入AiPrice
+
+[03:01.98]在下一幾年
+
+[03:02.98]會是最深的
+
+[03:03.98] co-design
+
+[03:04.98]和 co-evolution
+
+[03:05.98]產品和資料
+
+[03:07.98]和實際技術
+
+[03:08.98]而我認為
+
+[03:09.98]每個技術的技術
+
+[03:10.98]都會做得很好
+
+[03:11.98]那是一大部分
+
+[03:12.98]為何我開始深入
+
+[03:13.98]你提及Dota
+
+[03:14.98]哪些記憶在想
+
+[03:16.98]從RL 和 Transformers
+
+[03:18.98]在時間中
+
+[03:19.98]然後我認為
+
+[03:20.98]製造的工具
+
+[03:21.98]更加在LM 上
+
+[03:23.98]然後離開
+
+[03:24.98]更多的Agent Simulation
+
+[03:25.98]工作
+
+[03:26.98]像在移動的道路
+
+[03:27.98]我覺得Agent
+
+[03:28.98]是一個
+
+[03:29.98]完全正確的長途
+
+[03:30.98]你只要去找
+
+[03:31.98]AGI 是吧
+
+[03:32.98]你會說
+
+[03:33.98]首先
+
+[03:34.98]我其實不喜歡AGI
+
+[03:35.98]用人的改變
+
+[03:36.98]因為我真的不想
+
+[03:37.98]這樣會發生
+
+[03:38.98]我認為這個改變
+
+[03:39.98]AGI 是一些
+
+[03:40.98]人們表現的
+
+[03:41.98]非常值得的技術
+
+[03:43.98]是一個
+
+[03:44.98]極端的看法
+
+[03:45.98]和人的改變
+
+[03:46.98]我認為
+
+[03:47.98]我比較有興趣
+
+[03:48.98]AGI 的改變
+
+[03:49.98]就是
+
+[03:50.98]一個模式
+
+[03:51.98]可以做任何的
+
+[03:52.98]人能做的
+
+[03:53.98]如果你想到
+
+[03:54.98]超級有趣
+
+[03:55.98]Agent
+
+[03:56.98]是一種
+
+[03:57.98]自然的
+
+[03:58.98]改變
+
+[03:59.98]所以
+
+[04:00.98]所有的工作
+
+[04:01.98]我們在RL
+
+[04:02.98]這些技術
+
+[04:03.98]導致我們
+
+[04:04.98]有很清楚的
+
+[04:05.98]形容
+
+[04:06.98]你需要增加
+
+[04:07.98]你需要增加
+
+[04:08.98]對
+
+[04:09.98]而自然的LM
+
+[04:10.98]形容
+
+[04:11.98]沒有出現
+
+[04:12.98]我認為
+
+[04:13.98]我們
+
+[04:14.98]在這個場地
+
+[04:15.98]有很多想法
+
+[04:16.98]想想
+
+[04:17.98]我們如何解決
+
+[04:18.98]問題的問題
+
+[04:19.98]然後
+
+[04:20.98]我們忘記
+
+[04:21.98]我們在RL
+
+[04:22.98]是一個
+
+[04:23.98]很不容易的
+
+[04:24.98]方式
+
+[04:25.98]我們為何
+
+[04:26.98]我們在世界
+
+[04:27.98]找到所有的
+
+[04:28.98]知識
+
+[04:29.98]我們在一年
+
+[04:30.98]和一位
+
+[04:31.98]伯克里斯教授
+
+[04:32.98]教授
+
+[04:33.98]我們會拿到
+
+[04:34.98]AGI
+
+[04:35.98]他的觀點
+
+[04:36.98]對
+
+[04:37.98]他的理想
+
+[04:38.98]對
+
+[04:39.98]所以
+
+[04:40.98]我們都在
+
+[04:41.98]記錄
+
+[04:42.98]我們會
+
+[04:43.98]解決
+
+[04:44.98]我們已經解決
+
+[04:45.98]LM
+
+[04:46.98]我們已經解決
+
+[04:47.98]我們已經解決
+
+[04:48.98]我們已經解決
+
+[04:49.98]我們已經解決
+
+[04:50.98]我們已經解決
+
+[04:51.98]我們已經解決
+
+[04:52.98]我們已經解決
+
+[04:53.98]我們已經解決
+
+[04:54.98]我們已經解決
+
+[04:55.98]我們已經解決
+
+[04:56.98]我們已經解決
+
+[04:57.98]我們已經解決
+
+[04:58.98]我們已經解決
+
+[04:59.98]我們已經解決
+
+[05:00.98]我們已經解決
+
+[05:01.98]我們已經解決
+
+[05:02.98]每一句
+
+[05:03.98]文字
+
+[05:04.98]然後所有的圖案都會學習到模式
+
+[05:07.94]然後你能夠合作任何的組織
+
+[05:10.14]例如寫進、聲音、畫面、其他畫面、影片等等
+
+[05:14.42]這些都是圖案的圖案,可以學習到這類的動作
+
+[05:18.50]所以我希望我們能夠解決這件事
+
+[05:20.10]然後我們回到當時的歷史
+
+[05:22.74]我們如何跟我們一起學習這些圖案的學習
+
+[05:27.06]這就是我們要去進行的進步
+
+[05:28.62]我還要向大家提醒你多多的明年開放的故事
+
+[05:31.30]我們再回到大陸的故事
+
+[05:32.90]在你的個人網站,我愛的,因為是一個很好的個人的故事
+
+[05:37.38]故事的內容,像你的歷史
+
+[05:39.38]我需要更新,因為太老了
+
+[05:42.38]但是你提及GPC2,你忘記了GPC1嗎?我認為你忘記了,對吧?
+
+[05:46.18]我其實不太記得,我記得在那邊,我記得在那邊
+
+[05:50.70]對,《Canonical Story》是阿力的故事,他很擔心傳播者和傳播者
+
+[05:58.74]傳播者和傳播者和傳播者的訊息
+
+[06:01.38]對,你帶我們去… 拿我們傳播者和傳播者和傳播者的訊息
+
+[06:03.66]GPC的歷史,你也知道,對你來說
+
+[06:07.46]對我來說,歷史和GPC的歷史是一個很好的問題
+
+[06:10.02]所以我認為《Canonical Story》的故事,GPC的歷史是在谷歌上,對吧?
+
+[06:14.30]因為那是關於傳播者的故事
+
+[06:17.30]而我認為最驚訝的一件事,是…
+
+[06:21.26]這是一個成績,例如在谷歌設立,你跟你的最好的朋友寫文章,對吧?
+
+[06:26.26]好,所以在調查,我認為我的工作,當我當了學校的學長,是一個領導的領導人,對吧?
+
+[06:33.02]所以我真的有很好的朋友,我的工作是把人們的小數目和好幾個好意義,然後向他們進行完結的工作
+
+[06:41.10]我的工作不是在提供一百萬個意義,然後沒有任何股份的資料
+
+[06:45.54]然後當我的想法開始合作,然後我開始工作,我的工作是向他們扭動資料,向他們做好工作
+
+[06:52.50]然後開始將一些不正確的工作拆除,對吧?
+
+[06:56.06]那股股份並沒有存在在我的時間在谷歌上
+
+[06:59.34]如果他們有做好工作,他們會說:
+
+[07:02.06]"喂,你真棒,你懂這些東西的效果嗎?"
+
+[07:05.98]"這裡是所有的我們的TPUs,然後我認為他們會殺掉我們"
+
+[07:09.94]他肯定是想要的,他在2017年也說了一百萬公升的計劃
+
+[07:13.18]對,所以我認為這回合是在關於GPT的故事,對嗎?
+
+[07:15.98]就是我正在跳舞歷史,對嗎?
+
+[07:18.38]但在GPT2之後,我們都很期待GPT2,我可以告訴你更多的故事
+
+[07:22.50]這是我最後的一篇文章,我甚至真的受到觸碍了,所以我變成了研究研究研究員
+
+[07:27.70]每天每天我們進行GPT3,我會醒來,然後感到緊張
+
+[07:32.38]我感到緊張,因為...你只要看看Fax,對嗎?
+
+[07:35.54]Google有所有的帖子,Google有所有的人 who invented all of these underlying technologies
+
+[07:40.74]有一個人叫Noam,他很聰明,他已經做了這個討論,他想要一百萬的計劃模式
+
+[07:46.54]我認為我們可能只是在做一些複雜的研究,對嗎?他有這個扣子,只有轉換模式,他可能會在我們之前進行的
+
+[07:54.66]我心想,拜託,讓這個模式結束,對嗎?
+
+[07:57.90]然後,整個時間都變成了他們沒有得到股票的資金
+
+[08:01.62]所以,我年紀中,我帶了Google的LM的活動,我當時是一名手機的,我變得很清楚為什麼,對嗎?
+
+[08:06.98]那時候,有一個東西叫做"Brain Credit Marketplace"
+
+[08:11.06]你記得Brain Credit Marketplace嗎?
+
+[08:13.26]沒有,我沒聽過這說法
+
+[08:14.30]其實,你會問任何Google,就像一件事,對嗎?
+
+[08:18.58]對,有限定資訊,你必須有一個市場的市場,對嗎?
+
+[08:23.06]你可能,有些時候是貧富,有些時候是政治欺負
+
+[08:27.34]你可能,所以,基本上,每個人都要給錢,對嗎?
+
+[08:30.10]如果你有錢,你必須買N-CHIPS,按照貿易和責任的方式
+
+[08:33.74]如果你想做一個大職業,你可能有19、20個朋友不願意去工作
+
+[08:38.86]如果這就是它們的效果
+
+[08:40.74]它們很難得獲得
+
+[08:42.14]當中的肺炎
+
+[08:43.86]去學習這些東西
+
+[08:44.98]而 Google 的團隊
+
+[08:45.86]正在打架
+
+[08:47.02]但我們只能打擊它們
+
+[08:48.22]因為我們拿了大大的肺炎
+
+[08:50.62]然後我們注射
+
+[08:51.42]然後我認為
+
+[08:52.30]這就像是一部分的故事
+
+[08:53.54]像是一部分的歷史
+
+[08:54.34]像是一部分的歷史
+
+[08:55.62]像是一部分的歷史
+
+[08:57.58]像是一部分的歷史
+
+[08:58.90]我認為同樣的
+
+[09:00.22]我認為一部分的
+
+[09:01.02]三部分會成為
+
+[09:01.90]一部分的歷史
+
+[09:03.22]因為是一部分的
+
+[09:04.18]一部分的成績
+
+[09:05.62]對
+
+[09:06.30]我覺得這部分的內容是如何的
+
+[09:07.70]和影片也有關的
+
+[09:09.02]在前一天的情況下
+
+[09:10.06]我認為可能
+
+[09:11.10]我認為是Jensen
+
+[09:11.90]不確定是誰
+
+[09:12.86]把最近的照片
+
+[09:13.90]給大家看過的
+
+[09:15.26]他在第一張DGX的照片中
+
+[09:17.66]我覺得Jensen 已經是
+
+[09:19.06]一個完美的
+
+[09:21.22]技術
+
+[09:21.94]和精神的一切
+
+[09:24.10]我對NVIDIA的尊敬有多大關注
+
+[09:26.22]是不實際的
+
+[09:26.94]但我會打開
+
+[09:27.74]我給他們的需要
+
+[09:29.46]讓他們構思一下
+
+[09:30.30]或者
+
+[09:31.34]你只要用任何NVIDIA給他們的東西
+
+[09:33.70]所以我們很接近他們的工作
+
+[09:35.38]我不確定能分享所有的故事
+
+[09:37.62]但例子是我找到的
+
+[09:39.42]特別有趣的
+
+[09:40.14]所以 Scott Gray 是很棒的
+
+[09:41.54]我很喜歡他
+
+[09:42.22]他在我的隊伍中
+
+[09:43.30]是一名超級電腦隊伍
+
+[09:45.62]就是Chris Burner 做的
+
+[09:46.74]Chris Burner 還做了很多東西
+
+[09:48.82]結果
+
+[09:49.70]我們有很接近NVIDIA的 ties
+
+[09:52.62]其實我的 co-founder
+
+[09:53.70]在Adept Eric Elson
+
+[09:54.74]是一位以前的GPGPU人士
+
+[09:56.78]所以他和Scott
+
+[09:57.82]和Brian Kanzaro
+
+[09:58.86]NVIDIA
+
+[09:59.66] and Jonah
+
+[10:00.26] and Ian at NVIDIA
+
+[10:01.14]我覺得我們全都很接近
+
+[10:02.54]我們是一部分的組織
+
+[10:03.70]我們如何推動這些股票的限度
+
+[10:05.82]我覺得那種組織
+
+[10:07.42]幫助了我們
+
+[10:08.38]我想有趣的部分
+
+[10:09.50]是 knowing the A100 generation
+
+[10:11.22]那個Quadsbar city
+
+[10:12.26]會是一件事
+
+[10:12.98]是我們想找到的
+
+[10:14.50]來解決
+
+[10:15.22]這是我們可以利用的
+
+[10:16.50]模特兒訓練
+
+[10:17.14] really what it boils down to
+
+[10:18.50]是
+
+[10:19.22]我認為更多人
+
+[10:20.06]知道這件事
+
+[10:21.26]6 年前
+
+[10:22.34]甚至3 年前
+
+[10:23.34]人們拒絕接受
+
+[10:24.98]這個AI 是一件故事
+
+[10:27.02]是一件故事
+
+[10:27.62]如何讓你更能复入
+
+[10:29.22]實際使用模特兒
+
+[10:30.38]使用模特兒
+
+[10:31.66]還有GPT 2 3 故事嗎
+
+[10:35.78]你喜歡在外面
+
+[10:37.78]我認為是
+
+[10:38.78]很欣賞
+
+[10:39.86]這個模特兒的作用
+
+[10:41.66]有趣的GPT 2 故事
+
+[10:43.66]我花了很長的時間
+
+[10:45.86]幫Alex使用模特兒
+
+[10:48.58]我記得
+
+[10:49.82]最有趣的一刻
+
+[10:52.22]是我們寫了模特兒
+
+[10:54.70]我確定模特兒
+
+[10:56.22]是一個最短的模特兒
+
+[10:57.70]有任何ML
+
+[10:58.70]像是最理想的
+
+[10:59.90]ML 模特兒
+
+[11:01.42]是三個模特兒
+
+[11:03.18]這是一種模特兒
+
+[11:04.54]Vanilla 模特兒
+
+[11:05.58]只有轉換的模特兒
+
+[11:06.38]這些特別的東西
+
+[11:07.34]我記得是在《ParaGraph》裡
+
+[11:08.58]我記得是在《ParaGraph》裡
+
+[11:09.42]我們都在看這件事
+
+[11:11.02]我認為是很難看的模特兒
+
+[11:11.82]OGs 在廣場上
+
+[11:13.02]會很討厭這個模特兒
+
+[11:14.02]他們會說沒有創意
+
+[11:15.50]為什麼你們要做這個作用
+
+[11:16.94]現在是很有趣的
+
+[11:18.02]在後期的看法是
+
+[11:19.54]一件很刺激的作用
+
+[11:20.82]但我覺得是一件很早的事
+
+[11:22.54]我們完全遲到
+
+[11:24.42]我們都要關心的問題是 AI 和不關的
+
+[11:27.58]是否有四種不同的想法
+
+[11:29.34]是否有一個很簡單的想法
+
+[11:30.34]是否有一個很簡單的想法
+
+[11:31.34]是否有一個很簡單的想法
+
+[11:32.34]是否有一個很簡單的想法
+
+[11:33.34]是否有一個很簡單的想法
+
+[11:34.34]是否有一個很簡單的想法
+
+[11:35.34]是否有一個很簡單的想法
+
+[11:36.34]是否有一個很簡單的想法
+
+[11:37.34]是否有一個很簡單的想法
+
+[11:38.34]是否有一個很簡單的想法
+
+[11:39.34]是否有一個很簡單的想法
+
+[11:40.34]是否有一個很簡單的想法
+
+[11:41.34]是否有一個很簡單的想法
+
+[11:42.34]是否有一個很簡單的想法
+
+[11:43.34]是否有一個很簡單的想法
+
+[11:44.34]是否有一個很簡單的想法
+
+[11:45.34]是否有一個很簡單的想法
+
+[11:46.34]是否有一個很簡單的想法
+
+[11:47.34]是否有一個很簡單的想法
+
+[11:48.34]是否有一個很簡單的想法
+
+[11:49.34]是否有一個很簡單的想法
+
+[11:50.34]是否有一個很簡單的想法
+
+[11:51.34]是否有一個很簡單的想法
+
+[11:52.34]是否有一個很簡單的想法
+
+[11:53.34]是否有一個很簡單的想法
+
+[11:54.34]是否有一個很簡單的想法
+
+[11:55.34]是否有一個很簡單的想法
+
+[11:56.34]是否有一個很簡單的想法
+
+[11:57.34]是否有一個很簡單的想法
+
+[11:58.34]是否有一個很簡單的想法
+
+[11:59.34]是否有一個很簡單的想法
+
+[12:00.34]是否有一個很簡單的想法
+
+[12:01.34]是否有一個很簡單的想法
+
+[12:02.34]是否有一個很簡單的想法
+
+[12:03.34]是否有一個很簡單的想法
+
+[12:04.34]是否有一個很簡單的想法
+
+[12:05.34]是否有一個很簡單的想法
+
+[12:06.34]是否有一個很簡單的想法
+
+[12:07.34]是否有一個很簡單的想法
+
+[12:08.34]是否有一個很簡單的想法
+
+[12:09.34]之前 Microsoft invested in OpenAI
+
+[12:11.34]Sam Altman, myself, and our CFO
+
+[12:13.34] flew up to Seattle
+
+[12:14.34] to do the final pitch meeting
+
+[12:16.34] and I'd been a founder before
+
+[12:17.34] so I always had a tremendous amount of anxiety
+
+[12:19.34] about partner meetings
+
+[12:21.34] which this basis is what it was
+
+[12:22.34] it was like Kevin Scott
+
+[12:23.34] and Satya and Amy Hood
+
+[12:25.34] and it was my job to give the technical slides
+
+[12:27.34] about what's the path to AGI
+
+[12:29.34] what's our research portfolio
+
+[12:30.34] all of this stuff
+
+[12:31.34] but it was also my job to give the GPT-2 demo
+
+[12:34.34] we had a slightly bigger version of GPT-2
+
+[12:36.34] that we had just cut
+
+[12:38.34] maybe a day or two before this flight up
+
+[12:40.34] and as we all know now
+
+[12:42.34]Model behaviors you find predictable
+
+[12:44.34] at one checkpoint
+
+[12:45.34] are not predictable in another checkpoint
+
+[12:46.34] and so like I spent all this time
+
+[12:48.34] trying to figure out how to keep this thing on rails
+
+[12:50.34] I had my canned demos
+
+[12:51.34] but I knew I had to go
+
+[12:52.34] turn it around over to Satya and Kevin
+
+[12:54.34] and let them type anything in
+
+[12:56.34] and that just that really kept me up all night
+
+[12:58.34]Nice, yeah
+
+[13:00.34]I mean that must have helped you
+
+[13:01.34] talking about partners meeting
+
+[13:03.34]You raised 420 million for ADAPT
+
+[13:06.34]The last round was a $350 million series B
+
+[13:09.34]So I'm sure you do great
+
+[13:10.34]Pitching and painting
+
+[13:12.34]Nice
+
+[13:13.34]No, that's a high compliment coming from a VC
+
+[13:15.34]Yeah, I mean you're doing great
+
+[13:17.34]Let's talk about ADAPT
+
+[13:19.34]and we were doing pre prep
+
+[13:21.34]and you mentioned that maybe a lot of people
+
+[13:22.34]don't understand what ADAPT is
+
+[13:23.34]So usually we try and introduce the product
+
+[13:26.34]and then have the founders fill in the blanks
+
+[13:27.34]but maybe let's do the reverse
+
+[13:28.34]Like what is ADAPT?
+
+[13:30.34]Yeah, so I think ADAPT
+
+[13:31.34]is the least understood company
+
+[13:34.34]in the broader space of foundation models
+
+[13:36.34]plus agents
+
+[13:37.34]So I'll give some color
+
+[13:39.34]and I'll explain what it is
+
+[13:40.34]and I'll explain also
+
+[13:41.34]why it's actually pretty different
+
+[13:43.34]from what people would have guessed
+
+[13:44.34]So the goal for ADAPT
+
+[13:46.34]is we basically want to build an AI agent
+
+[13:48.34]that can do
+
+[13:49.34]that can basically help humans
+
+[13:50.34]do anything a human does on a computer
+
+[13:51.34]and so what that really means is
+
+[13:53.34]we want this thing to be super good
+
+[13:55.34]at turning natural language
+
+[13:56.34]like goal specifications
+
+[13:58.34]right into the correct set of end steps
+
+[14:00.34]and then also have all the correct sensors
+
+[14:02.34]and actuators
+
+[14:03.34]to go get that thing done for you
+
+[14:04.34]across any software tool
+
+[14:05.34]that you already use
+
+[14:06.34]and so the end vision of this
+
+[14:07.34]is effectively like
+
+[14:08.34]I think in a couple years
+
+[14:09.34]everyone's going to have access
+
+[14:10.34]to an AI teammate
+
+[14:11.34]that they can delegate arbitrary tasks to
+
+[14:14.34]and then also be able to use it
+
+[14:16.34]to a sounding board
+
+[14:17.34]and just be way, way, way more productive
+
+[14:19.34]right and just changes the shape
+
+[14:21.34]of every job
+
+[14:22.34]from something where you're mostly
+
+[14:23.34]doing execution
+
+[14:24.34]to something where you're mostly
+
+[14:25.34]actually doing these core liberal arts skills
+
+[14:26.34]of what should I be doing and why
+
+[14:28.34]right and
+
+[14:29.34]I find this like really exciting
+
+[14:31.34]motivating because
+
+[14:32.34]I think it's actually
+
+[14:33.34]pretty different vision
+
+[14:34.34]for how AI will play out
+
+[14:36.34]I think systems like ADAPT
+
+[14:37.34]are the most likely systems
+
+[14:38.34]to be proto-AGI's
+
+[14:40.34]but I think the ways in which
+
+[14:41.34]we are really counterintuitive
+
+[14:42.34]to everybody
+
+[14:43.34]is that
+
+[14:44.34]we've actually been really quiet
+
+[14:45.34]because we are
+
+[14:46.34]not a developer company
+
+[14:47.34]we don't sell APIs
+
+[14:48.34]we don't sell open source models
+
+[14:50.34]we also don't sell bottom-up products
+
+[14:52.34]we're not a thing
+
+[14:53.34]that you go and click
+
+[14:54.34]and download the extension
+
+[14:55.34]and like we want more users
+
+[14:56.34]signing up for that thing
+
+[14:57.34]we're actually an enterprise company
+
+[14:58.34]so what we do is
+
+[14:59.34]we work with a range
+
+[15:00.34]of different companies
+
+[15:01.34]some like late-stage
+
+[15:02.34]multi-thousand people start-ups
+
+[15:04.34]some Fortune 500s etc
+
+[15:06.34]and what we do for them
+
+[15:07.34]is we basically give them
+
+[15:09.34]an out-of-the-box solution
+
+[15:11.34]where big complex workflows
+
+[15:12.34]that their employees
+
+[15:13.34]do every day
+
+[15:14.34]could be delegated to the model
+
+[15:15.34]and so we look a little
+
+[15:16.34]different from other companies
+
+[15:17.34]in that in order
+
+[15:18.34]to go build this
+
+[15:19.34]full agent thing
+
+[15:20.34]the most important thing
+
+[15:21.34]you gotta get right
+
+[15:22.34]is reliability
+
+[15:23.34]so initially zooming
+
+[15:24.34]way back when
+
+[15:25.34]one of the first things
+
+[15:26.34]debt did was we released
+
+[15:27.34]this demo called Act 1
+
+[15:28.34]act 1 was like pretty cool
+
+[15:30.34]it's kind of become
+
+[15:31.34]a hello world thing
+
+[15:32.34]for people to show
+
+[15:33.34]agent demos
+
+[15:34.34]by going to redfin
+
+[15:35.34]and asking to buy a house
+
+[15:36.34]somewhere
+
+[15:37.34]because like we did that
+
+[15:38.34]in the original Act 1 demo
+
+[15:39.34]and like showed that
+
+[15:40.34]showed like Google Sheets
+
+[15:41.34]all this other stuff
+
+[15:42.34]over the last like year
+
+[15:44.34]since that has come out
+
+[15:45.34]there's been a lot
+
+[15:46.34]of really cool demos
+
+[15:47.34]and you go play with them
+
+[15:48.34]and you realize
+
+[15:49.34]they work 60% of the time
+
+[15:50.34]but since we've always
+
+[15:51.34]been focused on
+
+[15:52.34]how do we build
+
+[15:53.34]an amazing enterprise product
+
+[15:54.34]enterprises can't use
+
+[15:55.34]anything
+
+[15:56.34]the reliability
+
+[15:57.34]and so we've
+
+[15:58.34]actually had to go down
+
+[15:59.34]a slightly different
+
+[16:00.34]tech tree than what you
+
+[16:01.34]might find in the
+
+[16:02.34]prompt engineering
+
+[16:03.34]sort of plays in
+
+[16:04.34]the agent space
+
+[16:05.34]to get that reliability
+
+[16:06.34]and we've decided
+
+[16:07.34]to prioritize reliability
+
+[16:08.34]over all else
+
+[16:09.34]so like one of our use
+
+[16:10.34]cases is crazy enough
+
+[16:11.34]that it actually ends
+
+[16:12.34]with a physical truck
+
+[16:13.34]being sentto a place
+
+[16:15.34]as the result
+
+[16:16.34]of the agent workflow
+
+[16:17.34]and if you're like
+
+[16:18.34]if that works like 60%
+
+[16:19.34]of the time
+
+[16:20.34]you're just blowing money
+
+[16:21.34]and poor truck drivers
+
+[16:22.34]going places
+
+[16:23.34]interesting
+
+[16:24.34]one of the
+
+[16:25.34]common teams
+
+[16:26.34]has this idea of services
+
+[16:27.34]as software
+
+[16:28.34]I'm actually giving a talk
+
+[16:29.34]at nvidia gtc
+
+[16:30.34]about this
+
+[16:31.34]but basically
+
+[16:32.34]software as a service
+
+[16:33.34]you're wrapping
+
+[16:34.34]user productivity
+
+[16:35.34]in software
+
+[16:36.34]with agents
+
+[16:37.34]and services as software
+
+[16:38.34]is replacing things
+
+[16:39.34]that you know
+
+[16:40.34]you would ask somebody
+
+[16:41.34]to do
+
+[16:42.34]and the software
+
+[16:43.34]just does it for you
+
+[16:44.34]when you think
+
+[16:45.34]about these usecases
+
+[16:46.34]do the users
+
+[16:47.34]still go in
+
+[16:48.34]and look at the agent
+
+[16:49.34]kindof like
+
+[16:50.34]doing the things
+
+[16:51.34]and can intervene
+
+[16:52.34]or likeare they slowly
+
+[16:53.34]remove from them
+
+[16:54.34]are there people
+
+[16:55.34]in the middle
+
+[16:56.34]checking in
+
+[16:57.34]I think there's two current flaws
+
+[16:58.34]in the framing
+
+[16:59.34]for services
+
+[17:00.34]as software
+
+[17:01.34]or I think what you just said
+
+[17:02.34]I think that one of them
+
+[17:03.34]is likein our experience
+
+[17:04.34]as we've been rolling
+
+[17:05.34]out adept
+
+[17:06.34]the people who actually
+
+[17:07.34]do the jobs
+
+[17:08.34]are the most excited
+
+[17:09.34]about it
+
+[17:10.34]because they don't go from
+
+[17:11.34]I do this job
+
+[17:12.34]to I don't do this job
+
+[17:13.34]they go from
+
+[17:14.34]I do this job
+
+[17:15.34]for everything
+
+[17:16.34]including the shitty
+
+[17:17.34]wrote stuff
+
+[17:18.34]to I'm a supervisor
+
+[17:19.34]and I literally
+
+[17:20.34]likeit's pretty magical
+
+[17:21.34]when you watch the thing
+
+[17:22.34]being used
+
+[17:23.34]sequentially by hand
+
+[17:24.34]as a human
+
+[17:25.34]and you can just click
+
+[17:26.34]in any one of them
+
+[17:27.34]be like hey I want to watch
+
+[17:28.34]the trajectory
+
+[17:29.34]the agent went through
+
+[17:30.34]to go solve this
+
+[17:31.34]and the nice thing
+
+[17:32.34]about agent execution
+
+[17:33.34]as opposed to
+
+[17:34.34]like LLM generations
+
+[17:35.34]is that
+
+[17:36.34]a good chunk of the time
+
+[17:37.34]when the agent
+
+[17:38.34]fails to execute
+
+[17:39.34]it doesn't give you
+
+[17:40.34]the wrong result
+
+[17:41.34]it just fails to execute
+
+[17:42.34]and the whole trajectory
+
+[17:43.34]is just broken and dead
+
+[17:44.34]and the agent knows it
+
+[17:45.34]right so then
+
+[17:46.34]those are the ones
+
+[17:47.34]that the human
+
+[17:48.34]then goes and solves
+
+[17:49.34]and so then they become
+
+[17:50.34]a troubleshooter
+
+[17:51.34]they work on the more
+
+[17:52.34]present piece
+
+[17:53.34]of it
+
+[17:54.34]that we found
+
+[17:55.34]is our strategy
+
+[17:56.34]as a company
+
+[17:57.34]is to always be
+
+[17:58.34]an augmentation company
+
+[17:59.34]and I think
+
+[18:01.34]one out of principle
+
+[18:02.34]that's something
+
+[18:03.34]we really care about
+
+[18:04.34]but two
+
+[18:05.34]actually if you're
+
+[18:06.34]framing yourself
+
+[18:07.34]as an augmentation
+
+[18:08.34]company
+
+[18:09.34]you're always going to
+
+[18:10.34]live in the world
+
+[18:11.34]where you're solving
+
+[18:12.34]tasks that are a little
+
+[18:13.34]too hard for what
+
+[18:14.34]the model can do today
+
+[18:15.34]and still needs a human
+
+[18:16.34]to provide oversight
+
+[18:17.34]provide clarifications
+
+[18:18.34]provide human feedback
+
+[18:19.34]and that's how you
+
+[18:20.34]build a data flywheel
+
+[18:21.34]smart as humans
+
+[18:22.34]how to solve
+
+[18:23.34]things models
+
+[18:24.34]can't do today
+
+[18:25.34]and so I actually
+
+[18:26.34]think that
+
+[18:27.34]being an augmentation
+
+[18:28.34]company
+
+[18:29.34]forces you to go
+
+[18:30.34]develop your core
+
+[18:31.34]AI capabilities
+
+[18:32.34]faster than someone
+
+[18:33.34]who's saying
+
+[18:34.34]ah okay
+
+[18:35.34]my job's like
+
+[18:36.34]deliver you
+
+[18:37.34]a lights off
+
+[18:38.34]solution for X
+
+[18:39.34]it's interesting
+
+[18:40.34]because we've seen
+
+[18:41.34]two parts
+
+[18:42.34]of the market
+
+[18:43.34]one is
+
+[18:44.34]we have one company
+
+[18:45.34]that does
+
+[18:46.34]agents for
+
+[18:47.34]sock analysts
+
+[18:48.34]people just
+
+[18:49.34]don't have them
+
+[18:50.34]which is
+
+[18:51.34]the augmentation product
+
+[18:52.34]and then you have
+
+[18:53.34]sweep.dev
+
+[18:54.34]any of these products
+
+[18:55.34]which they just
+
+[18:56.34]do the whole thing
+
+[18:57.34]I'm really curious
+
+[18:58.34]to see how that evolves
+
+[18:59.34]I agree that today
+
+[19:00.34]the reliability is
+
+[19:01.34]so important
+
+[19:02.34]in the enterprise
+
+[19:03.34]that they just
+
+[19:04.34]don't use
+
+[19:05.34]most of them
+
+[19:06.34]that's cool
+
+[19:07.34]but it's great
+
+[19:08.34]to hear the story
+
+[19:09.34]because I think
+
+[19:10.34]from the outside
+
+[19:11.34]people are like
+
+[19:12.34]oh that
+
+[19:13.34]they do act one
+
+[19:14.34]they do person on
+
+[19:15.34]they do foo you
+
+[19:16.34]they do all these
+
+[19:17.34]it's just the public stuff
+
+[19:18.34]it's just the public stuff
+
+[19:19.34]我們想要更多的客人來領導
+
+[19:22.20]所以我們想要更多的客人來領導
+
+[19:26.08]但我們希望我們會更多的客人來領導
+
+[19:29.32]我們想要更多的客人來領導
+
+[19:31.48]我們想要更多的客人來領導
+
+[19:33.68]所以這次我們想要更多的客人來領導
+
+[19:36.70]為什麼你變得更多的客人?
+
+[19:38.78]如果整個推動...
+
+[19:40.12]你已經領導了你的公司
+
+[19:41.82]但是你也會更加努力去領導更多的客人來領導
+
+[19:46.20]我覺得我們剛剛領導過那一步
+
+[19:48.14]因為我最近還沒有領導過那一步
+
+[19:49.14]這是一個好問題
+
+[19:50.14]我認為這兩件事其實是很重要的
+
+[19:51.14]一件事我認為是...
+
+[19:53.14]坦白說,大部分是公共的歷史
+
+[19:56.14]在公司中的公司中的歷史是最重要的
+
+[19:58.14]我非常高興這件事發生
+
+[20:00.14]因為當我們開始公司在2022年代
+
+[20:03.14]大家都在社會中知道歷史的歷史
+
+[20:06.14]但公司中的歷史沒有任何意義
+
+[20:08.14]他們還會把所有的歷史都放在桌上
+
+[20:11.14]所以我認為現在
+
+[20:13.14]我真的要注意的是
+
+[20:15.14]當人們認為歷史
+
+[20:16.14]他們會認為是對的
+
+[20:17.14]對,所有各種各樣的東西都會被引起
+
+[20:19.14]會被引起的電話電話電話電話
+
+[20:20.14]會被引起的東西都會被引起的東西
+
+[20:21.14]或是被引起的電話電話電話
+
+[20:22.14]我認為電話電話電話
+
+[20:23.14]是一個可以給你一個目標
+
+[20:25.14]再次進行的工作
+
+[20:27.14]並且在最少數個步驟中
+
+[20:28.14]所以這就是一個大部分的原因
+
+[20:30.14]我認為其中一個部分
+
+[20:31.14]是因為我認為更好讓人們
+
+[20:33.14]更加 aware of the depth
+
+[20:34.14]他們想要做的事情
+
+[20:35.14]他們的生意
+
+[20:36.14]這塊地是在世界中
+
+[20:38.14]在於在更多的利益
+
+[20:40.14]我認為大量的利益
+
+[20:43.14]會發生從
+
+[20:44.14]你使用的研究模式
+
+[20:46.14]作為大量學童的學童
+
+[20:49.14]去解決這些事
+
+[20:50.14]我認為那些人
+
+[20:51.14]想要做的研究
+
+[20:52.14]應該有所改善
+
+[20:53.14]當你提到
+
+[20:54.14]研究已經變成
+
+[20:55.14]更多的一部分
+
+[20:56.14]有什麼特別的東西
+
+[20:57.14]你會問我嗎
+
+[20:58.14]我會給你一個名字
+
+[20:59.14] Bill Gates 在 his blog post
+
+[21:00.14]提及「Agent of the Future」
+
+[21:02.14]我是那個人 who made OSs
+
+[21:04.14]我認為「Agent of the Next Thing」
+
+[21:05.14]所以 Bill Gates
+
+[21:07.14]我會叫他出來
+
+[21:08.14]然後 Sam Altman 也會說
+
+[21:09.14]「Agent of the Future for Open AI」
+
+[21:10.14]我認為之前
+
+[21:11.14]我認為
+
+[21:12.14]有些人在《紐約 Times》
+
+[21:13.14]Kade Metz 也在《紐約 Times》
+
+[21:15.14]對於現在
+
+[21:16.14]在一些不同的
+
+[21:17.14]我看過 AI 開始的
+
+[21:18.14]使用的研究模式
+
+[21:19.14]是 AI 公司
+
+[21:20.14]現在的 AI 公司
+
+[21:21.14]是 AI 公司
+
+[21:22.14]只是我認為
+
+[21:23.14]是一段時間
+
+[21:24.14]從 VC 開始
+
+[21:25.14]是有點混合
+
+[21:26.14]是嗎
+
+[21:27.14]我認為有很多 VC
+
+[21:28.14]會說我不會
+
+[21:29.14]觸碰 any agent start-ups
+
+[21:30.14]因為
+
+[21:31.14]為什麼
+
+[21:32.14]你告訴我
+
+[21:33.14]我認為有很多 VC
+
+[21:35.14]比較少技術
+
+[21:37.14]不懂得
+
+[21:38.14]限制的東西
+
+[21:39.14]不不不
+
+[21:40.14]你會這樣嗎
+
+[21:41.14]不不
+
+[21:42.14]我認為
+
+[21:43.14]今天的可能性
+
+[21:44.14]是否適用
+
+[21:46.14]我認為
+
+[21:47.14]人們會看你
+
+[21:48.14]然後說
+
+[21:49.14]這傢伙
+
+[21:50.14]需要 400 億元
+
+[21:51.14]去做
+
+[21:52.14]所以有很多 VC
+
+[21:53.14]都會說
+
+[21:54.14]我會再加上
+
+[21:55.14]有些東西
+
+[21:56.14]協助 AI
+
+[21:57.14]有些東西
+
+[21:58.14]是比較容易
+
+[21:59.14]進行
+
+[22:00.14]進行的
+
+[22:01.14]但我還驚訝
+
+[22:02.14]有些 funders
+
+[22:03.14]不想做 agent
+
+[22:04.14]不只是 funding
+
+[22:05.14]有時候
+
+[22:06.14]我們在看
+
+[22:07.14]為什麼沒有人
+
+[22:08.14]做 agent for acts
+
+[22:09.14]那是好
+
+[22:10.14]其實
+
+[22:11.14]我從沒知道
+
+[22:12.14]我的觀點
+
+[22:13.14]是
+
+[22:14.14]有新的 agent company
+
+[22:16.14]在進行
+
+[22:17.14]所以可能
+
+[22:18.14]他們也有
+
+[22:19.14]但我提供人員
+
+[22:20.14]去取消 agent
+
+[22:21.14]他們的名字
+
+[22:22.14]是因為
+
+[22:23.14]他們的名字
+
+[22:24.14]他們的名字
+
+[22:25.14]所以
+
+[22:26.14]他們不等待
+
+[22:27.14]對
+
+[22:28.14]那是好處
+
+[22:29.14]你的 portfolio allocator
+
+[22:31.14]有些人
+
+[22:32.14]知道 about persimmon
+
+[22:33.14]一些人知道
+
+[22:34.14]for you and for you heavy
+
+[22:35.14]你覺得
+
+[22:36.14]怎麼想
+
+[22:37.14]那個 evolution of that
+
+[22:38.14]什麼人
+
+[22:39.14]想想
+
+[22:40.14]那是
+
+[22:41.14]a depth
+
+[22:42.14]搜尋個案
+
+[22:43.14] kind of take us
+
+[22:44.14]through the stuff
+
+[22:45.14]you should recently
+
+[22:46.14]and how people
+
+[22:47.14]should think about
+
+[22:48.14]the trajectory
+
+[22:49.14]what you're doing
+
+[22:50.14]the critical path
+
+[22:51.14]for adept
+
+[22:52.14]is we want to build
+
+[22:53.14]agents that can do
+
+[22:54.14]a higher and higher
+
+[22:55.14]level of abstraction
+
+[22:56.14]things over time
+
+[22:57.14]all while keeping
+
+[22:58.14]insanely
+
+[22:59.14]high reliability standard
+
+[23:00.14]because that's
+
+[23:01.14]what turns this from
+
+[23:02.14]research into something
+
+[23:03.14]that customers want
+
+[23:04.14]and if you build
+
+[23:05.14]agents with really
+
+[23:06.14]high reliability standard
+
+[23:07.14]your users
+
+[23:08.14]how to get that
+
+[23:09.14]next level of
+
+[23:10.14]straction faster
+
+[23:11.14]so that's how
+
+[23:12.14]you actually build
+
+[23:13.14]the data level
+
+[23:14.14]that's the critical path
+
+[23:15.14]for the company
+
+[23:16.14]everything we do
+
+[23:17.14]is in service of that
+
+[23:18.14]so you go zoom
+
+[23:19.14]way way back to
+
+[23:20.14]act one days right
+
+[23:21.14]like the core thing
+
+[23:22.14]behind act one
+
+[23:23.14]is can we teach
+
+[23:24.14]large model basically
+
+[23:25.14]how to even
+
+[23:26.14]actuate your computer
+
+[23:27.14]and I think we're
+
+[23:28.14]one of the first places
+
+[23:29.14]to have solved that
+
+[23:30.14]and shown it
+
+[23:31.14]and shown the generalization
+
+[23:32.14]that you get when you
+
+[23:33.14]give it various different
+
+[23:34.14]workflows and texts
+
+[23:35.14]but I think from
+
+[23:36.14]these models
+
+[23:37.14]to be able to
+
+[23:38.14]get a lot better
+
+[23:39.14]at having some
+
+[23:40.14]specificationof some
+
+[23:41.14]guardrails for what it
+
+[23:42.14]actually should be doing
+
+[23:43.14]and I think in conjunction
+
+[23:44.14]with that a giant thing
+
+[23:45.14]that was really
+
+[23:46.14]necessaryis really
+
+[23:47.14]fast multimodal models
+
+[23:48.14]that are really good
+
+[23:49.14]at understanding
+
+[23:50.14]knowledge work
+
+[23:51.14]and really good
+
+[23:52.14]at understanding screens
+
+[23:53.14]and that needs to
+
+[23:54.14]kind of be the base
+
+[23:55.14]for some of these
+
+[23:56.14]agentsback then
+
+[23:57.14]we had to do a ton
+
+[23:58.14]ofresearchbasically
+
+[23:59.14]on how do we
+
+[24:00.14]actually make that
+
+[24:01.14]possiblewell first off
+
+[24:02.14]back in
+
+[24:03.14]free at exact
+
+[24:04.14]one month of 23
+
+[24:05.14]and then
+
+[24:06.14]we had to
+
+[24:07.14]get a lot better
+
+[24:08.14]at the first place
+
+[24:09.14]and then
+
+[24:10.14]we had to
+
+[24:11.14]get a lot better
+
+[24:12.14]at the first place
+
+[24:13.14]and then
+
+[24:14.14]we had to
+
+[24:15.14]get a lot better
+
+[24:16.14]at the first place
+
+[24:17.14]and then
+
+[24:18.14]we had to
+
+[24:19.14]get a lot better
+
+[24:20.14]at the first place
+
+[24:21.14]and then
+
+[24:22.14]we had to
+
+[24:23.14]get a lot better
+
+[24:24.14]at the first place
+
+[24:25.14]and then
+
+[24:26.14]we had to
+
+[24:27.14]get a lot better
+
+[24:28.14]at the first place
+
+[24:29.14]and then
+
+[24:30.14]we had to
+
+[24:31.14]get a lot better
+
+[24:32.14]at the first place
+
+[24:33.14]and then
+
+[24:34.14]we had to
+
+[24:35.14]get a lot better
+
+[24:36.14]at the first place
+
+[24:37.14]and then
+
+[24:38.14]we had to
+
+[24:39.14]get a lot better
+
+[24:40.14]at the first place
+
+[24:41.14]and then
+
+[24:42.14]we had to
+
+[24:43.14]get a lot better
+
+[24:44.14]at the first place
+
+[24:45.14]and then
+
+[24:46.14]we had to
+
+[24:47.14]get a lot better
+
+[24:48.14]at the first place
+
+[24:49.14]and then
+
+[24:50.14]we had to
+
+[24:51.14]get a lot better
+
+[24:52.14]at the first place
+
+[24:53.14]and then
+
+[24:54.14]we had to
+
+[24:55.14]get a lot better
+
+[24:56.14]at the first place
+
+[24:57.14]and then
+
+[24:58.14]we had to
+
+[24:59.14]get a lot better
+
+[25:00.14]at the first place
+
+[25:01.14]and then
+
+[25:02.14]we had to
+
+[25:03.14]get a lot better
+
+[25:04.12]at the first place
+
+[25:05.12]and then
+
+[25:06.12]we had to
+
+[25:07.12]get a lot better
+
+[25:08.12]at the first place
+
+[25:09.12]and then
+
+[25:10.12]we had to
+
+[25:11.12]get a lot better
+
+[25:12.12]at the first place
+
+[25:13.12]and then
+
+[25:14.12]we had to
+
+[25:15.12]get a lot better
+
+[25:16.12]at the first place
+
+[25:17.12]and then
+
+[25:18.12]we had to
+
+[25:19.12]get a lot better
+
+[25:20.12]at the first place
+
+[25:21.12]and then
+
+[25:22.12]we had to
+
+[25:23.12]get a lot better
+
+[25:24.12]at the first place
+
+[25:25.12]and then
+
+[25:26.12]we had to
+
+[25:27.12]get a lot better
+
+[25:28.12]at the first place
+
+[25:29.12]and then
+
+[25:30.12]we had to
+
+[25:31.12]get a lot better
+
+[25:32.12]at the first place
+
+[25:33.12]and then
+
+[25:34.12]we had to
+
+[25:35.12]get a lot better
+
+[25:36.12]at the first place
+
+[25:37.12]and then
+
+[25:38.12]we had to
+
+[25:39.12]get a lot better
+
+[25:40.12]at the first place
+
+[25:41.12]and then
+
+[25:42.12]we had to
+
+[25:43.12]get a lot better
+
+[25:44.12]at the first place
+
+[25:45.12]and then
+
+[25:46.12]we had to
+
+[25:47.12]get a lot better
+
+[25:48.12]at the first place
+
+[25:49.12]and then
+
+[25:50.12]we had to
+
+[25:51.12]get a lot better
+
+[25:52.12]at the first place
+
+[25:53.12]and then
+
+[25:54.12]we had to
+
+[25:55.12]get a lot better
+
+[25:56.12]at the first place
+
+[25:57.12]and then
+
+[25:58.12]we had to
+
+[25:59.12]get a lot better
+
+[26:00.12]at the first place
+
+[26:01.12]and then
+
+[26:02.12]we had to
+
+[26:03.12]get a lot better
+
+[26:04.12]at the first place
+
+[26:05.12]and then
+
+[26:06.12]we had to
+
+[26:07.12]get a lot better
+
+[26:08.12]at the first place
+
+[26:09.12]and then
+
+[26:10.12]we had to
+
+[26:11.12]get a lot better
+
+[26:12.12]at the first place
+
+[26:13.12]and then
+
+[26:14.12]we had to
+
+[26:15.12]get a lot better
+
+[26:16.12]at the first place
+
+[26:17.12]and then
+
+[26:18.12]we had to
+
+[26:19.12]get a lot better
+
+[26:20.12]at the first place
+
+[26:21.12]and then
+
+[26:22.12]we had to
+
+[26:23.12]get a lot better
+
+[26:24.12]at the first place
+
+[26:25.12]and then
+
+[26:26.12]we had to
+
+[26:27.12]get a lot better
+
+[26:28.12]at the first place
+
+[26:29.12]and then
+
+[26:30.12]we had to
+
+[26:31.12]get a lot better
+
+[26:32.12]at the browser level
+
+[26:33.12]I really want
+
+[26:34.12]at your papers
+
+[26:35.12]you have like a different representation
+
+[26:36.12]kind of like
+
+[26:37.12]you don't just take the dome
+
+[26:38.12]and act on it
+
+[26:39.12]you do a lot more stuff
+
+[26:40.12]how do you think about
+
+[26:41.12]the best way
+
+[26:42.12]the models will interact
+
+[26:43.12]with the software
+
+[26:44.12]and like how
+
+[26:45.12]the development of products
+
+[26:46.12]is going to change
+
+[26:47.12]with that in mind
+
+[26:48.12]as more and more
+
+[26:49.12]the work is done by agents
+
+[26:50.12]instead of people
+
+[26:51.12]this is
+
+[26:52.12]there's so much surface area here
+
+[26:53.12]and it's actually one of the things
+
+[26:54.12]I'm really excited about
+
+[26:55.12]and it's funny because
+
+[26:56.12]I've spent most of my time
+
+[26:57.12]doing research stuff
+
+[26:58.12]but this is like a whole
+
+[26:59.12]new ball game that I've been
+
+[27:00.12]doing about
+
+[27:01.12]and I find it
+
+[27:02.12]really cool
+
+[27:03.12]so I would say
+
+[27:04.12]the best analogy
+
+[27:05.12]I have to
+
+[27:06.12]why ADAPT
+
+[27:07.12]is pursuing a path
+
+[27:08.12]of being able to
+
+[27:09.12]use your computer
+
+[27:10.12]like a human
+
+[27:11.12]plus of course
+
+[27:12.12]being able to call
+
+[27:13.12]APIs
+
+[27:14.12]being able to call
+
+[27:15.12]APIs is the easy part
+
+[27:16.12]like being able to
+
+[27:17.12]use your gear like humans
+
+[27:18.12]is a hard part
+
+[27:19.12]it's in the same way
+
+[27:20.12]why people are excited
+
+[27:21.12]about humanoid robotics
+
+[27:22.12]right
+
+[27:23.12]in a world where
+
+[27:24.12]you had t=infinity
+
+[27:25.12]right you're probably
+
+[27:26.12]gonna have various
+
+[27:27.12]different form factors
+
+[27:28.12]that robots
+
+[27:29.12]do
+
+[27:30.12]without changing
+
+[27:31.12]everything along the way
+
+[27:32.12]it's the same thing
+
+[27:33.12]for software
+
+[27:34.12]right
+
+[27:35.12]if you go itemize out
+
+[27:36.12]the number of things
+
+[27:37.12]you wanna do on your computer
+
+[27:38.12]for which every step
+
+[27:39.12]has an api
+
+[27:40.12]those numbers
+
+[27:41.12]will workflows add up
+
+[27:42.12]pretty close to zero
+
+[27:43.12]and so then many
+
+[27:44.12]points along the way
+
+[27:45.12]you need the ability
+
+[27:46.12]to actually control
+
+[27:47.12]your computer like a human
+
+[27:48.12]it also lets you learn
+
+[27:49.12]from human usage
+
+[27:50.12]of computers
+
+[27:51.12]as a source of training
+
+[27:52.12]data that you don't get
+
+[27:53.12]if you have to somehow
+
+[27:54.12]figure out how every
+
+[27:55.12]particular step needs to be
+
+[27:56.12]some particular custom
+
+[27:57.12]private api thing
+
+[27:58.12]it's the most practical path
+
+[27:59.12]i think a lot of
+
+[28:00.12]success will come
+
+[28:01.12]from going down
+
+[28:02.12]this path
+
+[28:03.12]i kinda think about this
+
+[28:04.12]early days of the agent
+
+[28:05.12]interaction layer
+
+[28:06.12]level is a little bit
+
+[28:07.12]like do y'all remember
+
+[28:08.12]windows 3.1
+
+[28:10.12]like those days
+
+[28:11.12]this might be
+
+[28:12.12]i might be too old
+
+[28:13.12]for you guys on this
+
+[28:14.12]but back in the day
+
+[28:15.12]windows 3.1
+
+[28:16.12]we had this transition period
+
+[28:17.12]between pure command line
+
+[28:18.12]right
+
+[28:19.12]being the default
+
+[28:20.12]into this new world
+
+[28:21.12]with the gui is the default
+
+[28:22.12]and then you drop into the
+
+[28:23.12]command line for like
+
+[28:24.12]programmer things
+
+[28:25.12]the old way was
+
+[28:26.12]you booted your computer up
+
+[28:27.12]and then it would
+
+[28:28.12]give you the c colon
+
+[28:29.12]slash thing
+
+[28:30.12]and you typed windows
+
+[28:31.12]and you hit enter
+
+[28:32.12]and then you got
+
+[28:33.12]put into windows
+
+[28:34.12]and then the gui
+
+[28:35.12]kind of became a layer
+
+[28:36.12]above the command line
+
+[28:37.12]the same thing
+
+[28:38.12]is gonna happen
+
+[28:39.12]with agent interfaces
+
+[28:40.12]is like today
+
+[28:41.12]what we have in the gui
+
+[28:42.12]is like the base layer
+
+[28:44.12]and then the agent
+
+[28:45.12]just controls
+
+[28:46.12]the current gui
+
+[28:47.12]layer plus apis
+
+[28:48.12]and in the future
+
+[28:50.12]as more and more
+
+[28:51.12]trust is built towards
+
+[28:52.12]agents and more and more
+
+[28:53.12]things can be done by
+
+[28:54.12]agents and more UIs
+
+[28:55.12]for agents are actually
+
+[28:56.12]users
+
+[28:57.12]then that just becomes
+
+[28:58.12]a standard
+
+[28:59.12]interaction layer
+
+[29:00.12]and if that becomes
+
+[29:01.12]a standard
+
+[29:02.12]interaction layer
+
+[29:03.12]what changes for
+
+[29:04.12]software is that
+
+[29:05.12]a lot of software
+
+[29:06.12]is gonna be
+
+[29:07.12]either systems
+
+[29:08.12]or record
+
+[29:09.12]or like certain
+
+[29:10.12]customized
+
+[29:11.12]workflow
+
+[29:12.12]execution engines
+
+[29:13.12]and a lot of
+
+[29:14.12]how you actually
+
+[29:15.12]do stuff will be
+
+[29:16.12]controlled at the
+
+[29:17.12]agent layer
+
+[29:18.12]and you think the
+
+[29:19.12]rabbit interface
+
+[29:20.12]is more like
+
+[29:21.12]it would like
+
+[29:22.12]you're not actually
+
+[29:23.12]seeing the app
+
+[29:24.12]that the model
+
+[29:25.12]I can see that
+
+[29:26.12]being a model
+
+[29:27.12]I think
+
+[29:28.12]I don't know
+
+[29:29.12]enough about
+
+[29:30.12]what using
+
+[29:31.12]rabbit in real life
+
+[29:32.12]will actually be like
+
+[29:33.12]to comment on
+
+[29:34.12]that particular
+
+[29:35.12]thing but I think
+
+[29:36.12]the broader idea
+
+[29:37.12]that you know
+
+[29:38.12]you have a goal
+
+[29:39.12]the agent knows
+
+[29:40.12]how to break
+
+[29:41.12]your goal down into steps
+
+[29:42.12]the agent knows
+
+[29:43.12]how to use
+
+[29:44.12]the underlying
+
+[29:45.12]software
+
+[29:46.12]and systems
+
+[29:47.12]or record
+
+[29:48.12]to achieve
+
+[29:49.12]that goal for you
+
+[29:50.12]the agent may presents
+
+[29:51.12]you information
+
+[29:52.12]in a custom way
+
+[29:53.12]that's only
+
+[29:54.12]you're a power
+
+[29:55.12]user
+
+[29:56.12]for some niche thing
+
+[29:57.12]general question
+
+[29:58.12]so first of all
+
+[29:59.12]I think like
+
+[30:00.12]the sort of input
+
+[30:01.12]mode conversation
+
+[30:02.12]I wonder if you have
+
+[30:03.12]any analogies
+
+[30:04.12]that you like
+
+[30:05.12]with self-driving
+
+[30:06.12]because I do think
+
+[30:07.12]there's a little bit
+
+[30:08.12]of how the model
+
+[30:09.12]should perceive the world
+
+[30:10.12]and you know
+
+[30:11.12]the primary split
+
+[30:12.12]in self-driving
+
+[30:13.12]is LiDAR
+
+[30:14.12]versus camera
+
+[30:15.12]and I feel like
+
+[30:16.12]most agent companies
+
+[30:17.12]that I'm tracking
+
+[30:18.12]are all moving towards
+
+[30:19.12]camera approach
+
+[30:20.12]which is like
+
+[30:21.12]the multimodal approach
+
+[30:22.12]that we're doing
+
+[30:23.12]you're
+
+[30:24.12]focusing on that
+
+[30:25.12]including charts
+
+[30:26.12]and tables
+
+[30:27.12]and do you find
+
+[30:28.12]inspiration there
+
+[30:29.12]from the self-driving
+
+[30:30.12]world?
+
+[30:31.12]that's a good question
+
+[30:32.12]I think sometimes
+
+[30:33.12]the most useful
+
+[30:34.12]inspiration I've found
+
+[30:35.12]from self-driving
+
+[30:36.12]is the levels analogy
+
+[30:37.12]I think that's awesome
+
+[30:38.12]but I think that
+
+[30:39.12]our number one
+
+[30:40.12]goals for agents
+
+[30:41.12]not to look like
+
+[30:42.12]self-driving
+
+[30:43.12]we want to minimize
+
+[30:44.12]the chances
+
+[30:45.12]that agents are sort
+
+[30:46.12]of a thing
+
+[30:47.12]that you just
+
+[30:48.12]have to bang
+
+[30:49.12]your head at
+
+[30:50.12]for a long time
+
+[30:51.12]to get to like
+
+[30:52.12]completely
+
+[30:53.12]and that takes you
+
+[30:54.12]all the way
+
+[30:55.12]up to the top
+
+[30:56.12]but similarly
+
+[30:57.12]I mean
+
+[30:58.12]compared to self-driving
+
+[30:59.12]like two things
+
+[31:00.12]that people really
+
+[31:01.12]undervalue
+
+[31:02.12]that's like really
+
+[31:03.12]easy to driving
+
+[31:04.12]a car down
+
+[31:05.12]highway 101
+
+[31:06.12]in a sunny day
+
+[31:07.12]demo
+
+[31:08.12]that actually
+
+[31:09.12]doesn't prove anything
+
+[31:10.12]anymore
+
+[31:11.12]and I think
+
+[31:12.12]the second thing
+
+[31:13.12]is that
+
+[31:14.12]as a non-self-driving
+
+[31:15.12]expert
+
+[31:16.12]I think one of the things
+
+[31:17.12]that we believe
+
+[31:18.12]really strongly
+
+[31:19.12]is that
+
+[31:20.12]everyone under
+
+[31:21.12]get a lot
+
+[31:22.12]of reliability
+
+[31:23.12]is a really
+
+[31:24.12]strong focus on
+
+[31:25.12]actually why
+
+[31:26.12]does the model
+
+[31:27.12]not do this thing
+
+[31:28.12]and the non-trivial amount
+
+[31:29.12]of time
+
+[31:30.12]the time the model
+
+[31:31.12]doesn't actually
+
+[31:32.12]do the thing
+
+[31:33.12]is because if
+
+[31:34.12]you're a wizard
+
+[31:35.12]of ozing it yourself
+
+[31:36.12]or if you have
+
+[31:37.12]unreliable actuators
+
+[31:38.12]you can't do the thing
+
+[31:39.12]and so we've
+
+[31:40.12]had to fix
+
+[31:41.12]a lot of those problems
+
+[31:42.12]I was slightly
+
+[31:43.12]surprised just because
+
+[31:44.12]I do generally
+
+[31:45.12]consider the way
+
+[31:46.12]most that we see
+
+[31:47.12]all around San Francisco
+
+[31:48.12]as the most
+
+[31:49.12]I guess real case
+
+[31:50.12]it's a big
+
+[31:51.12]job but it has taken
+
+[31:52.12]a long time
+
+[31:53.12]for self-driving
+
+[31:54.12]temperature from
+
+[31:55.12]when it entered
+
+[31:56.12]the consciousness
+
+[31:57.12]and the driving down
+
+[31:58.12]when it went on a sunny
+
+[31:59.12]day moment
+
+[32:00.12]happened to now.
+
+[32:01.12]so I want to see
+
+[32:02.12]the more compressed
+
+[32:03.12]cruise, you know,
+
+[32:04.12]R.I.P.
+
+[32:05.12]recently.
+
+[32:06.12]and then one more thing
+
+[32:07.12]on just like
+
+[32:08.12]just going back on
+
+[32:09.12]this reliability
+
+[32:10.12]thing, something
+
+[32:11.12]I have been holding
+
+[32:12.12]in my head
+
+[32:13.12]that I'm curious
+
+[32:14.12]to get your commentary on
+
+[32:15.12]is I think there's a
+
+[32:16.12]treatup between
+
+[32:17.12]reliability and generality
+
+[32:18.12]or I want to broaden
+
+[32:19.12]because you have
+
+[32:20.12]reliability also have
+
+[32:21.12]cost of speed
+
+[32:22.12]speed is a huge emphasis
+
+[32:23.12]for a debt
+
+[32:24.12]the tendency or the
+
+[32:25.12]attemptation is to reduce
+
+[32:26.12]generalityto improve
+
+[32:27.12]reliability
+
+[32:28.12]and to improve
+
+[32:29.12]cost improve speed
+
+[32:30.12]do you perceive a tradeoff
+
+[32:31.12]do you have any
+
+[32:32.12]insights that
+
+[32:33.12]solve those tradeoffs
+
+[32:34.12]for you guys
+
+[32:35.12]there's definitely a tradeoff
+
+[32:36.12]if you're at
+
+[32:37.12]the predo frontier
+
+[32:38.12]I think a lot of folks
+
+[32:39.12]aren't actually
+
+[32:40.12]at the predo frontier
+
+[32:41.12]I think the way you get
+
+[32:42.12]there is basically
+
+[32:43.12]how do you frame
+
+[32:44.12]the fundamental
+
+[32:45.12]agent problem in a way
+
+[32:46.12]that just continues
+
+[32:47.12]to benefit from data
+
+[32:48.12]I think one of
+
+[32:49.12]the main ways
+
+[32:50.12]of being able to solve
+
+[32:51.12]that particular tradeoff
+
+[32:52.12]is you basically
+
+[32:53.12]just want to formulate
+
+[32:54.12]the problem such that
+
+[32:55.12]every particular use
+
+[32:56.12]case just looks like
+
+[32:57.12]you collecting more
+
+[32:58.12]data to go make
+
+[32:59.12]that use case possible
+
+[33:00.12]I think that's how
+
+[33:01.12]you really solve it
+
+[33:02.12]then you get into the
+
+[33:03.12]other problems like
+
+[33:04.12]are you overfitting
+
+[33:05.12]on these end use cases
+
+[33:06.12]right but like you're
+
+[33:07.12]not doing a thing
+
+[33:08.12]where you're like
+
+[33:09.12]being super prescriptive
+
+[33:10.12]for the end steps
+
+[33:11.12]that the model can
+
+[33:12.12]only do for example
+
+[33:13.12]then the question becomes
+
+[33:14.12]kind of do you have
+
+[33:15.12]one sort of house model
+
+[33:16.12]they customize
+
+[33:17.12]the customer's
+
+[33:18.12]specific use case
+
+[33:19.12]we're not sharing
+
+[33:20.12]we're not sharing
+
+[33:21.12]it's tempting
+
+[33:22.12]but that doesn't
+
+[33:23.12]look like AGI to me
+
+[33:24.12]you know what I mean
+
+[33:25.12]that is just
+
+[33:26.12]you have a good
+
+[33:27.12]base model
+
+[33:28.12]and then
+
+[33:29.12]you fine tune it
+
+[33:30.12]for what it's worth
+
+[33:31.12]I think there's
+
+[33:32.12]two paths
+
+[33:33.12]to a lot more
+
+[33:34.12]capability coming out
+
+[33:35.12]of the models
+
+[33:36.12]that we
+
+[33:37.12]all are training
+
+[33:38.12]these days
+
+[33:39.12]one path
+
+[33:40.12]is you figure out
+
+[33:41.12]how to spend
+
+[33:42.12]compute and turn
+
+[33:43.12]into data
+
+[33:44.12]and so in that
+
+[33:45.12]path I consider
+
+[33:46.12]off play
+
+[33:47.12]all that stuff
+
+[33:48.12]the second path
+
+[33:49.12]is how do you
+
+[33:50.12]get super
+
+[33:52.12]competent
+
+[33:53.12]high intelligence
+
+[33:54.12]demonstrations
+
+[33:55.12]from humans
+
+[33:56.12]and I think
+
+[33:57.12]the right way
+
+[33:58.12]to move forward
+
+[33:59.12]is you kind of
+
+[34:00.12]want to combine the two
+
+[34:01.12]the first one
+
+[34:02.12]gives you maximum
+
+[34:03.12]sample efficiency
+
+[34:04.12]for the second
+
+[34:05.12]but I think
+
+[34:06.12]that is going to be
+
+[34:07.12]hard to be running
+
+[34:08.12]at max speed
+
+[34:09.12]towards AGI
+
+[34:10.12]without actually
+
+[34:11.12]solving a bit of both
+
+[34:12.12]you haven't talked
+
+[34:13.12]much about synthetic
+
+[34:14.12]data as far as I can
+
+[34:15.12]any insights
+
+[34:16.12]on using synthetic
+
+[34:17.12]data to augment
+
+[34:18.12]the expensive
+
+[34:19.12]human data
+
+[34:20.12]the best part
+
+[34:21.12]about framing AGI
+
+[34:22.12]is being able
+
+[34:23.12]to help people do
+
+[34:24.12]things on computers
+
+[34:25.12]is you have an environment
+
+[34:26.12]yes
+
+[34:27.12]so you can
+
+[34:28.12]simulate all of it
+
+[34:29.12]you can do a lot
+
+[34:30.12]of stuff
+
+[34:31.12]when you have an environment
+
+[34:32.12]we were having dinner
+
+[34:33.12]for our one year
+
+[34:34.12]anniversary
+
+[34:35.12]the other round
+
+[34:36.12]thank you
+
+[34:37.12]Raza from human
+
+[34:38.12]loop was there
+
+[34:39.12]and we mentioned
+
+[34:40.12]you were coming on
+
+[34:41.12]the pod
+
+[34:42.12]this is our first
+
+[34:43.12]so he submitted a question
+
+[34:44.12]now you had
+
+[34:45.12]gbd4 vision
+
+[34:46.12]and help you
+
+[34:47.12]building a lot
+
+[34:48.12]of those things
+
+[34:49.12]how do you think
+
+[34:50.12]about the things
+
+[34:51.12]that are unique to you
+
+[34:52.12]as a depth
+
+[34:53.12]and like going back
+
+[34:54.12]to like the maybe
+
+[34:55.12]research direction
+
+[34:56.12]that you want to take
+
+[34:57.12]the team and what you
+
+[34:58.12]want people to come
+
+[34:59.12]work on at a depth
+
+[35:00.12]versus what is maybe
+
+[35:01.12]not become commoditized
+
+[35:02.12]that you didn't expect
+
+[35:03.12]everybody would
+
+[35:04.12]have access to
+
+[35:05.12]yeah that's
+
+[35:06.12]a really good question
+
+[35:07.12]I think implicit
+
+[35:08.12]in that question
+
+[35:09.12]and I wish he were
+
+[35:10.12]tier two so he can
+
+[35:11.12]push back on my
+
+[35:12.12]assumption about his
+
+[35:13.12]questionbut I think
+
+[35:14.04]is calculus of where
+
+[35:16.04]does advantage a crew
+
+[35:18.04]in the overall
+
+[35:19.04]ML stack
+
+[35:20.04]and maybe part
+
+[35:21.04]of the assumption
+
+[35:22.04]is that advantage
+
+[35:23.04]a crew is solely
+
+[35:24.04]to base model scaling
+
+[35:25.04]but I actually
+
+[35:26.04]believe pretty strongly
+
+[35:27.04]that the way
+
+[35:28.04]that you really
+
+[35:29.04]win is that you
+
+[35:30.04]have to go build
+
+[35:31.04]an agent stack
+
+[35:32.04]that is much more
+
+[35:33.04]than that
+
+[35:34.04]of the base model itself
+
+[35:35.04]and so I think
+
+[35:36.04]like that is
+
+[35:37.04]always going to be
+
+[35:38.04]a giant advantage
+
+[35:39.04]of vertical integration
+
+[35:40.04]I think like
+
+[35:41.04]it lets us do things
+
+[35:42.04]like have a really
+
+[35:43.04]bad cat and dog
+
+[35:44.04]photo
+
+[35:45.04]it's pretty good
+
+[35:46.04]at cat and dog
+
+[35:47.04]photo
+
+[35:48.04]it's not like
+
+[35:49.04]soda at cat
+
+[35:50.04]and dogphoto
+
+[35:51.04]so like we're allocating
+
+[35:52.04]our capacity wisely
+
+[35:53.04]is like one thing
+
+[35:54.04]that you
+
+[35:55.04]really get to do
+
+[35:56.04]I also think that
+
+[35:57.04]the other thing
+
+[35:58.04]that is pretty
+
+[35:59.04]important now
+
+[36:00.04]in the broader
+
+[36:01.04]foundation modeling
+
+[36:02.04]space is
+
+[36:03.04]I feel despite any
+
+[36:04.04]potential concerns
+
+[36:05.04]about how good
+
+[36:06.04]is agents as
+
+[36:07.04]like a startup area
+
+[36:08.04]like we were talking
+
+[36:09.04]about earlier
+
+[36:10.04]I feel super good
+
+[36:11.04]that we're
+
+[36:12.04]cap just flowing
+
+[36:13.04]from can we make
+
+[36:14.04]a better agent
+
+[36:15.04]because right now
+
+[36:16.04]I think we all see
+
+[36:17.04]that you know
+
+[36:18.04]if you're training
+
+[36:19.04]on publicly available
+
+[36:20.04]web data
+
+[36:21.04]you put in the
+
+[36:22.04]flops and you do
+
+[36:23.04]reasonable things
+
+[36:24.04]then you get
+
+[36:25.04]decent results
+
+[36:26.04]and if you just
+
+[36:27.04]double the amount
+
+[36:28.04]of compute
+
+[36:29.04]then you get
+
+[36:30.04]predictably
+
+[36:31.04]better results
+
+[36:32.04]and so I think
+
+[36:33.04]pure play foundation
+
+[36:34.04]model companies
+
+[36:35.04]are just going to be
+
+[36:36.04]pinched by how
+
+[36:37.04]good the next couple
+
+[36:38.04]lamas are going to be
+
+[36:39.04]and the next
+
+[36:40.04]what good open source
+
+[36:41.04]on these base foundation
+
+[36:42.04]models I think it's
+
+[36:43.04]gonna commoditize a lot
+
+[36:44.04]of the regular llms
+
+[36:45.04]and soon regular
+
+[36:46.04]multimodal models
+
+[36:47.04]so I feel really good
+
+[36:48.04]that we're just focused
+
+[36:49.04]on agents so you
+
+[36:50.04]don't consider yourself
+
+[36:51.04]a pure play foundation
+
+[36:52.04]model company no
+
+[36:53.04]because if we were pure
+
+[36:54.04]play foundation model
+
+[36:55.04]company we would be
+
+[36:56.04]traininggeneral foundation
+
+[36:57.04]models that do
+
+[36:58.04]summarization and
+
+[36:59.04]all this dedicated
+
+[37:00.04]towards the agent
+
+[37:01.04]yeah and our business
+
+[37:02.04]is an agent business
+
+[37:03.04]we're not here to
+
+[37:04.04]sell you tokens right
+
+[37:05.04]and I think like
+
+[37:06.04]selling tokens unless
+
+[37:07.04]there's like yeah I
+
+[37:08.04]love it there's like
+
+[37:09.04]if you have a particular
+
+[37:10.04]area of specialty
+
+[37:11.04]right then you won't
+
+[37:13.04]get caught in the fact
+
+[37:14.04]that everyone's just
+
+[37:15.04]scaling to ridiculous
+
+[37:16.04]levels of compute
+
+[37:17.04]but if you don't have a
+
+[37:18.04]specialty I find that
+
+[37:19.04]I think it's gonna be
+
+[37:20.04]a little tougher
+
+[37:21.04]interesting are you
+
+[37:22.04]interested in robotics at
+
+[37:23.04]all just a personally
+
+[37:24.04]fascinated by robotics
+
+[37:25.04]have always loved robotics
+
+[37:26.04]embodied agents as a
+
+[37:27.04]business you know figure
+
+[37:28.04]is like a big also
+
+[37:29.04]so the open ai
+
+[37:30.04]affiliated company
+
+[37:31.04]that raises a lot of
+
+[37:32.04]money I think it's
+
+[37:33.04]cool I think I mean
+
+[37:34.04]I don't know exactly
+
+[37:35.04]what they're exactly
+
+[37:36.04]what they're doing but
+
+[37:37.04]robots yeah yeah
+
+[37:38.04]well I mean that's
+
+[37:39.04]well Christian
+
+[37:40.04]would you ask
+
+[37:41.04]like if we
+
+[37:42.04]had them on like
+
+[37:43.04]what would you ask them
+
+[37:44.04]oh I just want to
+
+[37:45.04]understand what their
+
+[37:46.04]overall strategy is
+
+[37:47.04]gonna be between now
+
+[37:48.04]and when there's reliable
+
+[37:49.04]stuff to be deployed
+
+[37:50.04]but honestly
+
+[37:51.04]I just don't know
+
+[37:52.04]enough about it
+
+[37:53.04]and if I told you
+
+[37:54.04]hey fire your entire
+
+[37:55.04]warehouse workforce
+
+[37:56.04]and you know
+
+[37:57.04]put robots in there
+
+[37:58.04]isn't that a strategy
+
+[37:59.04]oh yeah yeah sorry
+
+[38:00.04]I'm not questioning
+
+[38:01.04]whether
+
+[38:02.04]they're doing smart
+
+[38:03.04]things I genuinely
+
+[38:04.04]don't know what
+
+[38:05.04]they're doing as much
+
+[38:06.04]but I think there's
+
+[38:07.04]two things one
+
+[38:08.04]it's just
+
+[38:09.04]I think it's
+
+[38:10.04]just gonna work
+
+[38:11.04]like I will die
+
+[38:12.04]on this hill
+
+[38:13.04]like I mean
+
+[38:14.04]like again this whole
+
+[38:15.04]this whole time
+
+[38:16.04]like we've been on this
+
+[38:17.04]podcast it's just
+
+[38:18.04]gonna continually saying
+
+[38:19.04]these models
+
+[38:20.04]are basically behavioral
+
+[38:21.04]cloners right
+
+[38:22.04]so let's go behavioral
+
+[38:23.04]clone all this like
+
+[38:24.04]robot behavior right
+
+[38:25.04]and then
+
+[38:26.04]now you figure out
+
+[38:27.04]everything else
+
+[38:28.04]you have to do in order
+
+[38:29.04]to teach you how to
+
+[38:30.04]solve new problem
+
+[38:31.04]that's gonna work
+
+[38:32.04]I'm super stoked for that
+
+[38:33.04]I think unlike
+
+[38:34.04]what we're doing with
+
+[38:35.04]helping humans with
+
+[38:36.04]knowledge work
+
+[38:37.04]and I'm personally
+
+[38:38.04]less excited about that
+
+[38:39.04]we had a
+
+[38:40.04]canjun from imbu
+
+[38:41.04]on the podcast
+
+[38:42.04]we asked her
+
+[38:43.04]why people should
+
+[38:44.04]go work there
+
+[38:45.04]and not at adept
+
+[38:46.04]so I wanna
+
+[38:47.04]well she said
+
+[38:48.04]you know
+
+[38:49.04]there's space for everybody
+
+[38:50.04]in this market
+
+[38:51.04]we're all doing
+
+[38:52.04]interesting work
+
+[38:53.04]and she said
+
+[38:54.04]they're really excited
+
+[38:55.04]about building
+
+[38:56.04]an operating system
+
+[38:57.04]for agent
+
+[38:58.04]and for her
+
+[38:59.04]the biggest research
+
+[39:00.04]thing was like
+
+[39:01.04]getting models
+
+[39:02.04]better reasoning
+
+[39:03.04]and planning
+
+[39:04.04]for these agents
+
+[39:05.04]the reverse question
+
+[39:06.04]I'm excited to
+
+[39:07.04]come work at adept
+
+[39:08.04]instead of imbu
+
+[39:09.04]and maybe
+
+[39:10.04]what are like
+
+[39:11.04]the core research
+
+[39:12.04]questions
+
+[39:13.04]that people should
+
+[39:14.04]be passionate about
+
+[39:15.04]to have fun at adept
+
+[39:16.04]yeah first off
+
+[39:17.04]I think that
+
+[39:18.04]I'm sure you guys
+
+[39:19.04]believe this too
+
+[39:20.04]the AI space
+
+[39:21.04]to the center
+
+[39:22.04]there's an AI space
+
+[39:23.04]and the AI agent
+
+[39:24.04]space are both
+
+[39:25.04]exactly as
+
+[39:26.04]she likely said
+
+[39:27.04]I think colossal
+
+[39:28.04]opportunities
+
+[39:29.04]and people are just
+
+[39:30.04]going to end up
+
+[39:31.04]winning in different
+
+[39:32.04]areas and a lot
+
+[39:33.04]of companies are
+
+[39:34.04]going to do well
+
+[39:35.04]to be at
+
+[39:36.04]adept
+
+[39:37.04]I think there's
+
+[39:38.04]two huge reasons
+
+[39:39.04]to be at adept
+
+[39:40.04]I think one of them
+
+[39:41.04]is everything we do
+
+[39:42.04]is in the service
+
+[39:43.04]of like useful agents
+
+[39:44.04]we're not a
+
+[39:45.04]research lab
+
+[39:46.04]we do a lot of research
+
+[39:47.04]in service of that goal
+
+[39:48.04]but we don't
+
+[39:49.04]think about ourselves
+
+[39:50.04]as like a classic
+
+[39:51.04]research lab at all
+
+[39:52.04]and I think the second
+
+[39:53.04]reason at work at
+
+[39:54.04]adeptis
+
+[39:55.04]if you believe that
+
+[39:56.04]actually having customers
+
+[39:57.04]and a reward signal
+
+[39:58.04]from customers
+
+[39:59.04]lets you build
+
+[40:00.04]AGI faster
+
+[40:01.04]which we really believe
+
+[40:02.04]then you should come here
+
+[40:03.04]and I think the examples
+
+[40:04.04]are evaluations
+
+[40:05.04]they're not
+
+[40:06.04]academic evals
+
+[40:07.04]they're not simulator
+
+[40:08.04]evals
+
+[40:09.04]they're like
+
+[40:10.04]okay we have a
+
+[40:11.04]customer that
+
+[40:12.04]really needs us to do
+
+[40:13.04]these particular things
+
+[40:14.04]we can do some
+
+[40:15.04]of them
+
+[40:16.04]these other ones
+
+[40:17.04]they want us to
+
+[40:18.04]we can't do them at
+
+[40:19.04]all we've turned
+
+[40:20.04]those into evals
+
+[40:21.04]solve it
+
+[40:22.04]I think that's
+
+[40:23.04]really cool
+
+[40:24.04]like everybody knows
+
+[40:25.04]a lot of these evals
+
+[40:26.04]are like
+
+[40:27.04]pretty saturated
+
+[40:28.04]and the new ones
+
+[40:29.04]that even are
+
+[40:30.04]not saturated you look
+
+[40:31.04]at someone and you're
+
+[40:32.04]like is this actually
+
+[40:33.04]and all of this stuff
+
+[40:34.04]but they're very grounded
+
+[40:35.04]and actual needs
+
+[40:36.04]right now
+
+[40:37.04]which is really cool
+
+[40:38.04]yeah this has been
+
+[40:39.04]wonderful dive
+
+[40:40.04]I wish we had more time
+
+[40:41.04]but I'll just leave it
+
+[40:42.04]kind of open to you
+
+[40:43.04]I think you have broad thoughts
+
+[40:44.04]you know just about
+
+[40:45.04]the agent space
+
+[40:46.04]but also just general AI
+
+[40:47.04]space any sort of rants
+
+[40:48.04]or things that
+
+[40:49.04]they're just helping
+
+[40:50.04]might for you right now
+
+[40:51.04]any rants
+
+[40:52.04]minding you
+
+[40:53.04]for just general
+
+[40:54.04]wow okay
+
+[40:55.04]so Amelia's already
+
+[40:56.04]made the rant better
+
+[40:57.04]than I have
+
+[40:58.04]but not just
+
+[40:59.04]not just chatbots
+
+[41:00.04]is like kind of rant one
+
+[41:01.04]but the rant two
+
+[41:02.04]is AI's really been
+
+[41:03.04]the story of compute
+
+[41:04.04]and compute plus data
+
+[41:06.04]and ways in which
+
+[41:07.04]you could change one
+
+[41:08.04]for the other
+
+[41:09.04]and I think as much as
+
+[41:10.04]our research community
+
+[41:11.04]is really smart
+
+[41:12.04]we have made many
+
+[41:13.04]many advancements
+
+[41:14.04]and that's going to
+
+[41:15.04]continue to be important
+
+[41:16.04]but now I think
+
+[41:17.04]the game is
+
+[41:18.04]increasingly changing
+
+[41:19.04]and the rapid
+
+[41:20.04]industrialization
+
+[41:21.04]error has begun
+
+[41:22.04]and I think
+
+[41:23.04]we unfortunately
+
+[41:24.04]have to embrace it
+
+[41:25.04]excellent awesome David
+
+[41:26.04]thank you so much
+
+[41:27.04]for your time
+
+[41:28.04]cool yeah thanks guys
+
+[41:29.04]this was fun
+
+[41:30.04]thank you
+
+[41:31.04]thank you
+
+[41:32.04]thank you
+
+[41:32.04]thank you
+
+[41:33.04]thank you
+
+[41:34.04]thank you
+
+[41:35.04]thank you
+
+[41:36.04]thank you
+
+[41:37.04]thank you
+
+[41:38.04]thank you
+
+[41:39.04]thank you
+
+[41:40.04]thank you
+
+[41:41.04]thank you
+
+[41:42.04]thank you
+
+[41:43.04]thank you
+
+[41:44.04]thank you
+
+[41:45.04]thank you
+
+[41:46.04]thank you
+
+[41:47.04]字幕by索兰娅
+
+[41:49.04]字幕:J Chong
+
+[41:50.04]请不吝点赞 订阅 转发 打赏 打赏
+