[by:whisper.cpp] [00:00.00] (upbeat music) [00:02.58] - Hey everyone, welcome to the Late in Space podcast. [00:07.96] This is Alessio, partner in C2 and residents [00:10.40] and decibel partners, and I'm joined by my co-host, [00:12.66] Swix, founder of Small AI. [00:14.60] - Hey, and today we have in the studio [00:16.56] Eric Bernhardsen for Modal, welcome. [00:18.56] - Hi, it's awesome being here. [00:20.64] - Yeah, awesome seeing you in person. [00:22.40] I've seen you online for a number of years [00:24.96] as you were building on Modal. [00:26.28] And I think you're just making a San Francisco trip [00:29.16] just to see people here, right? [00:31.04] I've been to two Modal events in San Francisco here. [00:33.48] - Yeah, that's right, we're based in New York, [00:34.84] so I figured sometimes I have to come out to [00:37.72] capital of AI and make a presence. [00:40.20] - What do you think is the pros and cons [00:41.94] of building in New York? [00:43.44] - I mean, I never built anything elsewhere. [00:45.32] Like I lived in New York the last 12 years. [00:47.60] I love the city, obviously there's a lot more stuff [00:49.80] going on here and there's a lot more customers [00:51.16] and that's why I'm out here. [00:52.52] I do feel like for me where I'm in life, [00:54.32] like I'm a very boring person, [00:55.84] like I kind of work hard and then I go home [00:57.76] and hang out with my kids, like I don't have time [01:00.60] to go to like events and meetups and stuff anyway. [01:03.28] So in that sense, like New York is kind of nice. [01:04.92] Like I walk to work every morning, [01:06.52] it's like five minutes away from my apartment. [01:07.88] It's like very time efficient in that sense. [01:09.84] - Yeah, yeah. [01:11.28] Sounds like a good life. [01:12.44] So we'll do a brief bio and then we'll talk about [01:14.76] anything else that people should know about you. [01:16.64] Actually, I was surprised to find out real from Sweden. [01:19.28] You went to college in KTH. [01:21.32] - Yeah, yeah. [01:22.60] - And your masters was in implementing [01:24.36] a scalable music recommender system. [01:26.16] - Yeah. - I had no idea. [01:27.28] - Yeah, yeah, yeah. [01:28.12] So I actually started physics, [01:29.20] but I grew up coding and I did a lot of programming [01:31.24] competition and then as I was like thinking about, [01:33.48] you know, graduating, I got in touch with an obscure [01:36.60] music streaming startup called Spotify, [01:39.32] which was then like 30 people. [01:40.72] And for some reason I convinced them like, [01:42.08] why don't I just come and like write a master's thesis [01:44.00] with you and like I'll do some cool collaborative [01:45.44] filtering despite not knowing anything [01:46.76] about collaborative filtering really, I sort of, you know, [01:48.36] but no one knew anything back then. [01:49.76] So I spent six months at Spotify, [01:51.80] basically building a prototype of a music recommendation [01:54.60] system and then turned that into master's thesis. [01:56.80] - Yeah. [01:57.64] - And then later when I graduated, [01:58.60] I joined Spotify full time. [02:00.16] - Yeah, yeah. [02:01.00] So that was the start of your data career. [02:02.96] You also wrote a couple of popular open source tooling [02:06.16] while you were there. [02:07.32] And then you joined, is that correct or? [02:09.28] - No, that's right. [02:10.12] I mean, I was at Spotify for seven years. [02:11.32] This is a long stand and Spotify was a wild place [02:13.92] early on. [02:14.76] I mean, the data space was also wild place. [02:16.48] I mean, it was like Hadoop cluster in the like [02:18.56] foosball room on the floor. [02:20.24] There's a lot of crude, like very basic infrastructure [02:22.92] and I didn't know anything about it. [02:24.56] And like I was hired to kind of figure out data stuff. [02:27.88] And I started hacking on a recommendation system [02:31.20] and then, you know, got sidetracked [02:33.12] and a bunch of other stuff. [02:33.96] I fixed a bunch of reporting things [02:35.44] and set up A/B testing and started doing like [02:37.36] business analytics and later got back [02:38.84] to music recommendation system. [02:40.04] And a lot of the infrastructure didn't really exist. [02:42.04] Like there was like Hadoop back then, [02:43.52] which is kind of bad and I don't miss it, [02:46.20] but spent a lot of time with that. [02:48.20] As a part of that, I ended up building a workflow engine [02:50.76] called Luigi, which is like briefly like somewhat like [02:53.24] widely ended up being used by a bunch of companies. [02:56.20] Sort of like, you know, kind of like Airflow, [02:57.64] but like before Airflow, [02:59.04] I think it did some things better, some things worse. [03:01.36] I also built a vector database called annoy, [03:02.88] which is like for a while, it was actually [03:04.28] quite widely used in 2012. [03:06.00] So it's like way before like all this like vector database [03:08.48] stuff ended up happening. [03:09.76] And funny enough, I was actually obsessed [03:11.40] with like vectors back then. [03:12.52] Like I was like, this is gonna be huge. [03:13.72] Like just give it like a few years. [03:15.56] I didn't know it was gonna take like nine years. [03:17.12] And then it's gonna suddenly be like 20 startups [03:18.96] doing vector databases in one year. [03:20.72] So it did happen in that sense. [03:21.96] I was right. [03:22.80] I was glad I didn't start a startup [03:23.76] in the vector database space. [03:25.32] I would have started way too early. [03:26.92] But yeah, that was, yeah, this was a fun [03:29.24] seven years of Spotify, it was a great culture, [03:31.12] a great company. [03:31.96] - Yeah, just to take a quick tangent [03:33.68] on this vector database thing, [03:34.76] 'cause we probably won't revisit it, [03:36.20] but like, has anything architecturally changed [03:38.32] in the last nine years? [03:39.88] Or... [03:40.72] (laughing) [03:42.80] - I mean, sort of like, I'm actually not following [03:44.68] like super like closely. [03:46.20] I think, you know, some of the best algorithms [03:48.92] are still the same as like hierarchical, [03:50.60] navigable, small world or whatever. [03:52.48] - Exactly. [03:53.32] Yeah, H&SW. [03:54.66] I think now there's like product quantization. [03:56.68] There's like some other stuff [03:57.52] that haven't really followed super closely. [03:59.24] I mean, obviously like back then it was like, [04:00.64] you know, and always like very simple. [04:01.88] It's like a C++ library with Python findings [04:04.36] and you can map big files and into memory [04:07.04] and like they had some lookups. [04:08.12] And I used like this kind of recursive, [04:10.72] like hyperspace splitting strategy, [04:12.88] which is not that good, [04:14.18] but it sort of was good enough at that time. [04:16.12] But I think a lot of like H&SW is still like [04:18.76] what people generally use. [04:20.24] Now of course like databases are much better [04:22.76] in the sense like to support like insertion updates [04:24.88] and stuff like that. [04:25.72] And I never supported that. [04:26.84] Yeah, this is sort of exciting to finally see [04:28.48] like vector databases becoming a thing. [04:30.36] - Yeah, yeah. [04:31.32] And then maybe one takeaway on a most interesting lesson [04:34.44] from Daniel Eck. [04:35.28] - I mean, I think Daniel, like, you know, [04:38.72] he started Spotify when he was young. [04:40.08] Like he was like 25, something like that. [04:42.40] And I was like a good lesson. [04:43.24] But like he, in a way, like, I think he was very good leader. [04:46.64] Like there's anything like, and those scandals are like, [04:49.24] no, he wasn't very eccentric at all. [04:50.88] It was just kind of like very like level headed, [04:53.24] like just like random company very well, [04:54.92] like never made any like obvious mistakes or, [04:57.08] I think it was like a few bets that maybe like in hindsight [04:59.16] were like a little, you know, like we took us, you know, [05:01.64] too far in one direction or another. [05:03.08] But overall, I mean, I think it was a great CEO, [05:05.32] like definitely, you know, up there, like generational CEO, [05:08.48] at least for like Swedish startups. [05:09.96] - Yeah, yeah, for sure. [05:11.64] Okay, we should probably move to make a way to its model. [05:14.24] So then you spent six years as CTO of Better. [05:17.56] - Yeah. [05:18.40] - As a CTO engineer and then you scaled up [05:19.96] to like 300 engineers. [05:21.52] - I joined as a CTO when there was like no tech team. [05:23.84] And yeah, there was a wild chapter in my life. [05:25.80] Like the company did very well for a while. [05:28.20] And then like during the pandemic. [05:29.52] - That's well. [05:30.36] - Yeah, it was kind of a weird story. [05:31.20] But yeah, it kind of collapsed. [05:32.12] And then they actually went public for you. [05:33.24] - Lead off people poorly. [05:34.64] - Yeah, yeah, it was like a bunch of stories. [05:36.36] Yeah, I mean, the company like grew from like 10 people [05:38.92] when I joined at 10,000, now it's back to a thousand. [05:40.80] And yeah, they actually went public a few months ago. [05:42.72] It kind of crazy. [05:43.56] They're still around, like, you know, [05:44.40] they're still, you know, doing stuff. [05:46.08] So yeah, very kind of interesting six years of my life [05:49.20] for non-technical reasons, mostly like, [05:51.04] but yeah, like I managed like 300, 400, [05:52.36] - Management, scaling. [05:53.20] - Yeah, like learning a lot of that, [05:54.20] like recruiting, I spent all my time recruiting [05:55.88] and stuff like that. [05:56.72] And so managing at scale, it's like nice. [05:59.28] Like now in a way, like when I'm building my own startup, [06:01.20] like that's actually something I like don't feel [06:03.04] nervous about at all. [06:03.88] Like I've managed at scale. [06:04.76] Like I feel like I can do it again. [06:06.28] It's like very different things that I'm nervous about [06:07.76] as a startup founder. [06:09.08] But yeah, I started mode all three years ago [06:10.52] after sort of, after leaving better, [06:12.20] I took a little bit of time off during the pandemic. [06:14.12] And but yeah, pretty quickly I was like, [06:16.16] I got to build something. [06:17.04] I just want to, you know, [06:18.16] and then yeah, modal took form in my head, took shape. [06:21.88] - And as far as I understand, [06:23.12] and maybe we can sort of trade off questions. [06:24.92] So the quick history is started mode in 2021, [06:27.84] got your seed with Sarah from Amplify 2022. [06:30.64] Last year you just announced your series A with Redpoint. [06:32.92] - That's right. [06:33.76] - And that brings us up to mostly today. [06:36.24] Most people I think were expecting you [06:37.76] to build for the data space. [06:39.84] - But it is the data space. [06:40.92] - It is the data space. [06:42.64] When I think of data space, [06:43.48] so I come from like snowflake, big query, [06:46.20] you know, fire train nearby and that kind of stuff. [06:48.12] - Yeah. [06:48.96] - And what modal became [06:51.04] is more general purpose than that. [06:52.76] - Yeah, yeah. [06:54.24] I don't know, it was like fun. [06:55.20] I actually ran into like Eda Liberty, [06:56.64] the CEO of Pinecon like a few weeks ago. [06:58.16] And he was like, I was so afraid [06:59.68] you were building a vector database. [07:01.40] (laughing) [07:02.64] No, I started modal because, you know, [07:05.32] like in a way like I work with data, [07:06.92] like throughout my most of my career, [07:08.36] like every different part of the stack, right? [07:10.32] Like I thought everything like business analytics [07:12.76] to like deep learning, you know, [07:14.72] like building, you know, [07:15.92] trading neural networks to scale like, [07:17.84] like everything in between, right? [07:19.04] And so one of the thoughts, [07:20.36] like in one of the observations I had [07:21.84] when I started modal or like why I started was like, [07:23.96] I just wanted to make, [07:25.04] build better tools for data teams. [07:26.52] And like very, like that's sort of abstract thing. [07:28.68] But like, I find that the data stack is, you know, [07:31.24] fully like point solutions that don't integrate well. [07:33.96] And still when you look at like data teams today, [07:36.32] you know, like every startup ends up building [07:38.16] their own internal Kubernetes wrapper, whatever. [07:40.84] And, you know, all the different data engineers [07:42.72] and machine learning engineers [07:43.56] end up kind of struggling with the same things. [07:45.88] So I started to think about like, [07:46.92] how do I build a new data stack, [07:49.40] which is kind of a megalomaniac project? [07:51.08] Like, because you kind of wanted to like [07:52.56] throw out everything. [07:53.40] - It's over almost a modern data stack. [07:55.28] (laughing) [07:56.12] - Yeah, like a post-modern data stack. [07:58.40] And so I started to think about that. [08:00.08] And a lot of it came with like, [08:01.08] like more focus on like the human side of like, [08:02.68] how do I make data teams more productive? [08:04.08] And like, what is the technology tools that they need? [08:06.24] And like, you know, drew out a lot of charts [08:08.44] of like, how the data stack looks, [08:09.76] you know, what are the different components? [08:11.28] And this show is actually very interesting, [08:12.44] like workflow scheduling, [08:13.36] 'cause it kind of sits in like a nice sort of, [08:15.32] you know, it's like a hub in the graph of like data products. [08:18.24] But it was kind of hard to like kind of do that in a vacuum [08:21.32] and also to monetize it to some extent. [08:22.84] And I got very interested in like the layers below [08:25.44] at some point. [08:26.28] And like, at the end of the day, [08:28.20] like most people have code to have to run somewhere. [08:31.04] So I think about like, [08:31.88] okay, well, how do you make that nice? [08:34.04] Like, how do you make that? [08:35.04] And in particular, like the thing I always like [08:36.44] thought about like developer productivity is like, [08:38.00] I think the best way to measure developer productivity [08:40.56] is like in terms of the feedback loops. [08:41.72] Like how quickly when you iterate, like when you write code, [08:44.68] like how quickly can you get feedback? [08:46.04] And at the innermost loop, [08:46.96] it's like writing code and then running it. [08:48.84] And like, as soon as you start working with the cloud, [08:50.80] like it's like, takes minutes suddenly [08:52.60] 'cause you have to build a Docker container [08:53.88] and push it to the cloud and like run in, you know. [08:55.68] So that was like the initial focus for me. [08:57.52] It was like, I just want to solve that problem. [08:59.16] Like I want to, you know, build something less [09:01.80] your own thing is in the cloud and like retain this sort of, [09:03.64] you know, the joy of productivity [09:05.92] as when you're running things locally. [09:07.52] And in particular, I was quite focused on data teams [09:09.56] 'cause I think they had a couple of unique needs [09:11.84] that wasn't well served by the infrastructure at that time [09:14.36] or like still isn't like, in particular, like Kubernetes. [09:16.92] I feel like it's like kind of worked okay for backend teams, [09:19.68] but not so well for data teams. [09:21.16] And very quickly, I got sucked into like a very deep [09:23.04] like rabbit hole of like- [09:24.00] - Not well for data teams because of burstiness. [09:25.80] - Yeah, for sure. [09:26.64] So like burstiness is like one thing, right? [09:28.08] Like, you know, like you often have this like fan out. [09:30.20] You want to like apply some function [09:31.52] over very large assets. [09:32.84] Another thing tends to be like hardware requirements. [09:34.68] Like you need like GPUs. [09:35.68] And like I've seen this with many companies. [09:37.24] Like you go, you know, the data scientists [09:38.76] go to a platform team and they're like, [09:39.92] can we add GPUs to the Kubernetes? [09:41.48] They're like, no, like that's, you know, complex. [09:43.64] We're not gonna, or like, so like just getting GPU access. [09:46.20] And then like, I mean, I also like data code. [09:48.28] Like frankly, or like machine learning code, [09:50.24] like tends to be like super annoying [09:52.48] in terms of like environments. [09:53.52] Like you have enough having like a lot of like custom [09:55.56] like containers and like environment conflicts. [09:58.08] And like it's very hard to set up like a unified container [10:01.56] that like can serve like a data scientist. [10:03.92] Because like there's always like packages that break. [10:05.76] And so I think there's a lot of different reasons [10:07.84] why the technology wasn't well suited for backend. [10:11.44] And I think the attitude at that time was often like, [10:13.28] you know, like you had friction [10:14.80] between the data team and the platform team. [10:16.36] Like, well, it works for the backend stuff. [10:18.24] You know, why don't you just like, you know, make it work. [10:20.24] But like, I actually felt like data teams, you know, [10:22.32] or at this point now, like there's so much, [10:24.84] so many people working with data and like they, [10:26.36] to some extent like deserve their own tools [10:28.08] and their own tool chains. [10:28.92] And like optimizing for that is not something [10:31.04] people have done. [10:31.88] So that's sort of like very abstract, [10:33.72] philosophical reason why I started model. [10:35.12] And then, and then I got sucked into like rabbit hole [10:37.04] of like container cold start and, you know, like whatever, [10:40.20] Linux, page cache, you know, file system optimizations. [10:43.44] - Yeah, tell people, I think the first time I met you, [10:46.00] I think you told me a some numbers, but I don't remember. [10:48.08] Like what are the main achievements that you were unhappy [10:50.24] with the status quo and then you built your own container [10:52.32] stack as well? [10:53.16] - Yeah, I mean, like in particular it was like, [10:54.40] in order to have that loop, right? [10:56.08] You want to be able to start like take code on your laptop, [10:59.32] whatever and like run in the cloud very quickly [11:01.24] and like running in custom containers [11:02.60] and maybe like spin up like a hundred containers, [11:04.16] a thousand, you know, things like that. [11:05.48] And so container cold start was the initial, [11:07.64] like from like a developer productivity point of view, [11:09.40] it was like really what I was focusing on is, [11:11.92] I want to take code, I want to stick it in container, [11:13.52] I want to execute in the cloud and like, you know, [11:14.96] make it feel like fast. [11:16.40] And when you look at like how Docker works for instance, [11:18.60] like Docker, you have this like fairly convoluted, [11:21.00] like very resource inefficient way, they, you know, [11:23.60] you build a container, you upload the whole container [11:25.68] and then you download it and you run it. [11:27.68] And Kubernetes also like not very fast [11:29.40] at like starting containers. [11:30.24] So like I started kind of like, you know, [11:31.92] going a layer deeper like Docker is actually like, you know, [11:34.00] there's like a couple of different primitives, [11:35.08] but like a lower level primitive is run C, [11:36.96] which is like a container runner. [11:38.44] And I was like, what if I just take the container runner, [11:40.80] like run C and I point it to like my own root file system [11:44.52] and then I built like my own virtual file system [11:46.44] that exposes files over network instead. [11:49.68] And that was like the sort of very crude version of model. [11:51.40] It's like, now I can actually start containers very quickly [11:54.06] because it turns out like when you start a Docker container, [11:56.28] like first of all, like most Docker images [11:58.72] are like several gigabytes. [11:59.76] And like 99% of that is never going to be consumed. [12:02.36] Like there's a bunch of like, you know, [12:03.80] like time zone information for like Uzbekistan, [12:06.16] whatever, like no one's going to read it. [12:07.92] And then there's a very high overlap [12:09.60] between the files that are going to be read. [12:10.64] There's going to be like lib torch or whatever, [12:12.04] like it's going to be read. [12:12.88] So you can also cache it very well. [12:14.16] So that was like the first sort of stuff we started working on [12:16.48] was like, let's build this like container file system. [12:19.60] And, you know, a couple of like, you know, [12:21.24] just using run C directly. [12:22.76] And that actually enabled us to like get to this point [12:25.16] of like, you write code and then you can launch it in the cloud [12:27.80] within like a second or two, like something like that. [12:30.08] And, you know, there's been many optimizations since then, [12:32.16] but that was sort of a starting point. [12:33.44] - Can we talk about the developer experience as well? [12:36.76] I think one of the magic things about model is [12:39.60] at the very basic layers, like a Python function decorator, [12:42.84] it's just like stub, well, not, [12:45.00] but then you also have a way to define a full container. [12:48.00] What were kind of the time decisions that went into it? [12:50.12] Where did you start? [12:51.08] How easy did you want it to be? [12:52.44] And then maybe how much complexity did you then [12:55.04] add on to make sure that every use case fit? [12:57.16] - I mean, models, I almost feel like it's like [12:58.72] almost like two products kind of glued together. [13:00.80] Like there's like the low level like container runtime, [13:02.96] like file system and all that stuff, like in Rust. [13:04.52] And then there's like the Python SDK, right? [13:06.28] Like how do you express applications? [13:07.88] And I think, I mean, SWIX, like, [13:09.52] I think your blog was like the self-provisioning runtime [13:11.44] was like to me always like to sort of, [13:12.60] for me like an eye-opening thing. [13:13.80] It's like, so I didn't think about like. [13:15.04] - You wrote your post four months before me. [13:16.88] - Yeah? [13:17.72] - The software 2.0, infrared 2.0. [13:20.16] - Yeah, well, I don't know. [13:21.00] Like convergence of minds, like. [13:22.20] - I guess we're like both thinking, maybe you put, [13:25.32] I think better words than like, you know, [13:26.96] maybe something I was like thinking about for a long time. [13:28.96] - Yeah, and I can tell you how I was thinking about it [13:30.72] on my own, but I wanna hear it. [13:31.56] - Yeah, yeah, I would love it. [13:32.88] And like to me, like what I always wanted to build was like, [13:35.52] I don't know, like I don't know if you use like Pulumi. [13:37.32] Like Pulumi is like nice, like in the sense, [13:38.72] like it's like Pulumi is like, [13:39.96] you describe infrastructure in code, right? [13:42.24] And to me, that was like so nice. [13:44.12] Like finally I can like, you know, put a for loop [13:46.44] that creates S3 buckets or whatever. [13:48.36] And I think like Modal sort of goes one step further [13:50.56] in the sense that like, what if you also put the app code [13:53.12] inside the infrastructure code and like glue it all together [13:55.56] and then like you only have one single place [13:57.00] that defines everything. [13:58.08] And it's all programmable. [13:59.44] It don't have any config files. [14:00.84] Like Modal has like zero config, there's no config. [14:02.96] It's all code. [14:03.80] And so that was like the goal that I wanted, like part of that. [14:06.28] And then the other part was like, [14:07.44] I often find that so much of like my time was spent [14:09.92] on like the plumbing between containers. [14:13.12] And so my thinking was like, well, [14:14.68] if I just build this like Python SDK [14:16.92] and make it possible to like bridge like different containers [14:19.96] just like a function call. [14:20.88] Like, and I can say, oh, this function runs in this container. [14:23.96] And this other function runs in this container. [14:25.72] And I can just call it just like a normal function. [14:28.16] Then, you know, I can build these applications [14:30.36] that may span a lot of different environments. [14:32.52] Maybe they fan out, start other containers. [14:34.68] But it's all just like inside Python. [14:36.32] You just like have this beautiful kind of nice like DSL [14:38.88] almost for like, you know, [14:40.32] how to control infrastructure in the cloud. [14:42.32] So that was sort of like how we ended up [14:44.12] with the Python SDK as it is, [14:45.72] which is still evolving all the time. [14:47.04] By the way, we keep changing syntax quite a lot. [14:48.56] 'Cause I think it's still somewhat exploratory. [14:51.40] But we're starting to converge on something [14:52.56] that feels like reasonably good now. [14:54.48] - Yeah, along the way, you, with this expressiveness, [14:57.52] you enabled the ability to, for example, [15:00.24] attach a GPU to a function. [15:01.76] - Totally, yeah. [15:02.60] It's like, you just like say, you know, [15:03.68] on the function decorator, you're like GPU equals, [15:05.72] you know, A100 and then, or like GPU equals, you know, [15:09.32] A10 or a T4 something like that. [15:10.76] And then you get that GPU and like, you know, [15:12.20] you just run the code and it runs. [15:13.32] Like you don't have to, you know, [15:15.12] go through hoops to, you know, [15:16.48] start an EC2 instance or whatever. [15:18.60] - Yeah. [15:19.44] - So it's all code. [15:20.28] - Yeah, so on my end, the reason I wrote [15:21.92] self-provisioning runtimes was I was working at AWS [15:24.80] and we had AWS CDK, which is kind of like, you know, [15:27.72] that the Amazon basics blew me. [15:29.00] - Yeah, totally. [15:29.84] - And then, and then like, it creates, [15:32.00] it compiles the cloud formation. [15:33.56] - Yeah. [15:34.40] - And then on the other side, you have to like, [15:35.24] get all the config stuff and then put it [15:36.80] into your application code and make sure that they line up. [15:39.68] So then you're writing code to define your infrastructure, [15:42.72] then you're writing code to define your application. [15:44.52] And I was just like, this is like, [15:46.16] obvious that it's going to converge, right? [15:47.48] - Yeah, totally. [15:48.72] But isn't there like, it might be wrong, [15:50.48] but like, was it like Sam or Chalice or one of those? [15:52.88] Like, isn't that like an AWS thing [15:54.40] that where actually they kind of did that? [15:56.44] I feel like there's like one problem. [15:57.28] - Sam, yeah, yeah, yeah, yeah, yeah. [15:59.52] Still very clunky. [16:00.48] - Okay. [16:01.32] - It's not as arrogant as Modo. [16:02.80] - I love AWS for like, the stuff it's built, [16:05.60] you know, like historically in order for me to like, [16:07.56] you know, what it enables me to build. [16:09.20] But like, AWS is always like struggle [16:10.64] with developer experience. [16:11.68] Like, and that's big. [16:13.36] I mean, they have to not break things. [16:15.32] - Yeah, yeah, and totally. [16:16.40] And they have to, you know, [16:17.36] build products for very wide range of use cases. [16:20.32] And I think that's hard. [16:21.16] - Yeah, yeah, so it's easier to design for. [16:23.36] Yeah, so anyway, I was pretty convinced [16:25.84] that this would happen. [16:26.80] I wrote that thing. [16:27.84] And then, you know, imagine my surprise [16:29.12] that you guys had it on your landing page at some point. [16:31.60] - Yeah. [16:32.44] - I think Akshad was just like-- [16:33.28] - Oh, is that it? - Just throw that in there. [16:34.40] - Did you trademark it? [16:35.56] - No, but I definitely got sent a few pitch decks [16:38.44] with my post on there. [16:39.56] - Nice. - And it was like, [16:40.40] really interesting. [16:41.36] This is my first time like, [16:42.40] kind of putting a name to a phenomenon. [16:43.92] - Yeah. - And I think [16:44.76] this is useful skill for people [16:46.08] to just communicate what they're trying to do. [16:47.72] - Yeah, no, I think it's a beautiful concept, yeah. [16:49.96] - Yeah, yeah. [16:50.92] But obviously you implemented it. [16:52.48] What became more clear in your explanation today [16:55.00] is that actually you're not that tied to Python. [16:57.08] - No, I mean, I think that all the lower level stuff [16:59.96] is, you know, just running containers [17:01.80] and like scheduling things and, you know, [17:04.04] serving container data and stuff. [17:05.24] So, like one of the benefits of data teams is obviously like, [17:07.48] they're all like using Python, right? [17:09.04] So that made it a lot easier. [17:10.56] I think, you know, if we had to focus on other workloads, [17:13.28] like, you know, for various things, [17:14.40] like we've like been kind of like half thinking [17:16.56] about like CI or like things like that. [17:18.36] But like, anyway, that's like harder [17:20.04] 'cause like you also, then you have to be like, [17:21.84] you know, multiple SDKs, whereas, you know, [17:25.04] focus on data teams, you can only, you know, [17:26.80] Python like covers like 95% of all teams. [17:29.20] That made it a lot easier. [17:30.04] But like, I mean, like definitely like in the future, [17:31.60] we can add other support, like supporting other languages. [17:34.08] JavaScript for sure is the obvious next language. [17:37.12] But, you know, who knows, like, you know, Rust, Go, R, [17:40.56] whatever, PHP, Haskell, I don't know. [17:42.44] - You know, I think for me, I actually am a person [17:45.56] who like kind of liked the idea [17:47.72] of programming language advancements [17:49.88] being improvements in developer experience. [17:52.56] But all I saw out of the academic sort of PLT type people [17:56.20] is just type level improvements. [17:58.20] And I always think like, for me, like one of the core reasons [18:00.96] for self-provisioning runtimes and then while like model, [18:03.12] it's like, this is actually a productivity increase. [18:05.72] - Totally. [18:06.56] It's a language level thing, you know, [18:07.96] you managed to stick it on top of an existing language, [18:10.08] but it is your own language. [18:11.44] - Yeah. [18:12.28] - DSL on top of Python. [18:13.12] - Yeah. [18:13.96] - And so language level increase on the order [18:15.28] of like automatic memory management, you know, [18:17.28] you could sort of make that analogy that like, [18:19.48] maybe you lose some level of control, [18:21.44] but most of the time you're okay [18:22.84] with whatever model gives you. [18:24.44] And like, that's fine. [18:25.44] - Yeah, I mean, that's how I look at about it too. [18:28.04] Like, you know, you look at developer productivity [18:29.68] over the last number of decades, like, you know, [18:31.76] it's common like small increments of like, you know, [18:34.28] dynamic typing or like, it's like one thing, [18:36.24] it's not suddenly like for a lot of use cases, [18:37.68] you don't even care about type systems [18:39.04] or better compiler technology or like, you know, [18:41.52] the cloud or like, you know, relational databases. [18:43.64] And, you know, I think, you know, [18:44.88] you look at like that, you know, history, [18:47.08] it's a steadily, you know, it's like, you know, [18:49.84] the developers have been getting like probably 10x [18:52.24] more productive every decade for the last four, [18:55.28] four decades or something. [18:56.12] It was kind of crazy, like on an exponential scale, [18:58.00] we're talking about 10x or is there a 10,000x, [19:00.76] like, you know, improvement in developer productivity? [19:02.40] What we can build today, you know, is arguably like, [19:05.00] you know, a fraction of the cost of what it, you know, [19:06.56] took to build it in the 80s. [19:07.92] Maybe it wasn't even possible in the 80s. [19:09.28] So that, to me, like, that's like so fascinating. [19:11.40] I think it's going to keep going for the next few decades. [19:13.68] - Yeah, yeah. [19:14.88] - Another big thing in the infrared superno wish list [19:17.56] was truly serverless infrastructure. [19:19.92] The other, on your landing page, [19:21.32] you called them native cloud functions, [19:23.44] something like that. [19:24.76] I think the issue I've seen with serverless [19:26.88] has always been people really wanted it to be stateful, [19:30.00] even though state less was much easier to do. [19:32.36] And I think now with AI, [19:34.08] most model inference is like state less, [19:36.68] you know, outside of the context. [19:37.92] So that's kind of made it a lot easier to just put a model, [19:41.24] like a AI model on model to run. [19:44.60] How do you think about how that changes, [19:46.24] how people think about infrastructure too? [19:48.20] - Yeah, I mean, I think model is definitely going [19:50.24] in the direction of like doing more stateful things [19:52.20] and working with data and like high IO use cases. [19:55.16] I do think one like massive serendipitous thing [19:57.72] that happened like halfway, you know, [19:59.20] a year and a half into like the, you know, [20:00.88] building model was like gen AI started exploding. [20:03.20] And the IO pattern of gen AI, [20:05.04] it's like fits the serverless model like so well, [20:07.88] because it's like, you know, [20:09.28] you send this tiny piece of, like a prompt, right? [20:11.68] Or something like that. [20:12.64] And then like you have this GPU [20:13.92] that does like trillions of flops. [20:15.68] And then it sends back like a tiny piece of information, [20:17.88] right? [20:18.72] And that turns out to be something like, you know, [20:19.80] if you can get serverless working with GPU, [20:21.84] that just like works really well, right? [20:23.68] So I think from that point of view, like serverless always, [20:26.08] to me, felt like a little bit of like a solution [20:27.88] looking for a problem. [20:28.92] I don't actually like don't think like backend [20:30.64] is like the problem that needs sort of it. [20:32.72] Or like not as much, but I look at data [20:34.56] and in particular like things like gen AI, [20:35.84] like model inference, like it's like clearly a good fit. [20:38.44] So I think that is, you know, to a large extent, [20:41.48] explains like why we saw, you know, [20:43.24] the initial sort of like killer app [20:45.44] for model being model inference, [20:47.24] which actually wasn't like necessarily what we're focused on, [20:49.60] but that's where we've seen like by far [20:51.00] the most usage and growth. [20:52.36] - And this was before you started offering like fine tuning [20:55.48] or language models. [20:56.32] It was mostly stable diffusion. [20:58.92] - Yeah, yeah. [20:59.76] - Like model, like I always built it [21:01.16] to be a very general purpose compute platform, [21:03.16] like something where you could run everything. [21:04.28] And I used to call it model [21:05.12] like a better Kubernetes for data team for a long time. [21:08.04] What we realized was like, yeah, that's like, you know, [21:10.04] a year and a half in like we barely had any users [21:12.36] or any revenue and like we were like, well, maybe we should look [21:14.84] at like some use, trying to think of use cases. [21:16.60] And that was around the same time stable diffusion came out. [21:19.68] And the beauty of model is like, [21:21.00] you can run almost anything on model, right? [21:23.40] Like model inference turned out to be like the place [21:25.12] where we found initially or like clearly this has like 10x [21:28.24] like better agronomics than anything else. [21:30.12] But we're also like, you know, [21:31.48] going back to my original vision, [21:32.72] like we're thinking a lot about, you know, not, okay, [21:35.08] now we do inference really well. [21:36.16] Like what about training? [21:37.00] What about fine tuning? [21:37.84] What about, you know, end to end life cycle deployment? [21:39.48] What about data pre-processing? [21:40.76] What about, you know, I don't know, real-time streaming? [21:43.16] What about, you know, large data munging? [21:45.96] Like there's just data observability. [21:47.92] I think there's so many things like kind of going back [21:50.60] to what I said about like redefining data stack, [21:52.44] like starting with the foundation of compute. [21:55.28] Like one of the exciting things about model is like, [21:57.24] we've sort of, you know, [21:58.56] we've been working on that for three years [21:59.84] and it's maturing, [22:00.68] but like this is so many things you can do, [22:03.08] like with just like a better compute primitive [22:05.70] and also go up to stack [22:06.64] and like do all this other stuff on top of it. [22:08.84] - Yeah. [22:09.68] How do you think about a rather like, [22:11.52] I would love to learn more about the underlying infrastructure [22:14.20] and like how you make that happen. [22:16.08] Because with fine tuning and training, [22:18.24] it's a static memory. [22:20.04] Like you exactly know what you're gonna load in memory one. [22:22.56] And it's kind of like a set amount of compute [22:24.36] versus inference, just like data is like very bursty. [22:27.48] How do you make batches work [22:29.88] with a serverless developer experience? [22:32.60] You know, like what are like some fun technical challenge [22:35.10] is all to make sure you get max utilization on this GPUs. [22:38.00] What we hear from people is like, we have GPUs, [22:40.40] but we can really only get like, you know, [22:42.28] 30, 40, 50% maybe utilization. [22:45.12] - Yeah. [22:45.96] - What some of the fun stuff you're working on [22:46.90] to get a higher number there. [22:48.28] - Yeah, I think on the inference side, [22:49.60] like that's where like, you know, [22:50.84] like from a cost perspective [22:51.96] and like utilization perspective, [22:53.24] we've seen, you know, like very good numbers. [22:55.12] And in particular, like it's our ability [22:56.48] to start containers and stop containers very quickly. [22:59.08] And that means that we can autoscale extremely fast [23:01.84] and scale down very quickly, [23:03.50] which means like we can always adjust the sort of capacity, [23:05.88] the number of GPUs running to the exact traffic volume. [23:09.28] And so in many cases, like that actually leads [23:11.12] to a sort of interesting thing where like, [23:12.24] we obviously run our things on like the public cloud, [23:14.08] like AWS GCP run on Oracle. [23:16.32] But in many cases, like users who do inference [23:19.44] on those platforms or those clouds, [23:21.72] even though we charge a slightly higher price per GPU hour, [23:25.12] a lot of users like moving their large scale, [23:26.68] the inference use cases to model, [23:27.68] like end up saving a lot of money. [23:29.08] 'Cause we only charge for like with the time [23:30.70] the GPU is actually running. [23:32.20] And that's a hard problem, right? [23:33.28] Like, you know, if you have to constantly adjust [23:35.36] the number of machines, if you have to start containers, [23:37.20] stop containers, like that's a very hard problem. [23:38.92] And starting containers quickly is a very difficult thing. [23:41.48] I mentioned we had to build our own file system for this. [23:44.28] We also, you know, built our own container schedule [23:46.48] over for that. [23:47.32] We've implemented recently CPU memory check pointing [23:50.04] so we can take running containers [23:51.40] and snapshot entire CPU, like including registers [23:54.40] and everything and restore it from that point, [23:57.12] which means we can restore it from a initialized state. [23:59.84] We're looking at GPU check pointing next, [24:01.52] it's like a very interesting thing. [24:02.72] So I think the name for this stuff, [24:03.84] that's where serverless really shines, [24:06.84] because you can drive, you know, [24:08.32] you can push the frontier of latency [24:10.96] versus utilization quite substantially, [24:13.44] you know, which either ends up being a latency advantage [24:15.44] or a cost advantage or both, right? [24:17.56] On training is probably arguably like less [24:19.20] of an advantage doing serverless, frankly, [24:21.24] 'cause you know, you can just like spin up a bunch of machines [24:23.16] and try to satisfy, like, you know, [24:25.00] train as much as you can on each machine. [24:27.20] For that area, like we've seen like, you know, [24:29.12] arguably like less usage, like for model, [24:31.48] but there are always like some interesting use kit. [24:32.76] Like we do have a couple of customers, [24:34.00] like RAM for instance, like they do fine tuning with model [24:36.00] and they basically like one of the patterns they have [24:38.08] is like very bursty type fine tuning, [24:39.68] where they fine tune 100 models in parallel. [24:41.56] And that's like a separate thing [24:42.56] that model does really well, right? [24:43.52] Like we can start up 100 containers very quickly, [24:46.12] run a fine tuning training job on each one of them [24:48.84] for that only runs for, I don't know, 10, 20 minutes. [24:51.32] And then, you know, you can do hyper parameter tuning [24:53.28] in that sense, like just pick the best model [24:54.76] and things like that. [24:55.60] So there are like interesting training. [24:56.92] I think when you get to like training [24:58.08] like very large foundation of models, [24:59.48] that's a use case we don't support super well [25:01.12] 'cause that's very high IO, you know, [25:03.16] you need to have like infinite band and all these things. [25:05.16] And those are things we haven't supported yet [25:07.40] and might take a while to get to that. [25:09.32] So that's like probably like an area [25:10.60] where like we're relatively weakened. [25:11.96] - Yeah, have you cared at all [25:13.32] about lower level model optimization? [25:15.76] There's other club providers that do custom kernels [25:18.52] to get better performance or are you just [25:20.60] given that you're not just an AI compute company? [25:24.32] - Yeah, I mean, I think like we want to support [25:26.04] like a generic, like general workloads in a sense [25:28.12] that like we want users to give us a container essentially [25:30.24] or a code or code and then we want to run that. [25:32.92] So I think, you know, we benefit from those things [25:36.44] in the sense that like we can tell our users, you know, [25:39.04] to use those things. [25:40.44] But I don't know if we want to like poke into users' containers [25:43.20] and like do those things automatically. [25:44.48] That's sort of, I think a little bit tricky [25:46.08] from the outside to do [25:46.92] 'cause we want to be able to take like arbitrary code [25:49.20] and execute it, but certainly like, you know, [25:51.04] we can tell our users to like use those things. [25:52.80] - Yeah, I may have betrayed my own biases [25:55.84] because I don't really think about model [25:57.52] as for data teams anymore. [25:59.76] I think you started AI. [26:00.96] I think you're much more for AI engineers [26:03.08] and my favorite anecdotes, which I think you know, [26:06.20] but I don't know if you directly experienced it. [26:08.84] I went through the Versal AI Accelerator, [26:10.56] which you supported. [26:11.96] And in the Versal AI Accelerator, [26:13.80] a bunch of startups gave like free credits [26:15.44] and like signups and talks and all that stuff. [26:18.00] The only ones that stuck [26:18.92] are the ones that actually appealed to engineers [26:21.00] and the top usage, the top tool used by Fowler's Model. [26:24.56] - That's awesome. [26:25.40] - For people building with AI apps. [26:27.04] - Yeah, I mean, it might be also like a terminology question, [26:29.32] like the AI versus data, right? [26:30.64] Like I've, you know, maybe I'm just like old and jaded, [26:32.84] but like I've seen so many like different titles. [26:34.92] Like for a while it was like, you know, [26:37.32] I was a data scientist and a machine learning engineer. [26:39.96] And then, you know, there was like analytics engineers [26:42.00] and then it was like AI, you know, so like to me, it's like, [26:44.92] I just like, in my head, that's to me just like data, [26:48.64] or like engineer, you know, like I don't really, [26:50.92] 'cause that's why I've been like, you know, [26:51.96] just calling it data teams. [26:53.56] But like, of course, like, you know, AI is like, you know, [26:56.48] like such a massive fraction of our like workloads. [26:59.48] - It's a different Venn diagram of things you do, right? [27:02.64] So the stuff that you're talking about [27:04.02] where you need like infinity bands [27:05.84] for like highly parallel training, [27:08.00] that's not, that's more of the ML engineer, [27:09.72] that's more of the research scientist. [27:11.12] - Yeah, yeah. [27:11.96] - And less of the AI engineer, [27:13.24] which is more sort of try to work at the application. [27:15.96] - Yeah. [27:16.80] I mean, to be fair to it, like we have a lot of users [27:18.36] that are like doing stuff [27:19.60] that I don't think fits neatly into like AI. [27:22.40] Like we have a lot of people using like [27:23.68] more of a web scraping, like it's kind of nice. [27:25.40] Like you can just like, you know, fire up like a hundred [27:27.92] or a thousand containers running Chromium [27:29.52] and just like render a bunch of web pages [27:30.88] and takes, you know, whatever. [27:32.16] Or like, you know, protein folding, is that, [27:35.04] I mean, maybe that's, I don't know, like, [27:36.72] but like, you know, they have a bunch of users doing that [27:38.60] or like, you know, in terms of in the realm of biotech, [27:41.72] like sequence alignment, like people using, [27:43.76] or like a couple of people using like modal [27:45.68] to run like large, like mixed integer programming problems, [27:48.12] like, you know, using Gerobi or like things like that. [27:50.16] So video processing is another thing that keeps coming up. [27:53.02] Like, you know, let's say you have like petabytes of video [27:55.72] and you want to just like transcode it, [27:56.88] like, or you can fire up a lot of containers [27:58.64] and just like run FFM peg or like, [28:00.56] so there are those things too. [28:01.68] Like, I mean, like that being said, [28:03.04] like AI is by far our biggest use case, but, [28:05.16] you know, like, again, like modal [28:06.40] is kind of general purpose in that sense. [28:08.04] - Yeah. [28:08.88] Well, maybe let's stick with the stable diffusion thing [28:10.48] and then we'll move on to the other use cases [28:12.12] or AI that you want to highlight. [28:14.28] The other big player in my mind is Replicate. [28:16.96] - Yeah. [28:17.80] - In this era. [28:18.64] They're much more, I guess, custom built for that purpose, [28:21.76] whereas you're more general purpose. [28:23.20] How do you position yourself with them? [28:26.48] Are they just for like different audiences [28:28.12] or are you just hits on competing? [28:29.72] - I think there's like a tiny sliver of the Venn diagram [28:32.48] where we're competitive [28:33.52] and then like 99% of the area we're not competitive. [28:37.16] I mean, I think for people who, [28:39.64] if you think of like front engineers, [28:40.68] I think that's where like really they found good fit. [28:42.28] It's like, you know, people who built a cool web app [28:44.36] and they want some sort of AI capability [28:46.36] and they just, you know, an off the shelf model [28:48.56] is like perfect for them. [28:49.92] That's like, I like use Replicate, that's great, right? [28:52.80] Like, I think where we shine is like custom models [28:55.56] or custom workflows, you know, [28:57.44] running things at very large scale. [28:58.80] We need to care about utilization, care about costs. [29:01.16] You know, we have much lower prices [29:02.92] because we spend a lot more time [29:03.92] optimizing our infrastructure, you know, [29:05.68] and that's where we're competitive, right? [29:06.96] Like, you know, and you look at some of the use cases [29:08.60] like Suno is a big user. [29:10.36] Like, they're running like large scale like AI. [29:12.00] - Oh, we're talking with Mikey in a month. [29:14.00] - Yeah, so I mean, they're using model [29:15.76] for like production infrastructure. [29:16.88] Like they have their own like custom model, [29:18.76] like custom code and custom weights, you know, [29:20.36] for AI generating music, Suno.ai, you know, [29:22.88] that those are the types of use cases [29:24.52] that we like, you know, things that are like very custom [29:26.68] or like it's like, you know, [29:28.24] and those are the things like [29:29.24] it's very hard to run a Replicate, right? [29:30.96] And that's fine. [29:31.80] Like I think they focus on a very different part [29:33.80] of the stack in that sense. [29:35.08] - And then the other company pattern [29:36.80] that I pattern match you to is Modular. [29:39.64] - Is it the names? [29:40.96] - No, no, no. [29:42.20] But yes, the name is very similar. [29:44.16] I think there's something that might be insightful there [29:46.24] from a linguistics point of view. [29:47.60] Oh, no, they have Mojo, the sort of Python SDK. [29:50.52] And they have the Modular Inference Engine, [29:52.00] which is their sort of, their cloud stack, [29:54.08] their sort of compute inference stack. [29:56.00] I don't know if anyone's made the comparison to you before, [29:57.96] but like I see you evolving a little bit in parallel there. [30:01.60] - No, I mean, maybe. [30:03.28] Yeah, like it's not a company. [30:04.44] I'm like super like, I mean, [30:06.20] I know the basics, but like, [30:07.72] I guess they're similar in the sense [30:08.88] like they want to do a lot of, [30:10.40] they have sort of big picture vision. [30:12.40] - Yes, they also want to build very general purpose. [30:14.40] And they also are marketing themselves as like, [30:17.72] if you want to do off the shelf stuff, go somewhere else. [30:20.44] If you want to do custom stuff, [30:21.44] who are the best place to do it? [30:22.48] - Yeah, yeah. [30:23.32] - There is some overlap there. [30:24.60] There's not overlap in the sense that you are a close source [30:27.56] platform people have to host their code on you. [30:30.60] Whereas for them, they're very insistent [30:32.24] on not running their own cloud service. [30:34.36] They're a box software. [30:35.48] - Yeah, yeah. [30:36.32] - They're licensed C software. [30:37.16] - I'm sure their VCs at some point [30:38.60] can have forced them to reconsider. [30:40.28] - No, no, Chris is very, very insistent [30:42.44] and very convincing. [30:43.40] (laughing) [30:44.76] So anyway, I would just make that comparison, [30:47.52] let people make the links if they want to. [30:48.88] But it's an interesting way to see the cloud market develop [30:52.04] from my point of view. [30:52.88] 'Cause I came up in this field thinking cloud is one thing. [30:55.88] And I think your vision is like something slightly different. [30:58.40] And I see the different takes on it. [31:00.36] - Yeah, and like one thing I've, you know, [31:02.40] like I've written a bit about it in my blog too. [31:04.08] It's like, I think of us as like a second layer [31:06.24] of cloud provider in the sense that like, [31:07.56] I think snowflake is like kind of a good analogy. [31:09.60] Like snowflake, you know, is infrastructure [31:12.08] is a service, right? [31:12.92] But they actually run on like major clouds, right? [31:15.44] And I mean, like you can like analyze this very deeply. [31:18.16] But like one of the things I always thought about is like, [31:19.52] why did snowflake already like win over Redshift? [31:21.64] And I think snowflake, you know, to me, one, [31:24.68] because like, I mean, in the end, [31:25.84] like AWS makes all the money anyway. [31:27.32] Like and like snowflake just had the ability [31:29.76] to like focus on like developer experience [31:32.68] or like, you know, user experience. [31:34.24] And to me, like really proved that you can build [31:36.56] a cloud provider, a layer off from, you know, [31:39.56] the traditional like public clouds. [31:40.76] And in that layer, that's also where I would put modal. [31:44.32] It's like, you know, we're building a cloud provider. [31:45.88] Like, we're, you know, we're like a multi-tenant environment [31:48.52] that runs the user code, [31:49.80] but also building on top of the public cloud. [31:51.36] So I think there's a lot of room in that space. [31:53.12] I think it's very sort of interesting direction. [31:55.56] - How do you think of that compared [31:57.20] to the traditional past history? [31:59.56] Like, you know, yeah, AWS, [32:01.36] then you had Heroku, then you ran the railway. [32:04.48] - Yeah, I mean, I think they're all, [32:06.04] those are all like great. [32:06.88] Like, I think the problem that they all faced [32:09.08] was like the graduation problem, right? [32:11.28] Like, you know, Heroku or like, I mean, like also like, [32:14.16] Heroku, there's like a counterfactual future of like, [32:16.80] what would have happened if Salesforce didn't buy them, right? [32:18.88] Like, that's a sort of separate thing. [32:20.08] But like, I think what Heroku, I think always struggled with [32:23.04] was like, eventually companies would get big enough [32:26.36] that you couldn't really justify running in Heroku. [32:28.68] So they would just go and like move it to, you know, [32:30.64] whatever AWS or, you know, in particular. [32:32.96] And you know, that's something that keeps me up at night too. [32:34.92] Like, what does that graduation risk like look like for modal? [32:37.92] I always think like the only way to build [32:39.68] a successful infrastructure company in the long run [32:41.40] in the cloud today is you have to appeal [32:43.68] to the entire spectrum, right? [32:45.16] Or at least like the enterprise. [32:46.56] Like you have to capture the enterprise market. [32:48.44] But the truly good companies capture the whole spectrum, right? [32:50.68] Like, I think a company is like, [32:52.16] I don't like Data Dog or Mongo or something like that. [32:53.92] We're like, they both capture like the hobbyists [32:56.32] and acquire them, but also like, you know, [32:58.24] have very large enterprise customers. [33:00.12] I think that arguably was like where I, [33:01.96] in my opinion, like Heroku struggle was like, [33:04.68] how do you maintain the customers [33:06.36] as they get more and more advanced? [33:07.72] I don't know what the solution is, [33:08.88] but I think this, you know, [33:11.04] that's something I would have thought deeply [33:12.36] if I was at Heroku at that time. [33:14.32] - What's the AI graduation problem? [33:16.16] Is it, I need to fine tune the model, [33:18.04] I need better economics, [33:19.36] any insights from customers? [33:21.48] - Yeah, I mean, better economics certainly. [33:23.08] But although like, I would say like, even for people who like, [33:25.64] you know, need like thousands of GPUs, [33:27.68] just because we can drive utilization so much better. [33:29.80] Like we, there's actually like a cost advantage [33:32.12] of staying on model. [33:33.36] But yeah, I mean, certainly like, you know, [33:34.72] and like the fact that VCs like love, you know, [33:36.88] throwing money at least used to, you know, [33:38.88] add companies who need it to buy GPUs. [33:40.76] I think that didn't help the problem. [33:42.32] And in training, I think, you know, [33:43.52] there's less software differentiation. [33:45.44] So in training, I think there's certainly like better economics [33:47.48] of like buying big clusters. [33:48.96] But I mean, I hope it's gonna change, right? [33:51.12] Like I think, you know, we're still pretty early in the cycle [33:53.44] of like building AI infrastructure. [33:55.96] And I think a lot of these companies over in the long run, [33:59.00] like, you know, they're, except it may be super big ones, [34:01.44] like, you know, on Facebook and Google, [34:03.12] they're always gonna build their own ones. [34:04.16] But like everyone else, like, to some extent, you know, [34:06.84] I think they're better off like buying platforms. [34:09.28] And you know, someone's gonna have to build those platforms. [34:11.52] - Yeah, cool. [34:13.28] Let's move on to language models. [34:15.48] And just specifically that workload, [34:17.12] just to flesh it out a little bit. [34:18.72] You already said that Ramp is like fine-tuning 100 models [34:22.28] like once simultaneously on modal. [34:24.92] Closer to home, the, my favorite example is Eric bought. [34:28.56] Maybe you want to tell that story? [34:30.08] - Yeah, I mean, it was a prototype thing we built for fun, [34:32.84] but it was pretty cool. [34:33.68] Like we basically built this thing that hooks up to Slack. [34:35.96] It like downloads all the Slack history [34:38.04] and, you know, fine-tuning is a model based on a person. [34:40.20] And then you can chat with that. [34:41.72] And so you can like, you know, clone yourself [34:43.40] and like talk to yourself. [34:44.24] It's like, I mean, it's like nice like demo. [34:46.16] And it's like, I think like it's like fully contained model. [34:48.92] Like there's a modal app that does everything, right? [34:51.12] Like it downloads Slack, you know, [34:52.64] integrates the Slack API, like downloads the stuff, [34:55.04] the data, like just runs the fine-tuning. [34:57.04] And then like creates like dynamically [34:58.76] an inference endpoint and it's all like self-contained [35:01.16] and like, you know, a few underlines of code. [35:02.36] So I think it's sort of a good kind of use case for more, [35:05.20] like it kind of demonstrates a lot of the capabilities [35:07.32] of modal. [35:08.16] - Yeah. [35:08.98] - And now more personal side, [35:09.96] how close did you feel Eric bought was to you? [35:13.72] - It definitely captured the like, the language. [35:16.90] - Uh-huh. [35:18.12] - Yeah. [35:18.96] I mean, I don't know, like the content, [35:20.52] I always feel this way about like AI. [35:22.12] And it's gotten better, [35:22.96] but like you look at like AI output of text. [35:25.64] Like, and it's like, when you glance at it, [35:27.60] it's like, yeah, the sim's really smart, you know, [35:29.96] but then you actually like look a little bit deeper. [35:31.48] It's like, what does this mean? [35:32.92] What does this person say? [35:33.76] It's like kind of vacuous, right? [35:35.00] And that's like kind of what I felt like, you know, [35:36.68] talking to like my clone version. [35:38.52] Like it's like says like things like the grammar is correct. [35:41.20] Like some of the sentences make a lot of sense, [35:42.96] but like, what are you trying to say? [35:44.40] Like there's no content here, I don't know. [35:46.48] I mean, it's like, I got that feeling also with chat TBT [35:49.00] in the like early versions, right? [35:50.16] Now it's like better, but. [35:51.28] - That's funny. [35:52.12] Yeah, I built this thing called small podcast there [35:53.68] to automate a lot of our back office work. [35:56.08] So to speak, and it's great at transcript, [35:58.60] it's great at doing chapters. [36:00.60] And then I was like, okay, [36:01.76] how about you come up with a short summary? [36:03.80] And it's like, it sounds good, [36:05.44] but it's like, it's not even the same ballpark. [36:07.20] It's like, well, we end up writing. [36:08.84] And it's hard to see how it's going to get there. [36:11.32] - Oh, I have ideas. [36:12.84] - Yeah. [36:13.76] - I'm certain it's going to get there, [36:15.44] but like, I agree with you, right? [36:17.12] And like, I have the same thing. [36:18.24] I don't know if you read out like AI generated books. [36:20.68] Like they just like kind of seem funny, right? [36:22.72] Like there's off, right? [36:23.84] But like you glance at them and it's like, [36:25.24] oh, it's kind of cool. [36:26.52] Like looks correct, [36:27.56] but then it's like very weird when you actually read them. [36:29.72] - Yeah. [36:30.88] So for what it's worth, [36:32.20] I think anyone can join a modal slack. [36:33.60] I just open to the bottom. [36:34.76] - Yeah, totally. [36:35.60] If you go to modal.com, there's a button in the footer. [36:38.48] - Yeah, and then you can talk to Eric Pot. [36:40.40] And then sometimes I really like picking Eric Pot, [36:42.68] and then you answer afterwards, [36:44.08] but then you're like, [36:44.92] - Really? [36:45.76] - Yeah, mostly correct. [36:46.60] Like whatever. [36:47.44] - Cool. [36:48.28] - Any other broader lessons, you know, [36:49.68] just broadening out from like the single use case [36:52.00] of fine tuning, like what are you seeing people do [36:55.20] with fine tuning or just language models [36:57.44] on modal in general? [36:58.60] - Yeah, I mean, I think language models is interesting [37:00.60] because so many people get started with APIs, [37:04.16] and that's just, you know, [37:05.24] they're just dominating a space [37:06.44] in particular opening AI, right? [37:07.80] And that's not necessarily like a place [37:09.32] where we aim to compete. [37:10.64] I mean, maybe at some point, [37:11.48] but like it's just not like a core focus for us. [37:13.20] And I think sort of separately sort of question [37:15.32] if like there's economics in that longterm, [37:16.80] but like, so we, we tend to focus on more [37:19.08] like the areas like the around it, right? [37:21.00] Like fine tuning, like another use case we have [37:23.76] is a bunch of people ramp included [37:25.44] is doing batch embeddings on modal. [37:27.32] So let's say, you know, we have like a, [37:29.24] actually we're like writing a blog post [37:30.52] like we take all the Wikipedia [37:32.84] and like paralyze embeddings in 15 minutes [37:35.36] and produce vectors for each article. [37:37.80] So those types of use cases, [37:39.12] I think modal suits really well for, [37:40.72] I think also a lot of like custom inference, [37:42.76] like we have like that. [37:43.60] - Yeah, when you say parallelize, [37:45.44] I think you should give people an idea [37:47.08] of the order of magnitude of parallelism [37:48.84] because I think people don't understand how parallel. [37:51.60] So like, I think your classic hello world with modal [37:54.32] is like some kind of Fibonacci function, right? [37:57.12] - Yeah, we have a bunch of different ones. [37:57.96] - A recursive function. [37:58.80] - Yeah, yeah, I mean, like, yeah, [38:00.04] I mean it's like pretty easy in modal, [38:01.20] like fan out to like, you know, [38:02.92] at least like a hundred GPUs like in a few seconds. [38:05.00] And, you know, if you give it like a couple of minutes, [38:06.96] like we can, you know, you can fan out [38:08.48] to like thousands of GPUs. [38:09.72] Like we run it relatively large scale. [38:12.36] And yeah, we've run, you know, many thousands of GPUs [38:16.40] at certain points when we need it. [38:17.84] You know, big backfills [38:18.96] or some customers had very large compute needs. [38:21.00] - Yeah, yeah. [38:21.84] And I mean, that's super useful for a number of things. [38:25.28] So one of my early interactions with modal as well [38:27.52] was with a small developer, [38:28.92] which is my sort of coding agent. [38:30.72] The reason I chose modal was a number of things. [38:32.72] One, I just wanted to try it out. [38:33.92] And I just had an excuse to try it. [38:35.76] Akshad offered to onboard me personally. [38:37.52] - Yeah, good excuse. [38:38.76] - But the most interesting thing was that [38:40.48] you could have that sort of local development experience [38:43.12] as well as running it on my laptop, [38:44.40] but then it would seamlessly translate to a cloud service. [38:47.16] Or like cloud hosted environment. [38:49.32] And then it could fan out with concurrency controls. [38:51.72] So I could say like, because like, you know, [38:53.44] the number of times I hit the GPT-3 API at the time [38:57.16] was going to be subject to the rate limit from there. [38:59.80] But I wanted to fan out [39:00.84] without worrying about the kind of stuff. [39:02.88] With modal, I can just kind of declare that in my config. [39:06.04] And that's it. [39:06.88] - Oh, like a concurrency limit? [39:07.88] - Yeah. [39:08.72] - Yeah, there's a lot of control there. [39:09.56] - Yeah, so I just want to highlight that to people. [39:11.56] I was like, yeah, this is a pretty good use case for like, [39:13.72] writing this kind of LLM application code [39:16.68] inside of this environment [39:18.08] that just understands fan out and rate limiting natively. [39:22.52] You don't actually have an exposed queue system, [39:24.48] but you have it under the hood. [39:25.56] - Totally. [39:26.40] - That kind of stuff. [39:27.30] - It's a self-provisioning code. [39:29.08] (laughing) [39:30.80] - So the last part of modal I wanted to touch on [39:33.32] and obviously feel free, [39:34.24] I know you're working on new features, [39:36.64] was the sandbox that was introduced last year. [39:39.72] And this is something that I think was inspired [39:42.52] by Code Interpreter. [39:43.36] You can tell me the longer history behind that. [39:45.52] - Yeah, like we originally built it for the use case. [39:48.52] Like there was a bunch of customers [39:50.16] who looked into code generation applications [39:52.56] and then they came to us and asked us, [39:54.84] is there a safe way to execute code? [39:56.76] And yeah, we spent a lot of time [39:57.76] on like container security. [39:58.76] We used GeoVisor and for instance, [40:00.12] which is a Google product [40:01.20] that provides pretty strong isolation of code. [40:03.84] So we built a product where you can like basically [40:05.76] like run arbitrary code inside a container [40:07.84] and monitors output or like get it back in a safe way. [40:12.72] I mean, over time it's like evolved into more of like, [40:15.32] I think the long-term direction [40:16.44] is actually I think more interesting, [40:17.64] which is that I think modal as a platform [40:20.92] where like, I think the core like container infrastructure [40:24.08] we offer could actually be like, [40:26.24] unbounded from like the client SDK [40:28.52] and offer to like other, [40:30.36] we're talking to a couple of like other companies [40:32.20] that want to run through their packages, [40:34.96] like run, execute jobs on modal like kind of programmatically. [40:39.76] So that's actually the direction like sandbox is going. [40:41.76] It's like turning into more like a platform for platforms [40:44.04] is kind of what I've been thinking about it. [40:45.24] - Oh boy, platform, that's the old Kubernetes line. [40:48.36] - Yeah, yeah, yeah, but it's like, you know, [40:50.24] like having that ability to like programmatically, [40:53.32] you know, create containers and execute them, [40:55.64] I think it's really cool. [40:57.52] And I think it opens up a lot of interesting capabilities [41:00.36] that are sort of separate from the like core Python SDK in modal. [41:04.92] So I'm really excited about C. [41:06.24] I mean, it's like one of those features [41:07.32] that we kind of released and like, you know, [41:09.56] then we kind of look at like what users actually build with it [41:11.84] and people are starting to build like kind of crazy things. [41:13.84] And then, you know, we double down on some of those things [41:16.16] 'cause when we see like, you know, potential new product features. [41:19.24] And so sandbox, I think in that sense, [41:20.76] it's like kind of in that direction, [41:22.56] we found a lot of like interesting use cases [41:24.80] in the direction of like platformized container runner. [41:27.76] - Can you be more specific about what you're double down on [41:30.12] after seeing users in action? [41:32.08] - Yeah, I mean, we're working with like some companies that, [41:35.24] I mean, without getting to specifics, [41:36.96] like that need the ability to take their users code [41:40.72] and then launch containers on modal. [41:44.00] And it's not about security necessarily, [41:45.76] like they just want to use modal as a back end, right? [41:47.92] Like they may already provide like Kubernetes as a back end, [41:50.76] Lambda as a back end, [41:51.76] and now they want to add modal as a back end, right? [41:53.76] And so, you know, they need a way to programmatically define jobs [41:57.60] on behalf of their users and execute them. [41:59.48] And so, I don't know, that's kind of abstract, [42:01.80] but is that makes sense? [42:02.76] - Yeah, I totally get it. [42:03.60] It's sort of one level of recursion [42:05.56] to sort of beat the modal for their customers. [42:08.96] - Exactly, yeah, exactly. [42:10.20] And Cloudflare has done this, [42:11.80] you know, the Kenton Vardar from Cloudflare [42:13.64] was like the tech lead on this thing, [42:15.16] called it sort of functions as a service, as a service. [42:17.28] - Yeah, that was exactly right. [42:19.52] Fastass. [42:20.84] - Fastass. [42:21.68] - Fastass. [42:22.52] - Yeah, like, I mean, like that, [42:23.84] I think any base layer, second layer cloud provider [42:27.72] like yourself, compute provider like yourself should provide. [42:30.76] You know, it's a marker of maturity and success [42:32.80] that people just trust you to do that. [42:34.68] They'd rather build on top of you than compete with you. [42:37.16] The more interesting thing for me is like, [42:38.84] what does it mean to serve a computer, [42:41.28] like an LLM developer rather than a human developer, right? [42:44.76] Like that's what a sandbox is to me. [42:46.64] - Yeah, for sure. [42:47.48] - That you have to redefine modal [42:48.72] to serve a different non-human audience. [42:51.24] - Yeah, yeah, yeah. [42:52.24] And I think there's some really interesting people, [42:54.00] you know, building very cool things. [42:55.40] - Yeah, so I don't have an answer, [42:57.16] but you know, I imagine things like, [42:59.52] hey, the way you give feedback is different. [43:02.24] You maybe have to like stream errors, log errors differently. [43:06.28] I don't really know. [43:07.56] - Yeah. [43:08.40] - Obviously there's like safety considerations. [43:10.08] Maybe you have an API to like restrict access to the web. [43:13.12] - Yeah. [43:13.96] - I don't think anyone would use it, [43:16.00] but it's there if you want it. [43:17.04] - Yeah, yeah. [43:18.16] - Any other sort of design considerations? [43:20.12] I have no idea. [43:21.44] - With sandboxes? [43:22.28] - Yeah, yeah. [43:23.12] Open-ended question here. [43:24.48] - Yeah, I mean, no, I think, yeah, [43:26.28] the network restrictions I think make a lot of sense. [43:28.80] Yeah, I mean, I think, you know, long-term like, [43:31.56] I think there's a lot of interesting use cases [43:32.92] where like the LLM instead in itself can like decide, [43:36.12] I want to install these packages and like run this thing. [43:38.48] And like, obviously for a lot of those use cases, [43:40.52] like you want to have some sort of control [43:42.32] that it doesn't like install malicious stuff [43:44.12] and steal your secrets and things like that. [43:45.68] But I think that's what's exciting [43:47.72] about the sandbox primitive is like, [43:48.92] it lets you do that in a relatively safe way. [43:50.72] - Do you have any thoughts on the inference wars? [43:54.84] A lot of providers are just rushing to the bottom [43:57.36] to get the lowest price per million tokens. [43:59.76] Some of them, you know, Sean ran them out. [44:02.52] They're just losing money. [44:03.68] There's like the physics of it. [44:05.80] Just don't work out for them to make any money on it. [44:08.32] How do you think about your pricing [44:10.08] and like how much premium you can get [44:12.64] and you can kind of command versus using lower prices [44:15.84] as kind of like a wedge into getting there, [44:17.64] especially once you have a model instrumented. [44:20.08] What are the trade-offs and any thoughts on strategies? [44:23.68] - I mean, we focus more on like custom models [44:25.40] and custom code. [44:26.44] And I think in that space, there's like less competition. [44:29.44] And I think we can have a pricing markup, right? [44:32.20] Like, you know, people will always compare our prices [44:34.56] to like, you know, the GPU power they can get elsewhere. [44:36.64] And so how big can that markup be? [44:38.68] Like it never can be, you know, [44:39.68] we can never charge like 10X more, [44:41.24] but we can certainly charge a premium. [44:42.48] And like, you know, for that reason, [44:43.40] like we can have pretty good margins. [44:44.92] The LLM space is like the opposite. [44:46.36] Like the switching costs of LLMs is zero. [44:48.64] Like, if all you're doing is like straight up, [44:50.64] like at least like open source, right? [44:52.40] Like if all you're doing is like, you know, [44:54.08] using some, you know, inference endpoint [44:56.80] that serves an open source model. [44:58.80] And, you know, some other provider comes along [45:00.36] and like offers a lower price price. [45:01.60] You're just going to switch, right? [45:02.44] So I don't know, to me that reminds me a lot of like, [45:05.36] all this like 15 minute delivery wars [45:07.28] or like, you know, like Uber versus Lyft, you know, [45:09.92] and like maybe going back even further, [45:11.32] like I think a lot about like the sort of, you know, [45:13.36] flip side of this is like, this actually positive side [45:15.56] of it is like, I thought a lot about like fiber optics, [45:17.84] boom of like 98, 99, like the other day, or like, you know, [45:21.36] and also like the overinvestment in GPU today, like, like, [45:24.64] yeah, like, you know, I don't know, like in the end, [45:26.72] like I don't think VCs will have the return they expected, [45:29.76] like, you know, in these things, [45:31.60] but guess who's going to benefit? [45:32.84] Like, you know, it's the consumers. [45:34.52] Like someone's like reaping the value of this. [45:36.84] And that's I think an amazing flip side is that, you know, [45:39.60] we should be very grateful, you know, the fact that like, [45:41.92] VCs want to subsidize these things, which is, you know, [45:45.08] like you go back to fiber optics, [45:46.32] like there was an extreme like overinvestment [45:48.68] in fiber optics network and like 98. [45:50.80] And no one made money who did that. [45:52.96] But consumers, you know, got tremendous benefits [45:56.80] of all the fiber optics cable that were led, you know, [45:58.96] throughout the country in the decades after. [46:01.20] I feel something similar about like GPUs today [46:03.88] and also like specifically looking like more narrowly [46:05.68] at like LLM in the French market, like, that's great. [46:08.00] Like, you know, I'm very happy that, you know, [46:11.16] there's a price war. [46:12.80] Modal is like not necessarily like participating [46:15.40] in that price war, right? [46:16.24] Like I think, you know, it's going to shake out [46:17.84] and then someone's going to win [46:19.16] and then they're going to raise prices or whatever. [46:20.68] Like we'll see how that works out. [46:22.28] But for that reason, like we're not like hyper focused [46:24.76] on like serving, you know, just like straight up, [46:27.32] like here's an endpoint for to an open source model. [46:29.96] We think the value in modal comes from all these, you know, [46:32.84] the other use cases, the more custom stuff, [46:34.84] like fine tuning and complex, you know, [46:36.76] guided output, like type stuff. [46:38.28] Or like also like in other like outside of LLMs, [46:40.44] like focus a lot more like image, audio, video stuff. [46:43.48] 'Cause that's where there's a lot more proprietary models. [46:45.96] There's a lot more like custom workflows. [46:47.52] And that's where I think, you know, modal is more, you know, [46:51.04] there's a lot of value in software differentiation. [46:53.52] I think focusing on developer experience [46:55.28] and developer productivity. [46:56.80] That's where I think, you know, [46:57.68] you can have more of a competitive mode. [47:00.48] - I'm curious what the difference is going to be now [47:03.12] that it's an enterprise. [47:04.08] So like with DoorDash, Uber, [47:06.80] they're going to charge you more and like as a customer, [47:08.84] like you're going to decide to not take a Uber. [47:10.96] But if you're a company building AI features [47:13.04] in your product using the subsidized prices, [47:15.44] and then, you know, the VC money dries up in a year [47:18.52] and like prices go up, it's like, [47:20.56] you can't really take the features back. [47:22.76] Without a lot of backlash, [47:23.84] but you also cannot really kill your margins [47:26.08] by paying the new price. [47:27.48] - So I don't know what that's going to look like, but. [47:29.44] - But like margins are going to go up for sure. [47:31.08] But I don't know if prices will go up. [47:32.60] Cause like GPU prices have to drop eventually, right? [47:36.44] So like, you know, like in the long run, [47:38.36] I still think like prices may not go up that much, [47:41.60] but certainly margins will go up. [47:42.80] Like I think you said, [47:43.64] switch that margins are negative right now. [47:45.24] Like, you know, obviously that's not sustainable. [47:49.48] So certainly margins will have to go up. [47:50.92] Like some companies are going to have to make money [47:52.28] for it in this space. [47:53.32] Otherwise like they're not going to provide the service. [47:55.16] But that's equilibrium too, right? [47:56.40] Like at some point, like, you know, [47:57.80] the sort of stabilizes and one or two [48:00.24] or three providers make money. [48:02.48] - Yeah, what else is maybe underrated a model, [48:05.24] something that people don't talk enough about, [48:08.00] or yeah, that we didn't cover in the discussion. [48:11.40] - Yeah, I think, what are some other things? [48:13.96] We talked about a lot of stuff. [48:15.04] Like we have the bursty parallelism. [48:16.56] I think that's pretty cool. [48:17.84] Working on a lot of like, trying to figure out like, [48:20.28] you know, like kind of thinking more about the roadmap, [48:22.00] but like one of the things I'm very excited about is [48:24.56] building primitives for like more like [48:26.32] I/O intensive workloads. [48:27.88] And so like we're building some like crude stuff right now [48:30.92] where like you can like create like direct TCP tunnels [48:33.04] to containers and that lets you like pipe data. [48:35.08] And like, like, you know, we haven't really explored [48:37.28] as much as it was showed, [48:38.20] but like there's a lot of interesting applications. [48:39.80] Like you can actually do like kind of real time video stuff [48:42.44] in model now, because you can like create a tunnel to, [48:45.68] exactly you can create a raw TCP socket to a container, [48:48.84] feed it video and then like, you know, [48:50.72] get the video back. [48:51.68] And I think like it's still like a little bit like, [48:54.56] you know, not fully ergonomically like figured out, [48:56.84] but I think there's a lot of like super cool stuff. [48:59.40] Like when we start enabling those more like high I/O workloads. [49:04.40] I'm super excited about. [49:05.84] I think also like, you know, working with large data sets [49:07.96] or kind of taking the ability to map and found out [49:10.84] and like building more like higher level, [49:12.28] like functional primitives, like filters and group buys [49:14.84] and joins like, I think there's a lot of like [49:16.60] really cool stuff you can do. [49:18.00] But this is like maybe like, you know, years out like. [49:21.24] Yeah, we can just broaden out from model a little bit, [49:23.36] but you still have a lot of, you have a lot of great tweets. [49:25.40] So it's very easy to just kind of go through them. [49:28.52] Why is Oracle underrated? [49:30.44] - I love Oracle's GPUs. [49:32.52] I don't know why, you know, [49:34.16] what the economics looks like for Oracle, [49:36.26] but I think they're great value for money. [49:38.04] Like we run a bunch of stuff in Oracle [49:39.76] and they have bare metal machines, [49:41.20] like two terabytes of RAM, they're like super fast SSDs. [49:44.16] You know, I mean, I mean, we love AWS and a GCP too. [49:46.40] We have great relationships with them. [49:47.72] But I think Oracle's surprising like, you know, [49:50.06] if you told me like three years ago [49:51.32] that I would be using Oracle cloud, like what, wait, why? [49:55.20] But now I'm, you know, I'm a happy customer. [49:57.00] - And it's a combination of pricing [49:58.80] and the kinds of SKUs I guess they offer. [50:01.92] - Yeah, great, great machines, good prices, you know. [50:04.96] - That's it. [50:05.80] - Yeah, yeah. [50:06.64] - That's all you care about. [50:07.48] - Yeah, the sales team is pretty fun too, like I like them. [50:09.16] - In Europe, people often talk about Hedzner. [50:11.80] - Yeah, like we've focused on the main clouds, right? [50:14.88] Like we've, you know, Oracle, AWS, GCP, [50:16.60] we'll probably add Azure at some point. [50:18.08] I think, I mean, there's definitely a long tail [50:19.92] of like, you know, CoreWeave, Hedzner, [50:22.62] yeah, like Lambda, like all these things. [50:25.14] And like over time, I think we'll look at those too. [50:27.14] Like, you know, wherever we can get the right, you know, [50:29.10] GPUs at the right price. [50:31.26] Yeah, I mean, I think it's fascinating. [50:32.46] Like it's a tough business. [50:35.14] Like I wouldn't want to try to build like a cloud provider. [50:37.94] You know, it's just, you just have to be like incredibly [50:40.46] focused on like, you know, efficiency and margins [50:43.02] and things like that. [50:43.86] But I mean, I'm glad people are trying. [50:45.90] - Yeah, and you can ramp up on any of these clouds [50:48.62] very quickly, right? [50:49.46] - Yeah, I mean, yeah, like, I think so. [50:52.00] Like, you know, what modal does is like programmatic, [50:54.96] you know, launching and termination of machines. [50:57.16] So that's like, what's nice about the clouds is, you know, [51:00.44] they have relatively like immature APIs for doing that, [51:03.52] as well as like, you know, support for Terraform, [51:05.44] for all the networking and all that stuff. [51:07.12] That makes it easier to work with the big clouds. [51:09.24] But yeah, I mean, some of those things, I think, you know, [51:11.20] I also expect the smaller clouds to like embrace those things [51:14.00] in the long run, but also think, you know, you know, [51:16.04] we can also probably integrate with some of the clouds. [51:17.88] Like even without that, there's always an HTML API [51:20.96] that you can use, just like script something [51:23.66] that launches instances, like through the web. [51:25.82] - Yeah, yeah. [51:26.66] I think a lot of people are always curious about [51:28.18] whether or not you will buy your own hardware someday. [51:31.26] I think you're a pretty firm in that. [51:32.62] It's not your interest. [51:33.98] But like, your story and your growth does remind me [51:37.34] a little bit of Cloudflare, which obviously, you know, [51:40.58] invest a lot in its own physical network. [51:42.66] - Yeah, I don't remember like early days, [51:44.62] like did they have their own hardware or? [51:46.70] - No, they push out a lot with like agreements [51:49.18] through other, you know, providers. [51:52.26] - Yeah, okay, interesting. [51:53.10] - But now it's all their own hardware. [51:55.50] - Yeah. - Sorry, I understand. [51:57.54] - Yeah, I mean, my feeling is that [52:00.02] when you're venture funded startup, [52:01.62] like buying physical hardware is maybe not the best use [52:05.60] of the money. [52:06.44] - I really wanted to put you in a room [52:08.14] with Isocat from poolside. [52:10.54] - Yeah. [52:11.38] - 'Cause he has the complete opposite view. [52:12.22] - Yeah. - It is great. [52:13.80] - I mean, I don't like, I just think for like [52:15.18] a capital efficiency point of view, [52:16.40] like do you really want to tie up that much money [52:18.06] in like, you know, physical hardware [52:19.34] and think about depreciation and like, [52:21.18] like as much as possible, like I, you know, [52:24.74] I favor a more capital efficient way of like, [52:27.10] we don't want to own the hardware, [52:28.22] 'cause then, and ideally we want to, [52:30.50] we want the sort of margin structure to be sort of like, [52:33.66] 100% correlated revenue and cogs in the sense that like, [52:36.70] you know, when someone comes and pays us, [52:38.98] you know, one dollar for compute, like, you know, [52:41.02] we immediately incur a cost of like whatever, [52:43.54] like 70 cents, 80 cents, you know, [52:45.40] and there's like complete correlation [52:46.96] between cost and revenue. [52:48.36] 'Cause then you can leverage up in like, [52:49.88] a kind of a nice way, you can scale very efficiently, [52:52.08] you know, like that's not, you know, [52:54.28] turns out like that's hard to do. [52:55.96] Like you can't just only use like, [52:57.32] spotting on-demand instances. [52:58.52] Like over time, we've actually started adding [53:00.72] a pretty significant amount of reservations too. [53:02.84] So I don't know, like reservations is always like, [53:04.72] one step towards owning your own hardware. [53:07.08] Like, I don't know, like, do we really want to be, [53:09.00] you know, thinking about switches and cooling [53:12.12] and HVAC and like power supplies? [53:14.16] - Exactly, recovery. [53:15.46] - Yeah, like, is that the thing I want to think about? [53:17.62] Like, I don't know, like I like to make developers happy, [53:19.82] but who knows, like maybe one day, like, [53:21.68] but I don't think it's gonna happen anytime soon. [53:24.10] - Yeah, obviously, for what it's worth, [53:26.22] obviously I believe running in cloud, [53:28.82] but it's interesting to have the devil's advocate [53:31.94] on the other side. [53:32.94] The main thing you have to do is be confident [53:34.70] that you can manage your depreciation better [53:36.66] than the typical assumption, which is two to three years. [53:40.10] - Yeah, yeah. [53:40.94] And so the one when you have a CTO that tells you, [53:42.92] no, I think I can make these things last seven years, [53:45.72] then it changes the math. [53:46.56] - Yeah, yeah, but, you know, [53:48.24] are you deluding yourself then? [53:49.64] - Yeah. [53:50.48] - That's the question, right? [53:51.40] It's like the waste management scandal. [53:53.32] Do you know about that? [53:54.16] Like, they had all this like, [53:55.12] like accounting scandal back in the 90s, [53:57.24] like this garbage company, like was, [54:00.32] they like started assuming their garbage trucks [54:03.20] had a 10 year depreciation schedule, [54:05.68] booked like a massive profit, you know, [54:07.60] the stock went to like, you know, up like, you know, [54:09.76] and then it turns out actually the, [54:10.96] all those garbage trucks broke down and like, [54:13.34] you can't really depreciate them over 10 years. [54:15.26] And so, so then the whole company, you know, [54:17.04] they had to restate all the earnings and they use. [54:20.14] - Nice. [54:21.46] Let's go into some personal nuggets. [54:24.00] You received the IOI Gold Medal, [54:26.34] which is the International Olympian in Informatics. [54:29.66] - 20 years ago. [54:30.82] - Yeah. [54:31.66] How have these models and like going to change [54:35.36] competitive programming? [54:36.38] Like, do you think people are still love the craft? [54:39.66] I feel like over time we're kind of like, [54:41.84] programming has kind of lost maybe a little bit [54:44.96] of its luster in the eyes of a lot of people. [54:48.00] Yeah, I'm curious to see what you think. [54:51.32] - I mean, maybe, but like, I don't know, like, you know, [54:54.00] I've been coding for almost 30 or more than 30 years. [54:56.52] And like, I feel like, you know, you look at like, [54:59.16] programming and, you know, where it is today [55:01.08] versus where it was, you know, 30, 40, 50 years ago. [55:05.52] There's like, probably a thousand times more developers [55:08.60] today than, you know, so like, in every year, [55:10.74] there's more and more developers. [55:12.10] And at the same time, developer productivity [55:13.82] keeps going up. [55:14.70] And when I look at the real world, [55:16.26] I just think there's so much software [55:18.14] that's still waiting to be built. [55:20.10] Like, I think we can, you know, [55:21.80] 10X the amount of developers and still, you know, [55:24.34] have a lot of people making a lot of money, you know, [55:27.14] building amazing software and also being, [55:29.50] while at the same time being more productive. [55:31.26] Like, I never understood this, like, you know, [55:33.18] AI is going to, you know, replace engineers. [55:35.54] That's very rarely how this actually works. [55:38.22] When AI makes engineers more productive, [55:40.68] like the demand actually goes up [55:42.28] because the cost of engineers goes down [55:43.84] because you can build software more cheaply. [55:45.32] And that's, I think, the story of software in the world [55:47.52] over the last few decades. [55:48.76] So, I mean, I don't know how this, [55:50.48] like relates to like, competitive programming is a, [55:53.12] I don't know, kind of going back to your question. [55:55.56] Competitive programming to me was always kind of a weird, [55:57.68] kind of, you know, niche, like kind of, I don't know, [56:00.36] I loved it. [56:01.20] It's like puzzle solving. [56:03.20] And like, my experience is like, you know, [56:05.48] half of competitive programmers are able to translate that [56:08.66] to actual, like, building cool stuff in the world. [56:11.86] Half just, like, get really, you know, [56:13.70] sucked into this, like, puzzle stuff [56:15.10] and, you know, it never loses its grip on them. [56:18.46] But, like, for me, it was an amazing way [56:20.30] to get started with coding or get very deep into coding [56:23.90] and, you know, kind of battle off with, like, [56:26.26] other smart kids and traveling to different countries [56:29.02] when I was a teenager. [56:30.30] - So, I was just going to mention, like, [56:31.38] it's not just that he personally is a competitive programmer. [56:34.46] Like, I think a lot of people at Moto are competitive programmers. [56:37.98] I think you met Akshah through-- [56:39.30] - Akshah, co-founder, is also I.I. Gold Medal. [56:41.78] By the way, Gold Medal, doesn't mean you win. [56:43.90] Like, but, although we actually had an intern that won I.O.I. [56:47.24] Gold Medal is, like, the top 20, 30 people, roughly. [56:49.96] - Yeah, obviously, it's very hard to get hired at Moto. [56:52.28] But what is it like to work with, like, [56:54.52] such a talent density, like, you know, [56:56.68] how is that contributing to the culture at Moto? [56:59.04] - Yeah, I mean, I think humans are the root cost [57:01.88] of, like, everything at a company, right? [57:04.00] Like, you know, bad code, because it's bad human, [57:06.90] or, like, whatever, you know, bad culture. [57:08.30] So, like, I think, you know, like, talent density is very important [57:11.08] in, like, keeping the bar high and, like, hiring smart people. [57:13.46] And, you know, it's not always, like, the case [57:15.14] that, like, hiring competitive programmers is the right strategy, [57:17.54] right? If you're building something very different, [57:19.12] like, you may not, you know. [57:20.32] But we actually end up having a lot of, like, [57:22.12] hard, you know, complex challenges. [57:24.74] Like, you know, I talked about, like, the cloud, [57:27.58] you know, the resource allocation. [57:29.72] Like, turns out, like, that actually, like, [57:31.26] you can phrase that as, like, mixed integer programming problem. [57:33.54] Like, we now have that running in production. [57:35.04] Like, constantly optimizing how we allocate cloud research. [57:37.84] There's a lot of, like, interesting, like, complex, [57:39.68] like, scheduling problems, and, like, [57:41.68] how do you do all the bin packing of all the containers? [57:43.84] Like, so, you know, I think, for what we're building, [57:46.64] you know, it makes a lot of sense to hire these people [57:48.32] who, like, like, those very hard problems. [57:50.56] Yeah. And they don't necessarily have to know [57:52.62] the details of the stack. [57:53.72] They just need to be very good at algorithms. [57:55.80] Uh, no, but, like, my feeling is, like, people who are, like, [57:58.70] pretty good at competitive programming, [58:00.32] they can also pick up, like, other stuff, like, elsewhere. [58:03.48] Not always the case, but, you know, [58:05.24] there's definitely a high correlation. [58:06.88] Oh, yeah. I'm just, I'm interested in that, just because, [58:09.62] you know, like, there's competitive mental talents [58:12.62] in other areas, like, competitive, um, speed memorization. [58:16.02] Yeah. Whatever. [58:16.98] And, like, you know, don't, don't really see those transfer. [58:19.66] And I always assumed, in my narrow perception, [58:22.26] that competitive programming is so specialized. [58:24.96] It's so obscure, even, like, so divorced from real-world, [58:28.46] uh, scenarios that, um, it doesn't actually transfer that much. [58:31.50] But obviously, I think, for the problems that you work on, [58:33.00] it does. [58:33.82] But, but it's also, like, you know, frankly, [58:35.70] it's, like, translates to some extent, [58:37.62] not because, like, the problems are the same, [58:39.36] but just because, like, it sort of filters for the, you know, [58:41.34] people who are, like, willing to go very deep [58:43.64] and work hard on things, right? [58:45.60] Like, I, I, I feel like a similar thing is, like, [58:48.44] a lot of good developers are, like, talented musicians. [58:51.74] Like, why? Like, why is this a correlation? [58:53.74] And, like, my theory is, like, you know, [58:55.48] it's the same sort of skill. Like, you have to, like, [58:57.02] just hyper-focus on something and practice a lot. [58:59.62] Like, and, and there's something similar that I think [59:01.32] creates, like, good developers. [59:02.78] Yeah. [59:03.70] Sweden also had a lot of very good counter-strike players. [59:06.48] I don't know, why did, why did Sweden have [59:08.74] fiber optics before all of Europe? [59:10.70] I feel like, I grew up in Italy, [59:12.88] and our internet was terrible. [59:15.42] And then, I feel like, all the Nordics [59:17.58] are, like, amazing internet. [59:18.88] I remember getting online, and people in the Nordics [59:21.18] are, like, five-ping, ten-ping. [59:23.06] Yeah, we had very good network back then. [59:25.48] Yeah, do you know why? [59:26.86] I mean, I'm sure, like, you know, I think the government, [59:29.96] you know, did certain things quite well, right? [59:32.36] Like, in the '90s, like, there was, like, [59:34.16] a bunch of tax rebates for, like, buying computers. [59:36.22] And I think there were similar, like, investments [59:37.90] in infrastructure. I mean, like, and I think, like, [59:39.80] I always think about, you know, it's like, [59:41.44] I still can't use my phone in the subway in New York. [59:43.90] And that was something I could use in Sweden in '95. [59:47.10] You know, we're talking, like, 40 years almost, right? [59:49.34] Like, like, why? [59:51.24] And I don't know, like, I think certain infrastructure, [59:53.44] you know, Sweden was just better, I don't know. [59:55.62] Yeah. [59:56.54] And also, you never owned a TV or a car? [59:59.22] Never owned a TV or a car. I never had a driver's license. [60:01.38] How do you do that in Sweden, though? [60:03.08] Like, that's cold. [60:03.92] I grew up in a city. I mean, like, [60:05.42] I took the subway everywhere, with a bike or whatever. [60:08.76] Yeah. I always lived in cities, so I don't, you know, [60:11.40] I never felt. I mean, I, like, we have a, [60:14.16] like, me and my wife as a car, but like, I-- [60:16.80] That doesn't count. [60:18.14] I mean, it's her name, 'cause I don't have a driver's license. [60:20.48] She drives me everywhere. It's nice. [60:23.04] Nice. [60:23.88] Yeah, it's fantastic. [60:24.84] I was gonna ask you, like, the last thing I had [60:26.84] on this list was your advice to people thinking [60:28.88] about running some sort of run code in the cloud startup, [60:31.42] is only do it if you're genuinely excited [60:33.46] about spending five years thinking [60:34.82] about load balancing, page false, cloud security, and DNS. [60:37.06] So, basically, like, it sounds like you're summing up [60:38.56] a lot of pain running model. [60:40.98] Yeah. Yeah. [60:41.82] Like, one thing I struggled with, like, [60:43.06] I talked to a lot of people starting companies [60:45.40] in the data space or, like, AI space or whatever, [60:48.02] and they sort of come at it at, like, you know, [60:49.88] from, like, an application developer point of view, [60:51.58] and they're like, "I'm gonna make this better." [60:53.24] But, like, guess how you have to make it better? [60:54.86] It's like, you have to go very deep [60:56.26] on the infrastructure layer. [60:57.26] And so, one of my frustrations has been, like, [60:59.04] so many startups are, like, in my opinion, [61:00.38] like, Kubernetes wrappers, [61:01.48] and not very, like, thick wrappers, [61:02.80] like, fairly thin wrappers. [61:04.12] And I think, you know, every startup is a wrapper, [61:06.24] to some extent, but, like, you need to be, like, [61:07.60] a fat wrapper. [61:08.44] You need to, like, go deep and, like, build some stuff. [61:10.32] And that's, like, you know, if you build a tech company, [61:12.36] you're gonna want to have-- you're gonna have to spend, [61:14.24] you know, five, 10, 20 years of your life, like, [61:16.96] going very deep and, like, you know, [61:18.40] building the infrastructure you need [61:19.96] in order to, like, make your product truly stand out [61:22.20] and be competitive. [61:23.48] And so, you know, I think that goes for everything. [61:25.40] I mean, like, you're starting a whatever, you know, [61:28.10] online retailer of, I don't know, bathroom sinks. [61:31.14] You have to be willing to spend 10 years of your life [61:33.50] thinking about, you know, whatever, bathroom sinks. [61:36.18] Like, otherwise, it's gonna be hard. [61:38.06] - Yeah, I think that's good advice for everyone. [61:39.58] And, yeah, congrats on all your success. [61:41.62] It's pretty exciting to watch it. [61:43.46] It's just the beginning. [61:44.30] - Yeah, yeah, yeah, it's exciting. [61:45.86] And everyone should sign up and try at modal.com. [61:48.62] - Yeah, now it's GA. [61:49.46] - Yeah. [61:50.46] - Used to be behind a wait list. [61:51.82] - Yeah. [61:52.66] - Awesome, Eric. [61:53.48] Thanks you so much for coming on. [61:54.30] - Yeah, it's amazing. [61:55.14] - Thanks. [61:56.74] (upbeat music) [61:59.32] (upbeat music) [62:01.90] (upbeat music) [62:04.48] (upbeat music) [62:07.90] (upbeat music) [62:10.48] (upbeat music) [62:13.06] (upbeat music) [62:15.64] (upbeat music) [62:18.22] (upbeat music) [62:20.80] [BLANK_AUDIO]