Published: Dec 19, 2024
TensorWave CEO, Darrick Horton, speaks with theCUBE + NYSE Wired

In an exclusive sit-down with The Cube Cyber and AI Innovators Week at the NYSE, TensorWave CEO Darrick Horton shared how the company is disrupting the AI compute landscape.
Key insights from the interview include:
- Why AMD? Darrick reveals how TensorWave's partnership with AMD is creating a cost-effective, open alternative to NVIDIA’s monopoly.
- Focused Excellence: Learn how specialization enables TensorWave to outperform hyperscalers like AWS and Google Cloud in AI-specific workloads.
- Scaling AI: Discover TensorWave’s approach to building gigawatt-class data centers and embracing cutting-edge cooling technologies like liquid and hybrid solutions.
- What’s Next: Scaling globally with hundreds of thousands of GPUs in 2025.
Curious how TensorWave is reshaping AI infrastructure for training, fine-tuning, and inference?
Watch the full interview now 👇
Welcome back to media week. NYSC wired and the cubes exclusive coverage of cyber and AI innovators. My name is Dave Vellante and John Furrier is my cohost this week. We've got three days wall to wall coverage. I'm super excited to have Darrick, Darrick Horton, who's the CEO of tensor wave. Welcome to the cube.
NYSC. What do you think of this? Pretty nice. Yes, it's great to be here. Yeah. Well, thank you. Tell us about TensorWave. You guys are at the forefront of AI. You're partnering with AMD. Everybody's NVIDIA crazy, but you're partnering with AMD. I know you do some stuff with NVIDIA as well, but, but tell us about the company.
Yes. So TensorWave is a GPU cloud provider focused on building infrastructure for AI. AI workloads like training and inference for companies of all sizes. So we focus on building that infrastructure, optimizing it and making sure that the end customers that are utilizing it have the best possible experience.
And we went with AMD for a bunch of reasons that we can get into here if you'd like. Yeah, I'm interested. But before we get there, how is it that a company like, uh, like yours can compete with these giant hyperscalers? You get this question all the time. AWS, Google. Microsoft, they're insurmountable. How do you compete with them?
So the fun thing is, in this industry, we actually don't consider the hyperscalers to be competition. Uh, and that's because we operate in very different markets. So, the hyperscalers, you know, they do everything, right? Every type of compute, every type of cloud, every service. What that means is, they're not really the best at anything, right?
They're a good place to go if you need a bunch of little things. But if you need the best most optimized experience. You don't go to the hyperscale for that. Uh, and also because of their business model, they're very expensive though. They tailor it to companies that have to use them for compliance reasons or for or other reasons.
Um, and so from a price perspective, they're not competitive. Uh, and from a performance perspective, they're also not competitive. So, uh, ironic. Yeah, this is ironic because here's why. In the early days, the cloud was ostensibly less expensive or better than traditional IT. Now traditional IT has gotten much better.
You are best of breed, and you're more cost effective. Question is why? It's because of focus, right? That's all I boil it down to. The hyperscalers do everything, so they're not good at anything. We do one thing, and we do that one thing better than anyone else. It's because of the focus. We're able to put all of our resources into deploying one type of compute, optimize the whole stack, so the end customer gets the best experience, best performance, best cost and best experience.
Give me a 101 on why AI workloads are different than general purpose workloads. Sure, so AI workloads are very compute intensive, uh, and there's a special type of math that's happening. Behind the scenes for almost all A. I. Workloads. And that's, uh, it's known as, uh, matrix multiplication, right? Uh, and it just so happens that GPUs, the things we use for graphics are very good at matrix multiplication.
And so early on, uh, A. I. Workloads started using GPUs to go faster. And then GPUs were tweaked into these things. AI purpose built accelerators that are really good at just that thing. Um, and so that thing is really what's going on, uh, in these AI workloads and in these data centers that are built for AI.
Everything is built to be optimized for just that thing, whereas traditional data center and cloud have a lot more things going on, right? You have CPUs and you have storage, uh, and you have those things in AI as well, but to differing degrees, right? That's what an AI is really about. The GPU compute for matrix multiplication.
It's interesting, we had an analyst, uh, he's our analyst emeritus now, his name is David Floyer. Before the term accelerated computing came out, he called it matrix computing for the reasons that you just mentioned. I've also had people tell me that, that you need GPUs to do inferencing. That's right.
Because it's, it's, it's pretty simple math. Matrix math, but there's a lot of it. That's right. And so is it, but others have said, Oh no, I can use a x86 to do, to do inference. So there's some debate there. Where do you weigh in? You can use x86 for inference. So on the inference side, things are a little more flexible.
GPUs are still going to give you the best performance for most workloads. But CPUs exist. Uh, edge devices exist. FPGA based devices exist. And, and ASIC based, purpose built devices exist. Function devices. They also exist. Um, and there's a spectrum here. There's trade offs at every level on the infant side in terms of cost, efficiency, power consumption, and just raw performance.
And so for most workloads in the infant space, GPUs still are king. But if you need to get the best performance inside of a drone, you can't put a GPU in a drone, right? And so you might select a different type of device. Okay. So your business model essentially is renting AI compute to your clients. Tell us about a little bit about the business model and a little bit about who the customers are.
Absolutely. So fundamentally at the core, our business model is infrastructure as a service. So we build the infrastructure, optimize the physical, build the data center, stock it full of GPUs, and then lease that out to companies. But how we do that. Depends a little bit on the workload on the specific requirements of the company.
We are a bit more hands on, uh, white glove, uh, bespoke than a lot of our GPU cloud competitors because we'd like to ensure that the customers have the best possible experience. And so that means we might tweak the model a little bit for certain customers. Um, but generally we're deploying this infrastructure.
We're leasing it out to them. And then we'll have varying levels of abstraction on top of that. A very sophisticated customer with their own software might just want raw access to the hardware, and we can facilitate that. But, uh, other customers might want, uh, some abstraction on top of that. Maybe a software layer, a platform on top of that, or an API on top of that.
That makes it easier for them to consume those resources, uh, in an efficient manner. So they're getting the best bang for their buck, and they feel like they're having a good experience. utilizing the hardware effectively. So that abstraction is additional value add that you provide and as part of the business model, obviously.
Why AMD? Why AMD? That's an excellent question. It really goes down to the fundamentals of, of why we started this company. Uh, if you look at a bit over a year ago when the company was founded, uh, we noticed a problem in the AI space. Uh, and that was, there was a monopoly. With NVIDIA, right? Every AI end user was using NVIDIA, and they had to use NVIDIA, right?
Regardless of whether NVIDIA was, you know, what they wanted to use, they had to use NVIDIA. It was the only, the only solution in town. Um, and so the end customer, they could buy from NVIDIA or they could buy from NVIDIA Cloud One or NVIDIA Cloud Two, and that's the extent of their choice. Um, and that struck us as an opportunity.
It's a problem that needs to be solved. It's unsustainable. It has to be solved at some point. It has to be addressed. And so we set out to figure out, can we address this? Can we bring a viable alternative to the market? And critically, in order for a viable alternative to be successful in this space, it has to check a lot of different boxes.
It has to be cost effective. It has to be available. It has to be scalable. Um, and it has to be easy to use. It has to work right? Uh, and that's probably the most critical of all. Uh, there have been a number of projects and hardware solutions over the years that on paper have excellent performance, but if they're not easy to use, you're not going to get market traction.
So when we did the research, is there any solution out there that checks these boxes? Uh, the thing that was closest was AMD. And it just so happens that our team has a long history of working with AMD. AMD. And so we knew all the right people to contact and to work with at AMD to make this happen. And so we reached out and, uh, you know, went at it.
Uh, and really, AMD is very well aligned with us. All of the goals, in terms of bringing a viable alternative to market, that gives people choice when they're going looking for compute for their AI workloads. They're very committed to open source, open frameworks, and allowing everybody to have access.
Instead of NVIDIA's motto, which is to build a walled garden and not let anybody in, AMD is the polar opposite of that. So we're very well aligned in terms of vision and values. So you're building data centers. That's right. Okay. It's a very capital intensive business. That it is. So help us understand how you're raising capital.
How much money have you raised? How are you deploying it? Yes. So, uh, we did our seed round earlier this year. The seed round was 44 million. Nice seed. Uh, that was all on safes. So I think we might have set a record for the most money raised on a safe. Uh, but, uh, that was really just to get us going. And we built out, uh, our first couple of data centers, uh, with that and built out the team.
And right now we're going for the Series A, which is, uh, mid nine figures. But in this industry, generally what happens is you do a price round and then alongside that, you get a large amount of collateralized debt financing, um, because you have so many, you know, physical resources and you have contracts backing those physical resources.
So, um, it's a very, uh, debt friendly business. And so that's really where a lot, the majority of the capital comes from, is these, uh, large scale collateralized debt financing. So your series A will set, will price the round, is that right? That's right. The safe, the safe, the safe investors will get that, that price.
That's right. Awesome. We're excited. Very exciting. Um, how do you see this market shaping? So who are your customers today? Are they doing training? Are they doing inference? Are they doing a combination? How do you see these markets shaping? I'm particularly interested in. You know, enterprise A. I. It's kind of that's right here at the New York Stock Exchange.
Yes, enterprise is the is the big one. But there's several sectors here and I can break them down. So, um, it's really three groups. The smallest group is, uh, the startups, right? These are small contracts, um, companies that maybe raised their their precede round. They have a little bit of money. They need some initial compute, but those contracts are very small.
We do engage with those companies, but it's really more of a marketing effort than anything, right? Uh, and then in the middle you have, uh, Enterprise. Uh, and Enterprise, these are mid market, uh, companies. They have budgets, but they might not have, you know, the world's top tier AI talent, right? Uh, and then on the far end of the spectrum, you have Hyperscale.
And these are very, very large contracts. However, there's only a small handful of customers that fit into this category. Okay. Uh, big, big AI houses, the most popular names, uh, and the hyperscale clouds themselves fall into that category. And that's building large clusters of hundreds of thousands of GPUs.
So you have these three categories, very broadly speaking. Um, within those categories, you have inference, you have fine tuning, and you have pre training. Those are the three big workloads. And so you have three big workloads spread across these three different So you said inference, fine tuning And, uh, training, pre training.
Yeah, it's free training, right? Yeah, right, right. Okay. Yeah, the training and fundamental training, fine tuning and then inference, right? Okay, exactly. Now, training, training and fine tuning have started to bleed into each other. So pre training is the foundational step. And then after that, you do smaller skill training, fine tuning and then eventually inference.
And so, um, Bigger customers tend to have all of those, right? They need pre trading, they need training and they need inference. Uh, but as you move down towards smaller customers, it tends to be then pre training gets filtered out and then at some point fine tuning gets filtered out. Um, generally speaking, and that's because pre training and fine tuning are incredibly expensive.
Whereas inference you can be, you can get an entrance for a much lower sum. And it's because pre training anything competitive today requires thousands, tens of thousands of Or hundreds of thousands of GPUs. And so you're talking, uh, hundreds of millions or several billion in CapEx. Even if you're only renting those for six months, it's still a significant sum.
And so not that many companies can afford to do pre training, uh, whereas a lot of companies can afford to do fine tuning and every company can afford to do inference. You also have another driving factor, which is what customers will need in a long run. A lot of pre training happening now that has been happening over the past couple of years.
But the more we pre train these models, especially open source models, the more everyone has access to them. And they might want to fine tune them a little bit, but really what they want to do is they want to run them, right? And so inference is really the workload that scales. And inference scales infinitely.
Every single company on this planet needs inference, whether they know it or not. They all need inference. Nice. Whereas a smaller fraction of 'em need freetraining. So while we do engage in all of these models, we see inference being the thing that really scale. I wanna ask you about energy building data centers.
So you need power. That's right. Um, how are you handling that problem? Are you building your own data centers or you're partnering with colos? Uh, how, how are you solving that problem? It's a little bit of all of the above. Uh, we started doing colos. It's more cost effective initially. Uh, but. It's clear for any cloud company that you have to build your own data centers as you get going.
And so that's always been on our roadmap, and we are transitioning into that as we speak. So today we have co located data centers all over the U. S. But going into next year, we're looking at a few things. Really building out globally, and building out very large scale facilities in the U. S. So today, our facilities are tens of megawatts, 50 megawatts.
Uh, individually, but, uh, we have under contract, uh, sites that can do over a gigawatt. Uh, and so, that's really where things are going, especially with these larger hyperscale deals. Every single hyperscaler out there is trying to get their hands on gigawatt class data centers. It's, it's an insane shift, because just three years ago, a 50 megawatt data center was, uh, was state of the art.
Uh, and now we're talking about data centers 20 times that being not big enough, uh, not dense enough. Right. And so we love to see it. It's it's a lot of fun, you know, keeping up with this pace. But we do have over a gigawatt now under contract, and that's something that our competitors cannot say we were at supercomputing in Atlanta.
I don't know if you were there. I was there. Yeah. And so I was struck by how many liquid cooling Yeah. In the industry. Now we had some on the cube and we were, we actually had a hose manufacturer on the cube. They were called Omni Services. What do you guys do? They make hoses to deliver, you know, liquid cooling and they're talking about connection integrity and what's your take on, on cooling?
Liquid cooling, direct liquid cooling, hybrid cooling. I'd love to get your thoughts on two phase. If you're, I would love to talk about it. No, let's, let's get into it. Great. Well, liquid cooling is unnecessary. It's happening. We use it almost exclusively now. Uh, this generation, the current generation of GPUs is the last generation that it will be possible to air cool.
Right. This generation, most people are doing liquid cooling. You can still technically get away, uh, with air cooling. But, uh, next generation, not possible, right? Completely infeasible, because the power density of the chips is getting so high. And so, AI has forced a lot of really good innovations in this space.
One of them being liquid cooling. Others being, you know, on the energy side and the density side. But, uh, liquid cooling is really the key enabler for getting these higher density chips. Uh, and the reason behind that is, When you're building these clusters of GPUs, especially for training workloads, the physical space that they take up is actually really important.
Um, it's ideal if you can keep everything as close as possible. Um, for networking reasons, actually, primarily. And so, uh, that means you have to be dense. And so a rack used to be 10 KW. Uh, and then maybe 20 if you're really pushing it. 20 That was state of the art just a few years ago, uh, air cooling. And then today, with today's tech, you can easily do 120 up to 200 kW per rack.
So we're talking, you know, 10x from just a few years ago. And then the trajectory over the next couple of years is to go towards 1 megawatt racks, a single rack. That's, you know, 1, 000 kW. And for reference, a house, you know, a general house might use 10 kW, right? And so you have a hundred homes worth of power in one rack.
That's what we're talking about here. But liquid cooling is absolutely required, right? You can't even get anywhere close to those densities on air because air is actually a pretty bad conductor of heat. So you've got to get a little more creative. And then technologies like immersion that have, um, other more attractive heat transfer properties, dual phase or single phase immersion, are also being looked at.
A little more intensely now, although direct to ship liquid cooling is it's what's clearly winning at the moment, right? And in fact, you know, we talked it. I was in a lab that Dell has. They're actually using warm water and a combination. It's hybrid. They're able to tune the fans, although it supercompute.
We had a panel and there was one of the brains behind two phase was really, you know, pushing dual phase saying it's more efficient and do you have thoughts on that? Some people are pushing two phase, uh, they're combining the two, two phase in directed ship, which is really interesting technology. I think that, uh, has promise.
And so it's essentially combining the benefits of evaporative cooling that you get with two phase, but with the simplicity, relative simplicity of a directed ship design versus an immersion two phase design, which is incredibly complex. Which the GPU manufacturers are not. Warranty. Exactly. The immersion today.
All right. All right. We got the bell. It's a special treat when they ring the bell. We're live on the program. So this is it's like you win a prize. There it is. Fantastic. That's it. Yes, it is. Of course happens. This happens twice a day. And then of course the, the options exchange has, they close in another half hour.
They'll ring the bell as well. So that's good. That's good. It's like a special moment at the New York Stock Exchange. Well, Darrick, phenomenal having you on. Um, what's next for you guys? What should we be looking for? Our Series A and Series B. Scaling out, uh, globally and across the US with these larger scale data centers.
That's really what we're focused on. So 2025 will be a big year for us as we scale into the hundreds of thousands of GPUs. Yeah. Well, it looks like, well, supposedly Elon's going to get, try to get there in the first quarter. That's right. See, we're going to test the scaling laws to see if they hold. We're very excited to see.
I hope they do. Darrick, thanks so much. I really appreciate your time. Thanks for having me. Thank you for watching this episode. We're here three days of wall to wall coverage NYSE Wired and the Cube community. This is our Cyber and AI Innovators Week. Keep it right there. Dave Vellante for John Furrier.
We'll be right back.