Tetra Compute will be a Managed Hosting Provider (MHP) for machine learning models.
This week I started working on Customer Development (CD) to figure out what that description means.
CD is functionally selling vaporware.
The product only exists as bullet points in a sales presentation and a design in Figma. The product in these designs and presentations is as perfect as it will ever be, so if I can’t generate interest here it means my ideas are wrong, or I’m talking to the wrong people.
The goal of CD isn’t a perfect pitch, product or strategy, it’s that I understand (1) who to call and (2) what to sell them so (3) those calls get commitment or advancements more often than not.
Some markets and products don’t really need CD and it’s more important to start smiling and dialing. The market might be new, so any assumptions are probably wrong. Or, it could be the product is so simple that it doesn’t need to be honed, it just needs to find the right audience.
But there is some threshold of market size and product complexity where CD becomes essential. I think this is more true for products that get close to being systems of record. The more layers of business functionality sit on top of your product, the more you should really understand the use cases surrounding the tool you’re about to build.
As you’ll read below, the market for MHPs is anyone who uses computers to solve problems. For Tetra, it’s everyone who will need LLMs.
The market for MHPs is three decades old and so large it’s hard to comprehend. MHPs are mission-critical low-level infrastructure for their customers, so it’s important to spend the time upfront to think clearly about these CD questions.
To understand CD question #1 we need to understand #2 - what we’re attempting to sell.
Today I’ll be diving into three questions:
What the hell is an MHP?
What are the main categories of MHPs today?
Where does Tetra Compute fit into the market
Get ready, this is a long one.
What the hell is an MHP?
Let’s align on some extreme basics.
Computers do computational Tasks. They crunch numbers and data. We don’t care what those Tasks are or how the Tasks are performed. All we care about is that I can write a program and execute it on a computer to produce some value. That’s all a Task is.
Computers can talk to each other over a Network. This can be a network inside of one business building, the local Wifi Network in your home, or it can be the entire Internet.
Networks allow computers and their programs to talk to each other. With Networks, you can share Task loads. In many business cases we choose to put most of our programs together on powerful computers on our Network and have them do all of our Tasks. When a weaker computer needs to do a Task, it can use the Network to request that the powerful computers perform the Task. The powerful computer executes the Task and responds over the Network with the output.
We call these centralized, powerful computers Servers.
There are a lot of different kinds of Servers. None of these are important for us to dive into here. In fact, let’s take the word “Server” out of the equation. Let’s call it something more abstract. Let’s call it a Computational Resource, or even more simply, a Resource.
I promise this is going somewhere.
We have all the pieces to our MHP puzzle now, believe it or not.
We have Computational Tasks being performed by a Computational Resource. Resources communicate with each other and our computers over Networks.
If you have a working Network of Resources performing Tasks, you have what we’ll call Infrastructure.
In the ye olde dark days of the 80s and early 90s before I and most of my current readers were born, Infrastructure was pretty straightforward. You had a bunch of computers in the same building wired into the same Network and they could talk to each other. Some big blocky Resource in the basement handled Tasks and you had a guy who was awkward at parties making sure it didn’t overheat. If there were too many Tasks for the Resource, you could add some more horsepower to the Resource or add more Resources in parallel. If you added more Resources, new requests for Tasks would be routed to whichever Resource had more capacity.
Then the Internet happened. We ran some cables across the oceans and into every home in the developed world. Everyone started connecting to the Network to end all Networks. Now a company could have its Resources in a New York office work with Resources in the San Francisco office. Equally important, random folks on the Internet could go to Amazon.com and order books.
In the neolithic age of the Internet, this is equivalent to inventing farming.
For the first time your Infrastructure wasn’t just responsible for your business-related Tasks. Tasks were your business. Your entire business now depended on your Infrastructure in an existential way.
This, understandably, required some new innovations.
Again, I promise I’m almost to MHPs.
So, Infrastructure matters now. Well, it turns out Infrastructure is pretty damn hard. It’s even harder for successful businesses. It takes a lot of time, money and specialized teams to grow something like Amazon and not constantly crash. Those chunky computers in the basement turned into entire warehouses full of Resources. Now you needed a legion of awkward IT folks and engineers to build Tasks, Networks and Infrastructure. And it turns out that those nerds are expensive.
For many companies struggling to transition to the Internet, Infrastructure was not their core competency. Hiring technical employees to manage these systems was painful.
Predictably, an entire industry popped up to solve these pains.
These companies had a simple value proposition. “Hey, I have a warehouse full of Resources that are hooked up to the Internet Network. I’ll make sure the lights stay on in the warehouse. You can send your programs to me. I’ll run them on my Resources and send you the results back over the Internet. If you need more Resources, call me and I’ll set up a new one for you. In exchange, you’ll pay me less than you’d need for your own IT staff.”
Pretty damn good deal. Let’s condense that pitch to a single sentence: “My company offers Infrastructure as a service.”
In modern startup bro parlance we call this IaaS: Infrastructure as a Service.
An IaaS provider is a Managed Hosting Provider. (Yay! We made it!) IaaS providers offer the lowest level of abstraction. If you don’t want to manage your Resources, IaaS providers will manage the hosting of them in their warehouses.
Early pioneers of IaaS in the 90s were functionally outsourced IT teams that kept all of their customers’ Resources in a single warehouse. This changed rapidly with the introduction of Amazon Web Services in 2006 and the move to cloud computing in the mid 2000s.
Cloud IaaS providers offered more than an outsourced IT team. Instead, the IT team became a configuration file. Configure AWS with what kind of Resource you want and how you want it to scale and AWS will automatically keep up with the amount of Tasks you throw at it. No more calling Jim at Rackspace down in San Antonio to order a new Resource and put it on a rack for you. That happens automatically. Amazon AWS still operates warehouses full of Resources, but they’ve standardized those Resources and simplified the problem for you, unlocking a huge amount of potential for your engineering teams.
But as with everything in engineering: A powerful new abstraction allows us to beat back the boundaries of complexity and unlock a new world of potential; and in pursuing the new potential we find nothing but new borders of complexity.
That’s my flowery way of saying - you’ve replaced your IT team with a configuration file, but now it is so easy to make entire fleets of Resources that you need a team of site reliability engineers to make sure everything doesn’t come crashing down. And those engineers are even more expensive.
These new reliability engineers are there for two reasons: The first is to tame the complexity and make it visible. The second is to build tools to help manage the complexity so it’s easier for software engineers to maintain with the Resources they’ve built. This means building tools like logging, deployment pipelines and a bunch of other jargon we don’t need to consider right now.
We’ll call these sets of tools Platforms.
Many companies didn’t need this complexity. They might have simpler or smaller workloads. They may have less traffic. In many cases, this technical expertise is not their core competency.
Predictably, an entire industry popped up to solve these pains.
This new industry’s value proposition is: “Hey, all these IaaS providers are super powerful, but getting your Resources configured correctly can be brutal. A few wrong choices can lead to a lot of wasted money. We’ll give you a software Platform that helps you put Resources into IaaS providers. You won’t have as many configuration options as IaaS, but things will be cheaper and more predictable. We’ll even throw in a bunch of tooling for your Resources that you’d have to buy, configure or build for yourself otherwise. In exchange, you’re going to pay us a bit more than you pay for IaaS Resources, but it’s still way less than paying for an engineering team.”
Pretty damn good deal. Let’s simplify this one too. “I’ll sell you a Platform as a service”.
Translated into startup bro it’s PaaS: Platform as a Service.
PaaS are Managed Hosting Providers at a higher level of abstraction. They help you manage Resources on modern Infrastructure, typically IaaS providers. They’re Infrastructure middlemen.
All the pieces of our understanding on the board now. Let’s review.
Companies have Tasks that need to be done on Resources over a Network. A set of Resources over a Network is called Infrastructure.
Some companies need their tasks to be done privately, or they have the expertise to host their own Infrastructure. This Infrastructure is private. Other companies don’t really care where their Tasks are done, do not have the internal talent or time to manage it themselves, or have huge Task requirements. These companies can buy Infrastructure from IaaS providers. This Infrastructure is public and generally called “the cloud”.
IaaS infrastructure can be complex and difficult to manage. You need a Platform to manage it. If you don’t want to configure the Platform yourself you can buy one from PaaS providers.
Together, IaaS and PaaS make up the market of Managed Hosting Providers.
We did it! :)
… almost.
Turns out there are more than a handful of ways an MHP can play their game. Let’s dive in before circling back to Tetra Compute.
Modern Categories of MHPs
What I’ve written above is obviously a very simplified framework for dipping your toes into this market. But it’s been surprisingly helpful in understanding how to quickly categorize different companies as I come across them. I want to break down some of the more interesting subcategories and hybrid categories that have popped up in my research.
Old-school IaaS
Infrastructure in a warehouse. Fully old-school seems to be a dying breed.
IaaS.
Highly configurable, highly scalable Infrastructure
AWS EC2 (elastic compute cloud)
Hybrid IaaS > PaaS.
Highly configurable Platform. An onramp to full IaaS
Hybrid PaaS > IaaS
Companies that have their own independent cloud Infrastructure and have vertically integrated their own PaaS offering on top of that Infrastructure.
Full PaaS
Vertical Market PaaS
PaaS focused on specific Resources, specific Tasks, or both
WP Engine (WordPress hosting)
Shopify (e-commerce hosting)
SquareSpace (turbonormie website hosting)
Multi-cloud PaaS
Unified Platforms to help companies use multiple IaaS providers
Private PaaS
Platforms that are downloaded and installed locally on private, self-hosted Infrastructure that doesn’t touch the cloud
Hybrid Private + Public PaaS
Unified Platforms to help companies manage public Infrastructure from IaaS Providers in addition to self-hosted Infrastructure
Clarifying Tetra’s Core Assumptions
Let’s use the terms and MHP styles above to more clearly define the idea for Tetra Compute.
There are businesses that need to perform Tasks with Machine Learning models. Machine Learning models can run on standard Resources, but they run exponentially better on custom-built Resources that have more powerful GPUs.
These businesses can buy those Resources and self-host them on their own Infrastructure.
There are issues with that:
GPU Resources are more expensive than standard Resources and have a big supply bottleneck right now (why NVIDIA is currently worth more than one trillion dollars).
The Tasks that run on these Resources are harder to write and they require a different set of programs to host
So IaaS providers are stepping in. They’re using their capital to build warehouses of GPU Resources. Every IaaS provider now has the option to have GPU Resources.
With GPU Resources unlocked at the IaaS layer, PaaS providers have built GPU Resources into their Platforms.
The market pivoted quickly here. On first glance it feels like the industry rapidly sucked all of their air out of the room. Where does Tetra Compute play?
I believe Tetra Compute can be a Vertical Market PaaS focused specifically on GPU Resources and ML tasks. I don’t believe a Platform built for traditional Resources and Tasks will be able to keep up with the evolution of this market and specific needs that arise from managing AI in an enterprise.
Why?
Traditional software development is a decades old industry at this point. The broad categories of problems a Platform needs to solve a pretty clear. As an industry we’ve gotten a lot better at solving them at higher and higher levels of complexity as our Infrastructure has improved.
Some of the existing Platform tooling will definitely carry over into the AI landscape.
But AI development has some fundamental differences from standards software development; it’s more experimental and exploratory due to the underlying statistical nature of AI.
And the foundations of the AI market are moving at exponential rates:
AI models are making leaps and bounds on the same set of capabilities every quarter
The architecture under those models are seeing a lot of experimentation which could create entirely new capabilities, unlocking new forms of Tasks
The hardware that architecture runs on is seeing unprecedented levels of investment. New capabilities can be unlocked on existing architectures if they can use more hardware. The proliferation of hardware could unlock new architectures.
Separately, there is increased investment in non-AI Tasks that run better on GPU hardware.
Tetra Compute is a bet that these market forces and technical differences will create a gap in Platform tooling that we can fill.
This vision pretty closely mirrors WP Engine’s approach to the WordPress ecosystem.
You can host WordPress websites on any IaaS or PaaS provider you want, but WP Engine makes it dramatically easier to WordPress websites specifically. They completely specialize in this one hosting stack. That specialization allows them to build a Platform that has WordPress-focused tools that no other PaaS Platform is going to have.
Now I need to start calling customers and figuring out what those tools are.
More on that in the next Customer Development update.
If you made it far, thanks for reading. You’re a GOAT. Would love to hear thoughts.