

Table of Contents
I have been thinking a lot about the competition between OpenAI, Anthropic,
Meta, and Google for who has the best pinnacle AI model.
I think it comes down to 4 key areas.
-
The Model Itself
-
Post-training
-
Internal Tooling
-
Agent Functionality
Let’s look at each of these.
The Model
The model is obviously one of the most important components because it it’s
the base of everything.
So here we’re talking about how big and powerful the base model is, e.g.,
the size of the neural net. This is a competition around training clusters,
energy requirements, time requirements, etc. And each generation (e.g., GPT
3→4→5) it gets drastically more difficult to scale.
So it’s largely a resources competition there, plus some smart engineering
to use those resources as efficiently as possible.
But a lot of people are figuring out now that it’s not just the model that
matters. The post-training of the model is also super key.
Post-training
Post-training refines and shapes model knowledge to enhance its accuracy,
relevance, and performance in real-world applications.
I think of it as a set of highly proprietary tricks that magnify the
overall quality of the raw model. Another way to think of this is to say
that it’s a way to connect model weights to human problems.
I’ve come to believe that post-training is pivotal to the overall
performance of a model, and that a company can potentially still dominate if
they have a somewhat worse base model but do this better than others.
I’ve been shouting from the rooftops for nearly two years that there is
likely massive slack in the rope, and that the stagnation we saw in
2023 and 2024 around model size will get massively leaped over by these
tricks.
Post-training is perhaps the most powerful category of those tricks. It’s
like teaching a giant alien brain how to be smart, when it had tremendous potential before but no direction.
So the model itself might be powerful, but it’s unguided. So post-training
teaches the model about the types of real-world things it will have to work
on, and makes it better at solving them.
So that’s the model and post-training, which are definitely the two most
important pieces. But tooling matters as well.
Internal tooling
What we’re seeing in 2024 is that
the connective tissue around an AI model really matters. It makes the
models more usable. Here are some examples:
-
High-quality APIs
-
Larger context sizes
-
Simple Fine Tuning
-
Haystack performance
-
Strict output control
-
External tooling functionality (functions, etc)
-
Trust/Safety features
-
Mobile apps
-
Prompt testing/evaluation frameworks
-
Voice mode on apps
-
OS integration
-
Integrations with things like Make, Zapier, n2n
-
Anthropic’s Caching mode
Just like with pre-training, these things aren’t as important as the model
itself, but they matter because things are only useful to the extent that
they can be used.
So, Tooling is about the integration of AI functionality into
customer workflows.
Next lets talk about Agents.
Agents
Right now AI Agent functionality is mostly externally developed and
integrated. There are projects like CrewAI, Autogen, Langchain, Langraph,
etc., that do this with varying levels of success.
But first—real quick—what is an agent?
❝
An AI agent is an AI component that interprets instructions and takes on
more of the work in a total AI workflow than just LLM response, e.g.,
executing functions, performing data lookups, etc., before passing on
results.
Real-world AI Definitions
So basically, an AI Agent is
something that emulates giving work to a human who can think, adjust
to the input given, and intelligently do things for you as part of a
workflow.
I think the future of Agent functionality is to have it deeply integrated
into the models themselves. Not in the weights, but in the ecosystem
overall.
In other words, we soon won’t be writing code that creates an Agent in
Langchain or something, which then calls a particular model and returns the
results to the agent.
Instead, we’ll just send our actual goal to the model itself, and the
model will figure out what part needs agents to be spun up, using which
tools (like search, planning, writing, etc.) and
it’ll just go do it and give you back the result when it’s done.
This is part of this entire ecosystem story. It’s taking pieces that are
external right now (Agent Frameworks), and brings that internal to the
native model ecosystem.
Analysis
Here’s how I see this playing out.
Models continue to get bigger and bigger, but you can only multiply by 10 so
many times before we run out of GPUs and energy. After a number of years,
gains in model power will have to come from efficiency gains, algorithm
improvements, and other tricks.
At some point, most of the gains will start coming from post-training,
because that’s where we harness and direct the power of the models.
It’s how effectively we’re explaining our problems to the model, and giving
it ways of unlocking its intelligence to help us solve them. So gains there
are multiplicative or exponential on top of the gains of model intelligence.
Tooling will continue to make it easier and easier to use these AI
ecosystems in daily life. From command-line to voice, and integrated into
all of our various tools and workflows we use every day, e.g., email,
calendar, reading, writing, etc. In short—it’ll just get easier to use these
models wherever you are and whatever you’re doing. And it won’t require you
to contort yourself in order to do so.
And finally—and most significantly—we’re going to move from using AI ourselves to giving tasks to AI
Agents—which will ultimately become integrated into
Digital Assistants. This is the big one, because individuals and
companies
will then be able to spin up massive teams of agents
to do work for them—effectively multiplying their effectiveness many times over.
Summary
-
We should start thinking about top AI models as
Model Ecosystems rather than just models because it’s not just
the neural nets doing the work. -
There are four (4) main components to a Model Ecosystem—the Model
itself, Post-training, Internal Tooling, and Agent functionality. -
#1 (The model) is the most well-known piece, and it’s largely judged by
its size (billions of parameters). -
#2 (Post-training) is all about teaching that big model how to solve
real-world problems. -
#3 (Internal Tooling) is about making it easier to use a given model.
-
#4 (Agent functionality) emulates human intelligence, decision-making,
and action as part of workflows—ultimately multiplying the capabilities
of companies and individuals. -
The company that wins the AI Model Wars will need to excel at all
four of these—not just building neural nets with the most
parameters.
NOTES
-
Thanks to Jai Patel for informing many thoughts on this, especially
around pre-training. -
Some additional, related reading:
We’ve Been Thinking About AI All Wrong
AI is just a way to execute Intelligence Tasks that only humans can (could)
do
danielmiessler.com/p/weve-been-thinking-about-ai-all-wrong
Companies Are Just a Graph of Algorithms
AI is about to see your company as a series of components to be optimized
danielmiessler.com/p/companies-graph-of-algorithms
Related Posts

Technical Analysis: 4 Stocks with signs of death crossovers to keep an eye on

HDFC Bank & 3 other fundamentally strong stocks trading above 200 DMA to keep an eye on

Falling Channel Breakout: Multibagger NBFC Stock Shows Bullish Momentum on Daily Chart

4 Fundamentally strong stocks to buy for an upside potential of up to 36%; Do you hold any?

0 responses on "The 4 Components of Top AI Model Ecosystems"