How large did SemiAnalysis' cloud token spend get?
Dylan reports the firm's cloud token spend inflecting dramatically to about $7 million per year (a run rate discussed in the interview).
Video Summary
enterprise token spend has surged — SemiAnalysis moved to roughly $7M/year in cloud token usage
execution costs have collapsed: ideas are abundant but high-quality ideas become the scarce resource
token demand is driving supply-side pressure in GPUs, memory, TSMC capacity and CPUs
advanced models (e.g., Anthropic Mythos) can be token-expensive per-call but more efficient overall
AI-driven productivity gains can create “phantom GDP” not captured by traditional metrics and widen inequality, possibly triggering public backlash
Dylan reports the firm's cloud token spend inflecting dramatically to about $7 million per year (a run rate discussed in the interview).
With powerful models and cloud tooling, implementing ideas is far cheaper and faster than before, so the limiting factor becomes selecting high-value ideas worth token spend.
Key bottlenecks include memory (DRAM/NAND) capacity, foundry constraints at TSMC, GPU availability (Nvidia), and rising demand for CPUs for training and RL environments.
Phantom GDP refers to value created by AI-driven information and productivity that improves output or decision-making but may not show up in traditional GDP statistics due to falling costs or measurement gaps.
Rapid adoption, visible job impacts, inequitable access to cutting-edge models, and poor public communication by AI leaders could fuel negative sentiment and large-scale protests.
"What used to matter a lot was execution; now, ideas are cheap and plentiful but execution has become very easy."
The landscape of AI development has shifted, as execution has become simpler while the proliferation of ideas has increased.
This shift emphasizes that only the strongest ideas can warrant the investment in inexpensive execution options.
"Last year, we thought we were heavy users of AI; this year, the spend has skyrocketed."
Dylan Patel describes how his team's engagement with AI tokens has transformed from what they thought was heavy usage last year to an exponential increase in spending this year.
Their firm’s monthly expenditure on AI tools has increased from $5 million to $7 million, primarily driven by enhanced use of cloud code and AI applications.
"What's really remarkable is that people who have never coded before are now using cloud code."
The widespread adoption of AI has empowered non-technical staff to utilize AI tools for coding, significantly increasing operational efficiency.
The firm's annual spending on cloud code has risen to $7 million, representing over 25% of their total salary expenditure, highlighting a major shift in budget allocation towards AI technologies.
"One person on the team was able to create an application that is GPU accelerated...this is insane."
A team member developed an application that rapidly analyzes chip images using cloud tokens, replacing what previously required an entire team's efforts over the course of a year.
This highlights how AI and cloud computing can drastically reduce time and resource requirements in complex technical tasks.
"If I don't adopt AI, someone else will, and they will beat me."
The integration of AI into business practices is not just an opportunity; it has become a necessity for maintaining competitive advantage.
Firms must continually innovate and enhance their service offerings; failure to do so risks being relegated to a commoditized state where their services lose value due to competition.
"In just three weeks, we mapped the entire US grid and created a dashboard for power deficits and surpluses."
By leveraging cloud technology, the company massively accelerated the development of energy data services, achieving results that previously took decades for larger teams.
This rapid progress exemplifies how adopting AI tools can lead to significant improvements in productivity and competitive differentiation in once-stagnant industries.
"To achieve enterprise adoption at scale, you have to deliver on core capabilities like SSO, SKIM, Arbback, and audit logs."
The discussion highlights the crucial role that Work OS plays in enabling companies like OpenAI, Cursor, and Enthropic to adopt technologies effectively.
Instead of spending extensive time on developing essential capabilities internally, organizations can leverage Work OS APIs to achieve rapid deployment and enhance their operational efficiency from day one.
Many leading AI teams utilize Work OS to streamline their processes, which allows them to focus on product development and innovation without getting bogged down by foundational infrastructure.
"If I sell you information for a dollar, you're only buying it because you know that information helps you make a decision that lets you make more than a dollar."
The conversation emphasizes that information services inherently create value for their customers, who often derive much greater returns from such information.
Investment firms maintain their own information services but still continue to purchase external data, often because external providers can deliver insights more rapidly and innovatively.
There is a significant interest in the supply and demand dynamics of AI tokens, particularly how companies are navigating this rapidly evolving market and responding to the soaring demand for high-quality tokens.
"Enthropic has gone from 9 billion in revenue to what they're at 35-40 billion now, but their compute has not grown to the same degree."
A macro view of company performance highlights that Enthropic's revenue has soared dramatically without a corresponding increase in compute resources.
The model indicates a potential shift in margin growth and operational efficiency trends within the company, showing that higher demand allows for reduced usage limits and better resource allocation.
This creates an environment where token access becomes a primary concern, as businesses recognize their value generation capacities are tied to the most efficient usage of these advanced AI tools.
"If you have enough capital, you should get an enterprise subscription where you pay per token, not with these subscriptions."
Users express a strong desire to access the latest AI models, suggesting a competitive urgency to be first in line for advancements like version 4.7 of an AI model.
The demand for cutting-edge models illustrates how the market is driven not just by cost but by the potential for value creation and economic opportunity.
As costs for using advanced AI models drop, businesses must strategically leverage those tokens to maximize value and enhance their offerings, but it also signifies a shift toward the commercialization of AI capabilities where token arbitrage becomes a primary business focus.
"The cost to hit a certain capability tier used to cost X and now it costs 1/100th or 1/1,000th of that."
Insights into the changing economic landscape of AI technologies show that costs for achieving comparable capabilities are decreasing dramatically, opening the field for innovation and new applications.
Predictions suggest that within a year, the pricing for the same AI model quality might decrease significantly, making advanced technologies more accessible.
This heightened accessibility will likely lead to the emergence of new use cases, driving demand further as businesses seek to integrate AI solutions that are economically viable and encourage broader usage across different sectors.
"Mythos is more expensive as a model, but it spends fewer tokens to perform tasks, making it more efficient overall."
The conversation highlights the efficiency of Anthropic's Mythos model compared to others, emphasizing that while individual tokens might be more expensive, the reduced token consumption leads to lower costs for various tasks.
Dylan Patel notes that Rapid advancements in the AI model landscape have led to a compression of release cadences, allowing for faster iterations and improvements.
"The idea is cheap; implementation is what truly matters."
The discussion underscores a shift where generating ideas has become easier, while implementing them—though still costly—has also become simpler. This shift allows for more rapid experimentation and iteration within the AI space.
As the cost of implementation decreases, the focus now shifts towards identifying which ideas justify the expenditure on tokens for successful deployment.
"Now ideas are cheap and plentiful, but execution is very easy."
The transition from difficult execution to accessible implementation has created an environment where many ideas can be tested quickly. This landscape poses a complex challenge: discerning which concepts are worthy of investment.
The implications of this transition could lead to a complete reordering of economic structures, where only the most valuable ideas will effectively utilize the resources available.
"Who's going to have access to the newest model? That's going to be limited to certain companies."
The conversation reveals concerns about access to cutting-edge AI models, suggesting that only a select group of companies will be granted early or exclusive access, leading to a competitive disparity.
Companies capable of affording the high costs of infrastructure and AI usage may dominate in various sectors, further intensifying market inequality.
"Your ability to choose the right idea to implement is what truly matters."
A crucial point addressed is the importance of capital allocation for executing AI developments. As financial resources become the key to leveraging the newest models, disparities based on wealth could dictate market leaders.
The example of well-connected investors receiving preferential access to AI models highlights the potential for imbalanced competition within emerging technologies.
"We’ll start seeing real breakthroughs in robotics that enable few-shot learning."
There is optimism for future advancements in robotics, with expectations of significant improvements driven by the integration of AI models that enhance the learning capabilities of robots.
The conversation draws attention to the difficulties of merging software advancements with the hardware world, suggesting that future breakthroughs might follow the software singularity, enabling broader applications in physical robotics.
"There's going to be a huge explosion in physical good acceleration and deflationary effects there."
The future of robotics is expected to evolve significantly, with specific applications becoming more niche, such as robots designed for particular tasks like cleaning or folding clothes.
A rental model for these robots or the download of specific functionalities onto standard robots will likely emerge, which could drive an increase in token demand as the capabilities of such robots improve.
The speaker believes that token demand will continue to grow significantly, suggesting that there is no foreseeable slowdown in this demand surge.
"Mythos is a materially larger model than prior models, and it proves that the scaling laws still work."
The Mythos AI model demonstrates that larger models improve performance, adhering to known scaling laws of AI development.
The increase in model quality is correlated not only with more compute power but also with greater efficiency in compute usage, leading to decreased costs for significantly enhanced capabilities over time.
Companies like Google, Anthropic, and OpenAI are experimenting with different scaling strategies, with OpenAI focusing on a more measured approach that could yield better results over time.
"There's an interesting dynamic where Anthropic seems to be leading for now, but OpenAI continues to push aggressively in the compute space."
Anthropic appears to have the upper hand with their recent offering of the Mythos model, although their growth is limited by compute resource constraints.
OpenAI is aggressively investing in compute resources, which allows them to scale their capabilities quickly despite some criticism regarding their aggressive tactics.
The economic implications are evident: as the demand for AI tokens grows, so will the revenues for organizations that can effectively manage their compute resources to meet this demand.
"If you don't use more tokens, you'll never escape the permanent underclass."
To leverage the full potential of AI models, users must utilize more tokens to generate and capture economic value effectively.
The landscape is evolving, with early adopters being at an advantage as AI utilization becomes more commonplace and essential in the market.
Failure to adapt to this trend may lead to disparities, creating a permanent divide where individuals or organizations not utilizing AI effectively fall behind in economic terms.
"As demand skyrockets, prices are going up for everything on the supply side."
The rise in demand for AI technologies is leading to increasing prices for necessary components, including GPUs, which are seeing extended useful life and rising renewal costs.
This indicates a shift in market dynamics, where older technology can remain viable for longer, and the supply chain must adapt to meet the escalating demand for AI capabilities.
Companies are re-evaluating their technology lifespan and renewal strategies as the economic landscape shifts to accommodate new AI models and their requirements.
"Margins are expanding in the cloud layer, and hardware margins, with Nvidia still charging substantial gross margins, are extremely healthy."
The cloud service sector is experiencing expanding profit margins, which reflects a growing demand for these services. Hardware companies, particularly Nvidia, are able to maintain significant gross margins, as they charge around 75% on their products.
Memory chip manufacturers are also witnessing skyrocketing margins, while sectors like optics and logic see slower margin growth, although many are benefiting from large prepayments.
There is a consistent trend across the supply chain where companies are either sold out or experiencing an increase in margins. Companies making chips, like Nvidia, often face higher capital costs due to substantial prepayments.
"Supply chains are usually very fast to react, but our supply chains now are more complex than ever, leading to longer lead times."
Modern supply chains are more intricate than in the past, leading to longer lead times for production and delivery of key components. This complexity can contribute to challenges in quickly ramping supply to meet rising demand.
Even though the memory industry's demand is robust, companies can only increase capacity by low double-digit percentages annually, which is insufficient to meet the current demand surge.
Memory prices are expected to continue rising significantly because the incremental supply will not arrive until later in 2028, causing many companies to face the pressure of higher pricing and demand destruction.
"You need tons of CPUs, especially for reinforcement learning, as they play a critical role in running complex environments for model training."
CPUs are essential for executing various tasks within reinforcement learning, as they manage the environment where AI models train and evaluate their performance on different tasks.
The involvement of CPUs is crucial during training, where vast amounts of internet data are processed by models to generate outputs from the environment.
As AI applications grow in complexity, the demand for CPUs has skyrocketed, leading to supply shortages even as other parts of the semiconductor ecosystem, such as GPUs, are dominated by a few key players like Nvidia.
"Once you have these great models and you're deploying them, those models are generating useful output."
Iteration is critical in developing AI models, as successful trajectories are continually assessed and refined.
Central Processing Units (CPUs) play a significant role in executing these models and generating valuable outputs.
There's a high demand for CPUs due to their utility in running deployed applications that utilize AI models.
"The hardest area for us and for everyone is understanding token economics."
There is a good grasp of costs associated with infrastructure, token prices, and margins of AI labs.
However, modeling usage and adoption remains a complex and elusive task.
Frequent discrepancies in revenue predictions highlight the challenge of calibrating models based on user engagement with tokens.
"What is the value being created by these tokens, especially since it doesn't show up in GDP statistics?"
The transformation of tokens into enhanced information flows through the economy is not accurately captured by traditional GDP metrics.
Tokens create value by improving decision-making processes for businesses, leading to better competitive positioning.
The concept of "phantom GDP" emerges when recognizing the value created that does not materialize in conventional economic indicators.
"People hate AI, and with Anthropics adding revenue, it will start causing business changes downstream."
There is a belief that large-scale protests against AI companies may emerge as public sentiment shifts negatively.
The popularity of AI is perceived to be lower than that of politicians, indicating a growing public apprehension.
As AI companies face backlash, they need to engage more positively with the public and illustrate how AI can uplift society.
"AI leadership needs to stop getting on interviews as they lack charisma and present a misunderstood image of AI."
AI industry leaders should focus on presenting uplifting narratives about AI rather than constantly emphasizing its future capabilities.
Building connections with the public is crucial, as individuals often view AI developers as an unseen elite.
A shift in communication strategy is necessary to alleviate fear and misperceptions surrounding AI technology.