Why Michael Burry Is Wrong About AI Depreciation (But Still Might Be Right)
Nvidia ships new silicon every 12 months. So why are hyperscalers betting on 6-year depreciation cycles?
There’s a cognitive dissonance at the heart of the AI infrastructure boom. Nvidia, the undisputed king of datacenter GPUs, has turbocharged its release cadence to an almost absurd annual rhythm—Hopper in 2022, Blackwell in 2025, Vera Rubin slated for late 2026, and Rubin Ultra gunning for 2027. Each generation delivers roughly 40-50% better performance per dollar, a Moore’s Law on steroids that should render last year’s $40,000 chip about as useful as a flip phone.
Yet Microsoft, Google etc “inform” (in the process of complying with accounting standards) their auditors that those chips will be productive for six years. Meta says five and a half. CoreWeave, the GPU cloud upstart, confidently pencils in six years of useful life for its fleet. These aren’t accounting tricks—they’re audited, defended projections based on something the balance sheet can’t quite capture: the cascading lifecycle of AI compute.
The Waterfall Economy
Here’s the secret hyperscalers don’t advertise: GPUs rarely die. They just get demoted.
When an H100 arrives at a Microsoft datacenter, it’s immediately thrown into the computational thunderdome—training GPT-5 or whatever ungodly large language model needs 16,000 GPUs running in lockstep. The chip is maxed out, thermal limits pushed, NVLink connections humming at 900 GB/s. This is the frontier, where utilization rates hit 60-70% and a single GPU failure can cascade into hours of lost training time. Meta’s Llama 3 training run saw 30.1% of disruptions caused by GPU failures over 54 days—a brutal 9% annualized failure rate that suggests a three-year ceiling for this tier of work.
But here’s also where the waterfall begins. When Blackwell Ultra ships later this year, those H100s won’t be scrapped. They’ll cascade down to inference—the unglamorous but enormous work of actually running trained models for users. Inference is slower-paced, less thermally punishing, and constitutes the majority of AI workload volume. Microsoft CEO Satya Nadella laid it out plainly: a GPU trains a model, then generates synthetic data for the next model, then runs inference “in all sorts of ways.” It’s not locked into one job forever.
Behind inference comes another tier: serving small language models, handling video transcoding for cloud customers, powering Google Colab notebooks. IBM Cloud and Google Cloud still rent out 2016-era Tesla P100 GPUs—8-year-old silicon—because someone, somewhere, needs exactly that much compute and doesn’t want to pay H100 prices. Google’s own 7- and 8-year-old TPUs report 100% utilization. One data point even surfaced nine-year-old M4000 GPUs running at near-total capacity, still generating revenue.
The Power Problem Nvidia Won’t Solve
There’s an economic kicker: newer isn’t always better when you’re power-constrained. Modern datacenter GPUs consume 700 watts or more—a tangible thermal stress on silicon and a real constraint on datacenter capacity. Older chips draw less power. If you’re a hyperscaler staring at maxed-out electrical infrastructure and a queue of inference workloads that don’t need bleeding-edge performance, running a fleet of three-year-old A100s starts looking smarter than idling capacity while waiting for substation upgrades.
The H100 retains 60-83% of its value after 18 months in the secondary market—a depreciation curve that looks nothing like normal IT equipment. That residual value exists because the use cases cascade downward for years. The chip doesn’t become worthless; it becomes appropriately priced for tier-two workloads.
Enterprise Inertia vs. Silicon Valley Time
Nvidia’s annual cadence creates a vortex of FOMO, but actual customers don’t move at marketing-cycle speed. Large enterprises are only now adopting H100s en masse. The B200s Nvidia is shipping this year? Not even on the roadmap for most corporate AI teams. Yet. One customer recently sought a massive cluster of A100s for inference despite three newer chip generations existing, simply because A100s fit their validated, production-ready stack.

CapEx cycles, R&D alignment, migration planning—these things take time measured in quarters, not product launch countdowns. The result is operational inertia that extends the practical lifecycle of any given chip generation well beyond what the spec sheet suggests.
The Accounting Tells the Truth
Here’s what depreciation schedules actually reveal: Microsoft and Google didn’t just wake up one morning and decide to depreciate servers over six years for fun. Those numbers are defended to auditors with engineering data, failure analysis, and utilization metrics. Depreciation expert Dustin Madsen notes that companies must prove their useful life estimates are accurate or justify practical useful lifes (and these can vary from one entity to another, even if the underlying “machine” is the same) —and six years keeps passing the audit test.
Meta’s decision to extend networking and server asset life to five and a half years came in 2023, right as the AI infrastructure arms race hit peak intensity. CoreWeave made the same call, bumping from five to six years. These are companies with every incentive to depreciate faster (to reduce taxable income) choosing instead to bet on longevity.
The dissonance resolves once you see the full picture: technological obsolescence for frontier training happens fast—18 to 36 months at the cutting edge. But economic and physical usefulness extends far longer because the waterfall never stops. There’s always a tier below, always another workload that can’t justify top-tier silicon but needs more than a CPU.
The Industrial Truck Theory
Think of it less like smartphones (which become e-waste the moment software support ends) and more like a logistics company’s truck fleet. The newest, most fuel-efficient rigs handle the long-haul routes. But when a better model arrives, the old trucks aren’t junked—they’re reassigned to regional routes, then local delivery, then parts runs. Each tier generates less revenue per mile but remains profitable for years, right up until the transmission finally dies.
That’s the GPU lifecycle in 2025. The H100 training your competitor’s foundation model today will be running someone’s customer service chatbot in 2028, then rendering YouTube videos in 2029, then sitting in a university research cluster in 2030. Nvidia will have shipped three more generations by then. The chip will still be working.
The one-year release cadence is real. The 9% annual failure rate for frontier workloads is real. But so is the six-year depreciation schedule—not despite the facts, but because of them. In the cascading economy of AI compute, nothing is ever truly obsolete. It’s just waiting for the right workload to fall its way.
The Accounting Skeptic’s Objection

Michael Burry, the investor famous for predicting the 2008 financial crisis, recently took short positions on Nvidia and Palantir, arguing that extended depreciation schedules represent “one of the more common frauds of the modern era.” His claim: hyperscalers are artificially boosting earnings by extending useful life assumptions beyond what 2-3 year product cycles justify, understating depreciation by an estimated $176 billion through 2026-2028. But this critique conflates two distinct timelines—the technological obsolescence cycle for frontier training (where Burry is correct that chips age out in 18-36 months) with the economic utility cycle enabled by the waterfall model. The accounting isn’t fraudulent if the chips genuinely generate revenue for six years, just not at the cutting edge. Oracle, Meta, and Microsoft must defend these depreciation schedules to auditors with utilization data and failure analysis—and they keep passing.
That said, Burry may still be right that AI infrastructure stocks are overvalued, just for different reasons: future demand uncertainty, energy constraints, or the possibility that model improvements plateau before the cascading tiers can absorb all this capacity. The depreciation debate, however, hinges on whether you believe inference workloads will grow fast enough to keep those old chips humming. So far, the hyperscalers are betting they will.
How Accounting Standards Support the 3-6 Year Flexibility
Both US GAAP and IFRS provide substantial judgment-based flexibility that makes the 3-6 year depreciation range defensible for GPUs:
US GAAP Flexibility: GAAP requires companies to depreciate assets over their “expected useful life” but explicitly makes this the company’s responsibility to determine, considering factors like technological obsolescence and economic factors rather than prescribing fixed schedules. Computer equipment typically falls in the 3-5 year range under industry standards, but GAAP provides no universal standard. Critically, useful life is an estimate of how long an asset will be economically productive, not necessarily how long it will physically last, and must be based on expected usage. Annual reviews of useful life estimates can evaluate whether adjustments are needed for changes in asset usage and technological advances.
IFRS Flexibility: IAS 16 allows significant discretion in determining useful life, which should be specific to the entity and can be considerably shorter than the life span determined by others, dictated by the entity’s activity profile and asset management policy. The useful life and residual value must be reviewed at the end of each reporting period, and if either changes significantly, that change should be accounted for prospectively as a change in accounting estimate. Useful life determination takes into account expected usage, physical wear and tear, technical or commercial obsolescence, and the depreciation pattern should reflect how the asset’s economic benefits are expected to be consumed.
Why This Supports the above case:
The standards intentionally avoid prescriptive tables precisely because asset utility is entity-specific and usage-dependent. A GPU used exclusively for frontier training at 60-70% utilization (consuming it rapidly) legitimately warrants 3-year depreciation. The same GPU cascading through inference, then SLM serving, then cloud rendering over 6 years also legitimately warrants 6-year depreciation—because its economic benefit period genuinely extends that long for that specific hyperscaler’s business model.
The cascading use model isn’t accounting manipulation; it’s a documented operational reality that both frameworks explicitly accommodate through their emphasis on “expected usage” and “entity-specific consumption patterns.” Companies must convince auditors using engineering data and internal analysis (I make no assumptions about their skills or ability to understand technical matters, of course), which is exactly what Microsoft, Meta, and Google do when defending their extended schedules.
Burry’s “fraud” allegation misunderstands this: the standards intentionally allow the same asset class to have different depreciation periods across companies based on proven utilization patterns. It’s not aggressive accounting—it’s the waterfall model operating exactly as GAAP and IFRS contemplate.
Accounting is “fudgy” that way, for obvious reasons.
Example Note of The Utilization Paradox (Hypothetical Example)
The utilization paradox (70% → 50% → 40% → 100%) example, reflects that:
Early tiers can’t sustain 100% utilization due to thermal/reliability constraints
Late tiers CAN run at 100% because the workloads are so light that full utilization doesn’t stress the chip
It’s like the difference between a race car (can’t run at redline continuously) versus a delivery van (can run all day because it’s never pushed hard).
The power decline is est. linear and based on generational improvements, while the failure rate drops exponentially once chips escape the thermal hell of frontier training.
References
NVIDIA shifts to annual release cadence - CNBC, March 2025
Nvidia Blackwell Platform announcement - NVIDIA Newsroom
NVIDIA GPU Upgrade Planning: Blackwell & Rubin cadence analysis - CUDO Compute, July 2025
The Annual Cadence Gamble: Can NVIDIA Keep Launching a New Platform Each Spring? - Medium, July 2025
Nvidia May Move to Yearly GPU Architecture Releases - Tom’s Hardware, October 2023
Meta Llama 3 training: 30.1% of disruptions from GPU failures - Tom’s Hardware, July 2024
Meta report details hundreds of GPU interruptions to Llama 3 training - DCD, July 2024
Meta’s huge 16,384 NVIDIA H100 AI GPU cluster details - TweakTown, July 2024
Microsoft extends life of cloud servers to six years - The Register, August 2022
Microsoft anticipates $3.3bn savings by extending server life - Computer Weekly
Microsoft extends server life, cuts cloud costs - CIO Dive, July 2022
Google says TPU demand outstripping supply, 8-year-old hardware at 100% utilization - DCD, November 2024
Big Tech’s Deteriorating Earnings Quality: GPU depreciation analysis - MBI Deep Dives, January 2025
Amazon revises server lifespan amid AI shift - Deep Quarry, February 2025
H100 GPU Price Guide 2025 - Jarvislabs
Nvidia H100 GPU Resale Price and Market Analysis - Oplexa, October 2024
$2 H100s: How the GPU Bubble Burst - Latent Space, October 2024
Understanding GPU Lifecycles: Frankly It’s Complicated, Nov 2025
The Question Everyone is Asking: How Long Before A GPU Depreciates, Nov 2025








Wow, the 'waterfall economy' concept you describe is just brilliant and so insightful! It realy clarifies the apparent contradiction with depreciation. I'm curious tho, from a practical standpoint, what does the software orchestration look like for efficiently repurposing these high-end GPUs as they get demoted? Must be a wild puzzle to solve.
Please see the following paper on energy efficiency in new versus old technologies. This was briefly pointed out in the article.
https://arxiv.org/abs/2310.07516