Discussion about this post

User's avatar
JP's avatar

The 44% TCO gap is huge. It also explains why inference pricing is so strange right now. If Google can strip Nvidia's margin from the BOM, that savings eventually filters into cheaper Gemini pricing. Meanwhile providers stuck on Nvidia hardware either eat the margin or pass the cost on. The Chinese providers chose a third path: subsidise aggressively, lock users in, raise prices later. Same playbook as the 2014 cloud wars but with state backing this time. Covered the full pattern and where it ends: https://sulat.com/p/the-real-cost-of-cheap-ai-inference

No posts

Ready for more?