Meta announced it will develop and deploy four new generations of its Meta Training and Inference Accelerator chips within the next two years, the most aggressive custom silicon roadmap any hyperscaler has disclosed outside of Google.
The company is simultaneously expanding its co-development partnership with Broadcom and sourcing silicon from Arm, AWS, AMD and Nvidia. The strategy is explicit: match different chips to different workloads rather than running everything on Nvidia GPUs.
That is a polite way of saying Meta is building the infrastructure to stop writing Nvidia a blank cheque.
What MTIA does
MTIA is designed to handle inference, the process of running a trained AI model to generate responses, rather than training, the process of building the model in the first place. Training requires the brute-force parallel processing that Nvidia GPUs provide. Inference, which accounts for the vast majority of AI compute at scale, can run more efficiently on custom silicon designed for the specific workload.
Meta serves billions of inferences a day across ranking, recommendations and generative AI. Every efficiency gain on inference translates directly into lower operating costs at a scale no other company matches.
The company's most advanced model, Muse Spark, is trained across thousands of GPUs but serves its inference workload on MTIA chips. That split, Nvidia for training and custom silicon for inference, is the template Meta is building towards.
Why four generations in two years
The pace is unusual. Most chip development cycles run 18 to 24 months per generation. Four generations in two years means Meta is iterating faster than a traditional semiconductor cadence, likely shipping incremental improvements rather than wholesale redesigns with each version.
The speed matters because inference workloads are changing rapidly. As models become multimodal, handling text, images, video and audio simultaneously, the silicon optimised for last year's workloads becomes less efficient for this year's. Faster iteration lets Meta keep its custom chips aligned with its model development roadmap.
The Nvidia relationship changes shape
Meta is not abandoning Nvidia. GPUs remain essential for training and will continue to run alongside MTIA and CPUs in Meta's data centres. But the relationship is shifting from dependency to diversification.
When a single company controls the supply of your most critical component, every purchase is a negotiation conducted from a position of weakness. Meta spending $200bn on AI infrastructure this year cannot afford that dynamic.
Broadcom's expanded role is the clearest signal. Google already uses Broadcom-designed custom chips for its TPU programme. Meta following the same path suggests a structural shift in how hyperscalers procure AI compute: proprietary silicon for inference, Nvidia for training, and an architecture designed so that no single supplier holds the leverage.
Meta framed the announcement around performance and efficiency. The subtext is about power. Not electrical power, though that matters too. Commercial power. The kind you lose when your entire AI strategy runs on someone else's hardware.