What is Gemini 3 under the hood? Is it still just a basic LLM based on transformers? Or are there all kinds of other ML technologies bolted on now? I feel like I've lost the plot.
I am very ignorant in this field but I am pretty sure under the hood they are all still fundamentally built on the transformer architecture, or at least innovations on the original transformer architecture.
It's a mixture-of-experts model. Basically N smaller model pieces put together, and when inference occurs, only 1 is active at a time. Each model piece would be tuned/good in one area.