Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What is Gemini 3 under the hood? Is it still just a basic LLM based on transformers? Or are there all kinds of other ML technologies bolted on now? I feel like I've lost the plot.




I am very ignorant in this field but I am pretty sure under the hood they are all still fundamentally built on the transformer architecture, or at least innovations on the original transformer architecture.

It's a mixture-of-experts model. Basically N smaller model pieces put together, and when inference occurs, only 1 is active at a time. Each model piece would be tuned/good in one area.

The industry is still seeing how far they can take transformers. We've yet to reach a dollar value where it stops being worth pumping money into them.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: