Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's a mixture-of-experts model. Basically N smaller model pieces put together, and when inference occurs, only 1 is active at a time. Each model piece would be tuned/good in one area.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: