Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In theory, auto-regressive models should not have limit on context. It should generate the next token with all previous tokens.

In practice, when training a model, people select a context window so that during inference, you know how much GPU memory to allocate for a prompt and reject the prompt if it exceeds the memory limit.

Of course there's also degrading performance as context gets longer, but I suspect memory limit is the primary factor of why we have context window limits.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: