This is actually along the lines of what I'm working on in my free time at the moment. I am working to extend a local model's memory to allow smaller self-hosted models become a better solution than paying someone else.
Once this is working better, it will allow to extend the abilities of local models without running into the massive issues with context limitations I personally was hitting for self hosted.
Once this is working better, it will allow to extend the abilities of local models without running into the massive issues with context limitations I personally was hitting for self hosted.