Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do you have a write up of the tech stack and setup? Or willing to give the gist here?

I'd like to make a private Qwen or similar for my kids to prompt with a button and voice control. It doesn't need vision... Although eventually that'd be very cool.

Siri just sucks.

We might not be there yet...



I also ran across an interesting robot toy demo today that had voice built in. it was whimsical and seemed like it was aimed towards primary education and kids. Someone here might know the name.


You can use Ollama or LM Studio, both in API mode, to return the responses. I believe they offer audio support, but I'm not entirely sure.

However, if you're looking for instruction following (like an agent), I've tried to implement my own agent and have lost faith. Even GPT-4.1 will regularly gaslight me that no, it definitely ran the tool call to add the event to my calendar, when it just didn't. I can't get any more adherence out of it.


We're definitely there, there's just no "ready-made" apps yet. But the technology is possible, go to e.g. vapi.ai to test it.



All this lead to, is paying or using APIs and more paying. That's not what I was asking for.


yeah i made a post on here, but the algo sent it to the gulag abyss.

https://news.ycombinator.com/item?id=43926673


That's a good product site but it doesn't help me in anyway...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: