I like that this relies on generating SQL rather than just being a black-box cha...

Xyra · 2026-01-01T00:14:15 1767226455

Exactly, people want precision and control sometimes. Also it's very hard to beat SQL query planners when you have lots of material views and indexes. Like this is a lot more powerful for most use cases for exploring these documents than if you just had all these documents as json on your local machine and could write whatever python you wanted.

Yeah I've out a lot of care into rate-limiting and security. We do AST parsing and block certain joins, and Hacker News has not bricked or overloaded my machine yet--there's actually a lot more bandwidth for people to run expensive queries.

As for getting good semantic queries for different domains, one thing Claude can do besides use our embed endpoint to embed arbitrary text as a search vector, is use compositions of centroids (averages) of vectors in our database, as search vectors. Like it can effortlessly average every lesswrong chunk embedding over text mentioning "optimization" and search with that. You can actually ask Claude to run an experiment averaging the "optimization" vectors from different sources, and see what kind of different queries you get when using them on different sources. Then the fun challenge would be figuring out legible vectors that bridge the gap between these different platform's vectors. Maybe there's half the cosine distance when you average the lesswrong "optimization" vector with embed("convex/nonconvex optimization, SGD, loss landscapes, constrained optimization.")

kiney · 2026-01-01T13:32:46 1767274366

if performance becomes a problem statically hosting sqlite DBs with client side queries and http range requests is an interesting approach:

https://github.com/phiresky/sql.js-httpvfs

Xyra · 2026-01-01T19:30:43 1767295843

Thanks, that's very interesting.

plagiarist · 2026-01-01T15:44:53 1767282293

That's a neat thought. What's the granularity of the text getting embedded? I assume that makes a large difference in what the average vector ends up representing?

Xyra · 2026-01-01T19:25:06 1767295506

~300 token chunks right now. Have other exciting embedding strategies in the works.

bredren · 2025-12-31T21:22:18 1767216138

This is the route I went for making Claude Code and Codex conversation histories local and queryable by the CLIs themselves.

Create the DB and provide the tools and skill.

This blog entry explains how: https://contextify.sh/blog/total-recall-rag-search-claude-co...

It is a macOS client at the present but I have a Linux-ready engine I could use early feedback on if anyone is interested in giving it a go.

keeeba · 2025-12-31T10:55:43 1767178543

I don’t have the experiments to prove this, but from my experience it’s highly variable between embedding models.

Larger, more capable embedding models are better able to separate the different uses of a given word in the embedding space, smaller models are not.

Xyra · 2025-12-31T19:21:57 1767208917

I'm using Voyage-3.5-lite at halfvec(2048), which with my limited research, seems to be one of the best embedding models. There's semi-sophisticated (breaking on paragraphs, sentences) ~300 token chunking.

When Claude is using our embed endpoint to embed arbitrary text as a search vector, it should work pretty well cross-domains. One can also use compositions of centroids (averages) of vectors in our database, as search vectors.

A4ET8a8uTh0_v2 · 2025-12-31T11:21:50 1767180110

I was thinking about it a fair bit lately. We have all sorts of benchmarks that describe a lot of factors in detail, but all those are very abstract and yet, those do not seem to map clearly to well observed behaviors. I think we need to think of a different way to list those.

freakynit · 2026-01-02T18:50:07 1767379807

This is the same route I followed for https://zenquery.app .... It uses LLM to generate SQL rather than working directly on data files. Saves a ton of costs as well since you don't need to send entire file(s) to LLM, just the schema.

llmslave2 · 2025-12-31T21:58:16 1767218296

> I like that this relies on generating SQL rather than just being a black-box chat bot.

When people say AI is a bubble but will still be transformational, I think of stuff like this. The amount of use cases for natural language interpretation and translation is enormous even without all the BS vibe coding nonsense. I reckon once the bubble pops most investment will go into tools that operate something like this.