Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sure!

I use SingleFile to save a copy of every article / post / SO & forum discussion I find interesting or useful. I sort them into two buckets: work, and not-work.

I’ve been doing this for 10+ years (before SingleFile I used things like .pdf, plain .html, .webarchive files - although these all have drawbacks).

In the pre-LLM era, I would then interface with these almost exclusively through a search front-end. I use Houdahspot on Mac and easySearch on iOS. That lets me see everything interesting I’ve read on a particular subject just by typing it in (with the usual caveats that apply to basic keyword search - although in practice that alone has proven very effective). Because it’s just a folder of essentially zipped .html files, there’s no lock-in.

Now that we’ve got LLMs, I plug those 10+ years of files straight into my RAG pipeline using llama-index. It’s quite nice :)



Sorry for the ignorance, but if the forum posts require login to access then you won't be able to use SingleFile, right?

Also, how is the quality of the output generated compared to a .pdf? I'm used to print PDFs from chrome for articles that I want to save, but the layout can become awkward sometimes, and navigation bars can appear several times and hide portions of the text.

I like this feature from chrome, but it's not consistently reliable.


If you use the browser extension, then pages requiring login are no problem because you are already logged in.

The output compared to PDF is like night and day. It is high Fidelity versus low Fidelity. At this point now, I only use PDF if for some reason I need it


SingleFile operates in the context of your browser, so it scrapes files with your cookie jar meaning you will be authenticated and specifically it'll scrape files as you see them.

In most cases SingleFile outputs looks identical to the real thing. Though I generally only use it on simpler sites such as recipes and technical blogs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: