Author of SingleFile here, one of the major differences is that monolith doesn't use a web browser to take page captures. As a result, it doesn't support JavaScript, for example. SingleFile, on the other hand, requires a Chromium-based browser to be installed. It should also produce smaller pages and is capable of generating ZIP or self-extracting ZIP files. However, it will take longer to capture a page. Note that since version 2, it is now possible to download executable files of the CLI tool [1].
SingleFile is amazing - use it tens of times every day across desktop and mobile. Can’t recall a single instance of it breaking. Thank you sincerely for your excellent work.
Thanks a lot! Believe me, there have been a lot of bugs (+900 issues closed today) because it's hard to save a web page actually. You were lucky not to suffer ;)
I bet! The proof of that must surely be in how poor a job formats like .webarchive do of it.
SingleFile just makes this one really complex, really important thing trivially easy, and in a portable format. For anyone curating a knowledge base it’s an absolute godsend.
I didn’t see any donation instructions on your GitHub - I for one would certainly love to chip in if you could point me in the right direction?
Anybox (on Mac and ios) also supports SingleFile, presenting as a WebDAV server for archives to be saved. It’s flawless and hugely convenient in my experience.
Just stumbled across Monolith and SingleFile recently and it's fascinating to see how these tools approach the challenge of web archiving in different ways. SingleFile seems to be a powerhouse, especially for those who rely heavily on JavaScript-laden pages. The ability to produce smaller pages and even generate ZIP files is pretty handy for content archiving and sharing.
That said, Monolith's approach of not requiring a web browser could be a game changer for simpler projects or where installing a Chromium-based browser isn't viable. It strikes me as a more straightforward, lightweight solution, albeit with the clear trade-off of not supporting JavaScript.
Has anyone run into situations where one tool clearly outperformed the other in real-world usage? I'm particularly curious about the impact on performance and convenience when choosing between these two, especially for mobile use. Also, kudos to the authors and contributors of these tools. The tech community benefits greatly from such innovations that help preserve and share knowledge.
Is this a LLM generated comment? The structure of this response seems to be too close to the “while X, it’s also important to Y” construction that LLMs like to use.
Anyway, to answer your question, lots of pages need JS to work correctly, so using Singlefile is the better option.
I use SingleFile to save a copy of every article / post / SO & forum discussion I find interesting or useful. I sort them into two buckets: work, and not-work.
I’ve been doing this for 10+ years (before SingleFile I used things like .pdf, plain .html, .webarchive files - although these all have drawbacks).
In the pre-LLM era, I would then interface with these almost exclusively through a search front-end. I use Houdahspot on Mac and easySearch on iOS. That lets me see everything interesting I’ve read on a particular subject just by typing it in (with the usual caveats that apply to basic keyword search - although in practice that alone has proven very effective). Because it’s just a folder of essentially zipped .html files, there’s no lock-in.
Now that we’ve got LLMs, I plug those 10+ years of files straight into my RAG pipeline using llama-index. It’s quite nice :)
Sorry for the ignorance, but if the forum posts require login to access then you won't be able to use SingleFile, right?
Also, how is the quality of the output generated compared to a .pdf? I'm used to print PDFs from chrome for articles that I want to save, but the layout can become awkward sometimes, and navigation bars can appear several times and hide portions of the text.
I like this feature from chrome, but it's not consistently reliable.
If you use the browser extension, then pages requiring login are no problem because you are already logged in.
The output compared to PDF is like night and day. It is high Fidelity versus low Fidelity. At this point now, I only use PDF if for some reason I need it
SingleFile operates in the context of your browser, so it scrapes files with your cookie jar meaning you will be authenticated and specifically it'll scrape files as you see them.
In most cases SingleFile outputs looks identical to the real thing. Though I generally only use it on simpler sites such as recipes and technical blogs.
I was about to post a similar question: What does this mean for those using the Firefox versions of the extensions (SingleFile as well as the version that zips the result)?
For me it bridged the gap that warped into existence between the time when "take screenshot" existed on firefox and when webpages figured out some people did this to archive pages and started putting crap in to either mess with the layout or otherwise "break" the resulting file.
It snapshots a web page to a single html file. At least that's what i use it for. I use it to both archive stuff and to have proof that some site published something.
The next order up would be archivebox or whatever archive.org uses (the name escapes me) - which is a very heavy caching proxy that can save entire websites into a single directory in a way that wget/curl and all the other crawlers cannot.
If you care that the exact layout and everything is perfect, right now i think singlefile is aces.
It takes whatever is in the Dom of the page you are viewing, and sticks it into a single HTML file that can be served later and will reproduce with high Fidelity the source page.
I use it to export an HTML file that I can stick in my logseq archive for later. So much better than just printing to a PDF!
https://www.npmjs.com/package/single-file-cli