Putting Z-Library on IPFS

imhoguy · on Nov 19, 2022

This post really starts to show the right direction:

Z-Library - "the desktop app", with built in Tor (for seeders safety), IPFS (for p2p distribution), IPNS (to download updated indexes), with local search engine (no SPOF & convenience), optional at-rest file encryption, and some random pining algorithm to let users donate 1-10GB of local disk space to host random chunks of the library.

It needs to be dead easy to let anyone use and contribute.

RobotToaster · on Nov 19, 2022

Since it needs to be on all the time, a raspberry pi image maybe? Literally just configure your wifi password, burn the image, and let it rip.

bozhark · on Nov 19, 2022

If I could tack it on to the current network with a single pi+Ethernet connection it would be worthwhile

rahimnathwani · on Nov 20, 2022

Seems like a lot of I/O for a poor SD card.

sneak · on Nov 20, 2022

Mostly just O, if it's caching.

gizmo385 · on Nov 19, 2022

Requiring a raspberry pi and burning an image is not generally accessible to a non technical person imo

tfsh · on Nov 19, 2022

If you're not slightly technically inclined you wouldn't be interested in contributing to Z-Libs decentralised hosting so this isn't an issue.

dendriti · on Nov 20, 2022

If you're limiting support of Z-Libs to only those 'slightly technically inclined', you're leaving a lot of support on the table.

joshspankit · on Nov 20, 2022

That’s a fair point.

The main demographic is people who are probably already technically inclined... but there are lots of supporters of intellectual freedom who are not the main demographic of Z-Library.

DoItToMe81 · on Nov 20, 2022

Even my grandma with dementia knows how to burn CDs. The software to burn a disk image to an SD card is even easier to use.

delcaran · on Nov 21, 2022

With alpine and an external HD you just need to copy some file on a windows-formatted SD card to have a fully working system that doesn't even write to the SD.

wrs · on Nov 19, 2022

Considering the goals of IPFS, I’m surprised it has trouble at this scale. I mean, it’s supposed to be “interplanetary” but this data would fit on two hard drives. The post talks about performance problems with advertising “hundreds” of content IDs. Is it a design problem or just an early implementation problem?

ReaLNero · on Nov 20, 2022

Decentralization is expensive e.g. downloading a Google Drive file is immediate while a torrent takes some time initialize connections, or the PoW protocol for cryptocurrencies.

Perhaps we could narrow down the purpose of IPFS to storing a small amount of data in a database that is impossible to delete and difficult to censor?

GauntletWizard · on Nov 20, 2022

Design. IPFS is a dead end in design because it's properties are designed with statistical perfection to it's anonymity. This can only ever be achieved statistically, because there's always the attack of "Start Removing nodes at random, and when the content disappears when you remove a node that was your one"

This is a provably unsolvable problem, but it's okay: nobody actually has this problem but the worst social outcasts. In some places that's itself a problem: what "The worst social outcasts" is varies from place to place, but running IPFS is enough proof for a death sentence in those places.

dale_glass · on Nov 20, 2022

You might be confusing IPFS with something else, like Freenet.

As far as I know, there's no such statistical anonymity in it.

csande17 · on Nov 20, 2022

Yes, it's fairly easy for attackers to discover the IP address of the people currently hosting a particular piece of content on IPFS: https://docs.ipfs.tech/concepts/privacy-and-encryption/#what...

I think the parent comment is asserting that even if you run IPFS over Tor or something, or even if they try and make IPFS more anonymous in the future, it will always still be vulnerable to the "cut off your Internet and see which node in the network goes down" attack.

dale_glass · on Nov 19, 2022

I'm fairly sure this isn't really safe or really reliable.

To my knowledge, IPFS isn't really private, in that both the nodes hosting content can be easily known, and the users requesting content can be monitored. This is bad news for something law enforcement has already taken a serious interest in.

IPFS also requires "pinning", which means that unless other people decide to dedicate a few TB to this out of their own initiative, what we have currently is a single machine providing data through an obscure mechanism. If this machine is taken down, the content goes with it.

The amount of people that have 31 TB worth of spare storage, care about this particular issue, and are willing to get into legal trouble for it (or at least anger their ISP/host) is probably not terribly large. The work could be split up, but then there needs to be some sort of coordination to somehow divide up hosting the archive among a group of volunteers.

armchairhacker · on Nov 19, 2022

> To my knowledge, IPFS isn't really private, in that both the nodes hosting content can be easily known, and the users requesting content can be monitored. This is bad news for something law enforcement has already taken a serious interest in.

You can access the data through a VPN.

If necessary, the hosts can also encrypt the filenames and data so that, until law enforcement gets the encryption key, they can't know who accesses what (public key and other necessary info would be communicated through Signal). Rotate the filenames so so when one is discovered, past requests can't be tracked. Maybe there is a way to slightly break the protocol to further hide the requests.

> IPFS also requires "pinning", which means that unless other people decide to dedicate a few TB to this out of their own initiative, what we have currently is a single machine providing data through an obscure mechanism. If this machine is taken down, the content goes with it.

Do you have to explicitly choose what data to pin? If so then this is an issue. If not, and you just pin random chunks, then if we normalize people using IPFS and distributing legal data this will be solved. If we normalize it enough, there will be too many people hosting and using IPFS for law enforcement to reasonably take down. Or we could just have enough activists that are willing to risk being fined or arrested.

---

That being said though, I'm still not convinced on IPFS because it seems like it cannot handle much and is excessively inefficient (case in point: this article). The authors of IPFS should release a new protocol which addresses issues like the article's, hopefully before too much adoption.

dale_glass · on Nov 20, 2022

> You can access the data through a VPN.

Okay, and then law enforcement asks the VPN. Yeah, it improves matters some, but we're talking about a huge book archive here. People aren't going to maintain OpSec

> If necessary, the hosts can also encrypt the filenames and data so that, until law enforcement gets the encryption key, they can't know who accesses what (public key and other necessary info would be communicated through Signal).

That's a plan suitable for some sort terrorist organization maybe, but exactly how is that going to work for an archive of millions of books that are intended to be served to the general public? What's the key distribution mechanism? How do you distribute keys to everyone but the cops?

> Do you have to explicitly pin the data? If so then this is an issue. If not, and you just pin random chunks, then if we normalize people using IPFS this will be solved. If we normalize it enough, there will be too many people hosting IPFS for law enforcement to reasonably take down.

IPFS isn't Freenet. My understanding is that it's a content-addressable, multi-source system. Meaning the main different thing from plain HTTP is that stuff is named by hash, and that if there's a dozen people serving a given file, then the system can spread the load among them, or tolerate some of them going offline. You ask for hash X, the system figures out where to get it.

Unless people make the intentional choice to mirror content, then it's not very different from serving stuff over HTTP, only with a worse user experience.

What you suggest sounds more like Freenet, but I doubt that it'd work great even there. Freenet does the "store random chunks" sort of thing, but this means that it's extremely inefficient, and easily loses data. Freenet was made for plausible deniability, so any storage is probabilistic, and data is replicated as it moves through the network and eventually lost if nodes go offline or it just falls out of storage due to the lack of interest. Storing 31TB would require a lot of nodes dedicating a lot of storage, and a lot of interest in accessing all of that data on a regular basis.

> If we normalize it enough, there will be too many people hosting IPFS for law enforcement to reasonably take down. Or if we just get enough activists that are willing to risk being fined or arrested.

That's not a great plan for something that already got people into legal trouble

kloudalpha · on Nov 22, 2022

>Okay, and then law enforcement asks the VPN. Yeah, it improves matters some, but we're talking about a huge book archive here. People aren't going to maintain OpSec

You're making it seem like this is such an impossible task, but libgen already uses ipfs as a mirror..

joshspankit · on Nov 20, 2022

By glossing over important points and focusing on ones you feel you can argue against you’re coming across as someone who simply wants to argue.

I’m only saying this because I suspect that otherwise if people fail to change your mind you’ll walk away thinking you were vindicated by their silence, when it’s just as likely that they’ve followed the old adage: “Never Wrestle with a Pig. You Both Get Dirty and the Pig Likes It”.

dmitriid · on Nov 20, 2022

> By glossing over important points and focusing on ones you feel you can argue against

That's exactly what the person suggesting using VPN and encrypting file names is doing.

davidgerard · on Nov 20, 2022

yeah. IPFS is basically equivalent to BitTorrent. Go search some torrents from ten years ago, see how many stayed seeding the whole time.

It is better than nothing in the circumstances, but be aware of the species of creature it is.

Gigachad · on Nov 19, 2022

I’ve been interested in IPFS for a while now but I’m still unable to come up with a good explanation of how it’s different to regular torrents. Does anyone have any examples of stuff you can do with IPFS that you can’t with torrents?

Seems like over the last 5 years they haven’t really done anything but add crypto buzzwords to the project site.

georgyo · on Nov 19, 2022

The biggest thing is that in IPFS each chunk has a CID, and that CID is globally unique and sharable from other CIDs.

You can think of each CID as a mangent link, and each CID can point to more CIDs.

Where as in a torrent the peer tracking is done at the whole torrent level, instead of the chunks inside the torrent.

Torrent V2[0], which has poor adoption thus far, should also allow for a similar ability. To my knowledge this is not being taken advantage of yet.

[0]: https://blog.libtorrent.org/2020/09/bittorrent-v2/

mdp2021 · on Nov 19, 2022

> each chunk has a CID

An extremely interesting problem, fingerprinting generic binary - reducing kilobytes and megabytes into a handful of bytes, and avoiding collision.

Much simpler on large data.

rudolph9 · on Nov 19, 2022

The content is the address, no central issuer of the tracker file needed just the CID content ID. Also since the CID is a hash of the data it can be used to validate the data :wink:

bakugo · on Nov 19, 2022

That's quite literally how bittorrent with DHT works. To download a public torrent you only need its content hash, the metadata and peers are then fetched via DHT assuming at least one reachable DHT node is aware of the torrent.

nodja · on Nov 19, 2022

That is exactly how bittorrent works as well tho. Bittorrent hasn't needed any trackers for public torrents since 2005 and can work solely off magnet links using the DHT network. Just like how IPFS does it.

yellowsir · on Nov 20, 2022

the difference is, when you have a lot of (small) files. With ipfs you have one global peer pool, with torrents you need a pool per file.

teraflop · on Nov 19, 2022

That's basically the same as torrents with magnet links.

As other commenters pointed out, the biggest real difference is that IPFS objects can cross-reference each other by hash, which allows you to do partial updates.

layer8 · on Nov 19, 2022

An “update” means a new CID though, and this basically works like a persistent data structure [0]?

[0] https://en.wikipedia.org/wiki/Persistent_data_structure

fomine3 · on Nov 21, 2022

Winny (since 2002) and successor Japanese P2P network were like that. They use MD5/SHA file hash as address. https://en.wikipedia.org/wiki/Winny

russellbeattie · on Nov 19, 2022

A few months ago I decided on a whim that I should download and de-drm my Audible audiobooks. I have a lot of them from a decade+ of monthly credits being used. I hadn't really considered how much space it would take up so I just set off the download process and did other stuff. 650 audiobooks took up nearly 2TB of space before I stopped it, only noticing when my Mac gave me a warning about not having enough space left on the drive. Whoops. (Audible keeps track of your total listening time... It adds up to over 9 months solid. Crazy, but I can think of worse ways to spend that time.)

Anyways, if 6 million ebooks takes up 31TB, I wonder how much space the equivalent amount of audiobooks would use? And how much space a decent library of movies has? I'm kinda thinking that guy on reddit who somehow got a hold of a used Netflix edge server was on to something.

getcrunk · on Nov 19, 2022

the only way i could get that number to work is : 650 books x 15hrs x 3600 seconds x 320kbit/s divided by 8 bits/byte = 1.4tb

id say change the audio format and cut the bit-rate roughly in half.

but welcome to data hoarding, start with a 8tb external wd dirve!

summm · on Nov 19, 2022

Audiobooks are usually mono, so 700 kbit/s for uncompressed CD quality. To me parent's numbers sounds like uncompressed. I suspect something went wrong with de-drm/decoding. Should be more like 48 kbit/s for basically transparent mono voice with opus. At this rate, 9 solid months will fit into 142 GB...

russellbeattie · on Nov 20, 2022

Whatever OpenAudible.org's default options are obviously aren't as efficient.

russellbeattie · on Nov 19, 2022

Hmm... well, I do tend to try to maximize the amount of listening I get per credit, so I have lots of 20+ hour books. Stephen Fry's reading of the complete works of Sherlock Holmes is nearly 63 hours [1] and that's just a "single" audiobook (and highly recommended). Lots of Great Courses recordings are similar. Rise and Fall of the Third Reich is 57 hours, etc. Also, there's a ton of metadata like images and PDFs, though I'm not sure how much that takes up. And in fact (just checking again) I'm at only 638 books, so it was probably more like 625 books when I did this. Apparently it adds up.

1. https://www.audible.com/pd/Sherlock-Holmes-Audiobook/B06WLMW...

QuadrupleA · on Nov 19, 2022

> It was sad day for the free flow of information, knowledge, and culture.

Quick counterpoint to this type of revolutionary rhetoric - writing a book (a good one) takes a lot of time, patience, and hard work. Making it available for free, against the author's wishes, while very easy in the digital age and obviously great for readers, is robbing from the author (if they're still alive). It devalues the craft of writing and makes it even more un-viable as a profession.

This may be inevitable, and it has mostly happened with music, journalism, etc., but by putting energy into this kind of project you're only furthering its death along, and perhaps furthering the death of society and culture a little by taking away a lot of the incentive to write anything of value.

Pirate away if you want, but thinking this is some kind of virtuous revolutionary act is narcissistic BS.

OrvalWintermute · on Nov 19, 2022

Counterpoint to your counterpoint

For much of human civilization there has been an inherent social contract of sorts around communication.

I can take what you tell me, mix it with other things, and retransmit it. While I love books, bits and bytes are not books. There is no physical media. Now, we are in an era of inherently zero cost redistribution, and the vast and giant fraud perpetrated on us by the cabal of distributors, publishers, authors, publishers and agents finally is seeing pushback.

The vast amount of exploitation of public research, and non-value adding commercialization inherent in some of the scientific publishing and the gatekeeping boards is some of the worst of the lot.

Most of all, because it actively interferes with the transfer of knowledge and serves to penalize those without means, and those in poorer countries.

Considering the vastly exploitative system and the outrageous prices and robber baron profits these industries have made...

I consider it my duty to accelerate its creative destruction and return the ability to recommunicate ideas back to its inherent social contract.

Using your bits and bytes to store data is a revolutionary act that this vampire industry of ghouls seeks to prevent.

Down with the literary Robber Barons!

QuadrupleA · on Nov 20, 2022

I agree on scientific publishing - if the research is publically funded then having a journal charge exorbitant prices for access is completely corrupt.

I don't think it's the same thing for books by individual authors though. Good writing is a lot of work, whether it's bytes in a text file or ink on wood pulp. Giving authors a way to make a living at it so they don't have to wait tables the other 8 hours a day would be nice.

If publishers, agents, etc. are providing a real service to the author - marketing, networking, getting good work seen - then I don't see anything wrong with them getting a cut, if the author chooses to hire them. Or if the author wants to take that on, build a social media following, beat the pavement doing book tours etc. that's fine too.

z3c0 · on Nov 19, 2022

> This may be inevitable, and it has mostly happened with music, journalism, etc.,

What on earth are you talking about? Some of the best artworks - in all categories mentioned, and beyond - come with a flippancy towards marketability, aka the avant-garde. The detachment of capital from expression will only prompt more earnest expression.

QuadrupleA · on Nov 19, 2022

Art is made by people, who need food, housing, etc. If they can support themselves in some way through their craft, then they can spend more time at, and less washing dishes at a restuarant, and be better at it.

Yes business can cheapen and corrupt, and produce commercialized crap (which apparently people like and will pay for so it's serving some need). But there are plenty of artists out there with integrity who also make a living at it.

z3c0 · on Nov 20, 2022

You're conflating demands for capital with an artist's desire to create, and most artists would do the latter regardless of the former. In fact, you even said it yourself - artists who create for capital are forced to produce more shallow art for the masses. Art is a pursuit of the divine, not capital, and pursuing one will come at the expense of the other.

barry-cotter · on Nov 20, 2022

> What on earth are you talking about? Some of the best artworks - in all categories mentioned, and beyond - come with a flippancy towards marketability, aka the avant-garde. The detachment of capital from expression will only prompt more earnest expression.

Is this an “Artists should come from rich families“ or “Be like Picasso and Warhol. Attend the right parties with the right people and you can attract the right patrons too.”?

The only people who can be flippant with regards to marketability are those who have family money, an extremely supportive spouse, a patron or are willing to live in poverty.

zozbot234 · on Nov 20, 2022

> The only people who can be flippant with regards to marketability are those who have family money, an extremely supportive spouse, a patron or are willing to live in poverty.

Art has always been pursued under these exact circumstances, and literature was no exception historically. Mass-marketable art is very much the exception, not the rule. And often it's even the least interesting kind, because it's the most predictable in its features - more of a skilled craft than art in a narrowly creative sense.

z3c0 · on Nov 20, 2022

> The only people who can be flippant with regards to marketability are those who have family money, an extremely supportive spouse, a patron or are willing to live in poverty.

In this aside, you just described most artists. Picasso and Warhol are the two biggest pop stars in art history, and the worst possible examples of what constitutes the average artist.

js8 · on Nov 19, 2022

I think you should read https://lessig.org/product/free-culture/, maybe you will change your mind.

criddell · on Nov 19, 2022

Do you see any irony that you are pointing to what is basically an ad for a non-free book?

hbbio · on Nov 19, 2022

On March 25, 2004, Professor Lessig released a PDF version of this book under a Creative Commons license. This HTML version, produced by José Menéndez, is released under the same license.

https://www.ibiblio.org/ebooks/Lessig/Free_Culture/Free%20Cu...

dudul · on Nov 19, 2022

I'm surprised I need to pay to read this book :)

waboremo · on Nov 19, 2022

Were you paid for this comment? Or did you write it out of your own desire to convey what you are thinking?

I wonder if the answer to these questions also resolves the unfounded idea that writers will stop writing when capitalism ceases to exist.

QuadrupleA · on Nov 19, 2022

So writing an HN comment and writing a good book are the same level of effort? And I just wrote two books?

Basically you're saying authors should be forced to work for free, with day jobs to pay the bills, and writing on nights and weekends, rather than having their craft be a viable profession.

Fine, and it's probably where things are going, but kinda sad in my opinion. And the hypocrisy of book stealers thinking they're some kind of Che Guevara is pretty silly.

Eisenstein · on Nov 20, 2022

This entire line of reasoning is predicated on the fact that there is no public funding of such things. If there were a way to make a living creating works without relying on selling them, then this wouldn't be an issue. Instead of talking about why sharing is bad, why don't we talk about how to publicly support creation?

IanCal · on Nov 19, 2022

I don't think you reading their comment without paying them was against their wishes, was it?

Vt71fcAqt7 · on Nov 19, 2022

>the unfounded idea that writers will stop writing when capitalism ceases to exist.

Where did GP make that claim? Closest I found was that copyright violation is "taking away a lot of the incentive to write anything of value." That may or may not be true but they certainly did not say that "writers will stop writing when capitalism ceases to exist."

dredmorbius · on Nov 20, 2022

<https://news.ycombinator.com/item?id=33676788>

enriquto · on Nov 19, 2022

Just don't read Kafka then.

omnimus · on Nov 19, 2022

In reality its a lot more nuanced. Check out book Chokepoint capitalism. Authors are often robbed by the middlemen and copyright helps the middlemen.

QuadrupleA · on Nov 19, 2022

I believe Kindle books give a 30% revshare to Amazon. It's a lot, but they do marketing, distribution, payments, the e-book reader, etc. so that authors don't have to. That's still 70% to the author (and publishers, agents and other middlemen if the author chooses to hire them).

It's not a zero-sum game, where either the author wins or the middleman wins. It's complicated.

dmitriid · on Nov 20, 2022

> That's still 70% to the author (and publishers, agents and other middlemen if the author chooses to hire them).

Your mistake is putting the author first in this list.

Vt71fcAqt7 · on Nov 19, 2022

>Authors are often robbed by the middlemen and copyright helps the middlemen.

Can you explain a bit more what you mean? How does this justify copyright violation? Unless you ask the author, how do you know they would let you read the book for free? Surely they get paid more when people buy the book, right? IIRC most authors recieve royalties.

giantrobot · on Nov 20, 2022

Most authors never see a dime from royalties. Publishers pay them an advance and then use the author's royalties to pay back that advance.

Royalties often come out to pannies per book. Even an advance of a few thousand dollars ends up taking tens of thousands of unit sales to cover the author's advance and have them see any direct royalty payments.

Publishers also pull all sorts of shady accounting to stiff authors on royalties. They will charge them for editing or cover art against their royalties. Worst is when they charge the author if a retailer returns unsold copies.

That's all besides the fact Z-Library is little different than any other library. A library will lend out a book a large number of times and only pay for it once. Most of the readers were never ever going to buy the book and only read it because it was available at a library.

Very few authors ever see any money from royalties and fewer still manage to live off royalties.

Vt71fcAqt7 · on Nov 20, 2022

Wouldn't this long-term result in publishers offering less advance money? Also, I doubt most people pirating books care to check wether the author is getting royalties or not.

>fact Z-Library is little different than any other library.

The difference is that copyright does not cover benefit from a work but reproduction. Libraries would be illegal too if they involved reproduction of the work.

In any case, this is all still far from the "it was sad day for the free flow of information, knowledge, and culture" argument given by the OP. Most authors really don't want their works reporduced at no cost, and this idea is basicaly a "what's yours is mine" mentality.

giantrobot · on Nov 20, 2022

> Wouldn't this long-term result in publishers offering less advance money? Also, I doubt most people pirating books care to check wether the author is getting royalties or not.

Publishers make money off selling books. Old books do not sell (obviously there's outliers). Publishers offer advances to get new books to sell. If they stop offering advances they won't get new books.

For most authors royalties are not a thing and never will be. Worrying about royalties on their behalf is pointless.

None of that changes the fact that pirates are no different from a non-customer. Equivocating about "reproduction" is just ridiculous. A library loaning a book out to a hundred non-customers is no different than a hundred people pirating a copy of a book or borrowing a copy from a friend. They're all non-customers.

Treating pirates as some special class of non-customers is ridiculous.

Vt71fcAqt7 · on Nov 20, 2022

>Old books do not sell

>For most authors royalties are not a thing and never will be.

Can you provide some numbers to back up these claims?

Also, I'm not clear on where you addressed my point that this would "long-term result in publishers offering less advance money." If it's that "if they stop offering advances they won't get new books" how do you know that they are already offeringing them exactly the minimum such that if they give them smaller advances they won't write more books? Surely it would depend on how much they think they will make from the sales? And what about self published books? Do you admit that those should not be pirated?

>None of that changes the fact that pirates are no different from a non-customer.

Do you mean on the avergae or in totality? Because there certainly are people who would pay but instead pirate.

>Equivocating about "reproduction" is just ridiculous.

That is what the law of copyright is.

>A library loaning a book out to a hundred non-customers is no different than a hundred people pirating a copy of a book or borrowing a copy from a friend.

Authors do not have conrtol of what buyers do with their books beyond the fact that they cannot copy it. That is the condition of the sale. It is usualy expressed something like:

>All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted by copyright law.

Basicaly I'm making two points here. One that authors do net lose money from piracy, and second is that even if not, they expressly told all the buyers that the condition of the sale is that they cannot reproduce their works.

As for real evidence for my claim, I point you to the Author's guild[0] which fought against google[1] providing for free copyrighted works even if it contained a link to their store. And it in general defends author's copyright. It "has counted among its board members notable authors of fiction, nonfiction, and poetry, including numerous winners of the Nobel and Pulitzer Prizes and National Book Awards. It has over 9,000 members."

[0]https://en.wikipedia.org/wiki/Authors_Guild [1]https://en.wikipedia.org/wiki/Authors_Guild_v._Google

halpmeh · on Nov 19, 2022

Isn’t IPFS a terrible technology to host copyrighted material on? As far as I know, it has no privacy features, so you can see exactly who is hosting which content.

enriquto · on Nov 19, 2022

I hope we reach an "I'm Spartacus!" situation here, were none of us can be realistically accused of anything.

dale_glass · on Nov 20, 2022

As far as I know, there's no plausible deniability on IPFS. If you decide to help host some piece of content, you personally have to make the choice to mirror it. Which makes you liable.

gmuslera · on Nov 20, 2022

It is a particular piece of content or a group category of content? You may not be liable if you are hosting The Illiad or some other public domain work, how much you (can?) choose or be aware of which pieces of contents you are hosting, if we are not talking about the whole collection?

dale_glass · on Nov 20, 2022

Yeah, you can pin specific pieces of content, and pin recursively.

But what's the point in doing this for legal content? You can just get the Illiad from the Gutenberg Project.

gmuslera · on Nov 20, 2022

Every year there are books that end into public domain, either because they were published long enough or they were released in that way recently. Are every one of them in the Gutenberg project?

Some of those book even are in their own github project, but it would be nice to have them somewhere, and maybe in a way that could resist censorship o blocking.

Or other things than books, like projects or research released in public that some commercial platform centralizes and put a price tag to access them.

acdha · on Nov 20, 2022

IPFS probably isn’t the platform for anyone worried about censorship: it gives searchers the IPs hosting specific files which they can either pursue or block depending on whose jurisdiction they fall under.

zozbot234 · on Nov 20, 2022

Sure, but so does BitTorrent and they were previously using just that. A comparison of the two technologies is interesting regardless.

roenxi · on Nov 19, 2022

Between the IPFS & the XMR address at the bottom of the blob post - this is using important technology that didn't exist a few years ago - this blog post wasn't technically feasible in 2012.

It has become substantially cheaper to do this sort of crowdfunded guerrilla knowledge sharing.

typingmonkey · on Nov 19, 2022

You cannot "put stuff" on ipfs. Either you seed it or someone else does. If noone seeds it, it is gone. I bet most of it will be offline in 3 months.

tinalumfoil · on Nov 19, 2022

This is very nit picky. The entire web works by severs “seeding” data to the client, so by your logic you can’t “put stuff” on the web. The difference with torrents and IPFS is you can have multiple servers seeding the same content, and not be dependent on any one.

I’m not making any predictions about how long Z library stays up, but the illegal seeding of movies and tv shows has remained very strong until today.

dale_glass · on Nov 20, 2022

IPFS is actually worse to put stuff in than the web or torrents, in my experience.

Last time I tried it, the ipfs service used its own storage scheme. Meaning it's not like pointing Apache at a directory. You take your stuff, and upload it into ipfsd first, and it puts that data into its storage system.

So to do this from scratch (not mirroring somebody else's content) needs a minimum of >62TB -- 31TB of content, which ipfsd will then package into 31TB more + overhead in its storage area.

And of course if you're doing this, you're expecting other people to mirror this stuff, so count on hundreds of terabytes of traffic.

So this is easily ~$3K in hard disks alone, plus the NAS/server hardware, plus traffic, plus the willingness to risk the FBI coming and grabbing all of it.

aabbcc1241 · on Nov 20, 2022

I know the pain of having to store the payload multiple times when using system like ipfs and zeronet.

It maybe helpful to leave the files in the file system as is, and store the metadata in a sqlite database.

Then integrate it with a p2p network layer for crowd seeding and even content discovery.

boramalper · on Nov 21, 2022

IPFS isn't my favourite tool either but you are wrong.

> So to do this from scratch (not mirroring somebody else's content) needs a minimum of >62TB -- 31TB of content, which ipfsd will then package into 31TB more + overhead in its storage area.

IPFS has `nocopy` option for quite some time now, which avoids copying.

> And of course if you're doing this, you're expecting other people to mirror this stuff, so count on hundreds of terabytes of traffic.

Of course you are expecting other people to mirror this stuff, and naturally that will generate some traffic. How is this not a problem with web mirrors or torrents?

dale_glass · on Nov 21, 2022

> IPFS has `nocopy` option for quite some time now, which avoids copying.

Oh, didn't find about that one. Thanks!

> Of course you are expecting other people to mirror this stuff, and naturally that will generate some traffic. How is this not a problem with web mirrors or torrents?

I mean, if I put a book archive on the web, I'd expect the vast majority of people to just grab whichever book they were interested in. Mirroring is a possibility, but a non-trivial thing to accomplish, and can be discouraged.

Meanwhile, on IPFS I'd expect a much higher likelihood of somebody trying to replicate the whole archive, so one would do well to keep that in mind and to be prepared for it.

boramalper · on Nov 21, 2022

You're welcome!

Re #2: Perhaps that's true, but on the other hand, the load will be distributed across all seeders with IPFS whereas your web server will be the only one shouldering it.

typingmonkey · on Nov 19, 2022

Fair point. From my experience people often think ipfs works equal to sth like a shared drive or filecoin, so I had to point that out.

tshaddox · on Nov 19, 2022

That’s clear from the article, which is about the nontrivial task of making tens of terabytes of data available on IPFS.

blooalien · on Nov 19, 2022

> … "If noone seeds it, it is gone." …

This is a very good argument for primarily using ipfs to host high quality content that people will want to preserve long-term…

yieldcrv · on Nov 19, 2022

people are pinning stuff on ipfs with filecoin

there is a free service that does this for you till like 5 gigabits, it pins with filecoin

https://web3.storage/pricing/

I have entire sites hosted for free this way, whereas other hosts like vercel and netlify will charge you for traffic. you can just put your big assets on ipfs+filecoin pins and have unlimited traffic. the ipfs CDNs help with performance.

miohtama · on Nov 19, 2022

Interesting. How does the CDN work as web browsers are not yet natively supporting IPFS? Is someone operating the CDN?

yieldcrv · on Nov 20, 2022

The gateways are https, choose one that caches on its own, like cloudflare’s

miohtama · on Nov 19, 2022

IPFS is a caching or distributing layer, not a persistent storage. There are some attempts to make a storage that backs IPFS by Filecoin team (Filecoin and IPFS teams have a lot of overlap.)

Xeoncross · on Nov 19, 2022

Isn't that true of anything shared on a network?

acdha · on Nov 19, 2022

Yes, but there are people who claim that IPFS is uncensorable or permanent. I saw that a lot from NFT salespeople who were claiming you’d own something forever with no further payments needed, or even that a copyright claim couldn’t take it down.

boredhedgehog · on Nov 20, 2022

I would have expected there to be some more effort to curate and cull the collection before rehosting it in full size. Z-Library's automatic deduplication was at best perfunctory; too many works were featured in various formats and editions with no indication of their merits or distinctions. I wouldn't be surprised if the file size could be cut in half without any valuable information lost.

js8 · on Nov 19, 2022

I wish there was some "library request" mechanism for shadow libraries. I don't mind sharing torrents, especially for older and rare works (which are unlikely to be targeted for copyright), but I can't keep all of them active at the same time, lot of them I have on backup disks.

I imagine the client could request for data to become available, then server (person managing a part of shadow library) could make somehow available the files after a certain period, and the client would automatically download the available files.

tkiolp4 · on Nov 20, 2022

I love this but I don’t understand one thing. It’s relatively easy to find “Anna’s Archive” on the internet, and although it doesn’t host copyrighted content itself, it does link to websites that do host copyrighted PDFs, epubs, etc.

What’s stopping the FBI (or others) from taking down such linked websites? Isn’t just a matter of time until all of them are taken down and the owners sued?

gw67 · on Nov 19, 2022

How to search for a book? Is there a search engine?

flotzam · on Nov 19, 2022

https://en.wikipedia.org/wiki/Anna's_Archive

luuuzeta · on Nov 20, 2022

Funnily enough Z-Library made it easy for me to buy physical copies of books I wouldn't have bought otherwise.

Sugimot0 · on Nov 20, 2022

What advantage does ipfs hold over tor or i2p? it seems like ipfs is a poor fit for this use case, but I'm not very familiar with these things.

yonatron · on Nov 19, 2022

I haven't seen one commenter here concerned with copyright violation or piracy. You're ok with that? Have you never created content of your own, made of your own blood, sweat and tears, and then had it pirated and given away free? Do you condone this? Is this not thievery in your eyes?

braingenious · on Nov 19, 2022

I have made content and I condone this wholeheartedly. Hell, I consider it to be such a net positive that in my ideal world one of society’s primary mandates would be maintaining a universal “library of everything”

I think it’s a downright silly concept to pitch spending public money on cops to hunt down archivists rather than spending money on archivists. I think only the most embarrassingly naive folks would advocate for the world to work that way.

Imagine being so terrified of an imaginary John Galt scenario that you actually put people in jail to avoid the very thought of it.

dredmorbius · on Nov 20, 2022

$5.25 per month for a typical U.S. household, or an 0.01% income tax basis, would provide compensation equal to all current book sales, and avoid both the deadweight losses of information access denial of the present system as well as the Federal Crime of Giving People Books.

<https://news.ycombinator.com/item?id=33647625>

Criminalisation of digital distribution was only legislated in 2008 (again in the U.S.):

<https://news.ycombinator.com/item?id=33643398>

dredmorbius · on Nov 20, 2022

Correction: that should be 0.1% (1/1000th of income), on an annualised basis.

dudul · on Nov 19, 2022

I have written a book that ended up on Z-Library (and some other shadow libraries). It did bother me a little bit at first yes. I had spent hours of my spare time working on that book and, while I definitely didn't do it for the money, it still hurt a bit to think that folks could just download it for free.

But at the end of the day, I wrote my book to communicate knowledge I had and thought was valuable, and if people who couldn't afford the $30 got some value out of it then it was worth it. Now obviously it would be different for someone trying to make a living out of their writing.

mdp2021 · on Nov 19, 2022

> and if people who couldn't afford the $30

Be sure that some people in fact paid you the $30 only because they tried it first and said "this is worth it" (it is how it sometimes actually works).

dudul · on Nov 19, 2022

Meh, that's an argument often heard "I'll download this game to try it and if I like it I'll buy it legit", but how often does it really happen? :). Maybe some did, but it's impossible to quantify.

giantrobot · on Nov 20, 2022

Someone that downloads something and never pays for it is no different than them never downloading it but also never paying for it. They're non-customers.

There's also no difference between the game downloader and them playing a copy at their friend's house and deciding not to buy it. There's also no difference if they were to buy the game second hand.

mdp2021 · on Nov 19, 2022

I am not sure about the rhetoric: why should you quantify it?

If it is to calculate missed revenue: are you considering that the other side of what I have written - a fact - is that a number of people who downloaded an available copy of your book would not have bought it anyway?

Those relevant to the missed revenue are those who would have paid but did not: the number in that group is something that you can hardly quantify, and on the other hand you have gained buyers that would not have bought your book if they had not tried it.

dmitriid · on Nov 20, 2022

"an increase in illegal consumption over time is found to correlate with an increase in legal consumption and vice versa"

In a study commissioned by Google: https://www.ivir.nl/publicaties/download/Global-Online-Pirac...

getcrunk · on Nov 19, 2022

the problem is that, in general that content made of "blood,sweat and tears" is already being pillaged by the robber barons who take 90%+ of the profit from your work, at least in the fields of movies, tv, music and books. Not saying anything is right or wrong. just pointing out this complicating factor.

acdha · on Nov 20, 2022

There’s a serious citation needed on those numbers but even if they are accurate, theseauthors are not slaves. They voluntarily sign contracts rather than give their work away, self-publish, or stop writing. It seems rather self-serving to argue that you are better equipped to make that decision for them and that means you should get something you want without paying for it.

timmytokyo · on Nov 20, 2022

So by this logic, it's ok to further immiserate the poor author since she was already 90% immiserated by the big bad publisher?

RunSet · on Nov 20, 2022

> I haven't seen one commenter here concerned with copyright violation

While we are enumerating things we haven't seen...

In the United States, copyright exists "to promote the progress of the useful arts and sciences", yet copyright is defined in terms of "life of the creator plus some years".

I have never seen an explanation of how copyright is expected to incentivize a dead creator to perform further creative acts.

mdp2021 · on Nov 19, 2022

> Is this not thievery in your eyes

Certainly not: a thief removes property, leaving the victim without the re-appropriated.

bozhark · on Nov 19, 2022

99% of the people that use something like z-library would _never_ buy access to the articles.

scotty79 · on Nov 19, 2022

Yes.

How about you? How hard was your life destroyed by book piracy (or any other kind)?

cratermoon · on Nov 19, 2022

It depends on who steals it. If an individual, say one of my peers, wants to read and enjoy it, that's very different from a global clothing retailer using it to sell their stuff.

RobotToaster · on Nov 19, 2022

>You're ok with that?

Yes

NKosmatos · on Nov 19, 2022

Very difficult questions, with even harder answers. IMHO it’s totally different trying to liberate the knowledge locked behind paywalls (books, research papers, studies…) and the real piracy of digital content (movies, software, audio…). Many different factors to consider and I think it all comes down to personal views and preferences.

dudul · on Nov 19, 2022

How is a book a "locked paywall" and a movie is "real piracy"? I fail to see how the difference in medium between a book and a movie makes the situation so completely different.

NKosmatos · on Nov 19, 2022

Not all books are available in all countries, especially poor ones. My whole point is that we’re talking about freedom of knowledge. Someone who writes a book or publishes research is doing it for the spread of information/knowledge and for the greater good. I agree that these people have to make a living as well and that’s where it’s getting difficult. In my mind, piracy of knowledge is ok, but piracy for profit or for not wanting to pay is bad.

dudul · on Nov 20, 2022

And not all books are about spreading information and/or knowledge. We're not only talking about academic books here, there are also fictions. What does the latest Stephen King have to do with freedom of knowledge? I understand what you are talking about and as an author myself I do agree as per some of my comments under this thread, but let's not pretend all the books made available for free are about "knowledge".

dmitriid · on Nov 20, 2022

Who are you to say what's knowledge and what's not?

Why can't I, a speaker in language X, buy books in that language outside of the country where this language is spoken on the same platform where it's sold.

See e.g. availability of, say, Japanese or Russian books on Apple Books in Europe, US, Russia and Japan.

29athrowaway · on Nov 19, 2022

And this is exactly why I will never use IPFS.

I don't want anything to do with anything illegal.

dale_glass · on Nov 20, 2022

As far as I know, replication of data on IPFS is a purely voluntary and specific action -- you choose which exact content you want to mirror.

So if you run a node and want to avoid illegal content, the solution is just not to use the system to pin such content.

mdp2021 · on Nov 19, 2022

Checked https://www.ipfs.tech/ , it seems like a progressive project - "those things that do something good".

Unless you mean that you do not want to have anything to do with hammers and screwdrivers.

29athrowaway · on Nov 19, 2022

Gift cards are a great idea. But who are the #1 users of gift cards right now? Scammers who found in gift cards a good way to conceal their identity and financial info.

Today, thousands of elderly people will be scammed by idiots that spend their entire day preying on older people. Elderly people that will lose everything and end up homeless.

Fuck the current implementation of gift cards. They cause more problems than they fix. We can't have nice things because there are people out there who are fucking disgusting. Therefore we need accountability built into each technology.

Why? because there's a lot of people that fucking suck that's why. Build a simple application that allows people to share plain text and figure out what happens next. Harassment, death threats, organized crime, and all sorts of horrible things.

Technocrats and tech enthusiasts are often too idealistic to think about the real world consequences of technology.

The person that was having fun implementing convolutional neural networks and decided to publish their research for the benefit of the world would never have imagined that years later the chinese would take that research and build a mass surveillance system that tracked the entire population in real time enslaving billions of people.

The people that built Tor would never have imagined the caliber of fucked up shit that now takes place there. And the same with every technology.

mdp2021 · on Nov 20, 2022

> gift cards ... accountability built into the technology

It depends on what you mean with that «accountability built into the technology».

Related case, as you are also speaking of cards: this friend of mine wanted to buy a foreign book (less available in bookstores, where you pay cash). So he purchased an anonymous pre-paid credit card. He found out that during the mess of the past three years, somebody has made it very difficult to keep those cards anonymous: they now request a telephone number to activate them, and in regions where, "coincidentally", telephone numbers are to be associated to legal identities.

Now: as you should know, you can bet your life that if our transactions were recorded, we would strictly avoid commerce.

(Just like, pretty similarly, we would strictly avoid a non anonymous Internet.)

So: your call for accountability should work within the boundaries of what is acceptable.

> too idealistic

Don't worry, we grow up, and some of us restrain production - exactly in the awareness of what may happen.

29athrowaway · on Nov 20, 2022

I much rather prefer a tourist to be inconvenienced than thousands of elderly people losing their livelihood every year.

Imagine working all your life so that then everything gets taken away by some jerk.

mdp2021 · on Nov 20, 2022

You probably misread it: tracked commerce would be the end of commerce; tracked Internet would be the end of Internet - when you speak of «accountability», take into account who would accept which implementation.