Show HN: Fully-searchable Library Genesis on IPFS

(libgen.fun)

452 points | by sixtyfourbits 941 days ago

22 comments

  • madars 941 days ago
    Tech details from the Getting Started guide:

    > How does this work?

    > SQLite compiled into WebAssembly fetches pages of the database hosted on IPFS through HTTP range requests using sql.js-httpvfs layer, and then evaluates your query in your browser.

    The same guide, https://libgen-crypto.ipns.dweb.link/, also explains how you can also download the page to search locally without constant internet access.

    sql.js-httpvs was previously discussed on HN here: Hosting SQLite databases on GitHub Pages or any static file hoster (1812 points) https://news.ycombinator.com/item?id=27016630

  • c54 941 days ago
    This is great! I've been half-barely-following IPFS development for a few years now but I think this is a salient use case that I could actually see myself using.

    I think also with IPFS i can share files with peers pretty easily? It's nicer than uploading to a filesharing site, and easier than setting up a torrent.

    So, what's next @sixtyfourbits? Is there a read-only wikipedia on ipfs yet?

    edit: found it, but I think it's not searchable https://en.wikipedia-on-ipfs.org/wiki/

    • sixtyfourbits 941 days ago
      Next up is the scimag collection (also maintained by library genesis), which is a backup of all articles on sci-hub.

      The torrents are already widely replicated, but whether or not IPFS is going to be able to scale to the level required for 85M articles is still an unknown.

      • lettergram 941 days ago
        I thought the new release allows for an exabyte of data. Not sure it can handle bandwidth though
  • easrng 941 days ago
    This is cool, but more centralized than it needs to be. The update check resolves libgen.crypto using @unstoppabledomains/resolution[0] with its default Ethereum provider, Infura[1]. That means that if Infura disables the default API key, goes down, or starts censoring responses, the update check will fail and users will be stuck on an older version of the site. Using the .crypto domain for updates is unnecessary, a simple IPNS[2] lookup (Not to be confused with DNSLink[3]) would've sufficed.

    [0]: https://www.npmjs.com/package/@unstoppabledomains/resolution

    [1]: https://github.com/unstoppabledomains/resolution/blob/HEAD/R...

    [2]: https://docs.ipfs.io/concepts/ipns/

    [3]: https://docs.ipfs.io/concepts/dnslink/

    • dudehere 941 days ago
      You can switch to literally any other resolver in the browser settings or use alternative URLs to access the antisite.

      Also, in future there can be other access paths to this search. All of them go through different infrastructures and may or may not work for a particular user.

    • dvcrn 941 days ago
      IPNS is so slow that using it for anything just makes for a very unpleasant experience
    • miohtama 941 days ago
      POKT Network offers P2P Ethereum API nodes. It is early though, and paying is difficult (consumers do not want to pay)

      https://www.pokt.network/

  • cmeacham98 941 days ago
    > Put http:// as the protocol prefix instead of various https://, ipfs://, or ipns://. The most universal format is http://, since transport-level security (TLS) is not yet fully adopted in dWeb systems (neither it is essential for our applications where a domain name resides on a blockchain and SSL traffic encryption is employed further on)

    I understand that this isn't really their fault because they'd have to get a CA to issue a cert for the non-standard .crypto TLD, but has to be untrue, assuming I understand correctly that the HTTP version is just hosting a JS IPFS client? And therefore the non-IPFS link is suceptible to MITM attacks if I understand correctly.

    • dudehere 941 days ago
      No difference ICANN or blockchain TLD for making an SSL certificate.

      Where can this MITM stick his foot in? The blockchain domain record is encrypted and contains the publicly visible target IP-address. Even if accessed via http, it can be instantaneously checked. And upon visiting that IP address, the site forces https encryption. Nobody can know from mere traffic analysis what exactly you are doing within the site, since it's encrypted by SSL.

      IPFS encryption isn't necessary either, since content is addressed by its hash which automatically guarantees its authenticity. It doesn't matter if you use https or http.

      It's only imperfect in the transition during DNS resolution. A 3rd party can know what site you go to, but not more than that. Suspecting a public generic blockchain-domain resolving node set up for freedom in defacing would be a bit too much. Usually those are OpenNIC servers or huge DNS providers, unless you specify a custom DNS resolver in your settings.

      For a book site it's a pretty decently protected transition, actually maximal available for a non-expiring (unmanned) service which it is. Legit certificates expire or cost money. The system behind libgen.crypto is fully unmanned, i.e. eternal, except the files themselves which need hosting.

      • cmeacham98 941 days ago
        > And upon visiting that IP address, the site forces https encryption.

        This isn't true if there is a MITM attacker. When you visit an HTTP website, the website doesn't get to redirect you to HTTPS or anything else, because the game is already lost.

        When you visit a website over HTTP the attacker goes first. The legit response never made it to the client, because it was replaced in transit with a redirect to the attacker's scam/phishing/malware website.

        • dudehere 941 days ago
          How do you get a wrong IP from the blockchain in the first place? The legit software suggested to use reads out the blockchain record and forwards you to the only IP. Unclear to me where you can pick crabs on the way.
          • brenns10 940 days ago
            A MITM attack requires an attacker is able to intercept and inject messages between you and the other side of your connection. So when you connect via HTTP to the IP, the attacker just masquerades as the intended target and either doesn't redirect to HTTPS or does so with their own self signed certificate.
            • dudehere 940 days ago
              The problem with SSL certification on blockchain domains is that certificates from authorities cannot be made eternal. Either unmanned, or from CA. No real alternative.
            • dudehere 940 days ago
              Ok, true. I think realistically, just use a community approved (and actually the only) software from the company who risks its huge business, if they become a MITM.
              • cmeacham98 940 days ago
                Tons of people have an opportunity to MITM you and there's nothing any community approval can do about it because it's not their end of the connection.

                The random on the coffee shop wifi.

                Or the hacker on your apartment/university's poorly configured network.

                Or your shady ISP.

                Or your own government (especially likely in authoritarian countries decentralized solutions are supposed to help).

                I see elsewhere you mentioned certificates on the blockchain. That could work, but someone has to actually create a standard and write the code to validate the certificate and get other people to use it, which hasn't happened yet.

                • dudehere 940 days ago
                  I am aware of SSL on blockchain domains, but this is pain in the ass. As I mention somewhere in this thread, if random downloaders lose the same concerns about privacy as serial killers and child traffickers, I think it's better to buy a book, than to actually satisfy such demands. It can be a good joke for stand-up, though, but since there have been only a few individuals caught in the entire mankind history for making or hosting such libraries, it's not more than a joke which should not detail the world from using high tech. No need to be afraid as an academic exercise. In reality nobody needs users. At all. A few owners vs a billion users... No, bro, no. We are wasting time discussing how a book without any private data is "defaced". It is possible, if to stick a dynamite up the ass, but likely without real exposure or, let alone, the interest of all those 12 lawyers in the world fighting with piracy.

                  Have a glass of wine and relax. There is a long line of people to catch before you get on the list.

                  • cmeacham98 940 days ago
                    I'm not worried about getting "caught" for piracy, I'm worried about some asshole in the middle hijacking my connection.

                    Story time: back when I was in college a few years ago, the university network had some weird configuration where everybody in my dorm was on a single large local network. Somebody thought it would be funny if they ARP poisoned the network and redirected all HTTP traffic to shock websites. This would last 2-3 weeks until either they decided to stop or University IT finally caught them.

                    Regardless, I'm glad we moved the goalposts to "you don't need privacy" and conceded that my original comment pointing out how insecure this was is correct.

                    • dudehere 940 days ago
                      There is always a trade-off between convenience and security, and in the comment about http user's convenience is considered a priority.

                      Papers are distributed with IP-addresses stamped in many pdf files upon their downloading from publishers, and nobody seems discussing it. This is incomparably more harmful than some random MITM somewhere done by someone and requiring an infrastructure invasion. But even this has not yet posed a real threat.

                      BitTorrent: anybody directly intercepts the IP-addresses of seeders, and again, no much worry. No need to hack in as with MITM, it's just yours, go watch.

                      So, no problem with MITM in this project, at all. People who want to steel the projects reputation or name, simply squat domains or make various groups.

                      In my opinion MITM is no much different from intercepting a phone conversation by connecting to physical wires going to your apartment. It's very localized.

                      • dudehere 939 days ago
                        True, but I gave a solution for such countries and other cases along the lines: use VPN. It fully recovers security. You only need to quit your local network to bypass MITM risk. VPN does a lot more for your security and privacy.

                        It is not less secure since there's no equivalent more secure option. Don't mix problems of your network access with global decentralization. Decentralization alone is a way for better security by obscurity, but you should appreciate that whoever makes the project are volunteers having scarce resource and who don't want to make it a job for making it perfect for infinitesimal concerns.

                        I have no idea what "original" you refer to in this context. If the Web is more secure with broken HTTPS here and there and fully centralized access, you probably didn't fully understand what the dWeb project is doing.

                        • cmeacham98 939 days ago
                          Using a VPN only sidesteps the MITM risk to my VPN provider, their ISP, and everyone after that on the line. That's probably better than my normal internet if it is known comprised, but isn't really a much better solution.

                          And yes, the equivalent more secure option is running a website on the boring old normal internet. This solution actually gives more power to centralized operators, allowing your ISP and government to take over the connection whenever they want, a problem that doesn't exist for normal websites.

                          If your more secure alternative must be decentralized, then Tor hidden services are the go-to option, running on a decentralized network with actual working and battle tested security.

                          You can claim the problem is "infinitesimal" all you want, but until you point out a problem this solution solves that has more users being actively attacked than every person in China, I'll just assume you must be trolling.

                      • cmeacham98 939 days ago
                        You can claim that MITM attacks are "localized" and "one-off", but in reality there are entire countries that MITM their citizens, the most well known example being the Great Firewall of China.

                        What is the point of a decentralized solution that is less secure than the original and can be easily thwarted by more actors than the original (which we know happens to entire countries in the real world)?

                    • dudehere 940 days ago
                      I've just reread my message above, it has many mobile typos. Sorry about that. I hope it wasn't too derailing.

                      About MITM I'd like to add that this event is an exception even for a single person, since (the same) MITM cannot occur on different millions of network we all randomly switch. Anybody would see that the target site doesn't behave as normal at some point, should such an event happen.

                      Indeed, malicious networks exist and the key points here about them would be: 1) the current libgen.crypto implementation is read-only and doesn't request anything of value to be transmitted over the network; 2) your personal visiting statistics would quickly reveal, if MITM attack occurred. Eventually MITM is not more than site defacing. It's not going to be unnoticed in a read-only project, if starts behaving suspeceously.

                      Everyone knows what results to expect from LG (remember, the original LG project sets reputation and ethics as the top priority), there should be no issue to simply stop browsing.

                      Also, to avoid local network tricks (which can be very harmful), use VPN whenever possible. Nowadays it seems to be a universal tool everybody should have.

                      And don't connect to random WiFi networks ever. Only to those which belong to organizations you visit and are trusted.

                      Your post was correct, yes, since it stems from a mere HTTP protocol observation, but it ignores why it's the only way to access for some systems with some features, and that the expected harm of it for an average individual is practically zero. All variations of LG have been running without SSL for longer than a decade globally, and no problem. So, on the practical foot it's not a concern, (take into account my other comments about various issues introducing HTTPS in every part of the system).

                      Let's quantify it somehow to actually see if this is a concern beyond an academic exercise:

                      1 user out of a million users on a million networks a year may get a wrong forward due to a MITM attack on his network and notice that it is not the site he has seen a hundred times before. The probability of such an event for an average individual is something like 0.00000000000001 per annum. I call it a practical zero.

                      Should one get a small permanent job servicing certification for a dozen randomly expiring systems and paying money with the risk that an expired certificate, should the person die, would practically block access to resource, to get the practical zero to real zero?

                      My answer would be definitely not, this would be waste of life. We all know Http has this flow, but return to that comment about using http: it actually tells you may not have access at all, if you use https (not always, though, but that comment is a hint, not a statement you don't need security). Here's the choice: access with http or secure no access via https? I think there is no real choice. Neither that comment tells you more than to remember a pattern to use with dWeb domain names which reliably works.

                      Summarizing, your logic is correct but not practically helpful.

                      Story time: about 10 years ago a forker from ebookoid came in to the LG forum and started aggressively promote his site, an LG fork, selling books, while pointing out how poor LG's security was since it had no SSL/HTTPS, and his site had it. A scammer with a legit encryption was humiliating a legit project without encryption.

                      I hope you get my point: don't make a storm in a glass of water, because some less knowledgeable people may take it as a real breach which it is not )

          • dudehere 940 days ago
            Okay, I agree that if the intermediate network is not trusted, there can be MITM. The good thing is that the original LG offers and will keep offering multiple verification ways.

            1. Affiliated sites listed by bookwarrior, the Founder of LG: https://libgen.life/viewtopic.php?f=26&t=7896

            2. Blockchain records viewable via blockchain explorers and similar public tools. E.g., you may check the libgen.lib record on https://peername.com/, press Whois button after the search. The Peername extension simply can't handle SSL, and EmerDNS domains such as .lib aren't yet supported on IPFS by browsers. It's being worked on, though. For now only IP address forwarding works, but you can choose another way as per below.

            libgen.crypto record is googlable and can be seen on OpenSea. I'm not finding the IPFS CID, though, but .crypto does support HTTPS, and so do IPFS gateways, after which the CID takes you to the correct location. You may use https://libgen.crypto/ However, in this case there can probably still be MITM with legit SSL certificates. I'm not sure.

            Concluding, if you once learn a legit blockchain domain name, you can trust it's record since the record cannot be modified without direct owner's intervention. It's cryptographically strong. It's not the case with conventional Web domains which are fundamentally rented.

          • stavros 940 days ago
            Someone is sitting between you and the legit software.
  • 8K832d7tNmiQ 941 days ago
    It's probably just a few steps away to finally build an IPFS version of sci-hub. Sadly, I'm not a fan of the site's method of searching the libgen index by using sqlite's partial load feature [1] mainly because of the possible limited available storage issue.

    [1]:https://phiresky.github.io/blog/2021/hosting-sqlite-database...

    • morsch 941 days ago
      I remember the sqlite via static host discussion, but I don't understand what you mean by "the possible limited available storage issue". Can you explain?
    • dudehere 941 days ago
      It's a feature which actually makes fast decentralized search real-time. If no partial function is involved, it won't work before downloading the entire database. It's a feature and beauty, not something to dislike. It's just given.
  • mbStavola 941 days ago
    Immutability is a blessing and a curse for IPFS.

    It's cool for preventing things like censorship. Something like SciHub would really benefit from it.

    However, for "real world" use cases, many people want to be able to remove or modify what they've uploaded. With IPFS, as far as I'm aware, doing either doesn't really change the underlying data but just creates a new object in IPFS instead which you'd point to via IPNS. Anyone who still wanted to view the old content still could, provided they had the right content id.

    God forbid you accidentally upload a "personal" photo, your only hope is that someone never comes across the content id of that image. There is no way to undo it!

    • eric__cartman 941 days ago
      From my understanding if you accidentally upload a personal file, as long as no one downloaded it in the time you took to realize your mistake taking down the only node that has the file (your computer in this case) should effectively "erase it" in the sense that unless the node comes back up, even if someone has the id of that file they are SOL.
      • unknownOrigin 941 days ago
        Ok, I have to ask, what is an actual difference between torrents and ipfs? I don't care for technical details, I mean the business logic, so to speak.

        - Both use DHTs to search for sources of fingerprinted content. - Both use nodes (seeds in BT terminology) that actuallu store the content. - Both don't have an "archive" system, and so if at least one node doesn't have the file, it may as well not exist at all. - Both can have content censored by going after the node operators.

        Am I getting any of these wrong?

        • grumbel 941 days ago
          The difference is the granularity. A torrent is like a tar file, it's a big blob of static data that can't be updated. IPFS in contrast works more like a file system, you have a top level directory that points to the content within it. If you want to change something, you just update the top level directory, while all the content within it can stay the same. Each file on IPFS has it's own checksum and can be addressed individually.

          IPFS doesn't help much with censorship, as it has all the same issues as torrent in that area. It doesn't help much with privacy either, as it's all rather public. It's really for legitimate uses, not outside-the-law kind of stuff.

          The benefit of IPFS is that its granularity makes it much more useful for smaller tasks. For example you can host Git repositories or source trees on there. And since IPFS on Linux can be mounted as a file system, you can just access them with a simple `cd` command, no manual download or extraction needed.

        • easrng 941 days ago
          BitTorrent is easier to understand (IMO) and can sometimes be faster at lookups than IPFS. IPFS dedups content, whereas if you have an identical file in 2 torrents but only one is seeded, you can't download it from the other one. As far as I know those are the only differences.

          Edit: Actually BitTorrent v2 dedups files so it seems like IPFS and BitTorrent are now functionally identical.

          • dudehere 941 days ago
            BitTorrent, in my view, is a messy format, while IPFS is more clear in design. BitTorrent, namely, has fixed chunks and data overlaps in them from adjacent files which make the format artificially merging different files and create difficulties to treat parts individually, e.g. pick a single random file and check its only hash from a torrent. It looks like they couldn't get the basics deeply comprehended at the design stage.

            This is much neater in IPFS. Files and data blocks are individually handled, and there is no situation when a hash embraces several independent file fragments. Adaptive block sizes are also supported which is extremely useful for handling such collections as LG is (however I haven't checked if it's really used at present) instead of having an extra layer of hashes to rehash files individually after the torrent hashed their chopped "tape/tar" chucks. The forced BitTorrent serialization and subsequent fixed-size chunk chopping are basically absent in IPFS. This helps structuring the search and facilitates deduplication, too, through the strict Merkel tree correspondence to files, as opposed to the randomized data chunks having a fixed size for no real necessity and meaningless hashes for wider applications.

            To me these are the key aspects, even torrent bug fixes, that IPFS possesses.

            • 0-_-0 940 days ago
              >BitTorrent, namely, has fixed chunks and data overlaps in them from adjacent files

              BitTorrent 2 fixes that: https://blog.libtorrent.org/2020/09/bittorrent-v2/ (hash trees section)

              • dudehere 940 days ago
                Thanks, I actually did suspect they do. They still aren't Web friendly and require dedicated software unlike IPFS via gateways being more Web-ready.

                Who knows, BitTorrent might have never fixed this without seeing how IPFS works.

                • easrng 937 days ago
                  Though it isn't an ideal solution (it's rather large and doesn't have DHT support), https://webtorrent.io supports torrents in browsers.
                  • dudehere 935 days ago
                    Evolution makes different systems adopt each other's features and eventually they may equate as operating systems did. IPFS has key necessary features more on the surface and is a lot more adaptive to modern operating systems compared to torrents, with much less internal games nearly absent in a nodal IPFS software design.

                    Multiple things make IPFS a more architecture-oriented solution than application-oriented BitTorrent.

                    There are various application features yet holding back BitTorrent and LG will utilize them in future.

    • dudehere 941 days ago
      Immutability is not real. There is always a hosting with files. For this reason a community should manage such uploads, and not a random community without rules or values, and not onto a nameless server, for things to be reasonably flexible at first.

      Permanence comes from distributivity and from real hardware, not from IPFS. Its pinning is only a few days long, in reality. It's not really a hosting, rather a sporadic buffer.

  • ajvs 941 days ago
    Awesome project, now all we need is the SciHub version of this.
  • dr_dshiv 941 days ago
    https://libgen-crypto.ipns.dweb.link/

    Wow, this is way more user friendly than normal! Nice work.

  • CuriouslyC 940 days ago
    IPFS is pretty cool. We just need to come up with a really good solution to distributed search, and get some big content creators to sign on, and it'll take off.
  • severine 941 days ago
    You could add a link to IPFS Lite ("IPFS Lite node with modern UI to support standard use cases of IPFS ") on F-Droid. Seems actively updated, can anone vouch for its quality?

    Source: https://gitlab.com/remmer.wilts/ipfs-lite

    F-Droid: https://f-droid.org/en/packages/threads.server/

    • namibj 940 days ago
      Works, but last I tried, not even a local gateway, so you're limited to the built-in web-view.
  • isoprophlex 941 days ago
    And it's fast as hell too!

    This is fantastic yo, mad props.

    Thanks for liberating our collective knowledge, ipfs style. Keep it up!

  • bscphil 941 days ago
    Couple of thoughts:

    1. > Show HN

    Did you @sixtyfourbits make this? Any stories about how you came to be involved in the project? IPFS seems like a pretty ideal way to handle sharing documents like this, I'm surprised LibGen hasn't used it before (previously, you would get redirected to one of many constantly dying domains that may have ads and frequently 404 on the actual book you're looking for).

    2. Also, this interface frequently doesn't work in Firefox for me. It hangs while trying to load the file. Fortunately, you can check the browser dev tools and find the actual IPFS gateway link (which uses ipfs.io), and go to that link directly. My experience is that the direct link not only works far more frequently, it's actually faster as well. So this raises an obvious question: rather than load the file in a fancy interface, why doesn't the link just take you directly to the IPFS gateway?

    3. Is there any concern that systematically using a legitimate service like IPFS to share illegal material will create a situation similar to that of Bittorrent, which is similarly often presumed illegal until proven otherwise? That seems like a shame. I suspect the only reason why rights holders have not cracked down on IPFS is that it's not yet big enough to be on their radar.

    • dudehere 939 days ago
      1. The main contributor, the author, of this code sees this thread, but previously he asked not to mention his nickname. However, to give it an own tag, libgen.crypto is referred to as antisite. If he wishes otherwise, we'll happily present him to the public. Phiresky deserves the other half of the appraisal for his work on the underlying VFS. By looking at how people find each other to get tiny bits together into something mind-blowing we see how LG functions as a social development core, as a heritage harvester, and as a living organism.

      Everybody who makes a contribution, even a smaller one, no need to make revolutions every day, directly affects the civilization through Library Genesis. LG is built of contributions in the same way as our body is made of cells. It's our heritage.

    • dudehere 941 days ago
      2. It's random for a random user. I'd suggest to try every suggested natively supporting browser with the list of given URLs to find which combination works the best for you.

      3. An arbitrary IPFS gateway can be set up our rented, it's not a taboo. They are usually $10/Mon.

      • bscphil 941 days ago
        2. I don't understand what you mean? Are you saying the gateway is random? I tried several different browsers and got ipfs.io every time. Are you saying it's random whether it works or not? If so, that seems ... bad.

        3. It's not about whether it's difficult to act as an IPFS node, it's about whether doing so will (in the future) bring you under legal scrutiny the same way running a node serving copyrighted content on the Bittorrent network will do now. DMCA against the major gateways will probably work to make files difficult to access, and IPFS necessarily reveals the IP address of the node you connect to, if you don't reveal a gateway. Similar techniques are used to get the IP addresses of Bittorrent users, and send them demands for financial compensation or sue them in court for distributing copyrighted material. If the same becomes common for IPFS, it would not be unlikely to see college networks come under pressure to ban access to IPFS, and this would limit access to LibGen's database in a significant way.

        • dudehere 941 days ago
          2. Yes, random whether a chosen gateway works or not. A list of alternative gateways can be supported, in principle. Use local IPFS node software to use ipfs://... links instead, if the hardcoded IPFS gateway doesn't work for you.

          3. IPFS node and gateway are different things. A gateway is vulnerable to takedowns. A node isn't nearly as vulnerable. And if you run IPFS Desktop or similar software on a VPN connection, what's really left to be afraid of, conceptually? Pretty much nothing. It may throttle the traffic a bit, but that's no problem. Pick MullVad VPN or some other like NordVPN etc. Free test days are available usually.

          I don't know why exactly, but edonkey was killed by intercepting participants' IPs, but even without encryption it has not yet happened to BitTorrent. With VPN it's just unfeasible.

          • bscphil 940 days ago
            > Yes, random whether a chosen gateway works or not

            That might be true, but what I was saying in the OP was that the interface at libgen.fun usually does not work for me in Firefox, but the direct link to the gateway (the same one that the interface is using internally) almost always works for me.

            > And if you run IPFS Desktop or similar software on a VPN connection, what's really left to be afraid of, conceptually?

            Yes, but most users are going to download through an interface like this one. The concern is that this will put a legal burden on prominent public gateways once they become the targets of DMCAs and that rights holders may even be able to put pressure on university networks to block IPFS entirely, harming the whole network.

            You seem sure that people downloading or pinning content on the IPFS network are all using a VPN. I'm not so certain of that. The situation is rather similar to Bittorrent, I expect. It's certainly true that Bittorrent as a whole hasn't been killed (nor could it plausibly be), but (a) it's routinely blocked on certain networks, making it harder for certain people to use it even for legal purposes, (b) most people in fact don't use VPNs and rights holders do send takedowns (via ISPs) threatening lawsuits or demanding payments, (c) even on private trackers, VPN use isn't ubiquitous. If any of these trackers reuse public torrent hashes, their users are at risk of being port scanned.

            • dudehere 940 days ago
              Didn't quite get the problem with .fun. Doesn't FF render it properly? It's a pretty conservative code behind, really nothing outstanding compared to the code of 2008. What's your JS version in the browser?

              Real life has shown users are never hunted. Operators are, since they hold stuff. A random user out of a million of such a month on mostly a protected connection should really not think anybody will have a wish to find him. It's absolutely unfeasible and should only be mentioned as a joke.

  • ramon 941 days ago
    Nice, is it open source? Can we learn from this implementation? Is there any documentation?
  • mijailt 941 days ago
    It's not working for me on Firefox 92.0. It hangs on "Initializing...", with a single error on the console which says

    Loading Worker from “http://libgen.crypto.ipns.localhost:8080/dist/257fb50677e116... was blocked because of a disallowed MIME type (“text/plain”)

  • betwixthewires 941 days ago
    Bravo. This is big stuff.

    There are some good critiques and ideas in this thread I won't go over since it's been done, I have been putting off building something akin to this (not the same thing but somewhat similar) hoping someone more motivated would do it, and I'm very excited to see it happen.

  • unraveller 941 days ago
    It's a fantastic day for digital monks! I just wish the url took a searchable ?query=aristotle on load so you could add it to your browser's search engine list
    • dudehere 941 days ago
      Yeah, there has been a discussion on that matter. It would take the same time as the entire bootstrapping and the search query processing to open such an URL. This is doable but may exhibit inadequate performance.

      I support you, though, that independent of the search time it is valuable to have a linking standard like that.

      • stavros 940 days ago
        Wouldn't #query=Aristotle work and be just as fast?
        • dudehere 940 days ago
          The databases still needs to bootstrap first, then the code can search on it. Initialization phase should prepend showing a record.
          • stavros 940 days ago
            Sure, that's why the anchor would be faster, because you can use the same initialization code and still accept the query in the URL.
            • dudehere 935 days ago
              Anchor does not solve the bootstrapping issue. You may click an URL with an anchor or without, but the new page in the browser will need to load the database fragment and search over it before showing you the retrieved record. There's no faster way known yet in this implementation.
  • therealcamino 940 days ago
    Honest question: what do you envision happening if Library Genesis becomes widely known and people looking for e-books make that their first stop instead of Amazon or another store or the public library? In this thread I see discussion about the technical aspects, scientific research aspects, and censorship aspects, but nothing about what the economic effects will be if you're successful.
    • dudehere 940 days ago
      It's a very good question.

      LG should fit in the abyss for the poor, but let the business evolve. A rebalancing from the legal entities will be required, but then a global balance can be established. Even widely known, it should take its place, and businesses their place. The two sides aren't mutually exclusive, but rather complementary.

      Business cannot offer what LG does, and in this frame it is pointless to battle LG.

  • DantesKite 941 days ago
    What does this mean? That if any legal entity ever tried taking Library Genesis down, they'd be unable to?
    • dudehere 941 days ago
      It's a composite design, not everything placed in one location. Even from this primitive explanation it is clear that taking it down is a nontrivial task. And with time this system will only grow stronger without applying more effort, just because the number of supporting participants will grow when they pin the antisite. This will make the blocking formidable.
      • DantesKite 941 days ago
        Thank you for taking the time to explain.
        • dudehere 941 days ago
          You are very welcome.
  • snvzz 941 days ago
    Pretty nice. I have used it without issue to download a few public domain books.

    Relative to the usual website, it is a bit crude. A lot of metadata is missing, thus it is hard to decide which book to download. Particularly annoying are the lack of ISBNs and such, and the inability to click the author and see other books by them.

    The worst point is that files come with no proper filename (just a hash!), thus inviting everyone to rename them in a non-standard manner, rather than offer a filename people won't have to rename.

    • dudehere 941 days ago
      It was necessary to crop the meta info that much to host for free. It's a feature, not a real limit.

      The clickability problem is explained down this thread, it's the same as making an URL addressing a book. Not that nice. It's a technological peculiarity.

      The naming may have a solution a bit later. It wasn't clear initially how to approach it.

      • snvzz 941 days ago
        >It was necessary to crop the meta info that much to host for free. It's a feature, not a real limit.

        Maybe have a step of indirection (an extra page) showing the metadata, with the ability to download directly still in the index should the metadata "page" not be fetchable.

        >The clickability problem is explained down this thread, it's the same as making an URL addressing a book. Not that nice. It's a technological peculiarity.

        It's good as long as libgen is aware. To be clear, it's good to be up at all, and it doesn't need to be perfect on the first iteration.

        >The naming may have a solution a bit later. It wasn't clear initially how to approach it.

        Showing a filename somewhere to allow downloaders to manually rename the file to a standard form would be a step in the right direction.

        • dudehere 941 days ago
          Yes, I think issuing multiple free opportunities might give a more complete solution without compromising the unmanned service.

          My browsers actually do offer to rename the files. It might be that yours is set not to prompt.

          • dudehere 940 days ago
            Typo:

            ... using multiple free opportunities (accounts)...

  • hugoroussel 941 days ago
    The website looks down :(

    I love libgen will switch to your version for sure :)

    • fabianhjr 941 days ago
      It is available over IPFS. Do you have a client installed and running?

      EDIT: if you do have IPFS and the companion extension then libgen.crypto should resolve to something like `http://libgen.crypto.ipns.localhost:8080/` which currently works as advertised.

    • dudehere 941 days ago
      It's not a Web-site. It is more suitable to call it antisite since it has no single location where its content is hosted.

      To access it, you need to use software given in the description

      https://libgen.fun/dweb.html

  • nynx 941 days ago
    Pretty neat. Too bad searches take such a long time.
    • dudehere 941 days ago
      Try all the various URLs in different browsers given in the intro, not just one combination of URL and browser. Some combinations are much faster. Use them then. The fastest you find is usually just within a few seconds.

      Also, consider pinning as described on the libgen.crypto antisite itself. If you pin it, the search is going to be instantaneous.

  • ducktective 941 days ago
    > decentralized Web

    I thought the web and internet was decentralized already?

    Interesting project, btw! Many thanks to the devs.

    • dudehere 941 days ago
      You have both options, either to use a convenient Web gateway without decentralization, or to use full P2P access. It's up to you, the code and offered domain names give you all the freedom.
    • superkuh 941 days ago
      I agree. IPFS is more centralized in that 99% of the people that attempt to view data on the IPFS network go through the actual web proxies not IPFS. And when that happens they're easy to take down with legal or traffic attacks.

      Additionally, IPFS's devs have already stated on their community forums that content like sci-hub is not welcome there.

      • sixtyfourbits 941 days ago
        The gateways are indeed centralized (though there are several different public gateways). But anyone who installs IPFS on their computer can access the content directly. Also some browsers support IPFS natively, e.g. https://brave.com/ipfs-support/

        From what I understand, the notion of sci-hub/libgen "not being welcome" was only about discussion on the official forums. See https://discuss.ipfs.io/t/mirror-of-sci-hub-in-ipfs/1613 and https://news.ycombinator.com/item?id=25209246. But IPFS is a protocol just like bittorrent or HTTP, and the software is open source; it doesn't and can't enforce copyright restrictions.

        • superkuh 941 days ago
          >But IPFS is a protocol just like bittorrent or HTTP,

          Yes, but it's a protocol with a centralized single group doing development who can change whatever they want without the users' consent. Take a look at what is happening to the Tor ecosystem on Oct 15th this year: all tor v2 routing support is being dropped from the main client and infrastructure (for security reasons). Entire communities built on onionland and other tor v2 features, as well as all URLs/links, search engine databases, etc, will just go poof when the devs drop support.

          Unfortunately being a protocol isn't enough. It has to be a community protocol, not a proprietary one where everyone follows one group's code. HTTP and bittorrent are safe from these kinds of attacks. IPFS isn't (yet) and that's why their butt-covering anti-sci-hub/libgen stance is worrying.

          • gzer0 941 days ago
            That's being quite unfair to the developers. The entire process has been public and announced well in advance.

            You are welcome to download the Tor source code and add v2 functionality back in, and you’ll be able to visit sites hosted by people who have done the same. No one is stopping you.

            To very quickly summarize why we are deprecating, in one word: Safety. Onion service v2 uses RSA1024 and 80 bit SHA1 (truncated) addresses [1]. It also still uses the TAP [2] handshake which has been entirely removed from Tor for many years now _except_ v2 services. Its simplistic directory system exposes it to a variety of enumeration and location-prediction attacks that give HSDir relays too much power to enumerate or even block v2 services. Finally, v2 services are not being developed nor maintained anymore. Only the most severe security issues are being addressed.

            That being said, the deprecation timeline is now quite simple because v3 has reached a good maturity level:

              * v3 has been the default since Tor 0.3.5.1-alpha.
              * v3 is feature parity with v2.
              * v3 now has Onion Balance support [3]
              * Entire network supports v3 since the End-of-Life of 0.2.9.x series earlier
                this year.
            • a1369209993 941 days ago
              > [1] [2] [3]

              Citation (literally) needed.

          • dudehere 941 days ago
            If somebody wants to make his project in own way, why should it be decentralized? How to control ways and quality?

            Everything has its limits. Perfect decentralization of development lead to a libgen collapse which not many know about thinking that it's bigger servers are still a real libgen. They are not and are mimicking libgen after capturing its vital parts. This happened exactly due to idealistic views that development can be entrusted to different anonymous individuals without regard to centralized management.

            Decentralization implies loss of control. While it's good for results, it's very destructive for development and team building.

      • TacticalCoder 941 days ago
        > Additionally, IPFS's devs have already stated on their community forums that content like sci-hub is not welcome there.

        I'm not familiar with IPFS: is content on IPFS actually moderated somehow?

        • IceWreck 941 days ago
          No, the discussion of "sci-hub on IPFS" is not allowed in their forums.

          Theres nothing stopping you frm actually doing that, just talk about it somewhere other than their official forums.

        • dudehere 935 days ago
          Public IPFS gateways are typically DMCA-compliant, but nobody stops one to rent for $10/month or raise an own dedicated IPFS gateway.
      • jazzyjackson 941 days ago
        I think people don’t notice that IPFS is not built for privacy- you are broadcasting that you have possession of a particular hash, not a smart place to be a pirate.
        • zozbot234 941 days ago
          Doesn't torrent have the same issue?
        • chrisco255 941 days ago
          I mean, maybe? It's trivial to add noise to a file to produce a different hash. It's a neutral routing protocol that doesn't make prescriptions one way or the other.
        • mnd999 941 days ago
          Isn’t this a whole load of pirate ebooks though?
        • dudehere 941 days ago
          You are welcome to always use VPN. No problem to have a plain decentralized network without privacy features.
      • jancsika 941 days ago
        > Additionally, IPFS's devs have already stated on their community forums that content like sci-hub is not welcome there.

        Do you have a link?