MuPDF WASM Viewer Demo

(mupdf.com)

318 points | by aragonite 12 days ago

17 comments

  • keepamovin 12 days ago
    This has excellent performance, and is incredibly fast. Well done!

    Mutools is fantastic for PDF I use it as a backup converter when imagemagick fails in my document viewer: https://github.com/dosyago/chai

    • supernes 12 days ago
      This WASM "demo" is literally the only viewer on Windows I've found that can handle documents like the Baldur's Gate 3 Artbook (around 300MB of high-res artworks in PDF format). Native and browser-based viewers all choke even on a decent system with lots of memory.
      • Modified3019 12 days ago
        I’m curious, have you tried SumatraPDF (uses muPDF under the hood)?

        https://github.com/sumatrapdfreader/sumatrapdf

      • eviks 11 days ago
        That should literally be impossibly surpising

        Checked Sumatrapdf on Windows and it's better - while on some fast mouse wheel scrolling the web version is more responsive, but then it doesn't adjust text quality, so shows blurry version (though zooming seems to fix it) while the local versionalways shows high quality text (sometimes after a delay)

        Pageup/down scroll is without a delay locally but it scrolls through blank pages, so I guess mouse scroll just always does quality rendering, thus it's the only operation that's slower locally

        So while pages load faster in the web version, they are of much worse quality initially

      • remram 12 days ago
        Even the MuPDF native app fails?
      • snvzz 12 days ago
        Even okular?
    • hn_acker 12 days ago
      Off-topic: Chai's license seems to be proprietary now. 7 months ago the license was AGPL as given by the LICENSE and README.md files [1][2]. 1 month ago, the LICENSE file was changed to the text of the PolyForm Noncommercial License 1.0.0 [3], but the README.md still says AGPL. If Chai continues to use MuPDF (AGPL [4]), then isn't Chai's new license contradictory (unless the Chai developers got a license exception from Artifex Software)?

      [1] https://github.com/dosyago/chai/blob/a6b7fb50647ae001185bdc8...

      [2] https://github.com/dosyago/chai/blob/a6b7fb50647ae001185bdc8...

      [3] https://github.com/dosyago/chai/commit/45da5f12ab8a817dc4f74...

      [4] https://github.com/ArtifexSoftware/mupdf/blob/master/COPYING

      • keepamovin 12 days ago
        Don't worry about it, you're confused, it's okay. Hahahaha! :) AGPL requires you release the source and we do that.

        As an aside — would be interesting to get Artifex's comment on this — but I'm not even sure it applies as we install the mutool binary via apt, call it from bash, and we don't modify or use their libraries at all. Would this even need to comply with AGPL at all? I don't know.

        If you'd carried on your search of our source a little bit further, you'd see we use the mutool binary: https://github.com/dosyago/chai/blob/37c1a1ec0941d81e0d6f8af... ; and you also may have discovered what the AGPL means: https://artifex.com/licensing/agpl/ Hahahaha! :)

        • lionkor 11 days ago
          I would highly suggest you read the actual license text, because its a lot more than "just share the source".
          • brewmarche 11 days ago
            Looks like they’re calling the mupdf CLI, which makes chai not a derived work of mupdf IIUC. This would mean that AGPL applies to mupdf only.
            • keepamovin 10 days ago
              Hahaha! :) Thank you! This is correct!

              Appreciate you droppin by with your dose of clarity! Hahaha :)

              • keepamovin 9 days ago
                I don't know why our threads have been covered with concern trolling around licensing regarding anything to do with our products, trying to provoke confusion and mistrust!!! Hahaha :)

                Actually I think I do know: there are a couple of competitors to BrowserBox operating around the world, one based in Europe, one in the Americas, that all use a similar WebRTC video streaming method out of containerized headful Chrome with getUserMedia extensions.

                Our method is different, but lower resource usage, and more customizable. Theirs requires GPUs and has higher base costs. But that's not the thing they're mad about: it's actually much more inflexible because they are not instrumenting headless Chrome, they are just streaming the viewport of headful using xvfb.

                They can't 'downgrade' off GPUs because their whole video codec depends on it, and they don't / can't control if the browser is paused via an alert modal, doing a basic auth prompt, downloading a file. Even multiple tabs are a major challenge for that method, and mobile form factors? Basically a non-starter.

                These competitors resent our flexibility and are jealous of our larger control of the browser that enables us to more easily deliver all these things. They're concerned that implementing the same will run them afoul of our codebase, and they actually hoped to use/test/deploy BrowserBox when it was open-source, but became furious when we made it require a paid license for commercial use.

                Consequently, they've been acting shady across a range of threads, trying to "not compete" with us by using concern trolling in an attempt to tarnish our reputation. Shadily and dodgily not competing but being abusive and dishonest!

                The sad thing is: I like their products! I respect their technical accomplishments and admit they have better quality video streaming! At least right now -- but we never optimized for that.

                The issue is we can "snap in" a video codec layer whenever we please, if we want. They have 'run the experiment' and proved it is indeed possible to achieve real-time interactive streaming at relatively low latency, albeit higher fixed resource cost. This is appropriate for some applications!

                It's just that the method they chose has less control and customizability, and they cannot "downgrade" out of their rigid set up because they have no alternative. They were lured by the promise of higher quality streaming into a more inflexible architecture, while we pursued the "low end" of virtual browsers for automation which ended up giving us abundant control, and low latency streaming that's more flexible overall, not just across devices, but on low resource situations, while also performing very well at the high end.

                They have really ramped up their shady tactics since we launched CloudTabs and integrated with Puter. CloudTabs is our BrowserBox demo Saas, that was just meant to be a big funnel for licensees but ended up growing independently. Since launching 18 days ago we already have 6500+ users, just through Puter. We have a bunch more going straight to CloudTabs. It's crazy. Nothing can stop this train! Not even the lies of phoney 'competitors' acting shadily from the corners instead of actually competing with integrity! Hahaha!!! :)

          • keepamovin 11 days ago
            Hahaha! :) You want to pretend we're doing something wrong, right? Okay you read it and point out something. Hahahaha! :) You know you're quoted "just share the source" comes from their page, right? Hahaha! :)
            • lionkor 10 days ago
              You need to share the source under the same license, you need to publish changes under the same license, and there are quite a few gotchas in relation to who you have to publish it to (not anyone who asks).
              • keepamovin 10 days ago
                Haha! :) Oh I get you're confused. I was a confused a lot about licenses when I started out. I'm probably still confused a lot! Hahaha :)

                Another commenter has dealt with that:

                https://news.ycombinator.com/item?id=40105915

                This usage is considered "at arms length".

                If you're interested in chai, I encourage you to check out a way it's being used for real in this live demo of BrowserBox / CloudTabs: https://browse.cloudtabs.net/signupless_sessions

                Search for a docx, PDF file whatever and you can convert to images without ever downloading to your device. We've got 6000+ users in 17 days on our Puter Browser app: https://puter.com/app/cloudtabs-browserbox

                We have a lot of exciting things coming soon, too. :)

            • mfru 11 days ago
              Hahaha!
    • adrian_b 12 days ago
      Indeed, this appears to have the same advantage over other viewers as the standalone MuPDF, i.e. much greater rendering and navigating speed.

      MuPDF is my main PDF viewer, due to its unmatched performance and good full-screen UX, even if I sometimes encounter PDF files that cannot be rendered by MuPDF, when I have to fall back to other viewers, e.g. Okular.

      • keepamovin 12 days ago
        Coming from someone who is clearly in academia / research adjacent (!! judging by your comments) this is high praise!! PDFs are close to currency of the realm. Haha! :)
      • Modified3019 12 days ago
        The developers of muPDF would likely be interested in the files that are rendered incorrectly so that can be fixed
    • leononame 12 days ago
      The performance is astonishing. On my underpowered android, the PDF was super smooth. It's miles ahead of other PDF viewers (the worst offender is the Google Docs in browser pdf viewer, it's just horrible on my phone to a point where I refuse to even look at those documents on my phone). Really impressive
    • hatch_q 12 days ago
      I don't see it having any better performance than integrated chrome pdf viewer. Furthermore, with it using wasm i'd expected it to have custom renderer, but it's just pdf to html converter.

      And loading times are quite bad (10 times slower - compared to firefox or chrome pdf viewers).

      • keepamovin 12 days ago
        Loading the mupdf.js bundle is slow right now. When I checked it out it was super fast. Guess it's a server/ratelimit/caching issue with the HN hug being top of front page.

        Which is what I guess you mean about 10x slower -- so you're making an unfair comparison as you're counting the network at peak, whereas browser plugins load from disk or memory.

        But I actually thought the load of the PDF (once the app was loaded) was, for MuPDF.js, slightly faster than the browser plugin. When I watched it, tho I have not benched it. Do you have any benchmarks?

      • SiempreViernes 12 days ago
        I downloaded this file https://indico.cta-observatory.org/event/5245/contributions/... and tried timing how long it took for the standard firefox vs this MuPDF viewer to render the first slide and there is like at least 3 second difference.

        As others in the thread also report significant speed gains I think you either have some weird issue with your setup or how you measure performance.

      • CryZe 12 days ago
        It uses a custom renderer, which seems to just blit its image data onto a canvas. The HTML is just there so you can actually select the text.
        • dsp_person 12 days ago
          Does this generally satisfy accessibility needs too?
    • bramhaag 12 days ago
      [flagged]
      • keepamovin 12 days ago
        Hahaha! :) Which words would you have used a thesaurus for? I don't see anything that complex there.
  • bebnel 12 days ago
    Tried this on Firefox for Android but the table of contents takes up half the screen width ways, and I couldn't figure out how to get rid of it. Plus, it froze the browser when I backed out of the page.

    Also it's quite slow to load the WASM, about five seconds before it starts processing the PDF.

    This is on a fairly recent mid-range Samsung phone (Galaxy A52s 5G).

    Edit: turns out it's View -> Outline to remove the contents pane. There's no "fit to page" option so I still couldn't see the whole page - the 50% zoom out option wasn't sufficient to see everything corner to corner.

    • Da5idBlackSun 12 days ago
      You can get rid of it by clicking outline in the menu. Then it works surprisingly fast!
    • SiempreViernes 12 days ago
      Yeah, it needs a "fit page" option. But as a first example its very impressive.
    • vegetable 12 days ago
      [dead]
  • MaxLeiter 12 days ago
    Very cool project. I noticed in the dependencies section its using its own JS interpreter: https://mujs.com/
  • filmor 11 days ago
    If webOS came out with WASM already in place, it would have been even better :)

    When the Touchpad was firesaled, I got one and was disappointed with the PDF viewer. Because there was no such thing as WASM (or even asm.js) at the time, it used out of the box a service provided by Adobe that on request from the UI rendered tiles of the PDF at different resolutions, depending on the zoom level.

    Since the frontend code was JS, it was easy to implement an alternative via mupdf (https://github.com/filmor/webos-pdf/blob/master/arxservice.c...). Via the same inefficient process (rendering png tiles onto the filesystem), the mupdf implementation was about 3 times faster than the original (though, it's been 13 years, the actual speedup might have been less :)).

  • btown 12 days ago
    Very cool! But this being AGPL has me wonder: if you as a user download an AGPL licensed WASM binary, because the browser is making network connections by design, are you required to share the source with any third party your browser makes any request to?

    And if your browser has proprietary/not “generally available” compiled/minified code loaded, from Widevine to your corporate Chrome extension, are you in violation of AGPL if you don’t share all the sources to all those things, which by law you cannot have?

    Not a lawyer, but the idea of AGPL WASM blobs gives me shivers.

    • graemep 12 days ago
      > Very cool! But this being AGPL has me wonder: if you as a user download an AGPL licensed WASM binary, because the browser is making network connections by design, are you required to share the source with any third party your browser makes any request to?

      No. Clearly not. A browser making connections to servers is not the same as a "user interacting with it remotely".

      > And if your browser has proprietary/not “generally available” compiled/minified code loaded, from Widevine to your corporate Chrome extension, are you in violation of AGPL if you don’t share all the sources to all those things, which by law you cannot have?

      Only if you distributed the browser with AGPL code in it. If just runs WASM code its no different from any interpreter running GPL code.

    • pessimizer 12 days ago
      > are you required to share the source with any third party your browser makes any request to?

      The source to your changes to the binary, you mean, maybe? I don't understand the question. You seem to be implying that using an AGPL WASM binary would require you to share the browser's source. If that binary were a network application that was serving other clients, it makes sense to me that you'd be required to share modifications that you made to that binary, but I have no idea of how you would include the entire browser in that calculation.

      If there's anything scary, I'd say it would be that if you were serving a WASM blob to someone's browser, you'd have to be prepared to also distribute the source (and changes) to the binary if it was AGPL licensed.

    • p4bl0 12 days ago
      I'm sure it's not on purpose, but that sounds like fear-mongering against the AGPL license. None of the concerns raised here are remotely true. No worries :).
  • ttul 12 days ago
    A WASM-based PDF viewer may have security advantages over a native PDF viewer such as the viewer embedded in Safari or Chrome. I wonder if anyone has put MuPDF into a browser extension?
    • kevincox 12 days ago
      IIRC Chrome's PDF viewer is built upon NaCl which is basically a precursor to WASM. So it has lots of the same benefits and is basically running inside the web sandbox already. Until Firefox shipped PDF.js it was likely the most secure widely used PDF viewer due to this layered architecture.

      Of course NaCl is no longer available to web clients, so I don't know what the exact state of the Chromium PDF viewer is. Is NaCl maintained just for it or is it using some other sandbox now?

  • pama 12 days ago
    FYI: This does not work with iPhone lockdown mode.
    • azakai 12 days ago
      Lockdown mode disables WebAssembly, which this uses.

      (But it could be built with the Emscripten flag to translate the wasm to JS, which would work, but would be slower.)

      • eviks 11 days ago
        how much slower would that be?
        • azakai 11 days ago
          Usually around 2x slower execution. Load times would be even worse.
    • solardev 12 days ago
      What is iPhone lockdown mode?
  • felipefar 12 days ago
    Besides opening PDFs (and despite the project's name), MuPDF can also read EPUBs, but currently this WASM viewer can't open them. They must have had to reduce functionality of the library to port faster to WASM.

    But I wonder if there are any intrinsic issues with displaying EPUBs using WASM?

    • woodson 12 days ago
      Since EPUBs are just zipped HTML files, I guess rendering can be done more efficiently by the browser itself rather than by custom rendering on an HTML canvas.
      • ffsm8 11 days ago
        I wrote an epub reader a few years ago and this is false.

        most epubs are just zipped html, but not all.

        There are different versions to this file format and some need to be parsed as xml. The chapter files will mostly adhere to html with caveats wrt anchor tags, image sources and similar as well as metadata wont be parsed/work with html parsers.

        • felipefar 11 days ago
          Can't you preprocess them with a custom parser and then hand over a conformant HTML to the browser?
          • ffsm8 11 days ago
            Yes, but while thatd be a lot easier for epub (vs PDF), you can fundamentally do that for any other format too.

            You could even - technically - take a picture as an input and then render it via background color rgb() using divs pixel-by-pixel. thatd still fall under that description.

      • felipefar 12 days ago
        You're right. In that case you could implement the unzipping and metadata parsing in WASM and just hand the HTML to the browser to be rendered.

        It's still useful since browsers don't come with built-in EPUB support.

  • khimaros 12 days ago
    "Leading MuPDF.js" indefinitely on Mull for Android (installed from F-Droid)
  • cess11 12 days ago
    That's pretty neat. I could use something like this professionally, might actually to 'contact sales' and see what kind of money they expect.
    • merb 10 days ago
      The last time I contacted them. The money was not my problem. However the problem was that the license used a fork instead of the open source code, which meant we needed to use binaries from artifex which was exactly what we wanted to avoid.
    • reaperman 12 days ago
      What is your application? I'm curious why AGPL wouldn't be a reasonable choice. Seems like you could modify it and link it in a way that doesn't trigger AGPL on anything except your modifications to MuPDF itself, which don't seem like they'd be too valuable to give away in most cases.
      • wwarner 12 days ago
        What is the point of a dual license if you can't pay for the commercial rights which guarantee that you're not infringing on the copyleft license?

        Whoever I spoke to at Artifex [0] several years ago told me the terms of the commercial license were that if any output of muPDF were visible to the public, my company would owe $10k/mo plus some share of the revenue of the company. Unlimited internal use fell under the AGPL and was therefore free.

        Btw, the software is incredible, it was a shame I couldn't use it!

        [0] https://mupdf.com/licensing/index.html#commercial

        • reaperman 12 days ago
          Okay reading all of that link, in totality it's getting fairly ridiculous.
        • cess11 12 days ago
          Thanks for sharing. That's kinda pricey, in the vicinity of 1-2 employees.
      • cess11 12 days ago
        I'm in two markets, both heavily reliant on PDF rendering, viewing and distribution. Commonly PDF/A, which isn't very popular so having to adapt libraries happens every now and then. We're both into desktop applications and network services.

        While I'm quite FOSS positive, a lot around GPL style licensing hasn't been tested in court and I don't want to be a pioneer in this area. Besides the business aspects of having to maybe publish some of the 'secret sauce', of course.

    • wwarner 12 days ago
      I tried paying for muPDF a few years ago and it was way way more than my company could afford. Had to use a different system that was not dual licensed.
  • lofaszvanitt 12 days ago
    It has issues rendering the CIS_Ubuntu_Linux_22.04_LTS_Benchmark_v1.0.0 file. Lags like hell, it's ok with other files.
  • solardev 12 days ago
    Is there no mobile view? The sidebar stays open and the page width can't be shrunken with a pinch to zoom.
  • tuananh 11 days ago
    tried this on firefox (v125) and got this error in console

    CompileError: at offset 0: failed to match magic number

  • Olesya000 12 days ago
    [dead]
  • Olesya000 12 days ago
    [dead]
  • odmttelugu 12 days ago
    [flagged]
  • rtcode_io 12 days ago
    [flagged]
    • madmoose 12 days ago
      On the other hand, about a decade ago I sent them a feature patch to support multipage tiff images and they sent me a Christmas card for a couple of years, so my impression of them is pretty god.
    • ktosobcy 12 days ago
      > No need for this! They don't have a permissive license

      TBH I'm split on this - they would be awesome if not for greedy big-corps that often take complete advantage of the FOSS without giving anything back :/

      • janice1999 12 days ago
        MuPDF has a dual license. You can pay to use it commercially. To me that seems the best of both worlds. The project and developers get a revenue stream and the open source community continues to benefit from the code.
        • rtcode_io 12 days ago
          No, thanks. I don't tolerate rudeness. My case is closed.

          PDFium exists and my components already use it.

          • rice7th 12 days ago
            Again, this is not rude at all. You should get a dictionary and search for the meaning of the words you intend to say before you type them (this last sentence was deliberately rude so you can see the difference between the comments)
            • rtcode_io 12 days ago
              If I ask a question out of curiosity and the developer makes baseless assumptions of criminal conduct and forwards me to some clueless assistant instead, it's very, extremely, utterly rude!

              No one needs to put up with such treatment. We have alternatives that are leagues ahead.

              • rice7th 12 days ago
                No it's not. That is not the definition of rudeness. If it told you to fuck off, then thats rude, but if it told you that they will take legal action against you, then that is not rude, simple as.
                • rtcode_io 12 days ago
                  He can't do anything against me. I was simply looking for a WASM PDF renderer. I have not touched their sources with a pole. They had just removed WASM support from their public repository. I did ask the developer why. He then used my email domain to go visit my website and jump to conclusions all without asking me!!! Now I am old enough to have a working definition of rudeness and it is not artificially narrow. I responded to his secretary and asked for an apology, got nothing in return except the dev admitting he thought I was young and ignorant.

                  I then helped with the PDFium WASM builds instead: https://github.com/paulocoutinhox/pdfium-lib/issues?q=is%3Ai...

                  • kapildev 12 days ago
                    Your link uses `author @me` which is why I don't see anything.

                    Here is what is in the search box: `is:issue author:@me`

                • gala8y 12 days ago
                  Yes, it is.
        • throw7381 12 days ago
          Does anyone know the cost of the commercial licence?
      • rtcode_io 12 days ago
        I do love all things GPL! Call my distros GNU/Linux.

        They hide behind GPL as a tool to assume a weird sense of false righteousness, and see themselves entitled to make groundless accusations!

    • bramhaag 12 days ago
      > They don't have a permissive license

      Good

      > and are generally very rude.

      Any examples of this?

    • wheybags 12 days ago
      They may or may not have acted like assholes, but regardless - this comment is not making you look good.
      • rtcode_io 12 days ago
        I am letting people know of permissive alternatives (https://github.com/paulocoutinhox/pdfium-lib) and their usage in web components (PDFium.wasm + PDF.js).

        My comment also serves as a promise to open-source my components under the same permissive license.

        I don't want people exposed to unnecessary stress.

    • janice1999 12 days ago
      Verifying license compliance is not "unprofessional", it's literally the opposite. Work in industry long enough and you will be see it's a good thing when people are up front about licenses and pro-active about ensuring it's not misused.
      • rtcode_io 12 days ago
        I have been working for 12 years. You should see those emails before you make such comments. I had only asked them why WASM builds were taken out of the public repository, the dev just assumed I was using Mu and told on me to his secretary!? He then tried explaining himself saying he assumed I was young and lacked experience, respect for licensing. Just like you have done here with your comment yourself!
        • rice7th 12 days ago
          Janice wasn't rude at all in their comment, suggesting that you have a distorted idea of what rude is, apparently confuing rudeness with meticulousness.
          • rtcode_io 12 days ago
            I don't think Janice was rude. Mu people were rude, and Janice is making the same baseless assumptions they were making.
    • skywhopper 12 days ago
      Yeah, people who are rude, condescending, and uptight about licensing are a real drag, huh?
      • rtcode_io 12 days ago
        I was not using Mu anything anywhere! I was looking for a WASM renderer (then found PDFium and helped with its compilation).

        The dev just assumed I might be and replied to me as if I were 5 and deserved scolding o_O.