CatalyzeX appear to be just posting (or spamming depending on your point of view) interesting papers/videos to AI/ML subredits to publicise their chrome plugin (that "finds and shows implementation code for any... research papers").
The link to the paper is at the top of the post under "paper link" from that you can click through to the paper on arxiv.org
Chrome browser extensions aren't really compiled, though I suppose they can be obfuscated. If you install this extension you ought to be able to find it in your Chrome profile directory and look at the manifest and JS source.
> because a browser extension bypasses any browser security one may have, and has access to one's entire computer.
This isn't true. They don't have any access outside of the browser with that extension. Further, the list of sites that they've asked for permissions to seems vaguely reasonable. I have not actually looked at the extension code to see what they're doing, but they have only asked for permissions to the following:
And unfortunately the extension security model isn't all that granular, so "Read and change your data" really just means they wanted to touch the DOM somewhere on those sites. They could be recording data, but it's more likely they wanted to add a button/link somewhere.
It means they can view/modify the DOM of those sites.
Lots of harmless reasons to want to do that (ex: adding a link/button) but yes, it also means they can grab the text content of the site (read your data) and change the site DOM or your text nodes (change your data).
Just because they can doesn't mean they do, though (but it also doesn't mean they don't... shrug).
Again - the problem here is there's absolutely no way to tell the browser "Hey, I don't want to see text nodes, just put this button here in the DOM".
I should add - they appear to have only requested www.google.com, which is usually just search. Gmail/Docs/Sheets/Suite/Other are usually under seperate subdomains - ex: gmail is mail.google.com.
I mean - yes, although there's no real reason to even bother with the fake login form. They could just monitor the real login form the next time you login.
That said - at least for google, login is on accounts.google.com - so they aren't asking for that here.
Additionally, these extensions do go through review by google - something as blatant as a content_script that's phoning home with login details would ideally be caught (I develop extensions for work, but haven't tried submitting something malicious for review - so I can't really comment on whether they DO actually catch it)
Hi, co-author of CatalyzeX/the extension here! Most of the comments here seem to have been addressed, just re-iterating the following:
- The extension lets you easily jump to the code for papers (the papers are all open-access on the web)
- The extension code is not obfuscated so it's easy to check out the underlying code for yourself (just search online for how to view extension source code). We just haven't officially open-sourced it as of now as it still needs a fair bit of cleaning up and optimizing
- What we're using the browser permissions for is to add code buttons in-line on the webpage(s) you're on. Permissions offered by Chromium browsers are unfortunately not more granular than what we're using to simply update the DOM.
FWIW in latest Chrome if extension wants access to all websites, you can toggle it to be "click to enable", or only whitelist a few websites authorized to run it. Go to extension page -> choose extension and modify where it can run.
At least... I'm guessing. Regrettably the details of the bug is linking back to some internal atlassian mozilla seems to be keeping. Same with related bugs.
This is a little surprising to me, since all of mozilla's development tracking used to be on bugzilla. I hope this isn't a move to something more like Safari or Microsoft where issues vanish into an internal system with external updates ignored.
> This really smells fishy, because a browser extension bypasses any browser security one may have, and has access to one's entire computer.
IIRC that one is no longer true since the advent of sandboxing (and the end of native-code extensions via NPAPI and friends), all permissions have to be granted by the user explicitly.
> What valid reasoning could there be to justify a browser extension for this purpose?
Given the permissions it asks for (access catalyzex.com, arxiv, google scholar, twitter and google) and the description, I'd guess that whenever you search for some research paper it will forward the search to catalyzex and annotate search results.
Unfortunately, the access to Google and Twitter can also be used to exfiltrate your credentials or to commit actions on your behalf there, so I'd be very careful. Too bad the Chrome extension store (unlike the Firefox extension store) does not allow you to directly download the extension to examine it, or to prevent automatic updates.
This is correct! :) Also, the extension's source code is not obfuscated so it's possible to download, extract, and go through the readable code.
If anyone's keen, just Google or Bing search to see how to do so (or lmk!). We haven't officially open-sourced as of now as the code still needs cleaning up, refactoring, etc :)
Other issues aside, CatalyzeX has annoyed me for a long time. The creators of the extension used to spam almost all ML/DL related social media. Maybe they still do, I don't know cos now I've stopped following all of those platforms.
Why did it annoy me so much? - They would share the popular, latest papers that most people in the ML sphere would anyways come across if they follow academia twitter or ML subreddit. That is fine in itself but no, those posts would hijack the original source (mostly arxiv or respective project page) by taking you to the CatalyzeX website which is mainly designed to drive traffic to it and has all sorts of irritating design patterns. Mostly, it just felt dishonest and a blatant shadowing of the original authors' hard work to me.