Show HN: Curated, Pre-trained ML Models for Transfer Learning

(modeldepot.io)

324 points | by hsikka 2263 days ago

9 comments

minimaxir 2263 days ago
Two of my projects, textgenrnn (https://github.com/minimaxir/textgenrnn) and reactionrnn (https://github.com/minimaxir/reactionrnn) are among the pretrained models.
While this is allowed by the MIT License (and there is sufficient attribution to the source repos), it might be helpful to more explicitly state that the curated models are forks/modifications. These models also have a dependency on the source packages, which I can't promise that it won't have breaking changes if I do decide to update the package.
I do like the new accompanying examples for those projects, and it's good to see the projects actually being used! :)
[-]
- rm999 2263 days ago
  Hey, I just installed and played around with reactionRNN; I hate to be negative (especially to people open sourcing models, kudos!), but your model seems to perform quite poorly. It immediately failed my "easy" smell tests: https://i.imgur.com/FvfuZgy.png, and didn't really work on most of my other tests: "This book sucks" is 0% angry, "I'm going to go home and listen to emo music and cry" is 0% sad, "Check out this hilarious youtube video" is only 26% haha. Your example "He was only 41." is 100% sad, but "He was only 42." is 0% sad. These aren't hand-picked, these are literally things I just typed in. From what I can tell it usually gets anything negative wrong, and usually picks "haha" for the positive ones.
  Subjective performance is worse than your included examples. I've been building models my whole career, and what I've learned is most people will take claimed performance at face value until it burns them. It's beneficial to no one if someone comes up with an idea based off your repo description, builds it out, then finds it doesn't work adequately. My advice is to update your examples and test cases, and keep finding ways to improve the model.
  [-]
  - minimaxir 2263 days ago
    The result is the reaction to a given text; it won't fully be the same as a sentiment analysis, unfortunately.
    Additionally, as I put in the README notes, keep in mind that the network is trained on modern (2016-2017) language. As a result, inputting rhetorical/ironic statements will often yield love/wow responses and not sad/angry. That type of systemic bias in text analysis is currently unsolved and there isn't an easy way to account for it.
    "I am so angry" and "I'm going to go home and listen to emo music and cry" are phrases that would likely be posted on Facebook ironically, and therefore classifying it tricky.
    For the other examples, yes, that result might be overfitting on characters and while setting up the model I had difficulty accounting for that while still getting the model to converge.
    I'll admit it's not a perfect model (it was a side project while I was frustrated during a job hunt), but it's a great proof of concept. Unfortunately, since Facebook crippled their /posts endpoint, I can't get more data to improve the model...
- hsikka 2263 days ago
  Hey, thank you! That is an excellent point, and definitely a huge potential problem that we're gunning to fix, both in the submissions already up and those that will come in the future. With this version of ModelDepot, we really just wanted to pique people's curiosity and get them to see the power and potential of sharing well documented models.
  Also, thank you for contributing to the ML community at large! I really enjoy your work personally, and hearing that you dig the examples we patched together means the world :)
rememberlenny 2263 days ago
This is a great. Thank you for making this.
The aggregation of models like this isn't new, but simple layout of models available is useful. I can imagine being able to filter by model origin (OpenAI/Facebook/Google/etc), origin date (year), compared use case (why should I use VGG vs RNN).
Overall, great for people who are and aren't deeply immersed in ML.
[-]
- hsikka 2263 days ago
  Thanks! You hit the nail right on the head there, while aggregation is nothing new, we're hoping that the notebooks and tutorials provide an entrypoint into being able to get up and running with the model more quickly.
  We're actually hoping to build ModelDepot into a place where anybody can share (well documented) models, and be able to discover the right ones for their needs.
  [-]
  - RasputinsBro 2263 days ago
    Hey, I can't really see the website. The page is blank..
    [-]
    - mikeshi42 2263 days ago
      Hey! Sorry the site isn't rendering properly :'( Can you let us know what browser/OS version you're on?
      [-]
      - RasputinsBro 2263 days ago
        Tor Browser.
        The page just reads "You need to enable JavaScript to run this app."
        [-]
        ZeroCool2u 2263 days ago
        Honestly, laughed pretty hard at this comment.
        But seriously, you're gonna need to whitelist the site and enable JS. Or just use a normal non tor browser.
andreyk 2262 days ago
Definitely a cool project. It's crazy there is no standard model zoo anywhere yet. But, a few thoughts:
-No about page? Not clear at all without cleaning on sign up button that model submission/voting is a thing. And many more details I'd be curious to know seeing this for the first time.
-A few things, like colorization, loaded reallllyyy slowly
-No search/stronger filtering?
-Ideally various project should have weights in ONNX standard, and there should be tabs/selection of different platforms. You'd have five different entries for implementatations of ResNets in each of the major platforms? Quite silly.
-This seems like it should be an open source project , given that it is submission/open source driven
-Although this is aesthetically pleasing, at the end of the day it's just some links to cool Github repos... I feel like it'd make a lot more sense to create this as a CodaLab worksheet (https://worksheets.codalab.org/worksheets/0x818930127c4d47de...) ; the platform there is just more mature for this sort of thing. Or make this a fork of that with the whole Zoo angle.
daniellerommel 2263 days ago
For those using Wolfram Language or Mathematica, there is an unreleased Neural Net Repository that is worth checking out. A few examples:
Caffe: https://resources.wolframcloud.com/NeuralNetRepository/resou...
Keras: https://resources.wolframcloud.com/NeuralNetRepository/resou...
hendler 2263 days ago
I like the looks of https://openmined.org/ for decentralized sharing of data and models.
partycoder 2263 days ago
One day there will be a trained model called "software engineer" and we will be out of a job.
The training data may come from github, extracting requirements from issues, and mapping it to the code that closed the ticket.
[-]
- mikeshi42 2263 days ago
  But then we can finally spend _all_ of our time on HN!
  But we really do think ML, even in the near future, will automate some of our jobs away (and that's a good thing!), if you're interested in some of our thoughts on that: https://medium.com/modeldepot/we-previously-talked-about-how...
  [-]
  - MayeulC 2263 days ago
    I was about to answer roughly the same.
    If someone could step up and "take my job", it would free me up for other things. I would be grad to be able to rely on a good "software engineer" model to build the next generation tooling.
    All in all, more performance, more efficiency, improving our tools bottom-up, going for the low hanging fruit first, to free us from the less interesting tasks. Isn't it what we always do and always did?
    If a task is repetitive or deterministic, I always feel like I could be spending my time on something more interesting, not on the implementation details. The problem is that I have to take care of it myself, if I want it to be done properly (and that's true as well in other fields than software, of course).
    Disclaimer: still a student (although nearly graduated). I won't bite you for having more experience than I do :) And the phrasing here might be a little provocative on purpose, don't take it too hard please.
    [-]
    - partycoder 2261 days ago
      I strongly disagree.
      First, you assume that AI will remain less proficient than humans, and therefore suitable only for grunt work. Artificial intelligence has many advantages over us, and this is often overlooked.
      - How much time does it take to train one software engineer? decades worth of time. Once you have one, what does it get another? about the same amount of time and money. In contrast, once you train one AI based engineer, the time to get the second one is about the time of serializing the state of one and deploying it multiple times. You could recruit and discard AI engineers on demand for whatever you need.
      - They won't sleep, take breaks for lunch or coffee, get distracted, waste time talking about their car, vacations, wine tasting and other stupid things. They will just work. 24 hours a day. No breaks.
      - They won't consume information at a rate of 200 to 1000 words per minute, they won't be limited by their visual field, or how fast can they type. They may not even communicate using words, just transfer thoughts directly among them, or even entire trained neural ensembles. Plus, once they learn something that may affect they collectively rather than individually.
      - They will be aggressively supervised, ranked. Then, since their generations are shorter, this will cause them to continuously improve at a much faster pace than humans.
      If this happens, the need for having the same amount of human engineers will be lower if not zero.
      Some people may argue that the human brain has billions of neurons and synaptic connections. Sure. But which portion of that is actively used in developing software? remove motor functions, autonomous functions like regulating heart rate, respiration and such and you are left with much less.
    - hsikka 2263 days ago
      Well said!
- sgt101 2263 days ago
  Who will write the specs that cause the code to be generated?
  oh...
DOsinga 2263 days ago
This is great, I like the inclusion of the notebooks, makes it easy to get started. The download link for the Image Colorization model doesn't seem to work for me though - it gives an Access Denied error.
[-]
- mikeshi42 2263 days ago
  Hey! Sorry, slight typo on our end, the notebook should still work fine, but the manual download didn't work for a bit.
  It's fixed now though! Thanks for the heads up and I hope you've found it useful! Happy to hear any other feedback as well :)
  [-]
  - DOsinga 2263 days ago
    Yep, works now. Thanks!
orliesaurus 2263 days ago
Thanks for submitting! Can you tell me how this differs from something like https://algorithmia.com/ ? I am curious to understand the main differences between their approach and yours! Cheers :)
[-]
- edraferi 2263 days ago
  Algorithmia is a FaaS platform with strong service discovery and a lot of AI functions. Model Depot is a listing of open-source models.
  You could find a network on Model Depot and deploy it on Algorithmia. Then you (And others, if you like) can use the model via RPC using HTTP calls or one of the Algorithmia client libraries.
- mikeshi42 2263 days ago
  Of course! We're huge believers in creating a transparent platform that talks about models at a high-level for general understanding, but also want to make sure we include and cite the relevant original source code, and applicable research papers backing the model.
  We're not trying to build something black box, or tied to proprietary infrastructure/APIs that'll cost you money. The models are free for you to use, download, and extend, according to their respective license. You can deploy the model on your own servers and use it as many times as you like, without paying a single cent. You can further fine-tune/transfer learn the model for your specific deployment, if the out of the box model doesn't work exactly how you'd like.
  We're both trying to empower engineers with Machine Learning, we just hope to accomplish that through making it transparent and free to use, so that they're extensible and flexible for whatever use case you need. Along the way we also hope that the developers can learn a bit of ML as well, so they can better leverage the models for their use case!
- muninn_ 2263 days ago
  Did you take a look at the two websites and do any comparison?
nukenuke 2263 days ago
Are there any good models for transfer learning with audio ?
[-]
- hsikka 2263 days ago
  That is an awesome question, we haven't found any yet but are actively looking! We'll let you know if we find some good ones :)