Skymind (YC W16) raises $11.5M to bring deep learning to more enterprises

(techcrunch.com)

64 points | by blueyes 1863 days ago

10 comments

blueyes 1863 days ago
Any one interested in our code can try out a live demo on Katacoda: https://www.katacoda.com/skymind/
Fwiw, the Skymind team built Deeplearning4j, is the second-largest contributor to Keras after Google, and the sole maintainer of HyperOpt.
https://github.com/deeplearning4j/
https://github.com/hyperopt/hyperopt/
Our code serves as a bridge between the Python data science ecosystem and tools like Spark, Kafka, Hadoop, etc.
mark_l_watson 1863 days ago
Good for them!
While I almost always use TensorFlow for work, I appreciate Skymind's open source Deeplearning4j library for use with Common Lisp (via Armed Bear CL), Java, and Scala. Sometimes living on the JVM is the best choice.
[-]
- agibsonccc 1863 days ago
  Thanks! Of note is we also contribute to keras and run javacpp: https://github.com/bytedeco/javacpp which allows us to run a lot of things you only get in python directly in c. We need to work on getting some of these things more well known. Hopefully we can focus more on better community growth this year.
uberdog 1863 days ago
Is Sky TV still suing everyone with Sky in the name?
https://techcrunch.com/2013/06/28/uks-bskyb-wins-judgement-a...
https://www.eurogamer.net/articles/2016-06-20-no-mans-sky-st...
[-]
- blueyes 1863 days ago
  No. Knock on wood.
pytyper2 1863 days ago
How do I make an architecture diagram like they display in the article?
[-]
- asimjalis 1863 days ago
  You can use https://cloudcraft.co
- linkingday 1863 days ago
  Looks like they used Cloudcraft for that
- pavlov 1863 days ago
  Cool, but rather too difficult to read to be used for anything but marketing hero graphics.
  [-]
  - pytyper2 1862 days ago
    What would you recommend?
sinak 1863 days ago
We share an office with the Skymind team, and they're an awesome group. Congrats to them on their round!
king_magic 1863 days ago
Legitimately curious, why would this be a better solution than Databricks on Azure or AWS?
[-]
- blueyes 1863 days ago
  DL4J is one of the easiest ways to add deep learning to a Spark cluster.
  https://deeplearning4j.org/docs/latest/deeplearning4j-scaleo...
  You can also import Keras models to train them on a Spark cluster with DL4J:
  https://deeplearning4j.org/docs/latest/keras-import-overview
  [-]
  - king_magic 1863 days ago
    Right, but you can just as easily train a model with TensorFlow on Databricks - maybe I’m missing something, though.
    [-]
    - agibsonccc 1863 days ago
      We control the whole framework down to the bare metal. We also have our own built in memory allocators.
      There's a bit more to jvm than just training models. Java based application servers are still widely deployed for example.
      There's a whole data engineering niche we target here as well (for the spark pipelines) where python isn't a good fit for that team.
      Then there's the fact we're still easier to run keras on spark than any other framework thanks to our model import.
      That doesn't account for what we're doing with inference. Many vendors in this space are just running kubernetes bundling other tools they don't control. We actually engage various large companies in custom chip development running other DL frameworks due to this low level control.
      Depending on what you're looking to do, we're still the standard for pre compiled binaries compiled as jar files: https://repo1.maven.org/maven2/org/nd4j/nd4j-native/1.0.0-be...
      You won't find any other framework running pre cooked avx binaries and IBM power at the same time.
      Happy to talk about more depending on what your focus is.
      [-]
      - king_magic 1862 days ago
        Thanks! This is helpful.
    - FridgeSeal 1863 days ago
      Because if I have to use DataBricks then I need to have things as notebooks, something that I'm desperately trying to migrate the team I work in away from so that we can have actually maintainable code that gets deployed, monitored and held to the same rigour/maintainability that we hold normal dev code to.
      Also, being forced to use a cluster is such catastrophic overkill for so, so many tasks. I have teammates wanting to use spark/databricks just to process a handful of files in S3 totalling a few gb tops. Realistically we could do the same work, in a single container with Python/Julia/Scala/language of choice in same or less amount of time and with an order of magnitude more maintainability.
    - blueyes 1863 days ago
      Good point. Maybe that's just as easy. We see companies with on-prem Spark clusters telling us that we're the easiest way to do DL there. Enterprise clients are doing a lot on prem, and their future is probably hybrid for a long time to come.
      [-]
      - opportune 1863 days ago
        On-prem spark seems kind of anachronistic now.
    - mlthoughts2018 1863 days ago
      After doing a significant deep dive into Databricks as a third party solution for my team at work, we decided Databricks and a deep commitment to Spark was a very poor choice for machine learning, though Spark seems generally fine as an interface for map reduce or scheduled cluster compute tasks.
      Spark is in 2019 what Hadoop was in ~2014. In 5-6 years Spark will be the cure-all basket that a bunch of people put all their eggs into not realizing the deep-seated limitations. This is especially true for machine learning.
      [-]
      - ganeshkrishnan 1862 days ago
        Spark is big data processor. Use dl4j or tensor flow along with it for machine learning
        [-]
        mlthoughts2018 1862 days ago
        Use of those tools (along with MLlib and virtually anything relying on a py4j bridge) was precisely the setup I tested and found to have unacceptably poor performance when taken across our range of both large and small workloads, in addition to many problems with deep inflexibility in controlling the runtime environment on a per-project basis (our most critical requirement).
        See my other comment below with a link to a previous discussion.
      - cheriot 1863 days ago
        What are the deep-seated limitations you see?
        [-]
        mlthoughts2018 1863 days ago
        Rather than type it out again, I’ll link to a previous thread where I discussed a small bit of it,
        https://news.ycombinator.com/item?id=19321372
        [-]
        FridgeSeal 1862 days ago
        I cannot agree more with what you wrote in your linked comment.
        People dramatically underestimate how far you can get with even a single machine and some slightly better engineering. Best ever example of this I've seen is the super impressive work done by Frank McSherry: http://www.frankmcsherry.org/assets/COST.pdf http://www.frankmcsherry.org/graph/scalability/cost/2015/02/...
nikkwong 1863 days ago
Incredible. I know Adam (the founder), and he is a super terrific nice guy. Good for them.
[-]
- agibsonccc 1863 days ago
  Thanks Nikk!
suyash 1862 days ago
Congrats to the SkyMind team, I have given a few talks on Machine Learning using DL4J and it’s been nothing but an excelllent framework for Java Developers to learn.
agibsonccc 1863 days ago
Tech side here if you have any other questions!
AnimalMuppet 1863 days ago
Yeah, that name isn't ominous or anything...