Solving the Rubik’s cube with a robot hand

(openai.com)

171 points | by gdb 1653 days ago

18 comments

  • daenz 1653 days ago
    >We’ve trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation

    Simulated training is so cool. Related, is anyone interested in a plugin for Blender that allows you to easily build physically-accurate simulation environments for robots and then apply reinforcement learning to the virtual robots? I have a hodge-podge amount of code for doing exactly this, and I'm curious if anyone else would be interested in it?

    • sgillen 1653 days ago
      Personally I think it would be useful to focus on the "easily build physically accurate simulation environments for robots" using blender part. IMO it makes the most sense to try and make this created environment into an OpenAI gym environment so that way most of the existing RL algorithms can be applied to the robots. If you do want to spin your own RL this approach does not stop you from doing so.

      AFAIK there are a lot of publicly available RL algorithms out there, but not many (any) blender like interfaces to make physically accurate simulations.

    • orasis 1653 days ago
      The trick here isn’t “accurate” simulation, it’s that they used a bunch of different simulations with randomly perturbed physics and the RL learned policies that worked across these wide range of “realities”.
    • artemish 1653 days ago
      Yes please!
    • RugnirViking 1653 days ago
      Yes, I'd be interested.
    • ofou 1653 days ago
      Also Yes, please!
    • carapace 1653 days ago
      Also yes, please!
    • askytb 1653 days ago
      Yes, absolutely
  • ZhuanXia 1653 days ago
    Gwern on the luck of the last mover:

    "Launching too early means failure, but being conservative & launching later is just as bad because regardless of forecasting, a good idea will draw overly-optimistic researchers or entrepreneurs to it like moths to a flame: all get immolated but the one with the dumb luck to kiss the flame at the perfect instant, who then wins everything, at which point everyone can see that the optimal time is past."

    Robotics has been a money pit for startups and corporations for a long time. Think of the billions Toytoa has spent on home robotics research, to little avail.

    But at some point it won't be. Some entity will "kiss the flame" at the right movement. The wealth they create will be beyond any company ever, by an almost incomparable margin.

    • dang 1653 days ago
      Please put quotation marks around a quote, so we can know what's the quote and what's not! I've added them to your comment now, based on https://www.gwern.net/Timing. It's good to link to the source of a quote too. (Your comment was fine otherwise.)
    • hinkley 1653 days ago
      I worked at a startup that was trying to do what the Apple Store did for mobile about 4 years before that. They got little traction and ended up selling to a cellphone manufacturer.

      The next startup after that, the boss was very excited to have competitors, because without them you are alone in trying to validate that sector. Competition means people are voting with their feet and wallets that you are, if not right, at least not wrong.

      It kind of felt like I understood him on a level many of my coworkers did not.

      But you're describing kind of the opposite end of the spectrum. When there are too many people, you have no control over the narrative. If you are surrounded by idiots, you get painted with the same brush.

      And now that I'm thinking about it, it would even be tenuous for you to buy up your more clue-ful competitors because combining forces may improve your narrative but now it's one voice instead of two. That's a new wrinkle in the post-hype consolidation pattern that I hadn't considered before.

    • carapace 1653 days ago
      Part of the problem is that the popular conception of robots tends to be a kind of fetish. What I mean is, the things that are easy for robots to do are already addressed. You can buy off-the-shelf robots that work really well. They're not cheap though.

      But those don't look like "robots", they look like arms with tools on the end of them.

      The kind of humanoid servant robot from books and movies, however, is still pretty much fictional. The required capabilities are mostly really hard, even after you factor in the recent advances in ML et. al.

      I remember when Sony made that little humanoid robot that danced. I was like, "Big deal! I like to dance. Make a robot that does the dishes."

      - - - -

      To make it big with robots (per se as opposed to just building an automated factory, or toys) you have to find the economic niches.

      • DuskStar 1653 days ago
        > But those don't look like "robots", they look like arms with tools on the end of them.

        > I was like, "Big deal! I like to dance. Make a robot that does the dishes."

        From these comments, I think you're missing a really huge category of robots - appliances. Why does a dishwasher or laundry machine not qualify as a robot, after all?

        • carapace 1652 days ago
          I sometimes do call them robots, but you could exclude them on the basis of lack of mobility, or, better yet, lack of decision making. (Although a friend of mine has a laundry dryer with a moisture sensor.)

          In a sense, anything with a PID controller or even just a governor could be considered a "robot", or at least "automation", eh?

          https://en.wikipedia.org/wiki/Centrifugal_governor

    • paraschopra 1653 days ago
      It’s not the tech per se that is usually lacking, what’s truly required is tech that solved a human need better than how it is being solved currently.

      Since solving it better than how it is being solved requires much more than tech (distribution, habits, pricing), it does take a number of experiments before value from tech is unlocked by an entrepreneur.

  • sytelus 1653 days ago
    Many caviates but impressive progress in manipulation, especially sim2real:

    - Only 20% attempts successful on hardest configs with 26+ moves

    - Solving steps are not generated by RL (but could be[1])

    - Cube is modified internally to transmit additional state via bluetooth

    - Highly calibrated and fine tuned environment+MuJoCo based sim to match simulation to reality as much as possible

    - Open AI Five algorithm is pretty much reused as-is

    - Cumulative training time = 13 thousand years, same order of magnitude as the 40 thousand years

    - 32+64 V100 GPUs per training cycle

    [1] https://arxiv.org/abs/1805.07470

  • hinkley 1653 days ago
    Some of Vernor Vinge's books deal with the 'alien' in alien intelligence in ways that were quite illuminating/shocking for me at the time. They weren't just humanoids with animal instincts. He created intelligent spiders that were believable. And the only sympathetic treatment of a hive mind I've yet encountered (Card's are pale in comparison)

    But one of my favorite inventions of his was a creature that had somehow evolved wheels. With veins and nerves and such there is hardly a creature on earth that can rotate a limb farther much farther than 200°, and the ones that can, like owls, we treat with a certain reverence.

    Developing an artificial wrist that can spin arbitrarily would be, I'd think, a quite compelling compensation for someone having to use a prosthetic arm. It would also make for some wicked Rubix solving skills. I wonder how proprioception would deal with that though...

    • chewxy 1653 days ago
      On that note, I also highly recommend Adrian Tchaikovsky's Children of Time.
  • minimaxir 1653 days ago
    I appreciate the plushed giraffe perturbation. Reinforcement learning needs to account for all eventualities, including giraffes.
  • perl4ever 1653 days ago
    "What people don't appreciate, when they picture Terminator-style automatons striding triumphantly across a mountain of human skulls, is how hard it is to keep your footing on something as unstable as a mountain of human skulls."

    ...I'm not feeling so confident now.

  • gapo 1653 days ago
    It's great that OpenAI has continued to exist as an technological organization with no clear revenue expectations. At the same time I am not sure how long they can sustain doing what they are doing OR whether there is this new found feasibility for private research organizations to exist in this space provided they produce clear high-quality output like OpenAI is doing.
    • breck 1653 days ago
      Well they just raised $1B, so I hope they can exist at least a little while ;).

      https://news.ycombinator.com/item?id=20497548

    • hos234 1653 days ago
      I'll believe they are doing something useful the day they setup a Burger shack opposite a McDonalds and outcompete on inventory or queuing or something practical. Nobody in industry cares about Rubik's cubes and Go.
      • sytelus 1653 days ago
        I think you are underestimating the power of such progress. Look around all the objects you have from iPhone to laptops to pencil sharpener. They were made in some factory and very likely human hands played some role there. Now imagine you can throw in $100 human hands which can operate as dexterously as human from cameras just like human without taking rest or vacations or requiring medical insurance. What you think will be the impact of this? People call it Industrial Revolution 4.0. It will change world beyond billions or trillions of dollars. The investment in places like OpenAI is bargain of lifetime.
      • hereiskkb 1653 days ago
        Yes, the industry does not care about Rubik's cube and Go. The industry, however, would not exist in the first place if not preceded by researchers on equivalent endevours that would make zero monetary sense on first look. Markets are created by those that can envision one with the given technology. No matter how incredulous the technology might look like.
  • jefft255 1653 days ago
    As a roboticist, it's really clear to me that this sort of transfer in controlled environment is hard but doable. I think it's already been demonstrated many times and I'm not that convinced that there is anything new in there except more GPU + fancier robot.

    I'll be impressed by RL is a) they manage to do sim2real in open environments, think Doom -> office building or b) they manage to get data efficient enough that sim2real is still necessary but you don't have to do real data collection with 10 parallel robots for days on end.

    As someone in mobile robotics as opposed to pure manipulation, I read these papers and I'm like: "How the hell am I supposed to get this to work on a robot moving in the real world???". I don't see anyone being close to this right now.

    • ilaksh 1653 days ago
      As a roboticist what do you think of my theory that what's missing is more biomimetic artificial muscles with greater power-to-weight ratio?
      • jefft255 1653 days ago
        Honestly I don’t know; you’re out of my area here I’m really into perception/slam/planning. Greater power to weight ratio is always good. I never really cared about biomimetism for the sake of it. If the way to get better power to weight ratio is biomimetism then great but if you can get it without trying to imitate nature then it’s great too.
    • lucidrains 1653 days ago
      • jefft255 1653 days ago
        While this is a real mobile robot, this does not really address my concerns: 1) It is not navigating in a open environment, rather a constrained workspace just like other manipulation demos. 2) They're not using any visual information for task planning and reasoning, rather doing low level control for locomotion.

        Of course, I cannot emphasize enough that this is a good research paper! We're just really really far away mobile robots trained end-to-end with RL for doing any kind of tasks in open environments. And that's fine. It means more cool research to do for me.

        • lucidrains 1653 days ago
          They are going commercial soon, so we shall see how well they stand up to different open world environments.
  • est31 1653 days ago
    This is cool. I wonder about the hardware. Why does the mount for the hand have a fan? Does it contain the inference computer? Power transformers?
  • askytb 1653 days ago
    Does anyone have any experience with soft robotics? For example these guys: https://www.youtube.com/watch?v=X6CRe2ieuYE advertise their gripper as supposedly being able to handle weight/size variety with no training at all, just with the use of different materials in the gripper
  • breck 1653 days ago
    A few weeks back I was at a program synthesis conference and gave a short lightning talk where I said deep learning so far has been used to solve the easy computer chess, and the easy computer go, etc...not to take away from those accomplishments at all, I was just saying that having a robot beat grandmasters at real world physics chess where you have to move the pieces with many degrees of freedom is a harder problem, but trivial for a 7 year old.

    I thought we were still a decade away from having machines beat humans at real chess and real go, but this makes me think maybes it’s just 5 years out. Very impressive.

    • PeterisP 1653 days ago
      Manipulating chess pieces is trivial for e.g. a pick and place robot, which are quite widely used for industrial activities that are quite close to moving chess or go pieces.

      In particular, far from being "just 5 years out", robot hands that execute chess moves have been already demoed many times, including by hobbyists with very limited resources. Reliable computer vision was a bit more trickier a decade ago, but that's not a problem now; Having a robot beat grandmasters at "real chess" (i.e. the same thing as "virtual chess" but also manipulating the physical pieces) would not be considered a hard problem nor a valuable achievement, it's a nifty parlor trick that could make a cute demo 10 years ago, and could be used as a homework project for engineering students nowadays - however that's likely to be two separate projects, as the mechanical manipulation and visual recognition is likely to be different skillsets and thus different students.

      Here's a random article from 2010 https://newatlas.com/chess-terminator-robot-takes-on-kramnik...

      Here's a hobbyist project from 2013 https://www.robotshop.com/community/blog/show/a-chess-playin...

      Here's a tutorial from 2017 on how to make the chess piece manipulation yourself - https://www.youtube.com/watch?v=NefiXZ7BCsE

      Here's a student project, replacing the vision with sensors - https://www.instructables.com/id/Chess-Robot/

      • breck 1653 days ago
        Great links, thanks very much for bringing me up to speed on this domain. The Chess Terminator is the sort of thing I'm talking about.

        > Manipulating chess pieces is trivial for e.g. a pick and place robot,

        Perhaps in a sterile, well-known, controlled environment; but not in a real world, novel, potentially adversarial environment.

        I guess my point is about AGI is that I would bet a 7-year old could currently beat the best AI in the world at real, physical chess, played in a randomly chosen park. Kids can quickly figure out strategies in the real world with its more degrees of freedom than you have in the digital world of computer chess. In other words, perhaps a kid may figure out that if they place a piece in a certain position, the computer is unable to "see" or "execute" the desired move, perhaps because the angle of the sun or some line of sight obstruction. While an adult might be generous and offer help, a lot of children will take advantage of the robot's weaknesses.

        • PeterisP 1653 days ago
          IMHO that's not chess anymore, as that explicitly violates the rules of the chess - if you manage to get an advantage by distracting your opponent and obscuring line of sight to the pieces, that's simply violating the laws of chess (specifically, FIDE "12.6 It is forbidden to distract or annoy the opponent in any manner whatsoever.") and appropriately punishable by the arbiters.

          Chess is a well-defined, strict game not only from the "on-board" perspective but also regarding how the opponents can behave - e.g. it's explicitly specified that if your phone makes a sound during a match, then you lose the game; the rules of chess IMHO are exactly a sterile, well-known, controlled environment, and attempting to transform it to a novel, potentially adversial environment would generally be a violation of both the spirit and letter of laws of chess.

          E.g. https://en.m.wikipedia.org/wiki/Chess_boxing is a fine physical, adversarial form of sports, but it's not chess.

          • breck 1653 days ago
            Haha, thank you! I stand corrected. (Next time I play with my nieces and nephews, I'm going to be stricter about rules :) )
            • PeterisP 1653 days ago
              Hah, I have a niece that will assert that I violate that "forbidden to annoy opponent" rule because I annoy her simply by existing.
      • carapace 1653 days ago
        Years ago I played chess with/on a machine that used a magnet under the board and metal bases on each piece.

        The pieces seemed to move by themselves.

  • imtringued 1652 days ago
    This is a pretty pathetic result and is damning for the progress of AI. Instead of focusing on efficiency, AI researchers simply throw more resources at the problem with the hope that it's enough. The end result after 13000 years of training is a robot hand that can do nothing but solve rubik's cubes and fails 40% of the time.
  • yCloser 1653 days ago
    world record one handed is 6":88 average of 5 is 9":48 https://www.worldcubeassociation.org/results/rankings/333oh/...

    non-world-class: doing <30" one handed is very doable, anyone can do less than 1 min (yes, if you know how to solve and you trained one handed. ofc not if you never solved a cube in your life)

    that said... I really don't understand how the hand keeps the cube "floating" around. In one handed the technique is pretty much to keep the cube fixed holding front/back centers with thumb and index. Something like https://www.youtube.com/watch?v=mUF3aPDTO-4

    I understand the achievement, but wow, this solve is HORRIBLE. What did they train the network with to get this?!

  • throwaway07Ju19 1653 days ago
    Around 1988, I read a book that claimed the ideal robot hand would have fingers that repeatedly bifurcate until it has a digit so small it can manipulate matter at the atomic level. Implausible but fun to think about.
  • ilaksh 1653 days ago
    I think artificial muscles that are more biomimetic with better power-to-weight ratios are going to make a huge improvement in robot capabilities at some point. Especially for humanoids.
  • ____Sash---701_ 1653 days ago
    Any YC companies going after the robotics industry?
  • a13n 1653 days ago
    Would be cool to see it done with two hands (or one), solved faster than the human world record. It's still pretty clumsy looking.