Teaching a neural network to use a calculator

(reiinakano.com)

78 points | by baylearn 1623 days ago

5 comments

  • FraserGreenlee 1623 days ago
    Here the neural network was given examples of how to use the calculator for each question which means it wasn't generating it's own abstractions.

    If you wanted to use this to solve other (e.g. programming) problems you would need examples of every step required for almost every problem.

    Using neural networks in this way is akin to locality sensitive hashing, instead it should understand what it's lowest level operators do and discover useful combinations of them that can solve new problems.

  • fyp 1623 days ago
    I haven't been following this field, but anyone know what happened to Neural Programmer Interpreters (2015)? It seemed like such a promising direction back then. It showed that a neural network can learn to use arbitrary commands to execute algorithms such as multidigit addition and bubble sort: http://www-personal.umich.edu/~reedscot/iclr_project.html

    That seems like a much better demo of using blackbox tools as substeps in problem solving. Is there a reason why it shouldn't work when the blackbox is a more complex function like sympy's eval?

  • JHonaker 1622 days ago
    > Something that intrigued me in Saxton et. al.’s paper was how high a baseline transformer scored on probability tasks (~0.77 and ~0.73), given that working these out are a multi-step process. How could basic pattern-matching score so highly on such a task? Is mere perception enough to figure out something like the probability product rule, on such a generic architecture without any prior knowledge of numbers or probability?

    > To try and explain this, we point out that although questions are unique, a lot of them will share the same answers. For example, Calculate prob of sequence aad from abcda, Calculate prob of sequence bbz from zbbmn, and Calculate prob of sequence rpr from {r: 2, p: 1, x:2} all lead to the same answer, 1/30.

    > Doing a bit of analysis on training set questions, we find that out of 1 million samples each, swr_p_level_set and swr_p_sequence have 977179 and 978045 unique questions, respectively. This seems reasonable, as duplicates are limited to <3% of the training set and the distribution over questions appears fairly uniform.

    > On the other hand, doing analysis on training set answers reveals that out of 1 million samples eachs, swr_p_level_set and swr_p_sequence have 1458 and 1865 unique answers, respectively.

    > Counting the collective number of samples that share the top K most common answers reveals even more imbalance.

    This is the real takeaway for me from the article.

  • king07828 1622 days ago
    From the title, I was expecting the neural network to take an input (e.g., speech or a string "5+11+3=") and then control mouse movements to push the keys on a calculator program (e.g., Windows Calculator). I.e., a neural network driving an existing user interface based on commands from a user.

    But the article is more about using neural network transformers to build steps of a mathematical proof with each step checked by a symbolic "calculator". I.e., transformers applied to mathematical proofs.

  • The_rationalist 1623 days ago
    The fact that a neural network isn't even able to calculate, even if only trained to do this show how limiting are neural network only AGIs.
    • j-pb 1623 days ago
      Of course you could train a NN to do arithmetic, but this is much more impressive. Training a NN network to solve problems with available tools means more abstraction, and is closer to AGI than just essentially learning a LUT.
      • Sean1708 1622 days ago
        > Of course you could train a NN to do arithmetic

        Are we really capable of teaching a NN to parse and calculate an arbitrary arithmetic expression? Because that sounds incredibly impressive...

    • Hitton 1623 days ago
      I'm not sure. Human is general intelligence, but has to learn basic maths too.
    • stestagg 1623 days ago
      Here’s one that does so using roman numerals:

      http://static.offd.es/numerals/

      It’s unsurprisingly easy to implement

      • The_rationalist 1623 days ago
        Interesting but can it do more than addition? Also it doesn't seems to have 100% accuracy.
        • stestagg 1623 days ago
          Yeah. I only trained addition. Actually exploring the impact of training a net to perform a range of operations on the minimum plausible neuron count would be quite interesting

          I don’t see any reason why it would be significantly harder to do, however

          You’re right about accuracy. I didn’t let the model train enough to push the error low enough to guarantee exact results over the input range. But then again this was designed as a toy experiment, not something people should rely on