Apple machine learning in 2020: What’s new?

(machinethink.net)

95 points | by dsr12 4 days ago

6 comments

  • saagarjha 4 days ago

    Remember when Apple created a “Machine Learning journal”? Well, it seems like they’ve stopped publishing to it and now have gone back to introducing stuff at presentations, if at all: https://machinelearning.apple.com/

    • jcagalawan 4 days ago

      They are still publishing in other venues, like this pretty recent CVPR repo and paper.

      https://github.com/apple/ml-quant

      • jldugger 4 days ago

        IIRC, that blog is paired with conference proceedings and COVID has thrown a lot of that in the air.

        • DonaldPShimoda 4 days ago

          I'm not in ML, but in PL and HCI almost all conferences have proceeded on-schedule, just in a virtual format.

          The only exception I'm aware of is HOPL (History of Programming Languages). They still published the papers/proceedings as usual, but have postponed a physical gathering instead of meeting virtually because the conference convenes only once every 10-15 years.

        • machello13 4 days ago

          Huh? Their posting was never consistent. They had a gap from December 2018 to June 2019. Their last post was in December 2019. It's likely they just don't post in the months leading up to WWDC.

        • singhrac 4 days ago

          I think it's still wild that neither Tensorflow nor PyTorch work on Apple's MBP GPUs - AMD can't run ROCm on anything but Linux, and NVIDIA drivers aren't supported if you wanted to get an external GPU.

          • grej 4 days ago

            This combined with Microsoft's roadmap with a WSL that works on CUDA GPUs is going to cost Apple a lot of ML/AI/HPC developer mindshare. Yes, we do a lot of our work on remote machines, but it's not always the most convenient way to experiment. I doubt my next machine will be a MacBook.

            • ypcx 4 days ago

              There seems to be an ongoing work for Vulcan Compute support for Tensorflow. But the mlir repo moved at the end of 2019 and I don't see where (or if) the discussion and PR continue, because the new repo doesn't even use Github Issues.

                https://github.com/tensorflow/mlir/issues/60
                https://github.com/tensorflow/mlir/pull/118
                https://github.com/tensorflow/mlir/
            • teruakohatu 4 days ago

              > For example, the camera on the iPhone is different than the camera on the iPad, so you may want to create two versions of a model and send one to iPhone users of the app and the other to iPad users.

              Are app developers shipping models that are so brittle they cannot handle a different revision of Apple's camera?

              I can understand shipping more complex models for devices with better CPU/GPU or whatever Apple's AI accelerator is called, but not different cameras!

              • janhenr 4 days ago

                There might be some other reasons for shipping different models for the iPad vs the iPhone. F.e., if the iPad is more often used inside rather than outside, you could use a fine-tuned version of your big CNN to this smaller set of classes.

                • Ar-Curunir 4 days ago

                  Another reason is that the more powerful iPad processor can handle larger networks

                • julvo 4 days ago

                  Depends on the model. E.g. image enhancement or super resolution models are sensitive to the camera model and can be trained to fix artifacts introduced by specific cameras.

                  • tomaskafka 4 days ago

                    SE has one camera, 11 has two, iPad has two and LIDAR ...

                    • Enginerrrd 4 days ago

                      Fairly precise camera calibrations remain important in photogrammetry applications.

                      • m463 4 days ago

                        Training data pays attention to really weird things.

                        Someone was telling me of ML cancer detection that was unexpectedly training on the ruler found in most images of cancer.

                        I can see models based on an image sensor could inadvertently optimize for sensor size or geometry.

                        • sillysaurusx 4 days ago

                          Image augmentations are hard to add to training. It may seem easy, but it requires a lot of thought.

                          (To back up a bit: Image augmentations are how you solve that problem. "How do I make my model robust across different cameras?" It might be tempting to gather labeled data from a variety of cameras, but that doesn't necessarily result in a model that can handle newer, larger-res cameras. So one solution is to distort the training data with augmentations so that the model can't tell which resolution the input images are coming from.)

                          The other way to deal with it is to just downscale the camera's image to, say, 416x416. But that introduces a question: can different cameras give images that look different when downscaled to 416x416? Sure they can! Cameras have a dizzying array of features, and they perform differently in different lighting conditions.

                          To return to the point about image augmentations being hard to add: It's so easy to explain what your training code should do "Just distort the hue a bit" and there seem to be operations explicitly for that: https://www.tensorflow.org/api_docs/python/tf/image/adjust_h... but when you go to train with them, you'll discover that backpropagation isn't implemented, i.e. they break in training code.

                          I've been trying to build an equivalent of Kornia for tensorflow https://github.com/kornia/kornia which is a wonderful library that implements image augmentations using nothing but differentiable primitives. Work is a bit slow, but I hope to release it in Mel https://github.com/shawwn/mel (which will hopefully look less like a TODO soon).

                          But all of this still raises the question of which augmentations to add. Work in this area is ongoing; see Gwern's excellent writeup at https://github.com/tensorfork/tensorfork/issues/35

                          Training a model per camera isn't necessarily a terrible idea, either. In the future I predict that we'll see more and more "on-demand" models: models that are JIT optimized for a target configuration (in this case, a specific camera).

                          Robustness often comes at the cost of quality / accuracy (https://arxiv.org/abs/2006.14536 recently highlighted this). In situations where that last 2% of accuracy is crucial, there are all kinds of tricks; training separate models is but one of many.

                          • perturbation 4 days ago

                            > To return to the point about image augmentations being hard to add: It's so easy to explain what your training code should do "Just distort the hue a bit" and there seem to be operations explicitly for that: https://www.tensorflow.org/api_docs/python/tf/image/adjust_h.... but when you go to train with them, you'll discover that backpropagation isn't implemented, i.e. they break in training code.

                            Why not do the data augmentation during preprocessing (so that the transformations don't have to be done by differentiable transforms)? I.e., map over a tf.Dataset with the transformation (and append to the original dataset).

                            • spott 4 days ago

                              Why are you trying to backpropagate over data augmentations? I've never done that (or heard about it being done). Usually I just do the augmentations on the input samples and then feed the augmented samples to the network.

                              Differentiable augmentations aren't necessary unless the augmentations are midstream (so you have to propagate parameters above the augmentations, which is weird) or have parameters (at which point you aren't learning how to work on different views of the same sample, you are learning how to modify a sample to be more learnable, which is a different problem that you are trying to solve).

                              Don't get me wrong, augmenting samples to reduce device bias is a hard problem, but you might be making it harder than it needs to be.

                              • gwern 4 days ago

                                The data augmentations we are interested in are in fact 'midstream', as they augment the examples before passing into the D or the classification loss but you must backprop from that back through the augmentation into the original model, because you don't want the augmentations to 'leak': the G is not supposed to generate augmented samples, the augmentation is there to regularize the D and reduce its ability to memorize real datapoints. It would probably be better to consider them as a kind of consistency or metric loss along the lines of SimCLR (which has helped inspire these very new GAN data augmentation techniques). It's a bit weird, which is perhaps why despite its simplicity (indicated by no less than 4 simultaneous inventions of it in the past few months), it hasn't been done before. You really should read the linked Github thread if you are interested.

                                • spott 4 days ago

                                  Ah! I can see that in a GAN architecture. That makes much more sense.

                                  It wasn't clear from your original post that you were augmenting generated images, not real data.

                                  • gwern 3 days ago

                                    You're augmenting the real data too.

                              • gwern 4 days ago

                                > Training a model per camera isn't necessarily a terrible idea, either. In the future I predict that we'll see more and more "on-demand" models: models that are JIT optimized for a target configuration (in this case, a specific camera).

                                Meta-learning, or perhaps learning camera embeddings to condition on, would be one way. Although that might all be implicit if you use a deep enough NN and train on a sufficiently diverse corpus of phones+photos.

                            • ur-whale 4 days ago

                              Apple's secretive engineering culture is complete anathema to the ML world and what the likes of DeepMind, OpenAI, Google AI are doing in terms of sharing.

                              This is IMO very visible in the "output" Apple has produced in the space of ML : mostly infrastructure, and very little in the way of innovative tech and research.

                              And changing culture is a very hard proposition from a management perspective, unless you build a complete skunkworks-like independent entity within the mothership.

                              • elpakal 4 days ago

                                I hope Apple makes a CoreML -> Keras/other model types converter. This will make it much more appealing for me to use their GUIs and buy a Mac

                              • dorcassmith 4 days ago

                                Accounting Coursework Writing Services are hard to come across for those in need of Accounting Writing Services and accounting essay writing services. https://researchpapers247.com/accounting-writing-services/