6 comments

  • areoform 115 days ago

    > The company said the users who were affected chose the option in Facebook’s Messenger app to have their voice chats transcribed. The contractors were checking whether Facebook’s artificial intelligence correctly interpreted the messages, which were anonymized.

    Where exactly is this setting? I've looked through Facebook's settings and Messenger's settings, but this option is rarer than a cheap white truffle. Does anyone know?

  • minimaxir 115 days ago

    Unlike previous articles about tech-companies-listening-to-user-audio, this is over voice transcription rather than smart speaker QA.

    Facebook does have a smart speaker (Portal) with voice commands (https://portal.facebook.com/help/2149102838698668/) but that isn't mentioned in the article.

    • cameronbrown 115 days ago

      Paywall Workaround: https://outline.com/pyzYAB

      • pmantas 115 days ago

        Am I the only one who see’s a problem with them actively working to convert a non-indexable data source into indexable and searchable one?

        • kerng 115 days ago

          Is it from WhatsApp audio conversations? Or just random recordings through the apps - which Zuckerberg denied before?

          • Calvin02 115 days ago

            RTFA - "The company said the users who were affected chose the option in Facebook’s Messenger app to have their voice chats transcribed. The contractors were checking whether Facebook’s artificial intelligence correctly interpreted the messages, which were anonymized."

            • dekhn 114 days ago

              This sounds totally reasonable to me? Low quality machine learning algorithm needs human labellers?

              • vokep 113 days ago

                But why sample on realworld data from non-employees?

                • dekhn 112 days ago

                  Huh? Because that's the product you're trying to improve!

          • sgt101 115 days ago

            Is it fair to wonder why they are using people when automagical Ai transcription should do this for them like the man from Google/deepmind/amazon/IBM/Microsoft said? Or is FAIRs really not up to much?

            • ipsum2 115 days ago

              Not exactly sure what you're asking, but all tech companies hire people to transcribe audio precisely to gather data to train ML models to do transcription.

              • derefr 115 days ago

                Is there a reason that these ML models are being hoarded as "secret sauce" when, for these companies, all the rivals they're concerned about also have all the resources required to build one that's nearly as good? It feels strange that we've got six different tech giants that have all independently spent tons of capital building up the training data required to sell people smart speakers/mobile speech control/etc. with these ML models, without any of them entering into cross-licensing agreements.

                It seems like it'd make a lot more sense for Apple, Google, Amazon, Facebook, etc. to all pool their training data in an "industry working group" to build and license out one "best" model, the way that IWGs are formed to build and license out e.g. AV codecs.

                • Calvin02 115 days ago

                  > "to all pool their training data"

                  The press would skewer them alive and politicians will have a field day about tech companies violating privacy and sharing data.

                  • Smithalicious 115 days ago

                    It's bad enough that one BigCorp has my data. I'd rather not have them also give it out to every other BigCorp

                    • etaioinshrdlu 115 days ago

                      ML is an extremely competitive field right now and everyone's trying to get an advantage over everyone else. Not too ripe for cooperation right now.

                      • solarkraft 115 days ago

                        It's the same reason car makers don't all use the same platform. Everyone is hoping to get a slight edge over the others to preform better in the market.

                      • sgt101 115 days ago

                        Hang on, everyone's been at this for years now - are we seriously saying that Facebook et-al don't have large training sets for speech transcription? Why are they still labelling this? Why do they need 100's of contractors?

                        I can see that a couple of folks might be engaged in carefully reviewing low confidence transcription events, but 100's?