And I'm sure iOS Safari will still keep annoying restrictions on using it. I 100% understand the reasoning behind not allowing sound to play unless it's "user initiated" but it's really frustrating how small that window is and/or how you can't ask for that "permission". I have a web app that uses the camera for scanning (don't get me started on how Chrome/FF/etc on iOS can't use the camera for streaming) and in Safari I want to play little beep noise (toggleable on/off) after a successful scan but that's impossible.
I just hacked together a workaround for this very thing myself. You can call play() in a click handler but immediately pause(), and then call play() again on the same Audio element later when your scanner succeeds.
The frustrating and unnecessary thing is the trend to completely intertwine permissions policy policy with the API.
If I can't play a sound, just make it browser feature -- let the web page call the APIs continue as normal, mute the tab, and browser notifies the user.
Instead, responsibility is pushed from the browser developers (small in number) every single web page developer to deal with rejections, prompt users, retry the request, and cover a bunch of edge cases that mostly they'll never see.
The policy ends up frozen in time around the needs of eg. 'desktop' and won't be able to adapt in future.
Also, the inevitible concern around fingerprinting a user based on the pattern or timing of rejected events.
Not sure about WebAudio spec specifically, but there are certainly places where media specs are basically just a reflection of "what Safari does" to work its way around corners its boxed itself into.
Which, to the parent's point, makes the restriction itself pointless? It also seems like the kind of "workaround" that will one day be patched, and suddenly your app stops working.
- try not to obtain and initialize the audio context until the triggering event has occurred
- make sure the audio context + playback occurs directly as a result of the triggering event. If the event just sets some state, and then something else is periodically watching for that state to change, it may fail. A workaround here is to just mute+play some sound on that first tap, then you're good for any later Audio contexts created+initiated however you'd like.
Brings back memories of a web app I worked on (SpeakerBlast) that turned audio devices into one synced stereo. We used the silent file trick and had each audience members check into the Speakerblast (used that also to then to count and display number of connected speakers). We always had to find work around after Apple would change things up.
I really wish you could access audio data from across an origin. Preventing that access seems to me like it's mostly just nice for YouTube, Spotify, SoundCloud et al to not have others do things with their audio/video data. Why should my browser care to help them like that?
For example, I have built a few music visualizers and have had to run a local youtube-dl server that disables cors, just to be able to visualize music for a youtube video. I just want to draw patterns on my screen while some music plays, I'd even be fine with the ads! But I have to engage in some form of piracy to do this.
The music is already playing through my speakers, so I should be able to access that data!
(edit: I want to clarify: I know that one can set cors options on audio elements to enable cross-origins references, but it requires the server to allow you to -- which is what I'm doing with the youtube-dl server. That is the real problem: CORS is being used by youtube for copyright protection, not for user-safety. I know that accessing a youtube video across origins from a page I wrote myself is safe, but CORS enables youtube to stop me from accessing that data, in the name of safety.)
I’d rather not allow other tabs to listen in on my meetings in google meet, zoom or teams. How would this work securely? Seems like it would need to be opt-in for each site as well as requiring user consent.
Those things should already require authorization to access their streams, and your creds for e.g. meet.google.com shouldn't be available in other tabs/windows at other domains to send along, but it still would likely be reasonable to default to preventing access to/from domains unless they are marked as safe, in much the same way accesses are already requested/allowed.
I acknowledge that a feature like that would take real work to write and support and keep secure, but control over access to data should be the user's choice, not the server's.
You can from an extension, iirc – so it shouldn't be too hard to use an extension to smuggle the stream between the contexts without needing to copy the data.
Yes! I am using the WaveSurfer [0] library to visualize audio waveforms for a project, which uses this. It works, but Chrome always says 'ScriptProcessorNode is deprecated, use AudioWorkletNode instead'.
Worklets are out of the main process. That is good for a lot of what the script processors are designed for (compute intensive stuff), but having a script node in the main process is the only way I know to get perfect timing. Plus, the ScriptProcessorNode is just very easy to use for simple scripting, compared to the worklets.
The API is already available in all the major browsers, so nothing really changes with this announcement. The API itself lets you generate and manipulate audio data in a way that was impossible with just the HTML Audio tag.
Most of the Web Audio stuff has already been present in all major browsers for a little while, this mostly standardises what's already there. The main thing that this brings in that up until now was just a feature of Chrome is the AudioWorklet, so real-time low-level audio processing in JS worklets will work cross-browser when its implemented in the other browsers. It's very difficult to implement low-level audio processing off of the main thread in non-Chromium browsers at the moment.
AudioWorklet (which allows you to work with audio sample-by-sample in a dedicated, high priority "thread") is available and works well in Firefox and the latest Safari. I haven't tried it in Edge, but I believe it's also working well.
This is true for many web features, including many basic PWA features like web push notifications, url capture, etc...
As a mobile first web developer, ios is the thing that's really holding us back.
The very frustrating thing about this is that the end user tends to blame the app, and there's not a good way to communicate to them "hey, we would totally do this, but mobile Safari has limited functionality and Apple won't allow other browsers to be installed"
I'm not sure it does make any practical difference, but I could be wrong. But as someone who has been a little bit involved in the process since 2010 it's an enjoyable personal milestone. And I'm really happy for all of the wonderful people I've met along the way who contributed way more than I did.
The W3C usually adds new specs years after major browser vendors implement them. A browser being created after that time would be expected to be compliant with W3C specs.
If I can't play a sound, just make it browser feature -- let the web page call the APIs continue as normal, mute the tab, and browser notifies the user.
Instead, responsibility is pushed from the browser developers (small in number) every single web page developer to deal with rejections, prompt users, retry the request, and cover a bunch of edge cases that mostly they'll never see.
The policy ends up frozen in time around the needs of eg. 'desktop' and won't be able to adapt in future.
Also, the inevitible concern around fingerprinting a user based on the pattern or timing of rejected events.
Not sure about WebAudio spec specifically, but there are certainly places where media specs are basically just a reflection of "what Safari does" to work its way around corners its boxed itself into.
- try not to obtain and initialize the audio context until the triggering event has occurred
- make sure the audio context + playback occurs directly as a result of the triggering event. If the event just sets some state, and then something else is periodically watching for that state to change, it may fail. A workaround here is to just mute+play some sound on that first tap, then you're good for any later Audio contexts created+initiated however you'd like.
For example, I have built a few music visualizers and have had to run a local youtube-dl server that disables cors, just to be able to visualize music for a youtube video. I just want to draw patterns on my screen while some music plays, I'd even be fine with the ads! But I have to engage in some form of piracy to do this.
The music is already playing through my speakers, so I should be able to access that data!
(edit: I want to clarify: I know that one can set cors options on audio elements to enable cross-origins references, but it requires the server to allow you to -- which is what I'm doing with the youtube-dl server. That is the real problem: CORS is being used by youtube for copyright protection, not for user-safety. I know that accessing a youtube video across origins from a page I wrote myself is safe, but CORS enables youtube to stop me from accessing that data, in the name of safety.)
Those things should already require authorization to access their streams, and your creds for e.g. meet.google.com shouldn't be available in other tabs/windows at other domains to send along, but it still would likely be reasonable to default to preventing access to/from domains unless they are marked as safe, in much the same way accesses are already requested/allowed.
I acknowledge that a feature like that would take real work to write and support and keep secure, but control over access to data should be the user's choice, not the server's.
[0] https://github.com/katspaugh/wavesurfer.js
As a mobile first web developer, ios is the thing that's really holding us back.
The very frustrating thing about this is that the end user tends to blame the app, and there's not a good way to communicate to them "hey, we would totally do this, but mobile Safari has limited functionality and Apple won't allow other browsers to be installed"
Does that work in practice? You can 'expect' all you want if they don't choose to implement it.