We've set up a Jitsi installation but our stakeholders are wary about the limits on number of participants -- they can't really do department meetings or let alone an all-hands type of meeting with ~1000 people on the call. Also, some parts of their org are invested in the zoom usage/meeting stats too, and so that is another piece we'd have to figure out. We've also had mixed results testing with team members far away on rough connections, where audio/video was workable in zoom but sometimes unworkable with our internal setup.
Has anyone scaled up to 1000 and beyond with any open source solution? Any stories of success or failure? Are there other options to consider than Jitsi? Why is there no one crushing it like GitLab or Mattermost in this space?
- Like 90% of the benefit of Zoom (and other large video & voice chat providers) is their backend infrastructure. You cannot justify replicating that for a small single-customer or small group installation. It'd be far too expensive. So your open-source solution is ~always going to be worse. ("We've also had mixed results testing with team members far away on rough connections, where audio/video was workable in zoom but sometimes unworkable with our internal setup.")
- "they can't really do department meetings or let alone an all-hands type of meeting with ~1000 people on the call." Those should be broadcast as one-way streaming video, anyway. Set it up like a live video "podcast" or twitch stream with text chat for questions from the audience, if needed.
But, they just want it to all be the same platform, not require tech-averse execs to learn new tools, not have to work out the logistics of who needs to be on the "real" call each time, emoji reactions in the moment, etc. They do these town-hall sessions where they want to be able to have back forth with employees asking questions.
At the end of the day, it's a tough proposition to say, "you can either have the features you've come to rely on, or you can have your privacy and control". But I guess that is where we are.
Just as an example, they probably use UDP for latency, which has no ordering as part of the protocol. However, for streaming stuff, you really want to drop out of order packets and prioritize whatever the latest packet is.
Zoom probably has some kind of custom routing that handles that. If they get congested or clients are unstable, it prioritizes serving the latest audio/video rather than trying to push things out in the order they came in, only for the Zoom client to discard the out of order packets anyways. Amazon can't really do that, because they don't know which bits of data are the ordering.
On the pricing side, people are probably the component that scales the least. You're probably going to need a dedicated person to monitor/fix/upgrade it, and deal with calls from people who have issues. Between salary and benefits, that's at least a 6 figure expense. Zoom's most expensive tier is $240/license/year, so $100,000 would buy you 416 licenses. And $100,000 is a lowball here; that's roundabouts a $60k salary, and the other $40k is paid out in benefits/space/etc. This also doesn't count the bandwidth charges, nor the hardware to run the system on.
Imo, it's also just generally a bad idea to DIY anything business-critical that isn't a core competency of the company. It tends to cause issues. What do you do when the person that built this bespoke Zoom alternative quits? Can you risk it breaking and not working until you can hire someone new and they figure out how it works? You could hire 2 of them, but Zoom is definitely cheaper then. What do you do if the open source project gets abandoned? It also distracts from things that actually make you money. It makes more sense to me to focus on your core line of business, and use the additional proceeds to just pay for Zoom.
As the grandparent mentioned, the problem is the client side. Since there is no standard videoconferencing protocol, every free software project needs to develop their own clients. And it's difficult for a free software project to have the manpower and expertise to develop quality clients for the web, Android and iOS, so in effect what we currently have are mostly half-baked web clients.
There is some hope, though. The IETF have been working on standard protocols for ingress (https://datatracker.ietf.org/wg/wish/), and if their protocols get deployed, you'll be able to use the same streaming software (think OBS) or IP camera with multiple distinct videoconferencing servers. An interoperable interactive videoconferencing protocol is nowhere near, but as more people understand videoconferencing technology, there is some hope that people will get together and start working on multi-protocol clients (remember Pidgin?).
Full disclosure: I'm the author of Galene (https://galene.org), and I've been actively participating in the Pion community (https://github.com/pion/webrtc) and collaborating with the authors of ion-sfu (https://github.com/pion/ion-sfu) and LiveKit (https://github.com/livekit).
It is targetting the virtual classroom market but it does general videoconferencing just right.
You can do sub-second latencies with DASH and HLS. Not sub-second enough to do pleasant two-way communication, but literally an order of magnitude better than 10 seconds!
https://www.wowza.com/
consider this. the "host" creates a peertube livestreaming link and sets it as unlisted. that way only people with the link can watch the stream. sure it wont have any moderation but is that possible in a 1000 zoom meeting beyond who can open the mic or share screen?
Our org switched from Google Meet to their streaming equivalent once we hit around 250 people, nothing worked well with that many participants.
Lucky for your org, streaming has been a solved problem since the 90s! It looks like Jitsi might support streaming by itself, or you can find online tutorials on Nginx + Varnish RTMP streaming and DIY if you prefer.
To answer your literal question, there is no enterprise grade open-source zoom alternative because it's incredibly difficult and hard to make work for free. You have seen how hard just in your N=1000 sample size. Adding in self-hosted STUN or TURN or building your own proxy and video muxer is only going to be harder. There's a reason YouTube, Twitch, Facebook, Instagram, Vimeo, etc. are good at it, because they're zillion dollar companies that can afford to invest in the massive infrastructure and engineering required to make it not-terrible.
"Chat" services of various kinds have been around on the internet since the 1990s, the one thing that they all have in common is that they burn out. For instance there was CUSeeMe, Paltalk, Tivejo, AOL Instant Messenger, ICQ, Skype, WebEx, Facebook Messenger, and many others. Google alone has made 11.
Some of them are still around, but there's a general pattern that a company does the work to create an application but doesn't do the work to keep it up. Then some new thing comes along, gets popular for a while, then deteriorates.
Unless Zoom breaks the pattern a few years from now there will be something new and people will be saying "It's like the way Zoom used to be back when Zoom worked".
I'd say open source alternatives would run into the same problem as commercial chat apps.
In most cases, I see little real, added value in a live video feed. What I find more generally useful is simple desktop/screen sharing with audio (i.e. GotoMeeting, TeamViewer)
If you must see the participants, allow a photo/avatar. Watching real lips moving in a stuttered video is quite chic and nouveau and all ... it's nice with family ... but it doesn't really add much value in a typical business setting IMO.
As with most things, there are exceptions but I use desktop sharing way more often.
For larger meetings I can understand the fatigue but I absolutely love video chat for smaller meetings. I can easily tell when someone is distracted or might not have understood what I was saying. I've just stopped talking before because I can tell my boss/coworker just got a message or heard something off-screen that they need to address right away whereas without video I would have not known. I would have just kept talking until they finally said "Hold on a second", with video I know almost right when it happens and I don't have to repeat myself or try to figure out how how much to repeat.
Meetings are essential. Seeing people face to face is essential. I had lunch with a teammate recently and it bridges so many gaps you can't do with Zoom. And Zoom is orders of magnitude better than Slack or email.
If you don't think that face-to-face video meetings aren't important, then you don't understand human interaction.
I'm fiddling now and then on an alternative conferencing frontend(Pyrite - https://github.com/garage44/pyrite) for Galene(https://galene.org), which is a SFU that uses Pion.
- Webex supposedly does 1000 people.
- Google meet: No, only up to 500 with "business plus".
- MS Teams: Confusing numbers but 1000 for chat only, but 20(?) for audio/video.
- Bluejeans: 200 for enterprise; 500 "view-only"
- Anything else?
So the first answer that comes to mind is that there are no alternatives at all for 1000 participants. If you only need ~200 there are many more options.
Maybe there is a systemic lesson here.
https://youtu.be/W-PPPGy49kc