Ask HN: Why is there no enterprise grade open-source zoom alternative?

We have a client who are looking to replace zoom in their org. They like to self-host everything, and are wary of using outside services. We help them run their own email, and over the years we have helped them set up GitLab and Mattermost, etc which they are happy enough with.

We've set up a Jitsi installation but our stakeholders are wary about the limits on number of participants -- they can't really do department meetings or let alone an all-hands type of meeting with ~1000 people on the call. Also, some parts of their org are invested in the zoom usage/meeting stats too, and so that is another piece we'd have to figure out. We've also had mixed results testing with team members far away on rough connections, where audio/video was workable in zoom but sometimes unworkable with our internal setup.

Has anyone scaled up to 1000 and beyond with any open source solution? Any stories of success or failure? Are there other options to consider than Jitsi? Why is there no one crushing it like GitLab or Mattermost in this space?

41 points | by montroser 702 days ago

13 comments

  • corrral 702 days ago
    - You need 3 clients, minimum—Web, iOS, Android—or you're not an alternative. None of them can be way different from or far worse than the others.

    - Like 90% of the benefit of Zoom (and other large video & voice chat providers) is their backend infrastructure. You cannot justify replicating that for a small single-customer or small group installation. It'd be far too expensive. So your open-source solution is ~always going to be worse. ("We've also had mixed results testing with team members far away on rough connections, where audio/video was workable in zoom but sometimes unworkable with our internal setup.")

    - "they can't really do department meetings or let alone an all-hands type of meeting with ~1000 people on the call." Those should be broadcast as one-way streaming video, anyway. Set it up like a live video "podcast" or twitch stream with text chat for questions from the audience, if needed.

    • montroser 702 days ago
      We've talked about broadcast calls with text chat for Q&A, and it could probably be made to work...

      But, they just want it to all be the same platform, not require tech-averse execs to learn new tools, not have to work out the logistics of who needs to be on the "real" call each time, emoji reactions in the moment, etc. They do these town-hall sessions where they want to be able to have back forth with employees asking questions.

      At the end of the day, it's a tough proposition to say, "you can either have the features you've come to rely on, or you can have your privacy and control". But I guess that is where we are.

    • djohnston 702 days ago
      I'm not familiar with the nitty gritty of video streaming infra but I'm curious, with the crazy array of services on AWS/most other cloud providers, what in particular is missing or cannot scale down cost efficiently?
      • everforward 702 days ago
        I would guess a big part is switching/routing specifically geared towards low-latency audio/video.

        Just as an example, they probably use UDP for latency, which has no ordering as part of the protocol. However, for streaming stuff, you really want to drop out of order packets and prioritize whatever the latest packet is.

        Zoom probably has some kind of custom routing that handles that. If they get congested or clients are unstable, it prioritizes serving the latest audio/video rather than trying to push things out in the order they came in, only for the Zoom client to discard the out of order packets anyways. Amazon can't really do that, because they don't know which bits of data are the ordering.

        On the pricing side, people are probably the component that scales the least. You're probably going to need a dedicated person to monitor/fix/upgrade it, and deal with calls from people who have issues. Between salary and benefits, that's at least a 6 figure expense. Zoom's most expensive tier is $240/license/year, so $100,000 would buy you 416 licenses. And $100,000 is a lowball here; that's roundabouts a $60k salary, and the other $40k is paid out in benefits/space/etc. This also doesn't count the bandwidth charges, nor the hardware to run the system on.

        Imo, it's also just generally a bad idea to DIY anything business-critical that isn't a core competency of the company. It tends to cause issues. What do you do when the person that built this bespoke Zoom alternative quits? Can you risk it breaking and not working until you can hire someone new and they figure out how it works? You could hire 2 of them, but Zoom is definitely cheaper then. What do you do if the open source project gets abandoned? It also distracts from things that actually make you money. It makes more sense to me to focus on your core line of business, and use the additional proceeds to just pay for Zoom.

      • jech 700 days ago
        There's nothing particularly difficult on the server side — a quality SFU should be capable to handle on the order of 400 video flows per core, and there are quite a few high-quality free software SFUs available (Janus, Jitsi, ion-sfu, livekit, Galene). To give some perspective: we're using Galene for lectures, and our single-CPU server uses around 40% CPU usage in a room with 120 students (who keep their cameras switched off during the lecture, of course, and only occasionally switch them on to ask questions).

        As the grandparent mentioned, the problem is the client side. Since there is no standard videoconferencing protocol, every free software project needs to develop their own clients. And it's difficult for a free software project to have the manpower and expertise to develop quality clients for the web, Android and iOS, so in effect what we currently have are mostly half-baked web clients.

        There is some hope, though. The IETF have been working on standard protocols for ingress (https://datatracker.ietf.org/wg/wish/), and if their protocols get deployed, you'll be able to use the same streaming software (think OBS) or IP camera with multiple distinct videoconferencing servers. An interoperable interactive videoconferencing protocol is nowhere near, but as more people understand videoconferencing technology, there is some hope that people will get together and start working on multi-protocol clients (remember Pidgin?).

        Full disclosure: I'm the author of Galene (https://galene.org), and I've been actively participating in the Pion community (https://github.com/pion/webrtc) and collaborating with the authors of ion-sfu (https://github.com/pion/ion-sfu) and LiveKit (https://github.com/livekit).

  • prmoustache 702 days ago
    What you are looking for is BigBlueButton: https://bigbluebutton.org/

    It is targetting the virtual classroom market but it does general videoconferencing just right.

  • mgamache 702 days ago
    note: you can't really have 1000 people on a call. It's just a live streaming event at that point. You can use standard live streaming techniques for doing this. Yes, it will have 10-20 second latency (or more), but in large calls, most people don't interact with the speaker.
    • DANK_YACHT 702 days ago
      I don't know about Zoom, but I run a video platform and you can definitely have 1000 people on a call. The caveat is that you can't consume everyone's video and audio, so consuming a new person's audio and video takes a second or two. This works fine if you have 2-20 main speakers and a few of the 1000 people "raising their hand" to talk.
    • lultimouomo 702 days ago
      > You can use standard live streaming techniques for doing this. Yes, it will have 10-20 second latency (or more)

      You can do sub-second latencies with DASH and HLS. Not sub-second enough to do pleasant two-way communication, but literally an order of magnitude better than 10 seconds!

      • karmakaze 702 days ago
        I used a combination of WebSockets and HLS so that speakers could be live, and listeners would get the stream via HLS. This was for an audio platform built with Twilio and home-grown HLS proxies using ffmpeg. There was a process where a person could go from being a listener to a call-in speaker--the web client had dual Twilio & HLS clients and could switch. The hosts had a dedicated iOS hosting app that always ran the Twilio APIs. They also had video APIs but we were doing internet call-in radio talk shows.
    • bombcar 702 days ago
      This is the answer. If necessary, you livestream a call run on a separate service (think: a call of 5-20 people on Zoom, where the audio and video/screenshare is piped into YouTube).
      • mgamache 702 days ago
        Right, if you want to self-host you can run something like Wowza (paid product, but it's a complete solution)

        https://www.wowza.com/

      • 2Gkashmiri 702 days ago
        at that point, livestreaming is already a pretty stable affair in the foss world. owncast and peertube both are solid players where you can do good livestreaming. heck, even pixelfed which is a sorta instagram alternative is getting livestreaming so that is a nice idea.

        consider this. the "host" creates a peertube livestreaming link and sets it as unlisted. that way only people with the link can watch the stream. sure it wont have any moderation but is that possible in a 1000 zoom meeting beyond who can open the mic or share screen?

        • bombcar 702 days ago
          I think the idea behind a 1000 person zoom meeting is to "make" everyone feel like a participant, even if the reality is that they're not, they're just watching/listening. Live-streaming makes that more clear.
  • mgamache 702 days ago
    One of the things that makes Zoom (and for-profit streaming) better is noise cancelation. Zoom has spent a lot of money perfecting video calling without requiring headsets. This is not easy. A second reason is they use some proprietary stream control and compression. It's based on standards and you could copy it, but you would have to license h264 to make this work in open source. Open source requires other people to pay this license (typically it's part of a browser using WebRTC).
    • timbit42 702 days ago
      I won't call it perfect if it can't cancel out urination.
  • web007 702 days ago
    1000 _participants_ on a call, or 1000 _viewers_?

    Our org switched from Google Meet to their streaming equivalent once we hit around 250 people, nothing worked well with that many participants.

    Lucky for your org, streaming has been a solved problem since the 90s! It looks like Jitsi might support streaming by itself, or you can find online tutorials on Nginx + Varnish RTMP streaming and DIY if you prefer.

    To answer your literal question, there is no enterprise grade open-source zoom alternative because it's incredibly difficult and hard to make work for free. You have seen how hard just in your N=1000 sample size. Adding in self-hosted STUN or TURN or building your own proxy and video muxer is only going to be harder. There's a reason YouTube, Twitch, Facebook, Instagram, Vimeo, etc. are good at it, because they're zillion dollar companies that can afford to invest in the massive infrastructure and engineering required to make it not-terrible.

  • PaulHoule 702 days ago
    I'd point to this related question.

    "Chat" services of various kinds have been around on the internet since the 1990s, the one thing that they all have in common is that they burn out. For instance there was CUSeeMe, Paltalk, Tivejo, AOL Instant Messenger, ICQ, Skype, WebEx, Facebook Messenger, and many others. Google alone has made 11.

    Some of them are still around, but there's a general pattern that a company does the work to create an application but doesn't do the work to keep it up. Then some new thing comes along, gets popular for a while, then deteriorates.

    Unless Zoom breaks the pattern a few years from now there will be something new and people will be saying "It's like the way Zoom used to be back when Zoom worked".

    I'd say open source alternatives would run into the same problem as commercial chat apps.

  • jqpabc123 702 days ago
    Sorry, I must be a luddite. I don't get the video conferencing craze.

    In most cases, I see little real, added value in a live video feed. What I find more generally useful is simple desktop/screen sharing with audio (i.e. GotoMeeting, TeamViewer)

    If you must see the participants, allow a photo/avatar. Watching real lips moving in a stuttered video is quite chic and nouveau and all ... it's nice with family ... but it doesn't really add much value in a typical business setting IMO.

    As with most things, there are exceptions but I use desktop sharing way more often.

    • joshstrange 702 days ago
      I understand people not wanting to be in a large meeting with their camera on but for anything <10 or so where everyone is participating I greatly prefer video chats. Video is still not a perfect substitute for in-person but it's GoodEnough (tm) for me. You don't get the full body motion cues but you get the majority of them and that's invaluable. I've heard/used the phrase "X, you look like you don't agree", "X, do you see an issue with this approach?", "X, do you have something to add?", etc. Personally I get a lot of value from being on either side of that exchange, either me saying it to my boss or my boss saying it to me. With audio-only it would be impossible to notice these things or see that a team member has opened their mouth 2-3 times but someone else talked first, with video I can say "I think X has something to add".

      For larger meetings I can understand the fatigue but I absolutely love video chat for smaller meetings. I can easily tell when someone is distracted or might not have understood what I was saying. I've just stopped talking before because I can tell my boss/coworker just got a message or heard something off-screen that they need to address right away whereas without video I would have not known. I would have just kept talking until they finally said "Hold on a second", with video I know almost right when it happens and I don't have to repeat myself or try to figure out how how much to repeat.

    • SeasonalEnnui 702 days ago
      I envy your position. I would not be able to participate in meetings without the ability to read lips/faces.
    • robonerd 702 days ago
      Audio quality is always worse than conference calls in the 80s and 90s (until people started using cell phones.)
      • wildzzz 702 days ago
        I used to dread conference calls whenever most people were on their cellphones, me included. Quality was dog shit. If everyone was using a deskphone, quality was better but you could easily tell you were still on a phone call. Whatever system we call into can't make any use of the high quality bandwidth you'd get from internal deskphone calls or between two iPhones using the internet to do voice calls. Work has been giving out USB headsets lately so people can actually talk through the conferencing software on their laptop rather than using a dial-in number. The improvement in voice clarity is enormous and I will never go back to phoning into meetings. I've been encouraging my coworkers to do the same. We don't have webcams on our work PCs so voice or chat is the only option.
        • robonerd 702 days ago
          The problem with all webchat, regardless of microphone quality, is the erratic latency and frequent hiccups every time somebody's neighbor microwaves a burrito. Consistent latency is generally tolerable if kept to a reasonable minimum, but erratic latency is incredibly fatiguing for me to listen to.
    • baskethead 702 days ago
      You do sound like a luddite. Also it sounds like you're not used to remote work or interacting with other people in a remote-first environment.

      Meetings are essential. Seeing people face to face is essential. I had lunch with a teammate recently and it bridges so many gaps you can't do with Zoom. And Zoom is orders of magnitude better than Slack or email.

      If you don't think that face-to-face video meetings aren't important, then you don't understand human interaction.

      • robonerd 702 days ago
        Maybe one in ten meetings qualifies as "essential", and I'm being generous.
  • baskethead 702 days ago
    With a 1000 people you should be streaming. There's no conceivable way 1000 people can have an interactive meeting.
  • jvanveen 702 days ago
    For more interesting related projects, you may also want to checkout https://github.com/pion/awesome-pion

    I'm fiddling now and then on an alternative conferencing frontend(Pyrite - https://github.com/garage44/pyrite) for Galene(https://galene.org), which is a SFU that uses Pion.

  • diffeomorphism 702 days ago
    Why are there so few "enterprise grade zoom alternatives" regardless of being foss or not?

    - Webex supposedly does 1000 people.

    - Google meet: No, only up to 500 with "business plus".

    - MS Teams: Confusing numbers but 1000 for chat only, but 20(?) for audio/video.

    - Bluejeans: 200 for enterprise; 500 "view-only"

    - Anything else?

    So the first answer that comes to mind is that there are no alternatives at all for 1000 participants. If you only need ~200 there are many more options.

    • eurasiantiger 702 days ago
      How many real-life meetings have 200, 500, let alone 1000 active participants? Representational democracy comes to mind, which is interesting, because the ”representational” part only exists to mitigate this very problem.

      Maybe there is a systemic lesson here.

    • speedgoose 702 days ago
      I have been in a MS team meeting with 1000+ participants where the presenter had video enabled. But maybe 1000 is the max and the number of people was higher because counting fast in a distributed system is hard.
    • dartharva 702 days ago
      If you really want to address hundreds and thousands of people at once, maybe you're better off livestreaming on YouTube or Twitch.
    • narimoney 701 days ago
      undefined
  • lskerachid 691 days ago
    Did you consider Zoom on premises?
  • emrahcom 700 days ago
    how we added support for 500 participants to Jitsi Meet

    https://youtu.be/W-PPPGy49kc