21 comments

  • robto 828 days ago
    See, this is what I keep going on about to my friends - Matrix is actually a graph syncing protocol, not an instant messaging protocol, and is a great way to build federated applications.

    I get that Element needs to focus on one use case in order to pay the bills, but there's a lot of room for applications that have barely been explored - calendars, contacts, games, social media, heck - you could build a federated gitlab-like collaboration site with truly federated data. You get multi-device sync, e2e encryption, and identity management out of the box.

    Thanks for putting this together, I hope there will be a gold rush as people realize how much low-hanging fruit there is!

    • all2 828 days ago
      > calendars, contacts

      I'm interested in this just to keep my devices sync'd up. I know Samsung/Apple/MS etc all have syncing services, but I really really want to get all my data out of their ecosystems.

      Just a distributed calendar app would be pretty amazing. No more iCal subscription/export/whatevers, 'just' federate the calendar between devices, and arbitrarily share it with anyone else.

      I have no clue where to start with building an app on these abstractions. Does anyone have any pointers? Documentation to read? High level overviews? Etc.?

      • robto 828 days ago
        I'd start with reading the Matrix spec, you'll want to know how the syncing protocol works, and this will give you enough familiarity that you can start playing around with curl to actually see it in action. I'd highly recommend understanding how conflict resolution works, the "Analysis of the Matrix Event GraphReplicated Data Type"[0] lays it out pretty well.

        Matrix-CRDT posted here should abstract most of the nitty gritty stuff, but there are no shortcuts to designing with CRDTs so familiarizing yourself with those is important.

        Other than that, just start playing and asking questions! The #matrix:matrix.org room is full of friendly and helpful people and I'm sure you'd be able to get answers.

        [0]https://arxiv.org/pdf/2011.06488.pdf

        • YousefED 828 days ago
          The downside of building this directly on Matrix events is that I don’t think it would work offline, right? (Until p2p matrix gets more common, that is).

          My biased suggestion would be to see how far you get with storing the data to be synced in SyncedStore, and connecting Matrix-CRDT and y-indexeddb for local storage.

      • chrismorgan 827 days ago
        For these specific purposes, the JMAP working group at IETF is the place to start, because it’s the hub where this stuff is being incubated (though some of the work happens in other working groups). See <https://datatracker.ietf.org/wg/jmap/documents/>; the drafts JMAP for Calendars (integrating JSCalendar, published as RFC 8984 six months ago by the same group of people through the calext WG) and JSContact are particularly relevant here.

        Like Matrix, JMAP is commonly misunderstood, with people thinking it’s about email (it’s replacing IMAP, which is Internet Mail Access Protocol, right?), but it’s actually fundamentally an object synchronisation protocol, with the core in RFC 8620 being domain-neutral, and the email model completely separate in RFC 8621. (IMAP has also shifted a bit over time from being about mail with proper synchronisation being an unsound and secondary concern, to nailing down sound object synchronisation in extensions. But JMAP does it better.)

        Matrix is about decentralisation while JMAP is client-server, but you can probably reuse much or all of the underlying data models, just as JMAP for Calendars and JMAP for Contacts are establishing JMAP-independent data models models first.

      • eddieroger 828 days ago
        I've been trying to fit CalDAV/CardDAV to that use case for some time, but given how the majority of any kind of useful implementation of that is in PHP (which itself is fine, it's just not my forte), or Python (Apple's server implementation), I've all but given up, setting sights on the big players you mention. But maybe there is something here. The hangup will be acting as a provider to the devices you care about - Apple has APIs in to contacts and calendars, but I wonder if they'll be as useful as native accounts would be? A fun thought experiment for a bit...
        • all2 827 days ago
          I wasn't even thinking in terms of compatibility. I don't really use the event functionality in any of the calendar apps, and I don't really share my calendars. I'm more concerned about just having the information I need where I need it.

          I am interested in APIs, though, because those bring to light the underlying concepts people are trying to communicate. Examining APIs (I think) will show the problem domain clearly.

      • mab122 827 days ago
        For calendar I would say: https://github.com/39aldo39/DecSync + https://syncthing.net/ and for sharing with someone you would run some web client on VPS talking to https://github.com/39aldo39/Radicale-DecSync
    • mxuribe 828 days ago
      I think the very beginning the matrix project folks have used language similar to matrix being data synching...etc...But, that, the first use case is chat...but won;t end there...etc. Obviously i'm paraphrasing heavily...but once i read that, i've used similar language when i introduce matrix to friends...basically, chat/instant messenger is the first but not last/not only use case/scenario for this very cool technology.
  • klabb3 828 days ago
    > Personally, I'm convinced the technologies in this ecosystem (CRDTs, etc) are really powerful and can do more to decentralize software than many web3 technologies that currently receive much of the hype.

    100% agree! Web3 is really only tackling monetary instruments, which is interesting but very narrow. The majority of apps people need don't require global consensus, and it's too slow and expensive anyway. Yet, there's so much interesting stuff that can be decentralized/federated.

    > Matrix takes care of a lot of stuff: Authentication, E2EE, federation, hosting, etc. - so I can focus on the client.

    This is great to hear! What else besides CRDTs do you think is needed in order for matrix to be a platform for general purpose apps like this?

    • YousefED 828 days ago
      Thanks! Personally, I’m looking forward to stronger / cleaner (TypeScript) sdks with for example easier auth and e2ee key management. I heard this is on their roadmap for this year, so that’s pretty exciting!
  • jeroenhd 828 days ago
    This is an interesting use of the Matrix protocol. I like the many creative ways Matrix is being used outside the chat space, where its predominant use lies. However, I have several questions about this approach.

    - What happens with E2EE when there are key exchange problems? (i.e. a server goes down and the account is moved)

    - What happens when someone redacts a message? I assume you'd want to disable redactions for these rooms or you'll end up with broken update trees!

    - How do you communicate clearly that any data you add is in the event chain forever? In Word and GDocs you can remove old revisions of a document, but I don't think you can in this system? That could be a feature, of course!

    - What's the performance impact of such a system on a server editing reasonably large documents? If twenty people each edit ten to twenty documents you end up with quite a large set of state, especially over time!

    • YousefED 828 days ago
      Thanks for the questions!

      - If you'd enable E2EE for the Matrix client that you pass to Matrix-CRDT, key management is covered in the same way that Matrix does it normally. If key sharing with a particular user is broken, then you won't see updates from that user anymore and v.v. Basically, the transport between you and that user is broken. As CRDTs such as Yjs are designed specifically to work Local-first, at the core there is no assumption that all clients should always be connected to each other. Once the clients are able to communicate again, potential conflicts would be "resolved" according to the CRDT design.

      - Redacting: yes, basically you need to trust clients not to fiddle with messages (I think this is fair as you trust them to work with you on the same data already)

      - UX / communication: good question! Technically would be possible to purge old (deleted data), but I think this is still something we need to explore together while we start to see more mature software built on these technologies.

      - Kevin already answered your performance questions. Matrix-CRDT makes an additional optimization so that not the entire history of the Matrix room needs to be retrieved (see "Snapshots" in the readme)

      • Arathorn 828 days ago
        The best way to do snapshots might be to persist them as binary blobs (or binary diffs) in the Matrix media repository rather than snapshotting as Matrix events (whose limit is 65KB) ftr.
      • alexisread 828 days ago
        Amazing work on this btw Yousef! With the UX/Communication, I kinda see this as similar to tombstoning (with or without envelope-encrypted data), and archiving (eg. to permanent storage). I'd imagine this would also be necessary for GDPR compliance.

        Additionally, exposing the CRDT stream would allow for reactive index building for searches on the data (ie. timely dataflow operations to aggregate an index).

    • kevinjahns 828 days ago
      I can only answer to the last question. Yjs uses several performance optimizations to produce small documents (both in memory and in the encoded state). Since humans type relatively slow (<60 actions per minute), it is impossible for humans to create a document that has performance problems. I showed this in [1].

      Relm [2] even models a 3d world using Yjs. I don't necessarily recommend doing this as 3d applications usually produce a lot more actions per minute than text applications. This required some workarounds and deep knowledge of how Yjs' optimizations work. But it's definitely possible.

      [1]: https://blog.kevinjahns.de/are-crdts-suitable-for-shared-edi... [2]: https://www.relm.us/

  • samwillis 828 days ago
    Awesome project!

    I have said it before and I will say it again, I think this year is going to be the year of Yjs. The ecosystem around it is brilliant and I believe it will become the leading CRDT framework (it probably already is).

    Happy to see you using TipTap too for the demo, a brilliant rich text editor framework (with support for Vue.js, React and Svelte) built on top ProseMirror with first class Yjs support.

    https://tiptap.dev

    The TipTap guys (Philipp and Hans) along with Kevin Jahns of yJS have stated the y-collective inorder to centralise funding and organising of the Yjs ecosystem:

    https://opencollective.com/y-collective

    Personally I'm excited about the us of Yjs and CRDTs outside of just rich text editing, I think there is great potential to build a distributed offline enabled database targeting mobile and PWAs with it. Something like PouchDB but with automatic conflict resolution.

    Finally there is a Rust port in development to improve the (already very good) performance and make it cross platform with other languages.

    https://github.com/y-crdt/y-crdt

    • hanspagel 828 days ago
      Thanks for the mention, Sam! Great to have you and Yousef in the community.

      BTW, we are already tinkering on some interesting stuff with the Rust port, too. :-)

      • samwillis 828 days ago
        Thanks! Sadly haven’t had the time in the last 6 months to continue working on what I was experimenting with. Hopefully will soon.
    • thruflo 828 days ago
      > I think there is great potential to build a distributed offline enabled database targeting mobile and PWAs with it. Something like PouchDB but with automatic conflict resolution.

      Take a look at https://concordant.io

    • YousefED 828 days ago
      Thanks! And definitely agree, can 100% recommend TipTap!
  • ath92 828 days ago
    Amazing work. I think there's a whole host of apps that just store some small amount of data for each user that could work entirely on something like this.

    Some nice things:

    - Anyone can host their own Matrix homeserver, which can then sync its data within a federation, allowing users pretty fine-grained control over what data they share and with whom.

    - Add end-to-end encryption, and it becomes even better: data stored somewhere in the cloud (i.e. the user doesn't have to manage a server), but the server knows almost nothing about what you've stored aside from some metadata.

    • YousefED 828 days ago
      Exactly. One of the reasons to build on top of Matrix was that it opens the path to end-to-end-encryption. Matrix E2EE (Olm) is heavily based on Signal, and wouldn't be trivial to build from the ground up.

      I have a prototype running that uses Matrix-CRDT and Matrix E2EE, and it worked great. It's still a bit of a hassle to set up though (mainly, configuring the matrix sdk correctly), I hope to make that easier later this year.

  • JanisIO 828 days ago
    Thank you for your work! First SyncedStore, now this. Yjs keeps getting better and is already my favourite technology to work with. Going to include this in my app :) https://app.lity.cc
    • sunbum 828 days ago
      Kinda a dick move that it hijacks the back button :(
  • infogulch 828 days ago
    I recall that matrix itself has some CRDT-like data structure to transmit messages. If that's accurate, could you map some of the functionality of yjs directly into native matrix ops?
    • Arathorn 828 days ago
      On the Matrix core team, we're working on 'native' collaborative editing collaboration via the Matrix DAG - while shamelessly learning from CRDT-over-Matrix projects like this one :)
      • kevinjahns 828 days ago
        What I dislike about these attempts is that you will just end up with yet another CRDT implementation that is incompatible with the existing ecosystem (editors, drawing apps, state management, ...).

        Instead, I want to encourage you to build an API that others can use to efficiently store shared data. Feel free to ping me if you need input.

        - Kevin (Author of Yjs)

        • infogulch 828 days ago
          I'm happy to see representatives from both the matrix and yjs communities interact more directly.

          Can you expand on what you mean by this?

          > an API that others can use to efficiently store shared data

          What would you expect this API look like in a bit more detail? Would it be able to abstract any of the underlying CRDT logic? Would it just be a raw stream of authenticated messages with partial ordering? Something in-between?

          • kevinjahns 828 days ago
            Instead of building another shared-editing solution specialized for Matrix, there could be an API that can be used to store and distribute real-time updates efficiently (probably in the Matrix DAG).

            The matrix-crdt works really well. To reduce overloading the Matrix server with many small messages (each single keystroke produces an update message), it stores merged updates in the DAG after a short debounce. The optional WebRTC extension allows you to distribute messages immediately "of the chain", so you don't notice the debounce.

            After a time the message-log gets pretty huge. So in matrix-crdt, a random client will eventually store a "snapshots" of the current state in the DAG and removes old entries. This way, new clients don't need to download the huge message-log.

            It would be nice if there was a possibility to create a server-component that does the merging.

            (Btw, all credit to the above approach goes to Yousef)

            Now, there might be a better solution to store CRDT data in the Matrix DAG - the developers probably know best and might be able to expose some hidden API that would make everything even more efficient.

            I'm just asking that instead of creating yet another CRDT and integrating it into Matrix, open up this space, provide better APIs, and let others integrate their CRDTs.

            > Would it be able to abstract any of the underlying CRDT logic?

            Modern shared-editing frameworks don't require you to think about internal logic. They just set some requirements on the ordering of update messages. CRDTs in particular don't care in which order you transmit data, which makes them a very interesting choice in practice.

            • josephg 828 days ago
              It might be worth putting together a chat between matrix and a few of us! I have some thoughts on this too, having written two differently designed CRDTs with diamond types.

              Replaying a series of changes from an operation log is quite doable (blog post incoming). But having a way to compress / annotate the operation stream will lead to far better performance in lots of ways. Especially as Kevin says - with CRDTs like Yjs and automerge which consider document order (not time order) as the canonical representation.

            • ItsMonkk 828 days ago
              Have you guys taken a look at all of the various torrent-based[0] approaches that's been going around HN? Feels like if you combine the storage component of those with the real-time approach you've got here, it would feel like magic.

              [0]: https://news.ycombinator.com/item?id=29917818

        • Arathorn 828 days ago
          So to be clear, on the Matrix side we have absolutely zero desire to build another general purpose CRDT implementation - Yjs, automerge, Collabs etc are already here :)

          However, all the current collaborative editing apps which use Matrix operate by serialising opaque CRDT updates as Matrix events - a bit like Wave used to send blobs of base64 as OT updates over XMPP. Matrix-CRDT is cool in terms of also transporting updates over a WebRTC ephemeral transport to get lower latency (although it's missing a trick that the WebRTC looks to be signalled over websockets rather than just using Matrix's VoIP signalling to give you E2EE decentralised WebRTC signalling for free ;)

          Now, Matrix already is a constrained CRDT (monotonic semi-lattice, i think?) which provides primitives for key-value and conversation-timeline storage. Our merge resolution algorithm is detailed at https://matrix.org/blog/2020/06/16/matrix-decomposition-an-i.... However, we always intended to be able to store object graphs in Matrix too - and have been experimenting with APIs for traversing a DAG of objects overlaid within Matrix's room DAG (e.g. https://github.com/matrix-org/matrix-doc/blob/kegan/msc/thre... - which isn't really about threading specifically, but could be traversing any kind of object graph).

          So, what we're looking at now is (i think?) precisely what you're proposing that we do - i.e. use an existing CRDT implementation to model the collaborative evolution of an arbitrary object graph - while also snapshotting that as we go as evolutions within the Matrix room DAG. I haven't been working on it myself, and we haven't published the research yet, but I'd assume we'd do something like pass CRDT updates as Matrix EDUs (Ephemeral Data Units) between clients, with the server then maintaining or generating snapshots of views of the graph. The key thing we're aiming for is to build on the existing decentralised namespace and identity model and end-to-end encryption that Matrix already provides.

          This could look something like:

          * Client A negotiates a WebRTC data channel with Client B for low-latency collaboration (via standard Matrix m.call.invite WebRTC signalling, so you get E2EE and decentralisation for free)

          * CRDT updates also get sent to the server via the Matrix client-server API (n.b. that in the nearish future the server may actually be running locally within the client, thanks to P2P Matrix: https://matrix.org/blog/2021/05/06/introducing-the-pinecone-...). For E2EE rooms, updates would have to be at the granularity of an encrypted event (but perhaps clients participating in the E2EE could decrypt, coalesce and re-store these).

          * Server maintains a view of the object graph, letting clients navigate lazily through the tree, view it or bits of it as versioned snapshots, or start participating in the CRDT itself.

          (It's worth noting that Matrix events are deliberately capped to 65KB - anything bigger than that should be persisted as a binary blob on disk. In this model it's probably fine, given you'd want events to be as small as possible - possibly even keystrokes.)

          Again, this is very handwavy and I'm not actually working on it myself, but it hopefully gives a vague idea of what we're thinking about.

          • southerntofu 826 days ago
            This is pretty cool! Some people were asking on another thread about the differences between XMPP and Matrix the other day: first-class support for CRDTs (or equivalent consensus-reaching) is in my view a key property of Matrix and despite not using Matrix for chat yet, i can definitely see myself using it for collaborative apps (Etherpad is still bugged after all these years, and HedgeDoc is in my personal opinion going in the wrong direction by removing explicit macros).

            Can't wait for higher-level library (yJS or others) to support a matrix backend so i can mess around with that!

          • Arathorn 828 days ago
            btw, everyone should come hang out in https://matrix.to/#/#beyond-chat:matrix.org to discuss this interactively if you want :-)
    • YousefED 828 days ago
      Indeed, I think Matrix state resolution has some CRDT-like properties. Afaik, the goal of this is mainly to manage room state across servers when using federation - but I'd have to dive deeper into this.

      However, not all CRDTs are the same (actually, I think the definition is still somewhat vague - most things "eventually consistent" could be called a CRDT). Yjs is specifically designed for high-performance operations and has great support for rich-text collaboration. It also works great while offline - and you can connect different providers as you like (for example, store updates both locally in IndexedDB, and transmit messages over websocket / webrtc / and now, also Matrix). Definitely wanted to keep this flexibility.

    • pkulak 828 days ago
      Only to federate between servers, and none of those operations are exposed to clients, as far as I know. If everything is happening on the same server, it's basically just an event stream.

      That said, using this to stream CRDTs, and THEN federating those streams between servers would be pretty wild.

  • octopoc 828 days ago
    (dreaming big here) You could make this a basis for a generic conversational UI[1] for web data. It wouldn't be a perfect UI for most applications, but it provide a couple of important features:

    1. It would be easy to manage data sharing. Post updates to data in channels that the relevant people have access to.

    2. It would be easy to debug data changes. I could see it becoming the equivalent of a terminal app but for the web.

    [1] https://en.wikipedia.org/wiki/Conversational_user_interface

  • kukabynd 828 days ago
    Awesome, thanks for sharing! I’ve been building using yjs and it’s indeed something much more exciting than a bunch of bs we see with web3.
    • noman-land 828 days ago
      Matrix is for sure part of web3 since it's an open protocol.
      • tinco 828 days ago
        Open protocols and decentralization are web1 not web3. It's how the web was originally envisioned. Web3 is specifically about replacing centralized authority with cryptographic authority. But whether that's actually useful enough for anyone for it to be considered a revolution on the web like web2 was remains to be seen.
        • noman-land 828 days ago
          Unlike web1 and web2, web3 is an ethos more than a set of technologies. web1 and web2 are descriptions of the past, whereas web3 is a vision for the future.
          • heartbeats 828 days ago
            Unlike web1 and web2, web3 is a grift. In contrast, this appears to be a genuinely innovative technology.
            • DarylZero 828 days ago
              Web 1.0 is Star Wars.

              Web 2.0 is The Empire Strikes Back.

              Web 3.0 is The Room.

              • heartbeats 827 days ago
                Web 3.0 is The Phantom Menace.
          • tinco 828 days ago
            Alright, but if you can use that ethos to argue that e-mail, Bittorrent and IRC are the future I'm going to look at you funny.
            • opan 828 days ago
              I'd like to think email, irc, and bittorrent will still be around far into the future, at least.
  • maelito 828 days ago
    I've used yjs for a project that aims at computing individual climate footprints as a group in a conference.

    It works beautifully with almost no code. I used Yjs P2P, but P2P is blocked in numerous networks.

    If you want to try, with an hackernews room, but sorry it's in french : https://nosgestesclimat.fr/conférence/hackernews

    So using Matrix as a backend is a great idea.

    The only caveat I found to Yjs is that if you want to persist data (and not lose it when the server crashes), nothing's really plug-and-play, so I went with Supabase.

  • noman-land 828 days ago
    This is so cool. Come on, everyone! Let's make matrix the new thing, already.
  • Cthulhu_ 828 days ago
    Yjs looks interesting; I'm currently building a management UI that for now is a very straightforward set of REST APIs and a database, but there's a lot of concerns about people editing the same configuration simultaneously; I wonder if something like yjs can be used to create a collaborative editing environment.

    I'd probably have to re-architect the whole front- and back-end though. Although on the other hand, there will be some code editors (currently using Monaco) where this could be usable.

    • er4hn 828 days ago
      My understanding of CRDT is that it shines the most when you have collaborative documents. For a mgmt UI along the lines of "listen on port X, log at level Y" where does CRDT become handy? Wouldn't you just want "last writer wins" for that sort of use case?
  • panick21_ 828 days ago
    Having an Datomic like store backed by something like this.

    https://github.com/replikativ/datahike

    Is an Open Source variant of Datomic.

    Lambdaforge wants to eventually have this work with CRDTs.

    Using the Matrix ecosystem for this is quite interesting as it solves many problems for you already.

    • dunham 828 days ago
      I'd also love to see an offline-first version of the datomic model.

      I've thought about datomic + crdts in the past, and it seems like ":db/unique :db.unique/identity" properties would be an issue. Whether or not the transaction fails depends on the current content of the database.

      I don't know if there is a way around this (while still being a CRDT) or how necessary this feature is.

  • outside1234 828 days ago
    Lots of Matrix ignorance here - so sorry if this is a dumb question.

    When you auth with Matrix, is that a centralized action (one company / org behind it) or is that a decentralized action (its more a protocol for auth amongst the community of users)?

    ie. Is using Matrix for auth decentralized or a different centralization?

    • YousefED 828 days ago
      You authenticate against a Matrix server (the “homeserver”). This can be a server that you host yourself, or one that is readily available (matrix.org is the most popular one).

      This design is fairly similar to email where you can be @gmail, @hotmail, or @whateveryoulike.com

  • outside1234 828 days ago
    Just to understand, is the concept here that you send all of the updates for a particular data item to a Matrix "room" and then converge on the current state of that data item by each client replaying that data?
    • YousefED 828 days ago
      Yep, thats pretty much correct. With the addition of sending occasional “snapshots” so clients don't need to read the entire room history initially (more about this in the bottom of the readme).
  • lgas 828 days ago
    This sounds like (at least a big step towards) distributed event sourcing which is something that I have been anticipating for a while now but haven't heard any news of.
  • outside1234 828 days ago
    Is there any community where projects like this have a center of gravity? (eg. conferences, Discord, Matrix, ...) for people interested in getting involved in the community?
    • YousefED 828 days ago
      I just set up a Discord @ https://discord.gg/9GxSXCeQyN. Please join!

      It’s called TypeCell as that’s the (yet to be announced) project both Matrix-CRDT and SyncedStore are building blocks for.

      If you’re interested in tech like this and maybe also “future of programming” (those bret victor things ;) ), knowledge management software (notion, roam, obsidian) - i’m sure you’ll like it!

      • DenseComet 828 days ago
        Heh. I get why you're using Discord, but it's unfortunate that a project building on Matrix does not use it for chat.
        • YousefED 828 days ago
          Fair point haha. I’ll also set up a matrix room and bridge it to Discord.
          • trenchgun 828 days ago
            Excellent.

            Discord is not just proprietary, but it also has a messy UI and bad UX.

      • brunoqc 828 days ago
        oh! I dream of a CRDT note-taking app.
  • pixel_tracing 828 days ago
    Awesome project! Quick question I notice this uses WebRTC, why integrate / bake this in directly rather than abstract it out?
    • YousefED 828 days ago
      Good spot. The WebRTC part is currently experimental and might be superceded by a different solution. If that's not the case I'll probably abstract it out indeed!
  • toyg 828 days ago
    I wish this was available for Python. Does anybody know what the Py-equivalent would be (if there is one)?
  • hammyhavoc 828 days ago
    This is a whole other level of cool. Jesus Christ.