12 comments

  • steventhedev 1 day ago

    Few bits of advice/comments:

    1. the schedule syntax. It can seem much easier to work directly with English, but there's a reason most systems use cron-style schedules and it isn't because it's legacy. There's a benefit in having a syntax that is "alien" when you're crossing that human/computer/dev border, as it forces your users to think about how to achieve something. The other half of the reason is because it's a lot easier to validate a strict syntax with a formal grammar than anything even close to natural language. I can basically guarantee you that if someone had come up with a way to accurately parse english recurrence statements, it would have already been adopted by cron in the past (or at least a website like crontab guru[0].

    2. There is no description of the job execution semantics in the distributed sense. Is it at most once, or at least once? This is important if you're designing jobs, so if someone is switching to your system from an existing system with a different semantic it will cause a lot of friction especially if you don't even list it in your documentation.

    3. You have two data stores. You're using Raft for backing your key/value store, and mongo for both job metadata and also communicating with the dashboard. This is a bit confusing because in the docs you wrote that metadata goes in a sidecar yaml file. What job metadata are you referring to then? Results? Fundamentally, having multiple data stores makes your projects orders of magnitude more complicated to operate and monitor. My guess is you'll see this as a major pain point in the future when you want to scale a cluster in different ways (more jobs, more frequent jobs, longer histories, more nodes in the cluster).

    All in all, it looks like a pretty decent distributed cron, but I wouldn't touch your project for my production systems. It's lacking a lot of features we need for getting Legacy (TM) work done[1] like running a specific job on a specific worker because it needs access to special hardware or some filesystem or because our PHB decided that as a policy only one of our servers can run billing code on Tuesdays.

    [0]: https://crontab.guru/

    [1]: "Legacy code is code that makes money"

    • ojs 1 day ago

      Replying to each of your points:

      1. I completely agree. We have recently acknowledged the power in an "alien" syntax as you put it. In v2.0.0 Odin will also support the classic cron schedule string syntax!

      2. You are right, the repo is lacking a detailed description on this. The execution semantics boils down to at most once.

      3. The Yaml file just lends some simple schedule and runtime info to the job. We actually would rather this moved into the code in v2.0.0 - that way all workflow info is self contained and no supplementary files are needed. The multiple data store thing is entirely an issue, in v.2.0.0 we are aiming to make more aspects (such as you have described) pluggable, using whatever data stores you specify. That will require a lot of docs but we are up to task.

      That you so much for the advice comments, I will share them with the development team, feedback is always appreciated :)

      • boulos 1 day ago

        The original App Engine Cron had its own "human friendly" format for the schedule argument [1]. The problem is that what people want to copy/paste from the internet is some form of vixie-cron usually.

        However, all cron formats are subject to lots of ... edge cases depending on what you think of as normal. For example, since Vixie cron (and its descendants) are pattern matchers, there isn't a way to express "Run this every 33 minutes". Even a "sane schedule" gets to have fun with timezones and their adjustments (both for daylight savings but also countries adjusting their rules).

        Airbnb's Chronos chose instead to use ISO 8601 Repeating Intervals [2] which in written form are R[n]/<timestamp>/<period> (where n is the number of repetitions and timestamp is the start time). That's a good format for an infinitely repeating thing with arbitrary period, but not able to actually express "at 6PM every weekday" (which crons do, and many humans want).

        Finally, all repeating schedules go out the window once you get to someone who probably needs something more explicit: every NASDAQ trading day after market close (which is usually 4:30 but sometimes not, not open every weekday, etc.). The usual answer is "Just make the schedule to run every weekday at 1 PM Eastern for the early closes and usually exit, every weekday at the usual 4PM Eastern and have your code check that the market isn't on holiday". The same kind of works for the 33-minute period, except the answer ends up being "check every minute and do your own modulus to figure out if you should run" (which isn't particularly helpful).

        Enjoy scheduling!

        [1] https://cloud.google.com/appengine/docs/standard/python/conf...

        [2] https://en.wikipedia.org/wiki/ISO_8601#Repeating_intervals

    • mottosso 1 day ago

      Thought this was related to Odin the programming language.

      https://github.com/odin-lang/Odin

      • noarchy 1 day ago

        Or Odin the flashing software for Samsung devices.

        https://samsungodin.com/

        • ojs 1 day ago

          So much software seems to share the name, I am are currently trying to figure out something a little more unique!

          • white-flame 1 day ago

            Just start concatenating words. If you like "Odin", but it's too common, just use something like "OdinFlow" which integrates something about its domain.

            The actual full name doesn't matter that much, and it can still be called "Odin" as shorthand in the docs, but this way the full name can still have both the implications you want and uniqueness. Plus, adding "Flow" now makes it a tiny bit more self-describing than just a proper name.

            • A more obscure figure from the Germanic pantheon, perhaps?

              • ojs 1 day ago

                Trying to move away from figures of any pantheon now!

          • ttymck 1 day ago

            This looks awesome! Excited to give it a spin.

            If I wanted to take a stab at a Scala SDK, is that something you would be interested in supporting, or should community/contrib SDKs be managed as separate projects/plugins?

            • ojs 1 day ago

              If you wanted to give a Scala SDK a go that would be great! We'd certainly be interested in any contributions you're willing to make to the project!

            • lordofgibbons 1 day ago

              Thanks for posting this. I'm curious how this differs from Uber/Cadence, and Temporal (https://www.temporal.io/) which is a fork of cadence.

              They're both also distributed workflow scheduling and state management

              • maxmcd 1 day ago

                This seems to be more of a distributed cron or general purpose job runner. I believe Temporal/Cadence allow for much more rich composition. Linked actions with Odin seems limited to this mechanism: https://github.com/theycallmemac/odin/blob/master/DOCS.md#li...

                • ojs 1 day ago

                  So Odin is actually configurable workflows in four supported runtime at the moment - Python, Go, Node and Bash. Temporal seems to only support Java and Go as far as I can see!

                  Specifically, the observability hooks in Odin would be the differentiating feature. Information is gathered to help infer the internal state of jobs. This means Odin can directly help diagnose where the problems are and get to the root cause of any interruptions. Debugging, but you're one step ahead.

                  • dnautics 1 day ago

                    I got confused because I'm most faimilar with Oban, which is a distributed workflow manager and scheduler written in a different language:

                    https://github.com/sorentwo/oban

                  • jitl 1 day ago

                    Does Odin support task queuing like SQS, RabbitMQ, Redis, etc, or is this just for scheduled jobs? The debugging info looks very cool, but I need this for queue jobs more than scheduled jobs.

                    Relatedly, how many jobs/s or queries/s do you expect Odin to support in the suggested 3-leader configuration on typical VMs? What kind of load do you run this in production?

                    • sumobashriki 1 day ago

                      What role does MongoDB serve? Why would you even use MongoDB instead of Badger or Bolt

                      • threeseed 1 day ago

                        Because Badget/Bolt are key/value stores and MongoDB is a document store.

                        I can't imagine how anyone can model even moderately complex use cases on a key/value store. Even with some sort of ORM filtering and manipulation would be very cumbersome.

                        • ojs 1 day ago

                          Mongo serves as a data store for job metadata, which is in turn then access to be displayed on the dashboard.

                          We are currently working on having the database component as something more pluggable, so down the line you will be able to use the DB of your preference :D

                        • saxonww 1 day ago

                          [edited to remove snark]

                          An earlier discussion on why Odin: https://news.ycombinator.com/item?id=23460066

                        • rubio8 1 day ago

                          Nice job on this it looks well featured. Would like to see a Ruby SDK.

                          • ojs 1 day ago

                            I'm not a very confident ruby developer but it's definitely going to be something added to the development roadmap!

                          • meritt 1 day ago

                            This looks like a really complex crontab that requires mongoDB of all things.

                            • AtlasBarfed 9 hours ago

                              I harp on orchestration and workflow so I'm a bit of a cranky old man on this subject.

                              I get you can technically do orchestration and workflow with this, but you can also technically do it with any turing complete language.

                              Orchestration and workflow should involve UI design elements for workflows, integration/interfaces, and visual execution tracking and workflow instance initiating?

                              And yeah, MongoDB, which just got torn apart by aphyr for anything concurrent or distributed.

                              • Hallo james

                                • ojs 1 day ago

                                  Hallo senan

                                • samblr 1 day ago

                                  Point the submitted link to repo (it pointing to readme)