Using Argo to Train Predictive Models at FlightAware

(flightaware.engineering)

49 points | by rainingdeerbox 6 days ago

6 comments

  • vtuulos 6 days ago
    This is very cool! If you are interested in using a similar setup for your ML workloads (Argo + K8S), I'd love to chat with you (ville@outerbounds.co).

    We are working on integrating Argo & K8S for Metaflow, an open-source ML framework originally developed at Netflix https://github.com/Netflix/metaflow/pull/434

    • verst 6 days ago
      Out of curiosity, was Kubeflow ever considered and was there a deliberate decision against Kubeflow?

      Kubeflow Pipelines under the hood (and not very hidden) uses Argo Workflows.

      • KidComputer 6 days ago
        I found Kubeflow Pipelines to be worse than vanilla Argo. Not only is the documentation poor for the python DSL that gets compiled to the Argo workflow spec but some operations are downright impossible to express in the DSL. Vanilla Argo + RBAC works wonderfully especially with Argo Server since v3+.
        • verst 6 days ago
          I agree. For anything sophisticated you have to use the Kubernetes Python client combined with the KFP DSL. The compiled Argo Workflow spec can become such a mess that you quickly run into a size limitations -- especially if you choose to deploy a Python function directly (which then uses Kaniko to build the containers in cluster and run them via Argo Workflows). At that point I would rather use Argo directly. And KFP is generally quite behind in Argo Workflow versions.

          And it's not like KFP makes CI/CD (MLOps) any easier than Argo Workflows itself.

        • RocketSyntax 6 days ago
          Or Apache Airflow. I'd expect this and KFP to have better integrations that cater to data science. You can use Elyra extension to design the workflows right from JupyterLab.
          • FridgeSeal 6 days ago
            Not from said company, but did do a workflow tool comparison at my work and we also went with Argo.

            Argo had better kubernetes + surrounding ecosystem integration out of the box, it was designed to run containers by default which suited us because we had mixed language workloads. Airflow was mostly Python specific, unless you then ran plugins and extensions, the config/pipeline definition was written in Python which I didn’t want to do after witnessing my teammates write the worst Python I’ve seen in my career, and last time I evaluated it, it depended on a bunch of external, Python specific tools (celery etc) that I had previously found painful to run.

            • verst 6 days ago
              > my teammates write the worst Python I’ve seen in my career

              Isn't that fairly common which is why there are "ML engineers" that then productionize (clean up and optimize) the original code to be plugged into a production workflow / pipeline system?

        • maximilianroos 6 days ago
          Argo is extremely capable at its focus — coordinating workflows on Kubernetes.

          I had used Airflow for a few years, and looked into Prefect; in retrospect I'm very happy we chose Argo.

          Use Argo if: - Your tasks are containerized. - You're using Kubernetes, and can benefit from what it can offer — individually sized containers, autoscaling, fault-tolerance. - You have loosely coupled tasks — which pass at most pass files to each other, rather than python objects. - You don't have tens of thousands of tasks / streaming / etc.

          Airflow can run on Kubernetes, but with Airflow we ended up having equally sized workers up 24/7 — whether or not it was running an expensive job, a query on a remote system, or nothing.

          • iakh 6 days ago
            Why create a new model for each airport? That doesn't seem right. Delays are compounding and many are regional.
            • KingOfCoders 6 days ago
              I would think they train for predicting the normal case from positions near the airport to touch down and from touch down to the parking positions at the airport.
            • phissenschaft 6 days ago
              Really like the flexibility provided by Argo. It's the missing workflow concept from Kubernetes.
              • KingOfCoders 6 days ago
                I would like to know why they don't use GPUs for training but vCPUs.