Give your data team
magical powers

Open-source data pipeline tool for transforming and integrating data.

Upgrade your data workflow

Effortlessly integrate and
synchronize data from
3rd party sources.

Build real-time and batch
pipelines to transform data
using Python, SQL, and R.

Run, monitor, and orchestrate
thousands of pipelines without
losing sleep.

Build

Have you met anyone who said they loved developing in Airflow? That’s why we designed an easy developer experience that you’ll enjoy.

Easy developer experience

Start developing locally with a single command or launch a dev environment in your cloud using Terraform.

Language of choice

Write code in Python, SQL, or R in the same data pipeline for ultimate flexibility.

Engineering best practices

Each step in your pipeline is a standalone file containing modular code that’s reusable and testable with data validations. No more DAGs with spaghetti code.

Preview

Are you wasting time trying to test your DAGs in production? Get instant feedback every time you run code in development.

Interactive code

Immediately see results from your code’s output with an interactive notebook UI.

Data is a first-class citizen

Each block of code in your pipeline produces data that can be versioned, partitioned, and catalogued for future use.

Collaborate on cloud

Develop collaboratively on cloud resources, version control with Git, and test pipelines without waiting for an available shared staging environment.

Launch

Don’t have a large team dedicated to Airflow? Mage makes it easy for a single developer to scale up and manage thousands of pipelines.

Fast deploy

Deploy Mage to AWS, GCP, Azure, or DigitalOcean with only 2 commands using maintained Terraform templates.

Scaling made simple

Transform very large datasets directly in your data warehouse or through a native integration with Spark.

Fully-featured observability

Operationalize your pipelines with built-in monitoring, alerting, and observability through an intuitive UI.

You’ll love Mage. I bet Airflow gets dethroned by Mage next year!

Zach Wilson
Staff Data Engineer @ Airbnb

One thing that hasn't been highlighted much about Mage is the community.

The slack channel has been great and not only did they help me with my immediate problems but they also took a SERIOUS look at my feature requests and included one of them in the latest release!

Ajith Shetty
Senior Data Engineer @ Miniclip

Awestruck when I used Mage for the first time. It’s super clean and user-friendly.

Jon White
Principal Architect @ Red Alpha

I can say even after just trying it once, Mage would help any Data Engineering team write uniform, clean, well tested Data Pipelines. This is NOT something found in Airflow, Prefect, or Dagster.

Daniel Beach
Senior Data Engineer @ Rippleshot

The go to tool for any team looking to build and orchestrate data pipelines. Very friendly UI with a great developer experience, saving time in development. Mage is going to be the clear winner in the data pipeline tooling space.

Sujith Kumar
Data Architect @ Zero Pixels

I want to thank the Mage team for building such a great product. I am happy and excited to start using Mage as one of our daily data tools.

Zach Wilson
Staff Data Engineer @ Airbnb

I just loved using it, so easy and intuitive to use.

Zach Wilson
Staff Data Engineer @ Airbnb

Probably will make people better programmers in general.

Zach Wilson
Staff Data Engineer @ Airbnb

Give your data team magical powers

Questions & Answers

Our tool was built with data engineers and data scientists in mind, but is not limited to those roles. Other data professionals could find value in the tool.

You can quickly and easily get started by installing Mage using Docker (recommended), pip, or conda. Click here for details.

Mage is free as long as you are self-hosted (AWS, GCP, Azure, or Digital Ocean).

Mage differentiates itself from Airflow and other tools based on 4 core design principles:

  1. Focus on providing the easiest developer experience.

  2. Ensuring engineering best practices are built-into every aspect of data pipeline building.

  3. Everything in Mage is about data, that’s why data is a first-class citizen in Mage.

  4. Scaling is made simple and possible without overhead of a large dedicated infra or DevOps team.

We currently support SQL, Python, R, and PySpark.

Yes! Click here for a step-by-step tutorial to use Mage with Spark on EMR.

We love and welcome community contributions! Here is a doc to get you started.

To request features, add a “Feature request” using the New issue button in Github from this link, or join our feature-request Slack channel.