Set up and run Airflow locally
There are a few ways to run Airflow locally:
- Run it in a Python virtual environment
- Run it using
docker-compose
- Deploy it using Helm Chart
Python virtual environment deployment
Airflow community created a guide that shows how to run Airflow in a virtual environment.
This setup is lightweight and suitable to test something quickly. It is not a production-ready setup because it only processes tasks sequentially.
Docker-compose
Using docker-compose
is the preferred way to run Airflow locally. Again, the Airflow community is kindly created a guide including a pre-baked docker-compose.yaml
file.
When you do docker-compose up
, a whole Airflow cluster is up, including:
airflow-scheduler
—The scheduler monitors all tasks and DAGs, then triggers the task instances once their dependencies are complete.airflow-webserver
—The webserver is available at http://localhost:8080.airflow-worker
—The worker that executes the tasks given by the scheduler.airflow-init
—The initialization service.flower
—The flower app for monitoring the environment. It is available at http://localhost:5555.postgres
—The database.redis
—The redis is the broker that forwards messages from scheduler to worker.
All these services allow you to run Airflow with CeleryExecutor
.
There is only one trick/bug that you should be aware of - Docker has this weird of creating volumes with root user. When running Airflow with docker-compose
, there is a GitHub issue that includes a temp solution.
If you see errors after running docker-comose up
: Errno 13 - Permission denied: '/opt/airflow/logs/scheduler
, you need to stop the docker-compose
and run chmod -R 777 logs/
. You should be able to start your Airflow cluster again using docker-comose up
.
Helm Chart
How can we forget Kubernetes these days if we want to deploy something? To deploy Airflow to Kubernetes, here is a guide from the Airflow community.