Context manager DAG

DAGs can be used as context managers to assign each Operator/Sensor to that DAG automatically. This can be helpful if you have lots of tasks in a DAG; you don't need to repeat dag=dag in each Operator/Sensor. From the latest Airflow document, using context managers is recommended.

Below is a modified version of our first DAG in the previous page.

Create a file named 2_context_manager_dag.py that contains the following code:

from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.bash_operator import BashOperator


default_args = {
    "owner": "airflow",
    "depends_on_past": False,
    "retries": 1,
    "retry_delay": timedelta(minutes=5),
    "email": ["airflow@example.com"],
    "email_on_failure": False,
    "email_on_retry": False,
}

with DAG(
    "2_context_manager_dag",
    default_args=default_args,
    description="context manager Dag",
    schedule_interval="0 12 * * *",
    start_date=datetime(2021, 12, 1),
    catchup=False,
    tags=["custom"],
) as dag:

    start = BashOperator(
        task_id="start",
        bash_command="echo start",
    )

    end = BashOperator(
        task_id="end",
        bash_command="echo stop",
    )

    start >> end

So far, in the two DAGs that we wrote, the tasks run one by one. It is excellent but may not be the most efficient. What if we have a pipeline, and there are some tasks that can be running in parallel?