Case study: generate nudges
From the previous chapters, we have learned the Airflow concepts and how to write DAGs and custom Operators. In this chapter, let's work on a real-world use case and create an Airflow pipeline together.
Background
An e-commerce company (let's call it Cell-mate
) is a famous online shop selling phones and accessories. They have various data sources which export data in CSV files to a Google Cloud Storage (GCS) bucket on a daily basis.
Data sources
The Cell-mate
has three primary data sources:
- Accounts—It contains all the information from the accounts of their customers.
- Items—It contains all the items that are listed on the website.
- Activities—It contains all the viewer activities from the customers.
Goal
The product team in Cell-mate
would like to build a pipeline to generate nudge emails for the customers who recently viewed the items on their website but didn't make the purchase. As the first step, they would like the nudge data to be stored in a place so that the email campaign service can use it.
Let's continue and create the DAG first.