Installing Apache Airflow on Windows | Easy & Fast Approach | DAGs

Muhammad Talha Khan
5 min readNov 29, 2021

--

Guys on my way to become a Data Engineer, I and my classmates were installing Apache Airflow and following some thrown off documentation which has unnecessary steps.

So after that day, I decided to document the steps we followed during our session to install Apache Airflow on our Windows machines. Before starting the installation process.

Let me tell you, what is Apache Airflow?

Apache Airflow is designed by Airbnb to create, schedule, and monitor the workflows (ETLs) easily.

Let’s dive into the installation, What you would be needing:

  1. A computer or laptop (of course!) 😉
  2. Windows 11 or any updated Windows
  3. Ubuntu (Installed from store)
  4. Python (3.9)
  5. Apache Airflow (current version)

STEP 1: TURNING THE WINDOWS FEATURE ON FOR LINUX SUBSYSTEM

a. Simply click on Start button

b. Write down “Turn windows feature on or off

Start Menu (Turn Windows Feature On or Off)

c. Windows Features window will appear, search for “Windows Subsystem for Linux

Windows Features (Subsystem for Linux)

d. Check mark it, and a prompt will come asking you to restart your computer.

e. After the restart, head to the store and search “Ubuntu

Ubuntu (Microsoft Store)

f. Get it installed.

STEP 2: DOWNLOAD & INSTALL C++ BUILD TOOLS

To get the Apache Airflow work you need to install C++ Build tools, that you can download from here.

Build Tools (Microsoft)

Once your download is complete then let it halt in your system and move towards the next step..

STEP 3: SELECT YOUR USERNAME & PASSWORD FOR UBUNTU

Now Search for “Ubuntu” and open it.

Once it is opening, it will complete the initial process by asking the username and password. I had configured it while installing the Apache before.

Ubuntu (Main Screen)

You can see my username is setup as “Spidey” and you guys can personalize it.

STEP 4: INSTALLING PIP INSIDE UBUNTU

Here are some commands you would be copy pasting in Ubuntu but do you guys know how to paste??

sudo apt-get install software-properties-commonsudo apt-add-repository universesudo apt-get updatesudo apt-get install python-setuptoolssudo apt install python-pipsudo -H pip install --upgrade pip

Do run these commands one-by-one.

A tip for pasting, “Right-click” when you copied the command and wants to paste inside Ubuntu. (Thank me later 😉)

You can also verify your installation of PIP by using pip -V

STEP 5: INSTALLING DEPENDENCIES FOR APACHE AIRFLOW

We saved your time by getting things done before, just copy and paste the below mentioned commands to get your dependencies installed in Ubuntu.

sudo apt-get install libmysqlclient-dev sudo apt-get install libssl-dev sudo apt-get install libkrb5-dev sudo apt-get install libsasl2-dev

STEP 6: INSTALLING APACHE AIRFLOW

We once again saved your time by getting you a command, which you already know what to do with. 😉

sudo SLUGIFY_USES_TEXT_UNIDECODE=yes pip install apache-airflow

After installation, we move forward to make some of the changes required for smooth outcome.

Changing the path to your given username which would let it halt at the given location. Change <username> to your given in the command mentioned below:

export PATH=$PATH:/home/<username>/.local/bin

Like, my username is “Spidey” then,

export PATH=$PATH:/home/Spidey/.local/bin

Yayyyy! 😎 You just installed the Apache Airflow.

Now, open another instance of Ubuntu to run Airflow commands.

STEP 7: APACHE AIRFLOW COMMANDS W/ SETUP

The first time users will be needing to go through all the given steps below with commands:
a. Command to initialize the database

airflow db init

Once done, All the necessary files will be created inside your directory. We would be making some changes to Airflow’s setup.

b. Commands to open config file

cd airflowlssudo nano airflow.cfg

Now make the following changes:

dags_folder = /mnt/c/dagsbase_log_folder = /mnt/c/dags/logs

Note: The dags and log folder paths above map Airflow to your Windows C: drive. You will need to create two folders. One on your C: drive at C:\dags and a folder inside that folder at C:\dags\logs.

You can change the location and specify the folders of your choice. I used the above directory as it is easy to locate and access. You will also avoid any potential permissions issues in this directory.

Now run,

airflow db init

If you receive any error mentioning pyscope2 pkg then run the following commands:

sudo apt-get update -ysudo apt-get install -y libpq-devpip install psycopg2

Now run it again,

airflow db init

Hurrah! You did it, now startup the webserver and scheduler:

Open new instance and run the first command and let it run. Open a new terminal window and run the second prompt.

airflow webserver -p 8080airflow scheduler

Afterwards, open your browser and type:

localhost:8080

When you hit enter the following page would come up:

Airflow Window Showing DAGs

You can have a detailed guide on how to run Airflow from here. This is my first guide after completing the journey, stay tuned. Lots of stuff in the pipeline would come on YouTube.

If you need any of the help in it then comment and let me know!

--

--

Muhammad Talha Khan
Muhammad Talha Khan

Written by Muhammad Talha Khan

👨‍💻 Passionate Data Engineer 📊 | SQL Enthusiast 🗄️ | Lifelong Learner 📚| DataCamp Data Engineer Track Graduate 🎓

No responses yet