Skip to content

vnpedroso/local_airflow_instance

Repository files navigation

Local Airflow Instance

A local airflow instance deployed with the official airflow helm chart in a minikube cluster

components

  • airflow-namespace.yml: creates the namespace in which we will deploy ou airflow instance
  • airflow-variables-configmap.yml: creates variables to be deployed within our airflow instance
  • airflow-values.yml: main yaml config file to be used as a reference for the deployment of the helm chart
  • Dockerfile: the dockerfile containing the customization of airflow base image used in the instance
  • /dags: folder containing a sample dag to be built-in inside our airflow customized image definied in the Dockerfile

starting minikube cluster

I'm using minikube in this example, but this might work with k3d, k3s, kind and any other container orchestration tool that runs locallys. To start your minikube cluster, run the following:

minikube start --memory=<MEMORY_SPECS> --cpus=<CPU_SPECS>

building customized image

From lines 48 to 55 of the airflow-values.yml file we have the definitions of the image to be used in the deployment of the helm chart. The image tag and name defined there should match the tag and the image name that you will build and load into the minikube cluster.

docker build -t airflow_extended_image:1.0 .
minikube load airflow_extended_image:1.0

installing airflow helm chart

To install airflow official helm chart, run the following command:

helm repo add apache-airflow https://airflow.apache.org

Then deploy the airflow instance specifying the airflow-values.yml as the reference for helm:

helm upgrade --install airflow apache-airflow/airflow -f ./airflow-values.yml -n airflow-namespace --debug 

checking variables created with airflow_variables_configmap.yml

Variables created with the configmap will not be available with the webserver UI. However, they will be in your instance. You can check on them executing a terminal inside either the webserver pod or the scheduler pod:

kubectl exec --stdin --tty <webserver_pod_name or scheduler_pod_name> -n <namespace_name> -- /bin/bash

Once inside the pod's terminal run the following to check on the variables:

from airflow.models import Variable
Variable.get("<variable_name>")

enabling airflow webserver

In the line 984 of airflow-values.yml file, there is type of the service that the webserver will use to be available. If using minikube, you can set it to a Loadbalancer, and run the following in a dedicated terminal:

minikube tunell

If using ClusterIP as service type, which is the default, run the following in a dedicated terminal:

kubectl port-forward -n airflow-namespace svc/airflow-webserver 8080:8080

The webserver will be available at your localhost at port 8080, access it through your browser at http://127.0.0.1:8080

About

A local airflow instance deployed with the official airflow helm chart in a minikube cluster

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors