<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Airflow on Alexander Junge&#39;s website</title>
    <link>https://www.alexanderjunge.net/tags/airflow/</link>
    <description>Recent content in Airflow on Alexander Junge&#39;s website</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-US</language>
    <lastBuildDate>Mon, 22 Feb 2021 00:00:00 +0000</lastBuildDate>
    
	<atom:link href="https://www.alexanderjunge.net/tags/airflow/index.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
      <title>Spotlight: New TaskFlow API in Apache Airflow 2</title>
      <link>https://www.alexanderjunge.net/blog/taskflow-airflow-2/</link>
      <pubDate>Mon, 22 Feb 2021 00:00:00 +0000</pubDate>
      
      <guid>https://www.alexanderjunge.net/blog/taskflow-airflow-2/</guid>
      <description>I recently switched to version 2 of Apache Airflow which was released in December 2020. I am a big fan of the new TaskFlow API and want to highlight it here.
The TaskFlow API allows users to write DAGs in a much more efficient way, requiring less boilerplate code. Specifying task dependencies and exchanging data between tasks via XComs is also much easier now.
A minimal example of a DAG using the TaskFlow API looks something like this:</description>
    </item>
    
    <item>
      <title>Querying arXiv preprints using Airflow</title>
      <link>https://www.alexanderjunge.net/blog/arxiv-airflow-fastapi-psql/</link>
      <pubDate>Sun, 02 Feb 2020 00:00:00 +0000</pubDate>
      
      <guid>https://www.alexanderjunge.net/blog/arxiv-airflow-fastapi-psql/</guid>
      <description>Querying arXiv preprints using Apache Airflow I experimented with Apache Airflow to schedule hourly workflows fetching recent preprint articles from different arXiv categories via the public arXiv.org REST API. These articles are then stored in a PostgreSQL database via a custom-built fastAPI-based REST API.
The setup looks like this:
The code is fully dockerized and available on GitHub along with more detailed documentation.</description>
    </item>
    
  </channel>
</rss>