In Data Engineer’s Lunch #29: Introduction to Apache Nifi, we introduce Apache Nifi and how we can use it for data engineering. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion and demo, is also embedded below in case you were not able to attend live. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now!
In Data Engineer’s Lunch #29: Introduction to Apache Nifi, we introduce Apache Nifi, cover core concepts, and showcase a demo that pulls CSV data and transforms it into individual JSON records. If you were not able to attend the live event, be sure to check out the embedded live recording below. Additionally, subscribe to our YouTube Channel in order to watch prior and future Data Engineer’s Lunches on demand!
Apache Nifi was built to automate the flow of data between systems. Nifi supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic. One of the most amazing things about Apache Nifi is that it has a web-based user interface in which you build your data pipeline; hence, you may be able to build your entire data pipeline without writing a single piece of code. However, there are processors to use scripts if needed. With Apache Nifi, you are able to track your dataflow from beginning to end and visualize individual flowfiles as they move about from processor to processor via queues.
The purpose of this walkthrough is to a be simple introduction to Apache Nifi and how to build data flows using the UI. In the walkthrough, we will cover how we can take a CSV file, break it up, and convert the individual records into JSON format. Again, this walkthrough is mainly for getting accustomed to Apache Nifi and we will cover more complex data flows in future walkthroughs.
We will be using Gitpod for this walkthrough so that anyone can follow along without having to worry about OS incompatibilities. Click this link to get started!
1. Download Apache Nifi
curl -L -s https://mirrors.advancedhosters.com/apache/nifi/1.13.2/nifi-1.13.2-bin.tar.gz | tar xvz -C /workspace/example-introduction-to-nifi
4. Watch Video Below for Detailed Setup
If you missed Data Engineer’s Lunch #28: Petl for Data Engineering, be sure to check that out as well, which can also be found on our YouTube channel!
Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.
We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!