Course Code: nifidevadm
Duration: 28 hours
Prerequisites:
  • Java programming experience.
  • Experience with Maven.
  • Experience with Linux command line
Overview:

Developers

In this instructor-led, live training, participants will learn the fundamentals of flow-based programming as they develop a number of demo extensions, components and processors using Apache NiFi.

By the end of this training, participants will be able to:

  • Understand NiFi's architecture and dataflow concepts.
  • Develop extensions using NiFi and third-party APIs.
  • Custom develop their own Apache Nifi processor.
  • Ingest and process real-time data from disparate and uncommon file formats and data sources.

Administrators

Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a web-based user interface to manage dataflows in real time.

In this instructor-led, live training (onsite or remote), participants will learn how to deploy and manage Apache NiFi in a live lab environment.

By the end of this training, participants will be able to:

  • Install and configure Apachi NiFi.
  • Source, transform and manage data from disparate, distributed data sources, including databases and big data lakes.
  • Automate dataflows.
  • Enable streaming analytics.
  • Apply various approaches for data ingestion.
  • Transform Big Data and into business insights.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.
Course Outline:

Developers

Introduction

  • Data at rest vs data in motion

Overview of Big Data Tools and Technologies

  • Hadoop (HDFS and MapReduce) and Spark

Installing and Configuring NiFi

Overview of NiFi Architecture

Development Approaches

  • Application development tools and mindset
  • Extract, Transform, and Load (ETL) tools and mindset

Design Considerations

Components, Events, and Processor Patterns

Exercise: Streaming Data Feeds into HDFS

Error Handling

Controller Services

Exercise: Ingesting Data from IoT Devices using Web-Based APIs

Exercise: Developing a Custom Apache Nifi Processor using JSON

Testing and Troubleshooting

Contributing to Apache NiFi

Administrators

Introduction to Apache NiFi   

  • Data at rest vs data in motion

Overview of Big Data and Apache Hadoop

  • HDFS and MapReduce architecture

Setting up and Running a NiFi Cluster

  • Cluster Integration
  • Load Balancing/Redundancy
  • Mass Orchestration of NiFi (via Ansible)

NiFi Operations

  • Database Aggregating, Splitting and Transforming
  • Data Extractions, Logging, etc.
  • Integrating with Splunk (optional)

Monitoring and Recovery

  • Recovering without Data Loss
  • Autonomous Recovery

Optimizing NiFi

  • Performance tuning
  • Optimizing Nifi Setup

Best practices

Troubleshooting

Summary and Conclusion