Course Code: apachinifiadmin
Duration: 21 hours
Prerequisites:
  • Experience with Linux command line.


Audience

  • System administrators
  • Data engineers
  • Developers
  • DevOps
Overview:

Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a web-based user interface to manage dataflows in real time.

This instructor-led, live training (online or onsite) is aimed at system administrators who wish to setup and manage Apachi Nifi.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.
Course Outline:

Apache Nifi Introduction 

  • [Brief Introduction on Big Data] – If required
  • [Brief overview of different Data Ingestion Tools & comparision] – If required
  • [Apache Nifi Use cases] – If required
  • [Apache Nifi Architecture, Features, Terminologies & Internals]
  • [Apache Nifi integrations & looking at Hortonworks Dataflow (HDF)] – If required
  • [Apache Nifi Single Node setup] – If required

Setup and Configuration

  • Discussion on understanding current setup and issues /problems faced.
  • NiFi Cluster set-up (including ZooKeeper) with HA “High Availability”
  • Understanding Zookeeper.
  • Understanding Content repository archiving & Content claims.
  • Understanding Data Provenance & repository.
  • NiFi User authorization / Config with LDAP groups – If required
  • NiFi Registry (imports/exports)
  • NiFi Registry APIs and Command line interfaces
  • Deploying new FlowFiles in Cluster mode & internals.

NiFi Performance tuning, optimizations and system configurations

  • Deep Dive and understanding configuration properties (different sections).
  • Better understanding on tasks/threads/connections/workload intensiveness.
  • Optimized memory settings.
  • Best Practices
  • Configure properties per environment with password encryption. – If required
  • Nifi Cluster Administration Tasks          
  • NiFi Reporting, Monitoring, Logging, Backup & Recovery.
  • Integration with other tools like: Nagios / Splunk or ELK Stack.

Troubleshooting

  • Discussion on issues faced and re look into solutions.
  • Performance issues with provenance store
  • Automated testing 
  • Promotion of data from development to production 
  • Configuration properties options.

Summary and Conclusion