Apache Oozie workflow scheduler for Hadoop

What is Apache Oozie?

  • Workflow scheduler system to manage Hadoop jobs
  • Oozie workflow jobs are Directed Acyclic Graphs(DAGs) of actions
  • Oozie is scalable,reliable and extensible system

Oozie Features:

  • Execute and monitor workflows in Hadoop
  • Trigger execution based on data availability
  • Periodic scheduling of workflows
  • HTTP,Comand line and web interface

Oozie workflow nodes:

  • Control flow
    • start/end/kill
    • decision
    • fork/join
  • Actions
    • map-reduce
    • pig
    • hdfs
    • sub workflow etc

Oozie Workflow Application:

HDFS directory containing below important files.

  • Definition file: workflow.xml
  • Configuration file: config-default.xml
  • App files: lib directory

You can download and try different examples for Apache Oozie from below link:
Apache Oozie

© 2015, www.techkatak.com. All rights reserved.