Skip to content

Latest commit

 

History

History
53 lines (40 loc) · 1.34 KB

File metadata and controls

53 lines (40 loc) · 1.34 KB

Workflow Basics

Learn about linking data engineering and data analytics tasks into scalable and reliable pipelines.

What's a Workflow?

A workflow is a sequence of tasks executed in order with dependencies to accomplish a larger goal. These tutorials focus on workflow orchestration tools for data engineering and analytics that automate, schedule, and manage data processing pipelines—from simple ETL to complex machine learning tasks.

We distinguish:

  • Workflows related to your Office Productivity tools (not covered here)
  • Workflows for DevOps (CI/CD)
  • Workflows for Data Engineering and Data Analytics

Getting Started

Clone this repository and explore the docs:

git clone https://github.com/UVADS/workflow-basics.git
cd workflow-basics

Contents

  • Introduction
    • Orchestration
    • Choosing a tool
    • Monitoring
    • Deployment
    • Persistence
    • Resilience
    • Reproducibility
    • Portability & Sharing
  • Quickstart
    • Airflow
    • Nextflow
    • Snakemake
    • Prefect
    • Dagster
    • Targets
  • Examples
    • Airflow Examples
    • Nextflow Examples
    • Prefect Examples
    • Targets Examples

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

Distributed under the MIT License.