Pipeline Basics

Move data from Point A to B (or "source to target" if you want to be fancy)

What's a data pipeline?

To share your data with the public, we will need to move it from wherever it lives (your database, system, app, spreadsheet, etc.) onto our platform. We may also have to change columns or values before publishing it. The process of moving and cleaning data is a 'data pipeline'.
The process of moving data is often called ETL because the steps are:
  • Extract data from it's source
  • Perform Transformations on the data
  • Load the data into it's target destination

Which pipeline is best for me?

The first question to ask yourself is, how often will this dataset update?
  • If the answer is Yearly or Never, you could consider manual publishing
  • If the answer is more frequent than Quarterly, the process should be automated
The next two pages cover manual and automated data pipelines.