🖨️
DataSF | Publishing Process
  • Introduction
  • Why Publish Data?
    • Publishing Data Standards
  • Publishing Specifications
    • Kickoff
      • Breadcrumbs and Inventory
    • Privacy Toolkit
      • Privacy Toolkit Form
    • Security Toolkit
  • Data Pipeline
    • Pipeline Basics
      • Manual Publishing
      • Data Pipeline
      • Common Transformations
  • Metadata
    • Intro to Metadata
      • Metadata Standards
  • Publishing & Maintenance
    • Review & Publishing
    • See our other explainers
Powered by GitBook
On this page
  • What's a data pipeline?
  • Which pipeline is best for me?
  1. Data Pipeline

Pipeline Basics

Move data from Point A to B (or "source to target" if you want to be fancy)

PreviousSecurity ToolkitNextManual Publishing

Last updated 11 months ago

What's a data pipeline?

To share your data with the public, we will need to move it from wherever it lives (your database, system, app, spreadsheet, etc.) onto our platform. We may also have to change columns or values before publishing it. The process of moving and cleaning data is a 'data pipeline'.

The process of moving data is often called ETL because the steps are:

  • Extract data from it's source

  • Perform Transformations on the data

  • Load the data into it's target destination

Which pipeline is best for me?

The first question to ask yourself is, how often will this dataset update?

  • If the answer is Yearly or Never, you could consider

  • If the answer is more frequent than Quarterly, the process should be

The next two pages cover manual and automated data pipelines.

manual publishing
automated