Column Headers & Order

Column Headers

  • Only use alphanumeric or these 3 special characters: period (.), dash (-), and underscore (_)

    • Ampersand (&) should be replaced by “and” if needed

  • Each must be unique

    • Can’t have two headers called "duration"

  • Units of measure should be omitted

    • Units can and should be provided with the data dictionary

  • Keep short (less than 30 characters)

    • A full description can and should be provided with the data dictionary

Example: date_received, applicants_address, supervisor_district

Column Order

  • Unique identifiers should be in the left-most column if applicable

  • Date and time variables should be in the first column for time series data

  • Fixed or classified variables should be ordered with the highest-level variable on the left and most granular variable on the right, for example

  • Observed variables should always be on the rightmost columns, these are measured variables often numeric, for example:

    • Duration

    • Number of Units

    • Number of Stories

    • Year Built

    • People Served

Is anything wrong, unclear, missing?

Leave a comment.

Last updated