Column Headers & Order
Column Headers
Only use alphanumeric or these 3 special characters: period (.), dash (-), and underscore (_)
Ampersand (&) should be replaced by “and” if needed
Each must be unique
Can’t have two headers called "duration"
Units of measure should be omitted
Units can and should be provided with the data dictionary
Keep short (less than 30 characters)
A full description can and should be provided with the data dictionary
Example: date_received, applicants_address, supervisor_district
Column Order
Unique identifiers should be in the left-most column if applicable
Date and time variables should be in the first column for time series data
Fixed or classified variables should be ordered with the highest-level variable on the left and most granular variable on the right, for example
311 cases: service_name, service_subtype, service_details
Police incidents: category, descript
Observed variables should always be on the rightmost columns, these are measured variables often numeric, for example:
Duration
Number of Units
Number of Stories
Year Built
People Served
Is anything wrong, unclear, missing?
Last updated