Intro to Metadata

Resources to help people use your data

What is Metadata?

Metadata is "data that provides information about other data". Simply, metadata is the information you will include with your dataset when you publish it. It should help people understand how to use the dataset and how the dataset was made.

See an example of a completed metadata page on the open data portal here.

Overview of Metadata

When you create a new dataset in the Open Data Portal, you will be prompted to include metadata. Some fields are required while others are optional. The more information you include, the easier it will be for others to utilize your dataset.

Use our Metadata Template to start with metadata

The next page contains our metadata standards for each field but at a high level there are two types of metadata you should configure:

Dataset Metadata

Dataset metadata is information about the dataset overall: how was the dataset created, how should the data be used, what does each row represent? Essentially it is a primer for anyone who is interested in using the data.

Dataset Metadata can be added in the open data portal, on a draft page, by clicking on the Edit Dataset Metadata section (see screenshot below).

Dataset metadata should use the following template for the "Description" Section:

<strong>A. SUMMARY</strong>
[What is this dataset?  What purpose does it serve? 
Are there relevant links to your department's website? 
Try to make this description understandable by the general public.]

<strong>B. HOW THE DATASET IS CREATED</strong>
[Describe the business process that generates this data.]

<strong>C. UPDATE PROCESS</strong>
[What is the process for updating this dataset and how often will it happen?]

<strong>D. HOW TO USE THIS DATASET</strong>
[Describe what an analyst should know about this dataset to use appropriately,
 e.g. what to watch out for, what to filter, etc.]

<strong>E. RELATED DATASETS</strong>
[Please list any closely related datasets here with hyperlinks to the datasets.
 See below for hyperlink formatting]
 
 
 
 Hyperlink formatting:
 <u><a href="https://tinyurl.com/7fruttpx">Whatever text you want</a></u>

Column Metadata

Column Metadata is information on each individual column in the dataset. Each column should have a short description of what the column represents and how to use it. This metadata can be filled out by clicking on the "Edit Column Metadata" field in the Open Data Portal.

Column Metadata is especially important if the columns have abbreviated names such as "dt" instead of "date" as it will ensure there is a human readable version of the variable. If possible, also link to any additional documentation on how the variable was created. Below is how metadata will be rendered in the open data portal.

There are three required columns in every dataset:

  • data_as_of

  • data_loaded_at

  • a primary key

See Metadata Standards to learn more

Last updated