Step 1: Collect Needs and Requirements
Last updated
Last updated
Follow the steps below to identify stakeholder needs and requirements for the data. These steps may be iterative. For example, you may draft an initial purpose, but after talking to your data users, you may need to update it.
The steps are:
1.1 Define the purpose for the data
1.2 Identify your stakeholders
1.3 Assess stakeholder needs and requirements
Provide a brief background and motivation for this dataset. Why do you need to collect it? What business need, problem or process will it address? Update this purpose as you learn more.
Your stakeholders are the source of your needs and requirements. The lifecycle model helps you identify stakeholders. Answer the questions to identify stakeholders throughout the lifecycle. Also refer to the roles and responsibilities page.
Lifecycle Phase
Relevant Roles
Questions
Plan and Define
Business and Data Stewards
Who identifies objectives and assigns priorities and resources?
Who develops processes, business rules and standards?
Whose support can help this project succeed or fail?
Obtain
Data Producers
Who acquires information from sources?
Who enters new data and creates records in the system?
Maintain
Data Steward and Custodian
Who decides what should be updated?
Who makes actual changes in the system?
Who needs to know about these changes?
Who supports the storing technology?
Access and Use
Data Users
Who directly accesses the data?
What do web analytics for the relevant web pages tell you about who is accessing and how?
Who (within your government) uses the information?
What business processes rely on this data?
Who (at other agencies) uses the information?
Who (in the business sector) uses the information?
Who (in the media, non-profit & public interest sector) uses the information?
Who has requested or downloaded the data?
What other programs or agencies will use this data?
Dispose
Data Steward and Custodian
Who sets the retention policy?
Who archives the data?
Who needs to be informed?
Adapted from POSMAD interaction matrix detail, copyright 2005-8 Danette McGlivray, Granite Falls Consulting and Data Quality Guide for Governments by Stephanie Singer.
New data collection is an opportunity to add value to each of your stakeholders. A variety of tools exist for working with stakeholders and doing user-centered design. We won't try to be exhaustive but here are some tips to get you started.
For each lifecycle phase, work to answer the questions below.
Lifecycle Phase
Questions to ask
Plan and Define
Define organization need and value
Why do we need to obtain this data?
What issues could this solve?
Is it a new requirement or a new business process? If so, are there existing requirements?
Or is this to improve an existing process or service? Why does it need to be improved? How does that relate to the data we obtain?
Identify existing requirements
Are some of our requirements already determined? Either by ordinance, law or executive order?
What laws or regulations will govern this data?
Obtain
Understand those responsible for obtaining the data
Who will be obtaining the data?
Will it be a limited number of people or broader?
How much turnover is there?
What is the working environment like for those collecting our data?
How do they typically obtain data?
Do they work under pressure or in difficult situations? What does that consist of? How will this impact their ability to collect quality data?
What tools and systems are they using?
What is their level of comfort with data?
What do they understand about the needs and uses for the data?
How much do they value the data? What are their other priorities?
Plan for collection
How should they create or collect the data?
What quality controls should be in place at collection?
How frequently should they obtain the data?
How will you maintain motivation to collect quality data? For example, can you provide regular examples of uses of the data?
If the data is coming from somewhere else, e.g. a data warehouse or integration, how will it be changed to support your needs?
Maintain
Storing and archiving the data
What technology and tools do we have?
Where should the data live technically?
What security controls are required? Does the dataset contain personally identifiable information (PII)?
How is the data backed up? How frequently should it be backed up?
Change Management and Versioning
How are changes made to the dataset?
How do you track changes made over time? Will you need to know what the data showed/contained at points in the past?
How will we track and manage versions?
Quality of the data
What quality controls should be used and how frequently?
What data profiling tools do we need?
How will we measure data quality on an ongoing basis?
Access & Use
Access
Who is able to access the data for editing? Viewing?
What legal requirements apply?
What are the risks if the data is inappropriately accessed and used?
How will the data be made available to the rest of the enterprise?
Is the data suitable for publication? If not, how can the data be modified to be suitable for publication? See datasf.org/publishing for help.
Use
How do users expect to use the data?
What types of analysis will they do?
What research questions do they have?
What reports or dashboard requirements do they have?
What business processes rely on the data? What are the requirements of the data to support those business processes?
How could the data improve their business process?
What business rules do they expect to apply to the data?
How will user feedback on data quality be solicited and incorporated?
Dispose
How long should previous versions of the data be retained before disposal?
How long should the data be held when it is no longer in use? Is the historical data of value? What is the risk / benefit of the historical data?
What laws or data retention policies apply?