Police Department Stop Data

Published 9/15/23

Click here to return to the dataset

Overview

The Racial and Identity Profiling Act (RIPA), or California Assembly Bill (AB)953 was signed into law in October 2015. The bill requires law enforcement agencies to collect specific information on each stop, including elements of the stop, circumstances and the perceived identity characteristics of the individual(s) stopped. The information obtained by officers is reported to the California Department of Justice. Collection of these data begun on July 1st 2018 and is ongoing.

How can this dataset be best used?

This dataset includes information about police stops that occurred, including stop date, time, duration, and general location, as well as some details about the person(s) stopped, why the person was stopped, and what happened during the stop. Each row is a person stopped with a record identifier for the stop and a unique identifier for the person. A single stop may involve multiple people and may produce more than one associated unique identifier for the same record identifier (e.g.: A vehicle stop with a driver and three passengers.) See below for an example of a stop involving 4 persons:

The information collected by officers can be used to help understand the number, location, reason for and outcomes of stops conducted by the San Francisco Police Department (SFPD). Geographic information is anonymized and provided to help understand stops conducted across neighborhoods, police and supervisorial districts. Provision of date information allows for analysis of data trends over time. Reasons for stop, search, and seizure, along with outcomes of any stop, search, or seizure, allow for additional analysis against time, geographic and perceived demographic data.

What data is collected?

The department’s stop dataset provides counts of stops by type, date, time, and location. Department officers collect perceived detailed demographic information during pedestrian and traffic stops as required by the Racial and Identity Profiling (RIPA) Act of 2015 and associated stop data regulations. View the stop data regulations, which guide data collection and define the elements included in the data set.

Dataset users should review the RIPA stop data categories of information an officer must collect for each stop to help assist with analysis. It is important to note that an officer’s perception of a person’s demographic identifiers (race/ethnicity, gender, LGBTQ, etc.) may differ from how the person self-identifies as officers are prohibited from asking the person stopped to self-identify their characteristics.

What transformations have been applied to this dataset?

The SFPD conducts several transformations of this data to ensure privacy, accuracy and compliance with State law and regulation. These data transformations occur after data submission and includes data geocoding, PII cleaning processes and data transformations and joins for ease of use and readability.

Geocoding

By the end of each shift, officers enter all stop data into a Stop Data Collection System. The address fields are manually cleaned for mapping suitability. Addresses are geocoded to the nearest intersection. Instead of the uncoded free text that makes up the location in raw stops data, the Department has geocoded and added Location and District fields to this dataset. The location field will provide an anonymized location of the stop while the district field provides the police district of the stop location. Users are reminded that geographic information is anonymized and provided to help understand stops conducted across neighborhoods, police and supervisorial districts. A certain percentage of stops have stop information that can’t be geocoded. This may be due to errors in data input at the officer level (typos in entry or providing an address that doesn't exist). More often, this is due to officers providing a level of detail that isn't codable to a geographic coordinate - most often at the Airport (ie: Terminal 3, door 22.) In these cases, the location of the stops is coded as unknown.

Personally Identifiable Information

After the geocoding process is complete, the stops data file is formatted, and personally identifiable information is redacted from the narrative fields. Each narrative field is reviewed for names, addresses and juvenile terms.

Readability and ease of use

The dataset has also had readability and ease of use transformations applied to make the data easier and more useful to end users. These changes include:

  • The dataset itself is 'de-pivoted' and 're-pivoted' to create a dataset that contains a single stopped individual per row. This is unlike the state dataset, in which a single row of data is a stop with all stopped individuals involved in a single row.

  • The addition of a person number, unique id, perceived age group, neighborhood, CJIS offence code text, statute and type, traffic violation code text, statute and type, custodial arrest code text, statute and type, lat, long and point geocoding, data load time, data as of, and supervisor district columns.

  • Fields in which more than one data element can be entered are delimited with a "|". See below

  • The coding from numeral code to English as many columns as practical, to include perceived race/ethnicity, perceived gender, reason for stop, CJIS codes associated with stops, searches, and arrests, and others.

  • Numerical coding is included in the data with columns ending in _code.

  • The Agency name field was removed because all stops listed were conducted by SFPD officers.

  • A certain percentage of stops have stop information that can’t be geocoded. This may be due to errors in data input at the officer level (typos in entry or providing an address that doesn't exist). More often, this is due to officers providing a level of detail that isn't codeable to a geographic coordinate - most often at the Airport (ie: Terminal 3, door 22.) In these cases, the location of the stops is coded as unknown.

Other information

The SFPD seeks to mirror the CA DOJ’s release standards for RIPA data. The Reason for Stop Narrative, Basis for Stop Narrative, and personally identifiable information of officers is not published in the CA DOJ RIPA data and is also not included in this dataset.

The Agency name and the K-12 School name are fields published in the CA DOJ’s RIPA report but are not provided in this dataset. The K-12 School name is a field populated by the K-12 School Code field. Users may search the CA Department of Education’s School Directory for a corresponding California school by using the K-12 School Code in the dataset.

What is not captured in this dataset?

  • Most, but not all, stops or individuals stopped are captured in this data. Various exceptions to data entry are outlined in the RIPA regulations, noted here.

  • Demographic information pertaining to individual identity is based on officer perception rather than self-identification, and includes race/ethnicity, gender, age, LGBT identity, English fluency, and disability.

  • This dataset does not include any personally identifiable information of officers conducting stops or of suspects stopped.

  • The order in which actions are taken are not specifically captured. When multiple reasons for stop, search, etc are selected, these data do not delineate if all reasons occurred at the outset of the stop, or were developed over the course of the stop.

  • This dataset does not capture other law enforcement agency stops within San Francisco (BART PD, US Park Police, for example.)

What privacy controls are this data set subject to?

The release of this data must balance the need for disclosure to the public against the risk of violating the privacy of those individuals present within the dataset. All incident locations are shown at the intersection or 100 block level only. Some stops occur outside SFPD districts. These will be marked as “Out of SF” or “Unknown”. Per AB 953, law enforcement agencies are prohibited from reporting unique identifying information on persons stopped. Per GOV 12525.5, officer badge numbers and other Unique Identifying Information and are withheld from the dataset.

Juvenile Data

The release of de-identified stop data on juveniles is included and is consistent with CA DOJ RIPA data releases.

How is a “stop” defined?

The SFPD utilizes the RIPA program definitions under AB953: The Racial and Identity Profiling Act of 2015; a ‘stop’ is defined as 1) any detention, as defined in regulations, by a peace officer of a person or 2) any peace officer interaction with a person in which the officer conducts a search as defined in regulation. Stops include Traffic Stops, Pedestrian Detentions, and may be Self-Initiated or Dispatched. There are some exceptions to this definition, as noted in the regulations.

Can a stop have more than one person?

Yes. During a single encounter, multiple persons may be stopped. When multiple persons are stopped during one encounter, relevant information will be submitted for each person within the single report only under certain circumstances. For vehicle stops, for example, the following must be met for a passenger to be entered: (A): the passenger is observed or suspected of violating the Vehicle Code or any other applicable law or ordinance; or (B): excluding “Vehicle Impound” and “None”, the passenger is subjected to any data value actions identified in section 999.226, subdivision (a)(12)(A), “Actions Taken by Officer During Stop.”

What are CJIS offense codes?

Criminal Justice Information System (CJIS) offense codes are a table of all state level offenses maintained by the California Department of Justice. These codes are used across multiple criminal justice applications, to standardize data entry for some data fields. Copies of the latest CJIS offense code tables can be found here: Law Enforcement Code Tables | State of California - Department of Justice - Office of the Attorney General. This dataset utilizes the RIPA code table, posted at the above link.

Data field definitions

Where data fields contain multiple datapoints due to officers having the option of picking more than one value, data is pipe delimited, ie: "|"

Last updated