Dataset Analysis

The five types of police datasets were chosen to be able to provide context for police accountability.

Among 169 cities, 127 don’t have any data available. This means that less than one-third of the cities are publishing any data related to their police force. None of the cities has all five types of data we are looking for.

The number and percentage of how many types of police data that 169 cities have.

The largest proportion is Use of Force, which we believe is related to high social attention and strict gun management. Force Demographics accounts for the smallest proportion. We believe this is related to privacy regulations and low social attention. Philadelphia has four types of police data, without Force Demographics data. New York has four types of police data, without Traffic Stops data.

The number and percentage of how many cities that each type of police data has

Our results show an even more dire picture of transparency in police data than expected. While we hypothesized that the five core police datasets would not be widely available, we did not expect many cities to have any police data at all. It is important to note that the scope of this project does not include crime data. Instead, we are focused on datasets that specifically describe police force members' behaviors, actions, and demographics.

Our results have also provided us with some useful anecdotal information for our analyses. For example, cities that we expected to have robust police data because of their general leadership with data governance, like Portland, Oregon, lack the amount of data available. In Portland, police data are on a particular website from other open data and available in aggregate, PDF format.

Other cities, such as Dallas, TX, are more open with their data, publishing every instance of use-of-force by police officers since 2013. However, Dallas only publishes the data in separate datasets and pages for each year, making it difficult to analyze data from multiple years. This kind of complication is unfortunately not uncommon in open data practices, especially with police data.