Would Your Murder Be Solved?

An interactive dashboard with crime solve rate data

Welcome to our 'true crime' webpage! A data analytics project designed by members of UC Berkeley Extension,
Lindsay, Laura and Natalie.

All crime data displayed on the page relate to California homicide cases, and are based off the following data sources from Kaggle.com: U.S. Homicide Reports, 1980-2014 and U.S. Census Demographic Data.

Read the interactive instructions below and click to navigate to each page section.

Tableau Visualizations:
Select from any of the five tabs in the scroll bar for visual analyses of murder demographics and solve rates.
The first tab, 'Solve Rate By Victim Demographics & Murder Details', is interactive. Click to select victim demographic and murder weapon options from five of the graphs to see the solve rate displayed for cases that fit those criteria.

Machine Learning Analysis:
A description of the machine learning models trained on the Kaggle datasets and visualization of most important aspects of the crime in determining the solve outcome.

Filter Search Table:
Select from any of the drop-down options to filter the table displaying all California homicide cases from 1980-2014. The 'Solve Rate' above the search panel will display the percentage of cases solved that fit those selected criteria.




Machine Learning Analysis

Using Python’s scikit-learn library, we trained random forest classification and regression machine-learning models on the combined Kaggle datasets to predict the likelihood of a case being solved based on various aspects of the crime- including victim demographics, perpetrator demographics, murder weapon, and location demographics. The regression model predicts a percent likelihood of the case being solved, where the classification model predicts a binary ‘solved' or 'unsolved' outcome.

The graph to the right displays a ranking of the most important aspects of the crime in determining the outcome for the classification model. A link to the code used for creating, training and running the machine learning models can be found here.



Random Forest Feature Ranking

Historical Solve Rate Caluculator


Solve Rate

Filter Search


Victim Gender:


Victim Age:


Victim Race:


Victim Ethnicity:


Murder Weapon:


County:


Victim
Gender
Victim
Age
Victim
Race
Victim
Ethnicity
Murder
Weapon
County Solved?