ProPublica is a great place for any would-be muckraker to start. Need some inspiration or fire in your belly? Dip into hundreds of ongoing investigations about corrupt doctors and landlords, politicians and lenders to inspire your own reporting and research.
They provide data in two convenient ways: clean datasets often created from multiple sources and APIs that give you access to troves of data.
Occasionally, ProPublica will put out an APB for information about an ongoing investigation. They recently solicited readers to send them the names of any new White House staffers who the administration had hired with no public disclosure.
The Center for Responsive Politics, non-profit, nonpartisan research group runs OpenSecrets, an indisputably, indispensable database of information. This is where you go when you want to cherchez I’argent. After the House passed the AHCA, OpenSecrets published all health care industries’ contribution to members of the 115th Congress.
Readers can access data with custom built APIs or dive into its bulk data repository.
All you have to do is sign up for an My OpenSecrets account.
Other important features:
- The data is provided in compressed CSV text files
- There is a data dictionary for every file.
- Our OpenData User’s Guide includes additional information on how to use and link the data sets and even has scripts to create all data tables to facilitate the import process into your database software. The User’s Guide contains all data dictionaries.
Govtrack is where you go if looking for federal legislation, or information about your representative and senators in Congress including voting records, and original research on bills and votes. However, the site is less user-friendly. Some data requires authorization through an API key and those that don’t, require familiarity with JSON. Nevertheless, it is a valuable source of official data from Congress that includes:
One of my favorite examples of a data visualization storytelling comes from a 2016 collaboration between ProPublica and The Texas Tribune. Busted is a story about human error, lax regulatory oversight, and bias. I love how the authors use data visualizations as evidence of their own reporting. The visualizations aren’t meant to dazzle you with their complexity. Instead, they are the clues in a complex investigation. The underlying data are the facts of the case.
The social injustice
Cheap roadside drug tests widely used by law enforcement produce false positive results at an alarming rate and send hundreds of thousands of innocent people to prison. This is yet another story about America’s misguided war on drugs. According the article, “every year at least 100,000 people nationwide plead guilty to drug-possession charges that rely on field-test results as evidence.” And yet “74 percent of the convicted didn’t possess any drugs at the time of their arrest.”
How do journalists identify a social wrong? By comparing outcomes.
The authors use simple bar charts to uncover relationships. They do this in three different ways.
First they show a comparison of distribution
Over half of those proven innocent pleaded guilty within a week of their arrest.
Then they show a comparison of magnitude
Blacks are three times more likely to be wrongfully convicted of a drug crime in Houston than people of other races.
Finally, they measure and compare the lengths of time spent in prison
The bar graph shows the 416 wrongful convictions identified by the conviction-integrity unit of the Harris County district attorney’s office. All 416 have been exonerated by DNA tests, yet all 416 remain in prison. This graph ranks the length of time they have each spent in prison.
The journalists in this story use a bar graph in three different ways to show what a social wrong looks like. It’s encouraging to see simple visualizations used to such great effect.
Group Project By: Glynnis McIntyre, Ellen Studer, Zina Thompkins
See full storymap here.
Original Sources of Data:
Cline Center for Democracy, “SPEED Project – Civil Unrest Event Data,” The University of Illinois.
Databanks International, “The 2016 Edition of the CNTS Data Archive.”
The Economist Intelligence Unit.
We pulled the Democracy Score Index for 2014 data; 2010-2013 data
Modified Data Set here.