View on GitHub

990 Decoder -- Charity Navigator

ETL toolkit for 2.5 million electronic nonprofit tax returns released by the IRS.

IRS Form 990 Decoder

This repository contains everything you need to get started exploring the IRS Form 990 dataset hosted by Amazon Web Services on S3. This includes instructions for an easier-to-use 990 database provided free to the public by Charity Navigator.

New version available!

As part of the Nonprofit Open Data Collective, Charity Navigator has been proud to contribute to the eFile Master Concordance, which provides a standardized mapping between the many XML schemas in the primary dataset. The concordance is still in draft. A validation event is taking place in November 2017.

As a result of this hard work, Charity Navigator now has code capable of extracting all fields from the entire IRS eFile dataset. The data will be made publcily available after the Validatathon event. If you wish to preview the data, please follow the instructions in the readme for the new code.

Working with the original 990 Toolkit

Due to the imminent release of a much richer dataset, we have deprecated our original toolkit. If you wish to access it anyway, you can view the original documentation here.

Authors (original version)

Code and visualizations: David Bruce Borenstein

Documentation: David Bruce Borenstein and Zach Weinsteiger

Crosswalk between XML and database columns (990): Vince Bogucki

Crosswalk between XML and database columns (EZ): David Bruce Borenstein and Zach Weinsteiger