About the Data
The NJ-SHO Data Dashboard is powered by the NJ-SHO Data Warehouse, which has longitudinal information on 24 million individuals of all ages spanning almost 20 years. Using rigorous data integration methods, we have linked motor vehicle crash reports, driver licensing and citation records, hospital discharges, and other administrative datasets. The Warehouse has licensing and health information for people even if they have never been involved in a crash.
By linking records for the same individuals across datasets, their experiences can be seen within the larger context of their lives. This data integration resource goes far beyond just crash data to examine demographic and community characteristics of drivers, passengers, and pedestrians to promote transportation equity for all.
Current NJ-SHO Data Sources
Current NJ-SHO Data Warehouse Sources | ||||
---|---|---|---|---|
Data Type | Database | Data Contains | Years Obtained | Record Information |
Birth Certificate | NJ Birth Certificate | Birth Certificate data for all births occurring in NJ | 1979 - 2019 | ~4.6M Births |
Electronic Health Records (EHRs) | Childhood EHRs | EHR data on all CHOP healthcare network patients who were ever residents of NJ | 2005 - 2020 | ~540K Patients |
Hospital Discharge | NJ Hospital Discharge Data Collection System | Detailed utilization data on all NJ inpatient, outpatient, and ED discharges; files are derived from hospital uniform billing information | 2004 - 2019 | ~76M Visits |
Trauma Registry | To be integrated | |||
Emergency Medical Services | To be integrated | |||
Driver Licensing | NJ Driver Licensing | Detailed data on every driver who had a NJ license at some point during study period | 2004 - 2020 | ~11.3M Drivers |
Traffic Citation | NJ Administrative Office of the Courts (AOC) | Data and type of all license-related events in NJ | 2004 - 2020 | ~92M Events |
Motor Vehicle Crash | NJ Crash Report | Crash-, vehicle-, driver-, passenger-, and pedestrian/pedalcyclist-level data for all police-reported crashes in NJ | 2004 - 2019 | ~8.8M Drivers, ~3M Passengers, ~137K Pedestrians/Bicyclists |
Community Indicators | US Census and American Community Survey | Age-, sex-, and race/ethnicity-specific population data; census tract-level geographic and socioeconomic indicators | 2004 - 2020 | Census tract-level variables are assigned to individuals based on geocoded residential address |
Various | Various geographic measures of equity, disparities, and accessibility including community resilience estimates, social vulnerability indicators, and walkability scores | 2010 - 2020 | Variables are assigned to individuals based on geocoded residential address | |
Death Certificate | NJ Death Certificate | Death certificate data for all deaths in NJ | 2004 - 2019 | ~1.2M Deaths |
Medicare | Centers for Medicare and Medicaid (CMS) | Medicare claims data, including demographics, dates of enrollment, inpatient diagnoses, outpatient diagnoses, and prescription drugs | 2007 - 2019 | 1.5 Million |
Vehicle Information | National Highway Traffic Safety Administration | Decoded VIN of a specified vehicle and detailed vehicle information | 2004 - 2019 | Varies by year; 2019 data: 92% of vehicles |
The table above describes the sources and years of the most recent data integration. The next data integration, using additional years of data for current sources plus Emergency Medical Services, is currently ongoing. Administrative data sources must be finalized before the lengthy integration process can begin; consequently, the Data Warehouse does not include real-time data.
Technical Documentation
For detailed information on the NJ-SHO Data Warehouse, including a detailed description of all data sources and the methods used to develop the warehouse, download the technical documentation by clicking the link below.
NJ-SHO Data Warehouse Technical Documentation
Data Integration Methodology
The Data Warehouse was created using rigorous data integration methods. Data were linked using a combination of a probabilistic linkage process in LinkSolv and a hierarchical deterministic process in SAS. The linkage was evaluated according to strict thresholds and was determined to be high quality. We have published a peer-reviewed research paper in Injury Prevention on our initial data integration methodology. This research paper also offers guidance to others interested in undertaking data integration. Download a free PDF of the paper.
Data Security and Governance
The NJ-SHO Data Warehouse is governed by legal agreements between Children’s Hospital of Philadelphia (CHOP) and data owners that establish approved uses of the data and stringent security measures. Linkage and research activities have also been reviewed and approved by the Institutional Review Boards at CHOP and the NJ Department of Health. Careful protections are in place to ensure that these integrated data remain private and confidential. Data that identify an individual will never be released. Only researchers with special agreements can access the full de-identified data.
Citing the Data
If you use data from the NJ-SHO Data Dashboard for presentations, publications, or other reports, please acknowledge it as a data source using the citation below. We also request that you provide the title and full citation for any publications, research reports, or educational materials making use of data or documentation from the NJ-SHO Center for Integrated Data by emailing us at njsho@chop.edu