Data Search Tools
Data Collections/Aggregators/Lists
DataHub - mostly business and finance data
AwesomeData Public Datasets List
Stanford Large Network Dataset Collection - social networks and other communication/online network datasets
Wikipedia’s “List of datasets for machine learning research”
I^3 Open Innovation Dataset Index - curated, searchable, community-editable, portal of innovation datasets
DISCERN: Duke Innovation & Scientific Enterprises Research Network - links innovation data to Compustat firms
Datasets for Development Economics
Public datasets collected by Anthony Lee Zhang
U.S. Government Data
FRED - US economic data
data.gov - US government data
Center for Disease Control and Prevention
House of Representatives Open Government
Charles Stewart Congressional Data
The Data Liberation Project by Jeremy Singer-Vine - an initiative to identify, obtain, reformat, clean, document, publish, and disseminate government datasets of public interest
General Datasets
Opportunity Insights - neighborhood-level data on economic mobility and inequality, among other datasets
NASA Earth Data - all the earth science data you could want
FiveThirtyEight Data - mostly US politics and sports data
BuzzFeedNews Data - open-source data BuzzfeedNews has released
YouTube Data - 8 million categorized youtube videos
Spotify Data - lots of music/podcast-related data
CERN Data - particle physics data, image data
Refugee Resettlement Data 1975-2018 - digitization of the original refugee master files as initially recorded by the Office of Refugee Resettlement and made available by the National Archives
LIFE-M: Longitudinal, Intergenerational Family Electronic Micro-database - collection of data on four generations of 20th-century Americans
Other
Data Is Plural - weekly newsletter of useful/curious datasets by Jeremy Singer-Vine
Primer: Where to find data by Sebastian Tello-Trillo - guide on how to search for data and also includes collections of several field-specific datasets