OPR Data Archive
Welcome to OPR's Data Archive. This page provides a quick overview of our holdings and links to other data resources including joint projects hosted elsewhere.
New Data Release
The Latin American Migration Project (LAMP), which extends the MMP design to a study of migration flows originating in other Latin American countries, has now released new data for the four communities of Ecuador (LAMP-ECU4).
LAMP-ECU4 provides information on 4 communities, 803 households, and 4,732 people.
The LAMP Public Use Data include data from Puerto Rico, the Dominican Republic, El Salvador, Nicaragua, Costa Rica, Paraguay, Peru, Haiti, Colombia, and Eduador. Registration is required for accessing the data.
The Network Scale-up Method for Heavy Drug Users study (NSUM) was conducted to evaluate the method for estimating the sizes of groups most at-risk for HIV/AIDS. Using four different data sourses, the authors produced five estimates of the number of heavy drug users in Curitiba, Brazil. This data release include three data files in Comma Separated Values (CSV) format and three R programs, a documentation file for the data and R code, and the questionnaire instruments (in Portuguese) as Portable Document Format (PDF) files.
The Fragile Families and Child Wellbeing Study, being conducted by the Bendheim-Thoman Center for Research on Child Wellbeing (CRCW) has now released the Nine-Year Follow Up (Wave 5) public use datafiles. The Fragile Families and Child Wellbeing Study follows a cohort of nearly 5,000 children born in the U.S. between 1998 and 2000. The study oversamples births to unmarried couples; and, when weighted, the data are representative of births in large U.S. cities at the turn of the century. The In-Home Longitudinal Study of Pre-School Aged Children collects information, from a subset of the Fragile Families Core respondents, on children's cognitive and emotional development, health, and home environment, and is conducted by the Center for Health and Wellbeing (CHW).
The Mexican Migration Project (MMP), an ongoing multidisciplinary study of migration from Mexico to the United States, has released data for 134 communities (MMP134), which includes the original 128 communities plus 6 new additional communities: two from the state of San Luis Potosi, three from the state of Puebla, and one from the state of Guanajuato. The MMP134 has information on 21,522 Mexican households, 957 U.S. households, and individual-level data on 144,258 persons. These data contain information on 7,398 household heads with migration experience to the U.S. and information on 48 household heads with Canadian migration experience.
The Survey of Unemployed Workers in New Jersey (Krueger and Mueller, 2011) invited unemployed workers to participate in the study each week for up to 12 weeks (and additional 12 weeks for some). The released two data files: (1) The Entry Survey Public Use Data file, which has demographic, income and wealth information on 6,025 unemployed workers sampled from the universe of the roughly 360,000 individuals receiving Unemployment Insurance (UI) benefits in New Jersey as of September 28, 2009; and (2) The Weekly Follow-up Survey Public Use Data file, which contains focused information on the job search activities, reservation wage, and receipt of job offers. There are overall 39,201 person-week observations in the Weekly data.
The Game of Contacts (GC) data were collected as nested items in a behavioral surveillance study of heavy drug users in Curitiba, Brazil. This public data release includes two data files on 294 respondents in comma separated values (CSV) format, 13 R programs, data documentation, and a copy of the interviewer form (in Portuguese) to record the game of contacts data. By running the R programs, one can reproduce all the graphical and tabular results as reported in Salganik et. al. (2010) "The Game of Contacts: Estimating the Social Visibility of Groups." Social Networks (2010).
The Addis Ababa Mortality Surveillance Project (AAMSP) is hosted by Addis Ababa University and revolves around surveillance of burials at all known cemeteries of Addis Ababa, Ethiopia. This release of the public use data pertains to the first five years of the burial surveillance starting 2001 and includes a set of adult verbal autopsy interviews conducted in 2004. Registration is required for accessing the data.
Project90 was a prospective study of the influence of network structure on the dynamics of HIV transmission in a community of high-risk heterosexuals. The data was collected between 1988 and 1992 in Colorado Springs, CO. Stephen Muth and John Potterat kindly provided the data to Sharad Goel and Matthew Salganik in 2007, and it was later used in their paper, S. Goel and M. J. Salganik (2010) "Assessing respondent-driven sampling" Proceedings of the National Academy of Sciences (PNAS). The release of these data allows others to replicate the analyses of Goel and Salganik.
The Immigrant Identity Project (IIP), also known as Transnational Identities and behavior: An Ethnographic Comparison of First and Second Generation Latino Immigrants, released data for public use, which include: a quantitative data sheet, 165 interview transcripts (personal and place names are masked), and 306 pictures taken by respondents themselves related to American/Latino identity. Registration is required for accessing the data.
The Texas Higher Education Opportunity Project (THEOP) is a multi-year study that investigates college planning and enrollment behavior under a policy that guarantees admission to any Texas public college or university to high school seniors who graduate in the top decile of their class. THEOP released administrative data that consists of College Application Data and College Transcript Data obtained from nine Texas universities in December 2008. It also released the Sophomore Cohort Wave 2 Survey Data in Feb 2009.
The Success and Failure in Cultural Markets project (CM) was motivated by puzzling aspects of contemporary cultural markets, released data from a series of four web-based experiments involving a total of 27,267 participants. Included in this release are 167 data files, 48 music files (mp3 format), and detailed documentation. The experiments were conducted by Prof. Matthew J. Salganik between 2004 and 2007.
The National Longitudinal Survey of Freshmen (NLSF) has released the wave 4 (Junior in Spring 2002) and wave 5 (Senior in Spring 2003) public use datasets. Information on participants' graduation from college is available in a separate graduation dataset. The two final waves contain similar information as wave 2 (Freshman in Spring 2000) and wave 3 (Sophomore in Spring 2001), as well as detailed information on extracurricular group involvement, health and emotional problems, college debts, future plans for employment, career and higher education, respondents' perception of their own/other racial and ethnic groups in terms of identity, incidences of discrimination and prejudice to name a few. The NLSF follows a cohort of first-time freshman at selective colleges and universities through their college careers. Equal numbers of whites, blacks, Hispanics, and Asians were sampled at each of the 28 participating schools, with nearly 4,000 respondents.
The New Immigrant Survey(NIS) is a panel survey of a nationally representative sample of new legal immigrants to the United States. The first full cohort (NIS2003-1) data are now available for download.
The Social Environment and Biomarkers of Aging Study (SEBAS) is an unusually rich, population-based longitudinal study focusing on the health and well-being of older persons in Taiwan. SEBAS explores the relationship between life challenges and mental and physical health, the impact of social environment on the health and well-being of the elderly, and biological markers of health and stress. For more information about SEBAS, a joint project of Georgetown University's Center for Population and Health (CPH) and OPR. Public use data from the project are available at ICPSR under study 3792.
Following are historic datasets archived at OPR. When the data are officially disseminated by others then the OPR's copy is for internal use only.
Datasets connected with the Princeton European Fertility Project, including the famous Hutterite fertility data first analyzed by Mindel Sheps and later used to establish standards for the analysis of the European fertility decline.
U.S. Cohort and Period Fertility Tables 1917-1980, produced by the National Institute of Child Health and Development, National Institutes of Health, compiled by Robert L. Heuser.
Population and death statistics tables from developing countries amassed by the Organisation of Economic Co-operation and Development (OECD).
The World Fertility Survey (WFS), a collection of high-quality, internationally comparable surveys of human fertility conducted in 41 developing countries in the late seventies and early eighties.
Data and Statistical Services
Princeton University's Data Library is maintained by Data and Statistical Services (DSS), part of Firestone Library's Social Science Reference Center, has extensive data collection and offers statistical consulting.
Inter-university Consortium of Political and Social Research
If you can't find the data you need at Princeton, the next step is the ICPSR Archive at the University of Michigan. Especially, the Data Sharing for Demographic Research project (DSDR) provides resources to demographic data producers and users.
Office of Population Research, Princeton University, Wallace Hall, Princeton NJ 08544
Phone: (609) 258-4870 Fax: (609) 258-1039 Email: webmaster