Welcome to OPR's Data Archive. This page provides a quick overview of our holdings
and links to other data resources including joint projects hosted elsewhere.
The
Fragile Families and Child Wellbeing Study, being conducted by the Bendheim-Thoman
Center for Research on Child Wellbeing (CRCW)
has now released the Nine-Year Follow Up (Wave 5) public use datafiles.
The Fragile Families and Child Wellbeing Study follows a cohort of nearly 5,000
children born in the U.S. between 1998 and 2000. The study oversamples births to
unmarried couples; and, when weighted, the data are representative of births in
large U.S. cities at the turn of the century.
The In-Home Longitudinal Study of Pre-School Aged Children collects information,
from a subset of the Fragile Families Core respondents, on children's cognitive
and emotional development, health, and home environment, and is conducted by the
Center for Health and Wellbeing (CHW).
The Mexican
Migration Project (MMP), an ongoing multidisciplinary study of
migration from Mexico to the United States, has released data for 134 communities
(MMP134), which includes the original 128 communities plus 6 new additional communities:
two from the state of San Luis Potosi, three from the state of Puebla, and one from
the state of Guanajuato. The MMP134 has information on 21,522 Mexican households,
957 U.S. households, and individual-level data on 144,258 persons. These data contain
information on 7,398 household heads with migration experience to the U.S. and information
on 48 household heads with Canadian migration experience.
The
Survey of Unemployed Workers in New Jersey (Krueger
and Mueller, 2011) invited unemployed workers to participate in
the study each week for up to 12 weeks (and additional 12 weeks for some). The released
two data files: (1) The Entry Survey Public Use Data file, which has demographic,
income and wealth information on 6,025 unemployed workers sampled from the universe
of the roughly 360,000 individuals receiving Unemployment Insurance (UI) benefits
in New Jersey as of September 28, 2009; and (2) The Weekly Follow-up Survey Public
Use Data file, which contains focused information on the job search activities,
reservation wage, and receipt of job offers. There are overall 39,201 person-week
observations in the Weekly data.
The Game
of Contacts (GC) data were collected as nested items in a behavioral
surveillance study of heavy drug users in Curitiba, Brazil. This public data release
includes two data files on 294 respondents in comma separated values (CSV) format,
13 R programs, data documentation, and a copy of the interviewer form (in Portuguese)
to record the game of contacts data. By running the R programs, one can reproduce
all the graphical and tabular results as reported in Salganik et. al. (2010) "The
Game of Contacts: Estimating the Social Visibility of Groups." Social Networks
(2010).
The Addis
Ababa Mortality Surveillance Project (AAMSP) is hosted by Addis
Ababa University and revolves around surveillance of burials at all known cemeteries
of Addis Ababa, Ethiopia. This release of the public use data pertains to the first
five years of the burial surveillance starting 2001 and includes a set of adult
verbal autopsy interviews conducted in 2004. Registration is required for accessing
the data.
The Latin
American Migration Project (LAMP), which extends the
MMP design to a study of migration flows originating in other Latin American
countries, has now released data for El Salvador and 4 new communities for Columbia.
Public Use Data now include data from Puerto Rico, the Dominican Republic, El Salvador,
Nicaragua, Costa Rica, Paraguay, Peru, Haiti and Colombia. Registration is required
for accessing the data.
Project90
was a prospective study of the influence of network structure on the dynamics of
HIV transmission in a community of high-risk heterosexuals. The data was collected
between 1988 and 1992 in Colorado Springs, CO. Stephen Muth and John Potterat kindly
provided the data to Sharad Goel and Matthew Salganik in 2007, and it was later
used in their paper, S. Goel and M. J. Salganik (2010) "Assessing respondent-driven
sampling" Proceedings of the National Academy of Sciences (PNAS). The release of
these data allows others to replicate the analyses of Goel and Salganik.
The Immigrant
Identity Project (IIP), also known as Transnational Identities
and behavior: An Ethnographic Comparison of First and Second Generation Latino Immigrants,
released data for public use, which include: a quantitative data sheet, 165 interview
transcripts (personal and place names are masked), and 306 pictures taken by respondents
themselves related to American/Latino identity. Registration is required for accessing
the data.
The Texas
Higher Education Opportunity Project (THEOP) is a multi-year
study that investigates college planning and enrollment behavior under a policy
that guarantees admission to any Texas public college or university to high school
seniors who graduate in the top decile of their class. THEOP
released administrative data that consists of College Application Data and College
Transcript Data obtained from nine Texas universities in December 2008. It also
released the Sophomore Cohort Wave 2 Survey Data in Feb 2009.
The Success
and Failure in Cultural Markets project (CM) was motivated by puzzling
aspects of contemporary cultural markets, released data from a series of four web-based
experiments involving a total of 27,267 participants. Included in this release are
167 data files, 48 music files (mp3 format), and detailed documentation. The experiments
were conducted by Prof. Matthew J. Salganik between 2004 and 2007.
The National
Longitudinal Survey of Freshmen (NLSF) has released the wave
4 (Junior in Spring 2002) and wave 5 (Senior in Spring 2003) public use datasets.
Information on participants' graduation from college is available in a separate
graduation dataset. The two final waves contain similar information as wave 2 (Freshman
in Spring 2000) and wave 3 (Sophomore in Spring 2001), as well as detailed information
on extracurricular group involvement, health and emotional problems, college debts,
future plans for employment, career and higher education, respondents' perception
of their own/other racial and ethnic groups in terms of identity, incidences of
discrimination and prejudice to name a few. The NLSF follows a cohort of first-time
freshman at selective colleges and universities through their college careers. Equal
numbers of whites, blacks, Hispanics, and Asians were sampled at each of the 28
participating schools, with nearly 4,000 respondents.
The New
Immigrant Survey(NIS) is a panel survey of a nationally representative
sample of new legal immigrants to the United States. The first full cohort (NIS2003-1)
data are now available for download.
Following are historic datasets archived at OPR. When the data are officially disseminated
by others then the OPR's copy is for internal use only.
Datasets connected with the Princeton European Fertility Project,
including the famous Hutterite fertility data first analyzed by
Mindel Sheps and later used to establish standards for the analysis of the European
fertility decline.
U.S. Cohort and Period Fertility Tables 1917-1980, produced by
the National Institute of Child Health and Development, National Institutes of Health,
compiled by Robert L. Heuser.
Population and death statistics tables from developing countries
amassed by the Organisation of Economic Co-operation and Development (OECD).
The World Fertility Survey (WFS), a collection of high-quality,
internationally comparable surveys of human fertility conducted in 41 developing
countries in the late seventies and early eighties.