|
November 7, 2009

Administration
Faculty
Staff
Students
Jobs

Projects
Seminars
Working Papers
Publications
Dissertations

Programs
Courses
Course Schedule

Data Archive
Library
Pop Index
NIH Public Access Policy

Calendar

CRCW
CHW
CMD
PUM
OPR Mail
Search
|
 |
|
Files created at OPR from the EGSF data
During the course of the analysia of the EGSF data at OPR, a number of data
files were created by people at OPR that may be useful to others at OPR who want
to analyze the EGSF data. These files are only available to people at OPR.
You can click here for a list of these files.
- Prenatal care and delivery archive derived from Questionnaire
Section D (prenatal.zip)
- The prenatal care and delivery file is a
flat file (one record for
each live birth in the five years before the survey, up to two births
per woman) created from the hierarchical files containing Section D
of the questionnaire (prenatal care and delivery). Most variables
have not been modified except for
the creation of multiple variables to represent the variables
contained on multiple records in the original data. Some variables
have been combined to create a single variable out of multiple
questions. This file was created from the 6 section D files in the
raw data. The public use data has a less hierarchical structure, and
has only 4 files. There is a codebook which documents which files the
variables were derived from, with references to file number in both
the raw data and public use data. The following files are in the zipped
archive:
- prenatal.ssd01 - SAS version of the prenatal file (SAS 6.11
version for Sun 4 Unix)
- prenatal.dta - Stata version of the prenatal file. (Stata 5.0 for
Unix or PC.)
- prenatal.cbk - Simple ascii codebook for the prenatal file.
- Morbidity symptoms file archive derived from Questionnaire Section E
(symptoms.zip)
- The morbidity symptoms file is a flat file (one record for each living
child under five years of age at the time of the survey, with a maximum of
two births per woman) based on the very complex hierarchical files
comprising Section E
(child morbidity) of the EGSF questionnaire. The file contains details of
symptoms of illness manifest by the child in the two weeks prior to the
interview, as well as information about treatments applied, advice elicited
by the mother from others, and medical care providers consulted. Some
information from the birth history, some anthropological measurement data,
and some variables pertaining to the mother have been attached to the
records. This file was created from the 10 section E files in the raw data.
The public use data has a less hierarchical structure, and has only 4 files.
There is a codebook which documents which file each variable was derived
from, with references to file number in both the raw data and public use
data. The following files are in the zipped archive:
- symptoms.ssd01 - SAS version of the symptoms file (SAS 6.11
version for Sun 4 Unix)
- symptoms.dta - Stata version of the symptoms file. (Stata 5.0 for
Unix or PC.)
- symptoms.cbk - Simple ascii codebook for the symptoms file.
- Medicines archive derived from Questionnaire Section E, with
additional data added:
(medtable.zip)
-
The medicine table archive contains additional information about
medicines prescribed to children who had symptoms of illness in the
two weeks prior to the interview, based on the information about
those medicines provided by the mother in Section E of the
questionnaire, and further described and categorized by Junio Robles,
a Guatemalan physician at INCAP. A small data file
contains one record for each unique medicine mentioned by
the mothers, with Junio's description and a categorization developed
jointly by Barbara Vaughan and Junio. There is also a larger
hierarchical file which contains records for each medicine used,
linked to the child for whom it was prescribed and the person or
persons who advised the mother to use the medicine. For each child treated
by a medicine, there is one or
more records for each medicine used, depending on the number of people who
advised it. The file also contains a broad categorization of the
persons giving advice developed by Patrick Heuveline, so that both
the medicines and the persons advising them have a succint summary.
These records could be linked to the records in the symptoms extract
file described above. A rudimentary codebook is in the archive
as well. The zipped archive contains the following files:
- medtable.ssd01Hierarchical child/medicine/adviser file in SAS
format
- medtable.dtaHierarchical child/medicine/adviser file in Stata
format
- medtable.cbkCodebook for hierarchical child/medicine/adviser file
- Jararev.xlsExcel version of medicine description file, created by J. Robles and corrected by B.
Vaughan
- jararev.datAscii version of medicine description file
- jararev.cbkCodebook for medicine description file
- Health post archive derived from Community Health Post Survey
(puestosm.zip)
-
The file puestosm.dta is a Stata datafile containing one record for
each interviewed health post in the communities surveyed. It was created by
Arodys Robles, who merged files puestp00.dta, puestp02.dta,
puestp04.dta, puestp05.dta, puestp07.dta,
puestp08.dta, and puestp11.dta, using the variable POSTID,
which uniquely identifies each health post. The file has 48 records and 399
variables. (Two health posts which appear in file puest00.dta, but
for which interviews were not successfully completed, were dropped from this
file.) The variable COMMID can be used to link this file to other
community-level files.
No variables were redefined or modified, so the main EGSF codebook
should be used for this file. All variables found in the source files
were retained.
This file does not contain puestp06.dta (services provided, 816
records), puestp09.dta (types of instruments, 902 records), or
puestp10.dta (types of medicines, 1776 records)
The zipped archive contains the following files:
- puestosm.dta - Stata data file
- puestosm.cbk - description of data file, created by B. Vaughan
- mergepue.log - log file of merge
- mergepue.do - merge program that produced data file
- Private doctor/clinic archive derived from Community Surveyu of
Private Doctors and Clinics(unionmed.zip)
-
The file unionmed.dta is a Stata datafile containing one record
for each private health provider (doctor or clinic) in the communities
surveyed. It was created by Arodys Robles, who merged files
medicm00.dta thru medicm05.dta and medicm08.dta using
the variable PID, which uniquely identifies each doctor or clinic. The file
has 31 records and 429 variables. The variable COMMID can be used to link
this file to other community-level files.
No variables were redefined or modified, so the main EGSF codebook
should be used for this file. All of the variables found in the
source files were retained in this file.
This file does not contain medicm06.dta (personnel, 589 records)
and medicm07.dta (medicines prescribed, 893 records).
The zipped archive contains the following files:
- unionmed.dta- Stata data file
- unionmed.cbk- description of data file, created by B. Vaughan
- mergemed.do - merge program that produced data file
- Community data derived from Key Informant Survey
(community.zip)
-
M. Gragnolati constructed two classes of community-level variables.
-
First, global variables come from the community questionnaire. The
EGSF collected information from three key informants in each community. When
there was lack of agreement among the answers given by the three key
informants, M. Gragnolati reconciled the responses as follows:
- For continuous
variables, such as distances and prices, he computed the median value.
- For ordinal variables, when possible, he selected the answer on
which two informants agreed. When all responses were different, he chose the
value which fell between the other two.
- For categorical variables, when possible, he selected the answer
on which two informants agreed. When all responses were different, he chose
the answer given by the mayor.
Note that the variables referring to the providers (clavei18) were not
calculated as described above. M. Gragnolati did not try to reconcile
different answers but in general each answer given by any informant was
accepted as true.
Second, contextual variables are obtained by aggregating data
collected from the individual questionnaire. M. Gragnolati calculated the
proportion of households in each community with a certain attribute (e.g.
piped water, sanitation, TV, etc.).
Further detail is presented in Michele Gragnolati's dissertation.
The zipped archive contains the following files:
- communi.dta- Stata data file
- communi.cbk- description of data file, created by B. Vaughan
from notes provided by M. Gragnolati and Dana Glei.
|