Hutterite Fertility Data

Basic Information

  • Author: A.G. Steinberg
  • Conducted by: A.G. Steinberg, Western Reserve University
  • Data Prepared by: Office of Population Research, Princeton University
  • Source of Data: Data collected from historical records, family Bibles and followup interviews.
  • Date of Data Collection: 1953 through 1961
  • Sample Size: 722 families with live birth records; 522 have pregnancy histories based on followup interviews.
  • Record Weights: None.
  • Access: Public

Additional Information

These data are of great historical interest. They represent a natural fertility population with a very high level of marital fertility and were used as a standard in the European Fertility Study against which the fertility levels of other populations were measured.


A.G. Steinberg collected the data as part of a genetic study in 1953 (51 families) and 1958-1961 (685 families). An attempt was made to determine from written records and family Bibles the dates of birth and death of everyone who had ever lived in the communities studied. Followup interviews were conducted with 562 families in order to get complete pregnancy histories. The dataset we have has only 722 familes and 552 pregnancy histories. This is 14 families and 10 pregnancy histories fewer than Mindel Sheps reports.

History of the Dataset

As far as we can ascertain, Mindel Sheps obtained the data in the form of punched cards from A.G. Steinberg in the early 1960's. (The printed codebook at OPR is dated July, 1963.) In the late 1970's, they were transferred from cards to magnetic tape at the Office of Population Research. Copies of the data exist at other universities; to the best of our knowledge, they are copies of the data we have, as they contain the same data errors that were found in our dataset.

The project begun by A.G. Steinberg still continues under the aegis of the University of Chicago, with periodic reinterviews of these same Hutterite communities. For further information, contact

Dr. Carol Ober
Dept. of Obstetrics and Gynecology
University of Chicago
5841 S. Maryland Ave.
Chicago, IL 60637
(312) 702-1234

Dr. Ober is particularly interested in receiving copies of any publications based on the Hutterite data; as she maintains an ongoing relationship with the respondents and their descendents, it is a professional courtesy to forward to her the results of any research.

Data Structure and Content

The data file is hierarchical, and is an image of the original punched cards. Each case begins with a family record containing the birth dates and marriage date of the parents, the status of the marriage and the number of pregnancies and live births. The family card is followed in the file by a variable number of ``live birth'' cards, which are identified by confinement number, rather than birth number.

The Detailed Code Description for the live birth card indicates that confinements are numbered for the couple, rather than the mother; i.e., the first confinement is the first for the couple jointly, not necessarily the first for the mother. These birth records contain a duplication of much information in the family card as well as a duplication of information from the previous live birth record, if any.

For those women who were personally interviewed, the live birth records (if any) are followed by a pregnancy record, containing the outcome of all the woman's pregnancies, so that miscarriages and stillbirths can be placed in their proper birth intervals. These records have a code indicating whether they pertain to the first or second marriage; however, none of the families that have ``second marriage'' coded here have a pregnancy card for the first marriage, so the record is again complete for the couple, but not for the woman.

The code book describes other cards that follow the pregnancy card, specifically card types 50, 51, 61, 62 and 60, but these are not in the dataset. However, they appear to be variables that were derived from data in the earlier cards, so their absence is of little consequence. The codebook has been included in its entirety because it might be useful in elucidating earlier research that used this data.

The coding of the data reveals a compulsion to get the maximum amount of information into each column of the records, with a contradictory tendency to waste space by repeating data on successive records. These tendencies are typical of the state of data processing at the time the data were collected. While real estate on cards was extremely valuable, computer memory was even more so. Thus, data read on the first card were discarded and read again so that they didn't have to be stored in memory.

Data Quality

There were a number of obvious data errors in the data. Other errors have been pointed out to us by other users of the data. Those that caused serious problems reading the data or that could be corrected from duplicated information in the data were fixed. These corrections are recorded in a separate document, which you can see here.

There are obvious inconsistencies in some variables. Tabulations which were made while we were attempting to clarify discrepancies between the data and codebook are included here along with notes that may prove useful.

Many of the dates were arrived at by guesswork. These are well-documented in the data. There is no indication why the number of cases we have differs from what Mindel Sheps reports in her research. .

Location of Data and Documentation

You can click here for a file list. The codebook may be viewed online.

Key Reference

  • Sheps, Mindel. ``An Analysis of Reproductive Patterns in an American Isolate.'' In Population Studies 19, pp. 65-80).
We would greatly appreciate additional information about these data. If you can provide information, then please contact