HRS 1994 (Wave 2) Codebook

HRS 1994 (Wave 2) Final Release Codebook

   1 Introduction and Acknowledgments


   2 Contact Information


   3 Obtaining the Data
      3-a  Registration/Conditions of Use
      3-b  Internet Site


   4 Files Description
      4-a  List of Files and Links to Online Documentation
      4-b  File Types
           4-b-1  Documentation
              4-b-1-a  Data Description
              4-b-1-b  Interview/Questionnaire/Box-and-Arrow
              4-b-1-d  Codebook
              4-b-1-d  Questionnaire and Codebook Files List
           4-b-2  Raw Data
           4-b-3  Descriptor Statements
      4-c  Weights
      4-d  Imputations
      4-e  Identification Variables
      4-f  Structure
           4-f-1  HRS Tracker File
           4-f-2  Individual-level Files
           4-f-3  Household-level Financial Files
           4-f-4  Family and Household Listing Files
      4-g  Merging
           4-g-1  Individual (Respondent) Level File Creation
           4-g-2  Individual (Family/Helper/Household Member) Level File Creation
           4-g-3  Household level file creation
           4-g-4  Merging with 1992 HRS (Wave 1)


   5 Using the Files
      5-a  Setup
      5-b  Decompressing the Files
           5-b-1  Decompressing Files Using HRS2.BAT
           5-b-2  Decompressing Files Yourself
      5-c  Subdirectory Structure
      5-d  Using the Files with SAS
      5-e  Using the Files with SPSS
      5-f  Using the Files with STATA
      5-g  Using the Files with Other Software


   6 Data Description
      6-a  Masking for Confidentiality


   7 If You Have Special Needs or Problems

1. Introduction and Acknowledgments

The Health and Retirement Study (HRS) is a national longitudinal study that focuses on persons born between the years 1931 and 1941 and their health, retirement, and economic status. It is a cooperative agreement between the Institute for Social Research at the University of Michigan and the National Institute on Aging.

Funding has been provided by the National Institute on Aging at NIH, the Social Security Administration, the Department of Labor Pension and Welfare Benefits Administration, the Office of the Assistant Secretary for Planning and Evaluation at DHHS, the State of Florida Department of Elder Affairs, the NIH Office of Research on Minority Health, and the NIH Office of Research on Women's Health.

The data, with appropriate masking for purposes of respondent confidentiality, are being made available to the public via the Internet in hopes that a broad group of persons will make use of this very important collection of data. This document is intended to serve as an outline and approach to using the data, but not to be a comprehensive guide.

This release of the 1994 HRS (Wave 2) data set is intended for use by the general public. By receiving these data, which have been freely provided, you are agreeing to use them for solely for research and statistical purposes and to make no effort to determine respondent identities. In addition, you are agreeing in good faith to send a copy of any publications you produce based on these data to the address below.

HRS Papers and Publications
Institute for Social Research, Room 3050
The University of Michigan
P.O. Box 1248
Ann Arbor, MI (USA) 48106-1248

2. Contact Information

If you have questions, concerns, or comments that are not adequately addressed here or on our web page (http://www.umich.edu/~hrswww), please feel free to contact us.

        E-Mail: [email protected]


        Postal service: Health and Retirement Study
                        Institute for Social Research, Room 3050
                        P.O. Box 1248
                        Ann Arbor, MI 48106-1248


        FAX: (734) 647-1186


        Phone: (734) 647-1186

3. Obtaining the Data

3-a Registration/Conditions of Use

Before working with HRS data, you must first register. Through your registration, we are able to convey to our sponsors the size and diversity of our user community, allowing us to continue to collect this important data. Registered users will receive user support, as well as information related to errors in the data, future releases, workshops, and publication lists. The information provided will not be for commercial use, and will not be redistributed to third parties.

If you have already registered, thank you; you need not register again unless the information submitted has changed.

If you have not yet registered, you may register your use of HRS data by completing the online registration form at the HRS Public File Download Area.

3-b Internet Site Health and Retirement Study public release datasets are available through the Internet. To access the HRS 1994 data and other relevant information, point your Web browser to the HRS Web Site at: http://hrsonline.isr.umich.edu. Choose "Data" and then "Access to Public Data".

4. Description of Files

The descriptions that follow deal only with files included with and specific to the 1994 HRS (Wave 2) Final Release.

Files associated with the same data set generally have the same prefix. For instance, SAS file "W2A.SAS" and EXTRACT file "W2A.EDI" go with data file "W2A.DA". Questionnaire and codebook files have a slightly different prefix in that they are preceded by a two digit number (and underscore) that indicates the order in which the files should be printed to create a properly ordered, complete copy of the documentation.

In addition to the files provided in the 1994 HRS (Wave 2) Final Release, there are two other HRS public release files users will probably want to obtain. The first is the HRS Tracker File, which provides weights, tracking, demographic, and other information for the entire sample in a single data set. The second is the HRS Concordance, a database that allows users to track content and identify similar questions longitudinally. Both files are available from the HRS Web Page, in the same area of Datasets and Files as the 1994 HRS (Wave 2) Final Public Release.

4-a List of Files and Links to Online Documentation

The number of files contained in the 1994 HRS (Wave 2) Final Release seem daunting at first. It eases the mind, however, to realize that there is no need to access every one of the files. Some files are specific for the codebook, others are for SAS users, and so on. Indeed, it is unlikely that persons will use every data set, or even every variable within a data set. Rather, it is best to determine what content areas are of interest, and focus in on just the files containing the variables of interest.

Data File Content

W2CS Household and Individual Coversheet Data

W2HHLIST Coversheet: Household Listing

W2A Section A: Demographics, and Miscellaneous

W2B Section B: Health

W2C Section C: Cognition

W2D Section D: Housing

W2E Section E: Family Structure

W2KIDS Section E: Children file

W2PARS Section E: Parents file

W2SIBS Section E: Siblings file

W2FA Section FA: Employment (Employees)

W2FB Section FB: Employment (Self-Employed)

W2FC Section FC: Employment (Unemployed)

W2G Section G: Last Job, R Not Working Now

W2H Section H: Job History

W2J Section J: Disability

W2K Section K: Net Worth

W2N Section N: Income

W2R Section R: Health Insurance

W2S Section S: Widowhood

W2V Section V: Capital Gains

W2MOD0 Experimental Module 0: Activities and Nutrition

W2MOD1 Experimental Module 1: Depression Scale

W2MOD2 Experimental Module 2: Similarities

W2MOD3 Experimental Module 3: Physical Functioning

W2MOD4 Experimental Module 4: Spending and Saving

W2MOD5 Experimental Module 5: Risk Aversion

W2MOD6 Experimental Module 6: Social Support

W2MOD7 Experimental Module 7: Transfers

W2MOD8 Experimental Module 8: Help with ADLs

W2MOD9 Experimental Module 9: Activities and Time Use

4-b File Types

4-b-1 Documentation

There are three types of documentation available for use specifically with the 1994 HRS (Wave 2) Final Release. They are the Data Description, Interview/Questionnaire/Box-and-Arrow, and Codebook. Users of the data will want to become familiar with all three and reference them often.

4-b-1-a Data Description

Subdirectory: (any) Suffix: TXT

The Data Description, which you are currently reading, gives a rough overview of the data set. It is stored as an ASCII text file, and should be looked over prior to working with the data.

4-b-1-b Interview/Questionnaire/Box-and-Arrow

Subdirectory: C:\HRS\WAVE2\IVIEW Suffix: WP5

There are three names the research community uses that all refer to basically the same piece of documentation: interview, questionnaire, and box-and-arrow. For purposes of this document, we will refer to the document as a questionnaire.

The 1994 HRS (Wave 2) Final Release Questionnaire is stored as a set of WordPerfect Version 5.0 files. The questionnaire is the only part of the HRS Wave 2 Public Release that is not in ASCII text form. Because of the graphical nature of the questionnaire, adequate conversion to ASCII text format was not feasible.

The questionnaire is helpful when used in tandem with the codebook, as the questionnaire graphically depicts skip patterns and the flow of the interview, which some users find very helpful.

For a list of all 1994 HRS (Wave 2) Final Release Questionnaire files, see Part 4-b-1-d of this document.

4-b-1-c Codebook

Subdirectory: C:\HRS\WAVE2\CODEBOOK Suffix: TXT

The 1994 HRS (Wave 2) Final Release Codebook is stored as ASCII text files. There should be a codebook file that corresponds to each dataset. When accessing the codebook, it is sometimes also useful to reference the associated questionnaire files.

The codebook conveys variable names, labels, question text, code values, code labels. It also conveys some skip logic in a non-graphical format. In addition, frequencies or means are presented for each variable. Please note that the frequencies and means are UNWEIGHTED. In addition, the means include missing data values, and are intended only to be used to check that your data read in correctly. The means and associated univariates should not be used to examine the data analytically.

When accessing the codebook, it is sometimes also useful to reference the associated questionnaire files. While it is possible to work with the data at some level without the questionnaire, it is nearly impossible to use the data without the codebook.

For a list of all 1994 HRS (Wave 2) Final Release Codebook files, see Part 4-b-1-d of this document.

4-b-1-d Questionnaire and Codebook Files List

For users that wish to print out the entire HRS Wave 2 Questionnaire, or the entire H1994 HRS (Wave 2) Final Release Codebook, the first two digits of each file name indicate the order in which they should be printed.

Questionnaire Files Codebook Files Contend

*1 01_W2MAS Master Codes

01_W2CS.WP5 02_W2CS Household and Individual Coversheet Data

02_W2A.WP5 03_W2A Section A: Demographics, and Miscellaneous

03_W2B.WP5 04_W2B Section B: Health

04_W2D.WP5 05_W2D Section D: Housing

05_W2E.WP5 06_W2E Section E: Family Structure

06_W2EE.WP5 *2 Section EE: Family Structure

*3 19_W2KID Children Information from Section E

*3 20_W2SIB Sibling Information from Section E

*3 21_W2PAR Parent Information from Section E

*3 22_W2HHL Household Listing from Section E

07_W2FA.WP5 07_W2FA Section FA: Employment (Employees)

08_W2FB.WP5 08_W2FB Section FB: Employment (Self-Employed)

09_W2FC.WP5 09_W2FC Section FC: Employment (Unemployed)

10_W2J.WP5 10_W2J Section J: Disability

11_W2K.WP5 11_W2K Section K: Net Worth

12_W2V.WP5 12_W2V Section V: Capital Gains

13_W2C.WP5 13_W2C Section C: Cognition

14_W2N.WP5 14_W2N Section N: Income

15_W2R.WP5 15_W2R Section R: Health Insurance

16_W2S.WP5 16_W2S Section S: Widowhood

17_W2G.WP5 17_W2G Section G: Last Job, R Not Working Now

18_W2H.WP5 18_W2H Section H: Job History

19_W2MD0.WP5 23_W2MD0 Module 0: Activities and Nutrition

20_W2MD1.WP5 24_W2MD1 Module 1: Depression Scale

21_W2MD2.WP5 25_W2MD2 Module 2: Similarities

22_W2MD3.WP5 26_W2MD3 Module 3: Physical Functioning

23_W2MD4.WP5 27_W2MD4 Module 4: Spending and Saving

24_W2MD5.WP5 28_W2MD5 Module 5: Risk Aversion

25_W2MD6.WP5 29_W2MD6 Module 6: Social Support

26_W2MD7.WP5 30_W2MD7 Module 7: Transfers

27_W2MD8.WP5 31_W2MD8 Module 8: Help with ADLs

28_W2MD9.WP5 32_W2MD9 Module 9: Activities and Time Use

*1 - The Master Codes file contains large codeframes and is referred to as needed in order to save space and avoid repetition. *2 - Section EE has its own questionnaire sub-section, but is combined into the other Section E portions of the codebook. *3 - The coversheet household listing and select parts of Section E (Family) were broken out into separate files after collection. The questionnaire represents how they were actually collected, and the codebook indicates how they ended up.

4-b-2 Raw Data

Subdirectory: C:\HRS\WAVE2\DATA Suffix: DA

Files with the extension "DA" are raw data files. HRS Wave 2 data are stored in ASCII text format, with fixed-length records. All HRS Wave 2 data should be numeric.

4-b-3 Descriptor Statements

Subdirectory          Suffixes   Software
--------------------  --------   --------
C:\HRS\WAVE2\EXTRACT  EDI        EXTRACT
C:\HRS\WAVE2\OSIRIS   DI         OSIRIS
C:\HRS\WAVE2\SAS      SAS, SAI   SAS
C:\HRS\WAVE2\SPSS     SPS, SPI   SPSS
C:\HRS\WAVE2\STATA    DO, DCT    STATA

For software packages to understand the content of raw data files (with the "DA" suffix), descriptor statements are required. Because of the proprietary nature of software packages, descriptor statements specific to the software package are required. Please reference Part 5 of this document for information on using the descriptor statements provided with the software package of your choice.

4-c Weights

Household and person-level weights for the 1994 HRS (Wave 2) Final Release are present in the HRS Tracker File. The HRS Tracker File can be obtained from the same area of the HRS Web Site as the 1994 HRS (Wave 2) Final Release.

For the time being, the weights are not included in with the 1994 HRS (Wave 2) Final Release data files. In future versions of the data, we will include them in each dataset; until that time, we apologize for the inconvenience.

4-d Imputations

A large number of variables were imputed in the 1994 HRS (Wave 2) Final Release Release data set. Those analysts who wish to use the imputations need do nothing out of the ordinary. For those who do not care to use the imputed values, or perhaps just wish to know what the original value was, we have created imputation indicators.

The codebook indicates whether a particular variable was imputed. Variables which are imputed should have the tag "[IMPUTED]" below the variable label in the codebook. A variables which is imputed should have an associated variable that is its imputation indicator. The variable name for the imputation indicator is always the variable number plus an additional 10,000. For example, if W100 were imputed, its imputation indicator would be W10100; for W9999 the imputation indicator would be W19999, et cetera. The presence of an imputation indicator for a variable is further evidence that a variable has been imputed.

The imputation indicators are one digit codes reflecting the original value of the data. The meaning of the imputation indicators are shown in the table below.

        Indicator
        Value       Meaning (original value of variable)
        ---------   ------------------------------------
        1           Partially missing data: brackets were used
                    to obtain the value
        2           Original value was termed Inap.
        3           Original value was not missing, and was inside
                    the valid range of codes (most probably, a
                    prior variable was imputed and changed the
                    skip pattern of this variable)
        4           Original value was not missing, but was outside
                    the valid range of codes
        5           Missing data: refused
        6           Partially missing data: the range card was
                    used to obtain the value
        7           Missing data: loss/negative, DK/NA; ]
                    "Q not relevant to R", "other"
        8           Missing data: DK/don't know
        9           Missing data: NA/Not ascertained

Analysts who so choose should be able to use the imputation indicator in combination with the imputed variable to restore the original values for the variable.

4-e Identification Variables

Identification variables are distinguishable from other variables in that they identify a record in a data set for a particular level of analysis.

Household level. Upon being interviewed, each sample household was assigned a Household Identifier (HHID). The HHID is stable, and uniquely identifies the original household across time. At each cross-section, however, the status of a household may change due to the severance of a partnership or the death of a respondent. The Sub-Household Identifier (SUBHH) for each cross-section, when used in combination with the HHID, uniquely identifies a household as of a particular cross-section. All households are assigned a SUBHH of 0 in the first year of collection. Thereafter, a SUBHH of 0 indicates that the original household remains intact. A SUBHH of 1 or 2 recognizes households that have broken off from the original household due to the severance of a partnership. A SUBHH of 3 indicates a deceased respondent, who for practical reasons is considered to now be in a household of their own.

In summary, to identify an original household use the HHID by itself, but to identify a household as of a particular cross-section use the HHID in combination with the SUBHH.

Example 11-1. Two respondents in a sample household are married as of the first cross-section. Each respondent is assigned a HHID of 12345 and a SUBHH of 0. As of the second cross-section the two respondents are still married, and each retains their HHID of 12345 and their SUBHH of 0.
Example 11-2. Two respondents in a sample household are married as of the first cross-section. Each respondent is assigned a HHID of 23456 and a SUBHH of 0. As of the second cross-section, the married couple divorces. At the second cross-section, both respondents retain their HHID of 23456, but each is assigned a SUBHH of 1 and 2, respectively.
Example 11-3. Two respondents in a sample household are married as of the first cross-section. Each respondent is assigned a HHID of 34567 and a SUBHH of 0. One respondent dies before the next wave. At the next wave, both respondents retain their HHID of 34567; the living respondent retains their SUBHH of 0, but the deceased respondent is assigned a SUBHH of 3.
Example 11-4. A respondent who has never been married is in the first cross-section. The respondent is assigned a HHID of 45678 and a SUBHH of 0. As of the second cross-section, the respondent marries. Both the respondent and their new spouse are assigned a HHID of 45678 and a SUBHH of 0 as of the second cross-section. (The household was not divided or otherwise changed; it was added to.)

Individual level. Individuals, whether they be respondents, children, siblings, or otherwise, are at their root persons associated with a sample household. For that reason, they are able to share a single identifier, the Person Number (PN). Person numbers are unique within a household, meaning no two persons associated with a household should ever have the same PN. In addition, the PN assigned to a person never changes. When used together, the HHID of the original household and the PN form an identifier for the person that is unique across time. Because HHID and PN do form a unique person identifier, a single combined variable called HHIDPN has been included in many files for the convenience of the analyst (though HHID and PN appear separately as well).

Example 11-5. A sample household with a HHID of 56789 contains two respondents assigned PNs of 010 and 020, respectively. Associated with the household are three children with PNs of 101, 102, and 201, and two siblings with PNs of 051 and 052. A friend who lives with the respondents is assigned a PN of 80. All eight persons will keep those same PNs across time.

When dealing with individual level family, helper, and household member files, be aware that households broken off from the original household due to the severance of a partnership can each contain a separate report on the same person.

Example 11-6. A sample household at the first cross-section contains a respondent (who has a PN of 010), their spouse who is also a respondent (and has a PN of 020), and their mutual child (who has a PN of 201). As of the first cross-section, the household has a HHID of 67891 and a SUBHH of 0. Prior to the second cross-section, the respondents divorce. Thus, as of the second cross-section there are two sub-households. The first sub-household has a HHID of 67891 and a SUBHH of 1; it contains the first respondent (PN 010) and a report on their mutual child (PN 040). The second sub-household has a HHID of 67891 and a SUBHH of 2; it contains the other respondent (PN 020) and a report on their mutual child (PN 040). Note that the child is reported on by both sub-households and thus the information appears twice.

Because the current identification variable scheme was not implemented until after multiple waves of data were collected, some exceptions to the PN identification method remain in the data. Those exceptions will be made consistent in the near future, but until that time, the exceptions are listed below.

Special case: HRS Wave 1 Person Identifiers. As of the public release of HRS Wave 1, we had not yet decided on using the HHIDPN as a unique person identifier for respondents. For this reason, in HRS Wave 1 only, the Case ID uniquely identifies each respondent, not the HHIDPN. Analysts who plan to merge HRS Wave 1 to other HRS data sets will find that both the Case ID and HHIDPN are present in the HRS Tracker File, available via the web page in the same location as other HRS public release data sets. Because both identifiers are present, the HRS Tracker File can be used as a bridge to merge the HHIDPN and other variables on to HRS Wave 1 files. There are plans to re-release HRS Wave 1 with the new identification scheme sometime in late summer, 1998.

Special case: HRS Wave 1 Parents. Parents at HRS Wave 1 were assigned PNs in the range 31-49. As of HRS Wave 2, it was decided that PNs in the range 10-49 would be reserved for respondents. Because the parent PNs overlap with the respondent PNs, the parent PNs are not unique. To solve this problem, it is suggested that analysts add 40 to the parents' HRS Wave 1 PNs prior to merging with other files. The new HRS Wave 1 parent PNs will then be in the range 71-89. HRS staff will fix the problem in this manner in the HRS Wave 1 public data set as of the re-release in late summer, 1998.

Special case: HRS Wave 2 Parents. For some reason, PNs were not a part of the HRS Wave 2 parents data file. As a result, for now there are no PNs in the data set for parents at HRS Wave 2. Merging HRS Wave 1 and HRS Wave 2 parent data thus becomes fairly difficult. HRS processing staff will attempt to add PNs to the HRS Wave 2 parents file in the near future. In the meantime, analysts will need to employ an alternative means of merging. One suggestion is to merge based on the relationship of the parent to each respondent.

In summary, identification variables you will find in HRS Wave 2 are...



HHID    HRS Household Identifier
        [five digits]
        Uniquely identifies the original HRS Wave 1 household.


W2SUBHH HRS Wave 2 Sub-household Identifier
        [one digit]
        When used along with the HHID, uniquely identifies the
        household as of the HRS Wave 2 cross-section.  Household
        composition at each wave can change due to the death of a
        respondent in a household, or the splitting of a partnered
        pair (as an example, perhaps due to divorce).


        A "0" in this variable indicates the original Wave 1
        household is still intact.
        A "1" or "2" indicates a Wave 1 household that has split
        as of this wave.
        A "3" indicates a deceased person who, because of their
        deceased status, is considered for practical reasons to be
        in a sub-household of their own.


PN      Person Number
        [three digits]
        Uniquely identifies a person associated with a household;
        this person may be a respondent, spouse/partner, child,
        sibling, parent, or other household member.
        Variants of this identifier are RPN (Respondent Person
        Number, in the family sections) and IPN (Informant
        Person Number, in the household listing).


HHIDPN  HRS Household Identifier + Person Number
        [eight digits]
        A convenient combination of HHID (the first five digits)
        and PN (the last three digits).  Uniquely identifies a
        person both cross-sectionally and longitudinally.
        Variants of this identifier are HHIDRPN and HHIDIPN.

4-f Structure

The file structure for the HRS is most easily understood once the method for collecting the data is understood.

First, at each cross-section, there are questions asked of all respondents, questions asked of a designated Financial Respondent on behalf of the entire household, and questions asked of a designated Family Respondent on behalf of the entire household.

Second, most questions are also asked in other waves, introducing a longitudinal aspect to the file structure.

We like to refer to the way our data is collected as being at different "levels". One example of a level of collection is the household level; another is the individual/respondent level. Data at these different levels are associated by the identification variables in Section 4-e. The HRS can thus be thought of as one large relational database.

4-f-1 HRS Tracker File

We attempt to aid users in tracking the HRS sample longitudinally through use of our "tracker file". The HRS Tracker File is available from the same area as the 1994 HRS (Wave 2) Final Release.

The tracker file contains records corresponding to each of the 13142 respondents that were a part of the sample as of HRS Wave 1 and HRS Wave 2.

Information in the tracker file includes select identification variables, demographic information, weights, whether a person gave an interview in a particular wave, and whether a person was the Family or Financial Respondent in a particular wave, among other things.

4-f-2 Individual-level Files

When we say that a file was collected at the individual level, we mean that all respondents were to be asked the questions in these sections. Thus, there should be a record in each file for each of the 11596 respondents that gave information in HRS Wave 2.

Asked of all respondents:



W2A           Section A: Demographics
W2B           Section B: Health
W2C           Section C: Cognition
W2E           Section E/EE: Family Structure *
W2FA          Section FA: Employment (Employees)
W2FB          Section FB: Employment (Self-Employed)
W2FC          Section FC: Employment (Unemployed)
W2G           Section G: Last Job, R Not Working Now
W2H           Section H: Job History
W2J           Section J: Disability
W2R           Section R: Health Insurance
W2S           Section S: Widowhood

There were also a set of experimental modules that were not asked of all respondents, but rather a subset of available respondents.

Asked of a subset of respondents:



W2E           Section E/EE: Family Structure *
W2MOD0        Experimental Module 0: Activities and Nutrition
W2MOD1        Experimental Module 1: Depression Scale
W2MOD2        Experimental Module 2: Similarities
W2MOD3        Experimental Module 3: Physical Functioning
W2MOD4        Experimental Module 4: Spending and Saving
W2MOD5        Experimental Module 5: Risk Aversion
W2MOD6        Experimental Module 6: Social Support
W2MOD7        Experimental Module 7: Transfers
W2MOD8        Experimental Module 8: Help with ADLs
W2MOD9        Experimental Module 9: Activities and Time Use


* In Section E, some questions were asked of all respondents,
  and others of just Family Respondents.

4-f-3 Household-level Financial Files

Each household was to have a person designated to be the "Financial Respondent"

Examples of household-level financial files are...

W2D           Section D: Housing
W2K           Section K: Net Worth
W2N           Section N: Income
W2V           Section V: Capital Gains

4-f-4 Family and Household Listing Files

In reality, family files are still individual level files. In this case, however, there is not one respondent per individual line, but one household member, child, sibling, or parent per individual line.

Files of this nature include...

W2HHLIST      Coversheet: Household Listing


W2KIDS        Section E: Children file
W2PARS        Section E: Parents file
W2SIBS        Section E: Siblings file

The information in the household listing was not given by each household member, but instead by a single informant. The information in the family files was not given by each family member, but rather by a single person designated as the "Family Respondent" in each household.

4-g Merging

Merging would not be a particularly difficult task if all datasets were structured the same. Unfortunately, because of the hierarchical nature of the data many datasets are not structured alike, and so merging becomes one of the more difficult data management tasks facing the analyst.

Many analyses require variables that appear in separate files. Before doing analysis work, the files will need to be merged in an appropriate manner. Prior to doing any data management, however, analysts should ask themselves two questions:

First, what variables are of interest? Predetermining what variables are needed for an analysis allows the analyst to subset their files to include only the necessary variables, weights, and identification variables. The smaller files are, the more manageable they are to work with.

Second, what should the final analysis file look like? Knowing beforehand whether the intended analysis requires one household per data set record, one respondent per data set record, or some other configuration makes planning the merging of the files much easier.

After these two questions are answered, there are three main types of final analysis files that analysts create. Descriptions of the three file types are below, followed by instructions on how to construct them.

Individual (respondent) level files contain information about one respondent on each record. If there is more than one respondent in a household, each will have their own record in the file. Examples of this sort of file are the demographics, health, and experimental modules files.

Individual (family/helper/household member) level files have information about one person (who is not a respondent) on each record. If there is more than one person of that type associated with the household, each will have their own record in the file. Examples of this sort of file are the children, parents, siblings, helper, and household listing files.

Household level files have information about one household on each record. Examples include household financial data, and family data that is not specific to a single person associated with the household.

4-g-1 Individual (Respondent) Level File Creation

This set of instructions can be used to create a final analysis
file with one respondent per line.


1.   Subset your original files to include only the necessary
     weight, identification, and analysis variables.


2.   Identify subsetted files that are already at the individual
     (respondent) level and merge  them together.
2a.  Sort them by HHIDPN.
2b.  Merge them by HHIDPN.
     The result should be a individual (respondent) level file with
     all individual (respondent) level variables in it.


3.   Identify subsetted files that are at the household level and
     merge them together.
     3a.  Sort them by HHID and SUBHH.
     3b.  Merge them by HHID and SUBHH.
     The result should be a household level file with all household
     level variables in it.


4.   Identify subsetted files that are at the individual (family/
     helper/household member) level, make them into household level
     files, and merge them together.
     4a.  Sort them by HHID and SUBHH.
     4b.  Determine the maximum number of persons per household in
          each subsetted file.  In step 4c, you will create this
          many sub-files from each subsetted file.
     4c.  For each subsetted file, create a number of sub-files,
          the first of which contains the first person in the
          household, the second of which contains the second person
          in the household, and so on until the nth file contains
          the nth person in the household.
     4d.  Uniquely rename variables in each sub-file so that they
          do not write over each other when merging.  You will not
          want to rename HHID and SUBHH and weight variables.  You
          will want to uniquely rename the PN of the person as well
          as other variables that specifically refer to that person,
          such as age and education.
     4e.  Sort all of the sub-files by HHID and SUBHH, if this is
          not already the case.
     4f.  Merge all of the sub-files together by HHID and SUBHH.
     The result should be a household level file with all individual
     (family/helper/household  member) variables strung out on each
     line.


5.   Merge the resultant files from steps 2, 3, and 4 together.
     5a.  Sort the resultant files by HHID and SUBHH.
     5b.  Merge the household level files from steps 3 and 4 to the
          respondent level file from step 2 by HHID and SUBHH.  Be
          sure to have your merging routine allow for multiple
          matches in the respondent level file.  Because there can
          be multiple respondents per household, you need to allow
          the household level data to be matched to each.
     The result should be an individual (respondent) level file, with
     household and individual (family/helper/household member) data
     present on each respondent's record.  Account in your analyses
     for the fact that information not originally at the individual
     (respondent) level is duplicated in the file for households with
     more than one respondent.

Example 11-7a. Sample households with HHID 78912 and 78193 have one and two respondents in them, respectively. Household 78912 has a household income of $10,000 a year, and household 78913 $40,000 a year. When an individual (respondent) level data file is created from these two households, there are three records that result, one for each respondent. If we now run a mean household income on the individual (respondent) file we get an average household income of $30,000, which is incorrect because we have counted the income of household 78913 twice (they have two records in the file).
Example 11-7b. Sample households with HHID 78912 and 78913 have one and two respondents in them, respectively. Household 78912 has a household income of $10,000 a year, and household 78913 $40,000 a year. When an individual (respondent) level data file is created from these two households, there are three records that result, one for each respondent. Before running a mean household income, we keep the first record for each HHID and SUBHH, and discard all but the first record for households with a duplicate HHID and SUBHH. If we now run a mean household income on the revised individual (respondent) file we get an average household income of $25,000, which is correct as each household is counted only once.

4-g-2 Individual (Family/Helper/Household Member) Level File Creation

This set of instructions can be used to create a final analysis file with one family member, household member, or helper per line. Before creating this particularly complex type of file, consider whether a final analysis file at the household or individual (respondent) level might be an acceptable alternative.

1.   Subset your original files to include only the necessary
     weight, identification, and analysis variables.


2.   Identify subsetted files that are at the individual
     (respondent) level, merge them together, and make them into
     household level files.
     2a.  Sort them by HHIDPN.
     2b.  Merge them by HHIDPN.
     2c.  Take the resultant file and create two sub-files, the
          first of which contains the first person in the household,
          the second of which contains the second person in the
          household, if present.
     2d.  Uniquely rename variables in each sub-file so that they
          do not write over each other when merging.  You will not
          want to rename HHID and SUBHH and weight variables.  You
          will want to uniquely rename the PN of the respondent as
          well as other variables that specifically refer to that
          respondent, such as age and education.
     2e.  Sort the two sub-files by HHID and SUBHH.
     2f.  Merge the two sub-files together by HHID and SUBHH.
     The result should be a household level file with all
     individual (respondent) level variables in it.


3.   Identify subsetted files that are at the household level and
     merge them together.
     3a.  Sort them by HHID and SUBHH.
     3b.  Merge them by HHID and SUBHH.
     The result should be a household level file with all household
     level variables in it.


4.   Identify the child, parent, sibling, helper, or household
     member individual level file that you want to merge all other
     files to.  We will call this the "core" file.   For instance,
     you may want to attach information about household members
     and respondents' parents to each of their children.  In this
     case, the child file is the "core" file, and the household
     member and parent files are going to be merged to it.  In
     other words, the final analysis file will have one child per
     line, not one parent or household member per line.  After you
     have identified the core file, leave it alone until step 6.


5.   Identify subsetted files other than the core file that are at
     the individual (family/helper/household member) level, make
     them into household level files, and merge them together.
     5a.  Sort them by HHID and SUBHH.
     5b.  Determine the maximum number of persons per household in
          each subsetted file.  In step 4c, you will create this
          many sub-files from each subsetted file.
     5c.  For each subsetted file, create a number of sub-files,
          the first of which contains the first person in the
          household, the second of which contains the second
          person in the household, and so on until the nth file
          contains the nth person in the household.
     5d.  Uniquely rename variables in each sub-file so that they
          do not write over each other when merging.  You will not
          want to rename HHID and SUBHH and weight variables.  You
          will want to uniquely rename the PN of the person as well
          as other variables that specifically refer to that
          person, such as age and education.
     5e.  Sort all of the sub-files by HHID and SUBHH, if this is
          not already the case.
     5f.  Merge all of the sub-files together by HHID and SUBHH.
     The result should be a household level file with all individual
     (family/helper/household  member) variables, except those in
     the core file, strung out on each line.


6.   Merge the resultant files from steps 2, 3, 4, and 5 together.
     6a.  Sort the resultant files by HHID and SUBHH.
     6b.  Merge the household level files from steps 2, 3, and 5 to
          the core file from step 4 by HHID and SUBHH.  Be sure to
          have your merging routine allow for multiple matches in
          the core file.  Because there can be multiple persons per
          household in the core file, you need to allow the
          household level data to be matched to each.
     The result should be an individual (core) level file, with
     individual (respondent), household, and individual (family/
     helper/household member) level data present on each
     respondent's record.  Account in your analyses for the fact
     that household and individual (family/helper/household member)
     information is duplicated in the file for households with more
     than one core individual.

Example 11-8a. Sample households with HHID 89123 and 89124 have one and two resident children in them, respectively. Household 78912 has a household income of $10,000 a year, and household 89123 $40,000 a year. When an individual (child) level data file is created from these two households, there are three records that result, one for each child. If we now run a mean household income on the individual (child) file we get an average household income of $30,000, which is incorrect because we have counted the income of household 89124 twice (the household has two children, and each has a record in the file).
Example 11-8b. Sample households with HHID 89123 and 89124 have one and two resident children in them, respectively. Household 89123 has a household income of $10,000 a year, and household 89124 $40,000 a year. When an individual (child) level data file is created from these two households, there are three records that result, one for each child. Before running a mean household income, we keep the first record for each HHID and SUBHH, and discard all but the first record for households with a duplicate HHID and SUBHH. If we now run a mean household income on the revised individual (child) file we get an average household income of $25,000, which is correct as each household is counted only once.

4-g-3 Household level file creation.

This set of instructions can be used to create a final analysis file with one household per line.

1.   Subset your original files to include only the necessary
     weight, identification, and analysis variables.


2.   Identify subsetted files that are at the individual
     (respondent) level, merge them together, and make them into
     household level files.
     2a.  Sort them by HHIDPN.
     2b.  Merge them by HHIDPN.
     2c.  Take the resultant file and create two sub-files, the
          first of which contains the first person in the household,
          the second of which contains the second person in the
          household, if present.
     2d.  Uniquely rename variables in each sub-file so that they do
          not write over each other when merging.  You will not want
          to rename HHID and SUBHH and weight variables.  You will
          want to uniquely rename the PN of the respondent as well
          as other variables that specifically refer to that
          respondent, such as age and education.
     2e.  Sort the two sub-files by HHID and SUBHH.
     2f.  Merge the two sub-files together by HHID and SUBHH.
     The result should be a household level file with all individual
     (respondent) level variables in it.


3.   Identify subsetted files that are already at the household
     level and merge them together.
     3a.  Sort them by HHID and SUBHH.
     3b.  Merge them by HHID and SUBHH.
     The result should be a household level file with all household
     level variables in it.


4.   Identify subsetted files that are at the individual (family/
     helper/household member) level, make them into household level
     files, and merge them together.
     4a.  Sort them by HHID and SUBHH.
     4b.  Determine the maximum number of persons per household in
          each subsetted file.  In step 4c, you will create this
          many sub-files from each subsetted file.
     4c.  For each subsetted file, create a number of sub-files, the
          first of which contains the first person in the household,
          the second of which contains the second person in the
          household, and so on until the nth file contains the nth
          person in the household.
     4d.  Uniquely rename variables in each sub-file so that they do
          not write over each other when merging.  You will not want
          to rename HHID and SUBHH and weight variables.  You will
          want to uniquely rename the PN of the person as well as
          other variables that specifically refer to that person,
          such as age and education.
     4e.  Sort all of the sub-files by HHID and SUBHH, if this is
          not already the case.
     4f.  Merge all of the sub-files together by HHID and SUBHH.
     The result should be a household level file with all individual
     (family/helper/household  member) variables strung out on each
     line.


5.   Merge the resultant files from steps 2, 3, and 4 together.
     5a.  Sort the resultant files by HHID and SUBHH.
     5b.  Merge the resultant files together by HHID and SUBHH.
     The result should be a household level file, with individual
     (respondent) and individual (family/helper/household member)
     data strung out on each household's record.

4-g-2 Merging with HRS Wave 1

Longitudinal merges are basically the same as cross-sectional merges except that you need to make sure your variable names do not overlap and thus overwrite each other. That should not generally be a problem in the HRS, as Wave 1 variables are preceded by a "V" and most Wave 2 variables are preceded by a "W".

Because HRS Wave 1 was distributed prior to a revision in how we think about our identification variables, it uses a different individual-level identification variables for respondents. The variable the uniquely identifies HRS Wave 1 respondents is W1CASE. Fortunately, both W1CASE and HHIDPN are present in the HRS Tracker File (see Part 4-f-1). Thus, the HRS Tracker File can be used to add HHIDPN to HRS Wave 1, of W1CASE to HRS Wave 2, thus solving the problem.

Otherwise, dealing with HRS Wave 1 data is mostly similar to dealing with HRS Wave 2 data.

5. Using the Files

5-a Setup

While a particular setup is not required for using the HRS files, we do recommend the following.

Create the directory C:\HRS\WAVE2 on your hard drive.

Copy all of the HRS Wave 2 files you retrieved from the HRS Web Page to C:\HRS\WAVE2.

By using this directory structure, you ensure that HRS2.BAT will work appropriately with your files and that you will not have to change the path names in your descriptor files. This method also well organizes your files and makes user support easier.

5-b Decompressing the Files

First, verify that the files you copied from the HRS Web Page are in directory C:\HRS\WAVE2.

At this point, you may choose to decompress the files by one of two methods: using HRS2.BAT, or by doing it yourself.

5-b-1 Decompressing Files Using HRS2.BAT

We have provided a DOS batch utility called HRS2.BAT to aid in decompression of your files. You may run the utility from DOS, or launch it using the Run command in Windows.

The first screen identifies the program and announces that you may press the [CTRL] and [BREAK] key at any time to halt the program. A brief reminder of responsible use of the data is included. You are then prompted to press a key to continue.

The second screen reminds you that you need to have created the directory C:\HRS\WAVE2 on your hard drive. It also lists the compressed (self-extracting) files that you may decompress with HRS2.BAT and what they contain. If you have completed the instructions as stated, you may press 'Y' (yes) to go on. Pressing 'N' (for no) will result in the program halting execution so you may make the appropriate changes and run it again.

The third screen of HRS2.BAT is the one in which you choose which of the files you wish to decompress. The files, along with their compressed size, decompressed size, and number of decompressed files are in the table below. Make sure you have enough space on your hard drive to decompress the files you need.

When the computer prompts you to extract (decompress) a particular file type, answer 'Y' (yes) if you wish to do so and 'N' (no) if you do not. You do not need to decompress file types you do not plan on using. For example, if you do not plan to use the Wave 2 Interview, you need not decompress the associated file "IVIEW.EXE". However, while you can choose not to decompress files of a particular type, once you choose a type of file to decompress, with HRS2.BAT you must decompress all files of that type.



                                Size
                  #  -------------------------
File          Files    Compressed Decompressed  Contents
------------  -----  ------------ ------------  -------------------
CODEBOOK.EXE     32       323,905    2,376,446  Codebook files
DATA.EXE         31     4,487,889   74,491,950  Data files
EXTRACT.EXE      31        71,791      304,928  EXTRACT descriptors
IVIEW.EXE        28       459,108    2,247,699  Wave 2 Interview
OSIRIS.EXE       31        69,351      299,956  OSIRIS descriptors
SAS.EXE          62        88,553      259,520  SAS descriptors
SPSS.EXE         62        88,606      256,370  SPSS descriptors
STATA.EXE        62        76,259      294,195  STATA descriptors
------------  -----  ------------ ------------  -------------------
Total           339     5,665,462   80,531,064

HRS2.BAT will print 'End of program.' to the screen when it is done. Look in the appropriate subdirectories for the decompressed files; the subdirectories are listed in Part 5-c.

5-b-2 Decompressing Files Yourself

An advantage to decompressing files yourself is that you can decompress only the files you need.

HRS Wave 2 files the have the EXE suffix after their filename are PKZip self-extracting files. The software for decompression is already built into the files. You may use many of the same commands on these files as you would when running PKUnzip on a ZIP file. You may also use other PKZip-compliant softwares such as WinZIP to manipulate these files.

When decompressing files yourself, we still recommend that you decompress the files to the same subdirectory structure as used by HRS2.BAT. The recommended subdirectory structure is outlined in Section 5-c.

5-c Subdirectory Structure

After decompression, whether you used HRS2.BAT or decompressed them yourself, the file directories should be as follows if you followed our recommendations:

Subdirectory            Is to contain
------------            -------------
C:\HRS\WAVE2\CODEBOOK   Codebook files
C:\HRS\WAVE2\DATA       Data files
C:\HRS\WAVE2\EXTRACT    EXTRACT descriptor files
C:\HRS\WAVE2\IVIEW      Wave 2 Interview
C:\HRS\WAVE2\OSIRIS     OSIRIS descriptor files
C:\HRS\WAVE2\SAS        SAS descriptor files
C:\HRS\WAVE2\SPSS       SPSS descriptor files
C:\HRS\WAVE2\STATA      STATA descriptor files

5-d Using the Files With SAS

To create a SAS system file for a particular dataset, the following three file types must be present for that dataset:

Directory                        Files
--------------------             -----
C:\HRS\WAVE2\SAS\                *.SAS
C:\HRS\WAVE2\SAS\                *.SAI
C:\HRS\WAVE2\DATA\               *.DA

Files with the suffix "SAS" are short SAS programs which you may use to make a SAS system file. Load them into SAS and submit them as is; if you followed our recommended setup and all goes well, the SAS system file should then appear in directory C:\HRS\WAVE2\SAS.

Files with the suffix "SAI" are the SAS input statements used by the SAS programs to describe the data.

Files with the suffix "DA" contain the raw data for SAS to read.

NOTE: If you do not want to read the entire dataset into SAS, you may edit the SAI file to read in only the variables you desire.

5-e Using the Files With SPSS ?

To create a SPSS system file for a particular dataset, the following three file types must be present for that dataset:

Directory                        Files
------------------               -----
C:\HRS\WAVE2\SPSS\               *.SPS
C:\HRS\WAVE2\SPSS\               *.SPI
C:\HRS\WAVE2\DATA\               *.DA

Files with the suffix "SPS" are short SPSS programs which you may use to make an SPSS system file. Load them into SPSS and submit them as is; if you followed our recommended setup and all goes well, the SPSS system file should then appear in directory C:\HRS\WAVE2\SPSS.

Files with the suffix "SPI" are the SPSS input statements used by the SPSS programs to describe the data.

Files with the suffix "DA" contain the raw data for SPSS to read.

NOTE: If you do not want to read the entire dataset into SPSS, you may edit the SPI file to read in only the variables you desire.

5-f Using the Files With STATA ?

To use STATA with a particular dataset, the following three file types must be present for that dataset:

Directory                        Files
--------------------             -----
C:\HRS\WAVE2\STATA\              *.DO
C:\HRS\WAVE2\STATA\              *.DCT
C:\HRS\WAVE2\DATA\               *.DA

Files with the suffix "DO" are short STATA programs ("do files") which you may use to read in the data. Load them into STATA and submit them as is; if you followed our recommended setup and all goes well, STATA should read the data in appropriately.

Files with the suffix "DCT" are STATA dictionaries used by STATA to describe the data.

Files with the suffix "DA" contain the raw data for STATA to read.

NOTE: Due to STATA's unique method of memory management, sometimes STATA has trouble reading in exceptionally large files. To aid in overcoming this problem, or if you do not want to read the entire dataset into STATA, you may edit the DCT file to read in only the variables you desire.

5-g Using the Files With Other Software

Using the data with software other than SAS, SPSS, or STATA requires that you be able to describe the raw data files located in subdirectory C:\HRS\WAVE2\DATA. The raw data should all be numeric, and are stored in ASCII text format with fixed-length records.

Five basic types of descriptors are needed to describe the variables in the HRS Wave 2 raw data files:

1. Variable name (Var Name)
2. Variable label (Var Label)
3. Column (Column)
4. Width (Width)
5. Number of implicit decimals (D)
   [For example, if the number of implicit decimals is
   "2", a data point stored as "202" should be read by
   the software package as "2.02".]

All of these descriptors are stored in a convenient table as part of the EXTRACT descriptor statements (the ones with the "EDI" extension) under subdirectory C:\HRS\WAVE2\EXTRACT.

You may use the EXTRACT descriptors to create and edit a set of data descriptor statements appropriate for reading the data into a software package of your choice.

6. Data Description

Respondents. There were a total of 13,006 persons that were eligible to give an interview in HRS Wave 2. 11,596 persons gave an interview, and 1,410 did not.

[NOTE: The 136 HRS Wave 1 respondents that were given over to the AHEAD sample prior to HRS Wave 2 were not eligible to be interviewed by HRS as of HRS Wave 2, and are not present in Table 6-1.]


Table 6-1. Respondents


+-------------------------------------------+--------+
| Not interviewed at HRS Wave 2             |  1,375 |
| Not interviewed at HRS Wave 2; new spouse |     35 |
| SUB-TOTAL                                 |  1,410 |
+-------------------------------------------+--------+
| Interviewed at HRS Wave 2                 | 11,522 |
| Interviewed at HRS Wave 2; new spouse     |     74 |
| SUB-TOTAL                                 | 11,596 |
+-------------------------------------------+--------+
| TOTAL                                     | 13,006 |
+-------------------------------------------+--------+

Households. There were 7,227 cross-sectional households at HRS Wave 2. This was determined by calculating the number of unique HHID+W2SUBHHs.

As shown in Table 6-2, there were 2,494 single households in which the respondent gave an interview. Of 4,733 paired households, in 4,369 both respondents gave an interview, and in 364 only one respondent gave an interview.



Table 6-2.  Households


+-------------------------------------------+--------+
| Single household, gave interview          |  2,494 |
+-------------------------------------------+--------+
| Paired household, both gave interview     |  4,369 |
| Paired household, only one gave interview |    364 |
| SUB-TOTAL                                 |  4,733 |
+-------------------------------------------+--------+
| TOTAL                                     |  7,227 |
+-------------------------------------------+--------+

Financial Respondents. Of the 7,227 HRS Wave 2 households, 6,979 had a Financial Respondent, and 248 did not.

Households may be missing a Financial Respondent for a variety of reasons, including non-response, interviewer error, and errors in the instrument.


Table 6-3.  Financial Respondents


+-------------------------------------------+--------+
| No Financial Respondent in household      |        |
| Single household, gave interview          |    156 |
| Paired household, both gave interview     |      5 |
| Paired household, only one gave interview |     87 |
| SUB-TOTAL                                 |    248 |
+-------------------------------------------+--------+
| One Financial Respondent in household     |        |
| Single household, gave interview          |  2,338 |
| Paired household, both gave interview     |  4,364 |
| Paired household, only one gave interview |    277 |
| SUB-TOTAL                                 |  6,979 |
+-------------------------------------------+--------+
| TOTAL                                     |  7,227 |
+-------------------------------------------+--------+

Family Respondents. Of the 7,227 HRS Wave 2 households, 6,915 had a single Family Respondent, 10 had two Family Respondents, and 302 had no Family Respondents.

Households may be missing a Family Respondent for a variety of reasons, including non-response, interviewer error, and errors in the instrument.

Households that have two Family Respondents are also in err, likely due to interviewer error or errors in the instrument. Because the responses given by the two Family Respondents was sometimes inconsistent, the data for both was retained, and should be addresses by analysts prior to using family data.


Table 6-4.  Family Respondents


+-------------------------------------------+--------+
| No Family Respondent in household         |        |
| Single household, gave interview          |    144 |
| Paired household, both gave interview     |      9 |
| Paired household, only one gave interview |    149 |
| SUB-TOTAL                                 |    302 |
+-------------------------------------------+--------+
| One Family Respondent in household        |        |
| Single household, gave interview          |  2,350 |
| Paired household, both gave interview     |  4,350 |
| Paired household, only one gave interview |    215 |
| SUB-TOTAL                                 |  6,915 |
+-------------------------------------------+--------+
| Two Family Respondents in household       |        |
| Single household, gave interview          |      0 |
| Paired household, both gave interview     |     10 |
| Paired household, only one gave interview |      0 |
| SUB-TOTAL                                 |     10 |
+-------------------------------------------+--------+
| TOTAL                                     |  7,227 |
+-------------------------------------------+--------+

Variables and Records Per File. Table 6-5 lists the number of variables and records located in each file in the HRS Wave 2 data set.

From the amount of records, you can tell how many persons or households (as appropriate) are in each file.


Table 6-5.  Number of Variables and Records Per File


Data File     Variables  Records  Contains
-----------   ---------  -------  ---------------------------------
W2A.DA               75   11,596  Section A: Demographics
W2B.DA              176   11,596  Section B: Health
W2C.DA               82   11,596  Section C: Cognition
W2CS.DA              56   13,006  Coversheet Data
W2D.DA              140    6,979  Section D: Housing
W2E.DA               72   11,596  Section E: Family Structure
W2FA.DA             568   11,596  Section FA: (Employees)
W2FB.DA             460   11,596  Section FB: (Self-Employed)
W2FC.DA             228   11,596  Section FC: (Unemployed)
W2G.DA               74   11,596  Section G: Last Job, R Not Working
W2H.DA              200   11,596  Section H: Job History
W2HHLIST.DA          17   21,635  Coversheet: Household Listing
W2J.DA              141   11,596  Section J: Disability
W2K.DA              108    6,979  Section K: Net Worth
W2KIDS.DA            36   22,930  Section E: Children file
W2MOD0.DA            57      222  Module 0: Activities and Nutrition
W2MOD1.DA            16      815  Module 1: Depression Scale
W2MOD2.DA             9      817  Module 2: Similarities
W2MOD3.DA            19      771  Module 3: Physical Functioning
W2MOD4.DA            52    1,561  Module 4: Spending and Saving
W2MOD5.DA             7      801  Module 5: Risk Aversion
W2MOD6.DA            32      203  Module 6: Social Support
W2MOD7.DA             8      827  Module 7: Transfers
W2MOD8.DA            40      822  Module 8: Help with ADLs
W2MOD9.DA            28      179  Module 9: Activities and Time Use
W2N.DA              577    6,979  Section N: Income
W2PARS.DA            41   12,247  Section E: Parents file
W2R.DA               77   11,596  Section R: Health Insurance
W2S.DA               78   11,596  Section S: Widowhood
W2SIBS.DA            54   17,880  Section E: Siblings file
W2V.DA               68    6,979  Section V: Capital Gains
-----------   ---------  -------  ---------------------------------
TOTAL             3,596

Notes:

The coversheet file has 13,006 records, one for each potential HRS Wave 2 respondent.
Files with 11,596 records have a record for each respondent.
Files with 6,979 records are financial files, with a record for each household that had a Financial Respondent.
Modules are only asked of a portion of all respondents.
All other files (household listing, kids, sibs, parents) have one household or family member per line.

6-a Masking for Confidentiality
The Health and Retirement Study is dedicated to maintaining the confidentiality of study respondents. For that reason, a number of variables have been recoded, removed, or set to zero for the public release data set. A record of some of those changes follows:

Names, addresses, and similar variables are not present.
Days of birth have been set to zero or removed altogether.
State level geography data is recoded in all places to a level no more detailed than U.S. Census Region and Division.
Data on the highest educational degree earned has been further grouped together to increase cell sizes.
Industry and occupation codes were recoded into a group of thirteen and seventeen codes, respectively, from the original three-digit U.S. Census code. The codes themselves may be found as part of the codebook.

7. If you have special Needs or Problems

If you have any special requests or needs, feel free to contact the HRS Staff (for contact information, see Part 2 of this document). We will do our best to help. Suggestions and/or questions concerning data content, codes, methodology, and research-related topics are particularly welcome. Technical support beyond problems with the files themselves is limited.

Return to Top

Last change: 17 November 2004

Data File	Content
W2CS	Household and Individual Coversheet Data
W2HHLIST	Coversheet: Household Listing
W2A	Section A: Demographics, and Miscellaneous
W2B	Section B: Health
W2C	Section C: Cognition
W2D	Section D: Housing
W2E	Section E: Family Structure
W2KIDS	Section E: Children file
W2PARS	Section E: Parents file
W2SIBS	Section E: Siblings file
W2FA	Section FA: Employment (Employees)
W2FB	Section FB: Employment (Self-Employed)
W2FC	Section FC: Employment (Unemployed)
W2G	Section G: Last Job, R Not Working Now
W2H	Section H: Job History
W2J	Section J: Disability
W2K	Section K: Net Worth
W2N	Section N: Income
W2R	Section R: Health Insurance
W2S	Section S: Widowhood
W2V	Section V: Capital Gains
W2MOD0	Experimental Module 0: Activities and Nutrition
W2MOD1	Experimental Module 1: Depression Scale
W2MOD2	Experimental Module 2: Similarities
W2MOD3	Experimental Module 3: Physical Functioning
W2MOD4	Experimental Module 4: Spending and Saving
W2MOD5	Experimental Module 5: Risk Aversion
W2MOD6	Experimental Module 6: Social Support
W2MOD7	Experimental Module 7: Transfers
W2MOD8	Experimental Module 8: Help with ADLs
W2MOD9	Experimental Module 9: Activities and Time Use

Questionnaire Files	Codebook Files	Contend
*1	01_W2MAS	Master Codes
01_W2CS.WP5	02_W2CS	Household and Individual Coversheet Data
02_W2A.WP5	03_W2A	Section A: Demographics, and Miscellaneous
03_W2B.WP5	04_W2B	Section B: Health
04_W2D.WP5	05_W2D	Section D: Housing
05_W2E.WP5	06_W2E	Section E: Family Structure
06_W2EE.WP5	*2	Section EE: Family Structure
*3	19_W2KID	Children Information from Section E
*3	20_W2SIB	Sibling Information from Section E
*3	21_W2PAR	Parent Information from Section E
*3	22_W2HHL	Household Listing from Section E
07_W2FA.WP5	07_W2FA	Section FA: Employment (Employees)
08_W2FB.WP5	08_W2FB	Section FB: Employment (Self-Employed)
09_W2FC.WP5	09_W2FC	Section FC: Employment (Unemployed)
10_W2J.WP5	10_W2J	Section J: Disability
11_W2K.WP5	11_W2K	Section K: Net Worth
12_W2V.WP5	12_W2V	Section V: Capital Gains
13_W2C.WP5	13_W2C	Section C: Cognition
14_W2N.WP5	14_W2N	Section N: Income
15_W2R.WP5	15_W2R	Section R: Health Insurance
16_W2S.WP5	16_W2S	Section S: Widowhood
17_W2G.WP5	17_W2G	Section G: Last Job, R Not Working Now
18_W2H.WP5	18_W2H	Section H: Job History
19_W2MD0.WP5	23_W2MD0	Module 0: Activities and Nutrition
20_W2MD1.WP5	24_W2MD1	Module 1: Depression Scale
21_W2MD2.WP5	25_W2MD2	Module 2: Similarities
22_W2MD3.WP5	26_W2MD3	Module 3: Physical Functioning
23_W2MD4.WP5	27_W2MD4	Module 4: Spending and Saving
24_W2MD5.WP5	28_W2MD5	Module 5: Risk Aversion
25_W2MD6.WP5	29_W2MD6	Module 6: Social Support
26_W2MD7.WP5	30_W2MD7	Module 7: Transfers
27_W2MD8.WP5	31_W2MD8	Module 8: Help with ADLs
28_W2MD9.WP5	32_W2MD9	Module 9: Activities and Time Use
1 - The Master Codes file contains large codeframes and is referred to as needed in order to save space and avoid repetition. 2 - Section EE has its own questionnaire sub-section, but is combined into the other Section E portions of the codebook. *3 - The coversheet household listing and select parts of Section E (Family) were broken out into separate files after collection. The questionnaire represents how they were actually collected, and the codebook indicates how they ended up.

HRS 1994 (Wave 2) Final Release Codebook

Table of Contents