Consumer Expenditures and Income (PDF)
Consumer expenditure surveys are specialized studies in which the primary emphasis is on data related to family expenditures for goods and services used in day-to-day living. The Consumer Expenditure Survey (CE) of the Bureau of Labor Statistics (BLS, the Bureau) also collects information on the amount and sources of family income, changes in assets and liabilities, and demographic and economic characteristics of family members.
The Bureau's studies of family living conditions rank among its oldest data-collecting functions. The first nationwide expenditure survey was conducted during 1888–1891 to study workers' spending patterns as elements of production costs. With special reference to competition in foreign trade, the survey emphasized the worker's role as a producer, rather than as a consumer. In response to rapid price changes prior to the turn of the 20th century, a second survey was administered in 1901. The resulting data provided the weights for an index of prices of food purchased by workers that was used as a deflator for workers' incomes and expenditures for all kinds of goods until World War I. A third survey, conducted during 1917–1919, provided weights for computing a cost-of-living index, now known as the Consumer Price Index (CPI). The Bureau conducted its next major survey, covering only urban wage earners and clerical workers, during 1934–1936, primarily to revise CPI weights.
During the Great Depression of the 1930s, the use of consumer expenditure surveys extended from the study of the welfare of selected groups to more general economic analysis. Concurrent with its 1934–1936 investigation, the Bureau cooperated with four other Federal agencies in a fifth survey, the 1935–1936 study of consumer purchases, which presented consumption estimates for both urban and rural segments of the population. The sixth survey, in 1950, covered only urban consumers; it was an abbreviated version of the 1935–1936 study. The seventh survey, the 1960–1961 Survey of Consumer Expenditures, once again included both urban and rural families, and provided the basis for revising the CPI weights, while supplying material for broader economic, social, and market analyses.
The next major survey to collect information on expenditures of householders in the United States was conducted in 1972–1973. That survey, while providing continuity with the content of the Bureau's previous surveys, departed from the past in its collection techniques. Unlike earlier surveys, the U.S. Census Bureau, under contract to BLS, conducted all sample selection and field work. Another significant change was the use of two independent surveys to collect the information—a diary survey and an interview panel survey. A third major change was the switch from an annual recall to a quarterly recall (in the Interview Survey) and daily recordkeeping of expenditures (in the Diary Survey). Again, the resulting data were used to revise CPI weights.
The need for more timely data than could be supplied by surveys conducted every 10 to 12 years—intensified by the rapidly changing economic conditions of the 1970s—led to the initiation of the current continuing survey in late 1979. Since then, data have been available annually. The objectives of the CE remain the same: to provide the basis for revising weights and associated pricing samples for the CPI and to meet the need for timely and detailed information on the spending patterns of different types of families.
Like the 1972–1973 survey, the current survey consists of two separate surveys, each with a different data collection technique and sample. In the Interview Survey, each family in the sample is interviewed every 3 months over five calendar quarters. The sample for each quarter is divided into three panels, with consumer units being interviewed every 3 months in the same panel of every quarter. The Diary (or recordkeeping) Survey is completed by the respondent family for two consecutive 1-week periods.
The sample housing unit is notified in advance by a letter informing the occupants about the purpose of the survey and the upcoming visit by the interviewer. Both the Interview and the Diary Survey are conducted primarily by personal visits with some telephone usage. The interviewer uses a structured questionnaire to collect both the demographic and expenditure data in the Interview Survey. The demographic data for the Diary Survey are collected by the interviewer, whereas the expenditure data are entered on the diary form by the respondent. If, after attempts to contact the household, no adult is available, both surveys accept responses from any eligible household member who is at least 16 years old.
The unit for which expenditure reports are collected is the set of eligible individuals constituting a consumer unit that is defined as 1) all members of a particular housing unit who are related by blood, marriage, adoption, or some other legal arrangement, such as foster children; 2) a person living alone or sharing a household with others, or living as a roomer in a private home, lodging house, or in permanent living quarters in a hotel or motel, but who is financially independent; or 3) two or more unrelated persons living together who pool their income to make joint expenditure decisions. Students living in university-sponsored housing are also included in the sample as separate consumer units.
Survey participants report dollar amounts for goods and services purchased by any member of the consumer unit during the reporting period, whether payment was or was not made at the time of purchase. Expenditure amounts include all sales and excise taxes for all items purchased by the consumer unit. Excluded from both surveys are all business-related expenditures and expenditures for which the family is reimbursed.
The Interview Survey collects detailed data on an estimated 60 to 70 percent of total family expenditures. In addition, global estimates—that is, estimated average expenditures for a 3-month period—are obtained for food and other selected items. These global estimates account for an additional 20 to 25 percent of total expenditures. On average, it takes 60 minutes to complete the interview.
In the Diary Survey, detailed data are collected on all expenditures made by consumer units during their 2-week participation in the survey. It takes approximately 25 minutes over three visits for the interviewer to collect the demographic data and to instruct the respondent on how to keep the diary. It is estimated that it takes the respondent 15 minutes each day to complete the diary.
Quality control is provided by a re-interview program, which constitutes a means of evaluating the performance of the individual interviewer to determine how well the procedures are being carried out in the field. The re-interview is conducted by a senior field representative. A subsample of approximately 12 percent of households in the Interview Survey and 10 percent in the Diary Survey is re-interviewed on an ongoing basis.
All data collected in both surveys are subject to Census Bureau and BLS confidentiality requirements that prevent the disclosure of the respondents' identities. The information that respondents provide is used solely for statistical purposes. All Census Bureau data collectors take an oath of confidentiality and are subject to fines and imprisonment for improperly disclosing information provided by respondents. Names and addresses are removed from all forms and datasets prior to transmission from the Census Bureau to BLS and are not included in any statistical releases. At BLS, the data are processed and stored on secure servers, with access limited to employees having security clearances. As a further precaution, BLS applies certain restrictions to the microdata shown on the public-use CD-ROMs. These include geographical and value restrictions that prevent identification of respondents.
The Interview Survey is designed to collect data on the types of expenditures that respondents can be expected to recall for a period of 3 months or longer. In general, expenditures reported in the Interview Survey are either relatively large, such as those for property, automobiles, or major appliances, or expenditures that occur on a fairly regular basis, such as for rent, utility bills, or insurance premiums.
Each occupied sample unit is interviewed once per quarter for five consecutive quarters. After the fifth interview, the sample unit is dropped from the survey and replaced by a new sample unit. For the survey as a whole, 20 percent of the sample is replaced each quarter. New families are introduced into the sample on a regular basis, as other families complete their participation. Data collected in each quarter are treated independently, so that estimates are not dependent upon a particular family participating in the survey for a full five quarters.
For the initial interview, information is collected on demographic and family characteristics and on the inventory of major durable goods of each consumer unit. Expenditure information is also collected in this interview, using a 1-month recall, but is used—along with the inventory information—solely for bounding purposes, that is, to classify the unit for analysis and to prevent duplicate reporting of expenditures in subsequent interviews.
The second through fifth interviews use uniform questionnaires to collect expenditure information in each quarter. Data collected in these questionnaires, which are arranged by major expenditure component (for example, housing, transportation, medical, and education), form the basis of the expenditure estimates derived from the Interview Survey. Wage, salary, and other information on the employment of each member of a consumer unit is also collected or updated during each of these interviews. Expenditure data are collected via two major types of questions. The first set of questions asks the respondent for the month of purchase of each reported expenditure. The second asks for a quarterly amount of expenditures. The use of these two questions varies, depending on the types of expenditures collected. Approximately 65 percent of the data are collected using the direct monthly method, whereas about 35 percent are collected with the quarterly recall approach.
In the fifth and final interview, an annual supplement is used to obtain a financial profile of the consumer unit. This profile consists of information on the income of the consumer unit as a whole, including unemployment compensation; income from royalties, dividends, and estates; alimony and child support. A 12-month recall period is used in the collection of income- and asset-type data.
The primary objective of the Diary Survey is to obtain expenditures data on small, frequently purchased items, which are normally difficult to recall. These items include food and beverage expenditures at home and in eating places; housekeeping supplies and services; nonprescription drugs; and personal care products and services. The Diary Survey is not limited to these types of expenditures, but, rather, includes all expenses that the consumer unit incurs during the survey week. Expenses incurred by family members while away from home overnight and for credit and installment plan payments are excluded.
Two separate questionnaires are used to collect Diary data: a Household Characteristics Questionnaire and a Record of Daily Expenses. In the Household Characteristics Questionnaire, the interviewer records information pertaining to age, sex, race, marital status, and family composition, as well as information on the work experience and earnings of each member of the consumer unit. This socioeconomic information is used by the Bureau to classify the consumer unit for publication of statistical tables, as well as for economic analysis. Data on household characteristics also provide the link in the integration of Diary expenditure data with Interview expenditure data that permits the publication of a full profile of consumer expenditures by demographic characteristics.
The daily expense record is designed as a self-reporting, product-oriented diary in which respondents record a detailed description of all expenses for two consecutive 1-week periods. Data collected each week are considered statistically independent. The diary is divided by day of purchase and by four classifications of goods and services—food away from home, food at home, clothing, and all other goods and services—a breakdown designed to aid the respondent in recording the entire consumer unit's daily purchases. The items reported are subsequently coded by the Census Bureau so BLS can aggregate individual purchases for representation in the CPI and for presentation in statistical tables.
Integrated survey data
Integrated data from the BLS Diary and Interview Surveys provide a complete accounting of consumer expenditures and income, which neither survey component alone is designed to do. Some expenditure items are collected only by the Diary or Interview Survey. For example, the Diary collects data on detailed food expenditures and items, such as postage and nonprescription drugs, which are not collected in the Interview. The Interview collects data on expenditures for overnight travel and information on reimbursements, such as for medical-care costs or automobile repairs, which are not collected in the Diary. Data on average annual expenditures that come exclusively from the Interview Survey, including global estimates, such as those for food and alcoholic beverages, average about 95 percent of the total estimated spending, based on integrated Diary and Interview data. For items unique to one or the other survey, the choice of which survey to use as the source of data is obvious. However, there is considerable overlap in coverage between the surveys. Because of the overlap, the integration of the data presents the problem of determining the appropriate survey component from which to select the expenditure items. When data are available from both survey sources, the more reliable of the two is selected, as determined by statistical methods. The selection of the survey source is evaluated every 2 years.
Data collection and processing
Due to differences in format and design, the Interview Survey and the Diary Survey are collected and processed separately. The U.S. Census Bureau, under contract with BLS, carries out data collection for both surveys. In addition to its collection duties, the Census Bureau does field editing and coding, checks consistency, ensures quality control, and transmits the data to BLS. In preparing the data for analysis and publication, BLS performs additional review and editing procedures.
Quarterly Interview Survey. Beginning April 2003, Census field representatives (FR) started collecting the Interview data using a Computer Assisted Personal Interview (CAPI) instrument. This was a major improvement from the paper and pencil data collection that had been in place since 1980. The CAPI instrument enforces question skip patterns, allows for data confirmation of high expenditure values, and reduces processing time. The FR performs some coding of expenses—by selecting from a predetermined list—for vehicle make and model, trip destination, and job types for alterations, maintenance, and repair.
Data are electronically transferred from a FR's laptop at completion of the interview to the Census Master Control System. The Census Bureau's Demographics Surveys Division then reformats the data into datasets and does some special processing for output to BLS (such as converting missing values to special characters and merging data records into the required BLS output structure.) Some data, like vehicle and mortgage records, are copied into an input file that is loaded on the laptops for subsequent interviews the next quarter. This way, a few fields are updated each quarter, rather than re-collecting the entire data record. (As mentioned earlier, names and addresses of respondents are not transmitted.)
At BLS, a series of automated edits are applied to monthly data. These edits check for inconsistencies, identify missing expenditure amounts for later imputation, impute missing demographic variables, calculate weights, and adjust data to include sales tax and exclude business expenses or reimbursed expenditures.
Monthly data files are then combined into quarterly databases, and a more extensive data review is carried out. During this data review, BLS conducts the following steps: verifies counts and means by region, checks family relationship coding inconsistencies, and inspects selected extreme values for expenditure and income categories. Other adjustments convert mortgage and vehicle payments into principal and interest (using associated data on the interest rate and term of the loan). In addition, BLS verifies the various data transformations it performs. Cases of questionable data values or relationships are investigated, and errors are corrected prior to release of the data for public use.
Three major types of data adjustment routines—imputation, allocation, and time adjustment—improve estimates derived from the Interview Survey. Data imputation routines account for missing or invalid entries and affect all fields in the database, except assets. Missing or invalid attributes or expenditures are imputed. Additionally, allocation routines are applied when respondents provide insufficient detail to meet tabulation requirements. For example, combined expenditures for the fuels and utilities group are allocated among the components of that group, such as natural gas and electricity. Time adjustment routines are used to classify expenditures by month, prior to aggregation of the data to calendar-year expenditures. Tabulations are made before and after data adjustment routines, to analyze the results.
The CE implemented multiple imputations of income data, starting with the publication of 2004 data. Prior to that, only income data collected from complete income reporters were published. However, even complete income reporters did not provide information on all sources of income for which they reported receipt. With the collection of bracketed income data starting in 2001, this problem was reduced—but not eliminated. One limitation is that bracketed data only provide a range in which income falls, rather than a precise value for that income. In contrast, imputation allows income values to be estimated when they are not reported. In multiple imputations, several estimates are made for the same consumer unit, and the average of these estimates is published.
The CE Interview questionnaire is revised every 2 years to incorporate new products and services and to make such changes as clarifying instructions, improving navigation through the instrument, incorporating changes required for the CPI, and streamlining the interview by deleting outdated items. Whereas changes to the questionnaire are made biennially, CE staff continuously monitor the emergence of new goods and services available in the marketplace, as well as changes in the relative importance of existing items in consumers' budgets. New items are incorporated in a product index that enables Census field representatives to classify these new items by the appropriate item codes. Also, updated information on how to report new goods and services is provided to the field representatives on a regular basis.
Diary Survey. At the beginning of the 2-week collection period, the Census Bureau interviewer, using the Household Characteristics Questionnaire (a CAPI instrument), records demographic information on members of each sampled consumer unit. At this time, the interviewer also leaves the Diary questionnaire—or daily expenditure record—with the consumer unit, to record expenditures for the week.
At the end of the first week, the interviewer collects the diary, reviews the entries, answers any questions that the respondent may have and leaves a second diary. At the end of the second week, the interviewer picks up the second diary and reviews the entries. During this time, the interviewer again uses the Household Characteristics Questionnaire to collect previous-year information on work experience and income. Each week of a consumer unit's participation in the survey is treated as a separate occurrence.
The Census Bureau performs preliminary processing activities, including a number of data edits and adjustments. Data in the diaries are reviewed during a field edit for completeness and consistency. After the diaries are sent to the Census Bureau's National Processing Center, expenditure data captured in the diaries are key-entered into electronic formats; and a computer file of the database containing these data is produced and transmitted monthly to Census headquarters, along with image files of the diaries. Census headquarters merges the expenditure data with the data collected in the Household Characteristics Questionnaire and transmits the merged file monthly to BLS. Names and addresses of respondents are removed prior to transmitting the data.
At BLS, data are processed by computer to calculate population weights, impute demographic characteristics for missing or inconsistent demographic data, impute values for weeks worked when nonresponse is encountered, and apply appropriate sales taxes to the expenditure items.
Like the Interview Survey, three monthly diary data files are combined into quarterly databases; and BLS screens the data for invalid coding and inconsistent relationships, as well as for extreme values recorded or keyed erroneously. BLS then corrects any coding and extreme-value errors found.
Two types of data adjustment routines—allocation and imputation—improve the Diary Survey estimates. Allocation routines transform reports of nonspecific items into specific ones. For example, when respondents report expenditures for meat rather than beef or pork, allocations are made, using proportions derived from item-specific reports in other completed diaries. BLS imputes missing attributes, such as age or sex or package type, needed for mapping Diary expenditures. Income data from the Diary Survey are processed in the same way as in the Interview Survey.
Selection of households
The Consumer Expenditure Survey is a nationwide household survey designed to represent the total U.S. civilian noninstitutional population. The selection of households begins with the definition and selection of primary sampling units (PSUs). PSUs are small clusters of counties grouped together into geographic entities called "core-based statistical areas" (CBSAs). There are two types of CBSAs: metropolitan and micropolitan. Metropolitan CBSAs are areas that have an urban "core" with 50,000 or more people, plus the adjacent areas that have a high degree of social and economic integration with the core, as measured by commuting ties. Micropolitan CBSAs are areas that have an urban core of 10,000 to 50,000 people, plus the adjacent areas that have a high degree of social and economic integration with the core, as measured by commuting ties.
The sample of PSUs currently used in the survey consists of 91 areas, of which 75 urban areas are also used by the Consumer Price Index program. The 91 PSUs are classified into four categories:
- 21 "A" PSUs, which are metropolitan CBSAs with a population over 2 million people
- 38 "X" PSUs, which are metropolitan CBSAs with a population under 2 million people
- 16 "Y" PSUs, which are micropolitan CBSAs
- 16 "Z" PSUs, which are non-CBSA areas and are often referred to as "rural" PSUs
Within these 91 PSUs, the sampling frame (the list of addresses from which the sample is drawn) is generated from the 2000 Census 100-Percent Detail File. The sampling frame is augmented by a sample of addresses drawn from new construction permits and by extra housing units identified through coverage improvement techniques.
The population represented by the CE is the total U.S. civilian noninstitutional population, both urban and rural. The population includes people living in houses, condominiums, apartments, and group quarters such as college dormitories. Excluded are: military personnel living on base, nursing home residents, and people in prisons.
The U.S. Census Bureau selects a sample of approximately 12,000 addresses per year to participate in the Diary Survey. Usable diaries are obtained from approximately 7,100 households at those addresses. (Diaries are not obtained from the other addresses due to refusals, vacancies, ineligibility, or the nonexistence of a housing unit at the selected address.) The actual placement of diaries is spread equally over all 52 weeks of the year.
The Interview Survey is a rotating panel survey, in which approximately 15,000 addresses are contacted in each calendar quarter of the year. One-fifth of the addresses contacted each quarter are new to the survey and provide "bounding" interviews that provide baseline data, which are not used to compute the survey’s published expenditure estimates. Excluding these bounding interviews and interviews not completed (due to refusals, vacancies, ineligibility, or the nonexistence of a housing unit at the selected address), usable interviews are obtained from approximately 7,100 households each quarter. After a housing unit has been in the sample for five consecutive quarters, it is dropped from the survey, and a new housing unit is selected to replace it.
Response data for the 2009 CE are shown in table 1. For the Interview Survey, the totals refer to housing units in the second through fifth quarters of the survey (the non-bounding interviews), with each unique housing unit providing up to four usable interviews. For the Diary Survey, totals refer to housing units in weeks 1 and 2 of the survey, with each unique housing unit providing up to two usable interviews. Most Diary respondents participate for both weeks.
There are three general categories of nonresponse:
- Type A nonresponses are refusals, temporary absences, and noncontacts.
- Type B nonresponses are vacant housing units, housing units with temporary residents, and housing units under construction.
- Type C nonresponses are nonresidential addresses, such as destroyed or abandoned housing units, and housing units converted to nonresidential use.
Response rates are defined to be the percent of eligible housing units (that is, the designated sample less Type B and Type C nonresponses) from which usable interviews are collected. In the 2009 Interview Survey, there were 47,609 eligible housing units, from which 35,756 usable interviews were collected, resulting in a response rate of 75.1 percent. In the 2009 Diary Survey, there were 18,879 eligible housing units, from which 14,495 usable interviews were collected, resulting in a response rate of 76.8 percent.
Table 1. Analysis of responses in the
Consumer Expenditure Survey, 2009
Housing units designated for the survey
Less: Type B or C nonresponses
Equals: eligible units
Less: Type A nonresponses
Equals: Interview units
Percent of eligible units interviewed
The estimation of population quantities of interest, such as the average expenditure per consumer unit on a particular item, is achieved through the use of weights. Each consumer unit in the survey is assigned a weight that is the number of similar consumer units in the U.S. civilian noninstitutional population the sampled consumer unit represents. Using these weights, BLS estimates the average expenditure per consumer unit on a particular item category by
y = the average expenditure per consumer unit on the item category,
yi = the expenditure made by the i th consumer unit on the item category,
wi = the weight of the i th consumer unit in the sample, and
s = the sample of consumer units that participated in the survey.
For example, if yi is the expenditure on butter made by the ith consumer unit in the sample during a given time period, then y is an estimate of the average expenditure on butter made by all consumer units in the U.S. civilian noninstitutional population during that time period.
If one wants to estimate the proportion of consumer units that purchased butter during a given time period, then the same formula is applied, where yi is set equal to 1 if the ith consumer unit purchased butter during the time period, and 0 if it did not. When this 1/0 definition of yi is used, y is an estimate of the proportion of all consumer units in the U.S. civilian noninstitutional population that purchased butter during the given time period.
Several factors are involved in computing the weight of each consumer unit for which a usable interview is received. Each consumer unit is initially assigned a base weight that is equal to the inverse of the consumer unit's probability of being selected for the sample. Base weights in the CE are typically around 10,000, which means that a consumer unit in the sample represents 10,000 consumer units in the U.S. civilian noninstitutional population—itself plus 9,999 other consumer units that were not selected for the sample. The base weight is then adjusted by the following factors to correct for certain nonsampling errors:
Weighting control factor. This adjusts for subsampling in the field. Subsampling occurs when a data collector visits a particular address and discovers multiple housing units where only one housing unit was expected.
Noninterview adjustment factor. This adjusts for interviews that cannot be conducted in occupied housing units, due to a consumer unit's refusal to participate in the survey or the inability to contact anyone at the sample unit in spite of repeated attempts. This adjustment is based on region of the country, household tenure (owner or renter), consumer unit size, and race of the reference person.
Calibration factor. This adjusts the weights to 24 "known" population counts to account for frame undercoverage. These "known" population counts are for age, race, household tenure (owner or renter), region of the country, and urban or rural. The population counts are updated quarterly. Each consumer unit is given its own unique calibration factor. There are infinitely many sets of calibration factors that make the weights add up to the 24 "known" population counts, and the CE selects the set that minimizes the amount of change made to the "initial weights" (initial weight = base weight x weighting control factor x noninterview adjustment factor).
Precision of the estimates
The precision of the estimator y is measured by its standard error. Standard errors measure the sampling variability of the CE estimates. That is, standard errors measure the uncertainty in the survey estimates caused by the fact that a random sample of consumer units from across the United States is used instead of collecting data from every consumer unit in the nation.
Standard errors are estimated using the method of "balanced repeated replication." In this method the sampled PSUs are divided into 43 groups (called strata), and the consumer units within each stratum are randomly divided into two half samples. Half of the consumer units are assigned to one half sample, and the other half are assigned to the other half sample. Then 44 different estimates of y are created using data from only one half sample per stratum. There are many combinations of half samples that can be used to create these "replicate" estimates, and the CE uses 44 of them that are created in a "balanced" way with a 44 x 44 Hadamard matrix. The standard error of y is then estimated by
SE (y) =
r = 1
where y is the rth replicate estimate of y.
The coefficient of variation is a related measure of sampling variability and measures the variability of the survey estimate relative to the mean. The coefficient of variation is defined by the equation
Table 2. Precision of the Consumer Expenditure Survey expenditure estimates, integrated Diary and Interview Survey data, 2009
||Average annual expenditure per consumer unit, (y)
|| Standard error, SE (y)
||Coefficient of variation, CV (y) (in percent)
Personal insurance and pensions
Information from the CE is available in press releases, reports, and analytical papers. Microdata from the survey are available on CD-ROM. Data are also available from the Bureau of Labor Statistics website (http://www.bls.gov/cex) and the BLS Consumer Expenditure Survey Division.
Publications from the CE generally include tabulations of average expenditures and income arrayed by consumer unit characteristics, such as consumer unit size, age of reference person, or income. Tabulations by two variables (cross-tabulations) are available for selected characteristics, such as age by income or consumer unit size by income. Integrated Diary and Interview Survey data are published on an annual basis, and tabulations starting with 1984 are available on the BLS website.
The Diary and Interview Survey microdata that are available on CD-ROM contain files of expenditure and income reports of each consumer unit. To protect the identities of respondents, selected geographic detail is eliminated, and selected income and expenditure variables may be topcoded. The Interview files contain expenditure data in two formats: MTAB files that present monthly values in an item-coding framework based on the CPI pricing scheme, and EXPN files that organize expenditures by the section of the Interview questionnaire in which they are collected. Expenditure values on the EXPN files cover different time periods, depending on specific questions asked; and files also contain relevant non-expenditure information not found on the MTAB files. The annual Interview and Diary microdata files are available on CD-ROM beginning with 1990, as well as for selected earlier years.
Articles that include analyses of CE data are published online in the Monthly Labor Review (MLR), in the publication, Focus on Prices and Spending, and in CE anthologies. Other survey information is available on the Internet, including answers to frequently asked questions, copies of the Interview and Diary Survey instrument, a glossary of terms, and order forms for survey products. Starting with the 2000 data, estimates of standard errors for integrated Diary and Interview Survey data are available on the BLS website.
Consumer expenditure surveys undergo continuous evaluation by comparing results with other data and by performing internal statistical, qualitative, and cognitive analyses to address current methodological concerns. To improve expenditure estimates, research related to the data collection instruments, field procedures, and sources of potential survey error (e.g., nonresponse bias or measurement error) began in the mid-1980s, and have since become standard practice. A separate Branch of Research and Program Development was established within the Division of Consumer Expenditure Survey in 1999, with the mission of developing and conducting methodological studies to improve survey instruments, field procedures, and overall survey data quality.
The user-friendly diary form introduced in 2005 is one example of a research-driven survey improvement. This diary was the culmination of an extensive series of studies, including a large-scale field test. Through group discussions with data collection staff and cognitive testing with respondents, researchers learned that respondents preferred a more open format. The diary now obtains purchase data in four general categories, rather than the numerous detailed subcategories used previously. Another major improvement was the addition of checkboxes to collect data about meals eaten away from home, thus eliminating the need for respondents to provide this information. Checkboxes have the added benefit of facilitating data coding, and so contribute to streamlined processing. In addition to being more efficient, the new diary is visually more appealing, as it is smaller, contains fewer pages, and is printed in brighter colors than its predecessor. An evaluation of use of results from the new diary shows that it is more effective and demonstrated marked improvements in collection of some types of expenditures.
Other in-house Bureau studies recently completed include: 1) examining the effect of incentives on response rates and data quality 2) investigating the dimensions of nonresponse bias 3) collecting and analyzing supplemental information about the survey ("paradata") such as data from the Contact History Instrument, which may shed light on improvements to field procedures, and 4) exploring ways to increase within-household participation in the Diary Survey. Research results have been presented at the annual conferences of the American Association for Public Opinion Research and the American Statistical Association (ASA), and papers from both conferences can be found in the proceedings of the Joint Statistical Meetings.
Since 2009, research efforts have placed an emphasis on updating the design of the Consumer Expenditure Survey. At that time, the CE program initiated the Gemini Project, with the goal of promoting improved expenditure estimates in the CE—through an improved survey design—by reducing measurement error, particularly error caused by underreporting. The multi-year survey redesign project includes research, pilot testing, and transition to the new survey design. The scope of the redesign project is broad and open to a wide range of design alternatives, such as new modes, modular surveys, and innovative technologies. During the course of the Gemini Project, program staff will develop and test potential survey design changes, large and small, aimed at improving data quality, increasing the analytic value of the data to users, and supporting greater operational flexibility to respond to changes in the data-collection environment.
Redesign-related research has focused on a variety of topics aimed at assessing the potential of design alternatives. Ongoing studies feature evaluating the feasibility and impact of split questionnaire designs; shorter interview length; alternative recall periods; varying interviewing frequencies; recall aids; and new data collection technologies, such as data-capture devices, financial software, and an online diary. In addition, the program is researching the use of transaction records in a variety of formats, supplemented by respondent recall, as the primary data source for information about respondent expenditures.
In early 2011, the program contracted with the National Academies Committee on National Statistics to form a consensus expert panel, coordinate several research events, and produce a report containing redesign recommendations and other outside independent proposals. For further information on the Gemini Project, see www.bls.gov/cex/geminiproject.htm.
As an important part of the CE quality assurance program, estimates from CE data are compared regularly with corresponding estimates from other data sources to evaluate the soundness of CE estimates at any point in time, as well as the consistency of estimates over time. Among the comparisons that are made are those for a wide range of expenditure categories as well as comparisons of data on income and on assets and liabilities. For information on published articles and presentations comparing CE data with those from other sources, see www.bls.gov/cex/cecomparison.htm.
The importance of the Consumer Expenditure Survey is that it allows data users to correlate expenditures and income of consumers to the characteristics of those consumers. Survey data are of value to government and private agencies interested in studying the welfare of particular segments of the population, such as the elderly, low-income families, urban families, and those receiving food stamps. Data also are used by economic policymakers interested in the effects of policy changes on levels of living among diverse socioeconomic groups, and econometricians find the data useful in constructing economic models. Market researchers find consumer expenditure data valuable in analyzing the demand for groups of goods and services. Additionally, the U.S. Department of Commerce uses the survey data as a source of information for revising its benchmark estimates of selected items in the expenditure and income components of the National Accounts, and the Department of Defense uses the data in determining cost-of-living allowances for military personnel.
As in the past, the revision of the Consumer Price Index remains a primary reason for undertaking this extensive survey. Results from the Consumer Expenditure Survey are used to select new market baskets of goods and services for the index, to determine the relative importance of components, and to derive new cost weights for the baskets. In August 2002, the Bureau of Labor Statistics began publishing a new index, the "Chained Consumer Price Index for All Consumers" (C-CPI-U), which supplements the existing consumer price indexes. The use of expenditure data from different time periods distinguishes the C-CPI-U from the existing CPI measures, which use only a single expenditure base period to compute the price change over time. The chained index is designed to measure the change in the "cost of living," as compared with existing indexes that are designed to measure the change in the fixed market basket of goods and services in retail outlets. The C-CPI-U uses expenditure data from different time periods to reflect the effect of substitution that consumers make across item categories in response to changes in the relative prices of goods and services.
Sample surveys are subject to two types of errors—sampling and non-sampling. Sampling error is the uncertainty caused by the fact that observations are taken from a random sample of population members and not from the entire population. Non-sampling error is the rest of the error and arises regardless of whether data are collected from a sample or from the entire population. Non-sampling errors can be attributed to many sources, such as differences in the interpretation of questions, inability or unwillingness of respondents to provide correct information, data processing errors, etc.
Another way of analyzing error is to divide it into variance and bias. The variance is a measure of how close different estimates would be to each other if it were possible to repeat the survey over and over, using different samples. While it is not feasible to repeat the survey over and over, statistical theory allows the variance to be estimated. A small variance indicates that multiple independent samples would produce values that are consistently very close to each other. Bias is the difference between the "expected" value of an estimate and its "true" value. A statistic may have a small variance but a large bias, or it may have a large variance but a small bias. For an estimate to be considered accurate, both its variance and its bias must be small.
The Bureau of Labor Statistics is constantly trying to reduce the error in the Consumer Expenditure Survey estimates. Variance and sampling error are reduced by using a sample of respondents that is as large as possible, given resource constraints. Improving the accuracy of the estimates is the primary reason for the significant expansion in the sample for both the Interview and Diary Surveys that occurred in 1999. Going forward, the Bureau will strive to reduce non-sampling error through a series of computerized and professional data reviews, as well as through continuous survey process improvements and theoretical research.