Cover Image
close this book Food Composition Data: A User's Perspective (1987)
close this folder Managing food composition data
close this folder Managing food composition data at the national level
View the document (introductory text)
View the document Introduction
View the document Data input
View the document Data output
View the document Special considerations
View the document Conclusions
View the document References

Managing food composition data at the national level

(introductory text)

Data input
Data output
Special considerations


Nutrition Monitoring Division. Human Nutrition Information Service, US Department of Agriculture, Washington, D.C., USA



The management of food composition data at the national level is carried out with the US Department of Agriculture's National Nutrient Data Bank (NDB). A distinction should be made between this management and nutrient data-base management in the more usual sense, such as is carried out in support of the many computerized dietary analysis systems - often called nutrient data-base systems - now in operation. They differ in that the NDB summarizes individual analytical values into a nutrient data base of representative values for foods. These in turn can serve as the foundation for the dietary analysis systems. Essentially, the NDB is the provider of summarized data, and the managers of data systems built upon those summarized data are the NDB's primary users. It is the purpose of this paper to provide insight into the NDB's present mode of operation, describe modifications for improvement now under way, discuss efforts for improving the quality of data, and indicate new applications of the system that may benefit INFOODS.

The Nutrient Data Bank was conceived and established as a computerized means of storing and compiling data on the nutrient composition of foods and of providing average, or representative, nutrient values to data users. Because the computerized system serves as the mechanism for the revision of Agriculture Handbook No. 8, the expansion of data stored in the NDB parallels progress on the handbook revision. The current publication status is shown in table 1 [9]. Food groups covered by AH-8, section nos. 18-22, are those most actively pursued in the data-entering stage at the present time.

The essential features of the NDB system have been described in detail elsewhere [5, 8]. For this discussion it may be helpful to describe briefly the NDB at each of its three levels. Data Base 1 (DB1) consists of the individual entries of nutrients in a food item, together with detailed descriptions of the food item and particulars concerning the measured value. At present, over 800,000 individual records are stored in the NDB, and additions continue to be made at the rate of about 6,000 to 9,000 per month.

Table 1. Status of Agriculture Handbook No. 8 revisions

Sections published [9] Sections in preparation
8-1 Dairy and egg products 8-13 Beef products
8-2 Spices and herbs 8-14 Beverages
8-3 Baby foods 8-15 Fish and shellfish
8-4 Fats and oils 8-16 Legumes
8-5 Poultry products 8-17 Lamb, veal, and game
8-6 Soups, sauces, and gravies 8-18 Bakery products
8-7 Sausages and luncheon meats 8-19 Sugars and sweets
8-8 Breakfast cereals 8-20 Cereal grains, flours, and pasta
8-9 Fruits and fruit juices 8-21 Fast foods
8-10 Pork and pork products 8-22 Mixed dishes
8-11 Vegetables and vegetable products 8-23 Miscellaneous foods
8-12 Nut and seed products  

Data Base 2 (DB2) consists of summarized values of nutrients in food items that have like descriptions. Individual values are averaged and standard deviations calculated for each grouping. Data at this stage of summary provide the opportunity to examine specific food descriptions, such as year of harvest or region of growth. The application of DB2 information to development of an international data base of cereal grain foods was described in a previous publication [6]. At present, data in DB2 are generally too limited for meaningful statistical distinctions, but the potential for such use by INFOODS should be kept in mind as a means of providing more detailed access to data than is now possible.

Data Base 3 (DB3) contains data at the level familiarly known in Agriculture Handbook No. 8. The aim is to provide data that are representative of foods across the nation on a yearround basis. To this end, groupings within DB2 that are indistinguishable at point of purchase or consumption, or that have nearly identical nutrient profiles, may be combined to yield overall mean values. The total number of observations and standard error are also calculated. A provision of the NDB system allows the components to be weighted to produce averages that are more representative for the nation. DB3 is also used to create the computerized version, the USDA Nutrient Data Base for Standard Reference (available from National Technical Information Service, 5285 Port Royal Road, Springfield, VA 22161).

The Nutrient Data Bank is still in its formative period and has not yet reached the stage of continuous maintenance management. At this time, attention is still focused on completing the revision of Agriculture Handbook No. 8, and data management is thus devoted primarily to control of data input and output.

Data input

Data input

Data are extracted from the scientific literature on a continual basis, sought from industry, and obtained from government laboratories. In recent years especially, USDA's Human Nutrition Information Service (HNIS) has supported extra-mural contracts for the generation of data on specific foods in order to supply information otherwise lacking. The proportions of data from these sources vary with both the nutrient and the food. The data extracted from scientific literature are usually limited in their range of either nutrients or foods. Reports seldom supply comprehensive data on food composition, and details of analytical methodology and quality control are often incompletely described. Industry has been particularly helpful in providing analytical data upon which label claims are based, but these data are generally limited to the nutrients required for the label and are developed primarily for processed foods. USDA's Nutrient Composition Laboratory (NCL) provides data on selected nutrients in core foods. As discussed by Beecher elsewhere in chapter 18 of this volume, the goals of NCL coincide with ours and we are seeking ways in which to further co-ordinate our activities. The Food and Drug Administration's Revised Total Diet Study [7] is also providing additional, well-documented analyses of foods from known geographic locations.

For work performed under contract we are able to stipulate how samples are to be drawn and handled, the methods to be used for analysis, and precautions to be taken in proper performance. Contractors are required to validate analytical procedures and are asked to develop suitable quality control using standard reference and control materials. Many contractors have voluntarily taken part in analysing control samples routinely examined by cooperators in the National Food Processors Association programme, and for several years have participated in meetings in which they can discuss specific analytical problems and share information on possible solutions. There is no doubt that the co-operation and dedication of our contractors has increased the reliability of results. We realize, however, that absolute control over analytical measurements cannot be attained without imposing quality-control tests supervised by an outside laboratory. A current co-operative project with the Nutrient Composition Laboratory is providing such control in a current study and will serve as a model for future applications.

All data are carefully screened before insertion into the NDB. Data are excluded only with proper justification. The samples analysed must be representative of the food supply, and there must be evidence that the samples have been treated appropriately to avoid contamination or loss of nutrients. The analytical method must be known to be acceptable in the particular application or proven by the researcher. Data that are questionable because of insufficient explanation may be flagged for further evaluation. Flagged data are not included in the computation of means.

The Nutrient Data Bank system is now undergoing major revision. The main purpose is to make the system more efficient, taking advantage of advances in computer technology that have been made since the original system was designed. Two of the new features are of special interest. First, each food-group specialist assigned to work on the NDB will have direct access to the nutrient data through an interactive terminal and will be able to test the effects of different groupings of descriptors in the steps of creating both DB2 and DB3. Second, provision has been made to allow for the attachment of codes to the individual data that will express their reliability in various terms such as adequacy of sampling, methodology, and laboratory quality control. This approach was utilized by Exler when developing a table on the iron content of foods [4]. A similar treatment of selenium data was addressed by Beecher (see chap. 18). It is our expectation that attached codes will be able to be used to develop computer-generated confidence codes for the calculated means.

The range of nutrients included in the NDB follows the interests of the nutrition/ health community. Originally it was planned that nutrients should be limited to those for which Recommended Dietary Allowances (RDAs) have been established [3].

Recent observations on possible relationships between dietary components and health have led to demands for data on additional components. The report on Diet, Nutrition and Cancer [2], for example, pointed to the possible roles of carotene, dietary fibre, and selenium. Because of the lag between expressed interest in a new food component and the ability to generate data for its content in foods, we must anticipate users' needs and take an early initiative to supply new information. In our search for data, we concentrate on those components currently covered in the revised handbook, plus those of growing importance. Data for additional components, although not actively pursued, are entered when included in analytical reports.

The question of which foods should be included in the data bank is similar to the question about nutrients, and is perhaps as complicated. To serve as a national source of information we must be sure to cover those foods most frequently consumed by the general population, but we cannot neglect others that may be important only to certain population subgroups. Foods reported in the Nationwide Food Consumption Surveys provide both types of information. We try to keep abreast of trends in food production, the introduction of new processed foods, changes in formulation or processing, the introduction of new cultivars, and changes in breeding or feeding practices, in order to anticipate possible changes in our nation's food supply and to prepare for the impact such changes may have on measuring nutritional components.

Data output

Data output

The primary product of the NDB is the revision of Agriculture Handbook No. 8. In machinereadable form this is represented by USDA Nutrient Data Base for Standard Reference, which basically consists of the data contained in the revised sections of the handbook, supplemented by the older data for food groups that has not yet been revised. The Standard Reference tape is updated as new revisions are released and thus always represents the most up-to-date information available. Each release of the USDA Nutrient Data Base for Standard Reference is identified by a release number and the year. The most recent is Release No. 5, 1985, which covers revised sections of the handbook through no. 8-12 [9].

The USDA Nutrient Data Base for Standard Reference is utilized for the creation of specialized data bases, both within USDA and by users who purchase the tape for adaptation to their specific needs. A good example of a data base derived for USDA use is the Nutrient Data Base for Individual Food Intake Surveys, which is in the final stages of completion. This has been created by developing a computerized linking file that connects the survey food codes to the food codes on the USDA Nutrient Data Base for Standard Reference. For survey items that are composite foods, formulas are included by which values for the composite items can be calculated from the nutrient content of the individual components. For the forthcoming Continuing Survey of Food Intake by Individuals (CSFII), we were asked to include data for components not regularly present in the Standard Reference base. This made it necessary to expand it to include data for dietary fibre, alcohol, vitamin E, and carotenes for approximately 1,700 food items. In addition, 4,000 values for other nutrients were added to supply values not yet contained in the Standard Reference data base because the food groups have not yet been updated. We have named this expanded data base the Primary Data Set for Food Consumption Surveys.

Table 2. Proportion of nutrient values based on analytical data in Primary Data Set

Nutrient Percentage of values
Protein 98
Calcium 95
Magnesium 78
Carotene 54
Dietary fibre 29

The linking file, the Primary Data Set, and a computerized table of retention factors are accessed by a computer program to create the Nutrient Data Base for Individual Food Intake Surveys.

In creating the Primary Data Set we were placed in the unique position of being both data users and data providers at the same time. Care has been taken to document the sources of the expanded data so that they can be further evaluated and updated as additional information becomes available. Codes have been attached to all added nutrient values to indicate whether they are from analytical data in the revised handbook, new analytical data not yet finalized for the handbook, older data from sections of the handbook not yet revised, or whether they are imputed values or assumed values of zero, such as for cholesterol in plant foods.

Besides documenting the data sources, this coding system provides a new way to measure the state of knowledge of food composition data. A quantitative measure of available analytical data for each nutrient under consideration can be calculated by determining the relative proportion of analytical to imputed values in the expanded data base. Examples of such calculations are shown in table 2. Calcium and protein are representative of nutrients that have been analysed regularly over a long period and for which analytical data are thus most available. At the other extreme is dietary fibre, for which analysed values are just beginning to be reported. Although analyses for magnesium are now commonly included in food composition studies, only limited information was available for the 1963 edition of Agriculture Handbook No. 8, and it is apparent that analytical values currently available are not as comprehensive as those for calcium. Analysed values on hand for carotene are almost entirely those determined in plant products by AOAC procedures [1]. For the Primary Data Set the remaining values for carotene were those assumed to be present in arriving at expressions for total retinol equivalents of vitamin A.

It must be understood that the calculations shown in table 2 pertain directly only to the Primary Data Set. They indicate the basis for information on hand for calculating the composition of foods in the CSFII data base. It should be further understood that the procedure distinguishes only between analytical and imputed data, without attempting to address the reliability of the analytical data.

Special considerations

Special considerations

Determining when a value in a data base should be changed is a major problem in managing the NDB. The problem is universal and should be addressed by INFOODS to insure common approaches to the solution.

There are two major aspects to the problem. The first concerns data reliability. There is general recognition that all data are not created equal, because of either deficiencies in the analytical methods themselves or inappropriate application of methods. The concerted efforts being directed toward improving data reliability will gradually make available better values, but for some period the new data will coexist with those now on hand. We must learn how to deal with changes warranted by the availability of new data, and in the interim it is more important than ever to document the sources of data, recognizing whatever shortcomings they may have. A second aspect of recognizing change is caused by actual changes in the food supply. The introduction of new cultivars, adoption of new feeding practices, and technological changes in food processing are all capable of altering the composition of food. Two examples are the recent development of more highly coloured yellow vegetables, with a resultant increase in carotene content, and the reduction in sodium in some processed foods. Not only must such changes be recognized, but the time that the change occurred must somehow be accounted for. This is of particular importance to HNIS in proceeding with its continuing survey, and it will also be important in comparing changes in the food supply available in different regions of the world.



The management of food composition data in the United States should be no different than in any other country and should be representative of the situation confronting INFOODS. Certainly the successful solution to problems depends upon the co-operative interaction between data users and data providers.



1. Association of Official Analytical Chemists, Official Methods of Analysis, 14th ed. (Association of Official Analytical Chemists, Inc., Arlington, Va., 1984).

2. Committee on Diet, Nutrition, and Cancer, Commission on Life Sciences, National Research Council, Diet, Nutrition, and Cancer: Directions for Research (National Academy Press, Washington, D.C., 1983).

3. Committee on Dietary Allowances, Food and Nutrition Board, Commission on Life Sciences, National Research Council. Recommended Dietary Allowances, 9th ed. (National Academy Press, Washington, D.C., 1980).

4. J. Exler, Iron Content of Food, Home Economics Research Report, no. 45 (Human Nutrition Information Service, USDA, Washington, D.C., 1982).

5. F. N. Hepburn, "The USDA National Nutrient Data Bank," A.J.C.N., 35: 1297-1301 (1982).

6. F. N. Hepburn and B. P. Perloff, "The Nutrient Data Bank," Cereal Foods World, 24: 224-225 (1979).

7. J. A. T. Pennington, "Revision of the Total Diet Study Food List and Diets," J. Am. Diet. Assoc., 82: 166 173 (1983).

8. R. L. Rizek, B. P. Perloff, and L. P. Posati, "USDA's Nutrient Data Bank," Food Tech. in Australia, 33: 112-114 (1981).

9. US Department of Agriculture, "Composition of Foods: Raw, Processed, Prepared," Agriculture Handbook No. 8 (Science and Education Administration, USDA, Washington, D.C., 1976-1984).