| Food Composition Data: A User's Perspective (1987) |
|Managing food composition data|
|Maintaining a food composition data base for multiple research studies: the NCC food table|
Specific user needs and approaches to these needs
Minimizing redundancy in the nutrient data base
I. MARILYN BUZZARD and DIANE FESKANICH
Nutrition Co-ordinating Center, Division of Biometry, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
A food composition data base was developed at the Nutrition Co-ordination Center (NCC), University of Minnesota, in 1974 for use in analysing dietary data for two long-term multi-centred cardiovascular studies, the Multiple Risk Factor Intervention Trial (MRFIT) and the Lipid Research Clinics (LRC) Programs [2-4]. Standardized methods for the collection and analysis of dietary data and centralized processing of the data were implemented to minimize inter-clinic differences among the 34 centres involved and to allow comparability of dietary data between the two collaborative research programmes supported by the National Heart, Lung and Blood Institute (NHLBI).
The original nutrient data base and coding system were designed to allow detailed specification of both the quality and quantity of dietary fat. Margarines, oils, and shortenings were classified by type and brand, and fats used in recipes and food preparation were documented in detail, as described by Dennis et al. . Within food groups, food items with similar fat content were grouped together. The original data base included approximately 1,200 entries with values for 31 nutrients.
In 1977 the NCC nutrient data base coding system became available for other research studies, and the system has been used by numerous investigators over the past eight years. The majority of users are medical researchers in the United States and Canada involved in the investigation of relationships between diet and disease. The data base has been expanded to meet the research needs of each study. The current NCC nutrient data base (referred to here as the NCC Food Table) includes approximately 1,800 entries and values for 61 food components.
The purpose of this paper is to describe the needs of the users of the NCC Food Table and to discuss the NCC approaches to maintaining the table to meet these needs. NCC guidelines for minimizing redundancy, without loss of the specificity required by the users of the system, are also described.
Specific user needs and approaches to these needs
The Need for Standardized Methods of Determining Nutrient Values for All Foods and Beverages Consumed by the Study Population
Users of a nutrient data base require that the system be capable of handling not only common foods and beverages, but also any uncommon items that might be consumed by the study population. New entries are frequently added to the NCC Food Table to meet study-specific needs. Food composition data are gathered for regional foods and local recipes which appear on dietary intake forms, and also for new manufactured products or reformulations of existing products consumed by the study population. These new foods or beverages are compared with existing Food Table entries. If the item cannot be represented by an existing entry or a combination of existing entries, a new entry is developed. All coding decisions are documented and cross referenced as appropriate to ensure that the item will be coded in the same way if it appears on future dietary intake forms.
The Need for Updated and Complete Nutrient Profiles
Nutrient data bases must be updated continually to meet the needs of a wide range of investigators. The updated file must reflect new published or provisional data provided by the United States Department of Agriculture (USDA), new analytic data from the scientific literature, and current food composition data for new or reformulated commercial products.
To ensure that nutrient values in the NCC Food Table are current, several dozen scientific journals are reviewed by the NCC nutrition staff on a monthly basis. Any new data on food composition, new products, or food labelling are extracted. These data, along with any other newly available data from USDA or food manufacturers, are compared with existing values in the Food Table. Limits have been established for each nutrient to guide data-base nutritionists in determining when the difference between the new and the existing value is sufficient to warrant a modification to the Food Table; modification procedures are implemented only when differences exceed the established limit values. This system prevents the expenditure of considerable time and effort on making changes to the Food Table that are insignificant considering the variability inherent in nutrient data.
Major food manufacturers are contacted annually for updated information on the nutrient composition of their products, and new data are compared with existing values. Differences that exceed the established limit values are identified as being caused by either product reformulation or better analytical data. If the product has been reformulated, a new entry is added to the Food Table and the old entry deactivated so that it cannot be used in future coding. If the nutrient differences are the result of better analytical data, modification procedures are implemented.
New nutrients have been added to the NCC Food Table to accommodate investigators' current research interests. USDA data are the preferred sources, but when these are incomplete or unavailable other sources, including foreign food tables, scientific literature, data from food manufacturers, or unpublished laboratory data, are used. All sources of raw data are reviewed and compared by a data-base nutritionist, and values are selected or imputed for the Food Table based on factors which include the reliability of the data sources, the laboratory analytic method used, and the number of samples analysed.
Adding new nutrients to the NCC Food Table often requires making separate entries for items grouped within a single entry. For example, the addition of sodium to the nutrient profile resulted in new entries for canned vegetables, which were previously included in the entries for vegetables cooked from fresh or frozen sources.
Users also need a data base that is complete for all nutrient values. Since missing values are calculated as zeros in the nutrient analysis, they are tolerated only in cases where zero is as good an estimate as any other for a given nutrient in a given food item. The NCC data-base nutritionists impute values, whenever possible, from the nutrient content of similar foods, recipes, or product ingredient lists. As soon as better data become available, the values are updated to reflect the analytic data.
The Need for Documentation of Sources of Nutrient Values
Users of a nutrient data base require documentation of the sources of nutrient information used in the calculation of their dietary data to enable them to judge the reliability of the nutrient analysis.
The NCC reference system includes a six-character field for proximate composition and a two-character field for all other food components. The leading character in the proximate composition field designates the general category of the reference source, and remaining characters specify the actual page or code number in the designated source. The two-character code for nutrient references designates the source as an USDA reference, other food table reference, specific journal reference, or other source of nutrient data, including calculations and imputations.
A paper file is maintained for every entry in the NCC Food Table. All details of calculations and imputations of nutrient values, including the rationale for selection of the foods on which these determinations are based, are documented in the paper file. General guidelines by food groups are maintained for calculating or imputing nutrient values.
Codes for reliability of nutrient data have been discontinued due to the difficulty of establishing objective guidelines to meet the needs of all users. Sources of all nutrient data are documented as completely as possible in the NCC Food Table so that the user is able to judge reliability in relation to a specific research setting.
The Need for Quality-control Procedures to Ensure the Accuracy of the Data Base
Nutrient data-base users need assurances that the values used in the calculation of their dietary intake data are accurate and that the computer programs used to calculate the data are reliable.
Despite ongoing efforts to maintain the accuracy of a nutrient data base, mistakes can find their way into a system as a result of human error or mechanical or software malfunctions. A number of quality-control procedures have been developed by the NCC to document the accuracy of the Food Table and the calculation software.
The NCC Food Table is updated on a daily basis by the data-base nutrition staff. All Food Table modifications are made to a separate file called the Reference Food Table. Nutrient modifications are edited in the data-entry system by comparing new nutrient values with NCC established nutrient ranges. A value that falls out of range must be verified by an NCC nutritionist.
The Reference Food Table is periodically checked for accuracy by running a series of computer-generated integrity reports that are reviewed by the nutrition staff. Some integrity checks are algorithms that are calculated for each entry and compared with a predetermined value. Other integrity checks are listings of individual nutrients by food groups. Any nutrient value that deviates substantially from other values in that food group is verified by a nutritionist.
After all integrity reports have been verified, a test set of dietary intake records are analysed to check the various calculation procedures. The results of the nutrient calculations are compared with the previous calculations of the test records. Any differences observed must be verified as being due to recent modifications to the Reference Food Table. After satisfactory implementation of these procedures, the Reference Food Table is available for research use and becomes the current version of the NCC Food Table.
The Need for Stability of the Nutrient Data Base for Long-term Studies
Although the majority of researchers want to use the most current nutrient information available at the outset of a study, they require that the data base remain stable for the duration of their study. This may present a problem for those who maintain a nutrient data base for multiple users.
The NCC has resolved this problem by maintaining multiple versions of the Food Table. The most recent version available at the beginning of a long-term study is used for nutrient calculations for the duration of the study. The only permissible change to a study-assigned Food Table is the addition of new entries for foods or beverages not previously included. lt is essential that no modifications be made to previously established entries; changing the data base while a study is in progress will confound the interpretation of the dietary data. At the end of the study, the investigator may choose to rerun all data on an updated version of the Food Table.
The Need for Comparability of Nutrient Data between Studies
Investigators may require a nutrient data base that will permit comparison of their study results with other studies using the same data base. Inter-study comparability of dietary data can provide information beyond the scope of a single study.
A system has been established at the NCC that allows comparison of dietary data between past and future studies. Such comparison is possible even though the NCC Food Table is updated routinely. This is accomplished by ensuring that no entry is ever deleted from the data base. When a product is taken off the market, the item is "deactivated," which means that it can no longer be used in coding; however, the item remains in the data base and its nutrients continue to be updated as new nutrients are added to the Food Table. Maintaining these deactivated entries in the Food Table makes possible the recalculation of nutrients for dietary intake data collected in the past while using an updated version of the Food Table. The results of these recalculations can then be compared with those of current studies.
The Need for Flexibility in the Level of Specificity Required for Documenting Dietary Detail
Some researchers require considerable detail in documenting dietary intake data while others select methodologies that document dietary intake in broad food group categories. A nutrient data base that meets the needs of diverse users must be able to provide nutrient calculations for dietary intake data documented at different levels of specificity.
The NCC has met this need for flexibility by providing procedures that capture the highest level of detail (such as the brand names, preparation methods, and salt additions) or revert to default values when the higher levels of detail are not specified. Nutrient values for the default or "unknown" assignments are based either on weighted averages of values for the items that fall into that general category or on a single item that is representative of the items in that category.
Guidelines for creating defaults using weighted nutrient values or selected representative values are based on food intake patterns of specific study populations. Determination of defaults for fats used in food preparation methods takes into consideration whether the item was commercially processed, prepared at home, or prepared in a restaurant; if the latter, the default assignments are based on the price range of the restaurant.
The Need for Flexibility in Specifying Food Quantities for Data Input
Researchers need to be able to report dietary data as described by study participants without having to transform food quantities into specified units. The NCC Food Table meets this need by maintaining densities and/or weights of food-specific units. For entries that contain density data, food quantities can be reported using any cubic or household measure of volume. The nutrient calculation programmes convert volume measurements to a gram weight using the food density. For foods that are not easily described in common volume measurements, such as stalks of celery, slices of bread, or pieces of pie, weights are provided for food-specific servings. Dimensions are described for each food-specific serving.
Minimizing redundancy in the nutrient data base
Ongoing maintenance and expansion of a nutrient data base to include increasing numbers of nutrients and other food components can be efficiently accomplished only if the number of entries requiring routine maintenance can be kept to a minimum without losing the specificity required by the many users of the system.
All entries in the NCC Food Table are either elemental or composite entries. Elemental entries may be defined as entries for which nutrient values are maintained in the Food Table. Composite entries are defined by an ingredient list of two or more elemental entries with specified amounts. Nutrient values for composite entries are calculated from the ingredient list by the computer. This system limits the routine maintenance of the Food Table to the elemental entries.
The following guidelines have been developed by the NCC to minimize redundancy in the Food Table by limiting the number of elemental entries.
1. Include Foods Only in the Forms in Which They Are Eaten
Since analysis of the dietary intake of individuals living in the United States and Canada is the common need of all current users of the NCC system, the Food Table need not include foods in forms in which they are never consumed by the study populations. Thus, no entries are included for most raw meats, uncooked pasta, or certain raw vegetables such as potatoes. Many foods commonly eaten in other countries do not appear in the NCC Food Table because they have not been encountered frequently enough on the dietary records received at the NCC. As various ethnic foods increase in popularity in this country, it is expected that more foreign foods will be added to the Food Table. Sushi is an example of a recent addition to the NCC Food Table.
For composite entries of cooked recipes, cooked ingredients are substituted for the raw ingredients whenever possible. For example, if a casserole is made with raw rice, the corresponding amount of cooked rice is entered as a recipe ingredient in the composite entry. Use of the nutrient values of cooked rather than raw ingredients makes the calculated nutrient content more similar to the actual nutrient content of the food as eaten. This system also reduces the number of elemental entries required in the Food Table by eliminating the need to maintain raw food entries as recipe ingredients for composite entries.
2. Combine Foods of Similar Nutrient Content into a Single Entry
When similar foods differ by less than the established limit values for each nutrient, the foods are grouped together in a single entry. For example, spinach cooked from fresh and cooked from frozen are combined in a single entry. If at some point a new nutrient or other food component is added to the Food Table for which the value in cooked fresh spinach differs from that in cooked frozen spinach by more than its established limit value, the two items would be given separate entries.
3. Add New Foods as Composite Entries Rather than as Elemental Entries Whenever Possible
Even though some mixed dishes have been analysed for nutrients in the composite state, it is preferable to add them to the Food Table as composite entries rather than as elemental entries as long as the amounts of ingredients can be specified and the calculated nutrient values closely match the values provided for the composite item. Most home-prepared foods in the NCC Food Table are maintained as composite entries. Some commercial products with well-defined ingredients are also entered into the Food Table as composite entries. Approximately one-third of the current NCC Food Table consists of composite entries.
4. Use of Prep Codes and Fat Codes
"Prep codes" to specify amounts of fat added in various food preparation methods and "fat codes" to designate the type of fat used are other procedures used by the NCC to limit the number of elemental entries. These procedures have been described in detail elsewhere . Prep codes prompt the appropriate computer algorithms to calculate the amount of fat, salt, or other additions for various food preparation methods of a basic food. Thus a single entry can be used for many different preparations of that item. For example, a piece of light-meat chicken without skin can be breaded and fried in corn oil, baked with butter, or broiled without added fat by invoking the appropriate prep codes and fat codes with the same elemental item.
5. Use of Add-Principal-Fat (APF) Recipes
Foods that contain significant ingredient or cooking fats are designated as APF recipes and are maintained in the NCC Food Table as composite entries. French fried potatoes, pie crust, and salad dressing are examples. All APF recipes require specification of the type or brand of the predominant fat ingredient. Thus, the nutrients in corn bread could be calculated using bacon fat, soybean oil, shortening, or any other appropriate fat. This system allows considerable flexibility for specifying ingredient and cooking fats without increasing the number of entries in the Food Table.
6. Use of Coding Guides for Brand-name Products and Food Characteristics
Coding guides are alphabetical indices of specific types, classes, or brands of foods that designate the particular NCC Food Table entries into which they are classified. For example, the Brand Name Margarine Guide specifies which of the approximately 60 margarine entries in the Food Table should be used for each of approximately 400 brand-name margarines. The Beef Guide specifies which Food Table entry should be used for each type or cut of beef. The current NCC system includes approximately 80 guides. Each guide also includes directions for coding items when the type or brand is not specified.
7. Handling of "Uncodables"
When an item appearing on a dietary intake record cannot be coded according to established procedures, the item is documented on an Uncodable Form to be resolved by the nutrition staff. If the uncodable item is a new product on the market, ingredient and nutrient information is requested from the manufacturer to determine proper coding. If the uncodable item is a composite food consisting of unknown amounts of ingredients, the nutritionist makes a judgement on proportions. Decisions on the coding of uncodable items are stored in a crossreferenced file to facilitate standardization for future coding decisions. Uncodables that begin to appear frequently on intake records are added to the Food Table as new entries. An example of an uncodable recently converted to a new entry is trail mix. The uncodables system prevents the unwieldy expansion of the Food Table that would result from the inclusion of many infrequently consumed items.
Use of the various procedures outlined above effectively limits the number of elemental entries in the NCC Food Table. Thus, updating can be routinely implemented and new nutrients added with minimum effort while maintaining maximum flexibility for detailed specificity to meet user needs.
A table of food composition designed to meet the ongoing nutrient analysis needs of multiple research studies must be continually updated and expanded. Standardized methods of updating and imputing nutrient values must be established, and sources of all values must be carefully documented. Computerized edit checks and other quality-control procedures must be incorporated into the system to ensure the accuracy of the data base.
To accommodate the needs of multiple long-term studies, a number of versions of the data base are maintained by the NCC. Flexibility is provided to meet different levels of specificity of dietary detail required by different research protocols.
Procedures have been established by the NCC to facilitate ongoing maintenance of the data base without loss of the specificity required by the users of the system.
1. B. Dennis, N. Ernst, M. Hjortland, J. Tillotson, and V. Grambsch, "The NHLBI Nutrition
Data System," J. Am. Diet. Assoc., 77: 641-647 (1980).
2. Lipid Research Clinics Epidemiology Committee, "Plasma Lipid Distributions in Selected North American Populations. The LRC Program Prevalence Study," Circulation, 60: 427 (1979).
3. Lipid Research Clinics Program, "The Coronary Primary Prevention Trial. Design and Implementation," J. Chronic Dis., 32: 6(19 (1979).
4. Multiple Risk Factor Intervention Trial Group, "Statistical Design Considerations in the NHLBI Multiple Risk Factor Intervention Trial (MRFIT), " J. Chronic Dis., 30: 261 ( 1977).