| Food Composition Data: A User's Perspective (1987) |
|Report and recommendations of the conference|
A food composition data system
Users obviously need more than just data - they need the machinery to interact with these data. This aspect of the subject can be discussed under the rubric "food data systems," used to describe the data and all the programmes or tasks involved in keeping the data relevant and available to the user.
The first point to be made is the distinction between those data bases and systems that are tailored for a single specific purpose or task, and those that attempt to be general purpose. This distinction is discussed at length by Hoover (paper 10) in terms of two tiers of users. It is important to note that the specific data bases are constructed from the general, and the validity of special-purpose data bases depends on the validity of both the data in the general data base and of the procedures by which they were selected.
The design and building of specific-purpose nutrient data bases and systems are straightforward since such systems can usually be completely defined in advance (although in fact they rarely are). Additionally, the actual data involved are usually fixed for the duration of the task. Much of the design effort here is focused on the user interface - making the system easy to use. A number of commercial firms supply such systems [5, 6]; however, a standard problem is that documentation of the source and quality of the data is frequently missing, leaving the user without guidance in this area.
General or broad-purpose nutrient data systems tend to focus on the data rather than on the details of interface with the user, although all systems must address this latter aspect. A number of papers given at the conference address the management issues (papers 7, 10, 12, and 13), others describe the magnitude and complexity of the task (papers 11, 15, and 17), and paper 21 focuses on the tools and concepts available to the system designer. The major points are summarized below, organized into the three categories of (a) the data themselves, (b) documentation of the data, and (c) preparation of subsets of the data.
The Data Themselves
The ideal general-purpose data base contains all the data that anyone might need, in a form that makes them readily accessible for any purpose. To approach this ideal it is necessary to be concerned with the following areas.
The data base must be updated continually and aggressively with new foods and new analyses, including re-analyses with better techniques, analyses of new products on the market, and new formulations of existing products. Thus standardized procedures must be implemented for routine collection of new nutrient information from available sources, including governmental publications, the scientific literature, and manufacturers'data on commercial products.
An important aspect of adding data to a data base is that each new piece of data must be carefully evaluated for reliability. Moreover, all nutrient data files should be routinely checked for consistency, to identify possible anomalies and errors in the data. Such procedures could include, for example, comparing nutrient values within food groups or comparing actual data with predicted values. Thus the sum of the weight of the macro-nutrients plus ash and water theoretically should be 100 grams, while the sum of the calorie contributions of each macronutrient (including alcohol) can be compared with the total value for calories.
Having confidence in the individual data is one aspect of the question of the reliability of a data base. A cheek on the working of the entire data base, including a cheek on calculation procedures, can be provided by calculation of a selected, carefully constructed set of dietary records . Such a test should be routinely carried out, with disagreement between successive runs carefully investigated and explained.
General-purpose data bases need to contain information about the source and quality of each of their data points. At a minimum, the user should be able to trace back each piece of data either to a source document or, in the case of analytic data, a laboratory reference; or, if it is estimated, it should be possible to ascertain just how this was done and from what other data. Moreover, it is important to maintain older data as part of the system. In the case of foods and food preparations which have been modified or are no longer on the market, data should be retained for comparison purposes, and so that dietary information collected in the past can be evaluated.
Preparation of Data for the Ultimate User
A major responsibility of the general-purpose data system is to prepare subsets of its data for the "front-line" users - these are the special-purpose data bases mentioned above. In order to do this at all well, such a system must support a flexible query language, an information data base that adequately describes the data, and sufficient manipulative machinery. Areas of specific importance are:
Access to the Data
The system should provide a variety of different ways to access the data. For example, foods should be indexed by food group and type of processing and preparation undergone, as well as by common name and food code number. Moreover, linkages to other data, such as foodspecific quantity units, are also an essential part of retrieving the necessary data.
Aggregation of Data
Many users require data on quite general foods (for example, "apples" rather than "Red Delicious apples"). A general-purpose data base often contains some of these entries, with nutrient levels estimated by combining the data of several specific foods for which analytic data exist. It is essential that the data base include information on just how these estimations were calculated, and, further, that it provide the information, and perhaps the machinery, necessary for the users to make further combinations of data to suit their specific purposes.
Presentation of data, either on a screen or in hard-copy reports, needs to be flexible to permit the design of special-purpose formats to meet specific user needs. For example, options for presentation of data should permit the display of calculated nutrients as a percentage of calories, or other calculated combinations of nutrient values, such as saturated fat as a percentage of total fat or in ratio to polyunsaturated fat. Other options might include comparison of calculated nutrient intakes with recommended standards for specific age-sex groups, or the reporting of nutrients for each individual food item, for single meals, for single days, or for the average of multiple days.