| Food Composition Data: A User's Perspective (1987) |
|Systems considerations in the design of INFOODS|
The International Network of Food Data Systems (INFOODS) was organized in 1982 as a global collaborative of people and organizations interested in working towards improving the amount, quality, and availability of food composition data. Currently it is focusing on the development of standards and guidelines for (a) the terminologies and nomenclatures used in describing foods and food components, (b) the gathering of food composition data, including sampling, assay, and reporting procedures, and (c) the storing and interchange of food composition data.
INFOODS is co-ordinated by a small secretariat, based at the Massachusetts Institute of Technology, which has responsibility for initiation, co-ordination, and administration of international task forces to work on specific problems. Additionally, this secretariat serves as a resource for, and clearing-house for information about, food composition activities around the world. INFOODS works with, and is organizing where necessary, regional groups throughout the world; these provide information and assistance for food composition work in their geographic areas. INFOODS presently is funded primarily by the United States government and administratively supported by the United Nations University.
It is generally assumed that the major product of INFOODS will be one or two integrated computer systems for nutrient and nutritional data. In terms of both technical problems and the requirements of different groups of users, that goal presents serious challenges. It is useful to review those challenges and the reasons why a different strategy may be in order.
Technical problems and user requirements may be seen as challenges because they involve questions for which we don't know the answers, as well as several for which, at this point, we probably do. The validity of our belief in the answers we have depends on whether certain analogies hold between systems for management, recording, and analysis of nutrition data and those for other types of data - especially statistical and social measurement data - and the scientific application of them. In addition, a recurring theme in systems design is that large systems usually involve complex choices to which there are few "correct" answers. Instead, there are many trade-offs in which the needs and preferences of one group are optimized at the expense of others. Making these choices explicitly and with an understanding of their implications, and remembering, far into the future, the reasons for the options chosen, tends to promote better systems that are both more internally consistent and consistent with their avowed goals. Inevitably, making explicit choices that are remembered, some of the decisions will turn out, as time passes, to have been wrong. As a result, one of the major challenges - almost a meta-challenge - is designing for damage containment to ensure that a few wrong decisions do not result in the total uselessness of the system or the need to rebuild it from scratch. An understanding of how the wrong decisions were arrived at contributes to containment of the damage.
One of the themes that is not important is the question of "personal" v. "large" computers as ends in themselves. There can be specific reasons for choosing smallish machines- cost, space, even the psychological advantage of being able to pull the thing's plug if it behaves offensively; and there are also some reasons for choosing large ones (or complexes of small ones) economies of scale, the ability to retain large data bases of foods or consumption histories, and convenient sharing of information among scientists. But in discussing the reasons for choosing a machine we should not get involved in debate about the relative merits of small and large computers. It is especially important to avoid that debate because the use of some mixed strategies, in which different equipment is used for different operations, may be the best overall strategy given the present state of the art.
Before discussing the issues, challenges, and problems involved in trying to construct integrated systems, we should look at the question of why such systems should be considered. Small non-integrated systems have several advantages. They are typically cheaper to build and easier to maintain, and do not require large team efforts over a long period of time. Perhaps as important is one of the major discoveries of the microcomputer revolution - that considerable "friendliness" is a characteristic of machines that are not very capable. When capability is limited, it becomes possible to list all the commands, to list all the options, and to provide clear error messages that identify all choices. In other words, a message such as "No, you cannot type that answer, you must use one of the following three" is a reasonable and possible option. It is neither reasonable nor possible if there are tens of options to a particular command. Nor is it feasible to respond to an inquiry about what a command is called by listing all commands when there are several hundred from which to choose. The limited environments of small and unintegrated systems also tend to make them comparatively easy to document.
In this paper, large-scale systems are assumed to be groups of programs that provide a more or less common face to users, that permit free movement of data and intermediate results between different commands or other program components and analyses, and that let the user determine the order and content of both analyses and display formats. Such assumptions make the large-scale system a different type of object, rather than just a larger one, from most traditional program packages or packaged programs.
If one can figure out what is to be done with the data and what analytic and accessing capabilities are needed, it is often easily possible to design a collection of several medium-sized programs or small-scale systems for quite different purposes and users and having different interfaces, to operate from a single data base. In terms of the complexities of getting the data-base design right, that type of arrangement raises the same issues as the large-scale system, but is much easier from a software design standpoint. Also the individual programs may be much easier to get onto a small machine than a complete largescale system would be. So that is one of the alternatives to be considered.
A potential advantage of large systems is that they should be able to provide a user with more flexibility. At their best, they contain a wider resource base - more tools that can be applied - for most situations. If designed well, they should have a longer life expectancy than smaller systems because they can be extended further and can be used in more innovative and creative ways, including ways not anticipated by their designers. Larger systems can support a wider variety of models and analyses, and consequently permit comparisons among techniques. Such additional analytic capabilities are usually supplemented by facilities for handling large or complex data sets that are beyond the capabilities of a small system.
Most of the issues raised in this paper apply to smaller systems as well as larger ones, but become much more important as systems become larger. The would-be developers of a large system must consider these issues in the early design stages to avoid severe problems later on. The major challenges are easily stated: planning and designing what the system is to do and how to implement it, and then testing those ideas. Also essential, although seemingly obvious, is that resources adequate to the task be available not only at the beginning but also over a long enough span to do the entire job. The best long-term strategies, which tend to focus on the building of tools and prototypes and the conduct of experiments before a final commitment is made to a strategy, tend to be poor short-term ones from the standpoint of sponsors or agencies looking for results. The fear of ending up with only a prototype when the resources run out has prevented many prototypes from being built, and as a result many problems have occurred in production that would have been easily identified and eliminated in prototype experiments. The resource issue will not be addressed here, except to note that tasks usually take much longer and cost much more than expected.
It is worth noting that a very large fraction of the time and cost overruns in computer system building and programming can be attributed to a lack of clarity about what the goals are, what users are to be served, and what facilities are to be incorporated. Clear thinking, careful analysis of requirements and constraints imposed by the users (as distinct from ones imposed by real or imagined technical considerations), and careful design consistent with that thinking and analysis are usually richly rewarded, and the failure to perform such thinking and analysis is equally richly punished.