Cover Image
close this bookFood Composition Data: A User's Perspective (United Nations University - UNU, 1987, 223 pages)
close this folderOther considerations
close this folderSystems considerations in the design of INFOODS
View the document(introductory text...)
View the documentIntroduction
View the documentStaff turnover and system growth
View the documentDocumentation
View the documentThe choice of environmental and basic tools
View the documentChoices of operating systems
View the documentChoice of programming language
View the documentUser interface
View the documentData representations
View the documentSystem architecture and linkages
View the documentStability
View the documentPrimitive tool-based systems
View the documentSummary
View the documentReferences

Data representations

Most integrated environments make use of some kind of file system to retain information - a worksheet, a special system file, or even a full data-base management system. In addition to providing a compact way to save information during and between sessions, these files can be used as the mechanism by which all commands for users, other than those intended to read raw data into files and display the contents of files, communicate with each other and with the users. In other words, computational commands do not read, clean, or process raw files, nor do they print results. Having commands work this way ensures (given adequate data representations) that any command can use the outputs of any other appropriate commands as inputs. That level of compatibility will apply to commands written in the future, as the system is extended, as well as to those designed initially. This type of strategy is also complementary to agent strategies and to primitive tool-based systems (discussed below).'

As a system-building approach, such strategy has a long tradition in statistical and social science computation [8, 9]. At the same time, many users find it inconvenient (unless it is hidden) for trivial sets of operations. It also leads to inconvenience and unpredictability when one discovers, late in the life of a system, that the data representation forms are inadequate, that there is no mechanism for cleanly extending them, and that the only practical solution is to have some commands that simply print results. For example, some statistical systems in the recent past have run into major difficulties as the requirements of new or proposed procedures forced a choice between moving from columns and data matrices to symmetric matrices and multi-dimensional arrays on the one hand, and, on the other, deciding that some routines should display results that could not be captured in the file system. We are aware of several situations in which systems have been reorganized in major ways internally, requiring users to convert data sets, in order to try to cope with these problems as they unfold. Naturally enough, the problems tend to be buried as much as possible, rather than being cited explicitly in the literature.

To a degree, the more heavily the system relies on a single fixed set of data structures, the more dependent it becomes on the correctness of those data structures and file representations; such dependence amounts to a negative technical aspect of the approach. So once again there is a challenge in trying to make the right decision - in balancing the compatibility advantages against convenience for trivial tasks and against the risks of having to use a mixed strategy or make a major redesign if the data representations are not adequate to future developments. There are some alternative methods for data conversion, such as globally changing all files that would not exist in an integrated environment. However, the risks of making a conversion of broad scope would be a threat to the integrity of multiple dependent programs operating off a common data base or data representation, as well as to a more structured system.