Cover Image
close this book Food Composition Data: A User's Perspective (1987)
close this folder Other considerations
close this folder Systems considerations in the design of INFOODS
View the document (introductory text)
View the document Introduction
View the document Staff turnover and system growth
View the document Documentation
View the document The choice of environmental and basic tools
View the document Choices of operating systems
View the document Choice of programming language
View the document User interface
View the document Data representations
View the document System architecture and linkages
View the document Stability
View the document Primitive tool-based systems
View the document Summary
View the document References

Data representations

Data representations

Most integrated environments make use of some kind of file system to retain information - a worksheet, a special system file, or even a full data-base management system. In addition to providing a compact way to save information during and between sessions, these files can be used as the mechanism by which all commands for users, other than those intended to read raw data into files and display the contents of files, communicate with each other and with the users. In other words, computational commands do not read, clean, or process raw files, nor do they print results. Having commands work this way ensures (given adequate data representations) that any command can use the outputs of any other appropriate commands as inputs. That level of compatibility will apply to commands written in the future, as the system is extended, as well as to those designed initially. This type of strategy is also complementary to agent strategies and to primitive tool-based systems (discussed below).'

As a system-building approach, such strategy has a long tradition in statistical and social science computation [8, 9]. At the same time, many users find it inconvenient (unless it is hidden) for trivial sets of operations. It also leads to inconvenience and unpredictability when one discovers, late in the life of a system, that the data representation forms are inadequate, that there is no mechanism for cleanly extending them, and that the only practical solution is to have some commands that simply print results. For example, some statistical systems in the recent past have run into major difficulties as the requirements of new or proposed procedures forced a choice between moving from columns and data matrices to symmetric matrices and multi-dimensional arrays on the one hand, and, on the other, deciding that some routines should display results that could not be captured in the file system. We are aware of several situations in which systems have been reorganized in major ways internally, requiring users to convert data sets, in order to try to cope with these problems as they unfold. Naturally enough, the problems tend to be buried as much as possible, rather than being cited explicitly in the literature.

To a degree, the more heavily the system relies on a single fixed set of data structures, the more dependent it becomes on the correctness of those data structures and file representations; such dependence amounts to a negative technical aspect of the approach. So once again there is a challenge in trying to make the right decision - in balancing the compatibility advantages against convenience for trivial tasks and against the risks of having to use a mixed strategy or make a major redesign if the data representations are not adequate to future developments. There are some alternative methods for data conversion, such as globally changing all files that would not exist in an integrated environment. However, the risks of making a conversion of broad scope would be a threat to the integrity of multiple dependent programs operating off a common data base or data representation, as well as to a more structured system.