Research Methods in Nutritional Anthropology (United Nations University, 1989, 201 p.) 
6. Elementary mathematical models and statistical methods for nutritional anthropology 

Many foodrelated activities can be conceptualized as dynamic processes that occur over time. Jerome (1975), for example, has recently called attention to the importance of foodconsumption cycles and rhythms and has described the processes by which foods become incorporated into the dietary patterns of urban Americans. Food procurement, processing, and distribution activities should also lend themselves to processual analysis.
Mathematics is very helpful for representing and analysing processes. Deterministic mathematical models abstract the structure of a process and specify exact outcomes. Stochastic mathematical models represent processes as a set of outcomes occurring randomly according to a set of associated probabilities. Because the theory and data concerning human activities that can be seen as processes seldom allow us to state exactly what will happen, it is reasonable to assume that stochastic process models are more realistic than deterministic models. Thus, we focus exclusively on stochastic models here.
Stochastic processes comprise a vast field of mathematical statistics, and we will restrict our attention to only two, which in their most elementary form are rather easy to grasp and have been widely used in anthropology and other behavioural sciences. These are the finite Markov chain and Poisson processes.
Each stochastic process is derived from a particular set of assumptions. The choice of any one to model a reallife situation should, therefore, be justified, in so far as possible, by whether or not these assumptions appear reasonable in the light of the empirical problem at hand and/or substantive theory regarding it. If it can be established that a stochastic process model fits an empirical process, then the assumptions upon which the model is based can be said to characterize the empirical process. The model can then be used to extrapolate probability distributions of expected occurrences.
Markov Chains
A finite Markov chain is a stochastic process consisting of a finite set of states (outcomes), {s_{1}, s_{2}, . . ., S_{n}}, and an associated set of transition probabilities, {p_{ij}}, such that the conditional probability of outcome s_{j} of an experiment or trial, given a previous outcome of s_{i}, is p_{ij}. The states are mutually exclusive, so an element in the process, such as a person or an object, can be in one, and only one, state at one time (or after one trial). All elements in the same state are assumed to have the same probability of remaining in, or changing, state after each trial (homogeneity assumption). The transition probabilities {p_{ij}} remain constant throughout the duration of the process (stationarity assumption). And the conditional probability of outcome sj depends at most on the immediately prior outcome si (onestep dependency assumption). Considered together, these assumptions characterize a finite Markov chain process. When these assumptions appear reasonably true of sequential phenomena, then a Markov chain model is worth considering. A number of excellent works can be consulted for detailed explanations of finite Markov chains and their potential uses (Kemeny and Snell, 1960; Bartholomew, 1973). White (1974) describes many uses in anthropology. Here we will flesh out this skeletal description of Markov chains with an empirical illustration.
Example 8
In work on foodconsumption patterns and preferences of middleincome Americans, we collected daily records over a sixweek period of the major type of meat consumed at evening meals by three female and three male adults. Our objectives are to: (a) estimate the relative frequencies or probabilities of meats used over various timeperiods; (b) extrapolate the relative proportion of meats used over the long run (indefinite future); and (c) extrapolate meatuse cycles (expected time for reuse of the same meat type, and time from use of one type to another). Each person made a daily record of one of the following meat types he/she ate the most: (a) beef, (b) pork, (c) poultry, (d) seafood, and (e) other (e.g. variety meats like "cold cuts" or no meat consumption at all).
A discretestate finite Markov chain was selected to model meatuse sequences because the number of states is finite, given by the five meat types, and the states are mutually exclusive. A person can be in one, and only one, of the states at a given time. This is true by definition (i.e. a person can have only one meat type most often). Other reasons for selecting this kind of Markov chain are that: (a) observations at more than one time interval (or over more than one daily trial) are available; (b) it is assumed that persons in the same state (i.e. eating the same meat) have the same probability of remaining in or changing state; (c) it is assumed that the meat type a person has at time t + 1 depends at most on the type he/she had at time t; and finally (d) it is assumed that these transition probabilities will remain constant for the duration of the process considered. We also assume, for purposes of illustration, that the number of observations are sufficient to estimate accurately the transition probabilities.
While it certainly can be argued that these assumptions are tenuous, we shall deem them sufficiently reasonable to merit exploration within the context of this example. We will attempt to evaluate the goodness of fit of the Markov chain model by seeing how well it approximates reality by using the data from the first threeweek period to construct it. Extrapolations from the model will then be compared with the actual observations over the second threeweek period.
Table 5 presents a matrix, F. of 126 transition frequencies of meat types for the six respondents over the first 21day period. Each state is labelled along the rows and columns and refers to a meat type. Each f_{ij} element in the matrix denotes the frequency. One "state" is followed by another (including, on the main diagonal, the same state). Thus, the 20 in the first row and column refers to the number of times beef was followed by beef on the succeeding day. The 14 in row 17 column 2 refers to the number of times beef was followed by pork on the succeeding day, and so on. The other rows are interpreted similarly. Since each person's sequence starts with the meat type used on the day the observations began, we classified that meat type as following the modal meat type it follows in the first threeweek period.
Table 5. Meatuse transition frequency matrix
Beef  Pork  Poultry  Seafood  Other  
Beef  20  14  6  0  8 
Pork  16  11  6  1  2 
Poultry  5  4  3  2  2 
Seafood  2  1  0  0  0 
Other  4  6  3  0  10 
Dividing each fij element by its rowsum produces the matrix of transition probabilities, P. in table 6 (i.e. a matrix of probabilities estimated from relative frequencies of proportional occurrences). Therefore, each pij element denotes the probability of change (or stability) from one meal to another. Thus, the probability of beef being followed by beef, p11 = 20/48 = .417; and p12 = 14/28 = .291 is the probability that beef will be followed by pork. All rows are interpreted similarly. Notice that all Pij's are nonnegative and each row sums to unity. Thus, P provides the probabilities of remaining in or changing state over a oneday interval.
Table 6. Meatuse transition probability matrix
Beef  Pork  Poultry  Seafood  Other  
Beef  .417  .291  .125  .000  .167 
Pork  .444  .305  .167  .02x  .056 
Poultry  .313  .25  .187  .125  .125 
Seafood  .667  .333  .000  .000  .000 
Other  .174  .261  .13  .000  .435 
With the data transformed into these arrays, we can use some matrix algebra and theorems of Markov chains to ask some pertinent questions of the model. First, what is the expected distribution of meat use after edays, given an initial distribution on some prior day? For example, if two people had beef, three had pork, and one had "other" on one day, what is the expected distribution of meat use two days into the future? To answer this question we use a fundamental equation of a Markov chain:
(72) p^{t + n} = p^{t}p^{n}
where p^{t} is a row vector, the ordered components of which denote the initial probability distribution across the states; p^{n} is the matrix of transition probabilities; n is an exponent indicating the number of trials (or days); and p^{t + n} is the resultant vector, the ordered components of which denote the probability distribution across states after t + n trials (or days). Thus, to extrapolate the meatuse distribution after p^{t} and P², the transition probability matrix squared or
(73) p^{t + 2} = p^{t}P²
Two matrix operations are involved. The first is powering a matrix (successively multiplying a matrix by itself), which can be defined as
where the letters represent the pij matrix elements of P. The second operation is premultiplying the product matrix by the row vector pt. This can be defined as
In our example,
The initial probability distribution p^{t} can be found by dividing the frequency of each element by the sum. Thus 2 beef, 3 pork, 0 poultry, 0 seafood, 1 other is transformed into
p^{t} = (2/6, 3/6, 0, 0, 1/6)
or
p^{t} = (.333, .50, .00, .00, .167)
The vectormatrix product, p^{t + 2}, which specifies the probability distribution expected after two days, is given by the resultant vector:
Thus, in two days it is expected that the proportion of the sample having each meat will be .36 beef, .29 pork, .15 poultry, .03 seafood, .17 other. Frequencies can be found by converting the proportions to the nearest whole number. We want to emphasize that a forecast can be made n days into the future simple by successively exponentiating P to the appropriate nth power and then premultiplying it by p^{t}. We also mention, but will not discuss, the possibility of reversing the procedures and retrodicting prior distributions. In either case, if reality data are available, these can then be compared to the expected distributions to evaluate the accuracy of the model. We will illustrate this next.
Recall that the second objective of our study was to extrapolate the ultimate, longterm distribution of meat use. Remarkably, Markov chains allow us to see this at a glance. How this is done depends on the type of Markov chain. If some power of P has all positive, nonzero elements (as is true here  P²) then the chain is regular and some power of P will have identical rows. At this power, further exponentiation of P will not change the values of the elements. Any row of this "fixedstate" matrix, P^{e}, specifies the equilibrium vector, pe which is the ultimate probability distribution. The equilibrium vector, p^{e}, can be found more easily by solving the matrix equation
(74) p^{e} = p^{e} P
where pe is the ultimate equilibrium vector and P is the original transition probability matrix. The latter can be defined as
from which the following system of equations are derived:
(75a) p_{1}x_{1} + p_{21}x_{2} =x_{1}
(75b) p_{12}x_{1} + p_{22}x_{2} = x_{2}
Now recalling that x_{1} and x_{2} must sum to 1 we add
(75c) x_{1} + x_{2} = 1
With these three equations and two unknowns we drop either (75a) or (75b) (the number of equations must equal the number of unknowns), set them equal to zero, and solve the resulting system simultaneously for x_{1} and x_{2}. In our example the ultimate equilibrium vector, p^{e}, will contain five components, so we premultiply P by a fivecomponent row vector of unknowns (x_{1}, x_{2}, x_{3}, x_{4}, x_{5}). A system of five equations is produced, plus the equation
x_{1} +x_{2} + x_{3} + x_{4} + x_{5} = 1
or
(76) .417x_{1} + .444x_{2} + .313x_{3} + .667x_{4} + .174x_{5} =x_{1 }.291x_{1} + .305x_{2} + .250x_{3} + .333x_{4} + .261x_{5} = x_{2 }.125x_{1} + .167x_{2} + .187x_{3} + .000x_{4} + .130x_{5} = x_{3 }.000x_{1} + .028x_{2} + .125x_{3} + .000x_{4} + .000x_{5} = x_{4 }.167x_{1} + .056x_{2} + .125x_{3} + .000x_{4} + .435x_{5} = x_{5 }x_{1} + x_{2} + x_{3} + x_{4} + x_{5} = 1
which when solved simultaneously gives the equilibrium vector
(77) p^{e} = ( 375, .286, .143, .027, . 169)
This equation specifies the longterm expected proportions of meat use. Using the actual proportions of meat use over the last 21 days, we can compare this expected distribution with that observed, and evaluate the model's goodness of fit to reality (table 7). The actual observations of the meatuse distribution over the last 21 days can be seen to be in 97 per cent agreement with the predicted distribution deduced from the model.
A X^{2} test of goodness of fit, with 5  1 = 4 degrees of freedom, where
(78)
and where O = observed frequencies, E = expected frequencies, and S = number of states (meats), has a probability p>.90. This test shows that the expected and observed distributions do not differ significantly, and that any difference between them is largely due to chance. Overall, the fit is excellent and suggests that the meat sequence for this sample can be modelled quite accurately with a Markov chain.
Finally, let us turn to our third objective and see how a Markov chain can be used to study the meatuse cycle. Here we want to know what the expected (mean) number of days will be before meat S_{j} is used, given that meat S_{i} was used last. For instance, if beef was eaten today, how many days, on average, will it be before beef is eaten again, pork is eaten, poultry is eaten, and so on? With a regular Markov chain these values are obtained from a matrix of mean first passage times, M. Since the mathematical computations are lengthy and involved the reader is referred to Kemeny and Snell (1960), and we will proceed directly to the results in table 8.
Table 7. Comparison of expected and actual meatuse distributions
Proportion expected  Proportion observed  Expected frequency  Observed frequency  
Beef  .375  .380  47.25  48 
Pork  .286  .262  36.04  33 
Poultry  .143  .143  18.02  18 
Seafood  .027  .04  3.40  5 
Other  .169  .175  21.29  22 
Table 8. Matrix of mean first passage times
Beef  Pork  Poultry  Seafood  Other  
Beef  2.67  3.55  24.75  35.33  8.11  
Pork  2.48  3.50  23.79  33.78  9.12  
M =  Poultry  2 96  3.92  6.99  21.52  9.75 
Seafood  1.81  3.33  58.93  37.04  9.29  
Other  3.57  3.70  24.61  35.33  5.92 
Each m_{ij} element denotes the number of trials (days) it will be (on average) before a particular type of meat is used, according to the meat that was used last (table 8). Thus, m_{11} = 2.67 is the expected number of days before beef is reused, m_{12} = 3.55 is the expected number of days before pork is used if beef was used last, etc. All rows of M are interpreted similarly. It might be noted in closing that, although this analysis has been performed for the group as a whole (n = 6), a Markovchain analysis of each individual could also be performed and then compared. This procedure would also enable us to examine the assumptions of the model in much more detail.
In sum, Markov chains, as one type of stochastic process model, appear useful for linking past, present, and future events in an explicit way. While many of the assumptions are rather stringent and, for longrange forecasting, reestimates of the models' parameters are usually necessary, over the short run Markov models tend to be quite robust, even when all assumptions are not completely satisfied. Hopefully, this pilot study will stimulate further exploration of the many potential uses of Markov chains in nutritional anthropology.
Poisson Process
We now provide a brief example of the application of another stochastic process model, the common Poisson distribution (Feller, 1957). The Poisson is normally used to represent stochastic processes operating continuously over some unit of measurement such as time and space and to generate the expected number of occurrences of events therein. The major assumptions governing a Poisson process are: (a) there is a positive constant l (lambda), the average rate of occurrence, which remains the same for all units; (b) the occurrence of events is independent (i.e. the occurrence of one event does not condition the probability of another event); (c) the probability of one occurrence in a single unit is proportional to the size of the unit; and (d) the probability of two or more occurrences in a small unit is infinitely small. The Poisson probability function, p, is derived from these assumptions:
(79)
where x is the number of occurrences in a given unit of measurement; l = the mean of the distribution; and e = 2.71828 is the base of the natural logarithm. What can be seen is that the expected number of occurrences depends on A, the average number of occurrences per unit of measurement.
Example 9
Let us now use a Poisson process to represent an empirical problem, and in so doing illustrate the required computations.
It is widely known, and a matter of international concern, that animalsource proteins are frequently in short supply in the diets of many tropical populations. This is true of a rural parish we studied in the Buganda region of Uganda. The major portion of the diet consists of nonfat carbohydrates: plantains, sweet potatoes, cassava, and yams. Animal proteins (dairy products, eggs, meat, fish, and poultry), though available locally, are expensive to purchase and produce. A 24hour recall of foods consumed was collected from a social survey of a random sample of 107 household heads (HH) conducted over a sixweek period in 1967. The reported frequency and per cent distribution of animalsource protein food use is shown in table 9.
Table 9
Frequency 
Percentage  
Beef 
11 
10.28 
Fish 
11 
10.28 
Eggs 
2 
1.9 
Milk 
2 
1.9 
Tea/coffee with milk 
19 
17.76 
Poultry 
2 
1.9 
Termites 
1 
.9 
None 
70 
65.42 
With no other information available we assumed that: (a) the use of animal protein food was a random occurrence; (b) the animal protein useproneness of each HH was the same; (c) the use of an animal protein by one HH was independent of use by another HH; (d) the use of one animal protein food did not condition the probability of the use of another; and (e) the probability of HH using two or more animal protein foods was small. For these reasons we thought a Poisson process would accurately generate the probability of animalsource proteinuse occurrences. The actual use distribution is shown in table 10.
Table 10
No. of protein foods 
Frequency 
Proportion 
0 
70 
.654 
1 
28 
.262 
2 
7 
.065 
3 
2 
.019 
Using the probabilitygenerating function (79) to calculate the expected number of occurrences, we proceed as follows:
Table 11.
Expected number  Observed number  Expected proportion  Observed proportion  
0  68.27  70  .638  .654 
1  30.71  28  .287  .262 
2  6.85  7  .064  .065 
3  1.07  2  .01  .019 
Comparing these expected occurrences with the actual occurrences reveals close agreement (97.5 per cent) (table 11).
A Chisquare test of goodness of fit, with a value of 1.09 with 2 degrees of freedom (df = number of classes 1 and 1 for each parameter estimated) shows no significant difference between the two distributions (p > .50).
We conclude, therefore, that the Poisson distribution provides a close approximation to the distribution of actual occurrences of animal proteinuse in this sample population. Further, it is indicated that the assumptions of the Poisson model characterize the animalsource proteinuse process in this region.
Before closing, it might be informative to consider some reasons why a discrepancy between the Poisson distribution of expected values and the actual distribution of observed values might have occurred. First, the Poisson assumes that l = .449, the proteinuse proneness value, is identical for all HH. If the sample population were heterogeneous in this respect, a discrepancy could occur. Second, the Poisson requires that the use of one protein food does not affect the use of another. If several proteinsource foods were consumed together, or eating one protein food caused others not to be eaten. this could produce a discrepancy. Third, the Poisson requires independence of proteinuse occurrence among HH. If proteinfood exchanges occurred among several HH, producing simultaneous use (or no use), then this, too, could create a discrepancy.
In sum, departures from a Poisson process often occur when the sample population is heterogeneous in occurrence proneness; reinforcement causes one occurrence to condition the probability of another occurrence; and contagion reduces the independence of cases. Since the differences between the expected and actual distributions were small, these conditions are probably not present to a significant extent in this example. Had they been, or if it were assumed that these conditions do in fact characterize the nature of some process, then other stochastic process models with assumptions predicated on these conditions would have to be explored.