Cover Image
close this bookEarly Supplementary Feeding and Cognition (Society for Research in Child Development, 1993, 123 pages)
close this folderV. Methods of the cross-sectional follow-up
View the document(introductory text...)
View the documentSubjects
View the documentSocioeconomic indicators
View the documentSchooling variables
View the documentThe psychological test battery

(introductory text...)

This chapter describes the sample of subjects included in the follow-up study and the methods used to assess three sets of variables: the social and economic background of the families, the schooling histories of the subjects within the formal educational system of the villages, and the subjects' performance on a battery of psychoeducational and information-processing tests in adolescence. 5

5 Caroline Heckathorn was responsible for the analysis of the data on reliability and validity. we gratefully acknowledge her contributions to this chapter.

Subjects

As a result of a 1987 census, 1,704 subjects were identified as composing the potential sample for follow-up assessment in the behavioral area. Owing to the large number of outcome variables of interest and the amount of time needed to test each subject, the cohorts born prior to 1965 were excluded from this aspect of the follow-up study. The intent was to maximize the amount of information to be collected without loss of the theoretically most important cohorts; since cohorts born between 1962 and 1965 had received supplementation at a noncritical developmental period (age 4-7), these were considered to be of least theoretical interest.

Of the 1,704 subjects, 1,545 were residing in the villages, and, of those, 93% completed the battery of psychoeducational tests that were selected for the follow-up assessment. With the inclusion of individuals who had migrated out of the villages but who could be contacted for testing either in Guatemala City or in surrounding villages, coverage of all potential subjects decreased to approximately 83%. Additional data cleaning resulted in the elimination of approximately 30 subjects from the psychology battery. A breakdown of the final number of subjects available for the follow-up psychoeducational tests is presented in Figure 3.


FIG. 3. - Breakdown of the follow-up sample (sample sizes available for a given analysis vary as a function of outcome variable and covariate). Non = nonmigratory, residing in the village. Mig = migratory, not residing in the village.

Comparisons of participants and nonparticipants allow us to make some inferences regarding the representativeness of the follow-up sample. Participants in the follow-up had higher mean birth weights, were less frequently ill with diarrhea from birth to 3 years of life, and had higher average energy intakes from the supplement during the first 3 years of life than nonparticipants (Rivera & Castro, 1990). However, this difference existed in both the Atole and the Fresco villages, and the rate of participation among village residents was similar in both types of sites (94% and 93% for females and 86% and 84% for males in the Atole and Fresco villages, respectively).

In addition, the percentage of migrants - defined as subjects not living in the villages at the time of the follow-up - from the entire sample was also similar in the Atole (32.7%, N = 377) and the Fresco (34.4%, N = 350) sites (Rivera & Castro, 1990), as was the proportion of migrants who participated in the follow-up (41.4% in the Fresco and 40.0% in the Atole villages).

Comparisons of the results as a function of migratory status, and with migrants removed from analyses, yielded two important pieces of information. First, although the migrants who enlisted in the follow-up performed significantly better than the nonmigrants on all psychoeducational tests, their performance was similar across Atole and Fresco subgroups. The only exception occurred on the Raven's Progressive Matrices test: migrants who were natives of the Fresco villages performed significantly better on this test than those from the Atole villages. When migrants were removed from the main analyses, the results of these remained virtually unchanged. Thus, although the follow-up sample may have been slightly better off than the entire longitudinal sample and migrants may be better off than nonmigrants, none of the observed sampling differences should modify any observed treatment effects.

Ideally, the entire sample would be broken down by cohorts defined by periods of exposure (e.g., prenatal only, postnatal only, first 2 years of life) so as to assess whether developmental period and duration of exposure modified the effects of the nutrient supplement; however, restrictions of statistical power precluded such an approach. 6 Accordingly, our analyses focus on the entire sample and on a cohort of maximum exposure, that is, subjects who were exposed to the treatment during gestation and for at least the first 2 years of postnatal life. Sample sizes in these two cohorts are large, with power to detect even small effects (power =.98 for an effect size greater than .20). In addition, the final analyses included a "late" cohort whose age at first exposure was 24 months or older (power =.74 for an effect size of .25).

6 For subjects with prenatal exposure only, sample size was 77 (52 for tests requiring literacy). Estimated power for an effect size of .25 was 58 (42 for literate group). For subjects with postnatal exposure only, sample size was 30. Estimated power for effect size of .25 was 26.

The composition of the follow-up sample according to age of exposure and its duration is presented diagrammatically in Figure 4. The cohort of maximum exposure includes all subjects who received nutritional treatment during a period of accelerated brain growth, which represents a sensitive, if not the most sensitive, period when considering the effects of a nutritional intervention. These subjects were born between 1970 and 1974 and ranged in age from 13 to 19 years at the time of the follow-up study. They represent the most suitable sample on which to test the effects of the treatment. By contrast, the cohort of late exposure, born during or prior to 1967, helps determine whether exposure during the period of accelerated brain growth is a necessary condition for treatment effects.

Socioeconomic indicators

Census data from each of the four villages were collected prior to the initiation of the study (1967), in the midst of the supplementation period (1974), and prior to the follow-up (1987). From these data, three indicators of socioeconomic status (SES) that reflect family wealth and potential for child stimulation were constructed: house quality, mothers' education, and fathers' occupation. Father's education was not included because the informants (mothers) often lacked knowledge about their husbands' schooling. In the final analyses, these indicators were standardized and summed to yield a single composite socioeconomic index.


FIG. 4. - Breakdown of the follow-up sample by age and duration of exposure

For each census, all women over the age of 15 who had ever been married and all mothers in each village were interviewed, and information was obtained on each member of the nuclear family living in the household. The census forms for 1974 and 1987 were identical; the one used in 1967 was somewhat different but yielded comparable information. The informants also provided information about family structure, marital status, religion, number of pregnancies, number of live children, and relationship of household head to head of the extended family. Observations of the quality of the house were also made (e.g., type of walls, floor, and roof). For each family member, the parity, relationship to the rest of the family, birth date, education, occupation (if over age 10), and date of change of status (death or migration) were coded. In addition, in 1987, when many of the subjects had started their own families, information about their families of origin was also recorded. Table 9 shows the number of families per village that were included in each round of the census as well as the total number of families for whom data were obtained at least one of the three assessments.

TABLE 9: NUMBER OF FAMILIES WITH SES DATA BY YEAR OF CENSUS AND VILLAGE


1967

1974

1987

Total a

Fresco 03

174

225

369

471

Atole 06

178

236

368

464

Fresco 08

132

187

254

325

Atole 14

110

136

237

281

NOTE - Villages are identified by census codes.

a Number of families for whom data were obtained at any assessment date; new families formed in 1987 are included.

Construction of SES Measures

House Quality

In developing countries, it is often difficult to obtain accurate information on family income, a variable that traditionally has served as a proxy for a number of social-environmental variables that affect cognitive growth and educational development. This is particularly the case in rural agricultural communities and explains why indicators of house quality have often been used as proxies for income in such communities (e.g., Johnston, Low, de Baessa, & MacVean, 1987).

Nine variables describing house quality were assessed at each of the three census periods: an overall rating of the type of house (on a scale of 1-4); ownership of house (no = 0, yes = 1); number of rooms; type of floor (1-5); type of walls (1-7); type of roof (1-4); location of the kitchen (1-3); type of toilet (1-4); and number of possessions (1-6). In all instances, higher scale scores reflected the higher quality of the dwelling. Data were also available for type of water disposal, source of water, and presence of electricity. However, these variables were not used in the assessment of house quality because they tended to be village specific and contributed little to within-village variation.

To generate an index of within-village variation in house quality and of socially meaningful between-village differences, a factor analysis was performed to generate factor loadings for each assessment year. The nine variables were standardized within each village to allow for comparability across villages and then factor analyzed at each year using a principal components analysis. The results are shown in Table 10. Since the second principal component did not account for more than 13% of the variance at any year, it was dropped from further analyses; loadings on the first factor were similar at each time period.

In order to recapture between-village differences that were removed through standardization, final house quality scores were constructed by multiplying the original individual raw scores by the factor loadings. An alternative approach, used in Ruel (1991), eliminated variables that varied systematically by village and derived the factor scores without multiplying the raw scores by the factor loadings. These scores, which were calculated for the 1974 and 1988 data, correlated at r =.91 and .88, respectively, with the scores that we used for analyses reported here.

TABLE 10: FACTOR LOADINGS ON THE FIRST PRINCIPAL COMPONENTS FACTOR OF THE NINE INDICES OF HOUSE QUALITY (Using within-Village Standardized Scores)

Variable

1967

1974

1987

Possessions

.50

.53

.60

House

.84

.88

.88

Ownership

-.03

-.01

-.04

Rooms

.77

.74

.62

Floor

.51

.65

.82

Walls

.80

.83

.82

Roof

.52

.54

.56

Toilet

.32

.46

.55

Kitchen

.71

.64

.29

Eigenvalue

3.32

3.66

3.57

% variance

.37

.41

.40

Parents' Education

Parents' education has been consistently shown to be positively related to the cognitive development of the offspring (Sigman, Neumann, Jansen, & Bwibo, 1989). The mechanisms underlying this close-to-universal association are numerous, and they probably vary across cultures. In a variety of studies, parents' education has been positively related to frequency of educational opportunities available to the children, verbal stimulation provided to the offspring, and parents' aspirations for their children (Levine et al., 1991; Sigman et al., 1988; Sigman et al., 1989).

Informants reported parents' literacy (coded 0 = none, 1 = some) at all three census periods and parents' years of schooling in 1974 and 1987. The mean number of years of schooling in 1974 and 1987 was, respectively, .98 and 2.1 for mothers and 1.3 and 2.5 for fathers.

Parents' Occupation

Occupational status is a "carrier" variable that may be associated with income status in the community. availability of resources. and family socialization practices. The indirect effects of parents' occupation on the cognitive development of children are thought to occur through the earning capacity of the parents and the consequent resources for stimulation that that earning capacity permits. Both mother's occupation and father's occupation were assessed; however, since only about 20% of women at the follow-up reported having an occupation, mother's occupation was excluded from further analyses.

TABLE 11: CORRELATION OF SES MEASURES OVER THE THREE CENSUS PERIODS


1967-1974

1967-1987

1974-1987

House factor

.69**

.37**

.50**


(466)

(360)

(533)

Mother:


Literacy

.72**

.63**

.77**



(429)

(326)

(491)


Education a



.80**





(488)

Father:


Literacy

.82**

.82**

.77**



(380)

(282)

(433)


Education a



.73**





(397)


Occupation

.36**

.27**

.37**



(371)

(282)

(427)

NOTE - Pearson product-moment correlation coefficients were computed for house quality and education and Spearman rank-order coefficients for literacy and occupation. Sample size is given in parentheses.

a Data are not available for 1967.

**p <.01.

There were six occupational categories listed in the 1967 census and 19 in the 1974 and 1987 censuses. For purposes of comparisons across years, the original 0-19 scales used in 1974 and 1987 were collapsed to be similar to the 1967 scale; these original and recoded scales are highly correlated (r =.88). In preliminary analyses, the recoded scale demonstrated adequate linear properties and was used in all subsequent statistical calculations as an ordinal variable.

Reliability of Measures

As shown in Table 11, most indicators remained stable over time. As expected, the highest correlations are obtained for variables that have little intraindividual variability and little expected change, such as years of school attainment, while the smallest values occur for variables that might be more likely to change over time, such as the characteristics of the family's home.

In subsequent, final analyses, indicators obtained in 1987 were used. An SES composite was created by summing three standardized variables: house factor score, father's occupation, and mother's schooling. As noted earlier, mother's occupation was dropped owing to its low occurrence, and father's education (as reported by the mother) was also excluded since the mothers often could not provide accurate information.

Schooling variables

Five variables were generated from the individual schooling data: age at which the child entered school; number of times the child passed, failed, and withdrew; and the highest grade that the child reached. The data are shown in Table 12. On average, children began school more than a year after the expected age (i.e., 7 years for first grade), and some did so as late as at age 15 years. Children who attended school attained, on average, a level somewhat lower than the fourth grade (3.7), and failure was not uncommon (see Table 12). A more complete description of the Guatemalan educational system and of the measures of school efficiency in these villages is provided elsewhere (Gorman & Pollitt, 1992).

The psychological test battery

With the intention of assessing two distinct aspects of cognition, two psychological test batteries were used in the follow-up. The psychoeducational test battery included Raven's Progressive Matrices and tests of complex intellectual aptitudes, abilities, and achievements that are heavily influenced by experience, education, and cultural upbringing. Illustrative of the latter are two standardized tests of reading and vocabulary and a knowledge test that was developed locally. The theoretical justification for the selection of these tests was the expectation that proficiency in reading and vocabulary and breadth of general knowledge will determine in part the potential that an adolescent or a young adult has to contribute to his or her community's social and economic development. Our particular concern was whether the nutritional supplement made a difference in terms of the crystallization of those mental abilities.

TABLE 12: MEANS, STANDARD DEVIATIONS, AND RANGES OF THE SCHOOL VARIABLES CODED FOR THE GUATEMALAN ADOLESCENT POPULATION

Variable

N

Range

M

SD

Age at entry

1,083

6.00-15.00

8.36

1.37

Pass

1,056

0-7.00

3.63

2.12

Fail

1,056

0-6.00

1.03

1.05

Withdrawals

1,056

0-4.00

.17

.46

Highest grade reached

1,089

0-6.00

3.72

2.07

The second test battery included elementary cognitive tasks, such as simple and choice reaction time (RT), that measure a single attribute of information processing: speed. A paired associates test was also included in this battery. The between-subject variability in RT tests is generally not accounted for by schooling and cultural background, yet test performance still maintains a low-level correlation (r's ranging from -.10 to -.30) with g or a general ability factor. Theoreticians currently claim that RT is a sensitive indicator of differences in brain function (Eysenck, 1986; Jensen, 1991; Vernon, 1987). In the present study, inclusion of these tests was justified by the assumption that RT would be particularly sensitive to the effects of nutrition on central nervous system activity.

The tests included in the two batteries and their psychometric properties are described below.

Psychoeducational Tests

The battery included tests of literacy, numeracy, and general knowledge, two standardized educational achievement tests, and Raven's Progressive Matrices (RPM). The achievement tests were part of the Interamerican Series originally designed to assess reading abilities of Spanish-speaking children in Texas (Manuel, 1967).

Tests of literacy, numeracy, and general knowledge were administered individually by four trained testers. The achievement and intelligence tests were administered either individually or in a group, depending on subject availability, time, and logistical constraints. All the testers were females with certification as primary school teachers, and they came from Guatemala City or from a medium-sized town located near the villages. Testers received extensive training by both Guatemalan and U.S. psychologists during pre-testing and the pilot study.

Interrater reliability was calculated for literacy, numeracy, and general knowledge tests on the basis of four testing sessions with five raters (four testers and the psychologist) at each session. Percentage agreement varied between 86% and 100% for literacy, 97% and 100% for numeracy, and 94% and 100% for general knowledge.

Literacy

The literacy test consisted of two parts: a preliteracy measure of knowledge of letters, syllables, words, and short phrases and a reading test based on material familiar to the subjects. All subjects who reported having achieved 4 or fewer years of schooling were given the preliteracy test. Subjects who achieved between 4 and 6 years of schooling were asked to read the headline of a newspaper article aloud ("Futbol Guatemalteco bien representado en Caracas"). If mistakes were made in word recognition or pronunciation, then the preliteracy test was administered. Subjects who achieved more than 6 years of schooling were presumed to be literate.

The preliteracy test was scored on a four-point scale as follows: 1 = unable to complete prereading test, suspended; 2 = completed test with at least five errors, suspended; 3 = completed test with less than five errors, continued; 4 = only reading test, not preliteracy test, administered.

The reading test consisted of 19 questions about two different sets of stimuli: a cedula (identification card) and related personal data, and a newspaper article about a soccer game. For each stimulus, subjects were asked to read a short paragraph and then respond verbally to a series of questions regarding the information they had read. Coding was done by individual testers, and scoring was based on the total number of correct answers.

Numeracy

Subjects were asked to read aloud a list of numbers ranging from one to three digits, to read a list of prices of familiar articles, and to order a list of items sequentially by their prices. They were also shown three pictures reflecting common situations of buying, working, and transportation and asked to answer questions regarding costs, wages, fares, and distances that required the ability to add, subtract, multiply, or divide. There was a total of 41 items. Coding was done by individual testers, and scoring was based on the number of correct answers across all items.

Knowledge

The knowledge test consisted of 22 questions regarding common experiences related to school, work, transportation, legal-political structures, and health. Subjects were presented with situations that required either basic knowledge or simple decision-making skills to be understood. They were given three possible choices and asked to select the option that best answered the question. Coding was done by individual testers, and scoring reflected the total number of correct answers.

Achievement Tests

The Interamerican Reading Series is a standardized test that consists of three parts: level of comprehension, speed of comprehension, and vocabulary. As a result of the pilot study, and owing to time constraints, only the level of comprehension and vocabulary sections were included. All subjects who passed the preliteracy test, independent of years of schooling, were given the achievement tests. The tests were timed and given either individually or in a group of up to four subjects. Scores were the number of correct answers on each of the two scales.

Intelligence

Intelligence was assessed with Raven's Progressive Matrices (RPM), which consists of five scales (A-E) containing 12 items each. Data from pilot testing indicated very low variance on scales D and E; consequently, only scales A, B. and C were administered. The test was administered either individually or in a group, and scoring reflected the number of correct answers summed across the three scales.

Information Processing

Tests of simple, choice, and memory reaction time (RT) (Sternberg, 1966) composed the computerized battery of tests to assess information processing. In addition, a paired associates test was administered as part of this battery. The intent of the battery was to assess the speed with which an individual processed information in completing elementary cognitive tasks. As described below, two of the RT tests (i.e., choice and memory) also allowed an assessment of efficiency, that is, speed in relation to errors in response.

The computer programs for each test were designed for this study. Two Guatemalan testers from a medium-sized town centrally located near the villages were trained in the use of the computer program and data management. They had limited previous experience with computers but were trained extensively during both the pilot and the pretesting stages of the project.

Subjects to be tested were first introduced to the computer as if it were a television and a typewriter (both familiar objects). They were then given a chance to interact with the computer in a series of warm-up exercises prior to the administration of the test battery.

Simple Reaction Time

This task consisted of repeated presentations of a randomly selected stimulus (geometric figures such as a circle or triangle) at the center of a computer screen. The duration of the presentation was 0.5 see, with an interstimulus interval that varied systematically between 0.5 and 2 sec. The subjects were instructed to press the bar of the keyboard as quickly as possible on appearance of the stimulus. The test consisted of 30 trials. The lapse between presentation of the target stimulus and the bar press was recorded for each response. The score was the mean reaction time across successful trials.

Choice Reaction Time/Accuracy

The task consisted of the presentation of 12 geometric figures, from which the subject selected two that then became target figures. A series of five figures (two target and three randomly selected from the initial set of 12) flashed on the screen sequentially with a display period of 0.5 sec and interstimulus intervals that varied systematically between 0.5 and 3 sec. Subjects were instructed to press the bar when the two target figures appeared in sequential order and to refrain from pressing the bar in response to any other figures or to the target figures when not presented in sequential order. The test consisted of 30 trials. In addition to calculating reaction time for all correct responses, the percentage positive (presence of motor response) and negative (inhibition of motor response) correct and the number of errors of omission and commission were also calculated. The standardized error and reaction time scores were then used to calculate measures of efficiency (total error score plus reaction time) and impulsivity (total error score minus reaction time) (Salkind & Wright, 1977). These two measures capture variation in style of response, taking into account both accuracy and speed. Large negative scores of the efficiency measure are interpreted as highly efficient responses, and large positive scores on the impulsivity index indicate impulsive responses.

Memory Task

This task follows Sternberg's (1966) paradigm. It consisted of the horizontal presentation of six geometric figures at the top of the computer screen for 3 see; the figures then flashed off the screen, and a single target figure appeared at the center of the screen. Subjects had to press one of two different keys depending on whether the target figure was one of the six previously displayed figures or not. The test included 20 trials. As in the previous testing, scores consisted of reaction time, percentage of positive and negative correct, impulsivity, and efficiency.

Paired Associates

The task consisted of four pairs of randomly selected geometric figures that appeared at the top-left-hand corner of the screen for 5 sec. Figures were presented in two horizontal rows, paired vertically. Pairs were then flashed off the screen, and one of the four figures from the top row appeared in the middle of the screen; concurrently, the four figures from the bottom row appeared at the bottom of the screen. Each of the four figures was numbered (1-4). Subjects were requested to select the numbered figure that had been paired originally with the target figure by selecting the corresponding number on the keyboard. Each trial consisted of the presentation of four target figures (selected in random order). A bell rang after every correct response; incorrect answers received no feedback. The four pairs were consistent across all trials, while order of presentation was random. The test was completed after 30 trials or when all four pairs had been successfully matched on three consecutive trials. The score was the number of trials required to reach criterion.

Procedure

Each of the four villages was visited twice by a research team, once during the dry and once during the rainy season. The teams were rotated, and each team visited each village during one round of testing. The team stayed in the village for 3-9 weeks, depending on the size of the village and coverage rates. Teams were made up of a doctor, two anthropometrists, several interviewers for sociodemographic data collection, and three persons trained to collect the behavioral data: one person on each team administered the information-processing tests, and two administered the psychoeducational tests.

Subjects were asked to complete the series of psychoeducational and information-processing tests on two separate days; completion of both series in a single day was strongly discouraged and occurred infrequently. When administered on the same day, a break was given between the two testing sessions. The information-processing evaluation lasted approximately 30 min. while the psychoeducational assessment averaged 1 hour and 15 min. In the case of illiterate subjects, all tests were administered individually.

In each community, two staff members recruited subjects and made appointments for testing. All testing was conducted in houses in the community rented by the project and adapted appropriately. In addition to psychological assessments, subjects were given medical and anthropometric examinations and interviewed regarding sociodemographic characteristics.

Reliability of Tests

Test-Retest

Test-retest stability coefficients (Pearson product-moment correlation) for the psychoeducational and information-processing tests were assessed on a subsample of the Guatemalan adolescent study population (N = 217). Subjects who agreed to participate in retesting were assigned randomly to one or more of the information-processing and/or psychoeducational tests. The test-retest interim period ranged across subjects from 2 to 34 days, with a mean of 17.7 days (SD = 7.99). Tests with a test-retest stability coefficient of .40 or less were dropped from further analyses.

As shown in Table 13, the stability coefficients for the psychoeducational tests were high, ranging from .85 to .98. These coefficients are similar to published test-retest values for Raven's Progressive Matrices (Rash, 1959; Stinissen, 1956 [cited in Raven, Court, & Raven, 1984]) and the Interamerican Series (Manuel, 1967).

The stability coefficients for the reaction time and paired associates tests were also moderate to high. However, the other variables on the choice reaction test had coefficients under .40. The means and frequency distributions for these variables indicate that the test was not sufficiently difficult to capture individual differences; the frequency distribution of errors of commission, for example, showed the majority of subjects to have made few or no such errors.

Test-retest stability coefficients were also assessed on the basis of sub-samples of subjects with longer versus shorter interims between testing. Differences in range of time between testing sessions did not significantly affect reliability coefficients.

Internal Homogeneity

Using the entire sample, Cronbach's alphas were calculated to assess the internal consistency of the RPM, the Interamerican Series, and the knowledge, numeracy, and reading tests. Alphas obtained for the RPM and the Interamerican vocabulary and reading tests were high (.79-.98) and similar to the internal consistency measures published for these tests in the literature (Arnold, 1969; Barahini, 1973; Stinissen, 1956 [cited in Raven et al., 1984]; Swinnen, 1958 [cited in Raven et al., 1984]).

TABLE 13: TEST-RETEST CORRELATIONS OF THE COGNITIVE MEASURES DERIVED FOR THE FOLLOW-UP PSYCHOEDUCATIONAL BATTERY TESTS


r

Psychoeducational battery tests:


Raven (N = 88)

.87


Knowledge (N = 87)

.88


Interamerican (N = 70):



Reading

.85



Vocabulary

.87


Reading (N = 70)

.88


Literacy (N = 89)

.98

Information-processing battery tests:


Numeracy (N = 89)

.90


Reaction time (N = 82)

.73


Paired associates (N = 85):



Trials to criterion

.47

Information-processing battery tests:


Choice reaction time:



Reaction time

.46



% positive correct

.18a



% negative correct

.09a



Efficiency.

.23a



Impulsivity

.35a


Memory task (N = 70): tests:



Reaction time.

.71



% positive correct

.12a



% negative correct

.72



Efficiency

.68



Impulsivity

.72

a Dropped from further analysis.

The internal homogeneity of the numeracy, knowledge, and reading tests was of particular interest as these tests had been constructed specifically for use with the Guatemalan adolescents. The coefficient alpha for the numeracy test was .95; alphas for the knowledge (.67) and the reading (.75) tests were not as high, but we nevertheless considered them to fall within an acceptable range. Item deletions proved to increase the coefficient only marginally and hence were not considered necessary for subsequent analyses.

Tester Differences

Assessment of differences among testers was made by comparing mean scores obtained on each variable by each of the two information-processing testers and by each of the four psychoeducational testers.

As shown in Table 14, significant intertester differences were observed on all psychoeducational tests, except for the Interamerican reading test. A series of analyses was run to assess whether the differences were a function of length of time spent in the village, round of testing, systematic disposition of an individual tester, or teams of testers (since two testers were always working in each village together). The results suggest that the differences were more likely to be related to teams rather than to individual testers and that they were not systematic - no one tester appeared to be biasing the results in a specific direction. Nevertheless, because some of these differences were large and potentially capable of affecting findings on the effects of treatment, final analyses of the psychoeducational outcomes were run both with and without controlling for testers. Comparisons of results indicated that none of the treatment effects were modified significantly by tester variation.

TABLE 14: SIGNIFICANCE OF DIFFERENCES IN MEANS OBTAINED ON THE PSYCHOEDUCATIONAL TESTS BY EACH TESTER


TESTER

F


1

2

3

4

VALUES

Literacy

3.42a

3,30 a,b

3.22 b

3.05 c

6.57***

Reading

14.62c

14.72c

16.58 a

15.75 b

19.44***

Raven's Matrices

11.58a

10.66b

11.36 a

11.01 a,b

2.62*

Numeracy

33.33a

30.82b

32.66 a,b

31.91 b

5.55***

Knowledge

13.86a

13.59 a,b

13.04 c

13.32 b,c

3.64**

Interamerican:

Reading

17.34a

17.03a

17.07a

16.69a

.63

Vocabulary

25.02b

25.07b

26.75a

26.43a

3.63**

NOTE - Duncan test; means with the same letter are not significantly different. df(3, 1,405) for literacy, numeracy, knowledge, and RPM; df(3, 1,052) for reading and Interamerican reading and vocabulary.

*p <.05.

**p <.01.

***p <.001.

Among the information-processing tests, the only scores on which significant tester differences were obtained were the choice reaction time and memory reaction time variables. These differences, however, were not large (.028 and .09 see, respectively), suggesting that their statistical significance can possibly be attributed to the large sample size and does not represent behaviorally meaningful differences between testers.

Validity

To assess construct validity, a factor analysis was conducted to test the original assumption that the overall battery of tests assessed two distinct domains of cognition: complex intellectual aptitudes, abilities, and educational achievements and elementary aspects of information processing. Factor loadings obtained from a factor analysis with varimax rotation performed on the full set of variables showed that all the psychoeducational tests loaded strongly on the first factor, which reflects an overall, general abilities factor (see Table 15). Factor 2 loaded most heavily with two of the reaction time variables, and Factor 3 loaded a memory variable. Factor 4 included the number of trials to reach criterion on the paired associates test and reaction time on the memory test. The composition of this last factor was somewhat unexpected as it had been assumed that memory reaction time would load with the other two reaction time measures.

TABLE 15: FACTOR LOADINGS OF ADOLESCENT COGNITIVE OUTCOME, VARIABLES

OUTCOME VARIABLE

FACTOR


1

2

3

4

Raven's Matrices

.610




Numeracy

.806




Reading

.862




Knowledge

.684




Literacy

.664




Interamerican vocabulary

.802




Interamerican reading

.736




Choice reaction time


.830



Simple reaction time


.821



Memory - reaction time




.708

Trials to criterion




-.647

Memory - % negative correct



.756


Eigenvalues

3.90

1.40

1.23

1.05

% variance

51.4

18.5

16.1

13.8

The factor analysis supports the assumption that the psychoeducational and the information-processing test batteries assess two distinct cognitive domains. A clear division exists between Factor 1 (psychoeducational) and Factors 2, 3, and 4 (information processing). The factor-analytic separation of the simple and choice RT tests from the memory RT suggests that, in these subjects, the different measures of RT are not tapping the same cognitive functions and may, therefore, be sensitive to different types of influences. The existence of these distinct domains was confirmed with an oblique rotation.

Concurrent test validity was also addressed by calculating correlations between the test scores and the educational variables. Positive and statistically significant correlations - ranging from r =. 18 to r =.58 - were found between highest grade achieved and all the tests contained in the psychoeducational battery. Correlations between grade attainment and the information-processing variables were also statistically significant, but much lower, ranging from -.10 (simple RT) to - .22 (memory efficiency).