INSIDE TRANSITIONAL BILINGUAL CLASSROOMS: ACCURATELY DESCRIBING THE LANGUAGE LEARNING PROCESS

Karen Leigh Bruce, Rafael Lara-Alecio,Richard I. Parker, Jan E. Hasbrouck
Texas A&M University

Laurie Weaver
University of Houston, Clear Lake

Beverly Irby
Sam Houston State University

Abstract

The field of bilingual education lacks reliable methods for accurately describing the instructional process in transitional bilingual classrooms. In this article, a four-dimensional pedagogical model of transitional bilingual education was operationalized and an observation tool created. Pilot-testing of the observation tool occurred in four Grade 5 transitional bilingual classrooms, for the initial purpose of judging interrater reliability and stability of observation-based results over time. Finally, several hypotheses were posed about instruction within the four classrooms, and observation results were used to confirm or challenge these hypotheses. Results demonstrated high interrater reliability, but found that adequate stability over time would require more extensive observations. Most of the hypotheses posed about instruction in the target classrooms were disconfirmed by observation data.

Sections of the article

 

Recent meta-analyses of bilingual education have found a major methodological problem with most evaluation research - inadequate description of the bilingual instructional process being evaluated. Even classrooms operating under the same philosophy or avowed model may look very different inside (Cziko, 1992; Lam, 1992). The labels "bilingual" and "transitional bilingual" do have generally accepted definitions (Ovando & Collier, 1985; Peregoy & Boyle, 1993). However, the definitions are so broad as to tell very little about the teaching/learning processes occurring in the classroom, their variety, and patterns of occurrence (Escamilla, 1992, Strong, 1986). In other cases, the label is a misnomer: the phenomenon of nearly exclusive English instruction within "bilingual" classrooms is not uncommon (Losey, 1995; Sapiens, 1982; Vasquez, 1993).

To improve our understanding of bilingual education, we need descriptive accuracy. Our descriptions should reflect actual instructional practices, and be validated through reliable direct classroom observations. Relatively few studies in bilingual education have involved direct observation of instruction (Brisk, 1991; Escamilla, 1992; Heras, 1994; Krashen & Biber, 1988; Strong, 1986), and fewer still have used a "wide lens" to attempt to broadly describe the teaching/learning process. As a result, assumptions about present practice may not be well-founded, and our attempts at program evaluation are clouded by confusion (Cziko, 1992; Lam, 1992).

Second language acquisition research suggests which features should be attended to in describing the instructional process in bilingual education classrooms. This research lends support to the content validity of any observation scheme. The literature suggests the importance of the following variables: (a) which language of instruction is used, and for what content (Heras, 1994); (b) how the first and second languages may be used together (Heras, 1994); (c) how students are physically grouped for instruction (Strong, 1986), (d) what types of learning activities occur, and with what opportunity for student language use (Berducci, 1993), and (e) how listening, speaking, writing and reading communication modes are utilized for language learning (Krashen & Biber, 1988). Most descriptive research in bilingual classrooms has offered only piece-meal descriptions, focusing on only one or two of these variables. In contrast, our study relies on a wider observation lens to focus simultaneously on multiple features of bilingual classroom teaching and learning.

In addition to attending to important instructional features (content validity), observation-based classroom descriptions should be technically sound, i.e., yield descriptions which do not vary much from one observer to the next (interrater reliability), or from one occasion to the next (stability over time [Sue & Ary, 1989]). Technically-sound classroom observations are especially important when the results are used to monitor implementation of an instructional model, for program evaluation purposes, or when results are officially communicated outside of the school. In addition, classroom observations should have "utility," i.e., they should be readily interpretable and yield pedagogically useful results. In summary, our criteria for good-quality classroom observations are (a)content validity, (b) interrater reliability, (c) stability over time, and (d) utility.

The purpose of this study was to create procedures for accurately describing the instructional process in transitional bilingual classrooms. Transitional bilingual programs are those in which the first language and English are used in some combination for instruction, and where the first language is used as a temporary bridge to English language instruction (Ovando & Collier, 1985; Peregoy & Boyle, 1993; Ovando & Collier, 1985). We sought procedures which reflected current pedagogical theory (content validity), produced unambiguous results (interrater reliability), yielded classroom descriptions which varied little from one occasion to the next (stability over time), and produced easily interpretable and practically useful results (utility).

Our work began with a recently introduced pedagogical transitional bilingual model (Lara-Alecio & Parker, 1994). The model was developed as a basis for more accurate and useful descriptions of instruction in transitional bilingual classrooms. The present article discusses how the model was operationalized, and a tool for conducting observations in transitional bilingual classrooms was developed. We further describe the attainment of interrater reliability with this observation tool, discuss stability of results over time, and use results of classroom observations to confirm or challenge hypotheses about instruction in four Grade 5 transitional bilingual classrooms.

The Transitional Bilingual Pedagogical Model

As reported in more detail by Lara-Alecio & Parker (1994), the Transitional Bilingual Pedagogical (TBP) Model was developed to identify the components of transitional bilingual programs, the most frequently implemented approach to bilingual education (Ovando & Collier, 1985). In transitional bilingual programs, first language instruction is envisioned as a temporary bridge to English language instruction and acquisition (Peregoy & Boyle, 1993). The TBP Model consists of four dimensions: (a) Language Content, (b) Language of Instruction, (c) Communication Mode, and (d) Activity Structures. These four model dimensions are depicted in Figure 1.

Language Content

The model's "Language Content" dimension derives from Cummins's (1986) influential distinction of Basic Interpersonal Communications Skills (BICS)

 

and Cognitive-Academic Language Proficiency (CALP) language competencies. While the BICS and CALP distinction was initially useful, the main limitations (Trueba, 1989) of this simple dichotomy are that it has obscured all classroom communication on a continuum between BICS and CALP, and has discouraged examination of student progress in this vast "middle area." The TBP Model reformulates BICS and CALP as malleable levels of discourse, rather than as fixed or long-term abilities. The Model includes four levels of language content: (1) Social Routines (social exchanges and conversation), (2) Classroom Routines (repetitive school-related tasks), (3) Light Cognitive Content (e.g. discussing community news), and (4) Dense Cognitive Content (entailing conceptually demanding, specialized vocabulary).

Language of Instruction

The Model's second dimension, the "Language of Instruction," presents four progressive uses of native (L1) and second (L2) language in the classroom: (a) content presented in L1, (b) L1 introduces L2, (c) L2 supported and clarified by L1, and (d) content presented in L2. This dimension acknowledges the concept of "transition" (as in "transitional bilingual"), and affirms the importance of the content areas as rich sources of language input for LEP children (Cummins, 1986) and as vehicles for language learning (Krashen, 1985). Sapiens (1982) and others have observed teachers' combined use of first and second languages to effectively provide access to class content. Language of Instruction usually refers to the teacher's use of language. However, it also may refer to the reading text used, or the language used by students in cooperative learning groups.

Communication Mode

The Model distinguishes two receptive models (Aural, Reading) and two expressive language modes (Verbal, Writing). Cummins' (1986) "reciprocal interaction model" and the "context-specific" model of Diaz et al (1970) both support the practice of multiple modalities for second language acquisition. These modalities (especially Reading, Writing, and Verbal Expression) also are important curriculum skill areas. Their differentiation within the TBP Model indicates that English facility may not be unitary, but may vary by communication mode.

Activity Structures

Activity Structures are teacher-structured, stable, recurring learning situations, each with its own expectations for teacher and student communication (Brophy & Evertson, 1978; Doyle, 1981). Communication that is expected and fostered in one activity structure may be inappropriate and discouraged in a second. Our traditional pedagogical emphasis on "the lesson" with objectives, curriculum content, and assignments, unfortunately ignores "activity structures." Influenced by Vygotsky's notion of Zone of Proximal Development (Cole & Griffin, 1983), classroom ethnographers similarly describe the "structure of events," each type of structure with its own opportunities, implied values and expectations for student participation (Erickson, 1982). Activity structures are operationally defined in the TBP Model as combinations of: (a) type of teacher behavior (e.g. directing, leading, evaluating, observing), and (b) the expectation for student responding (e.g. listening, performing, discussing, asking questions, answering questions, cooperative learning). A few classroom activity structures (e.g. time spent disciplining, transitions between classes) are considered non-academic. Most classroom activity structures are defined by combinations of two activities, signifying the main teacher behavior plus the primary student expected behavior (Parker, Hasbrouck & Tindal, 1994). Thus when a teacher mainly lectures or presents information, and students are mainly expected to listen, the activity structure is identified as lecture/listen (Lec/Lis). The Activity Structures of the TBP Model are described in greater detail in the Appendix. Three additional model-based indices round out the description of activity structures: duration of the activity, curriculum subject, and physical grouping of the students.

Developing the Observation Protocol

The TBP Model aims to guide observations and data collection on the teaching/learning process within transitional bilingual classrooms. To serve this purpose, each Model dimension had to be operationally defined. Two graduate student researchers wrote descriptions for each dimension, and for different levels of each dimension. These researchers then tested these descriptors against actual observations, and re-wrote definitions to reduce ambiguities. Concurrent, independent classroom observations were followed by debriefings to reconcile differences in perception between the two student researchers. The result was the TBP Observation Protocol, which guides collection of observational data from the four dimensions of the TBP Model (Language Content, Language of Instruction, Communication Mode, and Activity Structures) within actual classroom settings.

Figure 2:

Date:_________Start Time:________Observer:_________

School:                                              Class:                         

     

Figure 2. Transitional Bilingual Observation Protocal

Most observations used in the development of the TBP Observation Protocol took place within four Grade 5 transitional bilingual classrooms in a low-SES district near Houston, Texas. These classrooms were part of an intensive summer instruction program for at-risk bilingual students, supported by a transitional English grant from the U.S. Department of Education Office if Bilingual Education for Math and Language Arts. Students' entering English skills on the Idea Oral Language Proficiency Test (1982) were Non-English Speaker (NES): 13%, Limited English Speaker (LES): 38%, and Fluent English Speaker (FES): 49%. Students were heterogeneously grouped in classrooms, each class with a bilingual certified teacher and a fluent bilingual teaching assistant.

A common philosophy was developed by the grant writers to guide the organization and structure of these classrooms. The philosophical guidelines provided to the teachers were based on findings from research on effective bilingual classrooms and included: (a) using Spanish to introduce new content and difficult concepts, (b) using Spanish to introduce English concepts, and clarify confusions in English, (c) providing frequent opportunities for verbal communication, (d) emphasizing cooperative learning (pairs and groups), and (e) utilizing and emphasizing equally all four language modes: Writing, Reading, Aural/Listening, Speaking (Ammon, 1985; Diaz, Moll & Mehan, 1986; Gutierrez, 1992; Reyes, 1991; Reyes & Laliberty, 1992; Sapiens, 1982; Trueba, 1987; Wong, Fillmore, Ammon, McLaughlin & Ammon, 1985).

Interrater Reliability

After TBP Model dimensions were operationalized and the observation protocol developed, we sought to answer the question of inter-observer reliability using the TBP Protocol. Reliability was calculated three times, for three consecutive sets of observations within the same four summer program classrooms, a total of 16 classroom observations over a two-week period.

The first reliability sample was based on the 30-second and one-minute Momentary Time Sampling (MTS), wherein Model dimensions were judged and coded for that instant at the end of each short time period (Suen & Ary, 1989). MTS was utilized for 100 minutes, across the four classrooms. However, MTS provided a window too brief to adequately judge all coding categories, so Interval Time Sampling (ITS) was used (Suen & Ary, 1989).

Using ITS, observers coded events at the end of each short period, based on judgment of all events during that period (Suen & Ary, 1989). In the present study, the observers began with one-minute intervals, but soon switched to longer four-minute intervals, which provided better opportunity for judging all model dimensions. Observations were conducted during morning activities, which included student reports and homework review, plus some regularly scheduled time in math activity centers. This reliability sampling was distributed over the four classrooms for a total of 161 minutes. For the third and final sample, interrater reliability was calculated over the same classrooms for a total of 60 minutes.

Interrater agreement on the TBP Observation Protocol was calculated with Cohen's Kappa (Cohen, 1960), a conservative measure of which accounts for error from chance agreement and from skewed data (or uneven marginals) in the observation sample. Generally speaking, Kappa values above .75 indicate strong agreement, between .40 and .75 fair to good agreement, and below .40, poor agreement (Fleiss, 1981; Wilkerson, 1992). Some statisticians recommend a more liberal Kappa interpretation by comparison with its maximum value for a particular dataset: KappaMax. The two indices may be expressed as a ratio: Kappa/KappaMax (Umesh, Peterson & Sauber, 1989). Table 1 reflects our preference for balanced consideration of three agreement indices: Percent Agreement, Kappa, and Kappa/KappaMax. Kappa was not calculated for perfect agreement.

Table 1 shows agreement results for observations based on the three consecutive reliability samples obtained over two weeks. Results are presented for each model dimension (Language Content, Language of Instruction, Communication Mode, and Activity Structure) plus Physical Grouping and Curriculum Subject.

In Table 1, the final "percentage of agreement" scores (from Sample Three) range from 93% to 100%, and the corresponding Kappas are all of respectable size. Reliability scores did not improve equally for all dimensions from the first to the third sample. For three of the six categories (Language of Instruction, Communication Mode, Physical Grouping), scores actually deteriorated and the observers switched from MTS to ITS. However, most losses were regained for the final 60-minute sample.

Measurement Stability

Two stability-related questions are germane to the confidence we have in observational-based descriptive results: (a) What length of observation provides a stable picture of bilingual instruction? and (b) To what extent does the description of bilingual instruction change from day to day? The two questions are related, but different. We performed separate analyses to answer these two questions, both on a total corpus of 321 minutes (5.3 hours), collected over two weeks within the four target classrooms. Our purpose in providing descriptions for all four classrooms at once was to describe a cohesive program being implemented, from teacher inservicing to instructional planning and delivery.

Observation Length

To answer the first question: "What length of observation is needed for a stable description?" we randomized the complete 321 minute corpus, divided it in four equal samples (80 minutes, or 1.3 hrs. each), and compiled summary "percent of occurrence" scores for each sample. Mean score variability (standard error of the mean - SEM) was then calculated for these four independent samples, which then helped from 94% confidence intervals (Cis) for the 80-minute samples. The Cis are here expressed as "Coefficients of Variation" (Wilkinson, 1992), the "percent of mean score change" across 80-minute samples (see Table 2). The Coefficient of Variation is less technical than a CI, and is unitized, permitting comparisons of relative stability across different variables. Larger Coefficients of Variation indicate poorer score stability. Generally speaking, stability scores of 15% or less are good, those of 25-50% are mediocre, and those of 60% and greater represent poor stability.

In Table 2, under Activity Structures, the relative frequency of occurrence of the category Ask/Ans is .35. In other words, the activity structure in which teacher mainly asks questions and students are expected to answer, occurs about one-third of the time. The stability score for Ask/Ans from our random sample of 80-minutes (Rand. Stabil.) is 22%. So we expect the Ask/Ans frequency of .35 to fluctuate up or down 22% (from about .28 to .42) over repeated 80-minute samples. This is a great deal of fluctuation, indicating only mediocre stability, and poorer than its second stability score, stability over time, discussed in the next section.

 

The observed variables with poor stability within an 80-minute sample are Communication Mode (average 45%) and Activity Structures (average 43%). In contrast, Group (average 7%), Language Content (average 13%) and Language of Instruction (average 26%) showed excellent-to-fair stability. To obtain high stability, a category must occur frequently, and be evenly distributed over time. Low frequency categories (e.g., the Activity Structure in which the teacher gives directions and the students perform, Dir/Per) generally showed low stability scores (114% for Dir/Per). Thus, the answer to our question about requisite observation length for a stable description is that 80-minutes is too short for some TBP Observation Protocol categories.

Stability Over Time

The second stability question we sought to answer was: "How much variation exists among observation results from one day to the next?" To answer this question, we did not randomize the data, but rather selected and compared three intact samples collected over consecutive days. These three samples composed the same total corpus of 321 minutes (5.3 hrs): 100 minutes, 161 minutes, and 60 minutes. We again calculated frequencies of occurrence, SEMs and 94% Cis, and converted to stability percent scores. These scores are labeled "Order Stability" in Table 2.

Table 2 indicates generally poorer stability for Order Stability than for Random Stability, even though the observation time was greater for the Order Stability sample (107 minutes) than for the Random Stability samples (80 minutes). Most of the Order Stability scores were one-half to one-third as stable (two-to-three times larger) as for Random Stability. This finding indicates distinct within-classroom differences from one day to the next.

Model-Based Expectations

Several expectations were held regarding how English and Spanish would be used by the teachers while providing instruction in the four transitional bilingual classrooms used in this study. These expectations were based on the goals and philosophy developed for this particular, grant-funded transitional bilingual program, the inservice delivered to all the teachers, and the records of students' entry-level language skills.

(1) Teachers were expected to use Spanish (L1) in support of English (L2) at least half of the class time, and to spend the remaining half of the class time roughly split between English and Spanish. The transitional program philosophy and preservice training had emphasized the mutual support roles of the two languages. (2) Teachers were expected to make approximately equal use of both language support strategies: (2a) Dense Cognitive Content initially explained in Spanish, with English support (L1-2), and (2b) Spanish used to clarify other concepts of Light Cognitive Content presented initially in English (L2-1). (3) Most class time was expected to be dedicated to teaching new academic content, and that language learning would occur through that content instruction. (4) Different patterns of language use were expected for the two expressive and receptive modes. We expected that Spanish, English, and both languages combined would be used for information input (Aural, Reading). In contrast, we expected more emphasis on English for output. Especially in Written Expression, we expected the stress on English performance. We also expected a varied, language-rich environment, with lots of Verbal/Aural-intensive activities, but with nearly half of class time spent in Reading or Written Expression. (5) Most class time was expected to involve activity structures which permitted active learning by students, especially cooperative group learning. By active learning we meant any activity structure which includes active participation (more than simply listening) by students. We expected that Dense Cognitive Content especially would be learned interactively, as opposed to a lecture/listen format. Finally, we expected to see an emphasis on challenging students to apply or perform with Dense Cognitive Content, beyond merely answering or asking questions.

Model-based Classroom Descriptions

The TBP Observation Protocol was used to conduct a total of five hours of observation (eight sessions conducted over three days by two graduate students in Bilingual/ESL at Texas A&M University). Results were used to describe the programs being offered to the students in the four Grade 5 transitional bilingual classrooms in terms of TBP Model dimensions.

Results form the observations were organized around our previously identified expectations, or hypotheses, for these classrooms. To confirm the hypotheses, observational data were coded, entered into an electronic spreadsheet, and summarized through simple frequency tabulations or cross-tabulations.

Results

Expectation #1: Combined use of Spanish and English at least half of class time, and the remaining roughly split between English and Spanish. To test this hypothesis, we tabulated frequencies for the dimension, Language of Instruction. Frequencies were then converted to percents of total observations, yielding: L1-2<1%, L2-1: 9%, L1: 20%, L2: 71%. Our hypothesis was not supported by the data. During most instructional time (71%) the teacher used English only. Use of Spanish as the instructional language occurred only 20% of the time.

Expectation #2: Approximately equal use of both language support strategies, L1-2 and L2-1. From the above data, we note that combined use of Spanish with English support occurred only 9% of the time. Contrary to our expectations, when the languages were used together, Spanish was not used for initial introduction of new concepts, but only for clarification in mainly English presentations.

Expectation #3: (a) There will be an academic emphasis in classrooms, (b) Spanish or Spanish/English mixture used for Dense Cognitive Content and some Classroom Routines, and (c) English used for less demanding Light Cognitive Content and Social Routines. Results form cross-tabulating Language of Instruction with Language Content are presented in Figure 3.

Figure 3:

 

Figure 3 confirms that the classrooms maintained a strong academic focus, involving Dense Cognitive Content 73% of the time. Our hypothesis about Spanish versus English use for different language content was not supported. Secondary calculations from Figure 3 show that English was the language of choice for all content, including Dense Cognitive Content (68%) and Classrooms Routines (85%). Spanish alone was used the most (59% of the time) for the least demanding language content, Social Routines. As expected, the two languages were used together (Spanish to clarify English) mainly for Dense Cognitive Content, but this seldom occurred (only 12% of the time).

Expectation #4: (a) Mixture of Spanish, English, and both languages combined for information input (Aural/Reading), (b) greater emphasis on English for output (Verbal Expression, Writing), especially Writing, and (c) an even blend of Verbal/Aural activities with academic tool skills (Reading & Writing).To test this hypothesis, we cross-tabulated Language of Instruction with Communication Mode, creating graphic and tabular summaries presented in Figure 4.

Figure 4:

 

Figure 4: Language of Instruction by Communication Mode: Relative Frequencies

Figure 4 data partially support the first part of our hypothesis regarding language use for information input. Secondary calculations from Figure 4 indicate that English was used for Aural/Listening and Reading activities about 67% of the time, while Spanish (alone or supporting English) was used for the remaining 33% of the time. However, our hypothesis of much greater English use for language output was not borne out. Patterns of language usage varied little between input and output modes. Finally, the data did support the presence of an even mix of Verbal/Aural-intensive activities (57%) with academic tool skills (Writing and Reading) (43%).

Expectation #5: (a) Active learning and cooperative learning emphasized, (b) active learning used with Dense Cognitive Content, and (c) student application or performance required with Dense Cognitive Content. To confirm these hypotheses, we cross-tabulated Activity Structures with Language Content, summarizing the results in Figure 5.

Figure 5:

Figure 5:Activity Structure by Language Content:Relative Frequencies

Activity Structures involving student active learning are all of those which entail a response other than passive listening (Lis), e.g. answering (Ans), asking (Ask), performing (Per), cooperative learning (Cop), and frequently-changing interactions between teachers and students (Inter). These active structures made up 75% of class time - a sizable commitment to active learning. However, contrary to our hypothesis, cooperative learning structures made up less than 2% of class time. Our prediction that Dense Cognitive Content would be handled mainly through active learning was confirmed, but this could be accounted for solely by high overall frequencies of both Dense Cognitive Content (73% of total) and active learning Activity Structures (75% of total). Proportionately, Dense Cognitive Content was no more likely to be part of active versus passive structures. Nor did the data confirm our hypothesis of an emphasis on student application or performance with Dense Cognitive Content.

Discussion

The field of bilingual education lacks adequate methods for describing teaching and learning within classrooms where students are transitioning from L1 to L2. A transitional bilingual pedagogical (TBP) Model was constructed (Lara-Alecio & Parker, 1994) to describe these classrooms is a manner consistent with current second language acquisition theory. In the current study we operationalized the TBP Model, created a TBP Observation Protocol, calculated reliability and stability estimates, and used the protocol to describe four Grade 5 classrooms in a special grant-funded transitional bilingual program.

Strong interrater reliability was established for all six observation categories assayed. Language Content, Language of Instruction, Communication Mode, Activity Structures, Physical Grouping, and Curriculum Subject. Stability indices were not as strong. In terms of size, a total of 80 minutes obtained from short observations over four days, did prove adequate for a stable descriptions of two model dimensions - Language Content and Physical Grouping. The other three observation categories proved too variable - most problematic were Communication Mode and Activity Structures. In these authors' judgment, samples of 2-2.5 hrs. will be needed to provide stable descriptions for all dimensions.

We also computed indices for stability of descriptions over time - three different days, over two weeks. A single day's observation proved entirely inadequate (for all dimensions assayed) for describing classroom instruction. In our judgment, samples from three or more days will be necessary to provide adequate measurement stability.

To demonstrate the utility of the TBP protocol, we performed a limited field test, observing and describing four classrooms engaged in an intensive transitional bilingual summer program. We posed several hypotheses, based on our knowledge of the program's philosophy and goals, students' language capabilities, and teacher inservicing. Most of these hypotheses were not confirmed by reliable observation data, underscoring the dangers in judging a program by its packaging. We did not return this summary information to the program director nor to the classroom teachers, to obtain their reactions. However, these observation-based data have practical potential uses for teacher self-monitoring and planning, and for program evaluation, as well as for research.

Teacher Self-Monitoring and Planning

Transitional bilingual programs are based on the premise that unfamiliar concepts are best learned in the known language first and then transferred to the second language, once the appropriate vocabulary has been learned (Cummins, 1988). Results of this study, however, indicate there may be a mismatch between theoretical premise and practice. The teachers observed in this study were found to use English more than Spanish during instructional time. In addition, Spanish was used to clarify English presentations rather that for the introduction of the new concepts. The TBP Protocol has potential for classroom use by bilingual teachers. By using the TBP Protocol, teachers could examine when each language is used and for what purposes to ascertain how closely their classroom instructional practices match established bilingual instruction theory and program philosophy. Using these results, bilingual teachers could then plan their instruction activities, and the language combinations to deliver such instruction, to more closely align their classroom practices with the theory and philosophy of transitional bilingual instruction.

Program Evaluation

As funds to support various forms of bilingual educational programs become less available, funding agencies are likely to increase their requirements for evaluations of these programs. The TBP Observation Protocol could be used by evaluators of these programs to (a) accurately document what is actually occurring in classrooms, and (b) use those results to identify what aspects of the programs are key to attainment of program goals. Classrooms in which children are learning more of the presented content as well as mastering English language skills while developing and/or maintaining strong native language skills could be identified. Then using data collected with the TBP Observation Protocol, program evaluators could obtain answers to critical questions such as: What are the instructional components being emphasized by teachers in these effective classrooms? What language combinations are being used when teaching various concepts? Answers these and related evaluation questions could be readily used by program developers to improve instructional programs.

Research

Researchers of bilingual programs presently rely on inadequate descriptions of bilingual instruction (Cziko, 1992; Lam, 1992). Without clear, accurate descriptions of programs, results of even the best designed and conducted research cannot provide interpretable results (Escamilla, 1992; Strong, 1986). The TBP Model identified key elements or components of transitional bilingual programs, which were then operationalized and used reliably to observe in classrooms. Further use of a tool such as the TBP Observation Protocol could assist researchers considerably toward building a trustworthy database of information about bilingual education programs.

The predominance of English found in our bilingual classrooms has been documented in other studies. Sapiens (1982) found that students acknowledged and responded to English as the "official" language even when teachers attempted to maintain Spanish in formal instruction. Spanish often becomes an informal, social language, relegated to the BICS language level (Vasquez, 1993). Other pressures on students and teachers to use English include the lack of Spanish instructional materials, and expectations from school administration (Losey, 1995).

The results of this study raise an important issue: If the purpose of transitional bilingual programs is to introduce new concepts in the known language and to provide clarification and reinforcement in the second language (Peregoy & Boyle, 1993), why does this not occur in the transitional bilingual classroom? Further research should examine this apparent lack of agreement between theory and practice. In addition, it is important to examine the reasons why this mismatch is occurring. Research has indicated that pressure from administrators, as well as the perception of Spanish language instruction as remedial instruction, may influence bilingual teachers to implement English language instruction in place of Spanish language instruction (Weaver, 1995). Further research is needed to address this issue, and the use of a reliable observation protocol to guide collection of data form within actual classroom could play a valuable role.

This study offers one approach to improving the descriptive accuracy of instructional processes in transitional bilingual classrooms. Until these more accurate descriptions are available, pedagogical theory will be difficult to advance (Escamilla, 1992; Strong, 1986), and program evaluations will continue to produce ambiguous results (Cziko, 1992; Lam, 1992).

References

Ammon, P. (1985). Helping children to write in English as a second language: Some observations and some hypotheses. In S.W. Freedman (Ed.). The acquisition of written language: Revision and response (pp. 65-84). Norwood, NJ: Ables.

Berducci, D. (1993). Inside the SLA classroom: Verbal interaction in three SL classes. Language Learning Journal, 8, 12-16. New York: Longman.

Brisk, M. (1991). Toward multilingual and multicultural mainstream education. Journal of Education, 173, 114-129

Brophy, J. & Evertson, C. (1978). Context variables in teaching. Educational Psychologist, 12, 310-316.

Cohen, J.A. (1960). A coefficient of agreement of nominal scales. Educational and Psychological measurement, 20, 37-46. Cole, M. & Griffin, P. (1983). A socio- historical approach to re-mediation. The Quarterly Newsletter of the Laboratory of Comparative Human Cognition, 5 (4), 69-74.

Cummins, J. (1986). Language proficiency, bilingualism and academic achievement. In P. Richard-Amato (Ed.), Making it happen: Interaction in the second language classroom (pp. 382-395). New York: Longman.

Cummins, J. (1988). Language proficiency, bilingualism and academic achievement. In P. Richard-Amato (Ed.), Making it happen: Interaction in the second language classroom (pp. 382-395). New York: Longman.

Cziko, G.A. (1992). The evaluation of bilingual education: From necessity and probability to possibility. Educational Researcher, 21, 10-15.

Diaz, S., Moll, L.C. & Mehan, H. (1986). Sociocultural resources in instruction: A context-specific approach. In Beyond language: Social and cultural factors in schooling language minority students (pp. 187-230). Sacramento, CA: Bilingual Education Office, California State Department of Education.

Doyle, W. (1981). Research on classroom contexts. Journal of Teacher Education, 32, 3-6.

Erickson, F. (1982). Classroom discourse as improvisation: Relationships between academic task structure and social participation structure in lessons. In L.C. Wilkinson (Ed.), Communicating in the classroom (pp. 119-158). New York: Macmillan.

Escamilla, K. (1992). Theory to practice: A look at maintenance bilingual education classrooms. The Journal of Educational Issues of Language Minority Students, 11, 1-25.

Fleiss, J.L. (1981). Statistical methods for rates and proportions. New York: Wiley & Sons.

Gutierrez, K.D. (1992). A comparison of instructional contexts in writing process classrooms with Latino children. Education and Urban Society, 24, 244-262.

Heras, A. (1994). The construction of understanding in a sixth-grade bilingual classroom. Linguistics and Education, 5, 275-299.

Idea Oral Language Proficiency Test (1982). Brea, CA: Ballard & Tighe.

Krashen, S. & Biber, D. (1988). On course: Bilingual education's success in California. Sacramento: California Association for Bilingual Education.

Krashen, S. (1985). The Input hypothesis: Issues and implications. New York: Longman.

Lam, T.C.M. (1992). Review of practices and problems in the evaluation of bilingual education. Review of Educational Research, 62, 181-203.

Lara-Alecio, R. & Parker, R. (1994). A pedagogical model for transitional English bilingual classrooms, Bilingual Research Journal, 18 (3&4), 119-133.

Losey, K.M. (1995). Mexican American students and classroom interaction: An overview and critique. Review of Educational Research, 65, 283-318.

Ovando, C. & Collier, V. (1985). Bilingual and ESL classrooms: Teaching in multicultural contexts. New York: McGraw-Hill. Parker, R., Hasbrouck, J.E., & Tindal, G. (1994). The Activity Structures Observation System (ASOS) (Training module). College Station, TX: Texas A&M University, Disabled and At-Risk Children and Youth (D.A.R.C.Y.).

Peregoy, S. & Boyle, O. (1993). Reading, writing and learning in ESL: A resource book for K-8. New York: Longman.

Reyes, M. de la Luz & Laliberty, E.A., (1992). A teacher's "Pied Piper" effect on young authors. Education and Urban Society, 24, 263-278.

Reyes, M. de la Luz, (1991). A process to literacy using dialogue journals and literature logs with second language learners. Research in the Teaching of English, 25, 291-313.

Sapiens, A. (1982). The use of Spanish and English in a high school civics class. In J. Amastae & L. Elías-Olivares (Eds.), Spanish in the United States: Sociolinguistic aspects, (pp. 386-412). Cambridge: Cambridge University Press.

Strong, M. (1986). Teacher language to limited English speakers in bilingual and submersion classrooms. In R. R. Day (Ed.), Talking to learn: Conversation in second language acquisition (pp. 53-63), Rowley, MA: Newbury House.

Suen, H.K., & Ary, D. (1989). Analyzing quantitative behavioral observation data, (pp. 105-115). New Jersey: Lawrence Erlbaum Associates.

Trueba, H. (1989). Raising silent voices: Educating the linguistic minorities for the 21st century. New York: Newbury House.

Trueba, H.T. (1987). Organizing classroom instruction in specific sociocultural contexts: Teaching Mexican youth to write in English. In S.R. Goldman & H.T. Trueba (Eds.), Becoming literate in English as a second language (pp. 235-252). Norwood, NJ: Ablex.

Umesh, U.N., Peterson, R.A., & Sauber, M.H. (1989). Interjudge agreement and the maximum value of Kappa. Education and Psychological Measurement, 49, 835-855.

Vasquez, O. (1993). A look at language as a resource. Lessons from la clase mágica. In M. B. Arias and U. Casanova (Eds.). Bilingual education: Politics, practice and research (pp. 199-224). Ninety-Second Yearbook of the National Society for the Study of Education, Pt. II. Chicago, IL: University of Chicago Press.

Weaver, L. (1995). A critical ethnography of writing instruction in a bilingual classroom. Unpublished dissertation, University of Houston, Texas.

Wilkinson, L. (1992). Statistics: Systat Manual. (Version 5.2). Evanson, IL: Systat Inc.

Wong Fillmore, L., Ammon, P., McLaughlin, B. & Ammon, M.S. (1985). Learning English through bilingual instruction (Contract No. 400-80-0030). Washington, DC: National Institute of Education.

 

Appendix

Examples of Activity Structures

(From Activity Structure Observation System

Parker, Hasbrouck & Tindal, 1994)

When teacher and student behaviors are combined, they form the Activity Structures of a classroom. The following are examples of some of the various possible combinations.

Activity Structures Where Teacher Behaviors Drive Student Behaviors

#1 Teacher: Lecture (lec) Student: Listen (lis). Teacher lectures or makes a presentation using audio-visual materials while students listen and watch.

Teacher introduces and shows a video tape, or a film while students watch/listen.Primary teacher teaches students about the safety rules for the playground.

#2 Teacher: Lecture (lec) Student: Perform (per). Teacher lectures or makes a presentation while students take notes.

#3 Teacher: Direct(dir) Student: Listen (lis). Teacher gives directions for the format or procedures for academic assignments while students listen (e.g. teacher giving directions about formatting a report with headings, margins, spacing, using pens only, etc.)

#4 Teacher: Direct(dir) Student: Perform (per). Teacher gives directions for the format or procedures for academic assignments while students take notes.Teacher gives an order/directive related to content/skills/subjects matter and students comply (e.g. "Take out your books, turn to Chapter 12, and read pages 23-29 silently to yourselves.")

#5 Teacher: Demonstrate(dem) Student:Listen (lis).Teacher demonstrates or models a procedure, action or activity while students watch and listen (e.g. "I am going to cut out this pattern first. You just watch carefully this time and see how I do this.")

# 6 Teacher: Led (led) Student:Perform (per). Teacher leads students through a desired performance while students perform the task with or slightly behind the teacher (e.g., "I am going to cut out this pattern again and this time I want you to follow along and cut out your patterns at the same time.") #7 Teacher: Ask (ask) Student: Perform (per). Teacher ask content/subject matter/skills related questions and students answer questions in writing, or in other non-verbal ways (e.g. "When you have figured out the answer to this problem hold up the number of fingers for your answer.")

#8 Teacher: Ask (ask) Student: Answer (ans) Teacher verbally asks questions related to subject matter/skills and students answer verbally. Teacher leads/controls a group discussion with students participating Activity Structure Where Teacher Behaviors Are Driven By Student Behaviors

#9 Teacher: Answer(ans) Student: Ask (ask). Student(s) lead/controls a group discussion with teacher participating/facilitating. Students verbally ask content/skills/subject matter questions while teacher answers verbally.

#10 Teacher: Evaluate(ev) Student:Perform(per). Student(s) perform a task, or respond to a directive while the teacher overtly judges the correctness or quality of the response, or performance (e.g. students complete writing assignment and the teacher comments to students about their performance as he/she walks around the room). Student reads a report while the teacher overly judges the correctness or quality of the response or performance (e.g. while student reads a report the teacher nods, smiles, and takes notes to later give to students)

#11 Teacher:Observe(obs) Student:Perform (per). Student reads a report while the teacher observes or supervises making no overt judgments.

#12 Teacher: Evaluate(ev) Student:Discuss (dis). Individual student(s) engage in exploratory activity to discover the answer to a skills/subject matter related question while the teacher overtly judges the correctness or quality of their work (e.g. students each have a mechanical puzzle in front of them and are trying to "solve" it individually while the teacher moves around the room giving encouragement and feedback to students as they work).

#13 Teacher: Evaluate(ev) Student:Cooperate (cop). Groups of 2 or more students(s) work together to do an subject matter related activity and assist each other while the teacher overtly judges the correctness or quality of their work (e.g. small groups of students are working together to solve a logistics problems presented to their group in the form of a written scenario. The teacher moves from group to group making comments on their progress and offering suggestions). #14 Teacher: Observe(obs) Student:Cooperate (cop). Individual student(s) engage in exploratory activity to discover the answer to a skills/subject matter related question while the teacher observes or supervises without any overt judgments about quality or accuracy.

#15 Teacher: Observe(obs) Student:Cooperate (cop). Groups of two or more student(s) work together to do a subject matter related activity and assist each other while the teacher observes or supervises without any overt judgments about quality or accuracy.

#16 Feedback (feed). Teachers give verbal feedback to students about their non-academic behavior (e.g. a discussion about students' behavior in the lunchroom and on the playground that afternoon). Teachers discipline student(s).

#17 Free time (free). Students participate in "free time" or play activities not directly related to skills/subject matter.

#18 Transition (tran). Time spent at the beginning and/or end of the day in activities not directly related to skills/subject matter such as managerial routines including taking attendance, collecting money, lunch count, cleaning desks, etc.

#19 Interruption (int). Any interruption to the regular planned classroom activity such as a parent stopping by and asking the teacher a series of questions about their child or messages given over the intercom.

#20 Outside Activity (out). Any activity outside of the classroom including time spent on the playground, in the cafeteria, in the hallway, in assemblies, etc.