Annual Report to the Teacher Education Accreditation Council New York University First Post-Continuing-Accreditation Year June 2013 FINAL DRAFT June 10, 2013 Department of Teaching and Learning The Steinhardt School of Culture, Education, and Human Development New York University 1 CONTENTS INTRODUCTION 4 EVIDENCE BASE 4 PROGRAM OPTIONS 5 UPDATED PROGRAM OUTCOMES 8 DRSTOS-R New York State Teacher Certification Exams (NYSTCE) Student Teacher End-of-Term Feedback Surveys (ETFQ) Educational Beliefs Multicultural Awareness Scale (EBMAS) Grade Point Averages Program Exit and Follow-Up Surveys Graduate Employment and Retention 8 9 12 13 14 15 18 Appendix A. NYCDOE Teacher Education Program Report: NYU Appendix B. EBMAS report ATTACHMENT (APPENDIX E) 2 Tables Table 1. Program options, completers, and enrollments 7 Table 2. Percentage of late-placement student teachers meeting standards on the Domain Referenced Student Teacher Observation Scale-Revised (DRSTOS-R) by academic year 8 Table 3. Summary of performance on DRSTOS-R Total Scores for student teachers in their last placements by program certification areas, fall 2010 – spring 2012 9 Table 4. Mean scaled scores, effect sizes, and passing rates for teacher-education graduates on the NYSTCE exams: Classes of 2011 & 2012 10 Table 5. Mean scores on the ETFQ Claim Scales for teacher-education students in last student teaching placements (Classes of 2011 and 2012) 13 Table 6. Mean EBMAS scale scores by degree and year compared to the program standard of 4.50 14 Table 7. Mean GPAs of NYU BS & MA teacher education graduates by claims (Class of 2012) 15 Table 8. Numbers and percents of Steinhardt teacher-education program completers who reported on the Program Exit Survey that their programs prepared them very or moderately well to begin teaching: Classes of '11 & '12 16 Table 9. Numbers and percents of Steinhardt teacher-education program completers who reported on the One-Year Follow-Up Survey that their programs prepared them very or moderately well to begin teaching: Classes of '11 & ‟12 17 Table 10. Comparison of the demographics of NYC schools in which NYU graduates first taught and all NYC schools disaggregated by school type (Classes of 2006 – „11) 19 Table 11. Retention status as of Sept. 2012 of Steinhardt graduates who began teaching in NYC public schools within one year of graduation (Classes of 2006 - 2011) 20 Figure 1. Mean scaled scores for NYSTCE Content Specialty Tests (BS graduates 2011 – 12) 11 Figure 2. Mean scaled scores for NYSTCE Content Specialty Tests (MA graduates 2011 – 12) 12 3 INTRODUCTION Having been granted continuing accreditation of our teacher education program on June 11, 2012, this is the first annual report submitted by NYU‟s Steinhardt School of Culture, Education, and Human Development to the Teacher Education Accreditation Council (TEAC). This report focuses on our self-study activities for the two academic years, 2010 – 11 and 2011 – 12. Prepared according to the specifications on TEAC‟s web site, the report includes (1) an update of Appendix E, the evidence that the program‟s self study relies upon, (2) an update of the Table of Program Options, including student enrollment, graduation numbers, and descriptions of three program options developed since the Brief, and (3) updates of the data tables in the Results Section of the Inquiry Brief that support the program‟s claims. In addition, the report describes changes in the measures that are being used in the ongoing self-study of the effectiveness of Steinhardt‟s teacher education programs and the most recent results from the analyses of data. Analyses of the new data indicate that Steinhardt‟s teacher education program continues to meet the claims for the development of competent, caring, and confident teachers who are committed to working in inner-city schools. Using data to inform program planning and improvement by faculty and administrators is a cultivated tradition at Steinhardt. Toward that end, the findings of this report have been shared and discussed at several faculty venues. First, thefindings disaggregated by Teacher Education Program Areas were presented to the faculty of Teaching and Learning at the April 10, 2013 department meeting. Second, detailed results were delivered in a PowerPoint presentation at a meeting of the Teaching and Learning faculty on April 15th, which was emailed to all Teacher Education Faculty for follow-up discussions at individual program area meetings. Last, the data were also reviewed at the April 17th meeting of the Teacher Education Working Group(TEWG), which discussed areas of concern that will be addressed in the fall. TEWG will also plan an “internal audit” during the 2013-2014 academic year. THE EVIDENCE BASE The updated Appendix E (see attachment) details the evidence that NYU continues to collect, analyze, and report to faculty to assess the effectiveness of its teacher education program and inform continuous program improvement. The core of the evidence base remains in place with some modifications and additions to measures and methods that are being leveraged by the institution of new teacher evaluation and certification systems by the New York State Department of Education (NYSED), new investments in assessment and accountability at Steinhardt, and the results of ongoing research on the measurement of teacher and program effectiveness. First, the new NYSED systems are upgrading the data that will be available as evidence in several ways. Beginning in 2014, aspiring teachers will have to pass the Teacher Performance Assessment (edTPA), in order to obtain initial certification. The data from the edTPA will provide a rich measure of our students‟ practice-based skills benchmarked against a large national data base. We plan on using the TPA to supplement the home-grown DRSTOS-R, which has become institutionalized at Steinhardt as a valid, reliable, and useful measure of 4 developing pedagogical proficiency. The new teacher evaluation system will yield effectiveness ratings for graduates teaching in New York State public schools, part of which will be based on a new Growth Percentile Measure (GPM) that uses the standardized test performance of the pupils they teach. We are negotiating a process to obtain the effectiveness ratings and GPM for our graduates in the same way that we obtained Value-Added Modeling data from the New York City Department of Education for the most recent Brief. Second, building on the foundation provided by Steinhardt‟s Center for Research on Teaching and Learning (CRTL) in the assessment and evaluation of our teacher education program, Steinhardt is launching a new Center for Research on Higher Education Outcomes (CRHEO) this summer. CRHEO will continue to maintain the extant CRTL evidence base while exploring and developing new methods and measures to assess the effectiveness of clinicallybased training programs. As part of the efforts to expand the evidence base with authentic performance-based assessments, Steinhardt is conducting due diligence on electronic portfolio systems for an anticipated pilot in the near future. Third, this year for the first time, the New York City Department of Education (NYCDOE) shared data that they have collected on NYU graduates teaching in the NYC public schools. One caveat is that the graduates included in the NYCDOE report include all graduates of NYU, including those who were not educated at the Steinhardt School. Nevertheless, these data represent an independent examination and confirm our conclusions that we are preparing high quality teachers. These data can be found in Appendix A.. Last, Steinhardt continues to conduct research on the validity and reliability of its measures for assessing teacher and program effectiveness. CRTL has been able to mine the large and rich database that it has built to study the psychometric properties of its measures resulting in improvements in the quality and usefulness of the data. An example is the recently completed study of CRTL‟s Educational Beliefs and Multicultural Attitudes Scale (EBMAS), which found new evidence supporting its validity and reliability. A copy of this article can be found in Appendix B. PROGRAM OPTIONS Table 1 displays the list of program options, including for each the number of completers for the Class of 2012, which includes graduates in September 2011 and January and May 2012, and the number of students enrolled in fall 2012. Three new program options have been added since the Brief: Clinically-Based English Education (CREE), Clinically-Rich Integrated Science (CRISP), and Teaching Spanish as a Foreign Language/and Teaching English as a Second Language, joint program with the Graduate School of Arts and Sciences (GSAS). These new program options are described below. CREE is a residency-based program designed to provide intensive fieldwork, combined with campus and online work, to prepare highlyqualified teachers of English Language Arts. The program has beendeveloped in partnership with the Great Oaks Foundation, which hoststhe residencies of its charter schools (with plans to expand to other schools). Great Oaks is a not-for profit educational foundation whose mission is to advancequality public education for poor children and families. It currently 5 operates a charter school in Newark NJ. This program leads to anAdvanced Certificate, for initial certification in Teaching English aswell as a MA, for professional certification in Teaching English. CRISP is a residency-based MA, leading to initial/professional teachercertification The program is designed to root teacher education deeply inthe daily life of schools struggling to teach students challenged bypoverty and special needs, while at the same time connecting bothresidents and their school-based mentors to the best practices of science and of science education at NYU and throughout the city. Itspower to do these things is not just a matter of design, however. Itderives too from the substantial history of collaboration between NYUand its partner schools. The CRISP design has four major and integratedcomponents: (1) an intensive clinical experience mentored by both school and university faculty in a paid teaching residency within one ofthree high-need NYU partner schools referred to here as host schools,(2) rigorous coursework taken in the public schools drawing on the materials and expertise of the host schools and their immediate neighborhood (e.g.,social service agencies, non-formal science centers, and other nearbyschools) (3) rigorous coursework that draws on the learning resources of a well networked research university (e.g. science departments,ties to the larger science community within and beyond New York, andresearch in the learning and health sciences), and (4) a performanceassessment system based on the prospective New York State TeacherEducation Standards, and drawing on the work in progress of the Teacher Performance Assessment Consortium. Across the four components is an overarching emphasis on the use of technology to enhanceagency, mentoring, and validity (the development of skills andknowledge that lead to pupil learning) in teacher preparation. The M.A. degree in Teaching Spanish as a Foreign Language and TeachingEnglish as a Second Language (TESOL) provides students with top-rateprofessional training in three areas: 1) mastery of the Spanishlanguage; 2) teaching Spanish as a foreign language; and 3) teachingEnglish as a Second Language (ESL). The M.A. leads to dual teacher certification in Teaching a Foreign Language (Grades 7-12) and TESOL(All Grades). The program entails two years of study. Year One takesplace in Madrid, Spain, where students study Spanish language andparticipate in a teaching assistantship as English teachers in Spanishschools. Year Two takes place in New York City, where students take course work in language education and TESOL, and complete studentteaching placements in New York City Schools. This Master‟s degreeprogram is a joint offering of The Steinhardt School‟s Department ofTeaching and Learning, the Graduate School of Arts and Sciences‟Department of Spanish and Portuguese, and NYU in Madrid. As such, students gain the advantages available from the expertise of faculty in both Schools and both locations. 6 Table 1. Program options, completers, and enrollments Option Name Level (UG, Grad) N completers (Class of 2012) N enrolled (Fall 2012) Teaching Educational Theatre, All Grades UG, grad 26 56 Teaching Music, All Grades UG, grad 29 104 Teaching Dance, All Grades Grad 17 26 Teaching Art, All Grades Grad 25 21 Childhood Education UG, grad 6 15 Early Childhood Education UG, grad 1 10 Teaching English, 7-12 Teaching a Foreign Language 7-12 (Chinese, French, German, Hebrew, Italian, Japanese, Latin, Russian, or Spanish) Science Education (Teaching Biology, Chemistry, Physics & Earth Science, 7-12) * UG, grad 32 74 UG, grad 14 32 UG, grad 6 22 Teaching Mathematics, 7-12 * UG, grad 47 52 Teaching Social Science, 7-12 UG, grad 22 41 Bilingual Education * Grad 0 0 Literacy (B-6, 5-12) Grad 17 20 Teachers of English to Speakers of Other Languages * Grad 17 25 Special Education: Childhood * Grad 4 11 Special Education: Early Childhood * Grad 2 3 Dual Certification: Educational Theatre, All Grades & English Education, 7-12 Grad 13 15 Dual Certification: Educational Theatre , All Grades & Social Studies, 7-12 Grad 7 5 Dual Certification: Childhood Education/Childhood Special Education * UG, grad 84 182 Dual Certification: Early Childhood Education/Early Childhood Special Education * UG, grad 47 102 Teaching French as a Foreign Language/TESOL Joint Degree GSAS * Grad 9 14 Teaching Spanish as a Foreign Language/TESOL Joint Degree GSAS (New) * Grad 9 Clinically Based English Education (New) Grad 7 Clinically Rich Integrated Science (New) Grad 18 Total N 449 906 *High-need areas 7 UPDATED PROGRAM OUTCOMES DRSTOS-R Table 2 presents DRSTOS-R ratings for students in their final student teaching placement for the Classes of 2011 and 2012 for a total of 318 BS students and 430 MA students. The total results across the two classes parallel those in the Brief for both BS and MA student teachers. As in the Brief, the percentages of MA students continue to meet or exceed the program standard of 70% with a mean of at least 3.0 for all four domains and the Total Scale. The BS students continued to fall below the program standard for all domains, except Professional Responsibilities. However, the BS students did show large gains of about 10% points for the three scales and overall, scoring at or above 70% for all domains for the first time since the scale was first administered in 2005. Disaggregated results by program options are displayed in Table 3. For BS students, the program standard was met for only two groups, science and social studies, with only three students assessed for the latter. For MA students, the program standard was met for 10 of the 12 program options, with social studies and dance showing the highest percentages scoring means of at least 3.0. Table 2. Percentage of Late-Placement Student Teachers Meeting Standards on the Domain Referenced Student Teacher Observation Scale Revised (DRSTOS-R) by Academic Year (See notes and footnotes on next page.) Claims Scale Domain Number Total (N)/ 2010 - 2011 2011 - 2012 Total** of Items % Meeting Standards (Mean>=3.0) BS Students Total (N) 158 160 318 Planning & 1 6 % Meeting Preparation 74.1% 68.1% 71.1% Standards Total (N) 158 160 318 Classroom 3,4 7 % Meeting Environment 75.3% 69.4% 72.3% Standards Total (N) 158 160 318 2 Instruction 7* % Meeting 76.6% 71.3% 73.9% Standards Total (N) 158 160 CCT 318 Professional Learning 3 % Meeting Responsibilities 81.6% 83.1% 82.4% to Learn Standards Total (N) 158 160 318 3 Total Score 21* % Meeting 72.2% 66.9% 69.5% Standards MA Students Total (N) 210 220 430 Planning & 1 6 % Meeting Preparation 80.0% 76.8% 78.4% Standards Total (N) 210 220 430 Classroom 3,4 7 % Meeting Environment 80.5% 79.5% 80.0% Standards Total (N) 210 220 430 2 Instruction 7* % Meeting 80.0% 81.8% 80.9% Standards Total (N) 210 220 CCT 430 Professional Learning 3 % Meeting Responsibilities 86.7% 90.0% 88.4% to Learn Standards Total (N) 210 220 430 3 Total Score 21* % Meeting 78.1% 78.2% 78.1% Standards 8 Notes. Scale is (1) Not Yet Proficient (2) Partially Proficient (3) Entry Level Proficient (4) Proficient. The standard for proficiency is 3. *Two additional items were added to “Instruction” in spring 2012, increasing the number of items from 5 to 7 ** Values in bold font meet the program standard of 80% >=3; values in bold italics fall within the 95% confidence interval around the standard, which means they are not significantly lower than the standard, p<.05. Table 3. Summary of performance on DRSTOS-R Total Scores for student teachers in their last placements by program certification areas, fall 2010 – spring 2012 Program * Dual Early Childhood Dual Childhood Ed. Theatre English Math MMS Music Science Social Studies N Assessed Undergraduate 45 169 18 27 5 5 26 3 19 % >=3** M SD 68.9% 74.0% 66.7% 59.3% 20.0% 60.0% 57.7% 100.0% 78.9% 3.23 3.30 3.22 3.12 2.80 3.34 3.11 3.98 3.34 0.54 0.54 0.43 0.44 0.18 0.49 0.40 0.03 0.39 83.3% 73.3% 78.6% 76.2% 92.6% 70.6% 81.8% 70.2% 74.4% 100.0% 66.7% 3.34 3.27 3.56 3.28 3.53 3.09 3.51 3.32 3.19 3.58 3.06 0.46 0.45 0.29 0.49 0.36 0.34 0.51 0.41 0.31 0.20 0.51 Graduate Early Childhood/Dual Early Childhood Childhood/ Dual Childhood Science English Social Studies Math MMS Educational Theatre Art Dance Music 48 86 14 42 27 17 66 47 39 22 21 ** Values in bold font meet the program standard of 80% >=3; values in bold italics fall within the 95% confidence interval around the standard, which means they are not significantly lower than the standard, p<.05. New York State Teacher Certification Exams (NYSTCE) Table 4 displays the results of the performance of graduates on the NYSTCE exams in 2011 and 2012. Consistent with the findings reported in the Brief, graduates continue to show strong performance on the three sets of exams exceeding the dual program standards of 90% passing and an effect size of at least 0.80, indicating that the mean scale score exceeded passing to a large and educationally meaningful degree. Figures 1 and 2 display the mean scores on the major Content Specialty Tests (CSTs) for BS and MA graduates respectively in 2011 and 2012 combined. As can be seen, the mean scores of both BS and MA Steinhardt students exceeded the passing score of 220 for all CSTs, although there were large differences among the specialty areas, with mean scores in math exceeding all other areas for both degree options. 9 Table 4. Mean scaled scores, effect sizes, and passing rates for teacher-education graduates on the NYSTCE exams: Classes of 2011 & 2012 Exam/Claim Degree BS LAST/ crosscutting theme 1, Learning to Learn MA BS ATS-W/Claim 2, Pedagogical Knowledge MA BS CST/Claim 1, Content Knowledge MA Class N Scaled scores Effect size SD 15.4 3.35 % Passing 2011 73 Mean 271.6 2012 113 264.2 21.2 2.09 92.9% Total 186 267.1 19.4 2.43 95.7% 2011 222 268.4 23.1 2.10 95.5% 2012 187 270.4 18.3 2.76 98.9% Total 409 269.3 21.0 2.34 97.1% 2011 79 267.7 17.0 2.80 97.5% 2012 108 269.0 20.0 2.45 94.4% Total 186 267.1 19.4 2.43 95.7% 2011 228 269.0 15.7 3.11 98.7% 2012 197 270.4 13.9 3.64 99.5% Total 425 269.7 14.9 3.33 99.1% 2011 126 255.0 21.2 1.65 94.4% 2012 163 249.8 23.0 1.30 91.4% Total 289 252.1 22.4 1.43 92.7% 2011 381 254.2 25.9 1.32 90.3% 2012 326 254.1 23.0 1.48 92.0% Total 707 254.1 24.6 1.39 91.1% 100.0% * ES = Effect Size = SDs Above Passing = (MSS - 220)/SD; the program standard is an ES >= .80, large and meaningful ** Passing score = 220 on a scale of 100 – 300. The program standard is 90% passing. *** If a student has multiple tests, data are based on the most recent exam 10 Figure 1 Note: N’s are as follows: Math = 15, El.Ed = 94, Social Studies = 15, Stud. With Disabilities = 93, Music = 23, English = 23 11 Figure 2 Note: N are as follows: Math= 56, Literacy = 19, For. Lang. = 67, El. Ed. = 131, ESOL = 72, Studs. With Disabilities = 120, English = 53, Science = 20, Theater = 49, Visual Arts = 39, Social Studies = 24, Dance = 32, Music = 20 Student Teacher End-of-Term Feedback Surveys (ETFQ) Table 5 displays the results of the assessment of Claims 1, 2, and 3 for the Classes of 2011 and 2012 using ETFQ data. The total mean scores for each of the three claim scales met the criterion of 4.0 (nominally equivalent to a rating of “Well”) for both BS and MA program finishers. For MA students, the means exceeded the program standard on all three claim scales while for BS students, the means exceeded the standard for Claims 2 and 3 and was not statistically significantly different from the standard for Claim 1. These results are consistent with the findings in the Brief and indicate that program completers continue to meet program standards on these two measures in the two years following the accreditation study. 12 Table 5. Mean scores on the ETFQ Claim Scales for teacher-education students in last student teaching placements (Classes of 2011 and 2012) Class 2011 2012 Total Claim Scale Statistic Degree Degree Degree Claim 1. Content Knowledge Mean BS 3.91 MA 4.05 BS 3.93 MA 4.10 BS 3.92 MA 4.07 N 206 343 132 191 338 534 SD 0.86 0.90 0.96 0.96 0.90 0.92 Claim 2. Pedagogical Knowledge Mean 4.00 4.12 4.11 4.22 4.04 4.16 N 206 343 132 191 338 534 SD 0.92 0.84 0.81 0.81 0.88 0.83 Claim 3. Clinical Knowledge Mean 4.08 4.18 4.08 4.20 4.08 4.19 N 206 343 132 191 338 534 SD 0.82 0.82 0.86 0.84 0.83 0.83 Notes: Claim 1 scale items are Items 9 and 18; Claim 2 scale items are Items 7 and 15; and Claim 3 scale items are Items 8, 11, 16, and 19. Items are measured on a 5-point Likert scale with scale values of (1) “Very Poorly”, (2) “Poorly”, (3) “Average”, (4) “Well”, and (5) “Very Well”. * Total means in bold meet the program standard of 4.0; means in bold italics are not significantly different from the program standard of 4.0. Educational Beliefs Multicultural Awareness Scale (EBMAS) Table 6 displays the comparison of mean EBMAS scale scores against the program standard of 4.5 for BS and MA program finishers in the Classes of 2011 and 2012. Continued research on the EBMAS found a factor structure that was different from the one that emerged in earlier analyses and led to a change in the scoring to yield five scale scores associated with three claims rather than the four reported in the Brief. Since the new scoring structure was based on research using a much larger sample size and was better aligned with the theory underlying the construction of EBMAS, the five-score structure will be used in subsequent assessments. As shown in the table, two scales, Personal Teacher Efficacy 1 and 2 (PTE 1) and (PTE 2) are associated with Claim 3, two, General Teacher Efficacy (GTE) and Social Justice (SJ) with claim 4, and one, Multicultural Awareness (MA) with Cross Cutting Theme 2. For both BS and MA students, all observed means either met or were not statistically significantly different from the program standard of 4.50, thereby supporting the claims. The highest mean scores were for MA and SJ and the lowest for PTE 1 and PTE 2, especially for BS students. Overall, the results were better than those in the Brief, suggesting progress in this measure during the two years since the accreditation study (see Appendix B). 13 Table 6. Mean EBMAS scale scores by degree and year compared to the program standard of 4.50 Scale (Claim)** Year PTE 1. Personal Efficacy: Student Problem Solving (Claim 3) N Mean 2010 - 11 54 4.70 2011 - 12 68 Total PTE 2. Personal Efficacy: Student Success (Claim 3) GTE. General Teacher Efficacy (Claim 4) MA, Multicultural Awareness(CCT 2) SJ, Social Justice(Claim 4) BS SD MA Mean SD M - 4.50 * N M - 4.50 * 0.66 0.20 114 4.40 0.80 -0.10 4.55 0.88 0.05 120 4.51 0.67 0.01 122 4.61 0.78 0.11 234 4.46 0.73 -0.04 2010 - 11 54 4.25 0.76 -0.25 114 4.36 0.70 -0.14 2011 - 12 68 4.24 0.67 -0.26 119 4.49 0.64 -0.01 Total 122 4.24 0.71 -0.26 233 4.42 0.67 -0.08 2010 - 11 54 4.94 0.90 0.44 114 4.77 0.95 0.27 2011 - 12 68 5.02 0.83 0.52 120 4.96 0.88 0.46 Total 122 4.99 0.86 0.49 234 4.87 0.91 0.37 2010 - 11 54 5.54 0.57 1.04 114 5.35 0.68 0.85 2011 - 12 68 5.50 0.66 1.00 120 5.50 0.49 1.00 Total 122 5.52 0.62 1.02 234 5.42 0.58 0.92 2010 - 11 54 5.37 0.51 0.87 114 5.28 0.70 0.78 2011 - 12 68 5.34 0.61 0.84 119 5.39 0.52 0.89 Total 122 5.36 0.56 0.86 233 5.34 0.61 0.84 * Values in bold font indicate the program standard of 4.5 has been met or exceeded; values in bold italicsare not significantly different from the program standard. ** TEAC Claims: Claim 3, Clinical Competence; Claim 4, Caring Professional; CCT 2, Multicultural Perspective Responses are measured on a 6-point scale of agreement as follows: (1) Strongly Disagree (2) Moderately Disagree (3) Slightly Disagree (4) Slightly Agree (5) Moderately Agree (6) Strongly Agree. Grade Point Averages Table 7 presents the mean values for four types of GPAs associated with three claims and Cross Cutting Theme 1 (CCT1), Learning to Learn, for BS and MA graduates in the class of 2012. As can be seen in the table, the program standard of 3.0 was exceeded for all three claims by both BS and MA students; for CCT1, MA students exceeded the standard while the mean for BS students did not differ significantly from the standard. The findings for the claims are consistent with those reported in the Brief and better than those in the Brief for CCT1. 14 Table 7. Mean GPAs of NYU BS & MA teacher education graduates by claims (Class of 2012) Claim GPA* Statistic BS** MA** Mean 3.04 3.46 SD 0.69 0.50 1 CK N 137 61 Mean 3.56 3.84 SD 0.32 0.40 2 PK N 137 325 Mean 3.85 3.87 SD 0.36 0.32 3 CS N 120 283 Mean 2.92 3.46 CrossSD 0.90 0.50 cutting CCT1 theme 1 N 105 61 * Types of GPA: CK=Content Knowledge; PK=Pedagogical Knowledge; CS=Clinical Skill; CCT1=Cross Cutting Theme, Learning to Learn ** Means in bold font meet or exceed the program standard of 3.0 on a 4-point scale; means in bold italics do not differ significantly from the standard of 3.0. Program Exit and Follow-Up Surveys Tables 8 and 9 display the results of two surveys administered to BS and MA program completers and graduates, respectively, in the classes of 2011 and 2012. The Program Exit Survey (Table 8) is administered in May to program completers to elicit their perceptions of the extent to which the program prepared them to begin teaching. The One-Year Follow-Up Survey (Table 9) is administered to the same samples eight months after graduation to assess their preparation for teaching after most had entered the teaching profession. In both tables, the items are clustered by the claims and cross-cutting themes they address. As can be seen in Table 8, at program exit BS students met the program standard of 80% feeling “Well” or “Very Well” prepared to begin teaching with respect to Content Knowledge, Clinical Skill, and Cross-Cutting Theme 2, Multicultural Perspective. They met the standard for four of the six items related to Pedagogical Knowledge, falling short on “addressing the needs of students with limited English proficiency” and “working with parents”, and two of the three items related to Caring Professionals. Overall, the results for MA students were not as strong as those for the BS students. They met the standard for all items related to Content Knowledge, Clinical Skill and Cross-Cutting Theme 2, but met the standard for only three of the six Pedagogical Skill items and none of the Caring Professional items. Neither BS nor MA students met the standard for Cross-Cutting theme 3, Knowledge of Technology. 15 Table 8. Numbers and percents of Steinhardt teacher-education program completers who reported on the Program Exit Survey that their programs prepared them very or moderately well to begin teaching: Classes of '11 & '12 Claim 1. Content knowledge Responded Very Well (4) or Moderately Well (3) How well did your teacher education program prepare you to: Have a mastery of your subject area VW (4) BS (N = 90) MW (3) (4+3) VW (4) MA (N = 141) MW (3) (4+3) N 35 39 74 55 48 103 % 38.9% 43.3% 82.2% 39.6% 34.5% 74.1% N 45 36 81 64 44 108 % 50.0% 40.0% 90.0% 46.0% 31.7% 77.7% N 47 38 85 75 51 126 % 52.2% 38.7% 90.9% 53.6% 36.4% 90.0% 1. Content knowledge Implement state/district curriculum & standards 2.Pedagogical knowledge Understand how students learn 2.Pedagogical knowledge Use different pedagogical approaches N 43 35 78 76 46 122 % 49.4% 40.2% 89.6% 54.3% 32.9% 87.2% 2.Pedagogical knowledge Use student performance assessment techniques N 47 32 79 66 43 109 % 53.4% 36.4% 89.8% 47.5% 30.9% 78.4% 2.Pedagogical knowledge Address needs of students with disabilities N 32 43 75 47 47 94 % 35.6% 47.8% 83.4% 34.1% 34.1% 68.2% 2.Pedagogical knowledge Address needs of students with limited English proficiency N 15 36 51 31 33 64 % 16.7% 40.0% 56.7% 22.5% 23.9% 46.4% 2.Pedagogical knowledge Work with parents N 20 36 56 20 41 61 % 22.5% 40.4% 62.9% 14.2% 29.1% 43.3% 3.Clinical skill Maintain order & discipline in the classroom N 44 36 80 56 62 118 % 48.9% 40.0% 88.9% 40.3% 44.6% 84.9% 3.Clinical skill Impact my students' ability to learn N 48 33 81 62 66 128 % 53.3% 36.7% 90.0% 44.3% 47.1% 91.4% N 36 32 68 52 50 102 % 40.4% 36.0% 76.4% 36.9% 35.5% 72.4% N 30 40 70 41 53 94 % 33.3% 44.4% 77.7% 29.3% 37.9% 67.2% N 15 41 56 34 46 80 % 16.7% 45.6% 62.3% 24.3% 32.9% 57.2% N 44 34 78 66 41 107 % 49.4% 38.2% 87.6% 47.5% 29.5% 77.0% N 25 26 51 48 38 86 % 27.8% 28.9% 56.7% 34.0% 27.0% 61.0% 4. Caring Professionals 4. Caring Professionals 4. Caring Professionals Work collaboratively with teachers, administrators and other school personnel Identify & use resources within the community where you teach Participate as a stakeholder in the community where you teach Cross-cutting theme 2 Address needs of students from diverse cultures Cross-cutting theme 3 Integrate technology into teaching * Total percents in bold meet or exceed the program criterion of 80%; those in bold italics have the program criterion within the 95% confidence interval for the observed value. 16 Table 9. Numbers and percents of Steinhardt teacher-education program completers who reported on the OneYear Follow-Up Survey that their programs prepared them very or moderately well to begin teaching: Classes of '11 &‘12 Claim 1.Content knowledge Have a mastery of your subject area 1.Content knowledge Implement state/district curriculum & standards 2.Pedagogical knowledge Understand how students learn 2.Pedagogical knowledge Use different pedagogical approaches 2.Pedagogical knowledge Use student performance assessment techniques 2.Pedagogical knowledge Address needs of students with disabilities 2.Pedagogical knowledge Address needs of students with limited English proficiency 2.Pedagogical knowledge Work with parents 3.Clinical skill Maintain order & discipline in the classroom 3.Clinical skill Impact my students' ability to learn 4. Caring Professionals 4. Caring Professionals 4. Caring Professionals Responded Very Well (4) or Moderately Well (3) How well did your teacher education program prepare you to: Work collaboratively with teachers, administrators and other school personnel Identify & use resources within the community where you teach Participate as a stakeholder in the community where you teach Cross-cutting theme 2 Address needs of students from diverse cultures Cross-cutting theme 3 Integrate technology into teaching VW (4) BS (N = 63) MW (3) (4+3) VW (4) MA (N = 120) MW (3) (4+3) N 28 25 53 37 55 92 % 44.4% 39.7% 84.1% 30.8% 45.8% 76.6% N 27 26 53 27 44 71 % 42.9% 41.3% 84.2% 22.5% 36.7% 59.2% N 39 15 54 45 48 93 % 61.9% 23.8% 85.7% 37.5% 40.0% 77.5% N 33 24 57 44 52 96 % 52.4% 38.1% 90.5% 36.7% 43.3% 80.0% N 30 26 56 34 53 87 % 47.6% 41.3% 88.9% 28.3% 44.2% 72.5% N 31 16 47 32 35 67 % 49.2% 25.4% 74.6% 26.7% 29.2% 55.9% N 19 14 33 23 31 54 % 30.2% 22.2% 52.4% 19.2% 25.8% 45.0% N 18 23 41 15 31 46 % 28.6% 36.5% 65.1% 12.5% 25.8% 38.3% N 21 20 41 15 42 57 % 33.3% 31.7% 65.0% 12.5% 35.0% 47.5% N 37 19 56 40 46 86 % 58.7% 30.2% 88.9% 33.3% 38.3% 71.6% N 32 20 52 40 40 80 % 50.8% 31.7% 82.5% 33.3% 33.3% 66.6% N 27 22 49 31 39 70 % 42.9% 34.9% 77.8% 25.8% 32.5% 58.3% N 22 21 43 18 43 61 % 34.9% 33.3% 68.2% 15.0% 35.8% 50.8% N 35 19 54 41 46 87 % 55.6% 30.2% 85.8% 34.2% 38.3% 72.5% N 23 24 47 38 33 71 % 36.5% 38.1% 74.6% 31.7% 27.5% 59.2% * Total percents in bold meet or exceed the program criterion of 80%; those in bold italics have the program criterion within the 95% confidence interval for the observed value. 17 As can be seen in Table 9, the BS graduates‟ perceptions of their preparation for teaching one year after graduation were similar to the ones they had at program exit, while the MA students felt less prepared after graduation than they did at program exit. MA students met program standards on only three of the 15 items on the Follow-Up Survey, compared to eight of 15 on the Program Exit Survey. For BS graduates, there were two noteworthy differences in their responses to the two surveys. First, whereas they fell below standard at graduation in Technology, their perceptions were higher on the Follow-Up survey and met the standard. Second, they met standards on one of the two Clinical Skillitems on the Follow-Up survey compared to two out of two at Program Exit. The results of the two surveys for 2011 and 2012 are generally in line with those reported in the Brief and suggest the need for continued work on improving the curriculum and experiences of Steinhardt teacher education students in certain areas of teaching skills. Graduate Employment and Retention Table 10 displays a comparison of the demographic characteristics of the NYC public schools in which Steinhardt graduates from the classes of 2006 – 11 were employed and the demographics of all NYC public schools at school level, elementary through high school. The program standard is that the demographic characteristics for the schools of Steinhardt graduates will by statistically similar to those of NYC public schools overall. As can be seen in the table, the schools of Steinhardt graduates are highly diverse. Nevertheless, they tend to have statistically significantly lower percentages of Black and Hispanic students eligible for free lunch than NYC public schools overall. On the other hand, the middle schools of Steinhardt graduates had higher percentages of ELL students. The differences in percentages of minority and poor students are largely attributable to the tendency of Steinhardt graduates to be employed in schools in District 2 in Manhattan, a district in which NYU is situated and one with lower percentages of poor and minority students than the city overall. These results are similar to those reported in the Brief. Table 11 displays the results of an analysis of retention data obtained from the NYCDOE for Steinhardt graduates from the classes of 2006 – 11 who were employed in NYC public schools. The program standard is 70% of graduates remaining employed or leaving after serving at least three years in the NYC public schools, a standard that is better than the average for new teachers in the NYC public schools. This standard uses a single criterion as opposed to the multiple criteria, differentiated by year of graduation, that were used in the Brief. The change in the standard is intended to simplify tracking progress on this indicator, thereby increasing the reliability of inferences based on the data. As can be seen in Table 11, the standard was met for both BS and MA graduates from all classes. Overall, as of September 2012, 80% of all graduates from the classes of 2006 – 11 who taught in NYC public schools remained employed or left after serving at least three years. These results are consistent with the results reported in the Brief and are higher than the 60% overall three-year retention rate cited in a staff report from the New York City Council.1 1 New York City Council (July 2009).A staff report of the New York City Council Investigation Division on teacher attrition and retention. Retrieved on June 4, 2013 from http://www.nyc.gov/html/records/pdf/govpub/1024teachersal.pdf 18 Table 10. Comparison of the demographics of NYC schools in which NYU graduates first taught and all NYC schools disaggregated by school type (Classes of 2006 – ‘11) All NYC Grads Schools Schools Diff. in N School Type Demographic Grads Mean SD Mean SD Means* Elementary Middle K-8 High School %ELL 968 14.7 12.3 16.9 13.1 %Spec. Ed. 989 17.3 6.7 16.9 6.3 % Black & Hispanic 989 63.1 24.2 70.8 31.3 % Free lunch 989 61.7 27.7 68.7 23 N Enrolled 989 655.8 284.1 639.3 277.9 %ELL 327 16.4 13.7 11.1 12.2 %Spec. Ed. 384 17.9 6.6 16.6 7.4 % Black & Hispanic 384 73.5 27.3 81 25.1 % Free lunch 384 68.7 20.8 68.9 19.3 N Enrolled 384 691.8 428.7 584.6 419.2 %ELL 14 10.6 9.9 11.6 11 %Spec. Ed. 14 13.6 8.2 16.6 6.7 % Black & Hispanic 14 65.4 18.7 78.3 27.4 % Free lunch 14 56.0 34.8 67.7 21.9 N Enrolled 14 502.6 129.5 684.6 290.8 %ELL 416 10.4 16.9 12.6 18.5 %Spec. Ed. 418 11.9 6.6 12.8 6.8 % Black & Hispanic 418 72.9 11.9 82.3 22.1 % Free lunch 418 55.9 21.5 61.4 19.9 N Enrolled 418 846.7 913.4 898.3 1027.2 -2.2 0.4 -7.7 -7.0 16.5 5.3 1.3 -7.5 -0.2 107.2 -1.0 -3.0 -12.9 -11.7 -182.0 -2.2 -0.9 -9.4 -5.5 -51.6 Note 1: School demographic data were not available for all graduates who were working in NYC public schools * Differences in bold italics are statistically significant at p < .05. The program standard is that the means for the percent of at-risk students in the schools of NYU graduates will equal to or higher than the means for all NYC public schools. The standard does not apply to enrollment. 19 Table 11. Retention status as of Sept. 2012 of Steinhardt graduates who began teaching in NYC public schools within one year of graduation (Classes of 2006 – 2011) Retention status * Degree Class 2006 2007 2008 BS/BMUS 2009 2010 2011 Total 2006 2007 2008 MA 2009 2010 2011 Total Total Hired N Left before 3 years 6 Still employed 18 Left after 3 years 10 % in class 17.6% 52.9% 29.4% 100.0% N 9 22 4 35 % in class 25.7% 62.9% 11.4% 100.0% N 7 28 1 36 % in class 19.4% 77.8% 2.8% 100.0% N 8 23 1 32 % in class 25.0% 71.9% 3.1% 100.0% N 7 23 0 30 % in class 23.3% 76.7% 0.0% 100.0% N 2 21 0 23 % in class 8.7% 91.3% 0.0% 100.0% N 39 135 16 190 % in class 20.5% 71.1% 8.4% 100.0% N 41 95 38 174 % in class 23.6% 54.6% 21.8% 100.0% N 39 139 26 204 % in class 19.1% 68.1% 12.7% 100.0% N 34 116 14 164 % in class 20.7% 70.7% 8.5% 100.0% N 31 119 4 154 % in class 20.1% 77.3% 2.6% 100.0% N 23 107 2 132 % in class 17.4% 81.1% 1.5% 100.0% N 14 92 0 106 % in class 13.2% 86.8% 0.0% 100.0% N 182 668 84 934 % in class 19.5% 71.5% 9.0% 100.0% Statistic 34 Program standard is a total of 70% still employed or leaving after 3 years of service in NYCDOE public schools. All classes have met the standard. 20 APPENDIX A 21 APPENDIX B. The Developing Teaching Dispositions of NYU Steinhardt’s Teacher Education Students: An Analysis of Responses to the Educational Beliefs and Multicultural Attitudes Scale (EBMAS) Data Collected in Academic Years 2009-10 – 2011-12 March 21, 2013 Robert Tobias, Research Consultant and Director Retired of the Center for Research on Teaching Learning, Steinhardt’s Department of Teaching and Learning 1 ABSTRAT This report presents updated findings from the analysis of EBMAS, a component of NYU Steinhardt’s assessment student and program assessment system, for the three academic years 2009-10 thru 2011-12. During that time, EBMAS was administered to 1,450 undergraduate and graduate students who were at the beginning, middle, or end of their pre-service teachereducation programs. The report presents findings from continued research on EBMAS, results from the use of the scale to assess TEAC program clams, and analyses of the differences in scores for students grouped by demographic, experiential, and program characteristics. These findings update the results reported for a smaller dataset in NYU’s 2011 TEAC Inquiry Brief for re-accreditation. The findings lead to the overall conclusion that EBMAS has been a valid and reliable tool for assessing the developing teaching dispositions of NYU Steinhardt teacher education students and, consequently, the data have important implications for readiness to teach of the graduates and the effectiveness of the program in preparing competent and caring educators. NYU graduates generally have strong beliefs in the general efficacy of teaching to promote the learning and positive behavior of all pupils, value social justice, and a strong awareness of and positive attitude toward the importance of a multicultural perspective. They also have moderate confidence in their personal efficacy to teach all students, although with less certainty than their other beliefs. The exploration of differences in scores between students grouped by demographic, experiential, and program characteristics revealed some differences that warrant discussion among program faculty and administrators. In addition, recent differences in the factor structure of the scale that emerged from PCA highlight the importance of continuing research on its psychometric properties. 2 CONTENTS ABSTRACT...................................................................................................................................................... 2 CONTENTS ..................................................................................................................................................... 3 TABLES........................................................................................................................................................... 4 INTRODUCTION ............................................................................................................................................. 5 BACKGROUND ON EBMAS ............................................................................................................................ 6 METHOD........................................................................................................................................................ 7 Data Collection.......................................................................................................................................... 7 Data ........................................................................................................................................................... 8 Data Analysis ............................................................................................................................................. 8 RESULTS ........................................................................................................................................................ 9 Description of the Participants ................................................................................................................. 9 Validity and Reliability............................................................................................................................. 12 Assessment of TEAC Claims .................................................................................................................... 16 Differences in Scale Scores by Student Demographics and Experience ................................................. 17 Perceived Effectiveness of the Steinhardt Teacher Education Program ................................................ 24 SUMMARY AND CONCLUSIONS .................................................................................................................. 26 REFERENCES CITED...................................................................................................................................... 28 3 TABLES Table 1. N and percent of students taking EBMAS by academic year…………………………………………………….9 Table 2. Race/ethnicity of total sample and sample with data…………………………………………………………….10 Table 3. Number and percent of total sample and sample with data in certification areas…………………11 Table 4. Credits completed within degree groups for EBMAS respondents………………………………………….12 Table 5. Summary of ANOVA for differences in EBMAS subscale scores among students at different stages of their programs (Undergraduates)………………………………………………………………………14 Table 6. Summary of T-test for differences in EBMAS subscale scores between new and late stage students (graduate students)………………………………………………………………………………………………15 Table 7. Summary of ANOVA for tests of significance of the main and interaction effects of year and degree on EBMAS subscale scores (late stage BS and MA students in Classes of 2010 - 2012)……………………………………………………………………………………………………………………..15 Table 8. Mean EBMAS subscale scores by degree and year compared to the program standard of 4.50 (Classes of 2010 - 2012)……………………………………………………………………………………………..17 Table 9. Summary of t-tests and ANOVAs for test of significance of differences in EBMAS subscale scores by descriptive characteristics of late-stage undergraduate student teachers (classes of 2010 - 12)………………………………………………………………………………………………………………18 Table 10. Homogeneous subsets of EBMAS PTE 1 means by certification area for late-stage undergraduates (Classes of 2010 - 12)……………………………………………………………………………………………......18 Table 11. Homogeneous subsets of EBMAS PTE 2 means by race/ethnicity for late-stage undergraduates (Classes of 2010 - 12)………………………………………………………………………………………………….19 Table 12. Homogeneous subsets of EBMAS GTE means by race/ethnicity for late-stage undergraduates (Classes of 2010 - 12)………………………………………………………………………………………………….19 Table 13. Mean EBMAS subscale scores for late-stage undergraduate students with varying types of student teaching experience (Classes of 2010 - 2012)……………………………………………………………..20 Table 14. Summary of t-tests and ANOVAs for test of significance of differences in EBMAS subscale scores by descriptive characteristics of late-stage graduate student teachers (classes of 2010 - 12)……………………………………………………………………………………………………………….21 Table 15. Homogeneous subsets of EBMAS GTE means by certification area for late-stage graduate students (Classes of 2010 - 12)……………………………………………………………………………………………….21 4 Table 16. Homogeneous subsets of EBMAS MA means by certification area for late-stage graduate students (Classes of 2010 - 12)………………………………………………………………………………………………22 Table 17. Homogeneous subsets of EBMAS PTE 1 means by race/ethnicity for late-stage graduate students (Classes of 2010 - 12)………………………………………………………………………………………………22 Table 18. Homogeneous subsets of EBMAS GTE means by race/ethnicity for late-stage graduate students (Classes of 2010 - 12)………………………………………………………………………………………………23 Table 19. Homogeneous subsets of EBMAS SJ means by race/ethnicity for late-stage graduate students (Classes of 2010 - 12)………………………………………………………………………………………………23 Table 20. Mean EBMAS subscale scores for late-stage graduate students by significant descriptive characteristics (Classes of 2010 - 2012)………………………………………………………………………………24 Table 21. Summary of ANOVA comparing mean scores of late-stage undergraduates to question 17 for the three Classes of 2009-10 - 2011 – 12……………………………………………………………………..25 Table 22. Summary of ANOVA comparing mean scores of late-stage graduate students to question 17 for the three Classes of 2009-10 - 2011 – 12………………………………………………………………….25 Table 23. Mean scores of late-stage graduate students to question 17 of EBMAS Classes of 2009 -10 - 2011 - 12)……………………………………………………………………………………………………………25 Table 24. Mean scores of late-stage graduate students to question 17 of EBMAS (Classes of 2009 -10 - 2011 - 12)……………………………………………………………………………………………………………26 5 INTRODUCTION One component of NYU Steinhardt’s comprehensive system for assessing the development of its teacher education students and the effectiveness of its teacher education program is the Education Beliefs and Multicultural Attitudes Scale (EBMAS). NYU recognizes that the qualities of a competent and caring teacher go beyond knowledge of subject matter and pedagogical skills that can be tested and observed. Also important are beliefs and attitudes toward teaching, learners and learning, and cultural communities—or what Burant et.al. (2007) refer to as teaching dispositions—as well as beliefs in one’s teaching efficacy. EBMAS was developed by Steinhardt’s Center for Research on Teaching and Learning to measure these unobservable dispositions using survey research methodology. EBMAS data were used in NYU’s TEAC Inquiry Brief (Tobias, Pietanza, & McDonald, 2011) as evidence supporting the claims that its graduates were competent and caring teachers. This paper presents updated EBMAS findings from the continued assessment of NYU’s teacher education students during the 2010 – 11 and 2011 – 12 academic years. In addition to presenting the overall findings for the assessment, the paper describes the results from CRTL’s continued research on EBMAS, including analysis of its psychometric properties and investigation of differences in scores for students disaggregated by program, demographics, and experience. Following this introduction, the paper provides background and history on the development of EBMAS. Next is a section on the survey methods followed by the presentation of results and a discussion of their implications. BACKGROUND ON EBMAS CRTL developed EBMAS in fall 2009 as a measure of teacher candidates’ developing dispositions toward teaching. EBMAS replaced its precursor, the Educational Beliefs Questionnaire (EBQ), which was administered to Steinhardt teacher-education students from 2004 - 2008. The initial form of EBMAS, which was administered to NYU teacher education students in fall 2009 and spring 2010 as part of the TEAC re-accreditation self-inquiry study, consisted of 39 items. In addition to the EBQ, EBMAS items were drawn from the Teacher Efficacy Scale (TES) (Gibson and Dembo, 1984) and the Teacher Multicultural Attitude Survey (TMAS) (Ponterotto et al., 1998). Item selection was based on alignment with the goals of the NYU program and the clarity of the items. It was hypothesized that the original 39-item scale would measure four constructs: General Teacher Efficacy (GTE), defined as the overall belief that teaching can promote the learning of all students regardless of home background or community; Personal Teacher Efficacy (PTE), which is the teacher’s own belief that he or she can educate all children regardless of background; Multicultural Attitudes (MA), which is the teachers’ awareness of, comfort with, and sensitivity to issues of cultural pluralism in the classroom; and Social Justice (SJ), defined as their belief in the moral and social responsibility 6 of teachers to educate all children equitably. However, factor analysis of the data from the administration of the 39-item survey found that a slightly different factor structure that was comprised of 28 of the items and explained 48% of the variance. While GTE emerged as a major factor as expected, the results of the factor analysis differed from expectancy with respect to the other three factors. First, the PTE items split into two factors: one was labeled PTE 1, and included items that asked the extent to which aspiring teachers felt capable of dealing with a variety of classroom situations and pupil problems; the second, labeled PTE 2, included items that asked the extent to which they felt that the successes of their pupils could be attributed to their teaching. The differences between these two factors are subtle and had not been observed in previous research on teacher efficacy, which had focused exclusively on practicing teachers. Second, the items designed to measure MA and SJ separately loaded on a single factor, which was labeled MA/SJ. Therefore, four subscale scores were computed using the 28 items aligned with the empirical factors and were used to assess the claims in the TEAC Inquiry Brief. In addition, EBMAS was reformatted to a 28-item version, which has been used in all administrations of the survey subsequent to spring 2010. CRTL continues to do research on EBMAS, which includes ongoing study of its factor structure, validity, and reliability. The results of some of this research has led to the further modification of the subscales, which has changed the scoring and reporting of results in this paper. The new scoring system will also be described in the 2013 Annual Report to TEAC. METHOD Data Collection CRTL attempts to administer EBMAS to all teacher education students, both graduate and undergraduate, twice, once at the beginning of their first semester at Steinhardt and again near the end of their last semester. A sample of undergraduates in the dual childhood and early childhood programs also take EBMAS at mid-preparation, in the beginning of their junior year. In actual implementation, undergraduates in the mid-preparation administration have varying levels of accumulated credits, resulting in their division into Early and Middle preparation groups for comparative analysis research. The data included in this paper are form EBMAS administrations for the three academic years 2009 –10 thru 2011 –12. During this period, EBMAS was administered in two formats: paper-and-pencil and on-line through Survey Gizmo. Since the audience is captive, the return rate for the former tends to be much higher and, therefore it was the mode of administration for all semesters except fall 2011. The paper-and-pencil form was administered by instructors at sessions of the following classes/events: • Undergraduates: The entry EBMAS was given by instructors of the New Student Seminar, which was attended by all teacher education students. The exit EBMAS was administered at seminars embedded in the final term of student teaching. Students 7 who take the mid-preparation EBMAS were assessed in seminars at the beginning of the first semester of student teaching. • Graduates: Fast Track MA students took the entry EBMAS during the orientation sessions at the beginning of the first summer session. Fall new enrollees were assessed at the beginning of Inquires, the core pedagogical course. The exit EBMAS was given near the end of the seminar associated with their last student teaching placement. The on-line form was administered in the same timeframe as the paper-and-pencil form via email invitations. In order to maximize response rates, email invitations were sent in the name of faculty, usually program directors, that the students would recognize. Arrangements for EBMAS administration were collaboratively coordinated with the Director of Clinical Services, the program directors, and chairs of the arts departments. Data The EBMAS items are statements of beliefs that students respond to using a six-point Likert scale of agreement ranging from (1) Strongly Disagree to (6) Strongly Agree, with the intermediate categories labeled (2) Moderately Agree (3) Slightly Agree and so on. Item statements are counterbalanced, with some stated in the positive form and some in the negative. Student workers key-entered completed paper-and-pencil survey forms into SPSS files within two weeks of each administration. On-line survey data were downloaded into SPSS files within two weeks of survey closing. The data consist of the individual numerical scores for each EBMAS item and computed mean scores (scale = 1 – 6) for each of the scales. In computing the scale scores, responses to reverse-coded items were flipped so that high scores always indicated positive beliefs and attitudes. Each record also contained demographic data and information on educational experience and academic programs. Data Analysis First, descriptive statistics were computed on the demographic and experience characteristics of the participants, to describe the sample. Next, in order to determine whether the data continued to support the hypothesized factor structure of the scale, the updated full database was submitted to principal components factor analysis with varimax rotation (PCA). Empirical deviations in the rotated factor solution were examined in relation to the theoretical structure underlying EBMAS. Following this examination, the scales were modified accordingly and checked for internal consistency reliability using Cronbach’s coefficient alphas. Then, evidence of substantive validity continued to be explored by comparing the subscale scores of groups of students at the early, middle, and late stages of their teacher education programs. Next, in order to continue to assess the TEAC claims for the Annual Report to TEAC, mean subscale scores were computed for program completers in the Classes of 2011 and 2012 and compared to the program standard of mean equal to or greater than 4.50. Means for 8 question 17, which asks students to the extent to which their teacher education program has given (will give) them the skills to be an effective teacher, were calculated separately, as a measure of perceived program effectiveness. In order to assess the stability reliability of the results across time, ANOVAS were applied to the mean subscale scores with year (2009 – 10, 2010 – 11, and 2011 – 12) as the independent variable. Finally ANOVAs and T-Tests for independent samples were applied to mean subscale scores of participants grouped by descriptive and experience variables, including certification area, Fast Track program, gender, ethnicity, international students, prior teaching experience, student teaching, and experience teaching minorities. RESULTS Description of the Participants The full dataset had data for a total of 1,684 students, which included all teacher education who had taken EBMAS during the fall 2009 thru fall 2012 semesters. This report focuses on the 1,450 NYU teacher education students took EBMAS during the three academic years 2009-10 – 2011-12. As can be seen in Table 1, the plurality (N = 609, 42%) of the total sample took the survey in 2010 – 11, followed by 2011 - 12 (N = 501, 34.6%). More than half (N = 787, 54.8%) were graduate students, of whom 257 (32.7%) were in the Fast Track program. More than four-fifths (82.7%) of the total sample was female and 107 (7.1%) were international students. Of the 1,381 students who provided usable data on race/ethnicity, nearly three-fifths (58.1%) identified as White or European American and 23.2% as Asian (see Table 2). Table 1. N and percent of students taking EBMAS by academic year Academic Year 2009 -10 N Students % Total Sample 340 23.4 2010 - 11 609 42.0 2011 - 12 501 34.6 Total Sample 1450 100.0 9 Table 2. Race/ethnicity of total sample and sample with data Race/Ethnicity N Students % of Total % With Data Latino 128 8.8 9.3 African American 55 3.8 4.0 Asian 320 22.1 23.2 802 55.3 58.1 76 5.2 5.5 White/EuroAmerican Multi-ethnic Total with data 1381 95.2 Note: 69 students did not provide usable data on race/ethnicity 100.0 Table 3 displays the distribution of respondents by certification area. Nearly one- quarter (N = 354, 24.6%) were in the Dual Childhood/Childhood Special Education major. Other majors with large numbers of respondents were English (184), Math (155), Dual Early Childhood/Early Childhood Special Education (154), and Foreign Language/TESOL/Bilingual Education. In a later part of the results sections, scale scores will be disaggregated by certification areas within degree programs with N’s of at least five. The survey also asked the participants about the number of credits they had accumulated in the program and their prior experiences in education. Consistent with CRTL’s protocol for administering EBMAS, the largest numbers of undergraduate respondents were in the beginning or later stages of their programs. As can be seen in Table 4, 350 (55.6%) of the undergraduates had between 0 – 15 credits and a total of 176 (27.9%) had 90 or more; the latter were grouped together as the Late stage group in the analysis of scale scores by stage of program, which is presented below. For the graduate students, 429 (55.0%) were in the beginning of the program with 0 – 15 credits. All of the other graduate respondents were considered to be in the late stage of their studies. In response to a question about whether they had prior teaching experience, 1,220 (84.1%) responded yes. When asked to describe this experience, only 12% of the experiences could be classified as actual teaching and most of this teaching was in foreign countries. Thirty-eight percent of those reporting their experiences were tutors, 13% were teacher aides or assistants, 12% cited student teaching, 11% were counselors in camps or after-school programs, 10% worked in non-formal education programs, such as parks and zoos, and the rest worked as interns or substitute teachers. In response to a direct question about whether they had student taught, 41.6% responded yes. Finally, 43% indicated that their teaching or student teaching experiences included minority students. 10 Table 3. Number and percent of total sample and sample with data in certification areas Certification Area N Students 26 % of Total 1.8 % With Data Dual Childhood/Childhood Special Ed 354 24.4 24.6 Dance Ed 23 1.6 1.6 Ed Theater* 84 5.8 5.8 Foreign Language Ed 69 4.8 4.8 Social Studies Ed 73 5.0 5.1 Science Ed 78 5.4 5.4 Music Ed 91 6.3 6.3 Dual Early Childhood/Early Childhood Special Ed 154 10.6 10.7 Foreign Language/TESOL/Bilingual Ed 112 7.7 7.8 English Math 184 12.7 12.8 TOSEL/Bilingual Ed 155 10.7 10.8 Early Childhood Ed 27 1.9 1.9 Total 8 0.6 0.6 1438 99.2 100.0 Childhood Ed 1.8 * Includes dual majors with social studies and English. Note: The above data do not include 11 students who did not report their certification areas and one who reported it as Special Education. 11 Table 4. Credits completed within degree groups for EBMAS respondents Degree Credits Completed 0-15 16-30 31-45 46-60 61-75 76-90 91-105 106-120 120 or more Total Undergraduate Graduate N % within Degree N % within Degree N % within Degree N % within Degree N % within Degree N % within Degree N % within Degree N % within Degree 350 55.6% 23 3.7% 19 3.0% 32 5.1% 13 2.1% 16 2.5% 16 2.5% 56 8.9% 429 55.0% 99 12.7% 170 21.8% 76 9.7% N % within Degree N 104 16.5% 629 774 % within Degree 100.0% 100.0% Note: 41 students did not respond to this question and six gave out of range values, for a total of 47 missing data and excluded from this table. Validity and Reliability Structural Validity and Reliability: In order to re-examine the empirical evidence for the clustering of items into subscales for the calculation of scores for specific dispositional constructs, PCA was applied to the full dataset of 1,684 students, which included the fall 2012 administration that was only used in this analysis. The results were similar to those for the PCA that was run on the 2009 – 10 sample, which had taken the earlier 39-item version, with one exception. The current PCA yielded five factors, with the MA/SJ subscale items splitting into two factors; the split subscale was more consistent with the theoretical logic that guided the original construction of the scale. That is, the items that were originally intended to measure MA and those intended to measure SJ split with each showing high loadings on one factor and 12 low loadings on the other. The five factors accounted for 49.7% of the item variance, slightly more than the earlier PCA, and were better aligned with the intended theoretical structure of the original scale. The coefficient alphas for the five scales were moderate to large, confirming their consistency reliability, as follows: PTE1, alpha = .754; PTE2, alpha = .740, GTE, alpha = .649; MA, alpha = .848; SJ, alpha = 666. The evidence suggests that the five-factor structure has reasonable empirical validity and reliability and better theoretical validity than the four-factor structure. Therefore, the items will be clustered into five subscales for EBMAS scoring in this and future analyses. Substantive Validity: Stages: The NYU Inquiry Brief for continuing accreditation reported that the EBMAS subscale scores of students in the later stage of their program were statistically significantly higher than for those in the early stage, which was considered to be evidence for the substantive validity of the scale (Tobias, et. al, 2011). In order to continue to assess the substantive validity of the five subscales of the new EBMAS scoring system, ANOVAS and T-tests were applied to test for the statistical significance of differences in the mean subscale scores of groups of students that varied in their stage of program completion. The results, which are displayed in Table 5 for undergraduate students and Table 6 for graduate students, mostly support the substantive validity of EBMAS, although with a few exceptions. As can be seen in Table 5, undergraduate students in the later stages of their programs, i.e. groups 3 and 4, had higher mean scores than those in the earlier stages, i.e. groups 1 and 2, for four of the five subscales. Note that due to the length of the undergraduate program and the assessment schedule, which allows for three assessments of some undergraduates, undergraduates are divided into four stage groups for this analysis. The late-stage group scored higher than the new group for four of the subscales and higher than the early-stage group for three; the middle-stage group scored higher than the new group for three subscales and higher than the early-stage group for two. There were no statistically significant differences between the new and early-stage groups and for PTE 2, which assesses the extent to which students believe they are or will be responsible for the academic and behavioral accomplishments of their students. As can be seen in Table 6, due to the shorter duration of the graduate program, these students were divided into two groups, new and late-stage, for this analysis. The results of Ttests for the significance of differences in mean EBMAS subscale scores between the two groups were equivocal, as they had been reported in the TEAC Inquiry Brief. Consistent with the TEAC results, the late-stage graduate students had a statistically significantly higher mean score than the new students in PTE 1, which measures their belief that they can or will be able to handle their students’ academic and behavioral problems in the classroom. This finding is theoretically reasonable, since this program experiences, especially student teaching, are designed to bolster their teaching skill and confidence. However, as we observed for the undergraduates above and consistent with the results reported in the TEAC Inquiry Brief, there were no statistically significant differences in mean PTE 2 scores. The contradictory findings for PTE 1 and PTE 2 add evidence supporting the fundamental difference between the constructs measured by these two subscales and suggest that the former can be impacted by pre-service program experiences, while the latter may not. Disparate findings were also observed in the 13 results for the MA and SJ subscales for the graduate students. Whereas the mean score for the late-stage group was higher than for the new group on the SJ subscale, the reverse was true for the MA subscale. In this regard, it should be noted that these are tests for independent samples and not repeated measures and the mean MA subscale score of the new students were already quite high. Table 5. Summary of ANOVA for differences in EBMAS subscale scores among students at different stages of their programs (Undergraduates) Stages with Std. Stage of significant Subscale N Mean F Sig Deviation Program differences New (1) 350 3.53 0.85 Early (2) 41 3.74 0.90 PTE1 68.84 0.000 1 & 2 <3 & 4 Middle (3) 61 4.40 0.89 Late (4) 176 4.59 0.80 New (1) 350 4.28 0.71 Early (2) 42 4.15 0.68 PTE2 0.75 0.524 None Middle (3) 61 4.33 0.68 Late (4) 175 4.23 0.73 New (1) 350 4.80 0.79 Early (2) 42 4.86 0.95 GTE 5.42 0.001 1 <4 Middle (3) 61 5.11 0.92 Late (4) 176 5.07 0.81 New (1) 350 5.01 0.69 Early (2) 42 5.14 0.71 MA 28.35 0.000 1 & 2 <3 & 4 Middle (3) 61 5.51 0.53 Late (4) 176 5.52 0.59 New (1) 350 4.96 0.60 Early (2) 42 5.05 0.60 1 < 3&4; SJ 24.34 0.000 2< 4 Middle (3) 61 5.35 0.52 Late (4) 176 5.37 0.54 14 Table 6. Summary of T-test for differences in EBMAS subscale scores between new and late stage students (graduate students) Subscale PTE1 PTE2 GTE MA SJ Stage Late New Late New Late New Late New Late N New 343 429 342 427 343 429 342 429 342 Mean 4.38 3.87 4.42 4.42 4.84 4.92 5.36 5.48 5.29 Std. Dev. 0.75 0.82 0.68 0.68 0.92 0.83 0.63 0.55 0.62 429 5.20 0.56 M diff. T Df Signif. 0.51 8.69 770 0.000 -0.01 -0.12 767 0.904 -0.08 -1.31 697 0.191 -0.12 -2.69 683 0.007 0.09 2.06 769 0.040 Stability Reliability: In order to assess the stability of EBMAS subscale scores over time, ANOVAs were applied to the differences in the mean subscale scores across the three years of the study (see Table 8). This analysis only included students in the late-stage of the program. In addition to testing for the main effects of year, the analyses tested for the main effects of degree program and the interaction effects of year and degree. As can be seen in Table 8, there were no statistically significant interaction effect or main effect for year. There was a statistically significant main effect for degree program for four of the five subscales, but this does not detract from the evidence supporting the stability of the findings over time. Inspection of the total means across the three years in Table 8 reveals that the mean scores of undergraduates were significantly higher than those for graduate students for three of the subscales, PTE 1, GTE, and MA, while the mean for graduate students was significantly higher for PTE 2. Table 7. Summary of ANOVA for tests of significance of the main and interaction effects of year and degree on EBMAS subscale scores (late stage BS and MA students in Classes of 2010 - 2012) Subscale PTE1 PTE2 GTE MA SJ Df 2 & 513 2 & 511 2 & 513 2 & 512 2 & 512 Year F 2.37 0.59 1.36 1.70 0.48 Sig 0.094 0.601 0.258 0.183 0.619 Effects * Degree Df F 1 & 513 8.85 1 & 511 4.23 1 & 513 8.20 1 & 512 8.20 1 & 512 2.68 Sig 0.003 0.003 0.004 0.004 0.102 Year by Degree Df F Sig 2 & 513 1.55 0.213 2 & 511 0.39 0.674 2 & 513 2.27 0.104 2 & 512 2.38 0.094 2 & 512 2.09 0.125 * Effects (F, sig.) in bold font are statistically significant at p < .05 15 Assessment of TEAC Claims NYU uses the EBMAS as one of its measures of two of its four TEAC claims—Claim 3, Clinical Competence and Claim 4, Caring Professional—and one of the three cross-cutting themes (CCT), CCT 2, Multicultural Perspective. The program standard established by the faculty for attainment of the claims is a mean score for late-stage students of at least 4.50 on the subscales aligned with each claim. Table 8 displays the results of the assessment of the claims using EBMAS for the three academic years, 2009-10 thru 2011-12, the first of which is a reanalysis of the data that were reported in the TEAC Inquiry Brief for reaccreditation. Consistent with the high stability reliability of this measure reported above, the results show high consistency across the three years. For undergraduates, participants met the program standard all three years in four of the five subscales, PTE 1 (Clam 3), GTE (Claim 4), MA (CCT 2), and SJ (Claim 4). On the other hand, undergraduates fell below the program standard in PTE 2 (Claim 3) by about one-quarter point for all three years. This is further evidence of the fundamental difference between these two types of PTE and suggests that although undergraduates are confident they know how to help their students learn and behave they are less sure that the successes of their students can be attributed to their teaching. Table 8 shows similar positive results for graduate students on the Claim 4 measures, GTE and SJ, and the CCT 2 measure, MA, but somewhat different outcomes on the Claim 3 measures. Although graduate students fell below the program standard for both PTE 1 and PTE 2 across the three years combined, the shortfall was only about a tenth of a point overall and they did meet the standard in PTE 1 in 2011-12. Moreover, the mean scores for the two scores have been increasing across the three years. Thus, the overall findings continue to provide evidence supporting the claims. 16 Table 8. Mean EBMAS subscale scores by degree and year compared to the program standard of 4.50 (Classes of 2010 - 2012) Subscale/ Claims ** PTE1 Claim 3 PTE2 Claim 3 GTE Claim 4 MA CCT 2 SJ Claim 4 2009 -10 54 Undergraduate Std. MMean Dev. 4.50 * 4.52 0.84 0.02 2010 - 11 2011 - 12 Total 2009 -10 2010 - 11 2011 - 12 Total 2009 -10 2010 - 11 2011 - 12 Total 2009 -10 2010 - 11 2011 - 12 Total 2009 -10 2010 - 11 2011 - 12 54 68 176 53 54 68 175 54 54 68 176 54 54 68 176 54 54 68 4.70 4.55 4.59 4.18 4.25 4.24 4.23 5.25 4.94 5.02 5.07 5.52 5.54 5.50 5.52 5.41 5.37 5.34 Year N 0.66 0.88 0.80 0.78 0.76 0.67 0.73 0.67 0.90 0.83 0.81 0.51 0.57 0.66 0.59 0.49 0.51 0.61 0.20 0.05 0.09 -0.32 -0.25 -0.26 -0.27 0.75 0.44 0.52 0.57 1.02 1.04 1.00 1.02 0.91 0.87 0.84 N 109 114 120 343 109 114 119 342 109 114 120 343 108 114 120 342 109 114 119 Graduate Std. Mean Dev. 4.22 0.76 4.40 4.51 4.38 4.40 4.36 4.49 4.42 4.78 4.77 4.96 4.84 5.22 5.35 5.50 5.36 5.19 5.28 5.39 0.80 0.67 0.75 0.70 0.70 0.64 0.68 0.93 0.95 0.88 0.92 0.67 0.68 0.49 0.63 0.60 0.70 0.52 M4.50 * -0.28 -0.10 0.01 -0.12 -0.10 -0.14 -0.01 -0.08 0.28 0.27 0.46 0.34 0.72 0.85 1.00 0.86 0.69 0.78 0.89 Total 176 5.37 0.54 0.87 342 5.29 0.62 0.79 * Values in bold font indicate the program standard has been met or exceeded. ** TEAC Claims: Claim 3, Clinical Competence; Claim 4, Caring Professional; CCT 2, Multicultural Perspective Differences in Scale Scores by Student Demographics and Experience A series of statistical analyses were performed to determine whether EBMAS scales varied for groups of students who differed in key measured program, experiential, and demographic variables. In order to control for stage in the program and degree, the participants were late-stage students and separate analyses were performed for undergraduate and graduate students. Undergraduates: Table 9 summarizes the statistical analyses performed on the EBMAS subtest scores of groups of undergraduates varying in descriptive characteristics. As the bold font indicates, statistically significant differences in at least one of the five subscales were observed for four of the six measured descriptive characteristics as follows: PTE 1 for certification area; PTE 2 an GTE for race/ethnicity; PTE 1 for student teaching; and four 17 subscales, PTE 1, GTE, MA, and SJ, for experience teaching/student teaching minorities. No significant differences were observed for gender or international versus American students. Further analyses were performed to determine the nature and size of these statistically significant differences. Table 9. Summary of t-tests and ANOVAs for test of significance of differences in EBMAS subscale scores by descriptive characteristics of late-stage undergraduate student teachers (classes of 2010 - 12) Descriptor Certification area Gender Race/ethnicity International student Student teaching Taught minorities EBMAS subscales PTE 1 PTE 2 GTE MA SJ t /F Sig t /F Sig t /F Sig t /F Sig t /F Sig 5.13 0.000 0.35 0.909 0.33 0.922 1.35 0.236 0.698 0.652 -0.61 0.542 -0.24 0.815 -1.57 0.119 0.28 0.777 -0.05 0.757 2.32 0.060 2.71 0.032 4.87 0.001 1.25 0.293 1.02 0.398 -0.60 0.584 -0.06 0.96 -0.89 0.375 -0.81 0.417 -0.87 0.386 2.91 0.004 0.10 0.918 0.69 0.489 0.33 0.740 0.31 0.759 2.52 0.013 0.03 0.974 2.93 0.011 2.73 0.011 1.97 0.051 * Effects (F, sig.) in bold font are statistically significant at p < .05 First, Scheffe post-hoc comparisons among pairs of PTE 1 means were performed for seven certification areas with a minimum N of 5. Table 10 displays the means in rank order form low to high and in homogeneous subsets; that is, means in the same subset do not differ significantly but means in one subset differ significantly from means not in that same subset. Accordingly, the mean PTE 1 scores for undergraduates in math music are significantly lower than the mean for dual early/childhood/ early childhood special education. No other differences between certification areas were statistically significant. Table 10. Homogeneous subsets of EBMAS PTE 1 means by certification area for latestage undergraduates (Classes of 2010 - 12) Certification area N Subset for alpha = 0.05 math 11 1 3.85 music 2 17 4.04 science ed 5 4.12 4.12 English ed 23 4.61 4.61 dual childhood/childhood special ed 50 4.63 4.63 ed theater 7 4.69 4.69 dual early childhood/early childhood special ed 54 4.89 Means for groups in homogeneous subsets are displayed. a. Uses Harmonic Mean Sample Size = 12.183. b. The group sizes are unequal with a minimum N of 5. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed. c. Means in bold font for certification areas in subset 1 are statistically significantly smaller than the means in bold font for certification areas in subset 2. 18 Tables 11 and 12 display the results of the respective Scheffe post-hoc comparisons of PTE 2 and GTE means between racial/ethnic groups. For PTE 2, the mean for African Americans was significantly lower than the means for Latinos and Whites/European-Americans. For GTE, the mean for Asian undergraduates was significantly lower than the mean for Latinos. Table 11. Homogeneous subsets of EBMAS PTE 2 means by race/ethnicity for late-stage undergraduates (Classes of 2010 - 12) Subset for alpha = 0.05 Race/Ethnicity 2 1 N African American 8 3.84 Asian 30 3.95 3.95 Multi-racial 15 4.25 4.25 Latino 30 4.35 White/Euro-American 82 4.35 Means for groups in homogeneous subsets are displayed. a. Uses Harmonic Mean Sample Size = 18.48. b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed. c. Means in bold font for certification areas in subset 1 are statistically significantly smaller than the means in bold font for certification areas in subset 2. Table 12. Homogeneous subsets of EBMAS GTE means by race/ethnicity for late-stage undergraduates (Classes of 2010 - 12) Subset for alpha = 0.05 Race/Ethnicity N 30 1 4.63 2 Asian White/Euro-American 82 5.12 5.12 Multi-racial 15 5.23 5.23 African American 8 5.34 5.34 Latino 30 5.38 Means for groups in homogeneous subsets are displayed. a. Uses Harmonic Mean Sample Size = 18.48. b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed. c. Means in bold font for certification areas in subset 1 are statistically significantly smaller than the means in bold font for certification areas in subset 2. Table 13 displays the means for the EBMAS subscales that had statistically significant differences on t-tests for independent samples comparing students who had and had not 19 student taught and those who had and had not taught or student taught substantial numbers of minority students. It should be remembered that most of the “teaching experiences” of the Table 13. Mean EBMAS subscale scores for late-stage undergraduate students with varying types of student teaching experience (Classes of 2010 - 2012) * Experience Subscale Have you student taught? PTE1 PTE1 GTE Have you taught minority students before? MA SJ Yes/No Yes N 158 Mean 4.64 SD 0.80 No 16 4.04 0.69 Yes 148 4.63 0.82 No 23 4.18 0.57 Yes 148 5.14 0.80 No 23 4.61 0.80 Yes 148 5.57 0.56 No 23 5.16 0.69 Yes 148 5.41 0.54 No 23 5.17 0.57 * Data are displayed only for descriptors and scales that showed statistically significant mean differences (see Table X). respondents involved assisting teachers, after-school programs, or non-formal schooling (see above). As can be seen in Table 13, the mean PTE 1 score for undergraduates who had student taught was significantly higher than for those who had not. This makes theoretical sense, since the students who had student taught would have had the opportunity to test their personal teaching skill in practice. In addition, students who had experience teaching/student teaching minorities had significantly higher PTE 1, GTE, MA, and SJ scores than those who had not. These findings support the program’s theory and practice of providing student teachers field opportunities in high-minority schools. Graduate Students: Table 14 summarizes the statistical analyses performed on the EBMAS subtest scores of groups of graduate students varying in descriptive characteristics. Statistically significant differences in at least one of the five subscales were observed for all but one (taught minorities) of the seven measured descriptive characteristics as follows: MA for Fast Track; GTE and MA for certification area; PTE 2 and MA for gender; PTE 1, an GTE, and SJ for race/ethnicity; GTE for international students; and PTE 1, GTE, and SJ for student teaching. First, a T-test for independent samples revealed that the mean MA scores of students in the Fast Track program was significantly lower than the mean for those in the regular program, M = 5.31, SD = 0.64, N = 0.64 for the former verses M = 5.45, SD = 0.61, N = 394 for the latter. This was the only significant difference observed in the EBMAS scores of Fast Track students. Next, Tables 15 and 16 display the results of Scheffe post-hoc comparisons of mean GTE and MA scores, respectively, between pairs of certification areas. As can be seen in Table 15, the mean GTE scores of graduate students in the TOSEL/bilingual areas was significantly lower than 20 students in social studies education; and, as indicated in Table 16, the mean MA scores of students dance education and mathematics were significantly lower than the means for dual early childhood/early childhood special education, dual childhood/childhood special education, TOSEL/bilingual education, and foreign language education. Table 14. Summary of t-tests and ANOVAs for test of significance of differences in EBMAS subscale scores by descriptive characteristics of late-stage graduate student teachers (classes of 2010 - 12) EBMAS Subscales Descriptor PTE 1 PTE 2 GTE MA SJ t/F Sig t/F Sig t/F Sig t/F Sig t/F Sig Fast Track -1.86 0.063 1.22 0.222 1.01 0.314 -2.22 0.027 0.51 0.609 Certification area 1.43 0.152 0.73 0.719 2.63 0.002 4.07 0.000 1.64 0.080 Gender 0.01 0.989 -2.00 0.047 -1.29 0.197 -2.75 0.008 -1.02 0.308 Race/ethnicity 2.97 0.020 0.11 0.979 16.42 0.000 1.91 0.109 3.61 0.007 International student 0.92 0.359 -0.81 0.418 -5.22 0.000 -0.85 0.395 -2.51 0.017 Student teaching 4.18 0.000 -0.73 0.464 3.35 0.001 0.41 0.681 2.06 0.040 Taught minorities 1.57 0.118 -0.72 0.471 1.51 0.132 -0.29 0.774 1.14 0.256 * Effects (F, sig.) in bold font are statistically significant at p < .05 Table 15. Homogeneous subsets of EBMAS GTE means by certification area for late-stage graduate students (Classes of 2010 - 12) Subset for alpha = 0.05 Certification area 2 TOSEL/bilingual ed N 12 1 4.33 music ed 14 4.43 dance ed 6 4.46 4.46 foreign language/TESOL/Bilingual ed 51 4.59 4.59 Math ed 56 4.64 4.64 science ed 24 4.79 4.79 English ed 45 4.90 4.90 foreign language ed 15 5.00 5.00 dual childhood/childhood special ed 62 5.09 5.09 childhood ed 11 5.09 5.09 dual early childhood/early childhood special ed 16 5.14 5.14 ed theater 14 5.25 5.25 social studies ed 13 5.42 Means for groups in homogeneous subsets are displayed. a. Uses Harmonic Mean Sample Size = 16.102. b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed. c. Means in bold font for certification areas in subset 1 are statistically significantly smaller than the means in bold font for certification areas in subset 2. 21 Table 16. Homogeneous subsets of EBMAS MA means by certification area for late-stage graduate students (Classes of 2010 - 12) Subset for alpha = 0.05 Certification area 2 dance ed N 6 1 4.85 Math ed 55 4.96 Music ed 14 5.07 5.07 childhood ed 11 5.18 5.18 science ed 24 5.30 5.30 English ed 45 5.39 5.39 ed theater 14 5.44 5.44 foreign language/TESOL/Bilingual ed 51 5.46 5.46 social studies ed 13 5.53 5.53 dual early childhood/early childhood special 16 5.55 dual childhood/childhood special 62 5.56 TOSEL/Bilingual Ed 12 5.56 foreign language ed 15 5.63 See footnotes in Table 15 above Next Scheffe post-hoc comparisons were applied to all pairs of race/ethnic group means for PTE 1 (Table 17), GTE (Table 18), and SJ (Table 19). The results show that the PTE 1 mean for the multi-racial group was significantly lower than Latinos, the GTE mean for Asian students was significantly lower than all other groups except multi-racial, and the SJ mean for Asians was significantly lower than Latinos. The consistently lower EBMA scores of Asian graduate students warrant discussion and further exploration. Table 17. Homogeneous subsets of EBMAS PTE 1 means by race/ethnicity for late-stage graduate students (Classes of 2010 - 12) Race/Ethnicity Multi Asian White/Euro-American African American Latino N 16 73 204 14 16 Subset for alpha = 0.05 1 3.83 4.32 4.42 4.46 2 4.32 4.42 4.46 4.63 Means for groups in homogeneous subsets are displayed. a. Uses Harmonic Mean Sample Size = 23.25. b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed. c. Means in bold font for certification areas in subset 1 are statistically significantly smaller than the means in bold font for certification areas in subset 2. 22 Table 18. Homogeneous subsets of EBMAS GTE means by race/ethnicity for late-stage graduate students (Classes of 2010 12) Subset for alpha = 0.05 Ethnicity Asian N 73 1 4.17 2 Multi 16 4.80 4.80 White/Euro-American 204 5.03 Latino 16 5.26 African American 14 5.30 Means for groups in homogeneous subsets are displayed. a. Uses Harmonic Mean Sample Size = 23.25. b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed. c. Means in bold font for certification areas in subset 1 are statistically significantly smaller than the means in bold font for certification areas in subset 2. Table 19. Homogeneous subsets of EBMAS SJ means by race/ethnicity for late-stage graduate students (Classes of 2010 - 12) Race/Ethnicity N Subset for alpha = 0.05 73 1 5.12 2 Asian Multi-racial African 16 5.19 5.19 American 14 5.23 5.23 White/Euro-American 203 5.37 5.37 Latino 16 5.58 Means for groups in homogeneous subsets are displayed. a. Uses Harmonic Mean Sample Size = 23.25. b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed. c. Means in bold font for certification areas in subset 1 are statistically significantly smaller than the means in bold font for certification areas in subset 2. Last, Table 20 summarizes the mean scores for subscales that showed statistically significant T-tests for independent samples on three dichotomous descriptors. As can be seen in the table, the mean PTE 1 score of females was significantly higher than males; graduate students who had student taught had significantly higher mean PTE 1, GTE, and SJ scores than 23 Table 20. Mean EBMAS subscale scores for late-stage graduate students by significant descriptive characteristics (Classes of 2010 2012) * Descriptor Subscale Gender PTE2 PTE1 Have you student taught? GTE SJ Are you an international student? GTE SJ Values male N 60 Mean 4.26 SD 0.74 female 282 4.45 0.66 Yes 254 4.47 0.71 No 81 4.09 0.77 Yes 254 4.93 0.90 No 81 4.55 0.92 Yes 253 5.33 0.61 No 81 5.18 0.54 yes 31 4.05 0.81 no 312 4.92 0.89 yes 31 4.93 0.85 no 311 5.33 0.58 * Data are displayed only for descriptors and scales that showed statistically significant mean differences (see Table X). those who had not; and the mean SJ scores of international students was significantly lower than their American counterparts. The last finding may be related to the consistently lower scores found for Asian students, as described above. Perceived Effectiveness of the Steinhardt Teacher Education Program One of the EBMAS questions (Question 17) directly asks students about the effectiveness of their teacher education program. The question is posed as a statement, “My teacher training program and/or experience has given me the necessary skills to be an effective teacher.” As a direct measure of the perceived quality of the program, it warrants separate analysis. First, Tables 21 and 22 show summaries of ANOVAs comparing the mean program ratings of late-stage undergraduate and graduate students, respectively, across the three years. There are no statistically significant differences between years for the undergraduates and the ratings are consistently high, five or above each year. On the other hand there were statistically significant differences between years for the graduate students with the mean for 2009-10 significantly lower than 2011-12. Moreover, a t-test for independent samples found that the overall mean for undergraduates was significantly higher than the graduates’ mean, M = 5.17 (SD = .96) for the former versus M = 5.17 (SD = 1.07) for the latter, t = 3.54, df = 514, p = .000. On the positive side, the mean for graduate students has been increasing over the three years. 24 Table 21. Summary of ANOVA comparing mean scores of latestage undergraduates to question 17 for the three Classes of 2009-10 - 2011 - 12 * Year N Mean SD 2009 -10 53 5.34 0.76 2010 - 11 54 5.24 0.85 2011 - 12 68 4.99 1.15 Total 175 5.17 0.96 F Sig. 2.26 0.108 * Question 17: My teacher training program and/or experience has given me the necessary skills to be an effective teacher. Table 22. Summary of ANOVA comparing mean scores of late-stage graduate students to question 17 for the three Classes of 2009-10 - 2011 - 12 * Year N Mean SD 2009 -10 109 4.60 1.25 2010 - 11 113 4.88 1.08 2011 - 12 119 4.99 0.84 Total 341 4.83 1.07 F Sig. Sig. Differences 4.15 0.017 2009-10 < 2011-12 * See foot note for Table 21 above Although there were no statistically significant differences in mean ratings between the students grouped by certification areas, these data are displayed for information and discussion in Table 23 for undergraduates and Table 24 for graduate students Table 23. Mean scores of late-stage graduate students to question 17 of EBMAS (Classes of 2009 -10 - 2011 - 12) * Certification area N Means ** math 11 4.45 music 16 5.13 dual childhood/childhood special 50 5.14 ed theater 7 5.14 science ed 5 5.20 English 23 5.26 dual early childhood/early childhood special 54 5.43 * Question 17: My teacher training program and/or experience has given me the necessary skills to be an effective teacher. ** No statistically significant differences between means at p< .05 25 Table 24. Mean scores of late-stage graduate students to question 17 of EBMAS (Classes of 2009 -10 - 2011 - 12) * Certification area N Mean ** math 55 4.55 foreign language/TESOL/Bilingual ed 50 4.58 dual childhood/childhood special 62 4.76 English 45 4.78 childhood ed dance 11 4.82 ed TOSEL/Bilingual 6 4.83 Ed foreign language 12 4.92 ed 15 4.93 dual early childhood/early childhood special 16 4.94 ed theater 14 5.14 social studies ed 13 5.15 music 14 5.29 science ed 24 5.33 * Question 17: My teacher training program and/or experience has given me the necessary skills to be an effective teacher. ** No statistically significant differences between means at p< .05 Finally, there were no statistically significant differences in mean ratings of program effectiveness for the descriptors Fast Track, gender, student teaching, previous teaching, and taught minorities. However among graduate students, international students had a more positive perception of the program’s effectiveness than American students. The mean for international students was 5.17 (SD = 0.75) and the mean for American students was 4.78 (SD = 1.10), with the difference statistically significant (T = 2.47, df = 42.1, p = .018). It is interesting that although the graduate international students had lower EBMAS subscale scores than the American students, they had a more positive perception of the effectiveness of the program. SUMMARY AND CONCLUSIONS This report presented updated findings from the analysis of EBMAS, a component of NYU Steinhardt’s assessment student and program assessment system, for the three academic years 2009-10 thru 2011-12. During that time, EBMAS was administered to 1,450 undergraduate and graduate students who were at the beginning, middle, or end of their preservice teacher-education programs. The report presented findings from continued research on EBMAS, results from the use of the scale to assess TEAC program clams, and analyses of the differences in scores for students grouped by demographic, experience, and program characteristics. These findings update the results reported for a smaller dataset in NYU’s 2011 TEAC Inquiry Brief for re-accreditation. The key findings are as follows: 26 1. A new PCA of the updated dataset largely replicated the factor structure that emerged from the PCA of the TEAC dataset, but with one important difference. The subscale MA/SJ based on the earlier PCA split into two factors, which led to a new scoring system using five subscales: PTE 1, PTE 2, GTE, MA, and SJ. The new factor and subscale structure is more consistent with the theory underlying the development of EBMAS than the previous four subscale structure. 2. The substantive validity of EBMAS was strengthened by new evidence that latestage students had higher scores than new students and additional evidence was found supporting the scale’s internal consistency reliability and stability. Therefore, inferences about student dispositional development and program effectiveness based on EBMAS can be made with confidence. 3. Mean subscale scores for late-stage undergraduate and graduate students continued to meet and exceed the TEAC program standards for Claim 4, Caring Professionals, and the Cross-Cutting Theme for Multi-cultural perspective; however the standards for Claim 3, Clinical Competence, were only partially met for undergraduates and weakly supported for graduate students. The mean scores of undergraduates were significantly higher than those for graduate students for three of the subscales, PTE 1, GTE, and MA, while the mean for graduate students was significantly higher for PTE 2. These findings are largely consistent with the 2011 TEAC Inquiry Brief, although the scores of graduate students appear to be increasing over the three years. 4. There were several noteworthy differences in the scores of students grouped by demographic, experience, and program variables. For undergraduates, students in the Dual Early Childhood/Early Childhood Special Education certification area had higher mean PTE 1 scores than those in Math Education and Music Education; Latino and White/European-American students had higher PTE 2 scores than African American students and Latino students had higher GTE scores than Asian students; and students who had student taught had higher PTE 1 scores than those who did not and those who taught/student taught minority pupils not only had higher PTE 1 scores, but also had higher GTE, MA, ad SJ scores. Among graduate students, those in the Fast Track program had lower mean MA scores than those in the regular program; students in Social Studies Education had a higher mean GTE score than those in TOSEL/Bilingual Education and Music Education, while those in Foreign Language Education, TOSEL/Bilingual Education, Dual Childhood/Childhood Special, and Dual Early Childhood/Early Childhood Special had higher mean MA scores than those in Dance Education and Math Education; Latino students had higher PTE 1 scores than Multi-racial students and higher GTE and SJ scores than Asian students, while White/European-American and African-American students also had higher GTE scores than Asian students; female students had higher PTE 2 scores than males, those who student taught had higher PTE 1, GTE, and SJ scores than those who did not, and international students had lower GTE and SJ scores than American students. 27 5. Overall, students gave very high ratings to their teacher education program in terms of giving them the necessary skills to be an effective teacher. Undergraduates gave significantly higher ratings to their program than graduate students, although the mean rating for graduate students in the most recent year, 2011-12, was significantly higher than in 2009-10. There were no statistically significant differences in these ratings for descriptive variables, with the exception of a significantly higher mean rating for international graduate students than American students, despite the former’s generally lower EBMAS scale scores. These findings lead to the overall conclusion that EBMAS has been a valid and reliable tool for assessing the developing teaching dispositions of NYU Steinhardt teacher education students and, consequently, the data have important implications for readiness to teach of the graduates and the effectiveness of the program in preparing competent and caring educators. NYU graduates generally have strong beliefs in the general efficacy of teaching to promote the learning and positive of all pupils, value social justice, and a strong awareness of and positive attitude toward the importance of a multicultural perspective. They also have moderate confidence in their personal efficacy to teach all students, although with less than certainty than their other beliefs. The exploration of differences in scores between students grouped by demographic, experience, and program characteristics revealed some differences that warrant discussion among program faculty and administrators. Finally, recent differences in the factor structure of the scale that emerged from PCA highlight the importance of continuing research on its psychometric properties. REFERENCES CITED Burant, T.J., Chubbuck, S.M., &Whipp, J.L. (2007).Reclaiming the moral in the dispositions debate. Journal of Teacher Education, 58(5), 397-411. Gibson, S., & Dembo, M. H. (1984). Teacher efficacy: A construct validation. Journal of Educational Psychology, 76(4), 569-582. Ponterotto, J.G., Baluch, S., Greig, T., and Rivera, L. (1998) . Development and initial score validation of the teacher multicultural attitude survey. Educational and Psychological Measurement, 58(6), 1002-1016. Tobias. R., Pietanza, R., & McDonald, J. TEAC Inquiry Brief. Submission to the Teacher Education Accreditation Council, September 2011. 28 Appendix E:Inventory: Status of evidence from measures and indicators for TEAC Quality Principle I Type of Evidence Available and in the Brief Note: items under each category are examples. Program may have more or different evidence Relied on Not relied on Reasons for including the results Reasons for not relying in the Brief on this evidence (Location in Annual Report) Not Available and Not in the Brief For future use Not for future use Reasons for including in future Briefs Reasons for not including in future Briefs Grades 1. Student grades and grade point averages Content Knowledge GPA, Pedagogical Knowledge GPA, Clinical Skills GPA, and CrossCutting Theses GPA are valid and reliable measures of student mastery of the skills and knowledge that are associated with the claims. (pp 14-15) Scores on standardized tests 2. Student scores on standardized license or board examinations Scaled scores on the NYSTCE Content Specialty Tests and Assessment of Teaching SkillsWritten exams are valid, reliable, and sensitive measures of Content Knowledge and Pedagogical Knowledge, while scaled scores on the Liberal Arts and Sciences Test are valid measures of the cross-cutting theme of Learning-to-Learn, which requires a broad and deep understanding of the tools and concepts of the liberal arts and sciences. (pp 9-12) 1 NYU’s claim of Content Knowledge pertains to the knowledge of program completers. Faculty believes that admissions tests for undergraduates taken four or more years prior to graduation are not valid measures of the claim because they are distal in time and not well aligned with the constructs in content. Admissions tests are optional for graduate admissions and few students submit them. 3. Student scores on undergraduate and/or graduate admission tests of subject matter knowledge and aptitude 4.Standardized scores and gains of the program graduates’ own students In its Brief, NYU used the VAM test score gains of the pupils of graduates teaching in grades 4-8 in the NYC public schools to measure Clinical Competence. Recently, the NYC Department of Education (NYCDOE) discontinued the calculation of VAM measures and transitioned to the use of Growth Percentile Measures (GPM), which are used by the NYS Education Department as part of its new teacher evaluation system. This system has been the focus of political and collective bargaining and we are in negotiations with NYCDOE to obtain release of the data for our graduates. We expect a successful conclusion to these negotiations and anticipate receiving these data in time for the next TEAC Annual Report in 2014. 2 Ratings 5. Ratings of portfolios of academic and clinical accomplishment Portfolio data were not included in the original Brief because of concerns about logistics, cost, and low reliability of the measures. Recently, there have been advances in portfolio technology and increased interest as part of the institution of a new evaluation system for prospective teachers by NYS. We are conducting due diligence of the new systems and plan on piloting some for possible adoption. We anticipate that these data will be available for the next Brief. 6. Third-party rating of program's students NYU considered using third-party ratings of program students but determined the procedures to be not feasible logistically. However, the faculty considers this to be valuable additional evidence and will attempt to design feasible methods in the future. 3 7. Ratings of in-service, clinical, and PDS teaching An important measure used to assess all four claims and the cross-cutting theme of Learningto-Learn is the DRSTOS-R. This observation protocol is used by field supervisors to assess the developing pedagogical proficiency of student teachers in clinical practice. Evidence of empirical validity and reliability is presented in the Brief. (pp. 89) 8. Ratings by cooperating teacher and college/ university supervisors, of practice teachers' work samples Student teachers’ work samples are used as an important source of evidence for DRSTOS-R assessments. The work samples include journals, lesson plans, written reflections on practice, and pupil work. Field supervisors review the work samples and then use them holistically to arrive at the ratings of related DRSTOS-R items. This evidence is cited in the protocols completed by the field supervisors. (pp. 8-9) NYU believes that in-service ratings of the teaching of its graduates can provide useful data for reflecting back upon the quality of graduates’ program preparation. As part of the institution of a new teacher evaluation system in NYS, all teachers will receive effectiveness ratings. NYU plans on obtaining these ratings for its graduates when the new system takes effect in 2014. The new state evaluation system will also rate pre-service teachers using the edTPA. NYU plans on using these ratings to supplement or replace the DRSTOS-R data. 4 Rates 9. Rates of completion of courses and program 10. Graduates' career retention rates The faculty believes these data are not valid measures of the claims and, therefore, they are not included in the Brief. NYU continues to obtain data from its Graduate Tracking Study to compute three-year retention rates for graduates teaching in the NYC public schools. These data are reliable and valid for assessing the claim that graduates are Caring Professionals who have the commitment and skill to sustain their careers in inner-city schools. (pp. 18-20) 11. Graduates' job placement rates 12. Rates of graduates' professional advanced study Job placement rates will not be used in future Briefs to support the claims, since they are subject to the vicissitudes of the job market. Accordingly, they are used by faculty for information purposes, but not tested against any program standard. NYU has been collecting these data in its Program Exit Surveys since 2009. Faculty believes additional data from future surveys will be needed in order to generate reliable estimates of rates of professional advanced study. 5 13. Rates of graduates' leadership roles NYU will be collecting these data in a planned Five-Year Follow-Up Survey and they will appear in future reports. 14. Rates of graduates' professional service activities NYU will be collecting these data in a planned Five-Year Follow-Up Survey and they will appear in future reports. Case studies and alumni competence 15. Evaluations of graduates by their own pupils NYU believes that the questionable reliability and validity of these data render the high resource expenditures required to collect them unwarranted. 16. Alumni self-assessment of their accomplishments NYU will be collecting these data in a planned Five-Year Follow-Up Survey and they will appear in future reports. 17. Third-party professional recognition of graduates (e.g. NPTS) NYU will be collecting these data in a planned Five-Year Follow-Up Survey and they will appear in future reports. 18. Employers' evaluations of the program's graduates Principals’ ratings of all teachers will be part of the new NYS teacher evaluation system. NYU plans to obtain these data for its graduates and use them in future studies. 19. Graduates' authoring of textbooks, curriculum materials, etc. NYU will be collecting these data in a planned Five-Year Follow-Up Survey and they will appear in future reports. 6 20. Case studies of graduates’ own pupils’ learning and accomplishment NYU believes the cost of collecting these data would be excessive and the inferences that might be drawn from them concerning graduates’ effectiveness would have weak validity. Other Data 21. Students’ self-ratings of NYU uses the ETFQ to assess growth during student teaching. student teachers’ perceptions of growth in Content Knowledge, Pedagogical Knowledge, and Clinical Skills. The results of this assessment have theoretical validity and have been consistent across many cohorts. (pp. 12-13) 22. Students’ dispositions to teaching. NYU has developed EBMAS, a survey that assesses students’ self perceptions of general teaching efficacy, personal teaching efficacy, social justice, and multicultural attitudes. EBMAS has demonstrated empirical validity and internal consistency reliability for measuring these dispositions which research has linked to teacher quality. (pp. 13-14) 7 23. Graduates ratings of the their preparation for teaching NYU conducts two surveys of teacher-education program graduates: the Program Exit Survey and the One-Year Follow-Up Survey. These surveys assess the extent to which graduates feel that the program has prepared them to be successful teachers. The surveys show consistency of results for successive administrations, convergence of findings between the two surveys, and consistency with the results from a source survey developed by Arthur Levine. In addition, the items are well aligned with NYU’s claims. (pp. 15-18) 24. Demographics of graduates’ schools of employment Through its electronic graduate tracking study, NYU assesses the demographic characteristics of the NYC public schools in which graduates are employed. These data are used to assess the graduates’ commitment to working in inner-city schools, which is aligned with the claim of Caring Professionals (pp.1819) 8
© Copyright 2026 Paperzz