One problem of significant consequence with the current teacher evaluation system is how difficult it is to write an article on the subject. Teachers are afraid. Several of the teachers whose opinions I asked about refused to comment on the evaluation systems, or told me their opinions “off the record.” Their jobs are at stake, and they believed that saying the wrong thing in print could get them in trouble. And just as important as the jobs of these teachers, the education of Pittsburgh’s students is also on the line. An education system in which a few words that may not sound politically correct could get a high-quality teacher in big trouble is compromising the future of its students.
It is always important to ensure that teachers are effective. It is as important to teachers and students, all of whom are directly involved in the process of learning, that teaching is effective, as it is to administrators looking at the test scores and the bottom line. According to Ms. Papale, Pittsburgh Obama’s ninth and eleventh grade English teacher, “We want all of our colleagues to be doing their share. It makes it easier on us.” But there are sometimes a few teachers who cannot maintain sufficient control of their classroom, who do not seem to be invested in instilling their knowledge on the youth in their class, or who simply do not seem to understand what they are teaching.
This is where teacher evaluation comes in. Similar to standardized tests for students, there are several tools that the Pittsburgh Public Schools district is currently using to evaluate its teachers. Value-Added Measures, commonly abbreviated as VAM, is one such method. VAM attempts to measure the academic growth of students that can be attributed to a particular teacher. This is done by examining how students standardized test scores have improved, and by comparing the students’ test scores to those of other students. The Pittsburgh Public Schools are also currently implementing student and principal evaluations of teachers. The students of at least one class taught by each teacher evaluate that teacher using a survey called Tripod, which contains 89 questions relating to the teacher and the class. The principals at each school also rate the teachers there. All of the above factors are combined into a composite score that affect whether the teacher is in line to be laid off, to be put on an improvement plan, or will get pay raises and bonuses.
The Pittsburgh Public Schools recently got a grant from the Bill and Melinda Gates Foundation for $40 million in order to improve the quality of teachers. The system that the Gates Foundation has pushed to be implemented at PPS is based on the system that Bill Gates had used – and recently abandoned – at Microsoft. Perhaps the worst part of the system is that it forces a certain percentage of employees to be placed into each of several categories. This means that it forces some teachers to fail each year. But it is unrealistic to say that no matter how good a school is some teachers have to fail. If the principal hired only the best applicants in the first place, as he would logically attempt to do, there may be no need for anyone to fail. The system fosters unhealthy competition among colleagues, and causes teachers an unnecessary amount of anxiety.
Starting this year, 50 % of the teacher evaluation is based on an administrator’s classroom observation. The other 50 % is comprised of student outcomes. The “student outcome” category can be divided further to say that VAM for a specific teacher counts for 30 % of that teacher’s score, Tripod surveys for 15 %, and VAM for a school in general for 5 %. This information was obtained from a publication called “Education Committee Update: Empowering Effective Teachers,” published in January 2013. The publication uses idealistic and vague language, saying for example that the district’s goals are to “accelerate student achievement” and to “become a district of first choice.” Its methods for doing this are to instate a strict high-stakes teacher evaluation system that may not effectively distinguish good teachers from bad.
Evaluating a teacher is not easy to do. PPS is trying to make the system more quantitative, but this does not always mean that it is more objective. “There are 1,000 different ways to be a good teacher and 10,000 different ways to be a bad teacher… And just because you can’t punch the boxes doesn’t mean you’re a bad teacher” says Mr. Boyce, a teacher at the Pittsburgh Gifted Center. A good teacher will instill knowledge in his or her students. Beyond that, there are many options and many different ways to be a good teacher.
Further, some of the things that teachers give their students are difficult to quantify. As summarized by Mark Rauterkus, a PPS father, the best thing a teacher can do is teach students “a thirst for knowledge and how to discover things for themselves. If a teacher teaches a student a love of learning in a subject, that’s fantastic.”
Mr. Dumbroski teaches eighth grade English at Obama, and is also involved in the teacher evaluation process as an administrator. According to him, a good teacher is “somebody who’d do whatever is humanly possible to get the most out of every single student with whom they come into contact.” For Mr. Dumbroski, a good teacher teaches more than academic lessons; he can teach social skills and life lessons as well. “Here’s a hint for how to be a good teacher,” Mr. Dumbroski says, “Remember that students are people, too.” In his classroom, Mr. Dumbroski attempts to connect with and teach each and every one of his students.
Mr. Kocur, Obama’s tenth grade English teacher, agrees. “First and foremost, a good teacher needs to be able to communicate with a variety of different kids.” It is important to Mr. Kocur that teachers have empathy. “Kids don’t care how much you know until they know how much you care,’” Mr. Kocur says, quoting leadership expert John C. Maxwell.
To Ms. Hetrick, the most important quality of a teacher is “passion.” This includes passion for the content being taught, as well as passion for the process of teaching. It is necessary for a teacher to care passionately about his or her students in order to instill in them a passion for learning.
Yet whether a teacher makes a student want to learn, connects and empathizes with his students, cares about his students as individuals, or has passion is difficult to measure on any sort of evaluation.
Also, a teacher who is effective for one group of students may be ineffective for others, and different students show improvements at different rates. With increasing class sizes, it is becoming more and more difficult for teachers to teach to the individual students in their classes. Further, class sizes are being increased and “mainstream” level classes have been abolished. This has resulted in students of ever more varying abilities being placed in the same class. The teacher’s job is becoming more and more challenging.
The current default model for education at PPS is one in which students have minimal choice in their classes, and in which teachers have minimal choice in what is taught. Curricula are set by the district, and each student has to take a certain set of classes with few possible variations. Teachers have to cover a specified curriculum on which the students will be tested at the end of the year as mandated by the district. In a system in which neither the teacher nor the students have choice over what is taught, some of the results being evaluated may not be attributable to the teacher. However, there are alternatives to the current model. For example, the Pittsburgh Gifted Center is based on a different model. There, teachers design their own curricula so they are able to teach at a pace that they feel best fits the needs of the class. Additionally, students can choose the courses they want to take so they are often more motivated to participate in the classes in which they are enrolled.
The classroom observation component, half of teachers’ composite scores is being shifted to a system called the Research-based Inclusive System of Evaluation, or RISE, in which teachers are rated in categories such as planning, instruction, and leadership on a scale of one to four based on their performance. “I actually think the RISE components do a pretty good job of identifying everything we’re looking for,” says Ms. Hetrick, Obama’s ceramics and IBDP visual arts teacher. The system is numerical and more standardized than previously used systems in which principal evaluations were based on value judgments. When Ms. Hetrick looks at the criteria, she sees the “distinguished” category as something she wants to work towards, and appreciates that the RISE rubric’s different levels seem to make sense.
Mr. Boyce points out the human factor that goes into the RISE system. “If you want to make me look like a good teacher you can make me look like a good teacher. If you want to make me look like a bad teacher, you can do that, too. I guess I kind of like the old go-or-no-go thing because I was in the military. RISE is just a fancy way of doing the same thing.” It is true that even in the RISE system, a principal has a lot of sway.
Pittsburgh Obama is fortunate to have a principal, Dr. Walters, who is a strong and fair leader. Yet principals like Dr. Walters are few and far between. One teacher from another school, who would like to remain unnamed, reports that his personal differences with his principal got in the way of her objectivity and brought down his rating. She rated him as “basic” in just enough categories that he would fail his RISE evaluation, despite the fact that his VAM scores were above the school’s average. “You can say its objective until you’re blue in the face,” agrees Mr. Kocur. “But it still comes down to an administrator walking in and saying what he thinks of you.”
The VAM system, comprising 30 % of teacher scores, seems mathematically pure at first glance, but does not necessarily treat all teachers fairly. Mr. Boyce believes that the sample of students in one classroom is not big enough. “They take a sample size of 30 students and apply that to 10,000. Because really, they’re using my students to say how I’d perform across the board. And that’s not realistic. In research science, a sample size of 30 typically means nothing but some preliminary results that could lead to further research.”
While the VAM system assumes that students are randomly assigned to teachers, this is rarely the case. Students can sometimes, but not always, choose their schools and classes. Differences in students from class to class greatly affect how the scores will turn out here. Mr. Boyce’s classes at the Gifted Center are more likely to perform well because many of them chose to be in his class.
Further, a study by the Gates Foundation has shown that VAM is more applicable to math than it is to language subjects. Children learn language from a variety of sources, including family and peers as well as school, while they learn math primarily from their teachers. Yet in the Gates Foundation model, VAM is being used across the board in both math and language subjects.
However, VAM is not being used in all subjects. It is used in the subjects for which there are standardized tests: english, math, and some sciences and social studies classes. Teachers for other subjects get a different type of score, called a 3f, which shows student growth based on criteria that the teacher decides at the outset. While it is unfair to grade different teachers on different standards, it is also unfair to the students to create standardized tests in even more subjects simply as a way to grade the teachers. Ms. Berry, a middle school math teacher at Obama who is involved in the Pittsburgh Federation of Teachers and who is a member of the committee that decides on evaluation criteria, says that “they shouldn’t test kids in every subject, because then it’s just a test for the teacher. I think it’s just mean to kids, the amount of testing they do…. Some things you should learn just because they are beautiful to learn.”
Student surveys, the Tripod questionnaires, comprise 15 % of teachers’ scores in the new system. According to Mr. Boyce, to be a good teacher “you teach for a long time, and you don’t repeat the things that don’t work. You do repeat the things that do work, and you throw something new in there every once in a while.” Mr. Boyce believes that the student Tripod surveys are helpful in this way. They allow him and other teachers to recognize their weaknesses and find things that they can improve on. The Tripods allow him to better understand how his students feel about him, to fix misconceptions, and to work on problems. Student feedback is useful if it can improve a teacher’s practices without putting too much at stake.
There are, however, some problems with high-stakes student surveys such as Tripod. “I don’t like some of the questions on Tripod,” explains Mr. Schaefer, a history teacher at Obama. Of the eighty-nine questions on the Tripod survey, many depend on factors that are not within the teacher’s control. It is a joke among students and teachers at Obama and other PPS schools that one of the questions is whether the class “feels like a big, happy family.” In fact, many students do not take the survey seriously in general. Some are fatigued by the length of the questionnaire and the fact that many questions are repetitive (evidently in order to ensure fairness), while others allow their personal feelings about a teacher to taint the results. Yet it is not a joke that such subjective questions are putting teachers’ jobs and student education on the line. It should not be expected at a middle school or high school level that the class “feels like a big, happy family,” and whether it does or doesn’t likely depends more on the students than the teacher.
Mr. Dumbroski believes that student feedback is useful, but that the Tripod survey does not go about collecting feedback the right way. Instead, he has his students write down on note cards at the end of each grading period what they feel he did right and what they would want him to work on. One suggestion that he received in this way was to give the students more choice; in response, he began to allow students to propose their own ideas for projects, rather than always choosing from a list of suggestions. Also, some students would write on the note cards that the pace of the class was too fast, while others would write that it was too slow. As a result, Mr. Dumbroski has begun to give more individual attention to students in his class. Mr. Dumbroski feels that this note card system lets him understand what is most important to his students more effectively than the Tripod does.
Colleges and universities are beginning to count student evaluation of teachers for less and less. According to Mr. Boyce, studies have begun to discredit such evaluations. “This is the exact system that colleges and universities are getting away from,” says Mr. Boyce. “Statistically, kids who perceive their grades as being good in a class are more likely to rate their teachers higher. Teachers can then manipulate the system by inflating grades.” So it is questionable whether a system that is being abandoned at the college and university level should be taken up and implemented for middle and high schools.
The last 5 % of teachers’ scores is based on the building’s VAM scores in general. This is entirely illogical, as the teacher has little or no influence on how students perform in a class down the hall. VAM’s inability to control for variables is magnified when it is applied to classes that a teacher doesn’t even teach.
As the system is currently set up, the above factors will be tallied up and turned into a score out of 300. If a teacher gets less than 140 points, that teacher is considered failing and put on an improvement plan. If he or she does not improve sufficiently within two years, the teacher will be fired. Teachers were given scores last year, although the scores will not count as grounds for being fired until after this year.
Last year nine percent of teachers were placed in the lowest category. Nationally, less than one percent of teachers are failed, but teachers in other districts are graded with different standards. If last year’s results are any predictor of this year’s, Pittsburgh’s teachers will be failed ten times more than the national average. The Pittsburgh Federation of Teachers, Pittsburgh’s teacher union, believes that the 140-point cut-off is far too high.
The Bill and Melinda Gates Foundation and Superintendent Lane both disagree. The Gates Foundation has been hinting at withdrawing the remaining $15 million of the $40 million grant that it awarded to Pittsburgh Public if the district does not comply with the 140-point standard. “We have not made any decisions about the future of the grant, but we are continuing to watch this very carefully,” reports Vicki Phillips, the director of College-Ready Education at the Gates
This puts the pressure on the district and Dr. Lane to go along with the Gates Foundation’s strict cut-off. Dr. Lane says, “I’m still a firm believer that there is a correlation between effective teaching and student learning outcomes.” To back this up, she cites a study that concludes it is twice as likely for students who show significant improvement to have teachers in the top category. What she does not mention is that sometimes there are students who show improvement under “failing” teachers or who show less improvement under top-rated teachers.
The Pittsburgh Federation of Teachers and Dr. Lane are currently at odds over the issue of teacher evaluation. “I thought we were partners in reform, but the partnership [with the union] has been rocky, let’s just say that,” Dr. Lane says. While both agree that effective teaching is important, they have strong differences when it comes to how to implement teacher-evaluation reform.
Ms. Berry acknowledges that not all teachers are good teachers. To be a good teacher, it is necessary to have coherent lessons, a good relationship with students, and a desire to work. “It has a lot to do with personality,” she says. Implementing any plan to improve failing teachers is difficult, because some people do not have the personality for it. “It does take a long time,” Ms. Berry says, “to improve a poor teacher. I don’t know how you’d set it [an improvement plan] up so that it’d be fair to teachers and students.” On the one hand no one wants to fire a teacher who may be simply misunderstood, but on the other hand the students and the whole system suffer under poor teachers.
To Mr. Dumbroski, the problem is that “No matter what, any type of evaluation tool, there’s going to be something wrong with it.” No matter how perfect the RISE criteria can be, the people who are checking the boxes are not perfect. No matter how many numbers and mathematical equations are used to compute RISE, VAM, and Tripod scores, the result is variable and subjective. There is a certain amount of bias that is impossible to remove.
To Mr. Kocur, teachers are not the problem in the first place. “At some point, people should stop being politically correct and put the blame where it is due. Behind nearly every good student, there is a supportive parent, and vice versa.” Teachers are held responsible by national, state, and local governments (including the school board), by the media, and by philanthropic organizations like the Gates Foundation for failures in student education. But according to Mr. Kocur, students should be learning from their parents before they can talk. Children who have parents who read to them, who help them with homework, and who model and encourage a positive attitude towards learning, are more likely to succeed in school than those who do not. This is as important as teachers are to student education, yet it is rarely considered in government laws and plans, by the media, or by philanthropic organizations who want to donate to education.
The way in which teacher evaluation tools should be utilized is debatable. While some of the measures that PPS is taking are more legitimate than others, it is clear that there are many problems with the current system. The PPS students and teachers are hoping that Superintendent Lane, the school board, and the Bill and Melinda Gates Foundation will consider how difficult it is to rank teachers by how well they teach, before making scores so critical to a teacher’s future.
Sunday, February 09, 2014
Fwd: The Eagle on PPS Teacher Evaluations by Lucy N, a swimmer.
--------- Forwarded message ----------