A teacher is observed in her first period class and gets a low rating; in her second period class she gets higher marks. She’s teaching the same material in the same way — why are the results different?

A new study points to an answer: the types of students teachers instruct may influence how administrators evaluate their performance. More low-achieving, black, Hispanic, and male students lead to lower scores. And that phenomenon hurts some teachers more than others: Black teachers are more likely to teach low-performing students and students of color.

Separately, the study finds that male teachers tend to get lower ratings, though it’s not clear if that’s due to differences in actual performance or bias.

The results suggest that evaluations are one reason teachers may be deterred from working in classrooms where students lag farthest behind.

The study, conducted by Shanyce Campbell at the University of California, Irvine, analyzed teacher ratings compiled by the Measures of Effective Teaching Project, an effort funded by the Bill and Melinda Gates Foundation. (Gates is also a supporter of Chalkbeat.)

The paper finds that for every 25 percent increase in black or Hispanic students taught, there was a dip in teacher’s rating, similar to the difference in performance between a first and second-year teacher. (Having more low-performing or male students had a slightly smaller effect.)

That’s troubling, Campbell said, because it means that teachers of color — who often most frequently work with students of color — may not be getting a fair shot.

“If evaluations are inequitable, then this further pushes them out,” Campbell said.

The findings are consistent with previous research that shows how classroom evaluations can be biased by the students teachers serve.

Cory Cain, an assistant principal and teacher at the Urban Prep charter network in Chicago, said he and his school often grapple with questions of bias when trying to evaluating teachers fairly. His school serves only boys and its students are predominantly black.

“We’re very clear that everyone is susceptible to bias. It doesn’t matter what’s your race or ethnicity,” he said.

While Cain is black, it doesn’t mean that he doesn’t see how black boys are portrayed in the media, he said. And also he knows that teachers are often nervous they will do poorly on their evaluations if students are misbehaving or are struggling with the content on a given day, knowing that it can be difficult for observers to fully assess their teaching in short sessions.

The study, co-authored by Matthew Ronfeldt of the University of Michigan, can’t show why evaluation scores are skewed, but one potential explanation is that classrooms appear higher-functioning when students are higher-achieving, even if that’s not because of the teacher. In that sense, the results might not be due to bias itself, but to conflating student success with teacher performance.

Campbell said she hopes her findings will add nuance to the debate over the best ways to judge teachers.

One idea that the study floats to address the issue is an adjustment of evaluation scores based on the composition of the classroom, similar to what is done for value-added scores, though the idea has received some pushback, Campbell said.

“I’m not saying we throw them both out,” Campbell said of classroom observations and value-added scores. “I’m saying we need to be mindful.”