The Gates Foundation's $575 million Intensive Partnerships for Effective Teaching program used algorithmic assessment systems to evaluate teacher performance but failed to achieve its educational goals and caused harm to teachers' careers.
The Gates Foundation implemented a $575 million program called Intensive Partnerships for Effective Teaching that used data-driven algorithms to assess teacher performance in public schools. The initiative gathered data from multiple sources including test scores, principal observations, and student/parent evaluations to determine teacher effectiveness using value-added models. The goal was to reward good teachers, remove ineffective ones, and narrow the achievement gap for low-income minority students. An independent assessment by the Rand Corporation found that the initiative did not achieve its goals for student achievement or graduation, particularly for the target population of low-income minority students. The algorithmic assessment systems were described as 'little better than random number generators' with secret formulas that prevented expert review. The program resulted in teachers being unfairly evaluated and driven out of the profession during a nationwide teacher shortage. The value-added models and other assessment measures were found to be statistically weak and biased, yet were used for high-stakes decisions about teacher promotion and termination.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed