We learn, therefore we score?

“All tables apart, we start with a test today!” In the Netherlands, primary schools are currently obliged to participate in a ‘pupils monitoring system’ to regularly evaluate students’ academic performance (math and language) with standardized tests. The claimed advantage of standardized tests is that they provide an ‘unbiased’ record of a student’s progress over time. While it might be a good idea to track children’s academic development, there are also some serious disadvantages that should make us careful.

Quantum Mechanics in the Classroom: The Heisenberg Effect

In the last years, testing has become an integral part of schooling. Researchers have pointed out that, because of this trend, schooling might start to resemble test-training. Amrein and Berliner (2002) call this the Heisenberg effect, a term adapted from quantum mechanics:

“The more important any quantitative social indicator becomes in social decision-making, the more likely it will be to distort and corrupt the social process it is intended to monitor”

For example, if schools’ average test scores are made public, they can eventually become stigmatized as―let’s say―low-performing schools. This may lead them to find ways to improve students’ scores, for example by intensive test-training in the classroom, or preventing students with learning disabilities from admission. And there you have your Heisenberg effect: the instrument that was initially intended to monitor student’s learning is now partly determining what students practice in school.

Testing Noise

The tests themselves, on the other hand, may have pitfalls as well. Messick (1989) first raised the problem of ‘construct-irrelevant noise’ by showing that students with reading difficulties often score lower on math tests that require reading. This was one of the first indications that standardized tests do not only measure the construct they claim, but that their outcome is highly influenced by other interfering factors. These factors include students’ reading difficulties (that is, if the test is not a reading test), attention problems, and communication problems, such as a limited vocabulary, difficulties to interpret questions or to verbalize answers. This is a considerable problem for students with learning difficulties, who score significantly lower in all academic domains (Reid et al., 2004). In our study, for example, we found that special needs students scored significantly lower than their peers in regular education on two standardized tests of academic performance. When working on hands-on scientific tasks together with a teacher, however, no substantial differences were found in their level of reasoning (Van Der Steen et al., 2012).

“Test scores are not always the objective context-independent measures of students’ understanding they are claimed to be”

Universally Designed

If we want to eliminate the disadvantages of standardized tests as much as possible, we might be better served with universally designed testing methods. This would not only help to diminish the issue of construct-irrelevant variance, but it might also change society’s ideas about the accuracy of single standardized test scores and the consequences attached to these (although it would not offer a complete solution for the Heisenberg effect, I agree).

“Applying the universal design principles to standardized tests reduces barriers in educational material and instruction by providing accommodations and supports for all students, including students with disabilities or developmental delays” (Rose & Meyer, 2002)

For example, computerized universally designed tests would contain text-to-speech software, a build-in dictionary to help students understand the wording of the questions, and other suitable facilities. Students would be able to either type in their answers, or record their verbal answers. In the case of multiple choice questions, it is even possible to let the computer program assess students’ performance automatically, and adaptively select the following item.

In sum, scoring on tests does not equal learning, but if we do use tests, all children should get an equal opportunity to score well. A ‘universal design’ would highly increase the accessibility of tests for all student populations―even for those that are now lagging behind.

Relevant Publications and Links

Amrein, A. L., & Berliner, D. C. (2002). High-stakes testing & student learning. Education policy analysis archives, 10, 1-74.

Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18, 5-11. doi: 10.3102/0013189X018002005

Reid, R., Gonzalez, J. E., Nordness, P. D., Trout, A., & Epstein, M. H. (2004). A meta-analysis of the academic status of students with emotional/behavioral disturbance. The Journal of Special Education, 38, 130-143. doi: 10.1177/00224669040380030101

Rose, D. H., & Meyer, A. (2002). Teaching every student in the digital age: Universal design for learning. Alexandria, VA: Association for Supervision and Curriculum Development.

Van Der Steen, S., Steenbeek, H., Wielinski, J., & Van Geert, P. (2012). A Comparison between Young Students with and without Special Needs on Their Understanding of Scientific Concepts. Education Research International, 2012. doi: 10.1155/2012/260403


NOTE: “Taking a test” by Renato Ganoza, is licensed under CC BY 2.0

Dr. Steffie van der Steen has finished her Master’s in Mind, Brain and Education at Harvard University, where she received an intellectual contribution/faculty tribute award. After finishing her Master’s, Steffie has joined the University of Groningen as a PhD student in Developmental Psychology. She has defended her dissertation in May 2014, for which she studied the longitudinal development of children’s understanding of scientific concepts. In her research, Steffie takes a process approach, focusing on the person-context interactions and the variability that constitute development and learning, rather than focusing on group averages. In her research she has worked with both typically developing children and children with emotional/behavioral problems. Next to continuing her research, Steffie is also appointed as assistant professor at the Open University, teaching several statistics-related courses. At the University of Groningen she is lecturer for the first year’s course Developmental Psychology. She has made research visits to Harvard University and Florida Atlantic University. Steffie dreams of writing a children’s book one day, if she finds the time. For more information, you can visit this website.

Select publications

Van Der Steen, S., Steenbeek, H., Van Dijk, M. W. G., & Van Geert, P. (2013). A Process Approach to Children’s Understanding of Scientific Concepts: A Longitudinal Case Study. Learning and Individual Differences. doi: 10.1016/j.lindif.2013.12.004

Van Der Steen, S., Steenbeek, H., Van Geert, P. (2012). Using the Dynamics of a Person-Context System to Describe Children’s Understanding of Air Pressure. In H. Kloos, B. J. Morris, & J. L. Amaral (Eds.) Current Topics in Children’s Learning and Cognition (pp. 21-44).

Van Der Steen, S., Steenbeek, H., Wielinski, J., & Van Geert, P. (2012). A Comparison between Young Students with and without Special Needs on Their Understanding of Scientific Concepts. Education Research International, 2012. doi: 10.1155/2012/260403

You may also like

Leave a comment