AI Image Detection Featured Image

“So you think you can tell…”: Why we are confident (and wrong) in AI detection

Every day we seem to be getting more and more used to encountering content on social media related to the use of artificial intelligence (AI). This includes many tools that help to optimize our work, advertisements, and also a lot of content created simply for fun. Exaggerated images or unrealistic videos are nowadays often used as internet memes, where the whole point is precisely that we can understand that what we are seeing is not real. As a result, we can get the idea that spotting AI-generated content is easy. But is that really the case? What happens when the content is made with the intention of not being detectable?

Take a moment to look at the faces shown below. Some of them are photographs of real people, and others were artificially generated. Which ones do you think are real? And also, how sure do you feel of your answers?

Some of these faces are real photographs, while others are AI-generated. Can you tell which are which?

Can you tell which of these images are AI generated? Find the answers below the reference section.

This short exercise allows us to understand one of the central problems in AI detection: it is not only about whether we can judge correctly, but also how confident we feel while deciding. What research on this topic has revealed so far shows an interesting pattern: when people are asked to distinguish between real and AI-generated content, their accuracy is often surprisingly low, many times being close to mere chance level (Cooke et al., 2025; Diel et al., 2024; Fiedler & Döpke, 2025). At the same time, when asked about how confident they were in their judgement, they rated their own skills as quite high (Köbis et al., 2021). In other words, people frequently feel certain about their ability to detect AI-generated content, even when their actual responses were incorrect. These findings are particularly consistent when it comes to hyperrealistic generations, such as the case of synthetic human faces (Miller et al., 2023; Tucciarelli et al., 2022).

“People frequently feel certain about their ability to detect AI-generated content, even when their actual responses were incorrect.”

This “mismatch” between accuracy in detection and confidence is especially relevant in the context of AI because AI-generated content has been specifically designed to look real, and to be as convincing as possible. To achieve this, generative models have progressively aimed to match and reproduce features that we instinctively rely on, while also removing many of the cues that we use for detection, such as visible irregularities, low-level artifacts, lighting inconsistencies, or unnatural smoothness (Karras et al., 2020). So, as these generative models improve, many of these signals become less available or less reliable.

Then, detection is not just about thinking harder, paying more attention or finding better reasoning strategies. In fact, interventions that have focused on improving detection by teaching participants what kind of details to look for, such as inconsistencies or visual artifacts, have proven to not be really useful in the long run (Somoray & Miller, 2023).

So instead, it seems more likely that the answer lies in the way that our perception itself is organized.

To understand this, we need to think of perception not simply as a “process” that happens inside our heads, but as something that is developing constantly as we interact with the world. In daily life, we make sense of what is around us by interacting with it: looking around, listening, moving closer, comparing details, and many more. What we perceive depends on the information that is available in this interaction, along with our ability to pick up on it (Gibson, 1979; Lobo et al., 2018). Over time, we start noticing cues that help us understand what we have in front of us; however, not all of these are equally useful nor meaningful in every context.

“Confidence might reflect how convincing the information we are attuned to feels, rather than how informative it actually is.”

Then, what matters is not just what we perceive, but which aspects of that information we can actually rely on. In psychology, this is sometimes described as “attunement”, which refers to becoming sensitive to certain signals in our environment and learning to use them to guide our judgments (Michaels & Carello, 1981). The biggest issue here is that, in the case of AI-generated stimuli, many of the cues we have developed and trusted to judge whether something is “real” are no longer reliable (Nightingale & Farid, 2022; Somoray & Miller, 2023). This helps explain why subjective confidence and accuracy in detection can come apart, as confidence might better reflect how convincing the information that we are attuned to feels, rather than how informative it actually is.

If the problem lies in the signals we are currently trusting, what can we do then? Well, we could assume that we simply need more knowledge on how AI works. Then, if we get better at analyzing what we see, we might also become better at detecting. However, evidence so far suggests that this is likely not enough, as effort alone does not necessarily improve our accuracy (Köbis et al., 2021; Somoray & Miller, 2023). What we need instead is to learn to become attuned to the right kind of information. This is clearly no trivial challenge, as many of the cues we learn throughout our life are still useful in everyday contexts even if they are no longer reliable for AI-generated content.

So, as generative models continue to produce increasingly realistic content, our challenge becomes not only technological but also related to how we perceive what we see, and how we judge whether our own certainty is justified. AI is not just changing the way we see things, but also the way in how we decide what to trust. When the signals that guide our reasoning are no longer useful, then our confidence in them does not guarantee us to be right. The key issue for us then is not simply to understand if we are able to tell what is real or what is not, but also whether the information that we are paying attention to is actually useful for that purpose. In a world where artificially generated media is progressively looking more authentic, our biggest challenge might not be just detecting mistakes, but also being able to recognize when our own sense of certainty is leading us in the wrong direction.

 

References

Cooke, D., Edwards, A., Barkoff, S., & Kelly, K. (2025). As good as a coin toss: Human Detection of AI-Generated Content. Communications of the ACM, 68(10), 100–109. https://doi.org/10.1145/3729417

Diel, A., Lalgi, T., Schröter, I. C., MacDorman, K. F., Teufel, M., & Bäuerle, A. (2024). Human performance in detecting deepfakes: A systematic review and meta-analysis of 56 papers. Computers in Human Behavior Reports, 16, 100538. https://doi.org/10.1016/j.chbr.2024.100538

Fiedler, A., & Döpke, J. (2025). Do humans identify AI-generated text better than machines? Evidence based on excerpts from German theses. International Review of Economics Education, 49, 100321. https://doi.org/10.1016/j.iree.2025.100321

Gibson, J. (1979). The ecological approach to visual perception. Houghton Mifflin.

Jacobs, D. M., & Michaels, C. F. (2007). Direct learning. Ecological Psychology, 19(4), 321–349. https://doi.org/10.1080/10407410701432337

Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020). Analyzing and Improving the Image Quality of StyleGAN. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8107–8116. https://doi.org/10.1109/cvpr42600.2020.00813

Köbis, N. C., Doležalová, B., & Soraperra, I. (2021). Fooled twice: People cannot detect deepfakes but think they can. iScience, 24(11), 103364. https://doi.org/10.1016/j.isci.2021.103364

Lobo, L., Heras-Escribano, M., & Travieso, D. (2018). The History and Philosophy of Ecological Psychology. Frontiers in Psychology, 9, 2228. https://doi.org/10.3389/fpsyg.2018.02228

Michaels, C. F., & Carello, C. (1981). Direct perception. Prentice-Hall.

Miller, E. J., Steward, B. A., Witkower, Z., Sutherland, C. a. M., Krumhuber, E. G., & Dawel, A. (2023). AI hyperrealism: Why AI faces are perceived as more real than human ones. Psychological Science, 34(12), 1390–1403. https://doi.org/10.1177/09567976231207095

Nightingale, S. J., & Farid, H. (2022). AI-synthesized faces are indistinguishable from real faces and more trustworthy. Proceedings of the National Academy of Sciences, 119(8). https://doi.org/10.1073/pnas.2120481119

Somoray, K., & Miller, D. J. (2023). Providing detection strategies to improve human detection of deepfakes: An experimental study. Computers in Human Behavior, 149, 107917. https://doi.org/10.1016/j.chb.2023.107917

Tucciarelli, R., Vehar, N., Chandaria, S., & Tsakiris, M. (2022). On the realness of people who do not exist: The social processing of artificial faces. iScience, 25(12), 105441. https://doi.org/10.1016/j.isci.2022.105441

Images B, E and F are artificially generated. The rest are real photographs.

Sources:
Real human faces: Flickr-Faces-HQ Dataset (Karras et al., 2019a).
Synthetic human faces: StyleGAN (Karras et al., 2019b).
Karras, T., Laine, S., & Aila, T. (2019a). Flickr-Faces-HQ dataset [Software]. GitHub. https://github.com/NVlabs/ffhq-dataset
Karras, T., Laine, S., & Aila, T. (2019b). A style-based generator architecture for generative adversarial networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4401–4410.
https://doi.org/10.48550/arXiv.1812.04948

You may also like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.