Brought to you by:
Paper The following article is Open access

Visualising relativity: assessing high school students' understanding of complex physics concepts through AI-generated images

, , and

Published 6 February 2024 © 2024 The Author(s). Published by IOP Publishing Ltd
, , Citation Maira Giovana de Souza et al 2024 Phys. Educ. 59 025018 DOI 10.1088/1361-6552/ad1e71

0031-9120/59/2/025018

Abstract

This study investigates how students utilized artificial intelligence (AI)-generated images to represent their understanding of general relativity concepts. Ten high school students participated in an extracurricular course on relativity theory. Using AI chatbot, these students created visual representations of 'relativity' before and after the course. The produced images, the accompanying prompts, student interviews, and their test scores were analysed to examine students' conceptual understanding and interactions with AI. Students with a clearer understanding of relativity tended to focus their prompts on more central concepts like spacetime deformation. In contrast, those with a weaker understanding leaned towards more tangential ideas. The clarity of their prompts was directly linked to more effective AI interactions, leading to more meaningful image generation. Despite this, some students faced challenges in crafting coherent prompts, resulting in less relevant images, indicating that understanding the concept does not always translate into successful AI engagement. The study underscores the potential of AI-generated images as a tool to illuminate student conceptualisation and interaction skills with AI in the context of complex physics concepts, offering a novel approach to evaluating understanding in advanced scientific topics.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

With the current advent of generative artificial intelligence (AI), its usage in educational fields has fostered diverse new possibilities [1] which change how we can approach the teaching and learning of complex concepts, such as Einstein's relativity theory (RT).

RT currently provides the most accepted and accurate description of the universe [2]. However, its concepts challenge our intuitive understanding of the world because the relativistic phenomena are not observed directly in everyday life [3, 4]. In this sense, to grasp these counterintuitive concepts, great reasoning skills are required, embracing more abstract and flexible conceptions of reality. Visualization skills are crucial for abstract reasoning [5] and, students' many difficulties with the visualization of relativity phenomena [6, 7] presents a challenge for educators. In this context, research on students' visualization abilities is essential for a deeper understanding of the theory.

Therefore, resources that aid both the visualization of relativistic effects and the externalization of students' imagery processes become significant, and a tool with remarkable potential is the AI-generated images. This new resource not only helps students' visualization, but also provides a window into their cognitive and imagery processes, thereby helping educators to identify students' difficulties and their conceptions.

There are already some studies involving the use of student selected or generated images. By exploring students' visual literacy skills and the use of images in academic work, Matusiak et al [8] found that students lack skills in selecting, evaluating, and using images. The use of student-made images can also promote engagement and motivation, because the use of visual thinking has been found to improve student satisfaction and learning outcomes [9]. Moreover, visual thinking provides alternative assessment methods [10] that can allow students to demonstrate their understanding in a creative and engaging manner.

Student-created images can also improve students' observation skills [11]. This approach also allows students to express their thinking through the use of images, promoting higher-order thinking skills. Moreover, artistic reflection and image creation have been found to be effective in fostering students' reflective thinking skills, such as critical analysis, and evaluation skills [12].

Consequently, the use of student-made images can be a valuable educational approach. In this context, AI-generated images bring a new perspective and can be a powerful tool for learning purposes. However, it is important to note that, more than visual literacy and thinking skills, a completely new set of skills is required, because students are interacting with new external resources [13]. Students must be able to interact with the AI through the chatbot adequately, that is, must know how to write the prompts, how to modify the generated images and interpret them. Even though the interaction using the chatbot happens through natural language (NL), a programing-type thinking is necessary [14].

Considering the scenario discussed, this study deals with students' use of AI-generated images about a complex topic, RT. The focus of the present research consists of investigating the conceptual focus of the students about their AI-generated images. In doing so, we aim to answer the following question: 'How do students approach AI to represent and express their understanding of general relativity?'. Moreover, we also analysed the factors that influenced the quality of students' images.

2. Methodology

2.1. Research context and intervention

The present study was developed with ten year-12 students at a Brazilian public school. The students were invited by the first author, who was their physics teacher, to participate in a short extracurricular course called 'Einstein's Relativity: from GPS to black holes'. Pre-tests and post-tests were answered by students before and after the course activities. The tests consisted of 11 conceptual questions, ten of multiple choice and one open question, that were validated by three experts—one in physics and two in physics education.

The test was focused on the concept of relative spacetime, covering Special and General Relativity. The 11 questions were divided into two sections: Questions 1–6 explored Special Relativity scenarios, addressing space contraction (Questions 1 and 2) and time dilation (Questions 3–6). Questions 7–11 delved into General Relativity, covering gravitational time dilation (Questions 7 and 8), space deformation (Questions 9 and 10), and spacetime curvature and gravity (Question 11).

Each multiple choice question had five answer options: one with the scientifically correct answer, two with scientifically correct elements but also including alternative conceptions, and two with only alternative conceptions elements.

After the end of the course, all students were interviewed individually through the Report Aloud protocol [15]. Using this protocol, a constant dialogue between the interviewer and interviewee was developed, where the students described what they were thinking at the time they performed some task. The focus consisted of investigating students' reasoning processes during specific activities.

The interviews were recorded and fully transcribed. For the present work, the excerpts dealing with AI-image generation were first literally translated into English and then adjusted for sentence structure. This process allowed a smoother reading while maintaining the meaning of the transcript.

The objective of the developed course was to promote learning about RT, by using multiple representations [16]. Therefore the students interacted with different resources, namely, videos, images, computer simulations, experiments, group activities and generative AI, to approach relativistic phenomena. In the present paper, we discuss one of the generative AI activities in detail as well as the main results.

2.2. AI image-generation activity

Students were also engaged in an activity using generative AI right before and after the course. Using Bing AI through their own smartphones [17], the students were asked to generate an image for the concept 'relativity'. For that, they received a written guide on how to use the tool. The students were prompted to think about what they imagine for the concept, and then they articulated their visual image with written text on the worksheet. Using their description as input, the students asked Bing AI to generate an image. The AI, by default, generates four images. From these, the students selected the one that most accurately mirrored their envisioned concept.

After the image generation, the students analysed how the selected image aligned with or diverged from their initial expectations. In instances where the outcome differed from what they anticipated, they were encouraged to critically evaluate their descriptive prompts, consider potential modifications, and regenerate images. After the activity, they submitted their final chosen images and handed over the completed worksheet to the teacher for review.

2.3. Data analysis

This analysis has a greater emphasis on the post-test results (questionnaire and images) based on the information provided by students after the activities. In this regard, students' written descriptions (prompts) were examined, focusing on the clarity and the main concepts used.

Structured prompts, showing clarity and specificity as well as providing the necessary context can be considered as good prompts [18]. However, as we aimed to investigate the meaning of the images and concepts involved, the accuracy of the images representing students' thoughts also was considered.

In this sense, a prompt was considered effective if the student expressed satisfaction with the resulting image. Since the focus was on whether students could use prompts to externalize their conceptual understanding through images, success was defined as whether the resulting image matched what they had imagined.

Therefore, to identify what students meant with their images based on the concepts involved, and if these images were accurate (considering what the students imagined), the interview excerpts were also analysed. The students' explanations were compared to their prompts and images.

Consequently, a match between student's expectations and the AI output indicates an ability to effectively create prompts to guide the AI, even if the conceptual accuracy is limited. If the resulting image met the student's expectations, this meant that he or she could create a good prompt.

To identify the concepts focused in the process, the keywords used by the students during the interview explanation and in the prompt were highlighted. Afterwards, these keywords were compared to the key-concepts of relativity. Finally, looking at the images, it was possible to assess students' conceptual understanding.

After the images analysis, students' tests scores were calculated. The score analysis focused only on identifying and differentiating image generation processes and prompts from students with higher and lower scores.

To calculate the scores, only the ten multiple choice questions were considered. Dealing with the five alternatives, for the correct answer five points were attributed, for the two partially correct answers three points, and for the two completely wrong answers no points were attributed. The maximum score of the test was 50 points.

3. Results and discussion

Through the results it was possible to identify different focuses and interactions between the students and the AI image-generator. To illustrate these scenarios, three exemplary students are discussed here. To protect the students' identities, pseudonyms were employed for reference.

3.1. Images and prompts

The first student, Luke, generated an image similar to the representations commonly used to deal with General Relativity. In the image generated (figure 1) there is a golden grid in which stars are reflected, and the bright light source in the centre has its rays bent. It is possible to identify the key-concepts of relativity on the image, that mass can deform spacetime, being a meaningful image concerning RT.

Figure 1.

Figure 1. Luke's AI-generated image, where is possible to observe the idea of curved spacetime and light bend.

Standard image High-resolution image

Looking at the prompt used by Luke, the image seems coherent with the student's description because he mentioned the curved spacetime and light bending in a concise sentence.

PromptLuke: "I imagine a sheet representing the space with massive stars bending the space and changing the light trajectory".

Moreover, the student was satisfied with the result obtained, as he mentioned in the interview that it was even better than he had expected. Luke related the word 'relativity' to the four-dimension universe deformed by massive objects, as he explained:

"It was much more related to space, there were many more stars. The spacetime sheet [gave me] the idea of a four-dimensional universe that included the time being bent by massive objects."

Analysing the student's prompt and speech, we could identify that he focused on spacetime deformation caused by massive objects. He used this idea to generate the image, with the key-concepts of GR [2]. As he could externalize and describe his thoughts adequately, Luke had the desired output from the AI.

Moreover, Luke showed a good understanding of RT, so he could articulate the concepts related to the theory to construct a coherent prompt and generate an image with physical meaning. As evidenced by the positive result obtained, Luke had adequate skills to communicate in NL what he meant and interacted successfully with the AI.

Another student, Sam, focused on similar concepts as Luke; however, the results were completely different. The first prompt provided by her was quite vague.

PromptSam(a): "The relativity is a theory about space and time".

As a result, Sam was not satisfied with the first image, which did not represent the concepts that she had thought of as 'relativity'—space and time. Therefore, Sam tried to modify the image using the second prompt:

PromptSam(b)"I think of a plane deformed sphere comparing to the Earth".

According to her, the resulting image (figure 2) still was not as she had expected:

"... [relativity] would be like a trampoline but the AI didn't understand what I wanted... The trampoline is bent because the black hole deforms spacetime as shown on this trampoline, as I imagined. But then I couldn't describe this [deformation and] it didn't understand me. I also couldn't express what I was imagining."

Figure 2.

Figure 2. Sam's AI-generated image, the black object aside Earth possibly is the attempt to represent a deformed space.

Standard image High-resolution image

Therefore, analysing Sam's description during the interview, her focus during the image generation was the spacetime deformation caused by massive objects. It was the same concepts used by Luke, but using a trampoline analogy instead of the lycra sheet.

However, the AI-generated image was completely different. It shows the Earth and a black object aside it, possibly the student's attempt to represent the space distortion using 'plane deformed sphere'. As a 'plane deformed' object seems contradictory, the generated image was somewhat undefined, with no significant physical meaning. Depending on how to look at it, the image seems like a planet or a hole in space.

Looking at Sam's explanation during the interview, a more accurate description could be 'a sphere deforming the plane' which would be more related to the trampoline analogy made by her. However, Sam could not adequately express and describe what she was thinking to prompt the AI, even after asking for modifications of the image generated.

Even though the second generated image was better compared to the first one, it still was not representing what Sam was imagining. This unsatisfactory result reflects the difficulty in students expressing their ideas verbally and generating adequate prompts to communicate with the AI.

It is important to note that this student showed a good understanding of RT and focused on the key concepts to generate the image—the spacetime deformation by mass. During the interview, she also could explain correctly these concepts. However, due to the lack of ability to develop prompts using them [13], the resulting image does not adequately represent 'relativity' according to the student.

There are also students who focused on different concepts, such as light-speed. In these cases, the images were completely different from the previous ones. For example, Tom's image (figure 3) showed some colourful and bright lines, in an abstract way. During the image generation process, he listed some concepts.

PromptTom: "Relative time, close to the light-speed, curved lines, spacetime".

Figure 3.

Figure 3. Tom's AI-generated image, where is possible to note the relation to the 'light-speed' concept.

Standard image High-resolution image

Even though these concepts are related to RT, Tom could not articulate them to catch the main point of the theory, the curved spacetime. Moreover, he could not explain these concepts adequately during the interview. In this sense, he did not describe something concrete, and this was reflected in the image generated.

The image provided a sense of movement, related to the 'light-speed' and 'curved lines' used by him and reflecting these focused ideas in the process:

"Actually, I just thought about the light-speed. According to the movies light is just a trail, right, and luminous. So, I thought about this actually happening, how it would be in the image".

As Tom mentioned during the interview, he related 'relativity' to the light-speed, thinking about a 'luminous trail'. In this sense, the image generated represented what he meant; Tom also affirmed this during the interview. This result shows that this student had the ability to interact with the AI using prompts to obtain the desired output from it.

However, it is possible to note, by the description and the interview, that the student did not have clear ideas about 'relativity'. Tom also did not demonstrate a reasonable understanding of RT and could not explain the main concepts of the theory during the interview [7].

Therefore, like Tom, even if the student possesses the skills to externalize his or her thoughts and generate effective prompts, that is, to obtain the expected result, other difficulties remain. There may be a challenge if the student has a poor conceptual understanding with no clearly defined image. In such cases, AI-generated images may fail to represent the main conceptual ideas of RT, representing the student's fuzzy ideas with no significant conceptual meaning.

3.2. Score patterns

The students' post-test scores were compared to their respective AI-generated images, prompts and explanations (figure 4). It was possible to identify some patterns related to teach student's focus on the image generation process.

Figure 4.

Figure 4. Summary with students' prompts, generated images, explanation of the images and post-test scores—the main concepts used by each student are highlighted.

Standard image High-resolution image

The students who presented the higher scores (>28) had a greater focus on 'spacetime deformation' and 'gravity', being among the most important concepts concerning General Relativity. Moreover, even when the image was not as expected, as in the case with Sam, students tried to represent these concepts on a concrete basis, indicating their clear ideas.

The only exception was Tom, who admitted that he had 'guessed' a lot of the test answers, and could not provide reasonable explanations for them during the whole interview. Thus, he was considered a 'low score student' despite scoring as a reasonable level on the test.

On the other hand, the four students with lower scores (<28) focused on different concepts, mainly the light-speed. Looking to their prompts and explanations during the interview, it was possible to note that these students do not possess clear ideas about relativity.

Even though some students such as Leo and Lara mentioned concepts related to the theory, they did not articulate them coherently in the prompt or explain them during the interview. Therefore, as they did not provide concrete descriptions, the generated images were also abstract with no conceptual meaning.

In this sense, these students could not generate meaningful images because they did not have a reasonable conceptual understanding of RT. Even the students among them with skills to develop good prompts and obtain the expected result, generated fuzzy images due to this lack of understanding.

4. Conclusion

The emergence of generative AI tools creates new opportunities for teaching complex subjects like RT. GR's complex nature makes it inherently challenging to represent visually in traditional representations like diagrammatic sketches. This investigation reveals that AI-generated images could serve as a valuable educational medium: prompting students to visualise and engage complex ideas while offering teachers a glimpse at students' conceptualisation of challenging concepts [6, 7].

Our analysis, encompassing not only the images but also the prompts and explanations, centred on students' conceptualisation of the key ideas during the image creation process. The aim was to discern the intent behind their representations and their satisfaction with the outcomes. The focus of our analysis was not on the scientific precision of the images in depicting RT but rather on what these images reveal about students' conceptual understandings and reflections. This approach underscores the potential of AI-generated images as a tool for exploring student conceptions, offering insights into their understanding and interpretive abilities.

This study also highlights a broader issue: the challenge of effectively translating understanding into coherent AI prompts for image generation. This difficulty is not limited to someone with a limited understanding; even those with a better conceptual understanding may struggle to capture their thoughts into the concise language required for AI interaction [13]. It is crucial for educators to recognize and address the challenges students face in generating effective prompts. Guidance in this area can mitigate frustration and enhance the learning experience. Many students struggled to articulate their thoughts into precise instructions for the AI [19], underscoring the need for skills in conceptual understanding, clear textual expression, and concise prompt formulation.

This study also reveals the potential of using AI-generated images to evaluate and foster students' communicative skills in interpreting and conveying scientific concepts, a methodology that can be extended to other intricate subjects like Quantum Physics [20]. While our findings are promising, they are preliminary and highlight the need for further research. This exploratory study opens the door to a novel and engaging way of teaching and learning, one that intertwines technological innovation with educational practice.

Acknowledgments

Firstly, we would like to thank the students for their participation in this study and the Coordination for the Improvement of Higher Education Personnel (CAPES) for funding this research. The Curtin University is acknowledged for welcoming the first author as a visiting student. And, finally, the reviewer for the comments which improved the quality of the manuscript.

Data availability statement

All data that support the findings of this study are included within the article (and any supplementary files).

Funding

This research was developed with funding from the Coordination for the Improvement of Higher Education Personnel (CAPES), Under Grant 88881.846299/2023-01.

Ethical statement

This research has been reviewed and given favourable opinion by the Lutheran University of Brazil Ethics Committee with Reference Number 52435921.0.0000.5349.

Appendix—: post-test questions

Relativity questionnaire

Questions 1 and 2: Two trucks are on the same road, in opposite directions, as the figure. The truck 2 is stopped due to an engine problem. The distance between them, measured by the truck 2, is 100 m.

  • (1)  
    If the truck 1 is moving at 80 km h−1, the distance measured by it will be, approximately:

  • (a)  
    Also 100 m.
  • (b)  
    Little bigger than 100 m.
  • (c)  
    Much bigger than 100 m.
  • (d)  
    Little smaller than 100 m.
  • (e)  
    Much smaller than 100 m.

  • (2)  
    Now, considering that truck 1 is moving with a huge speed (0.7 c), the distance measured by it will be, approximately:

  • (a)  
    Also 100 m.
  • (b)  
    Little bigger than 100 m.
  • (c)  
    Much bigger than 100 m.
  • (d)  
    Little smaller than 100 m.
  • (e)  
    Much smaller than 100 m.

Questions 3 and 4: A person is traveling by ship and throw a little ball upwards while the ship passes the coast. A person on the coast observes and notes that the little ball comes back to the other's hand 10 s after being thrown.

  • (3)  
    If the ship travels at a speed of 30 knots (56 km h−1), the time interval to the little ball returns to person on the ship hand will be, approximately:

  • (a)  
    Also 10 s.
  • (b)  
    Little bigger than 10 s.
  • (c)  
    Much bigger than 10 s.
  • (d)  
    Little smaller than 10 s.
  • (e)  
    Much smaller than 10 s.

  • (4)  
    Now, if the travels at a huge speed (0.6 c), the time interval to the little ball returns to person on the ship hand will be, approximately:

  • (a)  
    Also 10 s.
  • (b)  
    Little bigger than 10 s.
  • (c)  
    Much bigger than 10 s.
  • (d)  
    Little smaller than 10 s.
  • (e)  
    Much smaller than 10 s.

Questions 5 and 6: Consider two friends that have the same age. One of them travels on a cruise, while the other one stills on the city. On the return from the trip, one year has passed for the friend who stayed at the city.

  • (5)  
    If the cruise travels with a speed of 30 knots (56 km h−1), the time interval for the friend who travelled will be:

  • (a)  
    Also 1 year.
  • (b)  
    Little bigger than 1 year.
  • (c)  
    Much bigger than 1 year.
  • (d)  
    Little smaller than 1 year.
  • (e)  
    Much smaller than 1 year.

  • (6)  
    Now, if the cruise travels with a huge speed (0.5 c), the time interval for the friend who travelled will be:

  • (a)  
    Also 1 year.
  • (b)  
    Little bigger than 1 year.
  • (c)  
    Much bigger than 1 year.
  • (d)  
    Little smaller than 1 year.
  • (e)  
    Much smaller than 1 year.

  • (7)  
    Two astronauts are in the international space station. Consider that one of them goes down to the Earth to an activity and finished it in 8 h, according to his watch. What was the time interval for the astronaut who stayed in the space station?

  • (a)  
    Also 8 h.
  • (b)  
    Little more than 8 h.
  • (c)  
    Much more than 8 h.
  • (d)  
    Little less than 8 h.
  • (e)  
    Much less than 8 h.

  • (8)  
    Consider now that the astronauts are on a spaceship close to a supermassive black hole. One of them goes to a mission and gets closer of the black hole, finishing the mission in 3 h, according to his watch. What was the time interval for the astronaut who stayed on the spaceship?

  • (a)  
    Also 3 h.
  • (b)  
    Little more than 3 h.
  • (c)  
    Much more than 3 h.
  • (d)  
    Little less than 3 h.
  • (e)  
    Much less than 3 h.

  • (9)  
    Two points close to the Earth have between them a distance of 1 km. What would be the distance between these points if they were close to a black hole in an altitude of 500 km?

  • (a)  
    Also 1 km.
  • (b)  
    Little more than 1 km.
  • (c)  
    Much more than 1 km.
  • (d)  
    Little less than 1 km.
  • (e)  
    Much less than 1 km.

  • (10)  
    Two points close to the Earth have between them a distance of 1 km. What would be the distance between these points if they were close to a black hole in an altitude of 20 km?

  • (a)  
    Also 1 km.
  • (b)  
    Little more than 1 km.
  • (c)  
    Much more than 1 km.
  • (d)  
    Little less than 1 km.
  • (e)  
    Much less than 1 km.

Correct answers:

(1) D; (2) E; (3) D; (4) E; (5) D; (6) E; (7) B; (8) C; (9) B; (10) C.

Please wait… references are loading.

Biographies

Maira G de Souza

Maira G de Souza is a PhD student at the Post-Graduate Program of Science and Mathematics Teaching at Lutheran University of Brazil and a visiting PhD student at School of Education at Curtin University. She is conducting research on high school students' learning of Relativity Theory by investigating their conceptions and mental representations using multiple representations approach. She worked as Physics teacher for seven years.

Mihye Won

Mihye Won is Associate Professor in the School of Education at Curtin University. Her research focuses on ways to support students' conceptual understanding and scientific thinking skills such as creative and critical thinking. She is currently exploring the use of student-generated multiple representations and new visualisation media such as immersive virtual reality (VR).

David Treagust

David Treagust is John Curtin Distinguished Professor in the School of Education at Curtin University. He supervises research students on topics related to understanding students' ideas about science concepts and how these ideas relate to conceptual change and multiple representations, the design of science curricula and teachers' classroom practices.

Agostinho Serrano

Agostinho Serrano is Full Professor at the Post-Graduate Program of Science and Mathematics Teaching at Lutheran University of Brazil. He is currently conducting research on gestural analysis methodology and mental representations in the learning of scientific concepts, and acquisition of scientific representations through information technology (IT).