Special relativity at Czech schools: a half-century comparison

In this contribution, we present the results of one bachelor and one master theses both devoted to the teaching of special relativity at an upper-secondary or intro-university level in the Czech Republic. We compare the test results of the grammar-school students in the school years 1976/1977 and 2021/2022, which are close to average and very similar in both cases, so the level of understanding has not changed in the meantime. We also present a set of interactive spacetime diagrams used in the preparation of future physics teachers within our relativity course.


Introduction and motivation
The special relativity (SR) theory is more than a century old and brings the spontaneous attention of the public, but relativistic effects like length contraction, time dilation etc. are not supported by our everyday experience and intuition.Therefore, SR is usually considered a difficult topic by most students and many secondary school teachers.There are many valuable attempts to make this topic more accessible and understandable at various levels of education (see e.g.[1,2,3]).In Czech grammar schools1 , the SR was included in the physics curriculum in the sixties.Then, there was an effort to evaluate the study materials and also bring some innovations (e.g.omit the Lorentz transformations).Thus, a test survey with N = 397 pupils was organized in the school year 1976/1977 to check the knowledge and understanding of length contraction, time dilation, the relativistic addition of velocities and the concepts of relativistic mass and energy [4].
Since that time, the content of the relativistic curriculum at most grammar schools has been reduced in lesson time, so now it roughly corresponds to a very simplified version of chapter 37 in [5] with fewer examples and problems.Though it may slightly vary in school curricculla, the key topics are the same as introduced more than 50 years ago and covered in the most used textbook: space and time in classical mechanics, historical development of SR, basic principles (postulates) of SR, relativity of simultaneity, time dilation, length contraction, relativistic addition of velocities, basics of relativistic dynamics, mass-energy relation, Einstein's biography.We prepared a similar survey after nearly 50 years, with the same or very analogical 24 multiple-choice test problems.Number of N = 395 students from 10 different grammar GIREP-2022 Journal of Physics: Conference Series 2750 (2024) 012019 IOP Publishing doi:10.1088/1742-6596/2750/1/012019 2 schools participated in this survey from January to June 2022.The tasks covered the same types and extent of problems as the test in the seventies [4].
SR is usually taught in the final years of study at grammar schools in the Czech Republic.In some grammar schools, physics in the fourth year is no longer compulsory for all pupils.Thus, students with a greater interest in physics often take SR in seminars.On the other hand, some grammar schools do not include SR in the physics curriculum at all.The reason for the omission of this part of the curriculum is the argument that the knowledge of this theory cannot be directly supported by experiments and the teaching thus could be merely formal.On the other hand, let us also recall the most frequently cited reasons for including relativity in the curriculum [4]: • SR is a kind of extension of classical physics.It can provide students with a deeper and more complete view of the world around them.So far, we do not know any physical phenomenon that contradicts relativistic physics, which is not true in classical physics.Students should understand that classical physics is not wrong, but is an approximation of relativity theory that is a more complete model in describing the laws of physics.It is important to familiarize students with classical physics so that they can understand the basic findings of modern physics.
• The introduction of SR gives room for discussion with pupils about the limits of the applicability of classical physics.The connection between the laws of classical and modern physics is better demonstrated.We can show pupils that the laws of classical physics are a limiting case of more general laws, hence the theory of relativity.Many of them also encounter the findings and some relativistic phenomena in popular science literature and science fiction films and have a natural interest in these topics.
• The use of thought experiments, which can develop pupils' physical thinking, can be used more in the interpretation of SR.
• The knowledge gained from studying the fundamentals of SR is also applicable to teaching the fundamentals of quantum physics, atomic physics and nuclear physics.Without knowledge of SR, these parts of physics cannot be understood.
• Teaching relativity theory will enable students to better understand the basic concepts of physics (energy, mass, time, simultaneity, etc.).
• The theory of relativity is also linked to the personality of its creator Albert Einstein, who is an important figure in the history of physics.In the context of the Czech Republic, it is possible to recall his work at the Prague university in the years 1911-1912 [6].

Research methodology
Students were given a test based on a previous test in the 1970s.The test was administered in the 2021/22 school year in 13 schools and was completed by 395 secondary school pupils, of whom 186 were male and 206 were female (3 respondents did not indicate their gender).The primary motivation for repeating the research is to determine the knowledge of today's pupils in the field of SR.In addition, the aim was to highlight some problem areas in the curriculum, that could be targeted in the teaching of SR in an effort to improve the understanding of relativistic concepts and phenomena.The time limit was 30 minutes as in the previous test.The number of questions (24) and the total maximum score (24) were also maintained.The questions were divided into eight groups, and introductory information was described for each group of questions.Students had a choice of 4 options, just one option was correct, and they could use calculators.An English translation of the test is available from: http://muj.optol.cz/richterek/data/media/diplomky/test_sr_translation.pdf.
The questions can be divided thematically into 4 categories: kinematics (questions 5-8 and 11-16), addition of velocities in classical and relativistic physics (3, 4, 17-20), dynamics (2, 9, 10 and 21-24).The last category includes the question 1, which focuses generally on the principles of SR and the limits of applicability of classical mechanics.This category is referred to as "other" later in the text.Let us give examples of some test questions.

Problem examples
P7 (the easiest problem, P max = 0.876): Which of the graphs shows the length of a moving rod in classical physics?
P12 (the most difficult problem, P min = 0.291): Let us have two inertial frames of reference S and S ′ moving relatively uniformly rectilinearly at a speed close to c.We have placed the same length scales and the same clocks in both systems.An observer in S ′ detects two events occuring at the same place.Do these events occur at the same place in S? a) No, there will never be local events in system S. b) Yes, in any case.c) Only at low relative speed of the systems.d) Only if the events are simultaneous in S ′ .
The scores in test problems are weakly correlated (see matrix with Spearman correlation coefficients in Fig. 1).For problems P5 and P6 the scores are anticorrelated, though the situation is the same for both of them (though viewed from different frames of refference) -Alice flies in a rocket at a speed close to c towards Bob standing on an asteroid.Evidently, for the pupils it is difficult to realize the relativity of the views from different frames of reference.We used standard definitions of items and test characteristics that are given, e. g. in [7].This enables to spot the most difficult and most discriminating test problems and check the basic reliability of the whole test.

Research Results
The histograms of the test scores are in Fig. 2 and Fig. 3.The mean score was 13.7, median 14, standard deviation 4.48, skewness −0.0489, kurtosis −0.738, Kuder-Richardson reliability index KR-21 0.766 and Fergusons's delta 0.977.The results indicate that the test [4] is a reliable assessment tool.For the original variant of the test in 1976/77 the results were quite similar [4]: mean score 13.0, standard deviation 5.50 and KR-21 0.870, but there is still one interesting difference.Following the test from the seventies, in problems P13-P24 we requested not only to choose the right answer but also to provide some reasoning.However, a considerable part of the students did not provide any reason for their choice.If we insisted on the reasoning and answer without it would be taken as wrong, then the scores in our test would be definitely worse with the mean score 10.5 only.There might be more causes of that effect, besides guessing it may reflect the fact, that most tests do not require any additional reasoning and the students are used just to mark the choice.Therefore we decided not to take into account the reasoning for the statistics and used only chosen respondents' answers.

Item characteristics
The characteristics are compared in Fig 4 .The item difficulty indices P ranged from 29% (for question 12) to 88% (question 7).None of the questions had a difficulty index of less than 20% and only one question (question 7) had a difficulty index greater than 80%, indicating that the  of simultaneous events.The average difficulty index is P = 57%.
The mean of the discrimination index D was 0.55, with a minimum of 0.12 for question 6 and a maximum of 0.77 for question 8.For questions 6 and 12, the values are lower than 0.30, indicating that the questions did not discriminate between weaker and stronger students.From this perspective, these questions are not very appropriate.On the other hand, 22 of 24 questions (92%) meet the criterion stated in [7] The point biserial coefficient (sometimes referred to as the reliability index for each item) is a measure of the consistency of a single test item with the whole test.It reflects the correlation between students' scores on an individual item and their scores on the entire test.The average value for all questions was r pbs = 0.40, with a minimum of 0.053 for question 6 and a maximum of 0.56 for question 8.According to [7], the sensitivity of the items calculated using the point-biserial coefficient should be at least 0.20, which again meets 22 from 24 questions (92%).As already said above, the problematic question 6 just interchanges the frames of reference in comparison with question 5 and it is connected with different time intervals measured in different frames.

Gender differences
The test scores fifty years ago showed that boys scored significantly better than girls.Therefore, we performed the Mann-Whitney U test that confirmed this also for our survey -males scored higher than females at the α = 0.05 significance level (p = 0.017).Females achieved the mean score of 13.3 points (median 13), while for males the mean was 14.2 points (median 14).The score distributions are compared in Fig. 5.However, the Student's t-test did not prove the difference in mean scores as statistically significant (p = 0.061).Females were most successful only in question 12 (by 10%) and slightly better in questions 2, 15, 17, 21-24.The male scores were better in 16 of 24 questions (2/3), most remarkably better in questions 8 a 9 (by 16%).This means, that the small gender bias of the test is less evident than 50 years ago.

Class type differences
Pupils who chose physics as a voluntary subject in the upper years of grammar school can be expected to be more interested and involved.Therefore, these pupils can be expected to achieve better results.Pupils from the seminar achieved an average score of 14.3 (median 15).Pupils from the whole class had an average score of 13.1 (median 13).The test was also given in one class with a math specialisation with a mean score of 15.3 (median 15); in Fig. 6 we can see the test scores for different types of classes.We also performed the Mann-Whitney U test that confirmed that pupils from seminars scored better than pupils having relativity within a compulsory physics as a whole class at α = 0.05 significance level (p = 0.005).Similarly, the Student's t-test proved the difference in mean scores as statistically significant (p = 0.007).
Predictably, students who choose voluntary extra seminars on physics were more successful, which means that the topic of SR is very suitable for them.

Seminar Whole class Maths class
Class type

Comparison of past and actual surveys
The tests used within the past and current surveys are very similar (with only minor reformulation of 2 problems, keeping the principle of reasoning and calculation unchanged), so they can be considered as equivalent.While in the past test, the average score was 13.0 points from 24 (54.2%), in the current test, our results have an average score of 13.7 points from 24 (57.1%).While in the seventies SR was an obligatory part of grammar school physics, today it is mostly introduced within non-obligatory seminars for students declaring a deeper interest in physics.Moreover, there are grammar schools where SR is omitted.Taking this into account, we would expect better results for more motivated students today, but it cannot be found in the overall results.From this perspective, there is no evidence of significant improvement in teaching SR at this level within the last decades.On the other hand, as the lesson time allocation for physics as a school subject has been reduced, we were naturally interested if and how this could affect the test results.In that sense, the comparable results represent a positive message.The test instructions also included a requirement for students to provide a calculation or justification for their answer for questions 13-24, in addition to marking the option, otherwise their answer will not be credited.This condition was also required in previous research.However, it has now become apparent that pupils are not used to providing a justification in multiplechoice tests.The majority of pupils did not indicate the calculation.Some pupils had only one or two errors in questions 1-12 and completed the second half of the test also with almost no errors.It is very unlikely that such pupils merely guessed most of the answers.We also did not receive completed tests from some schools and only knew the individual answers.Because of  this, it was not possible to assess whether the answer was justified or not.This also reduced the total number of tests in which the justification could be judged and evaluated.
Therefore, the answers were used in the evaluation regardless of whether the pupil had justified the answer or not.However, in the previous research fifty year ago, this condition was required.With the emphasis on justifying the answer, today's students had a mean score of 10.5 points with a standard deviation of 4.46, i. e. pupils from the 1970s scored 2.53 more points than today's pupils.The reliability of the Kuder-Richardson test relationship was KR-21= 0.87.According to today's results, the reliability of the test according to this relationship is KR-21= 0.80, thus the test reliability of the previous test is greater.
The previous research also involved 6 classes focusing on mathematics and physics.These achieved an mean score 15.9.One math class participated in today's research and it achieved a mean score 15.3, so the students with a specialisation on mathematics achieved similar results.As said, the test problems can be topically divided into four categories of the test -kinematics, dynamics, the addition of velocities and other.The results in those categories are only weakly correlated (see matrix with Spearman correlation coefficients Fig. 7), the results in kinematics are most correlated with the total scores.The best results were achieved in the "other" category (66.1%), which included only the first question and concerned the principles of SR and the limits of applicability of classical mechanics.Also, in addition of velocities category, the success rate of 62.6% was quite good (see Fig. 8).The worst scores were in the area of dynamics (a success rate of only 52.4%).
For Question 12, which was about co-located events, students chose each option about equally often and even chose one distractor more often than the correct answer.This suggests that pupils were more likely to guess the answers.Thus, pupils found the concept of co-located events problematic.Some teachers of the participating schools mentioned that they had not discussed this concept in their lessons.It could be assumed that this ignorance may be down to some extent due to omitting of the Lorentz transformations in the compulsory curriculum.Another problem arose with the question concerning the quantification of time dilation.The pupils were asked to calculate how long an event would take for an observer in the system S ′ , if it took 10.0 s for an observer in the S system.Students did not have a problem with the mathematical expression of time dilation, but rather with determining which reference frame was involved.This problem was also evident in questions 5 and 6, where observers fly past each other at high speed and send flashes of light towards each other.Both questions are designed so that the other measures a longer time interval between the flashes.Students often marked the answer with a shorter interval in one question and the option with the longer time interval in the other question.Again, this suggests that pupils are struggling with learning about reference frames.Probably, more emphasis should be placed on this topic within the lessons.Another possible explanation might be that students simply "invert the effect" without thinking about reference frames at all in the sense "if one measures shorter, then the other measures longer".
Another problem area represents the topic of energy and momentum.For the most part, pupils could not calculate relativistic kinetic energy and estimated the result.There is also a misconception that the particle has zero rest energy, an answer that occurred frequently in question 23.Pupils were also unable to derive a relationship for relativistic momentum and so largely chose the option that corresponded to momentum according to classical physics.

Spacetime diagrams visualisation in GeoGebra
As SR is a rather abstract topic, it is very useful to make use of simulations to illustrate relativistic effects visually (see e.g.[2,3,8]).Besides robust and professional tools like Captain Einstein (http://captaineinstein.org), in our practice of preparing future physics teachers, we have developed a set of interactive spacetime diagrams in a free software Geogebra that can be easily modified, adapted, and shared online or included in an LMS like Moodle.With the help of such diagrams, it is possible to illustrate the kinematic effects and the geometrical roots of the Lorentz transformation.Besides widely used Minkowski diagrams (an example is in Fig. 9), we also experienced a successful use of the Loedel and Brehm ones, in which the scales on all axes are the same and the coordinate values can be compared straightforwardly (for an example follow e.g. the link https://www.geogebra.org/m/ubsqk8kq).

Conclusions
With an ongoing reduction of the curriculum, it is even more important to choose the most suitable concepts, ideas and tools to introduce them effectively.The performance of the Czech students in the test is nearly unchanged within the decades, difference is in the answers reasoning.Also students with expected deeper interest (choosing voluntary seminars in physics) scored better.Thus, it still makes sense to think about how to improve the introduction of the theory at this level.With further development of particle physics and systems like GPS, we may suppose that understanding the basis of SR and its applications will be even more important in the future (navigation systems, particle physics, …).

Figure 1 .
Figure 1.Correlations between the scores in individual problems of the test.

Figure 2 .Figure 3 .Figure 4 .
Figure 2. Histogram of the test scores with normal distribution of the same mean and standard deviation.

Figure 5 .
Figure 5. Gender differences in test scores.

Figure 6 .
Figure 6.Test scores according to the class type.

Figure 7 .
Figure 7. Matrix with Spearman correlation coefficients between scores in problems from various topics: particle kinematics (KINEM), addition of velocities (VELOC), particle dynamics (DYNAM), other (OTHER) and the total test scores (TOTAL).

Figure 9 .
Figure 9.An example of a Geogebra applet to demonstrate the relativity of simultaneity and time dilation (available online: https://www.geogebra.org/m/ex8admrz).