Physics and Mathematics university students’ ideas about computer simulations

In an increasingly complex world, the science of complex systems is well-positioned to provide epistemological lenses and methodological tools to analyse the reality. Among the tools developed, computer simulations have a crucial role, but the ways in which they are conceptualized by graduate and undergraduate students have not been extensively explored. Framed within a wider research about the educational role of simulations of complex systems, the goal of this work is to provide insights into the understanding about simulations of university Physics and Mathematics students. For this purpose, a study has been designed with a group of bachelor and master students within a course of Physics Teaching. The object of this paper is to present the results of the data analysis of the preliminary questionnaires, where 27 students were asked to express their ideas about simulations. The bottom-up process of qualitative analysis has allowed to point out, and organize in categories, different ways in which simulations are conceptualized by the students, in terms of: i) scope for which simulations are used, ii) their relationship with experiments and models, and iii) the examples of simulations they refer to.


Introduction
From the 50s, the use of computer simulations in the sciences has become more and more widespread. Their development is core part of the research in physics, climate science, ecology, sociology, and in many other disciplines. There are even specific disciplines, like the science of complex systems, "whose very existence has emerged alongside the development of the computational models they study" [1]. Their role in the scientific enterprise has become so important that there are authors who have defined them the "third pillar of science", alongside with theories and laboratory experiments [2,3]. This revolution represents nowadays a routine for the academic research but the epistemological debates about computer simulations are intense. The main challenge consists in characterize the peculiarities of simulations and, in particular, if and how they constitute a novelty with respect to models and to computational sciences in general [4,5].
In spite of the increasing relevance of the topic, simulations are rarely addressed at school and university levels from an epistemological and methodological point of view. Here we need to clarify that two macro-meanings are associated to the term "simulation". On one side, we have scientific simulations, the third pillar of research, whose definition we will discuss in the next section and that are the object of this paper. On the other, we have educational simulations, "interacting learning environments in which a model simulates characteristics of a system depending on actions made by the student" [6]. While the formers aim to reach a better expert understanding of a phenomenon basing on theories, the latter aim to favour a better understanding of the model and theoretical principles at its basis [7]. In high schools, in recent years the use of educational simulations has increased [8,9], but they are still used almost exclusively as a teaching aide for showing physical, biological or chemical IOP Publishing doi: 10.1088/1742-6596/1929/1/012059 2 phenomena in different ways than traditional laboratory experiences [10]. At the university level, scientific simulations are not part of most undergraduate scientific curricula [11]. In the Physics bachelor curricula, the only type of simulation introduced in mandatory laboratory courses is the Monte Carlo method -even if there is a debate about its nature of simulation or just of computation [12]. Other types of simulations, such as agent-based or equation-based simulations of complex systems, are only rarely mentioned. More generally, these ways of introducing simulations often hinder to value the cultural relevance of computer simulations and the epistemological challenges they present to the methods of contemporary science [11].
Framed within a wider research about the educational potential of simulations of complex systems and the differences between experts and novices in facing these tools [13,14,15], this work aims to contribute to characterize the ideas that university Physics and Mathematics students have about computer simulations, which is an unexplored issue by educational research. In this paper, we present: i) the theoretical framework about scientific simulations from the literature in epistemology of science; ii) the study, with a description of the context in which it was carried out, the data collection tool, the sample of students; iii) the methodology of data analysis; iv) the results of the data analysis, discussed in terms of their contribution to answer the research question and to the research framework.

Theoretical framework
In this section we outline the theoretical framework about scientific simulations from the literature in epistemology of science. It is articulated in three sections that discuss: i) some definitions of simulations; ii) the relation of simulations with experiments and models, iii) the main types of simulations.

The search for a definition of simulations
The philosophical literature on simulations has increased dramatically during the past 40 years and many attempts have been made to provide definitions of what a simulation is [12]. In spite of this, the propositions elaborated are not very informative by themselves, but they deserve attention because of the ways in which they consider important epistemological and methodological issues such as the relation of simulations with experiments and models or the scientific uses of simulations. We discuss here three main definitions and the issues they highlight. Humphreys, in 1991, has defined a simulation as "any computer-implemented method for exploring the properties of mathematical models where analytic methods are not available" [16]. Here, simulations are meant as purely computational tools, used for exploratory aims; simulations for which analytic methods are available are excluded from this definition. In 1996, Hartmann enlarges the categories considered by saying that "a simulation imitates one process by another process" [17]. Now, simulations are not necessarily computational tools, but they all have an imitational aim; focusing on the processes, the definition excludes simulations that use a model to represent structure (not dynamics) of systems. A third definition is provided again by Humphreys in 2004: "System S provides a core simulation of an object or process B just in case S is a concrete computational device that produces, via a temporal process, solutions to a computational model [...] that correctly represents B, either dynamically or statically" [18]. Here, three layers intersect: R the real system, B the model of object or process, S the simulating system; the simulation aims to solve a model and S is always a temporal process, even if the R is not necessarily dynamic.

The relation of simulation with experiments and models
Given the variety of simulations and of their uses, more than about the search for a univocal definition of simulations there is a vivid debate about characterizing their position with respect to models and experiments, the other two "pillars of science" [2,3]. Indeed, it is only from the comparison with them that the "novelty" of computer simulations can be discussed and articulated. Sketching a complete framework about the different epistemological positions about these topics would be an extensive work and goes beyond the aims of this paper. Also discussing the issue of novelty of simulations for the epistemology of science is not the object of this section, but, in this paragraph, we summarize the main issues that will help in the next sections to orient in the methodological choices for the data analysis. Accurate reviews can be found in [1,12].
In the experimental sciences, and in physics in particular, simulations often flank traditional laboratory experiments and it has become somewhat natural to see them as computational versions of experiments [19]. This idea holds in particular when a simulation study is designed to learn what happens to a system as a result of various possible interventions on its parameters. In this sense, the interaction with the surface of the simulation recalls the experimental process. Another focal question in the relation with experiments is whether the data obtained from simulation can count as measurements. About this, Norton and Suppe [20] claim that if a simulation is valid, that is formal relations hold between a base model, the modelled physical system itself and the computer running the algorithm, "a simulation can be used as an instrument for probing or detecting real world phenomena. Empirical data about real phenomena are produced under conditions of experimental control" [20]. Despite these common traits, there are views for which simulations differ from experiments. The first argument points to the different similarity relation that experiments and simulations have with their targets: in a simulation, rather than experimenting with the object of interest, one controls parameters of a model [21]. Connected to this, there is a difference in the degree of materiality and authors argue that this makes experiments epistemically privileged compared with simulations, since simulations have only a formal relation to their targets [22]. Another argument regards the sources of justification. For a simulation, justification rests on our trust in the background model, while, for experiments, justification relies on the fact that experimental object and target are of the same kind.
Simulations are often related also to models. For example, it is said that simulations are "based on" models or that there is a model "underlying" the simulation. But simulations and models differ mainly in their temporal expansion and in their epistemic opacity. About the first point, the model underlying a simulation is often referred as a static one [12], while the time evolution is intrinsic to the dynamical modelling of the simulation. The second difference lies in the methods by which models can be solved: indeed, simulations are used in particular when an analytic solution to the "underlying" model is not available. This is for example the case with complex systems [23]. In this case, the simulation executes sequences of calculations obtaining a list of numbers which can be interpreted as the numerical solution of the model. The specificity of this kind of calculations is that, despite the code can usually be written in procedural-imperative, human-readable languages, the way in which the simulation "solves" the model deriving the results is in general outside of the reach of human agents. Humphreys has named this behaviour as the "epistemic opacity" of computer simulations [4], which is a feature absent in standard analytically solvable models.

Types of computer simulations
Two main types of computer simulations can be distinguished: the equation-based and agent-based ones. Both types of simulations are used for different sorts of purposes such as prediction and explanation. In the case of equation-based simulations, the evolution of a target system is described by differential equations. Once they are numerically solved, they allow to determine the future state of the system starting from the present state. In the equations, variables related to the macroscopic system appear. In agent-based simulations, the dynamics of the target system is generated making the individual agents evolve according to behavioural rules. There is no description of the macroscopic properties of the system. These instead "emerge" as a result of the execution of the simulation.
Another large class of computer simulation is that of the Monte Carlo methods. They are algorithms that use randomness to calculate the properties of a mathematical model. The Monte Carlo approach does not have imitative purposes since the probabilistic analogy does not serve as a representation of the deterministic system [12]. That is why Monte Carlo simulation can be considered simulations but, in general, not simulations of the systems they refer to [1]. There are exceptions, in the cases in which Monte Carlo techniques are used to solve stochastic dynamical equations that refer to a physical system: in this case the probabilistic analogy is itself a representation of the system it simulates [24].

The study
The goal of this work is to provide insights into the understanding about simulations of a group of university Physics and Mathematics students, attending a course of Physics Teaching. In particular, we are interested in pointing out to which extent the definitions they construct of simulations reflect the debates in epistemology of science illustrated in the previous section. The study we present in this paper provides a qualitative survey of university students' ways to conceptualize scientific simulations, in absence of a specific teaching focused on methodological and epistemological aspects.
The overarching question guiding this work is: In absence of a systematic intervention, what level and kind of knowledge do university Physics and Mathematics students display about computer simulations? Because of the general character of this question that refers only to a generic "knowledge about simulations", we needed more specific research questions (RQs) to orient the design of the study and its analysis. They are: RQ1) What are students' ideas about the scope of simulations?; RQ2) What are students' epistemological ideas about the relationship of simulations with respect to models and experiments?; RQ3) What are the simulations that students take as reference to provide their answers?
To answer the RQs, the data analysed for this paper have been collected through a questionnaire submitted to the participants in a course of Physics Teaching, before a series of instructional activities. Because of this, even if it is not the specific focus of this paper, we will provide an overview of the whole study and the activities carried out. In the following paragraphs we present i) the context in which the study was carried out; ii) the data collection tool, iii) the sample of students and iv) the methodology of data analysis.

Context
The study was carried out in December 2018 within a course of "Physics Teaching" at the Department of Physics and Astronomy of the University of Bologna. The course is traditionally mainly attended by bachelor Physics students, who can choose this course from the "optional list" of the curriculum. In recent years also master students in Physics, Physics Education and Mathematics Education have started to attend it. During the course, fundamental physics issues are addressed (e.g. kinematics, mechanics, optics) and the students are guided to develop, through these disciplinary issues, knowledge and competences typical of the research field of Physics Education. Specific attention is paid to the role of history and epistemology in physics teaching and learning, with a particular focus on the role of models in physics (and in science in general) and on the modelling processes. The course usually runs from October to December and the intervention we describe here came just at the end of it.

Intervention and data collection
On the whole, the intervention was designed and implemented in three main phases preceded by a preliminary activity. Before the beginning of the intervention, that we will describe briefly at the end of this paragraph, an online questionnaire was submitted to the students of the course. This was the data collection tool for the analysis presented in the next section. The questionnaire consisted in five openended and one close-ended questions. They aimed to give insight into students' knowledge on the issues of simulation and complex systems. More specifically, after a section that required information about the university curriculum attended by the students, the questions were formulated as follows:  After this preliminary phase, the study was articulated in three parts. The first consisted in a lecture of an hour and a half about the specificities of computer simulations to analyse complex systems. Not only examples within physics but also complex systems in other fields were introduced (e.g. social sciences, economic, climatology). Indeed, the aim of the lecture was to show how wide the research in complexity is and how powerful are its conceptual tools, to the point that they provide descriptions and explanations of very different phenomena. One week after the lecture, a focus group activity was carried out; the task assigned was to analyse two simulations of complex systems (about Schelling's racial segregation and Lotka-Volterra predator-prey interaction) and answer some questions related to explanation and trust. The intervention ended with a dialogic lecture of an hour in which the results of a pilot study with secondary school students [11] were presented, in order to trigger meta-reflections about the use of simulations in the classrooms.

Sample
The total number of students who participated at least in one phase of the study is 36, 20 males and 16 females. The presence of the students was not constant throughout the phases of the study. In particular, 27 students participated in the preliminary activity filling the online questionnaire, while 29 of them took part in the focus groups discussion. In the following, we will focus on the participants in the preliminary activity because the questionnaires filled by them were the data considered and analysed for this paper. From here, when we refer to "the students" we mean the 27 participants in the preliminary activity. The students were distributed across university curricula as represented in figure 1. The majority of them were undergraduate students in Physics, at their third year in the bachelor course in Physics. The others were graduate students enrolled in master courses in Physics (Particle Physics, Materials Physics and Earth System Physics), Physics Education and Mathematics Education.

Methods of Data Analysis
The analysis has been carried out to answer the aforementioned RQs: RQ1) What are students' ideas about the scope of simulations?; RQ2) What are students' epistemological ideas about the relationship of simulations with respect to models and experiments?; RQ3) What are the simulations that students take as reference to provide their answers?. To address them, the 27 responses to the preliminary questionnaires described in 3.2 have been considered, in particular to the first four questions of the protocol. The data analysis was carried out with a qualitative methodology, through a theoretically oriented iterative process of analysis and interpretation, where the hypotheses formulation was progressively refined through an enlargement of the empirical base, until theoretical saturation was reached [25]. Due to the exploratory nature of this research, the analysis was mainly conducted with a bottom-up strategy, that is the categories were obtained, and the markers clarified, starting from students' answers. Nevertheless, once extracted from the data, the categories were organized also on the basis of the studies in epistemology of simulations, mainly referring to the uses of simulations for scientific inquiries and the distinction among models, experiments and simulations [10]. The data were analysed also to look at possible differences between Physics and Mathematics students and between bachelor and master students. Triangulation among researchers has been carried out to ensure validity and reliability of the analysis. More specific methodological choices for data analysis will be made explicit and detailed in the next section.

Data Analysis and Results
Following the three RQs, the analysis has been articulated to recognized in students' answers three main levels. These are: i) the level of scope of simulations; ii) the level of epistemological ideas about simulations; iii) the level of types of simulations used as references to provide the answers.

Students' ideas about the scope of simulations
The first item of the questionnaire required to attempt a definition of simulation in science. Nevertheless, the wide majority of students give their definitions in terms of their scope. Indeed, they do not elaborate definitions of simulation itself, but rather describe "what a simulation is supposed to be designed/realized for".
Among the scopes of simulations, we have identified four macro-categories. The first is related to the aim of simulation of recreating "something" (e.g. physics phenomena, models, processes, situations) in a virtual environment. The second refers to the aim of simulation of displaying the evolution of "something" (e.g. models, systems) starting from facts (e.g. initial conditions imposed in the simulation, data obtained from laboratory experiments, knowledge of the past evolution). Even if the first aimrecreating -and the second one -displaying the evolution -could be considered strictly related, we prefer to distinguish them because the second meaning involves a dynamical aspect which is absent in the first one. The third macro-category regards the use of simulation for obtaining predictions. The last one refers to the scope of testing "something" (e.g. hypotheses, models, theories, algorithms, alternative scenarios) against facts (e.g. real-world data, data to be obtained from laboratory experiments). We detail these categories in table 1 where we provide operational descriptions and flank each of them by an example of students' sentence. To ensure students' anonymity, their names have been omitted and only the referral to gender has been kept. Table 1. Operational description of the markers for the categories of ideas about the scopes of simulations.
[Rec] Recreating something in a virtual environment "Technique applied in the study of physical phenomena that are difficult to reproduce in the laboratory; a mathematical model and calculation tools are then used to reproduce the phenomenon in a "virtual" way" F23 [Evo] Displaying the evolution of something starting from facts [Evo1] Displaying the evolution of a model "Simulation is for studying the evolution of a model or a theory" M33 [Evo2] Displaying the evolution of a system "The simulation represents the operation of the system over time" F16 [Evo3] Starting from initial conditions "It is a test in which you recreate the initial conditions from which a certain phenomenon originates, and you study it" M24 [Evo4] Starting from experimental data "A simulation shows the effects of a model or a theory starting from the experimental data collected initially" M31 [Evo3] Starting from past known evolution "A simulation is something based on the study of the past evolution or the known behaviour of a certain phenomenon" F25 [Pre] Obtaining predictions "Simulations are used to predict the results of a certain phenomenon" M36 [Test] Testing something against facts [Test1] Testing algorithms "Data acquisitions can be simulated to test analysis algorithms" M29 [Test2] Testing theories, hypotheses or models "The simulations are used for the corroboration of a theory" M5 The frequency of the four macro-categories in students' answers is reported in figure 2.a while in figure  2.b a more detailed picture of the different categories is provided. A students' answer could represent more than one category, so the totals can add up more than 27. In this graph and in the following ones, we do not distinguish between bachelor and master students because no significant differences were found in the recurrence of answers; this is probably due to the fact that the course is attended by students at their last year of bachelor and by others at their first semester of master.

Students' epistemological ideas about simulations
Going beyond the level of the scope for which simulations are designed and used, the second level we address is that of the students' epistemological ideas about simulations. To perform this analysis, we considered in particular the responses to the third item of the questionnaire (In your opinion, is simulating closer to modelling or experimenting? Why?). Also because of the way in which the question was formulated, three macro-categories of ideas emerged: simulations as experimental tools, simulations as modelling tools and simulations as "in-between" tools. Even if in their answers to this question the students positioned themselves in one of these macro-categories, the richness of the reasonings they performed throughout the whole questionnaire allowed a refinement of the analysis and an articulation of the macro-categories in more specific ones. We detail these categories in table 2 through operational descriptions and flank each of them by an example of students' sentences. The frequency of the three macro-categories in students' answers is reported in figure 3.a while in figure 3.b a more detailed picture of the different categories is provided.  Table 2. Operational description of the markers for the categories of epistemological ideas about simulations.
[Exp] Experimental tool "I believe that simulating is closer to experimenting. In a simulation, starting from a significant model (which is already built, and does not derive from the simulation itself), we obtain a result that tends to be what we could measure (therefore, experiment) in reality" M32 [Exp1] Experimental tool for data acquisition "Simulation is a process to obtain "fictitious" data produced by following various physical/mathematical models which therefore show how a sample of "real" data would be if certain theoretical criteria and experimental criteria were met" F10 [Mod2] Model in which non relevant aspects or elements are removed "It is closer to modelling because of a phenomenon we take into account only the characteristics we consider necessary for the purposes of our research" M36 [Exp-Mod] In-between tool "The simulation shares, in the process of scientific discovery, the role of experimentation and, in this sense, it resembles it. At the same time, however, a simulation contains the model and evolves according to its rules, which instead cannot be said of the experiment, which takes place following the laws of the real system, which are the object of the modelling attempt" M29 "To simulate a phenomenon it is necessary to develop a mathematical model that describes it completely; on the other hand, a simulation is a sort of "virtual experiment", so one must then be able to apply the model and interpret the information provided by the simulation, as is done in experimental physics" F23 "I see simulating as close both to modelling, since a numerical-abstract procedure (algorithms) is carried out, and to experimentation, since it is as if a "parallel" experiment was performed beside the actual and "physical" experiment (in the sense of concrete)" F13

Students' references and known examples about simulations
The third and last level regards the types of simulations encountered by the students in their school or academic curricula and used as references to provide their answers to the questionnaire. The majority of the students referred as the only type of known simulation the Monte Carlo method used in particle physics to generate data according to probability distributions. Indeed, a module within the course of Physics Laboratory, in the second year of Physics bachelor, includes the basics of the Monte Carlo computation. Few students referred to examples of simulations of complex systems and agent-based simulations in particular. Others cite a wide variety of simulations, both material and computational: for example, electric circuit simulators, flight simulators, simulations for anti-seismic materials. In figure  4, we report the frequency of these references in students' questionnaires. In our study, only Mathematics' students (5 out of the 6) are included in the "no references" category.

Discussion of the results
The qualitative data analysis has allowed us to point out categories of ways in which university physics and mathematics students interpret computer simulations. In this section, we resume the main results and discuss them in the light of the theoretical framework. The first result regards students' ideas about the scope of simulations. The recurrences in students' answers have been organized in categories which relate to four different purposes of simulations: recreating something in a virtual environment, displaying the evolution of a model, obtaining predictions and testing. The first two purposes recall Hartmann's definition ("a simulation imitates one process by another process") [17] where imitation is related to recreation, and the existence of processes is reflected by the role of displaying the evolution -rather than the structure -of a system. In terms of macrocategory, the most represented is that about testing, and in particular testing models. This reflects the idea of simulation as a tool to verify a model, where the computational support allows to run a model from initial conditions, obtain predictions and then compare them against real-world data or data obtained from laboratory experiments. Another widely represented macro-category is the idea that a simulation aims to recreate phenomena or behaviours in a virtual environment. Among the main scopes of simulations, no students mentioned their role for providing explanation of phenomena, which is instead one of the main issues when agent-based simulations are considered [12]. The lack of this category can be ascribed to the lack of experience that students have with this type of simulations. Indeed, most of them had as only reference for computer simulations the Monte Carlo method.
The second important result consists in having mapped students' ideas about the epistemological and methodological position of computer simulations with respect to models and experiments. This mapping has been done initially according to macro-categories -simulations as experimental, modelling or inbetween tools -then detailed in sub-categories that highlight specific aspects of the experimental or modelling practices that students recognized in simulations. When the students were required to answer if simulating was closer to experimenting or modelling, they positioned themselves in one of the three categories in almost equal numbers. The analysis has revealed that in the experimental tool category there are only Physics students, while all the Mathematics students involved in the study, except one, are part of the modelling tool category. When asked to clarify why they selected this or that category, students' reasoning become rich because different argumentations interact, coming both from experimental and modelling practices. About this, the responses of students who identified simulations as intermediate tools between models and experiments deserve particular discussion. In table 2 we have reported for this category three sentences from three different students. They are Physics students and their sentences were selected because their argumentations are different and allow to underline different aspects. One student (M29) says that simulation "resembles" an experiment and has its role in the scientific enterprise, but at the same time a model "is contained" in the simulation and this model includes explicit rules and laws that cannot be recognized in the world object of laboratory experiments. The second student (F23) recognizes "the need" of a mathematical model behind the simulation because, on the basis of this model, interpretations can be formulated when the virtual experiment is carried out. Another student (F13) sees the modelling aspects in the "abstractness" of the algorithm while the experimental ones are recognized in the conduction of a virtual experiment that goes in parallel with the "concreteness" of laboratory experiments. These three sentences partially reflect the plurality of debates (formalism vs resemblance, necessity of mathematics vs need of interpretation, abstract vs concrete) that epistemology faces when dealing with computer simulations. In particular they re-focus the attention on the importance of making explicit their own views and conceptualizations of models and experiments when reasoning about simulations and their role within the scientific enterprise.
The last comment regards the "great absentee". Even if it is a focal issue and a prominent object of discussion in the epistemology of sciences as well as in the communities of research about simulations of complex systems, no students have mentioned anything related to the opacity of computer simulations. We can ascribe this to the fact that most students had not really encountered simulations except for Monte Carlo ones in which the opacity does not emerge as an important element. However, we claim that an introduction of this crucial epistemological issue, together with examples of specific