Analysis of the difficulty of text generated by the ChatGPT artificial intelligence, text from a lower-secondary physics textbook, and other sources in Czech language

ChatGPT (Generative Pre-trained Transformer) is a chatbot that was launched by Open AI as a prototype on November 30, 2022. ChatGPT has quickly gained attention for its detailed and well-formulated answers in many areas of knowledge. It can write computer programmes, poetry, business correspondence, or generate seemingly technical texts, depending on the verbal assignment. This article analyses and compares text in Czech language describing hydrostatic pressure produced by ChatGPT, from lower-secondary physics textbook and other sources. A text difficulty analysis was performed on the selected texts. The text-difficulty analysis by Nestler was chosen to assess the didactic properties. The results of the text-difficulty analysis and the selected partial coefficients are discussed.


Introduction
With the rapid advances in artificial intelligence (AI) and natural language processing (NLP), novel applications are emerging to revolutionise various aspects of our lives.One such ground-breaking development is ChatGPT, an advanced language model based on the GPT-3.5 architecture developed by OpenAI.ChatGPT is designed to interact with users in a conversational manner, simulating humanlike conversation and generating text responses based on the input received [1][2][3].This article explores the potential of using ChatGPT in the production of educational materials, focussing primarily on its ability to improve the production of text educational materials.Traditional textbooks serve as the main source of educational information.Opinions are often expressed in the lay community with the problems that textbooks have in effectively engaging students and presenting complex concepts in a way that is easy to understand.A very superficial and simple hypothesis is presented that "ChatGPT generates quality educational content that exceeds current textbook standards" [4, 5,].One of the main advantages of using ChatGPT in an educational setting is its ability to continuously improve.The model learns from a vast amount of data and user interactions, allowing it to refine its responses and generate increasingly accurate and informative content.As learners interact with the system, their queries and interactions contribute to the model, creating a dynamic and constantly evolving learning resource.It is the ability to enquire further, i.e. interact, that is the main advantage over a printed publication [2][3][4].A particularly valuable aspect of ChatGPT is its ability to generate customised and complex definitions.Students can query the model for definitions of various scientific phenomena and laws, allowing them to gain a deeper understanding of complex physics concepts.By interacting with the model, students can explore different perspectives and explanations, contributing to a holistic understanding of the topic.The potential impact of incorporating ChatGPT into educational materials is significant.It can fundamentally change the way students learn, offering them an interactive and personalised learning experience.Overcoming the limitations of static textbooks, ChatGPT can generate educational content that adapts to individual learning styles, engages students in a conversational way, and promotes deeper understanding [2,[5][6][7][8].
In the last ten years, several papers in professional journals, dissertations, and conference proceedings have been devoted to the analysis of textbooks.However, these were exclusively textbooks.In line with foreign trends, the research has been approached in different ways: the conception of individual topics has been analysed, textbooks have been analysed comprehensively, and the most used textbooks have been examined so that the authors can further elaborate or modify the topics presented in them.So far, no one has dealt with the comparison of a physics text created by artificial intelligence [9][10][11][12][13][14][15][16][17][18].

Objectives of the evaluation
The aim of the text is to analyse the difficulty of didactic learning texts, in contrast to chatbot-generated text.The excessive difficulty of teaching texts limits its usability in teaching, so it is necessary to address this topic [19,20].

Method of analysis of teaching texts
The textbook is an essential tool through which students acquire knowledge and attitudes.It is the tool that conveys a system of values.These values will then shape their attitudes and world view."Quality textbooks thus become the source of a nation's future success."[21].The evaluation of textbooks should be based on the findings of scientific research [21,22].The evaluation tool is Nestler's text difficulty analysis."Didactic difficulty is the set of text features that exist objectively in any text and in the learning process affect the perception, comprehension, and processing of textual information by the learner."[23].The entire method is based on this explanation.Many research methods have been developed abroad; in the Czech Republic, the measurement of the difficulty of didactic tests is dealt with by Průcha, who chose Nestler's method as the most appropriate, which he partially modified.According to Nestler, the text difficulty measure determines the degree of difficulty of a textbook text based on the properties of the linguistic representation of the text, the syntactic structure, and the properties of the content of the text, the semantic class of concepts.This is of importance precisely for textbook texts, in which the difficulty of the text is created by both factors of presentation, the form of language, and factors of content -knowledge, information.A positive benefit with this approach is the high correlation between the calculated coefficients and the actual difficulty of students in working with the textbooks in question.This method achieves high validity.The whole method has been adapted to its present form by Pluskal.He added two more categories of evaluation.The following text briefly describes the procedure for using this method [23,24].The procedure can be described as a sequence of steps that begins with the selection of samples from the textbook.It is followed by calculating the syntactic text difficulty Dst, calculating semantic text difficulty Dsm, calculating the total text difficulty D, calculating the density of scientific information coefficients i, h, ending with the interpretation of the results of the overall analysis [23,24].The coefficients can be interpreted in many ways -we can use them to compare textbooks of a given subject, years, textbooks published by different publishers, the development of textbooks in a historical context, and we can also compare textbooks from different countries.Although this method achieves high validity in determining categories, the classification of concepts under study can be extremely difficult and requires consultation with experts on the subject and in subject didactics.This test can also become an evaluation tool for the revised edition of textbooks and their subsequent correction [22][23][24].
The syntactic text difficulty Dst tracks the total number of words N, the sum of verbs in the active form V, and the number of sentences S. It is determined according to equation [25]: Semantic text difficulty Dsm tracks the total number of words N with the total number of terms T, new general terms T1, new scientific terms T2, factual terms T3, quantitative terms T4 and repeated terms T5.The resulting value is given by equation [25]: The total text difficulty D is the sum of the syntactic and semantic text difficulty [25]: The density of scientific information in the number of words i determines the proportion of terms that carry technical information in the total number of words [25]: The density of scientific information in the number of terms h is determined based on the sum of nouns of the expert information relative to the sum of the nouns.Therefore, it is determined according to equation [25]: The coefficient P1 represents the proportion of new general terms, P2 the proportion of new scientific terms, P3 the proportion of factual terms, P4 the proportion of quantitative terms, and P5 the proportion of repeated terms.The individual coefficients are determined by the following equations [25].
The proportion of new general terms P1: The proportion of new scientific terms P2: The proportion of factual terms P3: The proportion of quantitative terms P4: The proportion of repeated terms P5:

Selection of samples for analysis
The texts analysed must be at least 200 words long.This greatly influences the selection of topics for analysis [19][20][21][22][23][24][25].The selected text was hydrostatic pressure, which is related to other physics concepts such as pressure, gravity of Earth, liquid column depth, et al.

Text from textbooks
The text was selected from textbooks published by the three largest publishers of physics textbooks.Prometheus [26], Fraus [27], and SPN [28].The largest publishing houses were chosen because they are used by the largest number of students.

Text from websites
At the time of writing, there is currently no website with the necessary structure and content that offers teaching texts for lower-secondary students.For this reason, two websites were chosen, one aimed at upper-secondary students -"the Multimedia Encyclopaedia of Physics" [29], and the other aimed at the general public -Wikipedia [30].Both selected texts should have a higher difficulty level, and therefore serve mainly for a better comparison.

Text from AI
The text generated by ChatGPT (version 3. "Hydrostatic pressure is a physical quantity that refers to the pressure exerted by a standing fluid, such as water.Pressure is a force that acts on a specific area and is uniformly distributed in that area.The hydrostatic pressure depends on the depth at which a given point in the fluid is located and the density of the fluid.When a body is immersed in a liquid, a force is exerted on it from the deepest point in the liquid to the lowest point, in the direction opposite to gravity.This force is due to the weight of the column of liquid above the body, and is called the hydrostatic force.The hydrostatic force depends on the weight of the column of liquid above the body and the height of the column.Hydrostatic pressure is defined as the ratio of the hydrostatic force acting on a given surface to the area of that surface.The formula to calculate the hydrostatic pressure is P = F/A, where P is the hydrostatic pressure, F is the hydrostatic force, and A is the area on which the force acts.
The essence of hydrostatic pressure is that the pressure in a fluid increases with depth.This means that the bottom of a container of liquid is under more pressure than the top.Here, the pressure is evenly distributed on all sides and is perpendicular to the surfaces on which it acts.Hydrostatic pressure is an important concept in the study of hydrodynamics and is also used in many everyday situations."

Text from the student
The last text chosen for comparison was that produced by a first-year student in a master's in physics teaching for the lower secondary school.The assignment for text production was (loose translation): "Write a definition of the physical quantity hydrostatic pressure for a lower-secondary school student in the Czech Republic.Use at least 200 words.It is as if you were creating a text for a student's notebook to replace a textbook.Without inspiration from the textbook or other sources." The same assignment was given to 15 students.Then all solutions were presented to 29 bachelor's students, who anonymously voted for the best text that was error-free, well-structured, and contained all the necessary information, the one that they subjectively found the best to read.

Analysis procedure
Evaluation of the text was carried out in the following steps: 1. Interpretive texts of at least 200 words were selected.The selected text always ended together with the end of a sentence, so the samples were of different word counts.Headings, explanatory notes, tasks and exercises, pictorial sections, or tables data in tables were not counted.
2. In each sample, the total number of words N, sentences S, and verbs V in a particular form was counted.
3. All nouns, that is, the terms T, were found in each sample.4. The terms found were divided into five distinct categories based on their use in the text.New general terms T1, new scientific terms T2, factual terms T3, quantitative terms T4, and repeated terms T5. 5. From the values obtained, two levels of text difficulty were calculated: syntactic text difficulty Dst (equation 1) and semantic text difficulty Dsm (equation 2).The sum of these was then used to determine the total text difficulty D (equation 3).The density coefficients of scientific information were also calculated i (equation 4) a h (equation 5) and also the partial proportions of all categories of concepts (P1-P5) were determined (equations 6-10).

Results and discussion
The results of the analysis are summarised in the following table (table 1).

Overall difficulty of the text
The results of the difficulty of the analysis of the textbook (table 1) show that the overall difficulty of the D learning texts takes values ranging from 20.6 to 50.2.Based on the values obtained, the texts can be arranged in order of increasing difficulty as follows: SPN, ChatGPT, Student, Encyclopaedia, Wikipedia, Fraus, and Prometheus.The maximum recommended value of the overall difficulty of the text for the first year of upper secondary school is 35 [21].The second lowest value for ChatGPT is surprising and may indicate a high popularity and clear preference among students in the future.It is important to note that the encyclopaedia is intended for secondary school students, and therefore a higher difficulty level is expected.However, the texts in the Fraus and Prometheus textbooks can be considered difficult.

Syntactic text difficulty
This parameter is indicative of the style of expression of the textbook authors [25,32].Students must deal with difficult texts while reading rather than focus on the topic itself [32].Values above 20 indicate an inadequate length.Thus, in table 1, the only problematic text source is Wikipedia, which was chosen primarily as a reference to compare with others because it is written for the general public.

Semantic text difficulty
There was also a considerable range between the semantic (conceptual) difficulty values.Geography textbooks contain texts with an average difficulty of 23.2 [33], chemistry textbooks 31.1 [32], and natural history textbooks 24.0 [34].ChatGPT generates text with the second lowest semantic difficulty.

Density of scientific information
More precise information is provided by comparing the coefficients of expert information in the sum of words i and the sum of concepts h [32].ChatGPT generates the second lowest values in both parameters.

Proportion of terms
In terms of the proportion of terms, the text generated by ChatGPT is free from numeric terms, using no calculations or references to values to explain them.It is also very sparse on the use of factual data.The text often repetition of words in the text, which can be considered convenient to facilitate understanding of the issues rather than if more synonyms were used [19,21].The use of technical terms is more pronounced than common terms, and it scores highest on this parameter, along with Wikipedia.Therefore, it is a sharp contrast to textbook texts where the selected issue is explained based on inference and modification of relations.These texts are then rated as more difficult (Fraus and Prometheus).

Conclusion
The analysis provides an overview of the individual parameters in the difficulty of teaching texts on hydrostatic pressure.The main objective was to compare the difficulty of the text generated by ChatGPT with texts from textbooks and other sources.The generated text achieved excellent parameters in terms of text difficulty.It is important to emphasise that the paper in no way evaluates the technical correctness and appropriateness of the chosen interpretation procedures.The text contains non-standard markings of some physical quantities for the Czech Republic and some very questionable passages.Subjectively, it can be argued that much of the information is redundant and misleading.The shortcomings of the analysis are persistent in the subjective assessment of the type of some terms.The results are burdened by the choice of the text of the analysis.It would be beyond the scope of this paper to select a larger number of texts from different areas of school physics.To follow the evolution of the change of the generated text over time, the hypothesis is offered that the text will improve over time.Comparisons with other science subjects would also be important.Of course, we are investigating parameters other than the mere difficulty of the texts.
ChatGPT is an interesting tool that is likely to interfere with everyone's lives in a big way.It will certainly affect the course of education and teaching, as well.

Declaration of Interest
The author declares that there are no financial relationships that could be perceived as a potential conflict of interest and that this work is not commissioned or intended for advertising purposes for the major publishing houses.The content of this article is based on independent research and does not reflect any endorsement or influence from these publishers.

Table 1 .
Difficulty of teaching texts from various sources.