Study of the impact of speech-like interference on the psychological and emotional state of a human

The protection of speech information should not only be as effective as possible, but also convenient for people who are in the protected room. Currently, the active protection devices, which are noise generators, are more often used for this purpose. The most effective is speech-like interference. But at the same time, it is noted that this type of interference creates more discomfort compared to others. This study is aimed at identifying the noise annoyance of speech-like interference, created on the basis of syllabic and word articulation tables, as well as reverberation speech-like interference. The experiments have shown that the most annoying is speech-like interference based on syllabic articulate on tables. The main negative consequences were headache, annoyance, and fatigue. Modification of the speech-like interference in the form of peak trimming improved the auditory perception of the interference by people, while the efficiency did not decrease.


Introduction
To protect speech information from leakage through technical channels, the active protection devices are widely used, i.e. acoustic and vibroacoustic noise generators. Currently, such generators are constructed mainly using white noise with the normal law of probability distribution of values as the driving noise. At the same time, there is an issue of selecting the optimal interference, which, when providing the required security index (in general, i.e. the coefficient of verbal intelligibility of speech W), gives the minimum value of the integral level of interference, that is, it introduces minimal discomfort and unmasking signs during negotiations [1]. Such interference, as shown by research in psychoacoustics [2], is a speech-like interference that has spectral-temporal characteristics similar to a real speech signal. At the same time, a speech-like interference is "an acoustic signal synthesized by a random law, which corresponds to a speech signal in its main characteristics, but does not contain semantic information" [3].

Problem statement
It is known that acoustic noise can have a significant negative impact on a human body, so various specifications have been developed to regulate noise levels taking into account their frequency, exposure time, etc. (for example, [4,5]. This paper presents studies of the effect of speech-like interference on humans and introduces recommendations for reducing this effect. The purpose of this study is to evaluate the impact of speech-like interference of "speech chorus" type on a person based on syllable and word articulation tables, as well as the reverberation interference.
To achieve this goal the following tasks were performed: • the evaluation of the impact of speech-like interference on a human based on syllable articulation tables; • the evaluation of the impact of speech-like interference on a human based on word articulation tables; • the evaluation of the effect of speech-like reverberation interference on a human; • the comparison of the results obtained with the influence of "white noise".

Formation of the speech-like interference of the "speech chorus" type
The articulation syllabic and word tables from GOST 16600-72 are used as the basis for creating speech-like interference [8].
The artificial speech synthesizer is provided with the TTSApp. The choice of this utility is due to its feature of adjusting the parameters such as the sample rate, setting the speech speed, adding other voices, and saving the recording to a file. Audio files were recorded at a sampling rate of 44 kHz, 16 Bit Mono. Adobe Audition 3.0 was used to process audio signals and obtain their spectra. The speech-like interference was generated from 5 different computer voices (3 male and 2 female) at an average speech level of 70 dB. Each speaker used the TTSApp utility to voice sets of syllables and words taken from [8]. Each lexical unit consisted of 3-4 sounds. For each of the speakers, the order of voicing lexical units was individual. The voiced sets were then saved to an audio recording in .wav format.
To create the "speech chorus" type of the speech-like interference, one has to overlap the speakers' voices of with each other with a certain time delay, which was done using Adobe Audition 3.0. Previously, all recordings were aligned to the overall integral level of 70 dB, taking into account the discrepancy of no more than 1 dB. The time delay for overlap of the voices was 10 MS to avoid the matching pauses on computer voice recordings. Then the created interference was recorded in the audio file in mp3 format. Reverberation speech-like interference was generated by overlapping the previously created speechlike interference based on voicing word articulation tables with itself 5 times with a delay interval of 10 MS using Adobe Audition 3.0.

Results
Due to the situation caused by "COVID-19", the research was not carried out in the specially equipped laboratory; however, it was performed in residential premises, the scheme of one of which is shown in figure 1.  As a means of active protection -the source of speech-like interference -a laptop located in the middle of the room, playing an audio recording of speech-like interference, was used. To assay the condition of the testees, the tests for mindfulness and fatigue were implemented. The Thorndike test was used to evaluate the speed of decision making (test time was fixed). Kraepelin's method estimated the number of right decisions taken at the same time. The Thorndike test consists of searching for the specified 10 three-digit numbers in a 10x10 table filled with numbers from 100 to 999. To determine the level of attention, the time of passing the test, the number of correctly selected numbers, and the number of incorrectly selected numbers are recorded. This method is aimed at studying the selectivity of perceptual attention. [6]. Kraepelin's method is as follows: 20 rows of 40 pairs of digits in each are given; the testee should add the digits that are above each other, and write the result of addition under them, discarding the ten. Working time with a single row is 30 seconds, while working on the whole test is 10 minutes. This method is used for qualitative and quantitative assessment of performance, exercise and fatigue. [7]. The selected testing methods were not used simultaneously, but at different stages of the experiment. Experiments to identify the effects of speech-like interference on humans were conducted in 2 stages. Stage 1. First, the required noise level was set: 40 or 50 dB. These levels of interference power are chosen based on the conclusion that most often while using active protection devices, vibro-or acoustic emitters are set to the level of 70 dB, which corresponds to the average integral level of speech. But since the emitters are installed on/or near the trial design, in the center of the room, where the meeting table is located, people hear interference at the level of 40-50 dB. The noise level was monitored using ZET110 noise level meter. The measurements were made at the distance of 1 m from the source. The entire experiment lasted 90 minutes. To record changes in the participant's psychoemotional state, testing was performed before the experiment, and then at the interval of 30 minutes. In total, there were 4 control points of measurement. The testees also briefly described their condition at each control point verbally. In total, 3 people took part in the experiment -2 men and 1 woman. The testees are 21 years old. Each testee was at the distance of 1.5-2 m during the experiment. First, it was decided to test the effect of noise on a human in conditions that require silence and maximum concentration of attention, so the testee was listening to the noise while reading a book. The necessary duration of interference is implemented by "gluing" audio recordings of the created speech-like interference. The results of testing at control points using the Thorndike test at a noise level of 40 dB are shown in the table 1.t ij is the average time of passing the test in seconds of the i-th testee at the j-th control point. As a result of the experiments, all the testes noted that they experienced a headache, annoyance and a feeling of fatigue after an hour of listening. Although testing did not show any abnormalities in the testees' condition.
The testees noted that as the level of interference increased, the discomfort also increased. Based on the results of this stage, we can conclude that any interference, like any noise, creates a stressful environment. People can not assay their condition, but qualitatively all the testees noted that they would prefer to finish the experiment earlier because they had to endure the unpleasant sensations of annoyance and fatigue for at least the last 20-30 minutes of the experiment. Stage 2. This time, the Kraepelin's method was used as a test, and the time of passing it was reduced to 30 minutes. Consequently, the testing was carried out with 2 control points. In this case, the interference was heard only at the level of 50 dB, and the testee during the experiment was watching a TV program or listening music at a distance of 1 m at the level of 70 dB. The rationale for choosing this type of activity is that those involved in the negotiations should focus either on their own speech or on the speech of another speaker, otherwise the lack of concentration on the issue under discussion may affect their decision and possibly have negative consequences. This stage was carried out to assess the duration of the use of active protection devices with the generated speech-like interference without noticeable discomfort for the testees (i.e. headache, annoyance, etc.). The generated speech-like interference was used as interference The results of testing at control points at 50 dB using the Kraepelin's method are shown in table 3. At the end of this stage, the testes noted that after 30 minutes, there were no noticeable negative effects in the body or head. All the attention was taken by the TV program; the noise from the interference was almost not heard. Testing using the Kraepelin's method did not show any significant deviations. The testes who took the test in CP 2 mostly improved their results, because they got used At the end of all stages of the experiment, it can be concluded that if the duration of the meeting is more than 30 minutes when using speech-like interference as the active protection device, the probability of noticeable discomfort for employees increases. The most common ailments are headache, annoyance, fatigue, tinnitus, loss of concentration. To reduce the annoying effect of the interference, one must change its parameters without compromising its effectiveness. The quality of the parameter was chosen as the peak factor, the values of which are determined as the difference between quasimaximum and the average level of the signal. The values of the peak factor of speech signals are 10-12 dB. The peak factor is necessary to determine the limits of instantaneous signal power . Using Adobe Audition software, the peaks were trimmed by 5% of the total integrated noise level, and then the necessary interference level was set. The result of noise processing is shown in figures 2 and 3.  With the processed interference, the experiment was conducted on the parameters of stage 2, the results of which are presented in table 4 when using the Kraepelin's method.  The testees noted that these speech-like interference sounds are more pleasant and less annoying to the ear compared to the unprocessed interference. But the test results did not show any changes.
After the interference has been refined, one should check whether it has lost its ability to hinder the informative signal at the desired level. Articulation tests were performed to evaluate the effectiveness of the processed interference. [8] The total number of tests was 6: 3 -when using raw speech-like interference and 3 -when using speechlike interference with cut-off peaks. For each of the tests, audio recordings of the dialogue between two men lasting 150 seconds were used as an informative signal. For each signal-to-noise ratio, the recording was listened to 2 times, and verbal intelligibility was recorded at each listening. The results of the tests are shown in table 5. W 1 -speech intelligibility when using unprocessed interference, W 2 -speech intelligibility when using processed interference.  Based on these results, we can conclude that the modification of the speech-like interference did not affect its effectiveness, but it became less annoying. There are probably several reasons for the lack of changes in the test readings in stages 1 and 2 of the experiments. One of them may be the young age of the participants as the influence of the annoying factors increases with age. In addition, the nervous system of young people is able to quickly return to normal after an annoying effect. Another reason for the lack of changes in test results in the presence of discomfort and general malaise may be the choice of tests that are not able to assess small deviations in the person's psychoemotional state. However, discomfort in the form of headaches and annoyance still appeared in the participants, so the speech-like interference affects the human nervous system. The so-called cacophony may be one of the causes of testees' annoyance when listening to the speechlike interference based on syllabic and word articulation tables. This effect indicates the formation of sound combinations that do not make sense and sound unpleasant. Since the speech-like interference is generated based on syllables and words, i.e. combinations of sounds, it is the cacophony that occurs when voices are overlapping. The reduction of unpleasant effects on a human after trimming the peaks of the speech-like interference can be explained by the fact that it is in the peaks that a large instantaneous signal energy is embedded, which is most likely an irritant for the auditory system. This is clearly visible in the level diagram.

Conclusion
Experiments were conducted to identify the negative impact of "speech chorus" type of the speech-like interference, based on syllabic and word articulation tables, which resulted in all testees experiencing discomfort in the form of headaches, annoyance, and fatigue. The Thorndyke tests and the Kraepelin's method were used to control changes in the state of the testees. The appearance of discomfort is primarily influenced by individual characteristics of a human (sensitive hearing, psychological diseases, and unstable central nervous system) and the activities that a human conducts while indoors. If a human is focused on solving the task set at the meeting, the risk of headaches and other consequences of using active protection devices is minimized. Perhaps the duration of negotiations should be reduced to 1 hour, but no more than 2 hours, or the breaks every 30 minutes should be introduced. After the modification of the speech-like interference noted above, the irritating effect on a human decreased, as evidenced by the results of a repeated experiment. It is worth noting that the processed interference not only began to cause less discomfort, but also did not lose its effectiveness as a result of the articulation tests. The reduction of unpleasant effects on a human after trimming the peaks of the speech-like interference can be explained by the fact that it is in the peaks that a large instantaneous signal energy is embedded, which is most likely an irritant for the auditory system. This is clearly visible in the level diagram. After trimming, the noise level diagram does not have sharp changes in the energy levels, which contributes to less irritation of all parts of the human auditory system.