Selection of countermeasures against propagation of harmful information via Internet

Today, Internet allows accessing and propagation various information. Some of this information can be undesired for particular user or even harmful. While there are quite a large number of systems to respond to unwanted and harmful information and some well-established set of countermeasures, such as filtering, blocking access, and notification, existing systems are often based on one type of countermeasures. From our point of view, there is a need for the technique of automated selection of optimal countermeasures to counteract undesired or harmful information on the Internet depending on the type of such information and characteristics and needs of the protected system. In this paper we propose the set of interconnected models, including threat model, countermeasure model and information object model, and countermeasure selection technique for protection against harmful information in Internet using these models. The proposed technique uses single-criteria optimization on the basis of introduced countermeasure selection index and allows selecting applicable and optimal countermeasures for the particular system from specific threats. The application of the technique is demonstrated on the experiments.


Introduction
Today, Internet allows accessing and propagation various information. Some of this information can be undesired for particular user or even harmful. Harmful information can include annoying suggestions for gambling or lottery, a proposal, request and/or order to commit suicide, and/or description of suicide as a way to solve problems, data about production, transfer, sale or storage of child pornography, description of the production, development and use of drugs and psychotropic substances, and others. Some information can be forbidden for propagation by legislative acts, other can be specified as undesired by particular user. Today, parental control systems filter traffic based on security policies. Social media platforms provide protection from unwanted information for a child if the user has specified their age. This protection is also provided by some operating systems, such as iOS, by blocking access to pages with such content. Again, the user must specify the correct age. When implementing parental control security policies in a social network or operating system, the systems themselves only analyze messages and pages that the user interacts with. The tasks of countering the propagation of unwanted and harmful information, its distributor, and evaluating user behavior on such pages are not set or solved by social network platforms.
In fact, all existing tools implement the following actions: 1) checking information, 2) restricting access to a resource with unwanted information [1,2]. While today there are quite a large number of systems to respond to unwanted information and some well-established set of countermeasures, such as filtering, blocking access, and notification, existing systems are often limited with above mentioned approach based on one type of countermeasures, i.e. blocking access. From our point of view, there is IOP Publishing doi:10.1088/1757-899X/1032/1/012017 2 a need to classify countermeasures, and develop the models and technique for automated selection of optimal countermeasures to counteract unwanted or harmful information on the Internet depending on the type of such information and characteristics and needs of the protected system. In this paper we propose the set of interconnected models, including threat model, countermeasure model and information object (IO) model, and countermeasure selection technique for protection against harmful information in Internet. The proposed technique uses single-criteria optimization on the basis of introduced countermeasure selection index and allows selecting applicable and optimal countermeasures for the particular system from specific threats.
The paper is organized as follows. The second section provides the results of the related works and solutions research. The proposed models and technique are given in Section 3. The implementation and experiments are described in Section 4. The paper ends with conclusions and discussion about future research.

Related works
Parental control systems, anti-virus programs and methods, models and algorithms developed in this field are the first to protect against unwanted information.
Due to the huge volume of media content available from various sources (for example, TV, Internet, online games, etc.), parental controls are becoming increasingly important for censorship and access control of media content. Parental control systems traditionally allow users to restrict access to specific types of media content based on media characteristics, such as ratings, time, or resource type. In [3], mechanisms for filtering access to media content are proposed. The authors [4] in their work suggest a way to intercept information about the file at the moment when the file is saved to the device and can be run in single-user mode (offline). In [5], a method for controlling traffic and access to information in a network environment in a local network is proposed. In [6], a method for making decisions about parental control approval for user content is proposed, but with the parent's approval of the child's rights to upload a media file from an external website or resource.
Most existing systems for countering unwanted information on the Internet use methods based on the classification of web pages, resources, and the use of countermeasures such as lists (url blacklist). However, traditional methods are improved by using machine learning algorithms. In [7] the authors state that the blacklist is now completely ineffective in searching for both malicious URL variants and newly created URL's. In addition, creating and updating lists requires human input. Blacklisting is a time-consuming process. Machine learning solutions implicitly rely on the function development stage to extract characteristics. Such as: linguistic, lexical, contextual, or semantic attributes, URL string statistics, n-grams, word packets, link structures, content composition, DNS information, network traffic, etc. As a result, machine learning solutions must recognize new malicious URL's, and these systems are not yet accurate enough. The authors tested few deep learning methods such as recurrent neural networks, long-short-term memory, convolution neural networks and identity-recurrent neural networks. Algorithms and methods based on deep learning can extract objects automatically by accepting raw input texts. In the [8] shows a method for detecting malicious web pages by analyzing the domain and scanning it.
In fact, the black and white list policy is not really applicable to protect against unwanted information, for example in social networks, because the URL's of an IO contains not the attributes of an IO, but the attributes of the path of the object. The domain of the social network platform, the authors in it, and the messages are connected on the page of news portal. Adding the path of such a portal to the blacklist leads to excessive blocking.
The need to use algorithms for semantic text analysis leads owners of systems for countering unwanted information to the problem of analyzing extremely large amounts of information. Search and analytical systems and their clients (browsers) take the lead in developing such approaches. In the [9], a machine learning method is demonstrated that allows us to study a variety of messages in the social network Twitter, user moods, and conduct analytics. Such works are also popular in the field of marketing, economics and finance to assess consumer sentiment. In the [10] a method is proposed for evaluating the tone of messages in the social network Twitter, based only on smiles. In the [11], the Heracles platform for developing text analysis algorithms is proposed. In addition, there are also many solutions that protect users of the Internet and social networks from unwanted information based on the analysis of the graph of relationships. Patent [12] is devoted methods for protecting a social network user from prohibited objects (text, photo, media file) that were previously saved in the database. This technical result is achieved by building a social graph, a cluster of accounts in a social network, and a database of prohibited objects that are associated with social graphs and accounts. The invention relates to parental control systems that restrict user access to social network resources in accordance with the rules set by the parent. In [13], a method for quickly identifying similar communities and social network participants is proposed. During the analysis process, the system selection criteria for similarity of profiles and generates a list of IDs accounts that meet these criteria. Finally, in the created social graph, edges characterize the direct or indirect connection between social network accounts.
A review of existing studies shows that most of the existing solutions are based on the methodology within which the IO is checked. If unwanted information is detected, access to the resource will be limited. Also, a review of relevant studies shows that today there is some wellestablished set of countermeasures, such as filtering, blocking access, and notification. However, existing systems for countering unwanted information use only one type of measure from the established set of countermeasures. Therefore, there is a need to classify countermeasures, and develop a model and methodology for automatically selecting optimal countermeasures to counteract unwanted information on the Internet.

Countermeasure selection technique
The developed technique is based on the analytical modelling of counteraction process and decision support methods. In process of analytical modelling we specified the object affected by the harmful information, the threats against this object, and countermeasures against these threats. The developed technique uses single-objective optimization for countermeasure selection index. The first subsection considers the countermeasures and their types, and their applicability against particular threats, as well as the models of countermeasure, threat and object. The second subsection describes the proposed technique.

Countermeasures and models for countermeasure selection
We understand the countermeasure as an action or tool for protection of object from information security threat, where information security threat is represented by harmful information. We limit object with IO in Internet. We outlined the following properties for information objects specification: object size; type of information object (harmful or not); and harmful object class. Based on the outlined properties we specified the object model as follows: IO = <size, type, class>, where size -is a class of information object, it can take values {s, m, l}, where s -small (sio), m -medium (mio), llarge (lio); type -is a type of information object, it can take values {h, n}, where h -harmful objects, n -not harmful objects; class -is a class of harmful information object, it can take values {cour, reg, lot, suic, por, dru, from6, from12, from16, from18, none}, where cour -IO containing data earlier forbidden by court's decision; reg -IO containing data earlier blocked by authorized bodies; lot -IO containing suggestion for gambling or lottery; suic -IO containing proposal, request and/or order to commit suicide, and/or describing suicide as a way to solve problems, as well as endorsement of suicide; por -IO containing data about production, transfer, sale or storage of child pornography, the acquisition of child pornography; dru -objects containing description of the production, development and use of drugs and psychotropic substances and/or description of the cultivation locations of plants containing drugs and/or places of their sale and prices; from6 -IO containing information for children from 6 years old; from12 -IO containing information for children from 12 years old; from16 -IO containing information for children from 16 years old; from18 -IO containing information forbidden for children; none -not applicable (if type is n). The class of affected threats connects threat model and countermeasure model, while class of the information object connects countermeasure model and object model. Connections between the threats, countermeasures and object size are shown in table 1 on the example of drug information threat.  4 We analyzed possible countermeasures and outlined the following properties for their specification: the class of affected threats; implementation type; implementation agent; class of the information object; and complexity of implementation. Based on the outlined properties we specified the countermeasure model rm as follows: rm=<rm_name, rm_thr, rm_obj, rm_impl, rm_agent, rm_cost>, where rm_name -countermeasure type (message filtration for sio in browser; block mio/lio URL by internet provider; mio/lio URL filtration by antivirus/parental controls/browser/operation system; black SEO; user notification for sio/mio/lio by browser/antivirus/parental controls/operation system; administrator notification for sio/mio/lio; internet provider notification for sio/mio/lio); rm_thr -the class of affected threats (cour, reg, lot, suic, por, dru, from6, from12, from16, from18, none); rm_objclass of the information object (small, medium, or large); rm_impl -implementation type (manual or automated); rm_agent -implementation agent (browser, internet provider, parental controls, antivirus, mobile application, operation system); rm_cost -complexity of countermeasure implementation (it depends on rm_impl and rm_agent and currently specified by the experts). The class of affected threats connects threat model and countermeasure model, the class of the information object connects countermeasure model and object's model, and agent and implementation type connect countermeasure model with protected information system. Connections between the threats, countermeasures and object size is provided in table 1 on the example of drug information threats. The countermeasures should stop propagation of harmful information. The technique for countermeasure selection is provided in the next subsection.

Technique for countermeasure selection
The developed technique gets as input the harmful information objects, the threats relevant for this object and the list of countermeasures specified using the models described in the subsection below. The technique includes two main stages: (1) selection of the applicable countermeasures considering the protected system, information object's and threats' characteristics; (2) selection of optimal countermeasures on the basis of integral countermeasure selection index. The technique incorporates the following steps: 1. Input information object io that have type=h, class is given as input from harmful information detection system.
2. Get rm_agent and other characteristics that depend on the protected system. . Select countermeasures scms from cms with max rm_ind. 6. Output scms. As the result the technique outputs the set of optimal countermeasures against detected harmful information object for the particular system under protection.
The In figure 1 the distribution of the technique operation results, i.e. distribution of the optimal countermeasures against simulated information objects containing harmful information (threats), is represented. Figure 1. Distribution of optimal countermeasures for the system modelled in the experiment. Where 44.37% -the countermeasures implemented by antivirus filtering tools; 31.72% -filtering in the browser; 23.04% -notification to the resource administrator where the information object is located; 0.86% -filtering by the parental control system.
As part of the experiment, we used a set of threats, information signs, and objects specific to the legal field of Russia. However, any operator, whether a parent or an authorized body, can set its own rules and countermeasures. The proposed technique is flexible (can be implemented on the user's/internet provider's/authorized body's side) and automatically determines what countermeasures are the most or least available and optimal in the given conditions.

Conclusion
The paper considered the challenge of counteraction against propagation of undesired for particular user or even harmful information via Internet. The models of counteraction process, including the threat model, countermeasure model and information object model were developed. The technique for IOP Publishing doi:10.1088/1757-899X/1032/1/012017 6 automated selection of optimal countermeasures to counteract against such information using the developed models is proposed. It considers the information type and characteristics and needs of the protected system. It uses single-criteria optimization on the basis of introduced countermeasure selection index and allows selecting applicable and optimal countermeasures for the particular system from specific threats. The technique is flexible and can be implemented on the user's, internet provider's or authorized body's side. The operation of the technique was tested on different types of objects, systems and countermeasures. For the experiment a dataset (~ 10 million) was used, containing information about objects with various threats. The experimental data showed the choice of optimal countermeasures distributed as follows: 44.37% -antivirus filtering tools, 31.72% -filtering in the browser, 23.04% -notification to the resource administrator; 0.86% -filtering by the parental control system. The experiment demonstrated the possibility of their implementation and applicability of the developed technique for the selection of countermeasures. The technique can be applied in public control systems (schools, educational institutions), in the corporate segment (assistance in solving the restriction of access to information resources in the company's network), as well as for personal use (parental control systems, filtering).