Research Problems in Recommender systems

With continuous growth of web applications around the globe, it is a challenge to find the suitable information needed for the user in a limited time.Number of handheld mobile devices is increasing and most of the business revolves around the correct search of the data. Without a proper recommender system it is very difficult to get required information from the web applications. Web applications use recommender systems to provide suitable data to users based on their choices and interests. For different kinds of needs different types of recommender systems have been proposed. Two most basic types of recommender systems are collaborative filtering recommender system and content based recommender system. Sometimes these two recommender systems are combined to increase the efficiency of a recommender system The generated new recommender system is known as hybrid recommender system. The purpose of this paper is to help readers understand the basics of recommender systems. This paper identifies key areas of research openly available for new researchers. After reading this paper new researchers can understand basic problems of recommender systems which need improvement and hence they can make those problems their area of research.


Introduction
Recommender systems help us in getting the data which we need. It filters information which is are needed by the user. Today we have a lot of data in any system. [1,2] The example of systems in which recommenders are needed is YouTube Netflix or any other E-Commerce platforms like Flipkart and Amazon. The scenario which we are facing today is that data is increasing and screen size is decreasing. Screen size we mean to say that initially systems were used from desktop and laptop which were around 15 inches and now it is being used by mobile  Figure 1. Model of recommendation process which ranges from 4 to 7 inches. [3,4,5,6,7] When the user searches for any item and it is not available in first 5 or 10 searches he leaves the system and tries that item on another system. The biggest irony can be an item is available but didn't come on the top of the searches which user did. so in this case what will happen is he will buy that item from another computer E-Commerce platform rather than the platform which was unable to provide a recommendation. A recommendation system can increase sales of a particular application. Many E-Commerce platforms did field because of lack of good recommendation system on their platforms. A good recommender system also staves users time and keep the user engaged in the system resulting in higher revenue. [8] Many top-notch companies which are using recommendation systems are Google, YouTube Netflix, Flipkart, Amazon, Prime, gaana.com and many more. Every system comes with its advantages and disadvantages. So recommendation systems also face many problems which are yet to be solved effectively.
So the purpose of writing this paper is to make the Reader aware of recommender systems and its major techniques. The paper will also explore research problems in recommender systems based on the extensive study done by us using the papers which are referenced in the reference section. In this paper, we have identified some key areas of research which are open to new researchers. so students who are in their Masters and PhD can take this area and take t the topic as their area of research and contribute to the development and improvement of recommender systems of the new generation.
We have included papers from early 1997 to 2020. More than 50 papers have been included in the study. We have not only identified problems but the latest solutions to problems papers have also be added for researchers to understand the problem in detail.

GENERAL CONCEPTS
Recommender systems is a system that helps users to choose items which they may need. different artificial intelligence techniques and machine learning techniques are applied to achieve this output. [9,10] Some examples of recommender systems can be google.com, amazon.com, Netflix ,and other popular e-commerce,music and video portals available online.Since these systems have millions of items so they can not function properly without a good recommender system.
In a recommendation-system application, there are two classes of entities, which we shall 3 refer to as users and items. The formal definition of the recommender system is: • C: The set of all users • S: The set of all possible items that can be recommended, for example, video, songs and books. • U: A utility function that measures the usefulness of a specific item s S to user c C, i.e., U: C X S • R, where R is a totally ordered set.
The space S of feasible items can be very large, ranging in lots of hundreds or even thousands and thousands of items in some applications, such as recommending books or CDs. Similarly, a person's area can also be very large-millions in some cases. [11] In recommender machine how useful the item is determined by its rating. Rating is a measure of item liked by the user. As it is given by the user hence reliability of this rating has extra value in understanding the choices of the user. The rating can be taken in different ways. some common forms of rating are it is asked based on 1 to 5 scales as in case of apps in Google Play. It can also ask based on the scale of 1 to 10 as used by many rating methods. Many customer service agencies use this scale of rating. Whatever rating scale you are going to use your one and describes like and others end describes the extent of dislike. [12] User profile can be generated by storing its traits like age gender area email mobile and other things. Item profile can be generated based on features of the item. Like in case of a book it is the language author Jonah cost publisher etc. In the case of television, it is the brand feature power consumption and many features which can be used to create the profile. The way we create the profile has an important impact on the recommendation system. [13] Ratings are done on a subset of data rather than entire data. Rating matrix is created between user and item and this becomes the heart of the recommender system. The way the rating matrix is analysed defines the recommender system. Different domains use Different techniques for extracting data from the user-item matrix. [14,15,5] Recommender systems are recommends different items to the user based on whatever machine learning or artificial intelligence techniques they have used on the rating matrix. [15] Good recommender systems improve with user feedback. A good recommender system also performs good recommendation even though very few ratings are there. The accuracy of a good recommender systems increases as the history of user increases in the system. [16] 3. TECHNIQUES USED IN RECOMMENDATION SYSTEM The Recommender systems are categorised on the basis of rating techniques used by the system. The way a systems takes ratings or predicts user preferences defines the Recommender systems. Figure 2 describes the basic techniques used in recommender systems. In the coming sections, we are going to describe them. In this paper, we have given a brief description of the basic techniques used in recommender systems. Other techniques are also there which are combined to generate hybrid recommender systems. Figure 4 describes them.

Collaborative Filtering
The collaborative filtering system is a system where we take input from the user and use those inputs to create relations with other users and items. [10,14,15] For example, any user in a collaborative filtering system is asked to rate a particular item. Similarly, other users also rate different items. In this way, we get a user-item rating matrix. The rating done by the user gives historical data and choices of the user. Through choices of users, we can create a profile of the user. Then we can calculate what new items we can purchase on the basis of his history and his profile. 3.2. Content based Content-based system requires the item to the user based on the description of the item rather than the history of the user. [9,17] These types of recommender systems are used in newspapers and article recommendation systems. These recommender systems may have little information about the user. Figure 3 explains the process of recommendation.
These recommender systems do not have a history of the user. The most common technique used by content-based recommender systems is item profiling. This popularity of the item is recorded. User profiling is also done sometimes with little basic information about the user which system has. Item profiles are based on the properties of items. The recommender system uses the properties to recommend items to different users. Unlike collaborative filtering recommender systems, they do not have rating information. User profiles can also be created from the likes and dislikes of users to a particular item. [9,17] When we want to recommend any item to the user, user interests are compared against item properties. This is also known as content features of items which includes the following: [6,18] • The system contains a big database of the item to be recommended. This database consists of features of items. This database is known as an item profile database. • Users provide little information about their preference likes and dislikes to the system and with this little information the system builds a user profile. • The recommendation is done on the basis of a comparison of item profile with user interest.
One can make better-personalised recommendations by means of utilising the elements of gadgets and users. An object profile is defined by way of its essential features. For example, a book can be described using its title, genre,language, publisher, cost etc. Using the weighting procedure, similarity can be calculated between items. In some domains, we can represent elements by means of boolean values while in others we can represent the values using a set of restrained values. Consider the example of the newspaper where we analyze the newspaper articles on the basis of the exceptional form of topics. Boolean cost is indicative of whether a phrase is present  Figure 3. Content based Process in the article or not. Integer cost may want to define the categorical way the range of time of word appears in an Article. This method gives a successful recommendation in content-based recommender systems without using explicit ratings. [19,20]

Hybrid Recommendation System
This system recommends on the basis of the mixture of some techniques. This is one of the recent trends in the recommendation system. The table given shows different types of recommendation techniques. The Hybrid makes use of a combination of these recommendation techniques. The recommendation system by Facebook seems to make use of all these techniques. The hybrid recommendation takes advantage of all the techniques but once should be very careful as hybrid may involve a lot of computation and may give conflicting results. [6,21,22] Figure 4 describes some popular techniques combined to generate hybrid recommender systems. These techniques are combined with each other when they improve the results of the recommendation.

RESEARCH PROBLEMS
These recommendation systems have a great future. Today some problems are yet to be solved by the research community to make research more efficient. Some of the problems which we understand can be solved are listed below. Figure 5 Shows all the research problems we are going to address in the later sections. These research problems can give ideas to work in the area.

Gathering Known Ratings for Matrix
It has been observed that most of the users do not give any ratings. So a research problem arises that how to know whether they are satisfied with the product and how much. There are two ways of taking ratings. When is explicit like asking them after they have purchased or gone through any item. Another way which is predicting their ratings for a particular item based on their preferences on some other item. This method is known as an implicit method of collecting ratings. According to our research, explicit and implicit methods have sufficient gap which gives an opportunity for researchers to start research in this domain. [24,25]

Cold Start Problem
A cold start problem is a problem that arises when no information is found about the user or item in the system. [   about user and item before recommending fails. We can have three different subproblems in the domain of cold start problems. The first problem arises when we do not have any information about the new user who is entering the system. this only happens when a user joins the system for the first time. An example can be if you are joining Amazon or Flipkart for the first time this kind of problem arises. This problem is known as the new user cold-start problem. [26] The second problem arises when we introduce a new item to the system. this item is very original of its kind. The recommender system is unable to find any ratings associated with this item. The collaborative filtering system which needs user-item rating Matrix in order to give a recommendation is unable to start and the problem is known as cold start item problem. The third problem arose when we launched the system for the first time. In this case, we do not have either any user information or any item information. In other words, we do not have any user-item matrix of ratings which is required for collaborating recommender systems to work properly. This problem is known as the cold start system problem. In cold start problems, famous content-based solutions can be applied to solve the problem. [27,28] Other solutions are using a combination of various machine learning techniques. [29,30]

Sparsity Problem
In the recommendation system, it has been noticed that most users use the system but do not give rating for feedback to the system in a proper way. So even though we may have many users using the recommender system various Times, it is possible that we have very few ratings from those users about different items which they have liked or purchased or even disliked. Rating seems useless to users, so they avoid giving it and sometimes may lead to false rating such as providing 5 stars( Considered Best for them) or 1 stars (Considered non likely item), without even noticing what kind of product it is. This is taken as input by the recommender system, which further displays the unwanted results to the user, which may lose the interest of the user in that platform, and lead to non-efficient working. As rating has no significance as per the users, they sometimes don't even give to bulk of products, which again leads to the same problem. [31] Sparsity problem is a problem which arises due to sparsity of rating matrix. here we are talking about the user-item rating matrix. In mathematical terms user-item rating matrix is sparse, It gives rise to a unique problem known as sparsity problem in recommender systems. [32,33].
This problem is the problem of collaborative filtering recommender systems domain. This problem gives unique opportunities to new researchers to find new ways to predict this missing data. [34]

Scalability
Scalability is the property of the system which defines weather system will be able to cope up when the system grows. [34] For example, in case of recommender systems, scalability can be understood as a situation where a recommender system is performing very well in case of few users like 1000 users but as the user grows to 10000 or 100000 it starts performing a way which is not desirable. When the system faces scalability issues it becomes slow it starts feeling it start giving problems which it has never given when a load of users recommendation were less.
The scalability issues can be divided into two parts hardware scalability and software scalability. the hardware is scalability is about the increase of hardware to solve the scalability problem. For example, one can increase processor, RAM and server configuration to solve the problem. But only hardware air capacity increase cannot solve the problem. [34] Software scalability is about writing algorithms and using methods which work well when hardware configuration is increased as needed in future. although this is a major problem which is not as easy as it seems to be. because there are algorithms which perform very well when the amount of data on which they have to operate is small but as the data increases they start performing inefficiently. The accuracy of the prediction decreases as the data increases. some algorithms are not able to utilise the increased efficiency of the hardware hence create a problem of scalability. [32] So this is also an open area of research in recommender systems. as we know incoming Technology we are going for parallel processing and Hadoop and other architectures and big data. so this problem needs to be addressed. So it gives upcoming researchers a new area to start the research and solve this problem using an innovative method. [33] 4.5. Over Specialization problem This problem arises if the recommended items are too similar to each other. One such example is that a user buys grocery items from a shopping website on the regular basis, and whenever he/she opens the website, the recommender system recommends only sugar, may be of different brands (basis on the maximum purchase), on seeing same item again and again and not any unique suggestion may lead user to switch the shopping website whose recommender system offers better, interesting and personalized results for the user. So in this problem basically, The user loses interest in recommended items, one solution to this problem is recommendation diversification. In this method we list all the items that are dissimilar but it is of interest to the user [3].

Lack of Data
Perhaps the biggest issue facing recommender systems is that they need a lot of data to effectively make recommendations. In the world of the recommender system, it is common practice to use a publicly available dataset from a different environment in order to evaluate the efficiency of recommendation algorithms. [35] These data sets are very important and are used as a benchmark to develop new recommendation algorithms. Most of the top companies like Google, Amazon, make good recommendation because they are having a lot of consumer user data so recommender system firstly needs consumer/item user data (from different sources), then it must perform some statistical analysis based on some procedure (User behaviour observation or events), and then the Recommendation algorithm does its work. If we will be having more consumer/item data we will be in a situation with the help of the recommender algorithm to have better recommendations [36,37].

Changing data
The most important challenge in the case of recommender systems is changing data i.e. that they are usually subjective towards the old/past and face problems while dealing with new or recommending new. Many recommendation systems collect user data or make use of user past behavior but depending on user past behaviour analysis is not an efficient tool for a good recommendation because users trends and interests are always changing so it would become very difficult for the recommendation system to react in a changing environment of data. To show you what I mean, usually a newlywed couple checks for dresses, beauty products and lifestyle items but with the passage of time and change of situations their interest changes to baby products such as diapers, baby foods etc, which is entirely different from the previous search which may confuse the recommender system and may lead to display results which are no more interest of the buyer. [38] 4.8. Changing user preferences The next important challenge in the recommender system is changing user preferences and making coordination with rapidly changing preferences of the user. As we observed till now that user intention towards browsing some item may be different at different interval of time so recommendation systems which are completely based on user preferences may perform wrong recommendations for eg. Suppose today I am browsing books for myself but tomorrow I may browse sports item for myself or It can be said that a 10 year kid is searching for multiple items without any thought just to scroll without any intention of purchase, at that time the recommendation system based on user preferences may recommend me the wrong items, so making coordination with frequently changing user preferences is the most important issue in recommendation system [38].

Unpredictable items
The next important challenge in the recommender system is unpredictable items based on different rankings. Item recommender used to find out the ranked list of predictable list of items and unpredictable list of items. It may include unusual products which existed before or have relevance with the existing ones, in this scenario the system may not be able to recommend the product to any of the users, and thus will be dumped for ages , considering Amazon launched a new product for selling, i.e AI based Robot. Since this belongs to all new category, so no browsing history of the category found which leads to no recommendations for the user. In the same manner, movie recommendations fall into this category of the recommendation of unpredictable items. Content-based filtering technique addresses this problem. [39]

Conclusion
This paper has introduced recommender systems to new researches. This paper has also identified key problems which need research in recommender systems. This paper can help PhD and Masters students in choosing their area of research. The research gap is already presented in this paper to form different problems of recommender systems.
The recommendation system finds its utility in major areas of web Applications. As these problems get solved more and more useful recommendation systems will become. With more reliable recommendations web applications will be more intelligent and usable.