Implementation of Knowledge based Collaborative Filtering and Machine Learning for E-Commerce Recommendation System

This is the era of I-way. The development of high-speed computing and huge storage devices change the working culture of human. It affects the traditional business processes and shifted towards online business. It creates huge problems like overload and irrelevant information which are the causes of confusion both customers as well as enterprise. Recommendation system solves these problems. Design and development of efficient system is one of the key areas of the recent researchers. Collaborative filtering (CF) and content-based filtering algorithms are widely used in the implementation of such system. Collaborative used user’s features while content-based used item’s features. Most of the CFs are rating or review based processed homogeneous information. In this paper we proposed knowledge-based collaborative filtering algorithm for large data set that uses various activities done by users during interaction of item through E-commerce web site like clicks, select and purchase. The performance of the system is compared with the base models using real time Amazon E-commerce dataset using precession, recall and NDCG evaluation parameters in various combinations of activities performed by users on items.


INTRODUCTION
O Internet has transformed the style of traditional ways of business, almost every company wants to create its own web site for helping and doing their business. Since Internet provides a very large market place hence every customer is faced with multiple choices. Suppose a customer looking to read a book without any specific area, there are many books of same variety, therefore customer spent a lot of time for searching relevant book. If there is any site or app that provides relevant book to the customer as he/she read previously then it saved so many times of the customer. This feature of that web site is known as recommendation system.
Anciently, a person bought product only suggested his/her friends or relatives. This is the traditional way of purchasing when there was any doubt about the product, but this is the era of I-way that circle has expanded to include online sites that utilize some sort of recommendation engine [1]. A recommendation engine figure 1 uses various algorithms to filter and recommends the most relevant products to the customers on the basis of his/her past behaviors that is it recommends products which the user might be likely to buy.
Some popular websites that are using recommendation are This is the era of I-way. The development of high-speed computing and huge storage devices change the working culture of human. It affects the traditional business processes and shifted towards online business. It creates huge problems like overload and irrelevant information which are the causes of confusion both customers as well as enterprise. Recommendation system solves these problems. Design and development of efficient system is one of the key areas of the recent researchers shown in table 1.

Figure 1 Architecture of Recommendation Engine
Almost every collaborative filtering uses unstructured data such as ratings, reviews are images to profile the users for personalized recommendation. In this paper we can extends the power of collaborative filtering (CF) using large scale structured heterogeneous user behavior data. The main building block of proposed CF combined the traditional CF with knowledge base. The behavior of the users can be represented by directed graph called knowledge graph.  Table 1 Some popular sites that used recommendation system

Knowledge graph
The representation of relation between customer and product is denoted by a directed graph called knowledge graph [2,3]. It is a directed graph of triplets (subject, predicate, object) called SPO. Subject, object, and predicate, subject and object are entities and predicate show the relationship between these entities [4]. Entities are denoted by nodes and relationship by edges.
"A customer C1 buy a product P1 of category cat1 of brand b1 fall in price range r1 and Customer c1 select product p2 of brand b2 categories cat1 of price range r2" SPO of this statement is listed in Table 2.  The other activities of the customer can be written as "A customer C1 also_view a product p3 of brand b3 of category cat1 fall_in price range r1. Customer C1 also_buy product p4 of brand b2 of category cat1 fall_in price_range r2." 1.1.1 Construction of Knowledge Base. During the construction of knowledge base [5], it is mandatory to consider some parameters like completeness, accuracy, and quality of data which determine the usefulness of knowledge base. There are four major groups of knowledge base construction methods, curated method, collaborative method, automated semi-structured method and automated unstructured method.
1.1.1.1. Curated method. In this method triplets are created manually by a closed group of experts. The accuracy of curated knowledge base is very but this method is not scalable due dependency on human experts.
1.1.1.2. Collaborative method. In this method triplets are created manually by the open group of volunteers. This method is widely used in Wikipedia and Freebase and better scalable, but it also has some limitations due to this growth of Wikipedia has been slowing down.
1.1.1.3. Automated semi-structure method. In this method triplets are extracted automatically from semi-structured text by making some rules. This method is used in Wikipedia infoboxes which has large and highly accurate knowledge graph like YAGO [6] and DBpedia [7], but semi-structure text cover very fraction of information stored on the web.
1.1.1.4. Automated unstructured Method: In this method triplets are created automatically from unstructured text using machine learning and natural languages processing. This method tries to read the web extracts facts from the natural language text of web pages like Nell and Knowledge Vault.
Knowledge graph is similar as knowledge base which is classified into schema based and schema free. Some popular schema-based knowledge base is listed in table 3.
Schema based approach uses predefined and globally unique identified entities and relations in a fixed vocabulary while in schema free approach entities and relations are identified using open information extraction techniques.
Let eh, et are the head and tail entity in the knowledge graph rk is the edge between them then et can be related by eh as et=transpose(eh,rk)=eh+rk by applying this relation for all nodes relation among the nodes are easily calculated.

RELATED WORK
Recommendation System try to identify the user's interest in the specific domain of contents based on their previous experiences. When a user interacts with the E-commercial site he\she offers a set of implicit or explicit information like clicks, rating, comments etc. about his/her tastes. Recommendation systems are categorized into two main categories personalized [10,11] and non-personalized. Personalized system uses history of the customers' navigations/behaviors like content based [12], collaborative filtering [13], PageRank [14] in social network analysis etc., while non-personalized system does not require any historical data it used the characteristics of the products like popularity based.

Content based filtering system
This system is designed to recommend the products on the basis of user's past preferred order. It saves all the information related to each user in a vector form known as profile vector and another vector, product vector which contains all the information related to the products. This algorithm finds the cosine of the angle between the profile vector and product vector. It uses traditional classification and clustering techniques such as Support Vector Machine [13] or Nearest Neighbors algorithms [12]. There are two types of user implicit and explicit. Those updated their information automatically by the system are called implicit users while some give their feedback to the system in the given range are called explicit users. According to Aggarwal [1] it has some drawbacks like accuracy of the system is highly dependent on the specific application that is used for features of items, over specialization and training size. There are three major limitations of content-based filtering recommendation technique overspecialization, cold start and limitations of contents.

Collaborative filtering algorithm
It uses user behaviors for recommending items. This is the most commonly used algorithms in the industry since it is not dependent any additional information. There are two types of collaborative filtering techniques, memory based collaborative filtering and model based collaborative filtering.
Memory based collaborative filtering [15] used item based and user-based approaches, recommendations are generated on the preferences of nearest neighbors [ 16], while model based collaborative filtering used matrix factorization approach such as SVD, Tensor factorization [17] it is widely used in order to predict most preferable product wish to purchase by the customer. Graph based or social network-based recommendation system [18] utilized information available from social network like user preferences, influences from friends of social network to overcome the problems of cold start and data sparsity problems or recommendation systems.

DESIGN ARCHITECTURE OF PROPOSED MODEL
The architecture of proposed model (see fig 4) contains 4 basic components user that used the system by giving some data to the system and system provided a listed of item in his recommendation list. Event data pre-processing transformed the data as per the system requirements. Ranking algorithm that generated the rank score of each item based on user's preference, and Matching algorithm that measured the similarity among items and users using item-item and user-user similarity methods respectively.

Data pre-processing
Input datasets contain so many attributes in different domains. Every input data set consists of individual data objects, all the data sets have common properties like type of the data object, size, dimensionality sparsity and abstraction. There are some methods which are used to process the data before use of proposed recommendation system.

Feature selection.
There are many attributes in the input from users and items but, method of dimension reduction of the datasets which is primarily used to remove redundant and irrelevant attributes from the datasets. The system consider c_id , ip_address and session attributes from the customer dataset, P_id, P_cat from product dataset and P_view, P_select, and P_buy from relation datasets and all others are removed.

Data binarization.
It is the method to provide the values of the attributes here the values of the relations are considered for computation of preference score of the product hence the values of the attributes like P_view, P_select and P_Buy are binarized that either 0 for no and 1 for yes, therefore the preference matrix will be.

Computation of preference or ranking score using knowledge recommendation system (KRS)
The preference or ranking score of the product is computed on the number of customers participated in the events in a specific period of times.

Event Database.
It is the collection of customer events performed on different categories (P_cat) of products (see Table- 4). Suppose C b ij shows the customer i that buy the product j (i.e. a product j which is buy by customers i), C s ij shows the customer i that select the product j ( i.e. product j is only selected by the customer i ) Similarly C v ij shows the customer i that view the product j (i.e. the product j is view by the customer i). The preference order between the products is measure as product buy>product _select>product_view. The customer's preference of the product can be represented in matrix form as Cij it means customer i prefer in terms of buy or select or view product j. If there are m number of customers and n number of products then i=1,2,3,4,……..m and j=1,2,3,4,……….n. The product preference matrix (Cij) can be written by m x n matrix (2).  Table 6) contains specific data like c_id, url, location of the customer, navigational details, timestamp etc. The web server logs are represented as the tuple {u_id, ip_address, url, timestamp, location}, that represent the details of web access user, pages requested to access, corresponding browser time and the location of the user. But c_id of the customer is beneficial for recommendation. The u_id with the url which are visited by the customer can be set up by the association matrix.  (3) Here vij is the browser visiting information is a particular time that user i visited the first j pages (product).
The URLgives the idea about the product like product_id and product_category (see Table- 5). The user's information with the product's information is use d for the recommendation.

Name of Attribute
The dataset consists product details, customer details, page detail, action and categories, brands of products. The relation between user and item is denoted by the triples (Ci, Rk, Pj), this shows that customer i is related with particular relation k with product j. The first aim is to find the order pair (Ci,Pj) bought, select and view list of customers then we find the order pair (Ck,Pj) that is list of customers related with that product Pj.
The customer's preference rank can be calculated by the formula Where X, Y and Z are the weight adjusting coefficients corresponding to three difference relations of shopping all are not same as the preference of these steps the values of Z>Y>X. At value X=0.25, Y=0.5 and Z=1 is suitable for better results. Hence this formula finds the preference score of the products higher value means higher the score.

Matching algorithm for recommendation (Association Rule Mining)
The rule mining is used to create the knowledgeable bunches data of similar category that it associates the customer profiles with each product. That is valuable for the recommendation both customer as well as for the enterprise. It is clear that there are three categories of the products hence association rules can be generated from three different kinds of transaction sets, called buy transaction set, selected_but_not_buy and also_view transaction set. For each transaction from the web logs there are three steps for the association rules. First deals the minimum support and minimum confidence, second replace each product in the transaction set with its corresponding categories of products and in third step generate association rule for each transactional set using Apriori algorithm. It can be given by a matrix P=Pkl called product matrix, where k=1,2…n is the total number of products and l=1,2,3 or 4 define in the product matrix Pij that represents the association degree among the product categories in different transactional steps. The relation among these three parts of information is denoted by the triples (Ci, Rk, Pj), this shows that customer i is related with particular relation k with product j. The first aim is to find the order pair (Ci, Pj) bought, select and view list of customers then we find the order pair (Ck, Pj) that is list of customers related with that product Pj. There are two different matrices customer preference matrix and product association matrix before recommending it is to find the matching scores for each product associated with each customer this score gives the idea closeness of product with the customer.

URL
3.4.1. Customer-Customer collaborative filtering. This algorithm finds the similarity score between customers. Based on this similarity score, it then picks out the most similar users and recommends products which these similar customers have liked, selected or bought previously.
A user based Collaborative Filtering technique works based on a set of customers u have the same relation to the product by aggregating these relations using the formula (6).
…….. (6) Where Nu is the set customers of which K customers have the similar interest target to customer u. Sim(u,k) shows similarity between customer u and k predefined customers, and rk,j represents the rating given by k users to the product j.

Product-Product collaborative filtering.
In product based collaborative filtering considers the similarity among the products or services. It is supposed that similar products are related in a similar way by the same customer. Hence the products recommended to the customer u are scored or ranked by aggregating the similarity of the different customers and the customer u related in the past. It is possible to compute similarity score by the given formula (7).
, , Where Ni denotes the set of products or items neighbor to j Sim(j,k) is similarity value

EXPERIMENTS AND RESULT ANALYSIS
Experiments are performed on Amazon E-commerce datasets [19] there are five kinds of sub-datasets automotive, mobile phones, home appliances, movies, books. The behavior of data i.e. relations (events) like view, select and buy represent interaction that were collected over period of 4.5 months. In the original data there are 27,56, 101 events including 26,64,312 only view, 69,332 only select and 22,457 buys produced by 1407580 unique customers and 8885 unique products.
The statistics of these datasets are summarized in the table. We consider Top N recommendation measurement including Precision, Recall, Hit Ratio, NDCG [20] for evaluating the model and baselines. Fist three methods are used to evaluate the quality of recommendation system and last method is used for accuracy and ranking positions of the correct products in output list. There are so many relations among the entities of the knowledge graph, but this paper considers only three relations.
Buy relation: This relation shows that a customer c bought the product p.
Select relation: This relation shows that the customer c add_ to _cart the product p.
View relation: This relation shows that a customer c visits the product p page.

Computation of baselines
We use following methods for baselines performance comparison. 4.1.1. Bayesian Personalized Ranking (BPR) [21]. It is a popular method in Top N recommendation used matrix factorization as the prediction component. It is based on the triplet(u,i,j) where user u interact with item I but not item j. Relationship between item I and j with respect to user u can be given by the formula  [22]. It is a method used for textual reviews; we use HFT under BPR pair wise ranking framework for fair comparison.
4.1.3. Visual Bayesian Personalized Ranking (VBPR) [23]. This method is used for recommendation with images. [24]. It is a review based deep recommendation method to jointly model the users and the products.

Deep Convolutional Neural Network (DCNN)
4.1.5. Joint Representation Learning (JRL) [19]. It is a model which can leverage multi-model information for Top-N recommendation.

Ranking accuracy
It deals the levels of utility of the recommended product or service with respect to the ranking proposed by the user. Discounted Cumulative Gain (DCG) is very popular matrix for evaluating the ranking accuracy. The Normalized DCG [27] is defining as follows Where m denotes the total number of users in the test dataset De, Iu is the set of products/services liked by user u, vj is position of j in the recommended list, guj represents the utility gain given by the user u to the product j, and IDCG is the ideal value computed on the basis of real value using same formula as DCG. Another way to evaluate the accuracy of the relevant list is to consider the tradeoff between the length of the list RL and the number of actual relevant products/services for the user. The relevant list RL contains true positives (tp) but not false negative (fn) and false positive (fp). Hence RL and number of relevant products can be computed in terms of Precision and Recall [10] as follows.
In the single matrix it can be summarized in F-measure [10] which can be computed by the following formula.

Settings of parameters
All the parameters used in this filter are initialized in the range (0,1) and updated as per Stochastic Gradient Descent (SGD). The learning rate is determined in the range of {1.0,0.1,0.01,0.001,0.0001} and the dimension is started in the range {10,50,100,200,300,400,500,600} which gives the final learning rate as 0.01 and dimension as 200. For computing the baselines 70% products of each user are consider for training while other are used for testing. The system generated top 10 recommendation for each user from the test dataset.

Performance Comparison
The performance of proposed filter shown in table 7 and table 8. Table-7 shows the performance comparisons with various base models and table-8 shows the performance with possible combinations of relations. From the experimental result (See fig 5) it is clear that both reviews based and rating based models enhanced the performance of recommendation system but heterogeneous information source- based model like JRL performs more better than baseline system, which gives the idea that the knowledge based collaborative filtering (KCF) performs more better than that of JRL consistently over five datasets and all evaluation measures which verifies the proposed system. Table 7 Performance table on top 10 recommendation between baselines and proposed system. Table 8 Performance on top 10 recommendation when incorporating between varieties of relational structures used in knowledge graph. The final result is significantly than all other models

CONCLUSION
This paper discussed the concept of knowledge graph its learning and creation. Create a knowledge-based collaborative filtering that processed the heterogeneous information which are unstructured by using the concept of knowledge graph that is a directed graph it converted into structured form. The triplet tuple relation between user and the product played a vital role in the development of proposed CF. Experimental results used real world datasets for performance measurements of various filters used in recommendation system based on rating, review and heterogeneous information. From the result it is clear that the performance of proposed filter is much better than the discussed filters, therefore we conclude that the proposed CF is better for creation of recommendation system.