The identifying hidden data features problem solution

S Y Petrova; M A Boikova

doi:10.1088/1742-6596/1352/1/012039

Journal of Physics: Conference Series

Paper • The following article is Open access

The identifying hidden data features problem solution

S Y Petrova¹ and M A Boikova¹

Published under licence by IOP Publishing Ltd
Journal of Physics: Conference Series, Volume 1352, The International Scientific and Practical Conference on Mathematical Modeling, Programming and Applied Mathematics 27–28 June 2019, Veliky Novgorod, Russian Federation Citation S Y Petrova and M A Boikova 2019 J. Phys.: Conf. Ser. 1352 012039 DOI 10.1088/1742-6596/1352/1/012039

Download Article PDF

Article metrics

132 Total downloads

Author e-mails

svetayp@list.ru

Author affiliations

¹ Yaroslav-the-Wise Novgorod State University, ul. B. St. Petersburgskaya, 41 173003 Veliky Novgorod, Russia

Buy this article in print

Journal RSS

Sign up for new issue notifications

Abstract

In the article, we considered recommender models based on matrix factorization demonstrate excellent performance in collaborative filtering. The standard Matrix Factorization approach in MLlib deals with clear ratings. To work with implicit data, we used the trainImplicit method. To simulate the processing of real-time data streams, we used the Spark Streaming library, which is responsible for receiving data from the input source and converting the raw data into a discretized stream discretized stream (DStream) consisting of Spark RDD. The rank parameter determines the number of hidden features in the low rank approximation matrices. As a rule, the greater the number of factors, the better, but for a large number of users or elements, it will directly affect the memory usage of the computing system and the amount of data required for training. Therefore, in our problem it was a compromise solution.

Export citation and abstract BibTeX RIS

Previous article in issue

Next article in issue

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Please wait… references are loading.

The identifying hidden data features problem solution

Article metrics

Share this article

Author e-mails

Author affiliations

Abstract