A HYBRIDIZED RECOMMENDATION SYSTEM ON MOVIE DATA USING CONTENT-BASED AND COLLABORATIVE FILTERING
Need help with a related project topic or New topic? Send Us Your Topic
DOWNLOAD THE COMPLETE PROJECT MATERIAL
A HYBRIDIZED RECOMMENDATION SYSTEM ON MOVIE DATA USING CONTENT-BASED AND COLLABORATIVE FILTERING
Chapter One:
Introduction
1.1 Background of the Study
The rapid growth of information on the internet has resulted in massive amounts of data and an increase in the number of online users. This massive data explosion has inundated people with vast amounts of information, posing a significant difficulty in terms of information overload.
As a result, it has become extremely difficult for humans to manually process such information, as well as to discover the appropriate information. Users’ capacity to make informed and accurate decisions from the vast amount of information available to them frequently causes significant uncertainty.
Large online corporations such as Amazon, Google, and Facebook have struggled to keep up with the information explosion. Recommendation systems were used to intelligently transform this situation. Figure 1.1 depicts how recommender engines have stepped in to help users avoid such uncertainty.
The massive increase in online data and users fueled the rise of big data. In the Big Data world, the Recommendation System has received the greatest attention. Big Data has greatly improved the ability to provide large-scale suggestions.
It has made the Recommendation System more important for users because it anticipates the right piece of information from large volumes of data.
The system is a type of information filtering that uses a user’s prior behaviours or the behaviour of similar users to build a list of information items that is specifically suited to an end user’s preferences.
Currently, in E-commerce, Recommendation Systems (RSs) are widely utilised for information filtering procedures to give personalised information by anticipating user preferences for certain items[1].
RSs seek to recommend items (movies, music, books, news, web pages, etc.) that are most likely to pique the users’ interests. Amazon, Netflix, and other similar portals widely use RSs to recommend material to its consumers.
RSs strive to reduce information overload by displaying engaging and relevant material. RSs have become an essential component of every e-commerce portal.
Figure 1.1: The Relevance of a Recommendation Engine for Users
1.2 STATEMENT OF THE PROBLEM
Recently, a number of machine learning and hybrid filtering algorithms have been implemented to generate excellent suggestions and address the issues with pure Collaborative Filtering (CF). Sparsity, cold start, scalability, neighbour transitivity, and accuracy are the major issues with CF [1].
To address the issues of CF, alternative recommendation techniques such as content-based filtering [1], [5] and knowledge-based filtering [1], [4] have been merged with CF utilising hybrid algorithms.
In this paper, we present a novel hybrid system that blends content-based filtering with collaborative approaches. This study aims to improve recommendation accuracy by combining content features from MovieLens Data and IMDB, as well as collaborative filtering using the Mahout Framework on Hadoop.
1.3 Aim and Objectives
The project’s goal is to create a Hybridised Recommendation System on movie data using collaborative and content-based filtering techniques on top of a Hadoop [9] platform using Apache Mahout [10] and the MovieLens dataset [11] to evaluate performance in terms of scalability and speedup, as well as to address data sparsity and cold start issues associated with pure CF.
Objectives:
The following steps have been identified to reach this goal:
Investigate how to combine collaborative filtering with content-based approaches to create a hybrid recommendation system.
Determine the most successful hybrid system by merging content-based characteristics into a collaborative approach (using Apache Mahout).
This will be built on top of Hadoop to address scalability concerns.
Analyse the impact of modifying various Mahout Component settings on our hybrid model.
against compare the performance of the hybrid recommendation engine against existing models. Our new technique will investigate the impact of various content elements on recommendation accuracy.
Use the popular MovieLens datasets [11].
The movie content characteristics will be taken from the Internet Movie Database (IMDB). Our goal is to locate acceptable item features by matching user ratings from the MovieLens dataset to movie features from IMDB.
To demonstrate that extracted Movie Content features improve the prediction accuracy of our hybrid recommendation system.
4
1.4 Significance of the Study
When compared to other recently created recommendation algorithms, collaborative filtering (CF) has proven to be the most promising and widely used [2], [3].
Although CF has demonstrated success in a variety of application situations, it still has significant limitations, including the capacity to manage data sparsity, cold start issues, and scalability [4].
Its appropriateness and relevance are diminished due to data scarcity. Data sparsity refers to a situation in which consumers rate a restricted number of items.
Another disadvantage of the CF technique is that it cannot handle the exponential expansion of both users and items in the database. This study aims to improve the prediction accuracy of an existing cooperation framework by introducing content-based variables.
Need help with a related project topic or New topic? Send Us Your Topic