Abstract:
This paper proposes a multimodal graph based recommendation system using a hybrid filtering approach. The proposed approach
uses various sources of data and advanced graph based deep learning algorithms to provide more accurate and personalized
recommendations to users. Our framework captures user and item attributes using text, images, videos, and metadata. We incorporate these
attributes into the graph of user-item interactions using collaborative filtering and content based filtering. Graph convolutional networks
(GCNs) help us identify collaborative filtering attributes. The intrinsic characteristics of items can be better understood and utilized with
graph-based content based filtering. The proposed model initially classifies related users and items into groups using unsupervised
clustering, then refines its recommendations using a cross-attention approach. In addition, we use a Variational Graph Autoendcoder
(VGAE) approach that encodes intricate interactions inside a hidden space, hence enabling precise predictions of links. Experimental results
show that the proposed model provides more accurate and personalized recommendations than existing models. We conduct comprehensive
experiments using the publically accessible datasets of Movielens 1M, TikTok, MovieLens 10M and MicroVideo 1.7M. Our proposed
model demonstrates superior effectiveness compared to the state-of-art multimedia recommender systems in various evaluation parameters
such as precision, accuracy, recall, Normalized Discounted Cumulative Gain (NDCG), and F1-score