Knowledge-Aware Graph Self-Supervised Learning for Recommendation
Abstract
:1. Introduction
- In this paper, we propose a general self-supervised learning paradigm from a novel perspective, which jointly models KG and the user–item interaction graph to improve the robustness of recommendations and alleviate data noise and sparsity problems.
- We propose the Knowledge-aware Graph Self-supervised Learning (KGSL) framework, which constructs a contrastive view from the user–item interaction graph and the semantic-based item similarity graph for data augmentation while taking into account both structural information and semantic neighbor information.
- Extensive experiments on three real-world datasets are conducted, demonstrating that the proposed KGSL method outperforms several competitive baseline methods. Additionally, ablation studies and parameter investigations are performed to illustrate the impact of unique structures or parameters on model performance.
2. Related Work
2.1. GNN-Based Recommendation
2.2. Auxiliary Information-Based Recommendation
2.3. SSL-Based Recommendation
3. Preliminaries
3.1. Notation and Description of Concepts
3.2. Self-Supervised Learning
4. The Proposed Methodology
4.1. Semantic-Based Item Similarity Graph
4.1.1. Relationship-Aware Knowledge Aggregation
4.1.2. Generating the Semantic-Based Item Similarity Graph
4.2. Representation Learning Based on GNN
4.3. Construction of the Self-Supervised Learning Task
4.4. Joint Optimization
Algorithm 1: The Algorithm of KGSL |
4.5. Complexity Analysis of KGSL
- Adjacency Matrix Normalization: Before performing graph convolution operations, it is necessary to normalize the adjacency matrix of the graph. In KGSL, for each training iteration, we need to generate augmented views for both user–item interaction graph and semantic-based item similarity graph . Since the number of non-zero elements in the original graph and the augmented views are and , respectively, the overall computational complexity of this part is .
- GNN: For the l convolutional layer, the complexity of performing matrix multiplication is . Therefore, the complexity of graph convolution with a total number of layers L is . Thus, adding the complexity of performing graph convolutions on two augmented views, the overall complexity becomes .
- SSL Task: For calculating the time complexity of the self-supervised tasks, only inner product operations are considered. As shown in Equation (20), when calculating the loss for item nodes in the user–item interaction graph , all other item nodes are treated as a negative sample. Since KGSL sets up two self-supervised tasks, the overall time complexity is denoted as .
- Recommendation Task: Similarly, considering only inner product calculations, the calculation complexity is assessed. Since the BPR method computes the loss function by matching each positive sample with a negative sample, the overall computational complexity for the entire training process is denoted as .
5. Experiment
5.1. Experimental Setup
5.1.1. Dataset Description
- MovieLens-1M, a movie recommendation dataset obtained from the MovieLens website, a movie recommendation service platform. It contains over 1 million explicit ratings from more than 6000 users for over 4000 movies. User ratings for movies range from 1 to 5. In addition to rating information, this dataset also includes some auxiliary information. Microsoft Satori organized movies and their associated attribute entities into a knowledge graph, which is used for research and development in a personalized recommendation system.
- Last-FM, a music recommendation dataset collected from the online music platform Last.fm. It includes the listening history records of over 1000 users on the Last.fm website over the course of a year. This dataset covers more than 4000 artists and over 10,000 songs. Additionally, the dataset includes information about artists, songs, labels, and genres. Microsoft Satori also organized this information into a corresponding knowledge graph.
- Book-Crossing, a book recommendation dataset provided by the social networking site Book-Crossing, which is focused on readers. It includes rating and content information for over 27,000 books available on the website. User ratings for books in this dataset range from 1 to 10. Similar to the previous two datasets, the original dataset book content information is also present in the corresponding knowledge graph created by Microsoft Satori.
5.1.2. Evaluation Metrics
5.1.3. Baselines
- Neural Networks for Recommendation
- [8] is an NN-based CF recommendation algorithm. It employs neural networks instead of matrix factorization to simulate higher-order interactions and learns more complex nonlinear interaction features. In the comparative experiments, the model’s entity embedding dimension is set to 50, and the number of layers in the graph encoder is set to 2.
- Graph Neural Network for Recommendation
- [14] is a GNN-based recommendation algorithm. It organizes user-item interaction data into the form of a user-item interaction bipartite graph. It utilizes the information propagation and aggregation mechanism of GNN to explicitly encode high-order connectivity between users and items into collaborative information. Finally, it uses user and item embedding containing high-order collaborative information to make rating prediction. In the comparative experiments, the model’s entity embedding dimension is set to 50, and the number of layer in the graph encoder is set to 2.
- [13] is a GNN-based recommendation algorithm that builds upon the NGCF model. It introduces a lightweight graph convolution operation to learn user–item bipartite graphs. Instead of using non-linear activation functions and feature transformation operations in graph neural networks, LightGCN replaces them with simple weighted aggregators. This further enhances the training efficiency of the recommendation algorithm and the encoding capability of user–item embedding vectors. In the comparative experiments, the model’s entity embedding dimension is set to 50, and the number of layers in the graph encoder is set to 2.
- Self-Supervised Learning for Recommender Systems
- [50] is an SSL recommendation algorithm based on graph neural networks. Its SSL task involves data augmentation operations based on graph structure perturbation on the user–item interaction graph. Then, it maximizes mutual information between embedding of the same node under different views. In the comparative experiments, the model’s entity embedding dimension is set to 50, and the number of layers in the graph encoder is set to 2.
- [56] is an SSL recommendation algorithm based on a graph neural network. It takes a user–item interaction graph and an item–entity knowledge graph as separate local views, then concatenates them to generate a user–item–entity graph as the global view. Finally, it designs self-supervised learning tasks based on a multi-level cross-view contrastive learning mechanism to enhance the recommendation task. In the comparative experiments, the model’s entity embedding dimension is set to 50, and the number of layers in the graph encoder is set to 2.
5.2. Performance Comparison with Baselines
5.3. Ablation Study of the KGSL Framework
5.4. Hyperparameter Sensitivity Analysis
5.5. Study on the KGSL Effectiveness
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Wang, J.; Louca, R.; Hu, D.; Cellier, C.; Caverlee, J.; Hong, L. Time to Shop for Valentine’s Day: Shop** Occasions and Sequential Recommendation in E-commerce. In Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA, 3–7 February 2020; pp. 645–653. [Google Scholar]
- Liu, S.; Chen, Z.; Liu, H.; Hu, X. User-video co-attention network for personalized micro-video recommendation. In Proceedings of the World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 3020–3026. [Google Scholar]
- Gharibshah, Z.; Zhu, X. User response prediction in online advertising. ACM Comput. Surv. (CSUR) 2021, 54, 1–43. [Google Scholar] [CrossRef]
- Hu, Y.; Koren, Y.; Volinsky, C. Collaborative filtering for implicit feedback datasets. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; pp. 263–272. [Google Scholar]
- Shao, P.; Wu, L.; Chen, L.; Zhang, K.; Wang, M. FairCF: Fairness-aware collaborative filtering. Sci. China Inf. Sci. 2022, 65, 222102. [Google Scholar] [CrossRef]
- Wu, L.; He, X.; Wang, X.; Zhang, K.; Wang, M. A survey on accuracy-oriented neural recommendation: From collaborative filtering to information-rich recommendation. IEEE Trans. Knowl. Data Eng. 2022, 35, 4425–4445. [Google Scholar] [CrossRef]
- Su, X.; Khoshgoftaar, T.M. A survey of collaborative filtering techniques. Adv. Artif. Intell. 2009, 2009, 421425. [Google Scholar] [CrossRef]
- He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T.S. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 May 2017; pp. 173–182. [Google Scholar]
- Koren, Y.; Bell, R.; Volinsky, C. Matrix factorization techniques for recommender systems. Computer 2009, 42, 30–37. [Google Scholar] [CrossRef]
- Rendle, S.; Freudenthaler, C.; Gantner, Z.; Schmidt-Thieme, L. BPR: Bayesian personalized ranking from implicit feedback. ar** matrix. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Bei**g, China, 26–31 July 2015; pp. 687–696. [Google Scholar]
- Zou, D.; Wei, W.; Mao, X.L.; Wang, Z.; Qiu, M.; Zhu, F.; Cao, X. Multi-level cross-view contrastive learning for knowledge-aware recommender system. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11 July 2022; pp. 1358–1368. [Google Scholar]
Parameters | Definitions | Parameters | Definitions |
---|---|---|---|
User Set | Map** Matrix of head and tail entity in KGE | ||
Item Set | Embedding of head and tail entity in r space of KG | ||
User–Item Interaction Matrix | Embedding of item in KG | ||
Entity Set | Similarity of item | ||
Relation Set | Rate of edge dropout | ||
User–Item Graph | Node set in graph and | ||
Item Knowledge Graph | Edge set in graph and | ||
Semantic-based Item Similarity Graph | Embedding dimension | ||
Embedding of User and Item in | Encoder layer | ||
Embedding of Item in | The number of supervised signals |
#Users | #Items | #Interactions | #Entities | #Relations | #Triples | |
---|---|---|---|---|---|---|
MovieLens-1M | 5986 | 2347 | 298,856 | 6729 | 8 | 20,195 |
Last-FM | 1872 | 3846 | 42,346 | 9366 | 60 | 15,518 |
Book-Crossing | 17,860 | 14,910 | 139,746 | 24,039 | 10 | 19,793 |
Dataset | Metric | NeuMF | NGCF | LightGCN | SGL | MCCLK | KGSL | Improve (%) |
---|---|---|---|---|---|---|---|---|
MovieLens-1M | Recall@10 | 21.468 | 21.607 | 24.316 | 24.609 | 24.738 | 25.875 | 4.60% |
NDCG@10 | 18.734 | 21.223 | 22.097 | 22.733 | 22.931 | 23.834 | 3.94% | |
HR@10(%) | 9.002 | 10.019 | 10.696 | 6.326 | 7.923 | 11.058 | 3.38% | |
Recall@20 | 29.818 | 30.556 | 32.916 | 33.470 | 33.835 | 34.852 | 3.01% | |
NDCG@20 | 23.224 | 24.293 | 26.417 | 27.269 | 27.331 | 27.786 | 1.66% | |
HR@20(%) | 6.717 | 7.343 | 7.708 | 4.921 | 5.791 | 7.914 | 2.67% | |
Last-FM | Recall@10 | 19.363 | 27.396 | 27.927 | 28.141 | 27.993 | 29.585 | 5.13% |
NDCG@10 | 13.944 | 20.635 | 20.760 | 20.948 | 20.901 | 22.181 | 5.89% | |
HR@10(%) | 3.866 | 4.770 | 5.968 | 4.072 | 4.536 | 5.937 | -0.52% | |
Recall@20 | 25.641 | 29.865 | 34.814 | 29.994 | 35.903 | 37.362 | 4.06% | |
NDCG@20 | 14.801 | 16.453 | 22.563 | 19.554 | 23.012 | 24.141 | 4.91% | |
HR@20(%) | 2.832 | 3.473 | 4.027 | 2.533 | 2.817 | 4.078 | 1.27% | |
Book-Crossing | Recall@10 | 7.701 | 7.915 | 9.263 | 8.904 | 9.535 | 9.952 | 4.37% |
NDCG@10 | 4.767 | 4.498 | 5.795 | 5.363 | 6.051 | 6.174 | 2.03% | |
HR@10(%) | 1.348 | 1.128 | 1.564 | 1.102 | 1.381 | 1.594 | 1.92% | |
Recall@20 | 10.955 | 9.230 | 11.251 | 10.821 | 11.537 | 12.088 | 4.78% | |
NDCG@20 | 5.745 | 4.737 | 6.414 | 5.906 | 6.442 | 7.161 | 11.16% | |
HR@20(%) | 0.896 | 0.783 | 1.028 | 0.689 | 0.937 | 1.054 | 2.73% |
Dataset | Metrics | KGSL-NS | KGSL-NK | KGSL |
---|---|---|---|---|
MovieLens-1M | Recall@10 | 24.238 | 24.729 | 25.875 |
NDCG@10 | 22.074 | 22.738 | 23.834 | |
Last-FM | Recall@10 | 27.829 | 28.445 | 29.585 |
NDCG@10 | 20.586 | 21.163 | 22.181 | |
Book-Crossing | Recall@10 | 9.415 | 9.495 | 9.952 |
NDCG@10 | 5.884 | 6.122 | 6.174 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, S.; Jia, Y.; Wu, Y.; Wei, N.; Zhang, L.; Guo, J. Knowledge-Aware Graph Self-Supervised Learning for Recommendation. Electronics 2023, 12, 4869. https://doi.org/10.3390/electronics12234869
Li S, Jia Y, Wu Y, Wei N, Zhang L, Guo J. Knowledge-Aware Graph Self-Supervised Learning for Recommendation. Electronics. 2023; 12(23):4869. https://doi.org/10.3390/electronics12234869
Chicago/Turabian StyleLi, Shanshan, Yutong Jia, You Wu, Ning Wei, Liyan Zhang, and **gfeng Guo. 2023. "Knowledge-Aware Graph Self-Supervised Learning for Recommendation" Electronics 12, no. 23: 4869. https://doi.org/10.3390/electronics12234869