A Brief Survey of Machine Learning and Deep Learning Techniques for E-Commerce Research
Abstract
:1. Introduction
2. The Utilized Machine Learning and Deep Learning Techniques
2.1. Machine Learning Techniques
- Support vector machine (SVM) [11] is a machine learning model used for classification and regression. An SVM operates by identifying an optimal hyperplane that maximizes the margin between distinct classes, which is determined by critical data points known as support vectors. It can handle both linearly separable and non-linearly separable data using the kernel trick, such as the linear, polynomial, radial basis function, and sigmoid kernels. It is particularly effective for binary and even multi-class classification problems [12].
- Decision Tree [13] is a model used for prediction tasks, functioning by segmenting the predictor space into simple regions for analysis. It uses a tree-like structure to make decisions based on feature values. At each internal node of the tree, a decision or splitting criterion is applied to determine the best feature and threshold for splitting the data [14]. In classification tasks, each leaf node represents a class label while, in regression tasks, the leaf nodes contain the predicted continuous value in that subset.
- Random Forest [15,16] is an ensemble learning method that combines multiple decision trees to make predictions. It enhances classification and regression tasks by training multiple trees on various sub-samples of the data set and aggregating the predictions of individual trees to improve accuracy and prevent over-fitting [17].
- Naïve Bayes [18] is based on the assumption that features are independently and naïvely unrelated to each other. It utilizes the Bayes theorem to calculate the posterior probabilities of classes based on observed feature values. Depending on the assumed distribution type of the features, there are Gaussian, Multinomial, and Bernoulli Naïve Bayes algorithms. Naïve Bayes is widely recognized for its simplicity and efficiency in training and prediction tasks, making it popular for various applications [19].
- Logistic regression [20] utilizes the logistic function or the sigmoid function to estimate the probabilities of inputs belonging to different classes. This method can be extended to softmax regression or multinomial logistic regression by replacing the sigmoid function with the softmax function. Logistic and softmax regression provide straightforward and interpretable approaches to classification problems, allowing for accurate and probabilistic predictions [21].
- Principal component analysis (PCA) [22] is a linear modeling technique used to map high-dimensional input features to a lower-dimensional space, typically referred to as latent factors or principal components. PCA aims to transform the original data into a set of orthogonal components that explain the maximum variance in the data [23].
- K-nearest neighbors (KNN) [27] is a non-parametric algorithm that predicts the class label (for classification) or the target value (for regression) of a test instance based on its similarity to its K nearest neighbors in the training data. In classification, the majority vote among the neighbors determines the class label while, in regression, the average (or weighted average) of the target values is taken [28].
2.2. Deep Learning Techniques
- An Artificial Neural Network (ANN) [30] is a computational model inspired by the structure and functionality of biological neural networks in the human brain. It is composed of interconnected artificial neurons or nodes, organized into layers including the input layer, hidden layers, and output layer. The connections between neurons have associated weights, which are adjusted iteratively by propagating the error from the output layer back to the input layer, guided by a defined objective or loss function [31].
- A Convolutional Neural Network (CNN) [32,33] consists of convolutional layers that apply filters to extract features from input data, followed by pooling layers to reduce the spatial dimensions. They have demonstrated exceptional performance in image classification, object detection, and image segmentation [34].
- The Visual Geometry Group network (VGG) [35] is a deep convolutional neural network architecture (e.g., with 16–19 convolutional layers) developed by the Visual Geometry Group. It showcases the effectiveness of deep convolutional neural networks in capturing complex image features and hierarchies [36].
- A Temporal Convolutional Network (TCN) [37] utilizes dilated convolutional layers to capture temporal patterns and dependencies in the input data. These dilated convolutions enable an expanded receptive field without significantly increasing the number of parameters or computational complexity.
- Recurrent Neural Networks (RNNs) [30] are designed to process sequential data and utilize recurrent connections that enable information to be carried across different time steps. The key characteristic of an RNN is its recurrent connections, which create a loop-like structure and allow information to flow in cycles, enabling the network to maintain a form of memory or context to process and remember information from previous steps [38].
- Long Short-Term Memory (LSTM) [39] is a type of RNN architecture that excels at capturing long-term dependencies and processing sequential data. It utilizes a memory cell and a set of gates that regulate the flow of information; in particular, the memory cell retains information over time, the input gate determines which values to update in the memory cell, the forget gate decides what information to discard from the memory cell, and the output gate selects the relevant information to be output at each time step [38].
- Bidirectional Long Short-Term Memory (BiLSTM) [40] combines two LSTMs that process the input sequence in opposite directions: one LSTM processes the sequence in the forward direction, while the other processes it in the backward direction. This bidirectional processing allows the model to capture information from both past and future contexts, providing a more comprehensive understanding of the input sequence. It has demonstrated strong performance in various natural language processing tasks.
- The Gated Recurrent Unit (GRU) [41] is a simplified alternative to the LSTM network, offering comparable performance with fewer parameters and less computation. In GRU, the update gate determines the amount of the previous hidden state to retain and the extent to which the new input is incorporated. The reset gate controls how much of the previous hidden state is ignored and whether the hidden state should be reset, based on the current input [42].
- The attention-based BiGRU [43,44] adopts attention mechanisms to dynamically assign different weights to different time steps of the sequence, allowing the model to attend to more informative or salient parts of the input. It has demonstrated superior performance in various natural language processing tasks [45].
- Deep Q-Networks (DQN) [47] combine reinforcement learning and deep learning, utilizing the deep neural network to approximate the Q-function and then learn optimal policies in complex environments. The Q-function—also known as the action-value or quality function—represents the expected cumulative reward an agent can achieve by taking a specific action in a given state and following a certain policy. In recent years, Deep RL has gained substantial attention and success in various domains, including robotics, game playing, and autonomous systems [49].
- A Generative Adversarial Network (GAN) [50] is composed of a generator network and a discriminator network, which engage in a competitive game. The generator aims to produce synthetic data samples, while the discriminator tries to discern between real and fake samples. Through iterative training in this adversarial process, GANs have exhibited remarkable capabilities in tasks such as image generation, image-to-image translation, and text generation [51,52].
- Transformers [53,54] are neural networks that use self-attention to capture relationships between words or tokens in a sequence. Self-attention involves calculating attention scores based on the relevance of each element to others, obtaining attention weights through the softmax function, and computing weighted sums using these attention weights. In transformers, the encoder computes representations for each element using self-attention, capturing dependencies and relationships, while the decoder uses this information to generate an output sequence [55].
- Bidirectional Encoder Representations from Transformers (BERT) [56] is a powerful pre-trained language model introduced by Google in 2018. BERT is trained in a bidirectional manner, learning to predict missing words by considering both the preceding and succeeding context, resulting in a better understanding of the overall sentence or document. BERT’s ability to capture contextual information and leverage pre-training has paved the way for advancements in understanding and generating human language [57].
- Autoencoders [58,59] are neural networks that learn to reconstruct their input data. They consist of an encoder network that maps input data to a compressed latent space and a decoder network that reconstructs the original data from the latent representation. They can be employed for tasks such as dimensionality reduction, anomaly detection, and generative modeling [60].
- A Deep Belief Network (DBN) [63,64] is a type of generative deep learning model that consists of multiple layers of stochastic unsupervised restricted Boltzmann machines (RBMs). An RBM is a two-layer neural network with binary nodes that learns representations by minimizing the energy between visible and hidden nodes [65].
- Graph Neural Networks (GNNs) [66,67,68] are a class of deep learning model designed to learn node representations by aggregating information from neighboring nodes in a graph, which are typically used to capture and propagate information through the graph structure, enabling effective learning and prediction tasks on graph-structured data [69].
- A Directed Acyclic Graph Neural Network (DAGNN) [70] is an architecture specifically designed for directed acyclic graphs, where the nodes represent entities or features, and edges denote dependencies or relationships. DAGNNs can effectively capture complex dependencies and facilitate learning and inference in domains with intricate relationships among variables.
2.3. Optimization Techniques for Machine and Deep Learning
- Gradient Descent is an iterative algorithm that updates the model’s parameters by moving in the direction of steepest descent of the loss function.
- Stochastic Gradient Descent (SGD) is a variant of the Gradient Descent algorithm that is particularly suitable for large-scale data sets. It is widely used in deep learning, where it updates the network parameters based on a randomly selected subset of training examples, called a mini-batch.
- Adaptive Moment Estimation is an extension of gradient descent that incorporates adaptive learning rates for different parameters. It dynamically adjusts the learning rate based on the first and second moments of the gradients.
- Root Mean Square Propagation is an optimization algorithm that adapts the learning rate individually for each parameter based on the average of past squared gradients.
- Adagrad adapts the learning rate for each parameter based on their historical gradients. It places more emphasis on less frequent features by reducing the learning rate for frequently occurring features.
2.4. Ensemble Techniques for Machine and Deep Learning
- Bagging (Bootstrap Aggregating) [73] involves training multiple models independently on different subsets of the training data, typically using the same learning algorithm. The final prediction is obtained by averaging or voting the predictions of the individual models. Random Forest is an example of a popular ensemble method that utilizes bagging [74].
- AdaBoost (Adaptive Boosting) [75] sequentially trains multiple homogeneous weak models and adjusts the weights of the training examples to emphasize misclassified instances. The final prediction is a weighted combination of the predictions from the individual models, with more weight given to more accurate models [76].
- Gradient Boosting [77] is an advanced boosting methodology that incorporates the principles of gradient descent for optimization purposes. It assembles an ensemble of weak learners in a sequential manner. The primary objective during this iterative process is for each subsequent model to specifically address and minimize the residual errors—also referred to as gradients—with respect to a pre-determined loss function [78].
- XGBoost (Extreme Gradient Boosting) [79] is an optimized and highly efficient implementation of gradient boosting. It introduces regularization techniques to control model complexity and prevent over-fitting and uses a more advanced construction to provide parallel processing capabilities to accelerate training on large data sets. It also offers built-in functionality for handling missing values, feature importance analysis, and early stop** [80].
- Stacking [81,82] enhances the predictive accuracy by integrating heterogeneous weak learners. These base models are trained in parallel to provide a range of predictions, upon which a meta-model is subsequently trained, synthesizing them into a unified final output. This not only leverages the strengths of individual models, but also reduces the risk of over-fitting.
2.5. Techniques to Prevent Over-Fitting and Improve Generalization
- Cross-validation [84,85] is a widely used technique to estimate the performance of a model on unseen data. It involves partitioning the available data into multiple subsets, training the model on a subset, and evaluating its performance on the remaining subset which can guide the selection of hyperparameters and model architecture.
- Dropout [89] is a technique commonly used in deep learning models. It randomly deactivates a fraction of the neurons during training, effectively creating an ensemble of smaller sub-networks. This encourages the network to learn more robust and less dependent representations, reducing over-fitting and improving generalization.
- Early stop** [90,91] involves monitoring the model’s performance on a validation set during training and stop** the training process when the performance on the validation set starts to degrade. This prevents the model from over-optimizing the training data and helps to find an optimal point that balances training accuracy and generalization.
- Data augmentation [92] involves artificially increasing the size of the training set by applying various transformations to the existing data. This introduces diversity into the training data, reducing the risk of over-fitting and hel** the model to better generalize to unseen examples.
3. The Main Research Topics of Machine and Deep Learning in E-Commerce
3.1. Sentiment Analysis
3.2. Recommendation System
3.3. Fake Review Detection
3.4. Fraud Detection
3.5. Customer Churn Prediction
3.6. Customer Purchase Behavior Prediction
3.7. Prediction of Sales
3.8. Product Classification and Image Recognition
3.9. Other Directions
4. The Main Challenges and Trends for Machine and Deep Learning in E-Commerce
- Imbalanced data pose a significant challenge for both machine learning and deep learning-based classification tasks in e-commerce. This issue is prevalent in fields such as fraud detection, fake review detection, customer churn prediction, and re-purchase behavior classification [164,165,166,174], where one class significantly outweighs the others, leading to biased models with poor performance on the minority class. To address this issue, various methods can be applied, including re-sampling techniques, weighted training, and transfer learning, hel** to enhance model performance and achieve more accurate predictions in e-commerce applications.
- Preventing over-fitting and achieving robust generalization is another challenge for machine learning and deep learning in e-commerce [109,180,183]. Ensembling techniques, such as bagging and boosting, combine multiple models to improve the overall performance and reduce over-fitting. Hybrid models integrate different types of machine learning and deep learning algorithms, leveraging their respective strengths to enhance the overall generalization ability. Regularization techniques, data augmentation, dropout, cross-validation, transfer learning, and early stop** also collectively contribute to building more reliable and accurate models for e-commerce tasks.
- Multi-modal learning poses significant challenges for machine learning and deep learning approaches in e-commerce [185,209]. Integrating data from diverse sources such as text, images, and audio requires careful alignment and pre-processing: feature extraction becomes complex and resource-intensive, labeling and annotating multi-modal data is time-consuming, and the development of fusion strategies to combine modalities for accurate prediction becomes challenging. Despite these challenges, multi-modal learning has great promise for enhancing e-commerce applications such as product classification [185], recommendation [210], sentiment analysis [211], and customer behavior prediction [212].
- Model interpretability poses a significant challenge for both machine learning and deep learning approaches in the context of e-commerce [6,213]. Due to the complexity of deep learning architectures, these models are often considered “black boxes”, making it difficult to understand the reasoning behind their decisions. In e-commerce applications, the ability to interpret why a model makes a specific recommendation or classification is crucial for building trust with users. Interpretability techniques, such as feature visualization [214], attention mechanisms [215], and gradient-based methods [216], are being explored to shed light on the inner workings of machine and deep learning models, enabling better transparency and accountability in the e-commerce decision-making process.
- Personalization is a prominent research area in e-commerce, aiming to enhance the user experience on various platforms [217,218]. AI-powered customer services employing machine learning and deep learning-driven chatbots and virtual assistants not only provide support, but can also help to predict customer needs and deliver tailored assistance and recommendations. Real-time inference capabilities are essential for e-commerce platforms, offering instantaneous recommendations and predictions to users. Reinforcement learning holds great potential in the realm of personalized marketing, allowing for the tailoring of promotions and advertisements to align with individual customer preferences [158]. Additionally, transfer learning stands out as a valuable strategy for refining pre-trained models, thereby enhancing their performance in specialized e-commerce tasks and mitigating the need for extensive data collection and training efforts [194]. These research-driven trends signify the growing importance of personalization and customer-centric strategies in the dynamic e-commerce landscape.
- Machine learning and deep learning-enabled chatbots and virtual assistants are, indeed, emerging as a new trend in e-commerce [219,220]. These technologies harness the power of natural language processing (NLP) and conversational AI to deliver efficient and personalized customer support [219]. AI-driven chatbots analyze customer queries and interactions, providing real-time assistance and recommendations, thereby enhancing the overall shop** experience [220]. The integration of machine learning and deep learning with chatbots can facilitate continuous learning and adaptation to evolving user behaviors, enhancing their effectiveness in addressing customer needs. As e-commerce platforms endeavor to elevate customer engagement and streamline support processes, the adoption of machine and deep learning for chatbots and virtual assistants is poised to gain traction within the e-commerce industry.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AI | Artificial intelligence |
SVM | Support vector machine |
PCA | Principal component analysis |
KNN | K-nearest neighbor |
ANN | Artificial neural network |
CNN | Convolutional Neural Network |
VGG | Visual Geometry Group Network |
TCN | Temporal Convolutional Network |
RNN | Recurrent Neural Network |
LSTM | Long Short-Term Memory |
BiLSTM | Bidirectional Long Short-Term Memory |
GRU | Gated Recurrent Unit |
RL | Reinforcement Learning |
DQN | Deep Q-Network |
GAN | Generative Adversarial Network |
BERT | Bidirectional Encoder Representations from Transformers |
SDAE | Stack Denoising Autoencoder |
DBN | Deep Belief Network |
GNN | Graph Neural Network |
DAGNN | Directed Acyclic Graph Neural Network |
SGD | Stochastic Gradient Descen |
AdaBoost | Adaptive Boosting |
XGBoost | Extreme Gradient Boosting |
References
- Sarker, I.H. Machine learning: Algorithms, real-world applications and research directions. SN Comput. Sci. 2021, 2, 160. [Google Scholar] [CrossRef] [PubMed]
- Burkart, N.; Huber, M.F. A survey on the explainability of supervised machine learning. J. Artif. Intell. Res. 2021, 70, 245–317. [Google Scholar] [CrossRef]
- Bi, Q.; Goodman, K.E.; Kaminsky, J.; Lessler, J. What is machine learning? A primer for the epidemiologist. Am. J. Epidemiol. 2019, 188, 2222–2239. [Google Scholar] [CrossRef] [PubMed]
- Janiesch, C.; Zschech, P.; Heinrich, K. Machine learning and deep learning. Electron. Mark. 2021, 31, 685–695. [Google Scholar] [CrossRef]
- Dong, S.; Wang, P.; Abbas, K. A survey on deep learning and its applications. Comput. Sci. Rev. 2021, 40, 100379. [Google Scholar] [CrossRef]
- Sarker, I.H. Deep learning: A comprehensive overview on techniques, taxonomy, applications and research directions. SN Comput. Sci. 2021, 2, 420. [Google Scholar] [CrossRef] [PubMed]
- Laudon, K.C.; Traver, C.G. E-Commerce; Pearson Boston: Boston, MA, USA, 2013. [Google Scholar]
- Nosratabadi, S.; Mosavi, A.; Duan, P.; Ghamisi, P.; Filip, F.; Band, S.S.; Reuter, U.; Gama, J.; Gandomi, A.H. Data science in economics: Comprehensive review of advanced machine learning and deep learning methods. Mathematics 2020, 8, 1799. [Google Scholar] [CrossRef]
- Song, X.; Yang, S.; Huang, Z.; Huang, T. The application of artificial intelligence in electronic commerce. Proc. J. Phys. Conf. Ser. 2019, 1302, 032030. [Google Scholar] [CrossRef]
- Di Corso, E.; Proto, S.; Vacchetti, B.; Bethaz, P.; Cerquitelli, T. Simplifying text mining activities: Scalable and self-tuning methodology for topic detection and characterization. Appl. Sci. 2022, 12, 5125. [Google Scholar] [CrossRef]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Noble, W.S. What is a support vector machine? Nat. Biotechnol. 2006, 24, 1565–1567. [Google Scholar] [CrossRef]
- Breiman, L.; Friedman, J.; Olshen, R.; Stone, C. Cart. In Classification and Regression Trees; Wadsworth Publishing Group: Belmont, CA, USA, 1984. [Google Scholar]
- Charbuty, B.; Abdulazeez, A. Classification based on decision tree algorithm for machine learning. J. Appl. Sci. Technol. Trends 2021, 2, 20–28. [Google Scholar] [CrossRef]
- Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; Volume 1, pp. 278–282. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Speiser, J.L.; Miller, M.E.; Tooze, J.; Ip, E. A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst. Appl. 2019, 134, 93–101. [Google Scholar] [CrossRef]
- Hand, D.J.; Yu, K. Idiot’s Bayes—Not so stupid after all? Int. Stat. Rev. 2001, 69, 385–398. [Google Scholar]
- Chen, S.; Webb, G.I.; Liu, L.; Ma, X. A novel selective naïve Bayes algorithm. Knowl.-Based Syst. 2020, 192, 105361. [Google Scholar] [CrossRef]
- Cramer, J.S. The Origins of Logistic Regression; University of Amsterdam and Tinbergen Institute: Amsterdam, The Netherlands, 2002. [Google Scholar]
- Pampel, F.C. Logistic Regression: A Primer; Number 132; Sage Publications: New York, NY, USA, 2020. [Google Scholar]
- Hotelling, H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 1933, 24, 417. [Google Scholar] [CrossRef]
- **-but when? In Neural Networks: Tricks of the Trade; Springer: Berlin/Heidelberg, Germany, 2002; pp. 55–69. [Google Scholar]
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 1–48. [Google Scholar] [CrossRef]
- Agarwal, B.; Nayak, R.; Mittal, N.; Patnaik, S. Deep Learning-Based Approaches for Sentiment Analysis; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
- Cao, J.; Li, J.; Yin, M.; Wang, Y. Online reviews sentiment analysis and product feature improvement with deep learning. Trans. Asian-Low-Resour. Lang. Inf. Process. 2023, 22, 1–17. [Google Scholar] [CrossRef]
- Gope, J.C.; Tabassum, T.; Mabrur, M.M.; Yu, K.; Arifuzzaman, M. Sentiment analysis of Amazon product reviews using machine learning and deep learning models. In Proceedings of the 2022 International Conference on Advancement in Electrical and Electronic Engineering (ICAEEE), IEEE, Gazipur, Bangladesh, 24–26 February 2022; pp. 1–6. [Google Scholar]
- Xu, F.; Pan, Z.; ** an intelligent system with deep learning algorithms for sentiment analysis of e-commerce product reviews. Comput. Intell. Neurosci. 2022, 2022, 3840071. [Google Scholar] [CrossRef]
- Hung, B.T.; Semwal, V.B.; Gaud, N.; Bijalwan, V. Hybrid deep learning approach for aspect detection on reviews. In Proceedings of the 2020 Integrated Intelligence Enable Networks and Computing (IIENC 2020), Chamoli, India, 5–7 September 2020; Springer: Berlin/Heidelberg, Germany, 2021; pp. 991–999. [Google Scholar]
- Yang, L.; Li, Y.; Wang, J.; Sherratt, R.S. Sentiment analysis for E-commerce product reviews in Chinese based on sentiment lexicon and deep learning. IEEE Access 2020, 8, 23522–23530. [Google Scholar] [CrossRef]
- Liu, Y.; Lu, J.; Yang, J.; Mao, F. Sentiment analysis for e-commerce product reviews by deep learning model of Bert-BiGRU-Softmax. Math. Biosci. Eng. 2020, 17, 7819–7837. [Google Scholar] [CrossRef] [PubMed]
- Zhou, H.; Wu, G. Research on sentiment analysis of chinese e-commerce comments based on deep learning. Proc. J. Phys. Conf. Ser. 2019, 1237, 022002. [Google Scholar] [CrossRef]
- Venkatesan, R.; Sabari, A. Deepsentimodels: A novel hybrid deep learning model for an effective analysis of ensembled sentiments in e-commerce and s-commerce platforms. Cybern. Syst. 2023, 54, 526–549. [Google Scholar] [CrossRef]
- Lin, P.; Luo, X.; Fan, Y. A survey of sentiment analysis based on deep learning. Int. J. Comput. Inf. Eng. 2020, 14, 473–485. [Google Scholar]
- Jain, S.; Roy, P.K. E-commerce review sentiment score prediction considering misspelled words: A deep learning approach. Electron. Commer. Res. 2022, 1–25. [Google Scholar] [CrossRef]
- Sharm, N.; Jain, T.; Narayan, S.S.; Kandakar, A.C. Sentiment analysis of Amazon smartphone reviews using machine learning & deep learning. In Proceedings of the 2022 IEEE International Conference on Data Science and Information System (ICDSIS), Hassan, India, 29–30 July 2022; pp. 1–4. [Google Scholar]
- Wang, C.; Zhu, X.; Yan, L. Sentiment analysis for e-commerce reviews based on deep learning hybrid model. In Proceedings of the 2022 5th International Conference on Signal Processing and Machine Learning, New York, NY, USA, 4–6 August 2022; pp. 38–46. [Google Scholar]
- Mehul, A.R.; Mahmood, S.M.; Tabassum, T.; Chakraborty, P. Sentiment polarity detection using machine learning and deep learning. In Proceedings of the 2023 International Conference on Electrical, Computer and Communication Engineering (ECCE), Chittagong, Bangladesh, 23–25 February 2023; pp. 1–5. [Google Scholar]
- Sheikh, A.S.; Guigourès, R.; Koriagin, E.; Ho, Y.K.; Shirvany, R.; Vollgraf, R.; Bergmann, U. A deep learning system for predicting size and fit in fashion e-commerce. In Proceedings of the 13th ACM Conference on Recommender Systems, New York, NY, USA, 16–20 September 2019; pp. 110–118. [Google Scholar]
- Maher, M.; Ngoy, P.M.; Rebriks, A.; Ozcinar, C.; Cuevas, J.; Sanagavarapu, R.; Anbarjafari, G. Comprehensive empirical evaluation of deep learning approaches for session-based recommendation in e-commerce. Entropy 2022, 24, 1575. [Google Scholar] [CrossRef]
- Pujastuti, E.; Laksito, A.; Hardi, R.; Perwira, R.; Arfriandi, A. Handling sparse rating matrix for e-commerce recommender system using hybrid deep learning based on LSTM, SDAE and latent factor. Int. J. Intell. Eng. Syst. 2022, 15, 379–393. [Google Scholar]
- Ahmed, A.; Saleem, K.; Khalid, O.; Rashid, U. On deep neural network for trust aware cross domain recommendations in E-commerce. Expert Syst. Appl. 2021, 174, 114757. [Google Scholar] [CrossRef]
- Shoja, B.M.; Tabrizi, N. Customer reviews analysis with deep neural networks for e-commerce recommender systems. IEEE Access 2019, 7, 119121–119130. [Google Scholar] [CrossRef]
- Yan, Y.; Liu, Z.; Zhao, M.; Guo, W.; Yan, W.P.; Bao, Y. A practical deep online ranking system in e-commerce recommendation. In Proceedings of the Conference on Machine Learning and Knowledge Discovery in Databases, Dublin, Ireland, 10–14 September 2018; Springer: Berlin/Heidelberg, Germany, 2019; pp. 186–201. [Google Scholar]
- Zhao, X.; ** behaviour from clickstream data using deep learning. Expert Syst. Appl. 2020, 150, 113342. [Google Scholar] [CrossRef]
- Vieira, A. Predicting online user behaviour using deep learning algorithms. ar** and Validating a Directed Acyclic Graph (DAG) Network for Deep Learning. Electronics 2022, 11, 2940. [Google Scholar] [CrossRef]
- Qi, Y.; Li, C.; Deng, H.; Cai, M.; Qi, Y.; Deng, Y. A deep neural framework for sales forecasting in e-commerce. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Bei**g, China, 3–7 November 2019; pp. 299–308. [Google Scholar]
- Zahavy, T.; Krishnan, A.; Magnani, A.; Mannor, S. Is a picture worth a thousand words? A deep multi-modal architecture for product classification in e-commerce. In Proceedings of the 2018 AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
- Gupta, P.; Raman, S. Translate2Classify: Machine translation for e-commerce product categorization in comparison with machine learning & deep learning classification. In Proceedings of the 2021 Data Analytics and Management (ICDAM 2021), Polkowise, Poland, 26 June 2021; Springer: Berlin/Heidelberg, Germany, 2022; pp. 769–788. [Google Scholar]
- Dai, J.; Wang, T.; Wang, S. A deep forest method for classifying e-commerce products by using title information. In Proceedings of the 2020 International Conference on Computing, Networking and Communications (ICNC), Big Island, HI, USA, 17–20 February 2020; pp. 1–5. [Google Scholar]
- Samia, B.; Soraya, Z.; Malika, M. Fashion images classification using machine learning, deep learning and transfer learning models. In Proceedings of the 2022 7th International Conference on Image and Signal Processing and Their Applications (ISPA), Mostaganem, Algeria, 8–9 May 2022; pp. 1–5. [Google Scholar]
- Sharma, R.; Vishvakarma, A. Retrieving similar e-commerce images using deep learning. ar** strategy for e-commerce commodity demand forecasting. Mob. Inf. Syst. 2021, 2021, 5568208. [Google Scholar] [CrossRef]
- Symeonidis, P.; Tiakas, E.; Manolopoulos, Y. Product recommendation and rating prediction based on multi-modal social networks. In Proceedings of the Fifth ACM Conference on Recommender Systems, Chicago, IL, USA, 23–27 October 2011; pp. 61–68. [Google Scholar]
- Jiming, L.; Peixiang, Z.; Ying, L.; Weidong, Z.; Jie, F. Summary of Multi-modal Sentiment Analysis Technology. J. Front. Comput. Sci. Technol. 2021, 15, 1165. [Google Scholar]
- Herberz, M.; Hahnel, U.J.; Brosch, T. The importance of consumer motives for green mobility: A multi-modal perspective. Transp. Res. Part A Policy Pract. 2020, 139, 102–118. [Google Scholar] [CrossRef]
- Wanner, J.; Herm, L.V.; Heinrich, K.; Janiesch, C. Stop ordering machine learning algorithms by their explainability! An empirical investigation of the tradeoff between performance and explainability. In Proceedings of the 2021 Conference on e-Business, e-Services and e-Society, Newcastle upon Tyne, UK, 13–14 September 2022; Springer: Berlin/Heidelberg, Germany, 2021; pp. 245–258. [Google Scholar]
- Zhang, Q.s.; Zhu, S.C. Visual interpretability for deep learning: A survey. Front. Inf. Technol. Electron. Eng. 2018, 19, 27–39. [Google Scholar] [CrossRef]
- Yang, Z.B.; Zhang, J.P.; Zhao, Z.B.; Zhai, Z.; Chen, X.F. Interpreting network knowledge with attention mechanism for bearing fault diagnosis. Appl. Soft Comput. 2020, 97, 106829. [Google Scholar] [CrossRef]
- Nielsen, I.E.; Dera, D.; Rasool, G.; Ramachandran, R.P.; Bouaynaya, N.C. Robust explainability: A tutorial on gradient-based attribution methods for deep neural networks. IEEE Signal Process. Mag. 2022, 39, 73–84. [Google Scholar] [CrossRef]
- Tan, K.S.; Subramanian, P. Proposition of machine learning driven personalized marketing approach for E-commerce. J. Comput. Theor. Nanosci. 2019, 16, 3532–3537. [Google Scholar] [CrossRef]
- Goldenberg, D.; Kofman, K.; Albert, J.; Mizrachi, S.; Horowitz, A.; Teinemaa, I. Personalization in practice: Methods and applications. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, Jerusalem, Israel, 8–12 March 2021; pp. 1123–1126. [Google Scholar]
- Rakhra, M.; Gopinadh, G.; Addepalli, N.S.; Singh, G.; Aliraja, S.; Reddy, V.S.G.; Reddy, M.N. E-commerce assistance with a smart chatbot using artificial intelligence. In Proceedings of the 2021 2nd International Conference on Intelligent Engineering and Management, London, UK, 28–30 April 2021; pp. 144–148. [Google Scholar]
- Landim, A.; Pereira, A.; Vieira, T.; de B. Costa, E.; Moura, J.; Wanick, V.; Bazaki, E. Chatbot design approaches for fashion E-commerce: An interdisciplinary review. Int. J. Fash. Des. Technol. Educ. 2022, 15, 200–210. [Google Scholar] [CrossRef]
Year | 2018 | 2019 | 2020 | 2021 | 2022 | 2023 |
Number | 10 | 16 | 22 | 26 | 40 | 54 |
Ranking | Word | Frequency | Ranking | Word | Frequency | Ranking | Word | Frequency |
---|---|---|---|---|---|---|---|---|
1 | learning | 159 | 11 | product | 19 | 21 | network | 9 |
2 | deep | 132 | 12 | fraud | 18 | 22 | churn | 9 |
3 | ecommerce | 77 | 13 | approach | 16 | 23 | online | 8 |
4 | machine | 51 | 14 | neural | 16 | 24 | reinforcement | 8 |
5 | analysis | 38 | 15 | reviews | 16 | 25 | comparative | 8 |
6 | sentiment | 28 | 16 | classification | 16 | 26 | hybrid | 8 |
7 | detection | 24 | 17 | system | 15 | 27 | card | 8 |
8 | model | 24 | 18 | review | 12 | 28 | credit | 8 |
9 | recommendation | 21 | 19 | customer | 12 | 29 | algorithm | 7 |
10 | prediction | 20 | 20 | data | 11 | 30 | techniques | 7 |
Ranking | Word | Frequency | Ranking | Word | Frequency | Ranking | Word | Frequency |
---|---|---|---|---|---|---|---|---|
1 | learning | 381 | 11 | models | 110 | 21 | classification | 83 |
2 | deep | 309 | 12 | online | 108 | 22 | products | 82 |
3 | ecommerce | 252 | 13 | accuracy | 107 | 23 | system | 77 |
4 | model | 246 | 14 | sentiment | 103 | 24 | detection | 75 |
5 | data | 175 | 15 | network | 99 | 25 | information | 74 |
6 | reviews | 160 | 16 | prediction | 96 | 26 | features | 73 |
7 | product | 145 | 17 | proposed | 92 | 27 | cnn | 69 |
8 | machine | 144 | 18 | recommendation | 88 | 28 | algorithms | 68 |
9 | analysis | 131 | 19 | customer | 84 | 29 | sales | 68 |
10 | neural | 125 | 20 | dataset | 84 | 30 | performance | 65 |
References | Models | Data Set |
---|---|---|
[98] | LSTM, GRU (compared with Naïve Bayes, Decision Tree, Random Forest, and SVM) | Indian Hotel booking data from booking.com |
[104] | Comparative study of LSTM, Bi-LSTM, GRU-CNN, CNN-RNN, CNN-LSTM, and CNN-BiLSTM | IMDB Movie Review, Amazon Product Review |
[105] | LSTM, CNN-LSTM | Reviews of cameras, laptops, mobile phones, tablets, televisions, and video surveillance products from the Amazon website |
[106] | LSTM, CNN, CNN-LSTM | Vietnamese VLSP 2018 data set |
[107] | Combination sentiment lexicon, CNN, and attention-based BiGRU | Book evaluation of dangdang.com |
[108] | RNN, Bert, BiGRU, Bert-BiLSTM, Softmax function | COAE2014-task4, ChnSentiCorp-Htl-ba-6000, Reviews about mobile phone products from Sunning and Taobao |
[109] | LSTM, Bi-directional LSTM, LSTM Attention, TCN, TCN Attention model | 10,679 positive comments and 10,428 negative comments collected from Chinese e-commerce platforms |
[110] | Hybrid model of spotted hyena optimized LSTM and BiCNN (compared with CNN, BIGRU, CNN-LSTM, and Attention LSTM), | Amazon Product Reviews (Text Data), Twitter Emoji data sets (Emojis), Shop** Customer data (Text + Emojis) |
[112] | LSTM encoder–decoder (compared with LSTM, Bi-LSTM, and attention-based LSTM) | Reviews on electronics products from the Amazon website |
[113] | Combination of SVM, logistic regression, naïve Bayes model with LSTM, RNN | Comments of smartphones on Amazon and Flipkart, Kaggle |
[114] | CNN attention, RNN attention (compared with CNN and RNN) | 8 cross-border e-commerce APP reviews from APP Store |
[115] | Combination of LSTM and Sigmoid kernel | Amazon-based customer reviews |
References | Models | System Types |
---|---|---|
[116] | Matrix factorization, deep MLP, SGD | Personalized size and fit recommendation |
[117] | Hybrid model of Matrix Factorization, RNN-GRU, attention mechanism, and GAN | Session-based recommendation |
[118] | Hybrid model of SDAE, LSTM, and Probabilistic Matrix Factorization | E-commerce recommendation system from sparse rating matrix |
[119] | Generalized matrix factorization, deep MLP | Cross-domain recommendations |
[120] | Latent Dirichlet Allocation, deep neural network | E-commerce recommendation based on customer reviews analysis |
[121] | Deep neural network | Production online recommendation |
[122] | RNN, GRU | Page-wise recommendation and interaction, 2D page real time feedback |
[123] | Combination of CNN, GAN, RL, and Deep Q-Network | Session-based interactive recommendation |
[124] | Time window-based RNN | Product sequence recommendation |
[125] | Comparative analysis of CNN, RNN, LSTM, and GRU | Product recommendation for online shop** |
[126] | CNN | Image retrieval and visual recommendation |
[127] | Comparative review of KNN, SVM, random forest, CNN, LSTM-RNN, and GNN | Search engine recommendation review |
[128] | Pairwise deep RL | Recommendations with negative feedback |
[129] | CNN, LSTM | Cross-border Niche product recommendation |
[130] | RL | Product recommendation in online advertising |
[131] | CNN | Fashion collocation recommendation model |
[132] | Deep neural network, collaborative filtering | Recommendation engine of social media websites |
[133] | CNN, RNN | Cold start and data sparsity of recommendation system |
[134] | Transformer | Sequential signals underlying user behavior sequences for recommendation in Alibaba |
[135] | GAN | Content-based recommendation system |
[136] | CNN, autoencoders | E-commerce fashion market recommendation |
[137] | CNN, RNN | Shop** basket recommendation |
References | Models | Auxiliary Techniques or Application Fields |
---|---|---|
[138] | Comparative analysis of Naïve Bayes, SVM, KNN, and decision tree | Weka Text Classification, film reviews data set |
[139] | Comparison analysis of Naïve Bayes, SVM, logistic regression, and decision tree | Tokenization and removing stop words, Amazon data sets |
[140] | Comparison analysis of KNN, SVM, naïve Bayes, and logistic regression | Yelp data set |
[141] | CNN, Bi-LSTM (compared with Naïve Bayes, SVM, KNN, and decision tree) | GloVe embedding method, bag of words model, Amazon e-commerce reviews |
[142] | Hybrid model of SVM, MLP, and CNN-LSTM | Emotional expressions, extreme rating |
[143] | CNN-BiLSTM (compared with Random forest) | Amazon platform |
[144] | CNN (compared with SVM, logistic regression, and naïve Bayes) | GloVe word embedding, Ott data set |
[145] | Comparative analysis of CNN, MLP, LSTM, naïve Bayes, KNN, and SVM | Yelp Database |
[146] | Comparative Study of CNN, Bi-LSTM, CNN-Bi-LSTM, logistic regression, random forest, naïve Bayes, and SVM | Word2Vec, FastText, and GloVe embeddings, E-commerce website called DarazBD |
[147] | Bi-GRU | Amazon, Yelp |
[148] | CNN + LSTM (compared with MLP, naïve Bayes, and SVM) | Ott and Yelp data sets |
[149] | RNN+CNN (compared with BiLSTM and SVM) | Amazon product reviews data set |
[150] | BiLSTM+Attention, CNN-BiLST (compared with CNN, BiLSTM, logistic regression, naïve Bayes, and BERT) | Real-world data sets from http://Yelp.com (accessed on 2 November 2022) |
[151] | Hybrid model of CNN, RNN, and attention mechanism | Cross-domain spam detection, Hotel, restaurant, and doctor reviews |
[152] | Comparison analysis of MLP, CNN, LSTM, Naïve Bayes, KNN, and SVM | Ott Data set, Yelp Data set |
References | Models | Research Topic |
---|---|---|
[153] | Comparison analysis of Decision tree, random forest, SVM, logistic regression, XGBoost, CNN, LSTM, RNN, GAN, and RBM | Credit card fraud detection method comparison |
[154] | Comparative analysis Random forest and deep neural network | Credit card fraud detection |
[155] | Hybrid model of BiLSTM and BiGRU (compared with naïve Bayes, Adaboost, random forest, decision tree, and logistic regression) | Credit card fraud detection |
[156] | Deep network, RL | Vulnerability of deep fraud detector |
[157] | Hybrid model of Markov Decision Process and RL | Impression allocation for combating Fraud in e-commerce |
[158] | RL | Order fraud evaluation |
[159] | Machine learning methods | Comprehensive survey on fraud detection |
[160] | Comparative analysis of Random forest, SVM, KNN, KNN-SVM-CNN, and RBM | Fraudulent transaction tracing |
[161] | Hybrid model of Encoders, SVM and CNN | Financial fraud detection |
[162] | Comparative analysis of Random forest and Adaboost | Credit card fraud detection |
References | Models | Research Topic or Application Field |
---|---|---|
[165] | AdaBoost, deep network | Imbalanced data processing |
[166] | RNN | Imbalanced classes of real e-commerce data |
[167] | Hybrid model of PCA, AdaBoost, and decision tree | High-dimensional and unbalanced data |
[168] | Deep neural network | Telco data set, Churn factor analysis |
[169] | Comparative analysis of PCA, SVM, naïve Bayes, random forest, and deep network | Brazilian e-commerce data set |
[170] | CNN | Telco Customer, Distributed model |
[171] | Hybrid model of CNN, decision tree, and grid search optimization | Diagnosis of employee churn |
[172] | Comparative analysis of Naïve Bayes, SVM, decision tree, random forest, and logistic regression | IBM Watson Analytics HR data, Employee attrition prediction |
References | Models | Direction or Application Fields |
---|---|---|
[197,198] | Machine learning optimization | Last-mile delivery, third-party logistics, and bin packing problem |
[199] | Comparative analysis of Random Forest, XGBoost, Logistic Regression, and Neural Network | Detection and prediction of company short-, middle-, and long-term defaults and bankruptcy |
[200] | Deep neural network | Financial early warning model |
[201] | LSTM | Financial risk prediction for Chinese e-commerce enterprises from 2012 to 2022 |
[202] | CNN-LSTM | Service capacity allocation for cross-border e-commerce |
[203] | Deep neural network | Identification of entrepreneurs in rural e-commerce |
[204] | Deep neural network | Social e-commerce tax evasion detection |
[205] | Hybrid model of AdaBoost and Deep neural network | E-commerce industry marketing promotion |
[206] | CNN | Task scheduling based on deadline and cost |
[207] | CNN | Cross-border e-commerce platform for commodity automatic pricing |
[208] | LSTM | Named Entity Recognition |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, X.; Guo, F.; Chen, T.; Pan, L.; Beliakov, G.; Wu, J. A Brief Survey of Machine Learning and Deep Learning Techniques for E-Commerce Research. J. Theor. Appl. Electron. Commer. Res. 2023, 18, 2188-2216. https://doi.org/10.3390/jtaer18040110
Zhang X, Guo F, Chen T, Pan L, Beliakov G, Wu J. A Brief Survey of Machine Learning and Deep Learning Techniques for E-Commerce Research. Journal of Theoretical and Applied Electronic Commerce Research. 2023; 18(4):2188-2216. https://doi.org/10.3390/jtaer18040110
Chicago/Turabian StyleZhang, Xue, Fusen Guo, Tao Chen, Lei Pan, Gleb Beliakov, and Jianzhang Wu. 2023. "A Brief Survey of Machine Learning and Deep Learning Techniques for E-Commerce Research" Journal of Theoretical and Applied Electronic Commerce Research 18, no. 4: 2188-2216. https://doi.org/10.3390/jtaer18040110