Next Article in Journal
Self-Calibration for Star Sensors
Previous Article in Journal
Investigation of Appropriate Scaling of Networks and Images for Convolutional Neural Network-Based Nerve Detection in Ultrasound-Guided Nerve Blocks
Previous Article in Special Issue
Performance Evaluation of a New Sport Watch in Sleep Tracking: A Comparison against Overnight Polysomnography in Young Adults
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Development of a Personalized Multiclass Classification Model to Detect Blood Pressure Variations Associated with Physical or Cognitive Workload

1
Department of Electronics and Telecommunications, Politecnico di Torino, 10129 Torino, Italy
2
Tyndall National Institute, University College Cork, Lee Maltings Complex, Dyke Parade, T12R5CP Cork, Ireland
*
Authors to whom correspondence should be addressed.
Sensors 2024, 24(11), 3697; https://doi.org/10.3390/s24113697
Submission received: 2 May 2024 / Revised: 23 May 2024 / Accepted: 4 June 2024 / Published: 6 June 2024
(This article belongs to the Special Issue Wearable Technologies and Sensors for Healthcare and Wellbeing)

Abstract

:
Comprehending the regulatory mechanisms influencing blood pressure control is pivotal for continuous monitoring of this parameter. Implementing a personalized machine learning model, utilizing data-driven features, presents an opportunity to facilitate tracking blood pressure fluctuations in various conditions. In this work, data-driven photoplethysmograph features extracted from the brachial and digital arteries of 28 healthy subjects were used to feed a random forest classifier in an attempt to develop a system capable of tracking blood pressure. We evaluated the behavior of this latter classifier according to the different sizes of the training set and degrees of personalization used. Aggregated accuracy, precision, recall, and F1-score were equal to 95.1%, 95.2%, 95%, and 95.4% when 30% of a target subject’s pulse waveforms were combined with five randomly selected source subjects available in the dataset. Experimental findings illustrated that incorporating a pre-training stage with data from different subjects made it viable to discern morphological distinctions in beat-to-beat pulse waveforms under conditions of cognitive or physical workload.

1. Introduction

Population growth and the aging demographic are recognized as predominant factors linked to the rise in the incidence of cardiovascular diseases (CVDs) [1]. The estimated increment in the number of adults aged 30 to 79 ranged from approximately 650 million to 1.28 billion between 1990 and 2019 [2,3]. Hypertension, also known as elevated blood pressure (BP), is a medical condition that promotes the insurgence of coronary diseases and different pathologies impacting vital organs, such as the brain and kidneys [4,5,6,7].
Noninvasive BP measurement systems have emerged to overcome the mentioned limitation of the invasive approach [8]. These devices employ multi-modal sensors that exploit diverse physical principles to extract information regarding the status of the cardiovascular system [9,10,11,12,13]. Continuous arterial BP measurement has been widely acknowledged as a more accurate determinant of cardiovascular risks since alterations in systolic or diastolic BP, along with changes in the shape of BP waveforms over time, reflect the progression of arterial and arteriolar modifications [14,15]. Moreover, by analyzing the arterial pressure waveforms, the cardiovascular status can be assessed through the estimation of physiological parameters [16,17]. A promising approach for continuous BP measurement is through computational modeling of the circulatory system [18,19,20]. These models integrate noninvasive data, like aortic flow and peripheral readings, to estimate BP. Parallelly, calibrated methods for cuffless BP, including Pulse Transit Time (PTT) and pulse wave analysis (PWA), stand out as viable solutions for BP assessment [21]. PTT is the time delay for the pressure wave to travel between proximal and distal arterial sites [22]. As defined in different arterial stiffness studies, PTT is conventionally identified as the foot-to-foot time delay between proximal and distal arterial BP waveforms [23,24]. PWA relies on extracting features from an arterial waveform and associating them with BP units through a calibration model. This approach offers greater convenience compared to the PTT method, as it necessitates only a single sensor [25] or can be employed in conjunction with PTT to enhance its accuracy. PWA employs data-driven feature extraction to extrapolate relevant information for the BP assessment. Numerous features have been investigated to analyze the morphology of the arterial pulse waves captured by photoplethysmography (PPG) sensors [26,27,28].
Several machine learning (ML) algorithms, including Support Vector Machines (SVM), random forests (RF), and feedforward neural networks (NN), have been employed in BP assessment [29,30,31]. In many instances, nonlinear models have demonstrated superior performance compared to linear models, although this outcome is contingent on the specific dataset and approach utilized, i.e., PTT, Pulse Arrival Time (PAT), or PWA using only PPG data. Additionally, more advanced methods, such as Recurrent Neural Networks (RNN) [32] and Long Short-Term Memory (LSTM) networks [33], have also been proposed. Although these models may offer a significant advantage over previously mentioned models by incorporating the ability to capture variations in extracted features over time, Deep Learning (DL) models require a large number of data samples to provide reasonably accurate BP values. Although DL and ML have found extensive application in BP assessment, the considerable inter-subject variability has posed challenges in formulating a sufficiently generalized model whose performance could also be maintained outside of the initial dataset. Therefore, drawing inspiration from established practices in the field of human activity recognition [34,35], numerous studies have suggested the formulation of person-specific models for the examination of this clinical parameter [36,37,38,39].
This paper presents a personalized ML model designed to detect BP changes in response to various stimuli. For this specific application, we developed a customized acquisition system to perform real-time acquisition and visualization of raw PPG pulse waveforms at the level of brachial (elbow) [10] and digital (thumb) [40] arteries. A specific data collection protocol was deployed to perform an accurate assessment and to induce a BP variation owing to the execution of both physical and cognitive tasks [41,42].
Our contributions in this application are as follows: (1) we propose ten pre-training strategies to calibrate an RF model based on individual physiological characteristics; (2) pre-training improves blood pressure classification accuracy on beat-to-beat pulse waveforms by up to 60% compared to a generalized model; (3) overfitting is mitigated by expanding the number of source subjects, reducing the need for additional target subject data; (4) we cut the data required to personalize the model by up to 30% while maintaining evaluation metrics above 95%. The structure of this article is as follows. Section 2 guides the reader through a detailed description of the hardware design of the device developed to retrieve the PPG raw data used in this work. Then, the data capture protocol and the data processing pipeline are presented, along with the description of the training strategies and the evaluation metrics employed to quantify the performance of the ML model. In Section 3, the results of the processing stage are reported, followed by the results retrieved for each subject in the dataset. Section 4 details the discussion and limitations of the proposed analysis. Finally, Section 5 concludes the paper.

2. Materials and Methods

2.1. Hardware Device

In this work, a custom-designed data acquisition system was employed to collect the arterial pulse waveforms between the brachial artery and the digital artery [43]. Figure 1 illustrates the fabricated supports and their positioning on the subject during data collection. The thumb-mounted holder (Figure 1, center) mimics a standard pulse oximeter design, featuring an elastic spring for sensor adherence to the finger. On the top left side of Figure 1, the sensor holder is seen to be positioned on the elbow, with mechanical fixtures that allow the operator to regulate arm pressure, making it steady during the data acquisition process as recommended in [21,44,45]. Each enclosure was designed to guarantee that the sensor maintains firm contact with the sample site, applying steady pressure and avoiding the need for the operator to hold it in its position. This feature enhances the replicability of the acquisition setup for any specific subject, thereby improving measurement consistency. Further information on the developed hardware device is available in [43].

2.2. Data Collection Protocol

A pre-clinical trial was carried out at the Tyndall National Institute in University College Cork (UCC), Cork, Ireland, to study the blood pressure variations related to the execution of cognitive and physical tasks. In this experiment, approved by the UCC Clinical Research Ethics Committee of the Cork Teaching Hospitals, a cohort of 31 healthy volunteers ranging from 21 to 34 years was recruited. Table 1 details the physiological parameters of the people involved. In accordance with the guidelines established for accurate BP measurement [46], every participant included in this study did not have any pre-existing cardiovascular condition and was not undergoing treatment with medications that could influence BP readings. Moreover, every individual was asked to refrain from smoking or consuming coffee in the 60 min before the session. The first step in the data capture consisted of the individual reclining in a supine position for 10 min to ensure that their hemodynamic conditions and vasomotor tone returned to a baseline level. Subsequently, the subject received instructions regarding the prescribed posture for data capture, which included sitting with back support, both feet flat on the floor, and hands resting on the table at a height equivalent to that of the heart.
In accordance with the study protocol reported in Figure 2, after obtaining the anamnesis information, the operator identified the best location for the brachial artery through tactile arterial palpation. Once located, the spot was marked with ink to be sure that the acquisition site did not change over the duration of the data capture. Each data collection session was divided into three principal sections, denoted as follows: the resting phase (REST), the phase dedicated to cognitive testing (CT), and the after-exercise phase (AE). During each phase, a series of three data acquisitions, each one lasting one minute, was performed using the presented device. Then, the commercial cuff-based device BPM Connect [47] (Withings, Issy-les-Moulineaux, France) was used as a gold standard to measure the reference BP values for each specific section. In total, a set of three reference measurements was collected throughout the entire data collection. To prevent any potential recovery effects between measurements using both devices, we conducted the reference assessment immediately after completing the three acquisitions. Each phase was designed to induce changes in both blood pressure and PPG data collected from each participant. The resulting alterations in the PPG pulse waveforms are illustrated in the bottom section of Figure 2.
In the CT section, the subject was cognitively stimulated through two cognitive tests: the Stroop test [48] and the n-back test [49,50]. Both tests were executed through a custom-designed graphical user interface (GUI) structured to make a gradual augmentation in the level of complexity. Prior to commencing the actual measurement, the operator provided the participant with detailed instructions regarding the tests to be undertaken. Additionally, the participants had the opportunity to familiarize themselves with the GUI through the execution of a short demonstration. Then, the device was positioned on the subject. The last three minutes of this section were recorded during the execution of the n-back test and later subdivided into the three acquisitions related to the CT part. Hence, the reference BP measurement was taken again with the Withings device. Finally, the AE section of the data capture was performed. During this stage, each subject was engaged in a 10-min walking session on a calibrated treadmill. The treadmill’s configuration was kept uniform across all data collection sessions. The speed was configured at 8 km/h, and the inclination was adjusted to its maximum level to induce BP variation even in trained subjects. Then, the last three acquisitions with the proposed device were carried out along with the last reference BP measurement.

2.3. Data Processing

The data processing pipeline designed for this application can be divided into three major sections: pre-processing, pulse wave quality assessment (PWQA), and lastly, the identification of specific fiducial points employed to derive the features for data analysis. Thumb and elbow PPG measurements were processed following the same procedure within the MATLAB (Natick, MA, USA) environment. Each acquisition was band-pass filtered between 0.3 Hz and 15 Hz [51], to remove the DC offset and the high-frequency noise. Time series segmentation and labeling remain challenging areas of study, with researchers exploring various methods to improve accuracy and efficiency [52,53].
This study addressed this challenge by segmenting our collected data into 3-s, consecutive, non-overlap** windows. This initial segmentation was used to identify and remove portions of the signal possibly corrupted by the presence of motion artifacts [54]. Subsequently, every single pulse wave within the acquisition was identified through the localization of the pulse onset, known in the literature as the beginning of the systolic phase within the cardiac cycle. The template matching approach was selected to perform the PWQA [55]. As the first step, a reference template was computed from all the available epochs. Then, Pearson’s correlation coefficient was used as the signal quality index (SQI) between the ith pulse and the template. All the pulses showing an indicator below the defined acceptance threshold (i.e., 0.95, 0.95, 0.9, respectively, for REST, CT, and AE) were marked as unacceptable and discarded. Feature measurements were obtained from the PPG pulse wave through the identification of key reference points on the pulse wave and its derivatives, which were then used to compute a variety of characteristics. The identified fiducial points included the systolic peak (sys), dicrotic notch (dic), and diastolic peak (dia) on the pulse wave, as well as the point of maximum upslope on the first derivative (ms). Additionally, the a, b, c, d, e, and f waves on the second derivative were detected [56,57,58,59]. These reference points are visually represented in Figure 3 for the baseline radial artery PPG pulse wave. Detailed criteria for identifying these fiducial points and features extracted are provided in Table 2 and Table 3, respectively.

2.4. Model Training

This study examined the efficacy of personalized against generalized training strategies to identify significant alterations in blood pressure levels. As delineated in section II-B, the data collection protocol was meticulously structured to induce BP variations through the execution of both physical and cognitive tasks. This setup enabled a thorough investigation of BP fluctuations in individuals subjected to diverse stimuli. In this context, a macroscopic variation in BP was defined as the difference between the reference values measured throughout the data collection procedure, regardless of the magnitude. Consequently, the phases of the data capture process (e.g., REST, CT, AE) were used as target labels for the analysis, as they inherently reflected BP changes.
Our investigation compared the outcomes derived from applying ten different person-specific models (PSM) against those obtained by a person-independent model (PIM) when applied to the identical dataset, utilizing an RF classifier. Although features were extracted from signals at both sites, only those from the thumb were utilized. Pulse waves collected from the elbow were used to calculate the PTT, which was then used as a feature. Figure 4 (left branch) shows the workflow for our generalized approach. To optimize the model’s performance, we used a Leave-One-Subject-Out strategy across all users in the dataset. The optimization process involved the following parameters: the number of trees in each forest, which ranged from 50 to 100; the maximum depth of each tree in the forest, which ranged from 10 to 50; and the number of features used in the training process, which ranged from 3 to 25. The feature selection process was applied only at the training stages by ranking the first n features according to the mutual information between each feature and the target label. The right branch, highlighted in red, summarizes the ten personalized strategies.
The tested PSMs differed in the quantity of data used during the training phase and the fraction of the target subject data employed to personalize the model. Starting from P S M S D , in which we used 50% of the data from the kth subject for training, data from randomly selected individuals were gradually added to the training set. Specifically, the number of source subjects varied across 5, 10, and 15 individuals. Different fractions of the target subject data were also tested to customize the model. This feature was progressively expanded, beginning from an initial value of 15%, and subsequently increased to 30% and 50%. Each combination of these parameters, when applied to the RF, was labeled as P S M i , j where i identifies the number of source individuals, i ∈ 5, 10, 15, and j refers to the percentage of data belonging to the kth target subject j ∈ 15%, 30%, 50%. The right side of Figure 4 details the workflow followed by each P S M i , j before applying the RF model. The initial step involved randomly selecting a portion of data samples from each class. To avoid class imbalance, we made sure that each class was equally represented by selecting 15%, 30%, or 50% of pulse waveforms from each class. Following this, pulse waveforms from different source subjects (5, 10, or 15) were included.
Then, unlike the generalized approach, the PSM method incorporated a 10-fold cross-validation to fine-tune the model’s hyperparameters and identify the most informative subset of features. Finally, all the mentioned solutions applied the RF model to predict the actual class of the input pulses. The output from each subgroup was merged for visualization purposes, although each PSM was tested individually.

2.5. Evaluation Metrics

As described in the previous subsection, the ten presented models were trained and tested for each subject in the dataset. Therefore, metrics such as accuracy, precision, recall, and F1-score were computed to evaluate the fluctuations in classification performance from subject to subject. Finally, an average of all indexes was computed along with its standard deviation to summarize the performance of each model. In a multiclass classification problem with three classes (REST, CT, and AE), the definitions are as follows:
  • True positives (TP): correctly predicted instances of a class.
  • False positives (FP): instances incorrectly predicted as a certain class.
  • False negatives (FN): instances of a class that are incorrectly predicted as another class.
  • True negatives (TN): all instances that are correctly not classified as the class under evaluation.
Figure 5 clarifies how TP, FP, FN, and TN are identified respectively for each class.
The accuracy score, computed as the ratio of correctly predicted instances over of the total number of instances, was used to quantify the correctness of the predicted labels compared to the actual labels Equation (1).
Accuracy = True Positives for All Classes Total Instances
Given these definitions, the evaluation metrics such as precision and recall scores were computed individually for each of the three classes referring to different sections of the data capture ( α ∈ REST, CT, AE) as specified in Equations (2) and (3).
P r e c i s i o n α = T P α T P α + F P α
R e c a l l α = T P α T P α + F N α
where F P α and F N α are the overall numbers of false positives and false negatives referred to the target class α ∈ REST, CT, and AE under evaluation.
Then, for every tested subject s u b i , the macro-averaged value of precision and recall, Equations (4) and (5), was calculated according to the following:
P r e c i s i o n ¯ s u b i = 1 N α = 1 N P r e c i s i o n α
R e c a l l ¯ s u b i = 1 N α = 1 N R e c a l l α
where N is the number of classes occurring in this study.
Finally, the the macro-averaged F1-score was computed as reported in Equation (6):
F 1 ¯ s u b i = 1 N α = 1 N 2 × P r e c i s i o n α × R e c a l l α P r e c i s i o n α + R e c a l l α

3. Results

3.1. Data Processing

Table 4 shows the results of the three processing stages described divided per section of the data capture (REST, CT, AE), for a single site.
After eliminating epochs corrupted by motion artifacts, the total number of segmented waveforms amounted to 19,274, distributed as follows: 6348 for the resting phase (REST), 6213 during cognitive testing (CT), and 6713 after the physical exercise phase (AE). Data collected from subject #26 were discarded due to corruption of the recording on both sites during the CT section. The variance in the number of detected waves aligned with the execution of the scheduled tasks during the data capture. Specifically, the approximately 400-wave difference between the REST section and the measurement following treadmill walking could be attributed to the observed increases in heart rate (HR) and BP in the measurements conducted with the reference device. Regarding the CT section, although there was an increase in SBP and DBP values (Table 5), the heart rate remained essentially constant compared with the resting value. This phenomenon was reflected in the number of waves detected (6213, CT vs. 6348, REST).
As a result of the PWQA, approximately 3.2% of the total pulses were excluded due to their insufficient similarity to the reference template. Due to the low data quality found in the CT section, data from subjects #17 and #29 were discarded from the dataset used for data analysis. Finally, following the validation of the fiducial points, an additional 4% of data points were discarded for a total of 17,886 pulse waves used in the analysis phase collected from 28 out of 31 subjects.

3.2. Model Evaluation

Table 6 compares the aggregated BP classification performance between ten different PSMs with the results achieved using a generalized approach. The results were expressed in terms of mean value and related standard deviation using the scoring criteria (accuracy, precision, recall, and F1-score) mentioned in Section 2.5. The evaluation metrics computed for each subject according to the training strategy are reported in Table A1, Table A2, Table A3, Table A4, Table A5, Table A6, Table A7 and Table A8 in Appendix A. The person-independent model, denoted as PIM, was trained across the complete dataset employing a Leave-One-Subject-Out cross-validation. Subsequently, performance evaluation was conducted by aggregating the outcomes obtained for each individual. No personalization was applied in this case. The low scores retrieved for each metric (0.36, 0.36, 0.31, and 0.37) suggest the requirement for personalization to model the PPG-BP relationship effectively.
Figure 6 displays the results gathered using the RF trained using 50% of the data of the target subject under investigation ( P S M S D ). In this case, subjects #2, #14, #21, #25, and #28 displayed a marked decrease in classification performance, showcasing accuracy values of 0.63, 0.67, 0.75, 0.78, and 0.67, alongside precision values of 0.43, 0.5, 0.8, 0.8, and 0.5. Through a systematic assessment using the proposed training approaches, we examined how an alteration in the number of subjects and the percentage of data used for model customization impacted the classification performance. The evaluation metrics computed for each personalized model ( P S M i , j ) are reported in Table 6, and the average accuracy score is represented in Figure 7.
In the latter figure, two discernible trends can be identified. Specifically, the average accuracy is directly correlated with the increase in the percentage of data utilized during the pre-training phase and inversely correlated with an augmentation in the number of subjects. The observed accuracy values of 96.4%, 95.7%, and 94.5% in the first set (e.g., P S M 5 , 50 % P S M 10 , 50 % P S M 15 , 50 % ) declined to 95.1%, 92.6%, and 91.6% in the second set (e.g., P S M 5 , 30 % , P S M 10 , 30 % , P S M 15 , 30 % ), and further decreased to 91.4%, 87.4%, and 86% in the third set (e.g., P S M 5 , 15 % , P S M 10 , 15 % , P S M 15 , 15 % ). We excluded combinations that demonstrated overfitting across multiple subjects by discarding those with accuracy and F1 values below 0.95. Therefore, P S M 5 , 30 % , P S M 5 , 50 % , and P S M 10 , 50 % were selected. Given the comparable overall performances across subjects for these combinations, our choice for the best combination was guided by a balance between performance metrics and the minimized data requirement for model customization. This led us to favor P S M 5 , 30 % .

4. Discussion

This study compared the performance of person-dependent and generalized models adopted to track BP macro-variations associated with physical or cognitive workload using a random forest classifier. This model was chosen due to its ability to handle the nonlinear relationships that exist between the extracted features and the variation in BP [57]. In other studies, RF outperformed other nonlinear models such as SVM adopting a nonlinear kernel and neural networks [60]. Moreover, RF is less prone to overfitting compared to the other two mentioned MLAs [29]. Generalized solutions often struggle with the high inter-subject variability within the dataset, making it challenging to develop a universally applicable model. The choice between personalized and universal models depends on the specific context and objectives of the problem being addressed.
Personalized models, finely tuned to individual users’ characteristics, take into account factors like age, gender, medical history, and lifestyle to provide more accurate and relevant predictions of BP. This tailored precision proves particularly crucial for individuals affected by complex health conditions or unique risk factors. Despite these advantages, the construction and maintenance of personalized models for each user pose challenges. This process can be resource-intensive, especially in the field of large-scale applications involving a significant number of subjects. Moreover, privacy and data protection concerns come to the forefront, as the development of personalized models often necessitates access to sensitive user data.
Generalized models, in contrast, are crafted to exhibit proficiency across a diverse spectrum of users without the need for individual customization. This inherent versatility makes them more scalable and simpler to implement, eliminating the necessity for tailoring to each user’s unique attributes. The cost-effectiveness and ease of maintenance associated with generalized models make them particularly advantageous for applications boasting a large user base. However, this broad approach comes with a trade-off since generalized models may fail to capture the distinctive characteristics and preferences of individual users. Consequently, the predictions generated by these models may result in a lack of accuracy compared to their personalized counterparts.
This phenomenon, highlighted in [61], is also reflected in our findings where the averaged metrics of the generalized approach (0.36, 0.36, 0.31, 0.37) underline the difficulties in defining a univocal representative model for subjects with different physiological characteristics. A hybrid approach combining personalized and universal models, as investigated in this study, may be beneficial for blood pressure monitoring. A universal model could be used as a baseline to provide initial predictions for all users, and personalized models could be applied to increase the model’s performance where personalization is deemed critical, such as users with complex health conditions or unique risk factors accommodating the inherent diversity in BP patterns among different subjects.
In [62], the authors used a transfer learning technique that personalized specific layers of a pre-trained network to improve the performance of PPG-based BP estimation, highlighting the influence of the number of data samples and source subjects used for training. Our analysis of the results shows that, on average, all the PSMs improved the performance of the generalized model regardless of the number of source subjects employed for training. Moreover, by observing the metrics displayed in Table 6, strategies including data obtained from different individuals demonstrated better performance in comparison to the model constructed exclusively using data from the tested subject ( P S M S D ) where, as reported in Figure 6, the classification performance of eight subjects witnessed a substantial decline. Subject #2 emerged as the most adversely affected, exhibiting a notable drop of all metrics up to 0.63, 0.43, 0.67, and 0.52 for accuracy, precision, recall, and F1-score, respectively. These fluctuations in classification performance are a direct consequence of the phenomenon of overfitting whereby the model cannot correctly predict data that differ from the small training set available. To mitigate this issue, we included data from 5, 10, or 15 randomly selected subjects from the dataset in addition to diverse fractions of the target subject’s data (15%, 30%, 50%). In this way, we were able to evaluate the behavior of the model according to different sizes of the training set, degrees of personalization, and combinations of hyperparameters. Table 6 revealed a distinct inverse correlation between the classification metrics and the increase in the number of individuals. This diminishing pattern suggests the potential implications linked to the higher variability introduced by additional source subjects with respect to the initial quantity of data used to pre-train the model. Hence, this phenomenon may reduce random forest customization and consecutively lead to poorer classification performance for the target subject under evaluation. In fact, as depicted in Figure 7, P S M 5 , 15 % , P S M 10 , 15 % , and P S M 15 , 15 % showed a more pronounced decrease in accuracy value as the number of subjects increased compared to the cases with 30% and 50% of the target subject’s data.
This phenomenon is further visible in Figure 8 and Figure 9. Notably, when utilizing only 30% of the data for the pre-training stage, this adverse trend was further accentuated by a more pronounced variability (Figure 8b,c) compared to the scenario with 50% of the data (Figure 9b,c), where the standard deviation was progressively reduced. In the definition of the best solution within the context of our application, we opted to discard any tested combinations exhibiting aggregated accuracy and F1 values below 0.95. This approach ensured that combinations displaying overfitting across multiple subjects were not considered. As result, our selected PSMs were confined to P S M 5 , 30 % , P S M 5 , 50 % , and P S M 10 , 50 % . Upon analyzing the performance of various combinations across subjects within the dataset, it is evident that their overall performance values were generally comparable. However, an exception arose with subject #21, Figure 8a and Figure 9a,b, which exhibited a drop in performance exceeding 10% compared to the training phase in all three combinations, although slightly less evident in P S M 10 , 50 % . This trend was attributed to subject #21 having the lowest number of pulses in the dataset, resulting in a diminished dataset for personalization compared to other subjects. Notably, the combination P S M 15 , 50 % demonstrated a substantial improvement, utilizing more data for personalization along with an increased number of individuals. Hence, due to the similarity observed among the performance metrics, the selection of the best combination was guided by the consideration of the data required for model customization, leading us to favor P S M 5 , 30 % . Employing 30% of the total available data, equivalent to approximately 162 s for the personalization phase, represents a noteworthy outcome. This achievement is particularly significant as it reflects a substantial reduction in the time required for this task compared to the approach outlined in [62], where 250 s of data recording per subject was used for the pre-training stage. Therefore, combining a subset of source subjects, in conjunction with an adequate fraction of data for pre-training leads to increased robustness and generalizability of personalized models across a broader spectrum of cases in BP assessment when compared to standard generalized models. Despite the mentioned improvements, some limitations of the proposed study need to be discussed. In this study, the performance of the proposed approach was evaluated on a limited sample of 28 subjects, falling short of the 85 subjects required by the AAMI [29]. To enhance model validation and generalization for accurate BP monitoring, it is crucial to include a diverse range of values that truly represent the population, including both males and females across different age ranges. In our future endeavors, we intend to extend the validation process to encompass a larger and more diverse cohort of individuals, aligning with the standards set by AAMI. Typically, to assess blood pressure variations, multiple sets of data collection over several days are conducted to ensure the algorithm’s consistent performance over time for the same individual. However, it is crucial to note that our data collection protocol was designed to induce short-term variations in BP linked to diverse stimuli rather than long-term monitoring. Moreover, increased proficiency in the cognitive tests section would likely result in reduced BP variation due to heightened familiarity with the tasks.

5. Conclusions

Cuffless blood pressure measurement has gained attention due to clinical demand and recent technological advances in the fields of data acquisition systems, embedded systems, and machine learning techniques. This paper presented a personalized multiclass classification model aimed at the detection of blood pressure variations associated with physical or cognitive workload. Several training strategies were implemented, each differing in the percentages of the dataset and utilizing a diverse subset of individuals as the training set. Experimental results demonstrated that the inclusion of a pre-training stage with data from diverse subjects enabled the discernment of morphological distinctions in beat-to-beat PPG waveforms under various stressors with respect to a generalized model fitted on the whole dataset. Understanding the regulatory mechanisms influencing blood pressure, combined with a reduction in the number of sensors employed to track this latter, constitutes a further step toward unobtrusive cuffless BP monitoring, resulting in better management of this parameter.

Author Contributions

Conceptualization, A.V., S.T. and B.O.; methodology, A.V. and S.T.; software, A.V. and S.T.; validation, A.V., P.M.R., D.D., S.T. and B.O.; formal analysis, A.V. and S.T.; investigation, A.V., P.M.R., D.D., S.T. and B.O.; resources, P.M.R., D.D., S.T., and B.O.; data curation, A.V., B.O. and S.T.; writing—original draft preparation, A.V. and S.T.; writing—review and editing, A.V., P.M.R., D.D., S.T. and B.O.; visualization, A.V., P.M.R., D.D., S.T. and B.O.; supervision, D.D., S.T. and B.O.; project administration, S.T. and B.O.; funding acquisition, S.T. and B.O. All authors have read and agreed to the published version of the manuscript.

Funding

This publication has emanated from research supported by a research grant from the Disruptive Technologies Innovation Fund (DTIF) project HOLISTICS funded by Enterprise Ireland (EI). Aspects of this research have emanated from research conducted with the financial support of Science Foundation Ireland under Grant 12/RC/2289-P2-INSIGHT and 16/RC/3918-CONFIRM, which are co-funded under the European Regional Development Fund.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the the UCC Clinical Research Ethics Committee of the Cork Teaching Hospitals (protocol code ECM 4 (ff) 10 November 2020 and ECM 3 (b) 13 December 2022. Date of approval: 28 November 2022).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study. Written informed consent has been obtained from the patients to publish this paper.

Data Availability Statement

The data presented in this study are available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Test set results using the person-independent model (PIM).
Table A1. Test set results using the person-independent model (PIM).
Sub-IDTest SetRF Hyperparameters
PrecisionRecallF1AccuracyMax DepthNumber of Estimators
10.350.360.280.352080
20.190.220.200.232080
30.380.290.190.282080
40.190.270.190.272080
50.540.360.320.342080
60.430.540.440.512080
70.740.740.740.752080
80.360.400.350.392080
90.350.400.360.422080
100.090.270.140.262080
110.410.450.400.532080
120.640.600.600.642080
130.160.160.160.192080
140.770.590.530.602080
150.150.330.210.442080
160.400.190.160.182080
17------
180.300.300.290.302080
190.090.200.120.202080
200.380.450.380.432080
210.360.270.200.292080
220.180.100.110.102080
230.090.200.130.212080
240.460.540.490.562080
250.560.630.520.632080
26------
270.150.340.200.352080
280.720.600.530.592080
29------
300.250.300.200.352080
310.130.240.140.232080
Table A2. Test set results using person-specific model P S M S D .
Table A2. Test set results using person-specific model P S M S D .
Sub-IDTest SetRF Hyperparameters
PrecisionRecallF1AccuracyMax DepthNumber of Estimators
111115060
20.430.670.520.631060
311111060
411111060
50.950.940.940.951090
60.850.850.860.8650100
70.990.990.990.991060
80.980.980.980.981060
911111060
100.980.980.980.981060
110.950.970.960.961060
120.990.980.980.991090
1311112070
140.50.670.560.681060
1511111060
1611111060
17------
1811111060
1911111080
200.990.850.840.841070
210.80.780.770.752060
22111110100
2311115060
2411111060
250.80.780.780.781060
26------
2711111060
280.500.670.550.671090
29------
300.980.980.980.983060
310.860.860.860.861070
Table A3. Test set results using person-specific model P S M 5 , 30 % .
Table A3. Test set results using person-specific model P S M 5 , 30 % .
Sub-IDTest SetRF Hyperparameters
PrecisionRecallF1AccuracyMax DepthNumber of EstimatorsSubjects for Training
10.970.970.970.9740802 10 12 13 31
20.970.970.970.9720703 4 7 20 23
30.960.960.960.96207012 21 23 24 30
40.970.960.970.97501005 9 12 15 25
50.890.890.890.89307015 18 20 25 31
60.900.900.900.9020707 8 20 24 28
70.940.940.940.9450803 7 12 13 31
80.970.970.970.9720703 7 12 13 31
90.970.970.970.9850702 12 13 18 19
100.950.950.950.9440904 13 20 21 24
110.970.960.970.9720907 8 14 15 18
120.960.940.950.9510703 4 14 16 22
130.920.910.910.9350803 5 8 12 30
140.990.990.990.9930904 11 12 24 30
150.980.970.970.9720705 11 21 24 31
160.970.970.970.9730801 12 20 25 31
17-------
180.960.950.960.965010011 20 23 27 30
191.001.001.001.00201002 3 14 24 30
200.940.940.940.94301001 5 8 30 31
210.820.810.820.81201003 4 5 19 25
220.970.970.970.97308012 15 27 28 30
230.950.950.950.9530702 10 14 19 21
240.990.990.990.99209013 19 27 30 31
250.930.930.930.93309018 27 28 30 31
26-------
271.001.001.001.0050708 9 16 25 28
280.950.940.940.9430702 6 20 24 31
29-------
300.920.930.930.9350701 8 11 28 31
310.940.940.940.9430703 19 20 22 30
Table A4. Test set results using person-specific model P S M 10 , 30 % .
Table A4. Test set results using person-specific model P S M 10 , 30 % .
Sub-IDTest SetRF Hyperparameters
PrecisionRecallF1AccuracyMax DepthNumber of EstimatorsSubjects for Training
10.930.930.930.93201002 7 8 12 13 14 15 20 22 31
20.880.880.880.8840901 3 7 15 18 21 22 23 28 31
30.970.970.970.9740902 8 9 11 12 13 16 20 25 27
40.970.960.970.9730801 2 10 15 16 21 22 23 27 31
50.900.900.900.91401002 4 6 9 21 22 23 27 28 30
60.860.850.850.8550804 5 7 9 13 19 21 22 27 31
70.960.960.960.96301003 4 5 8 10 14 15 20 22 31
80.970.970.970.9740904 5 6 9 12 13 14 16 23 27
90.970.970.970.97501001 3 4 14 15 18 21 23 24 25
100.930.930.930.93201006 9 12 14 15 16 22 25 28 30
110.870.850.860.88301003 10 12 18 21 23 24 27 28 30
120.890.860.860.89501001 4 6 7 11 18 19 21 23 31
130.950.920.930.9320902 3 6 10 11 14 15 18 20 28
140.970.970.970.9730804 7 9 11 12 16 19 20 23 25
150.990.990.990.99301001 2 4 11 13 20 24 25 27 31
160.920.920.920.92501006 10 12 13 15 19 25 27 28 30
17-------
180.930.930.930.9450701 5 6 10 14 20 21 27 28 30
190.980.980.980.9850706 9 10 14 20 22 23 25 27 28
200.940.940.940.9450901 2 6 7 8 11 19 21 27 31
210.820.770.780.7630902 5 6 10 14 16 22 24 25 31
220.880.850.850.8640804 5 6 7 10 12 18 24 25 30
230.950.950.950.9540708 9 16 19 20 21 24 25 28 31
240.990.990.990.99301001 4 6 8 10 11 14 22 27 30
250.920.920.910.92501001 3 4 12 15 19 21 22 28 31
26-------
270.940.940.940.9440906 7 9 14 16 18 24 28 30 31
280.970.970.970.97401001 4 8 13 14 18 19 23 24 25
29-------
300.910.880.890.9130901 4 6 10 13 18 22 25 27 28
310.900.900.900.9050802 5 10 15 16 19 20 23 24 30
Table A5. Test set results using person-specific model P S M 15 , 30 % .
Table A5. Test set results using person-specific model P S M 15 , 30 % .
Sub-IDTest SetRF Hyperparameters
PrecisionRecallF1AccuracyMax DepthNumber of EstimatorsSubjects for Training
10.980.980.980.9840903 6 9 12 14 18 19 20 22 23 24 27 28 30 31
20.910.890.900.89501001 4 5 7 8 12 13 16 18 19 21 22 27 28 30
30.990.990.990.99401001 2 4 5 6 8 11 13 15 19 22 23 24 27 30
40.940.920.920.93201001 3 5 8 9 11 12 14 15 16 18 20 21 22 27
50.890.890.880.89401003 4 7 11 12 13 15 18 19 20 21 22 24 30 31
60.870.870.870.8640903 5 9 10 15 16 18 19 20 22 23 25 27 28 30
70.970.960.970.9720703 4 5 6 8 11 12 13 16 19 22 23 25 27 31
80.950.950.950.9550801 2 5 6 10 14 15 16 18 19 23 25 27 28 30
90.900.900.900.90301002 6 7 10 11 12 15 16 18 21 22 23 24 28 30
100.950.950.950.95201002 4 6 8 9 15 18 19 22 23 24 27 28 30 31
110.920.900.910.9220902 4 5 6 7 8 12 13 15 16 21 22 27 30 31
120.910.900.900.9240702 3 7 9 10 13 15 18 19 20 21 25 27 30 31
130.840.860.850.8620704 5 6 9 12 14 15 16 20 22 23 24 25 28 31
140.960.960.960.96501001 2 6 8 9 13 16 19 20 21 22 23 24 25 30
150.940.920.930.9220901 2 4 5 8 11 12 13 14 16 19 22 23 28 30
160.960.950.950.9540901 2 3 4 7 8 10 11 12 18 19 20 23 24 25
17-------
180.960.960.960.96401003 6 10 12 13 14 15 16 19 20 21 24 25 27 30
190.950.940.940.9430901 3 5 6 7 11 12 14 18 21 23 24 25 30 31
200.920.920.920.9230701 2 3 5 6 7 9 11 13 14 22 25 27 28 31
210.830.790.800.7950901 2 4 5 6 7 8 9 14 15 20 23 27 28 31
220.890.890.890.8920801 2 5 6 8 10 13 15 16 18 20 24 25 27 31
230.900.900.900.9050804 5 8 10 11 12 14 15 19 21 22 24 25 28 30
240.890.890.880.8830804 5 6 7 9 10 13 18 20 21 23 25 27 28 31
250.910.910.910.9130902 3 4 5 8 9 12 13 16 20 22 24 27 28 31
26-------
270.960.950.950.95201002 3 4 5 7 9 18 19 20 21 22 23 28 30 31
280.950.950.950.9530802 4 5 8 10 11 12 13 14 16 18 19 22 23 25
29-------
300.850.860.850.8650901 2 4 5 7 9 11 13 16 18 19 22 23 27 31
310.870.850.850.8620901 2 4 6 7 8 12 13 16 18 20 23 27 28 30
Table A6. Test set results using person-specific model P S M 5 , 50 % .
Table A6. Test set results using person-specific model P S M 5 , 50 % .
Sub-IDTest SetRF Hyperparameters
PrecisionRecallF1AccuracyMax DepthNumber of EstimatorsSubjects for Training
10.980.980.980.98308010 14 16 27 31
20.980.980.980.9820608 9 22 24 25
30.980.980.980.98201009 10 16 21 25
40.980.980.980.984010013 20 22 30 31
50.940.940.940.9540807 15 22 28 31
60.920.920.920.9250807 8 15 21 22
70.960.960.960.96201001 15 19 28 30
80.980.980.980.9850704 12 20 23 27
9111120905 6 8 24 25
100.970.970.970.9720808 9 11 19 23
110.980.970.970.98201002 10 15 19 23
120.970.950.960.9620602 4 15 16 25
130.970.960.960.9630806 10 11 21 27
14111140903 7 20 21 27
150.980.980.980.98201006 10 12 16 27
160.990.990.990.9930704 5 9 10 13
17-------
180.970.980.970.97309010 11 14 20 31
190.980.980.980.98201005 6 8 10 13
200.970.970.970.9730806 8 22 23 28
210.880.880.870.8650804 12 14 24 31
220.950.960.950.9650807 15 16 22 28
23111140909 11 19 22 27
240.990.990.990.9940907 11 16 18 19
250.820.810.800.8120801 14 15 19 22
26-------
27111130901 7 9 12 28
280.980.980.980.9830705 12 21 22 27
29-------
300.940.950.950.9530601 2 5 21 23
310.940.940.940.9440807 22 23 25 30
Table A7. Test set results using person-specific model P S M 10 , 50 % .
Table A7. Test set results using person-specific model P S M 10 , 50 % .
Sub-IDTest SetRF Hyperparameters
PrecisionRecallF1AccuracyMax DepthNumber of EstimatorsSubjects for Training
10.970.970.970.9720702 4 5 11 14 18 19 22 25 27
20.940.930.940.9430901 4 12 16 18 21 22 23 24 28
30.990.990.990.9930901 2 8 12 16 19 21 25 28 30
40.990.990.990.9930808 9 11 12 13 15 18 20 23 30
50.910.910.910.9120807 9 11 20 21 23 24 28 30 31
60.910.910.910.9120902 9 15 19 22 23 27 28 30 31
70.990.990.990.9940603 5 12 13 21 22 23 25 27 30
80.980.980.980.98201001 5 10 11 12 14 18 19 28 31
90.960.960.960.96401002 3 5 14 15 19 21 23 24 28
100.950.950.950.9530702 9 12 15 19 23 24 25 28 30
110.970.950.960.9640903 5 8 10 13 21 25 27 28 30
120.980.970.980.9840601 2 5 6 8 20 21 23 27 31
130.950.90.920.93301003 6 11 15 16 19 21 23 24 28
140.980.980.980.98501006 10 15 18 20 23 25 27 28 30
1510.990.990.99301004 6 8 9 10 13 14 16 19 24
160.950.950.950.9520701 4 9 12 15 20 22 23 25 27
17-------
180.970.960.960.9630901 2 6 7 10 13 15 16 21 24
190.970.970.970.97201003 5 6 12 16 18 21 24 25 28
200.960.960.960.9630802 3 5 9 12 15 18 22 24 25
210.880.870.880.86501001 2 3 5 6 8 10 11 15 16
220.960.970.960.9630902 6 8 13 20 21 23 24 27 30
230.960.960.960.9630907 9 11 12 15 19 22 27 28 30
240.960.950.960.96201004 6 7 13 14 18 19 20 27 30
250.930.940.930.9340601 2 3 10 12 14 18 23 28 31
26-------
27111140704 5 16 18 20 21 23 25 30 31
280.970.970.970.9730902 3 6 7 10 15 16 18 21 30
29-------
300.940.950.940.9530905 7 9 10 11 21 22 25 27 28
310.930.930.930.9320901 3 9 10 11 12 18 21 22 30
Table A8. Test set results using person-specific model P S M 15 , 50 % .
Table A8. Test set results using person-specific model P S M 15 , 50 % .
Sub-IDTest SetRF Hyperparameters
PrecisionRecallF1AccuracyMax DepthNumber of EstimatorsSubjects for Training
10.940.940.940.9450902 3 5 6 9 10 13 15 16 19 22 23 25 27 31
20.950.940.940.94201003 4 5 9 12 13 14 15 16 18 19 20 21 25 30
30.970.970.970.97201001 2 5 9 10 11 12 15 19 21 22 23 24 25 30
40.970.960.960.9650802 3 7 8 10 12 14 16 18 21 22 23 27 28 30
50.920.920.920.9220901 3 10 12 13 14 15 16 18 19 20 24 25 27 28
60.90.910.90.9301001 3 5 7 8 9 11 12 13 16 20 22 25 27 28
70.970.970.970.9720801 2 4 9 10 13 14 16 21 22 23 24 28 30 31
80.970.970.970.9750802 5 7 10 13 14 18 19 21 22 23 25 27 28 31
90.970.970.970.9750902 5 6 7 8 13 14 15 16 19 20 21 23 30 31
100.950.960.950.95201002 3 8 13 14 15 18 21 22 23 24 25 27 28 30
110.970.960.960.9720802 3 7 9 10 12 13 14 15 16 20 22 24 27 28
120.960.950.960.9630903 5 6 8 10 11 18 19 20 22 23 25 27 30 31
130.880.910.890.930901 2 8 9 10 11 14 15 19 20 21 22 23 25 30
140.960.960.960.9650901 4 6 9 10 11 12 15 16 18 20 23 24 28 31
150.980.980.980.9820601 2 7 9 12 13 14 20 21 22 23 24 25 27 28
160.930.920.920.9240804 5 8 9 10 12 18 20 21 22 25 28 27 30 31
17-------
180.980.970.970.9830901 3 6 12 13 14 15 16 19 21 22 23 27 28 31
190.960.960.960.96401001 3 4 5 8 10 11 12 13 18 20 21 22 25 27
200.940.940.930.93501002 5 7 8 9 11 12 14 16 18 22 23 25 27 28
210.910.90.90.88201002 4 5 9 10 13 14 16 18 19 20 23 24 27 28
220.960.960.960.96401004 6 7 8 11 13 14 15 19 20 21 23 27 28 30
230.970.970.970.97201001 2 8 9 11 12 14 19 20 21 22 24 27 28 30
240.950.950.950.9540902 3 4 6 8 11 14 16 18 20 21 23 28 30 31
250.90.90.90.930801 2 3 5 6 7 9 14 16 18 19 21 24 27 28
26-------
27111120902 4 8 13 14 15 16 18 19 20 21 23 24 28 30
280.960.960.960.9620802 5 7 8 9 11 12 14 16 18 19 23 24 28 30
29-------
300.930.940.930.9330902 5 8 9 10 11 12 14 16 20 21 22 25 28 31
310.910.910.910.9130901 2 3 6 10 11 12 13 14 15 21 22 25 28 30

References

  1. Yeates, K.; Lohfeld, L.; Sleeth, J.; Morales, F.; Rajkotia, Y.; Ogedegbe, O. A global perspective on cardiovascular disease in vulnerable populations. Can. J. Cardiol. 2015, 31, 1081. [Google Scholar] [CrossRef]
  2. World Health Organization (Ed.) World Health Statistics 2022: Monitoring Health for the SDGs, Sustainable Development Goals; World Health Organization: Geneva, Switzerland, 2022. [Google Scholar]
  3. Zhou, B.; Carrillo-Larco, R.M.; Danaei, G.; Riley, L.M.; Paciorek, C.J.; Stevens, G.A.; Gregg, E.W.; Bennett, J.E.; Solomon, B.; Singleton, R.K.; et al. Worldwide trends in hypertension prevalence and progress in treatment and control from 1990 to 2019: A pooled analysis of 1201 population-representative studies with 104 million participants. Lancet 2021, 398, 957–980. [Google Scholar] [CrossRef] [PubMed]
  4. Lackland, D.T.; Weber, M.A. Global burden of cardiovascular disease and stroke: Hypertension at the core. Can. J. Cardiol. 2015, 31, 569–571. [Google Scholar] [CrossRef]
  5. Olsen, M.H.; Angell, S.Y.; Asma, S.; Boutouyrie, P.; Burger, D.; Chirinos, J.A.; Damasceno, A.; Delles, C.; Gimenez-Roqueplo, A.P.; Hering, D.; et al. A call to action and a lifecourse strategy to address the global burden of raised blood pressure on current and future generations: The Lancet Commission on hypertension, Lancet 2016, 388, 2665–2712. [CrossRef]
  6. Choi, J.; Kang, Y.; Park, J.; Joung, Y.; Koo, C. Development of Real-Time Cuffless Blood Pressure Measurement Systems with ECG Electrodes and a Microphone Using Pulse Transit Time (PTT). Sensors 2023, 23, 1684. [Google Scholar] [CrossRef]
  7. Zhou, B.; Perel, P.; Mensah, G.A.; Ezzati, M. Global epidemiology, health burden and effective interventions for elevated blood pressure and hypertension. Nature 2021, 18, 785–802. [Google Scholar] [CrossRef]
  8. Owida, H.A. Biomechanical Sensing Systems for Cardiac Activity Monitoring. Int. J. Biomater. 2022, 22, 8312564. [Google Scholar] [CrossRef] [PubMed]
  9. Athaya, T.; Choi, S. A Review of Noninvasive Methodologies to Estimate the Blood Pressure Waveform. Sensors 2022, 22, 3953. [Google Scholar] [CrossRef] [PubMed]
  10. Wan, Q.; Chen, Q.; Freithaler, M.A.; Velagala, S.R.; Liu, Y.; To, A.C.; Mahajan, A.; Mukkamala, R.; ** Personalized Models of Blood Pressure Estimation from Wearable Sensors Data Using Minimally-trained Domain Adversarial Neural Networks. In Proceedings of the 5th Machine Learning for Healthcare Conference, Virtual, 7–8 August 2020; pp. 97–120. [Google Scholar]
  11. Leitner, J.; Chiang, P.H.; Dey, S. Personalized Blood Pressure Estimation Using Photoplethysmography: A Transfer Learning Approach. IEEE J. Biomed. Health Inform. 2022, 26, 218–228. [Google Scholar] [CrossRef] [PubMed]
Figure 1. System employed to collect PPG raw data from the selected sites, elbow (brachial artery) and thumb (digital artery).
Figure 1. System employed to collect PPG raw data from the selected sites, elbow (brachial artery) and thumb (digital artery).
Sensors 24 03697 g001
Figure 2. Data collection protocol followed in this study along with the evolution of the averaged pulse waveforms morphology according to each section of the data capture.
Figure 2. Data collection protocol followed in this study along with the evolution of the averaged pulse waveforms morphology according to each section of the data capture.
Sensors 24 03697 g002
Figure 3. (a) Feature extracted from a PPG waveform. (b) Maximum of the first derivative (ms) detected on the velocity plethysmography (VPG). (c) Fiducial points detected on the acceleration plethysmography (APG).
Figure 3. (a) Feature extracted from a PPG waveform. (b) Maximum of the first derivative (ms) detected on the velocity plethysmography (VPG). (c) Fiducial points detected on the acceleration plethysmography (APG).
Sensors 24 03697 g003
Figure 4. Overview of the tested training strategies. (Left) Workflow employed for the generalized approach (PIM). (Center) Tested combinations for person-specific strategies (PSMs). (Right) Workflow adopted by every P S M i , j .
Figure 4. Overview of the tested training strategies. (Left) Workflow employed for the generalized approach (PIM). (Center) Tested combinations for person-specific strategies (PSMs). (Right) Workflow adopted by every P S M i , j .
Sensors 24 03697 g004
Figure 5. Definition of true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN) instances in a multiclass problem. (a) REST class. (b) Cognitive task (CT) class. (c) After-exercise (AE) class.
Figure 5. Definition of true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN) instances in a multiclass problem. (a) REST class. (b) Cognitive task (CT) class. (c) After-exercise (AE) class.
Sensors 24 03697 g005
Figure 6. Values of evaluation metrics (accuracy, precision, recall, and F1-score) according to the training strategy denoted as P S M S D . A 90% threshold (indicated by the gray dashed line) is used to identify subjects whose performance drops by more than 10% compared to the training phase.
Figure 6. Values of evaluation metrics (accuracy, precision, recall, and F1-score) according to the training strategy denoted as P S M S D . A 90% threshold (indicated by the gray dashed line) is used to identify subjects whose performance drops by more than 10% compared to the training phase.
Sensors 24 03697 g006
Figure 7. Averaged values of accuracy score according to different combinations of number of source subjects and diverse fractions of data employed to personalize the RF model.
Figure 7. Averaged values of accuracy score according to different combinations of number of source subjects and diverse fractions of data employed to personalize the RF model.
Sensors 24 03697 g007
Figure 8. Evaluation metrics computed for each individual employing a fraction of the target subject data set equal to 30% and a diverse number of source subjects (N). A 90% threshold (indicated by the gray dashed line) is used to identify subjects whose performance drops by more than 10% compared to the training phase. (a) N = 5. (b) N = 10. (c) N = 15.
Figure 8. Evaluation metrics computed for each individual employing a fraction of the target subject data set equal to 30% and a diverse number of source subjects (N). A 90% threshold (indicated by the gray dashed line) is used to identify subjects whose performance drops by more than 10% compared to the training phase. (a) N = 5. (b) N = 10. (c) N = 15.
Sensors 24 03697 g008
Figure 9. Evaluation metrics computed for each individual employing a fraction the target subject data set equal to 50% and a diverse number of source subjects (N). A 90% threshold (indicated by the gray dashed line) is used to identify subjects whose performance drops by more than 10% compared to the training phase. (a) N = 5. (b) N = 10. (c) N = 15.
Figure 9. Evaluation metrics computed for each individual employing a fraction the target subject data set equal to 50% and a diverse number of source subjects (N). A 90% threshold (indicated by the gray dashed line) is used to identify subjects whose performance drops by more than 10% compared to the training phase. (a) N = 5. (b) N = 10. (c) N = 15.
Sensors 24 03697 g009
Table 1. Overview of the characteristics of the study populations.
Table 1. Overview of the characteristics of the study populations.
Characteristics μ ± σ Range
Number of Subjects31-
Male20 (64%)-
Smokers4 (13%)-
Age (years)27.77 ± 3.7021–34
Height (cm)172.74 ± 9.27158–92
Weight (kg)69.52 ± 12.7253–99
BMI (kg m−2)23.22 ± 3.3018.16–31.24
Abbreviations: BMI, body mass index, μ , mean value, σ , standard deviation.
Table 2. Criteria for identifying fiducial points on PPG pulse waves.
Table 2. Criteria for identifying fiducial points on PPG pulse waves.
SignalFiducial PointDescription
PPG, sSysMaximum of the pulse waveform
DicFirst local minimum after the systolic peak or coincident with e
DiaFirst local maximum after dic and before 0.8 T (where T is the duration of the cardiac cycle)
VPG, s′msMaximum of the first derivative, s′
APG, s″aThe maximum of s″ preceding the maximum of the first derivative ms
bFirst local minimum following a
cThe greatest maximum of s″ between b and e (or, if no maxima, then the first maximum on x′ after e
dThe lowest minimum on s″ after c and before e (or, if no minima, then coincident with c).
eThe second maximum of s″ after ms and before 0.6 T (unless the c wave is an inflection point, in which case take the first maximum).
fThe first local minimum of s″ after e and before 0.8 T.
Abbreviations: PPG, photoplethysmogram; VPG velocity plethysmography; APG, acceleration plethysmogram; s, original pulse; s′, first derivative of the original pulse; s″, second derivative of the original pulse.
Table 3. Definition of the extracted features from PPG pulse wave and its derivatives.
Table 3. Definition of the extracted features from PPG pulse wave and its derivatives.
SignalTypeFeatureDefinitionFormula
PPG, sTimeΔTTime delay between systolic and diastolic peaks t d i a t s y s
SIStiffness index, h is the subject’s height h / ( t d i a t s y s )
CTCrest time: time occurring between pulse onset e of systolic peak t s y s t 0
wPulse width at 50% of systolic peak amplitude, A s y s -
IPRInstantaneous pulse rate 60 / T
TPeriod of the cardiac cycle-
t d i a Duration of the diastole T t d i c
t d i c Time to dicrotic notch t d i c t 0
Amplitude A 0 Amplitude of pulse onset s ( t 0 )
A s y s Amplitude of the systolic peak s ( t s y s )
A d i c Amplitude of the dicrotic notch s ( t d i c )
A d i a Amplitude of the diastolic peak s ( t d i a )
RIReflection index ( s ( t d i a ) s ( t 0 ) ) / ( s ( t s y s ) s ( t 0 ) )
KPulse waveform characteristic value ( s μ A 0 ) / ( A s y s A 0 )
K 1 Systolic characteristic value ( s μ , s y s A 0 ) / ( A s y s A 0 )
K 2 Diastolic characteristic value ( s μ , d i a A 0 ) / ( A s y s A 0 )
s μ , s y s Mean value of the systolic phase of the pulse waveform-
s μ , d i a Mean value of the diastolic phase of the pulse waveform-
s μ Mean value of pulse waveform-
s σ Standard deviation of pulse waveform-
s s k e w n e s s Skewness of pulse waveform-
s k u r t o s i s Kurtosis of pulse waveform-
Area A 1 Area under the curve between the pulse onset ( t 0 ) and the dicrotic notch ( t d i a )-
A 2 Area under the curve between the dicrotic notch ( t d i a ) and the end of the pulse ( t e n d )-
IPAInflection point area A 2 / A 1
VPG, s′Time t m s Time to the maximum slope computed on the first derivative of the pulse t m s t 0
Amplitude A m s Amplitude of the maximum slope s ( t m s )
APG, s″Time t b d Time elapsing between b and d t d t b
t b c Time elapsing between b and c t d t c
Amplitude b / a Amplitude ratio of early systolic negative wave over early systolic positive wave s ( t b ) / s ( t a )
c / a Amplitude ratio of late systolic re-increasing wave over early systolic positive wave s ( t c ) / s ( t a )
d / a Amplitude ratio of late systolic decreasing wave over early systolic positive wave s ( t d ) / s ( t a )
e / a Amplitude ratio of early diastolic positive wave over early systolic positive wave s ( t e ) / s ( t a )
AGIAging index ( s ( t b ) s ( t c ) s ( t d ) s ( t e ) ) / s ( t a )
Combined IPADInflection point area combined with d-peak A 2 / A 1 + s ( t d ) / s ( t a )
kElasticity constant s ( t s y s ) ( ( s ( t s y s ) s ( t m s ) ) / ( s ( t s y s ) s ( t 0 ) ) )
Table 4. Data processing results.
Table 4. Data processing results.
Data Capture PhaseSegmented PulsesPWQAFiducial Points Validation
REST634860745935
CT621358495630
AE671366206321
Total Pulses19,72418,543 (96.8%)17,886 (92.8%)
Abbreviations: REST, measurements at rest; CT, cognitive task section; AE, measurements after physical tasks; PWQA, pulse wave quality assessment.
Table 5. Averaged reference blood pressure values.
Table 5. Averaged reference blood pressure values.
Data Capture PhaseSBP (mmHg)DBP (mmHg)HR (bpm)
REST109 ± 11.667.8 ± 6.771.7 ± 8.3
CT114.5 ± 371.2 ± 8.270 ± 8.7
AE115.4 ± 12.272.9 ± 7.377.3 ± 11.9
REST, measurements at rest; CT, cognitive task section; AE, measurements after physical tasks; SBP, systolic blood pressure; DBP, diastolic blood pressure; HR, heart rate.
Table 6. Macro-averaged evaluation metrics computed for each model.
Table 6. Macro-averaged evaluation metrics computed for each model.
MLATraining
Strategy
Training SetTest Set
Accuracy *Precision *Recall *F1-Score *Accuracy *Precision *Recall *F1-Score *
RFPIM0.990.990.990.990.360 ± 0.2000.360 ± 0.1800.310 ± 0.1800.370 ± 0.180
PSM S D 11110.925 ± 0.1170.912 ± 0.1650.926 ± 0.1130.912 ± 0.147
PSM 5 , 50 % 11110.964 ± 0.0410.964 ± 0.0400.963 ± 0.0380.962 ± 0.042
PSM 10 , 50 % 11110.957 ± 0.0300.958 ± 0.0280.955 ± 0.0300.956 ± 0.028
PSM 15 , 50 % 11110.945 ± 0.0280.947 ± 0.0280.945 ± 0.0260.944 ± 0.027
PSM 5 , 30 % 11110.951 ± 0.0390.952 ± 0.0370.950 ± 0.0380.950 ± 0.037
PSM 10 , 30 % 11110.926 ± 0.0510.930 ± 0.0440.922 ± 0.0520.923 ± 0.051
PSM 15 , 30 % 11110.916 ± 0.0450.918 ± 0.0430.914 ± 0.0450.914 ± 0.045
PSM 5 , 15 % 11110.914 ± 0.0430.916 ± 0.0430.913 ± 0.0420.913 ± 0.042
PSM 10 , 15 % 11110.874 ± 0.0690.885 ± 0.0610.872 ± 0.0720.871 ± 0.072
PSM 15 , 15 % 11110.860 ± 0.0830.867 ± 0.0830.852 ± 0.0910.853 ± 0.091
Abbreviations: MLA, machine learning algorithm; PIM, person independent model; PSM S D , person-specific model with 50% of data from kth subject for training set, 25% as validation set, and 25% as test set; PSM i , j , person-specific model where i identifies the number of source individuals, i∈ 5, 10, 15, and j refers to the percentage of data belonging to the kth target subject j∈ 15%, 30%, 50%. Note: * macro-averaged values computed on the 28 subjects employed for the analysis.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Valerio, A.; Demarchi, D.; O’Flynn, B.; Motto Ros, P.; Tedesco, S. Development of a Personalized Multiclass Classification Model to Detect Blood Pressure Variations Associated with Physical or Cognitive Workload. Sensors 2024, 24, 3697. https://doi.org/10.3390/s24113697

AMA Style

Valerio A, Demarchi D, O’Flynn B, Motto Ros P, Tedesco S. Development of a Personalized Multiclass Classification Model to Detect Blood Pressure Variations Associated with Physical or Cognitive Workload. Sensors. 2024; 24(11):3697. https://doi.org/10.3390/s24113697

Chicago/Turabian Style

Valerio, Andrea, Danilo Demarchi, Brendan O’Flynn, Paolo Motto Ros, and Salvatore Tedesco. 2024. "Development of a Personalized Multiclass Classification Model to Detect Blood Pressure Variations Associated with Physical or Cognitive Workload" Sensors 24, no. 11: 3697. https://doi.org/10.3390/s24113697

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop