Pedestrian Walking Distance Estimation Based on Smartphone Mode Recognition

Wang, Qu; Ye, Langlang; Luo, Haiyong; Men, Aidong; Zhao, Fang; Ou, Changhai

doi:10.3390/rs11091140

Open AccessArticle

Pedestrian Walking Distance Estimation Based on Smartphone Mode Recognition

by

Qu Wang

¹

,

Langlang Ye

²,

Haiyong Luo

^2,*

,

Aidong Men

¹,

Fang Zhao

³ and

Changhai Ou

⁴

¹

School of Information and Communication Engineering, Bei**g University of Posts and Telecommunication, Bei**g 100876, China

²

Bei**g Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology Chinese Academy of Sciences, Bei**g 100190, China

³

School of Software Engineering, Bei**g University of Posts and Telecommunication, Bei**g 100876, China

⁴

School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore

^*

Author to whom correspondence should be addressed.

Remote Sens. 2019, 11(9), 1140; https://doi.org/10.3390/rs11091140

Submission received: 17 April 2019 / Revised: 3 May 2019 / Accepted: 11 May 2019 / Published: 13 May 2019

(This article belongs to the Special Issue Concurrent Positioning, Map** and Perception of Multi-source Data Fusion for Smart Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Stride length and walking distance estimation are becoming a key aspect of many applications. One of the methods of enhancing the accuracy of pedestrian dead reckoning is to accurately estimate the stride length of pedestrians. Existing stride length estimation (SLE) algorithms present good performance in the cases of walking at normal speed and the fixed smartphone mode (handheld). The mode represents a specific state of the carried smartphone. The error of existing SLE algorithms increases in complex scenes with many mode changes. Considering that stride length estimation is very sensitive to smartphone modes, this paper focused on combining smartphone mode recognition and stride length estimation to provide an accurate walking distance estimation. We combined multiple classification models to recognize five smartphone modes (calling, handheld, pocket, armband, swing). In addition to using a combination of time-domain and frequency-domain features of smartphone built-in accelerometers and gyroscopes during the stride interval, we constructed higher-order features based on the acknowledged studies (Kim, Scarlett, and Weinberg) to model stride length using the regression model of machine learning. In the offline phase, we trained the corresponding stride length estimation model for each mode. In the online prediction stage, we called the corresponding stride length estimation model according to the smartphone mode of a pedestrian. To train and evaluate the performance of our SLE, a dataset with smartphone mode, actual stride length, and total walking distance were collected. We conducted extensive and elaborate experiments to verify the performance of the proposed algorithm and compare it with the state-of-the-art SLE algorithms. Experimental results demonstrated that the proposed walking distance estimation method achieved significant accuracy improvement over existing individual approaches when a pedestrian was walking in both indoor and outdoor complex environments with multiple mode changes.

Keywords:

indoor positioning; machine learning; pedestrian dead reckoning; stride length estimation; smartphone mode recognition

Graphical Abstract

1. Introduction

Applications that attempt to track pedestrian motion level (walking distance) for health purposes require an accurate step detection and stride length estimation (SLE) technique [1]. Walking distance is used to assess the physical activity level of the user, which helps provide feedback and motivate a more active lifestyle [2,3] Another type of application based on walking distance is navigation applications. Among various indoor localization methods, pedestrian dead reckoning (PDR) [4] has become a mainstream and practical method, because PDR does not require any infrastructure. In addition to the general applications, involving asset and personnel tracking, health monitoring, precision advertising, and location-specific push notifications, PDR is available for emergency scenarios, such as anti-terrorism action, emergency rescue, and exploration missions. Furthermore, smartphone-based PDR mainly benefits from the extensive use of smartphones—pedestrians always carry smartphones that have integrated inertial sensors. Stride length estimation is a key component of PDR, the accuracy of which will directly affect the performance of PDR systems. Therefore, in addition to providing more accurate motion level estimation, precise stride length estimation based on built-in smartphone inertial sensors enhances positioning accuracy of PDR. Most visible light positioning [5,6], Wi-Fi positioning [7,8,9], and magnetic positioning [10,11,12] critically depend on PDR. Hence, motion level estimation based on smartphones contributes to assisting and supporting patients undergoing health rehabilitation and treatment, activity monitoring of daily living, navigation, and numerous other applications [13].

The methods for estimating pedestrian step length are summarized as two categories: the first is direct methods, based on the integration of acceleration; the second is indirect methods that leverage a model or assumption to compute step length. The double integration of the acceleration component in the forward direction is the best method to compute the stride length of pedestrians because it does not rely on any model or assumption, and does not require training phases or individual information (leg length, height, weight) [14]. Kourogi et al. [15] leveraged the correlation between vertical acceleration and walking velocity to estimate walking speed, and calculated stride length by multiplying walking speed with step interval. However, the non-negligible bias and noise of the accelerometers and gyroscopes resulted in the distance error growing boundlessly and cubically in time [14]. Moreover, it is difficult to obtain the acceleration component in the forward direction from the sensor’s measurements, as well as constantly maintaining the sensor heading parallel to the pedestrian’s walking direction [16]. Additionally, low-cost smartphone sensors are not reliable and accurate enough to estimate the stride length of a pedestrian by double integrating the acceleration [17]. Develo** a step length estimation algorithm using MEMS (micro-electro-mechanical systems) sensors is recognized as a difficult problem.

Considerable research based on models or assumptions has been conducted to improve the accuracy of SLE, and summarized as empirical relationships [18,19], biomechanical models [18,20,21], linear models [22], nonlinear models [23,24,25], regression-based [22,26], and neural networks [27,28,29,30]. One of the most renowned SLE algorithms was presented by Weinberg [23]. To estimate the walk distance, he leveraged the range of the vertical acceleration values during each step, according to Equation (1).

L = k \cdot \sqrt[4]{a_{\max} - a_{\min}}

(1)

where

a_{\max}

and

a_{\min}

denote the maximum and minimum acceleration values on the Z-axis in each stride, respectively. k represents the calibration coefficient, which is obtained from the ratio of the actual distance and the estimated distance.

As shown in Equation (2), Kim et al. [24] developed an empirical method, based on the average of the acceleration magnitude in each stride during walking, to calculate movement distance.

L = k \cdot \sqrt[3]{\frac{\sum_{i = 1}^{N} {| a}_{i} |}{N}}

(2)

where

a_{i}

represents the measured acceleration value of the

i^{th}

sample in each step, and N represents the number of samples corresponding to each step. k is the calibration coefficient.

To estimate the travel distance of a pedestrian accurately, Ladetto et al. [22] leveraged the linear relationship between step length and frequency and the local variance of acceleration to calculate the motion distance with the following equation:

L = α \cdot f + β \cdot v + γ

(3)

where f is the step frequency, which represents the reciprocal of one stride interval, v is the acceleration variance during the interval of one step, α and β denote the weighting factors of step frequency and acceleration variance, respectively, and γ represents a constant that is used to fit the relationship between the actual distance and the estimated distance.

Kang et al. [31] simultaneously measured the inertial sensor and global positioning system (GPS) position while walking outdoors with a reliable GPS fix, and regarded the velocity from the GPS as labels to train a hybrid multiscale convolutional and recurrent neural network model. After that, Kang leveraged the prediction velocity and moving time to estimate the traveled distance. However, it is challenging to obtain accurate labels, since GPS contains a positional error. Zhu et al. [32] measured the duration of the swing phase in each gait cycle by accelerometer and gyroscope, and then combined the acceleration information during the swing phase to obtain the step length. ** are not within the scope of this article. To reduce redundancy and maximize compatibility, all the data were published in JSON (JavaScript Object Notation) format. As shown in Figure 5, each stride holds nine degrees-of-freedom sensor data and the corresponding stride number, smartphone mode, stride length, and total walking distance. More detailed info about the dataset can be found in GitHub (https://github.com/wq1989/WalkingDistanceEstimation).

2.2. Pre-Processing and Walk Detection

The accelerometer data provided by the Android service were fairly noisy. High-frequency oscillations from the device and ambient environment seriously skew the clean oscillations of human motion. Normally, the step frequency was lower than 3 Hz (3 steps per second) [40]. To minimize the impact of the smartphone shaking and sensor noise, and improve the robustness of smartphone mode recognition and stride length estimation, we utilized a 1st order Butterworth filter [41] with a cutoff frequency = 3 Hz to remove the high frequency oscillations of the time-series sensors feature signal, and extract useful information from the low-cost sensor signals. Figure 6 shows the signal before and after the Butterworth filter. After using a Butterworth filter, the signal was smoother, and the insignificant parts of the signal were eliminated (see the red curve).

As shown in Figure 7, unexpectedly rotating or shaking a smartphone may arouse marked fluctuation in accelerometer and gyroscope readings, but no step event. Merely considering accelerometer and gyroscope readings for walk detection, the abnormal movements (unexpectedly rotating or shaking the smartphone) may lead to unreliable step detection results. We combined the accelerometer and gyroscope with a magnetometer to reduce the influence of random motion (shaking or rotating smartphones). This is based on the assumption that the magnitude of a magnetic reading changes significantly when the user is walking indoors, due to the magnetic field diversity at different locations. We denoted the magnitude of the gyroscope, acceleration and magnetic field at time t as

g_{t}

,

a_{t}

and

m_{t}

, respectively. We introduced a sliding window of N observed values to eliminate exception data and consider the average magnitude of acceleration

h_{a}

, the standard deviation of the gyroscope

h_{g}

, and the standard deviation of magnetic field magnitude

h_{m}

for walk detection, as in Equations (4)–(6).

h_{a} = \frac{1}{N} \sum_{t = 1}^{N} a_{t}

(4)

h_{g} = \frac{1}{N} \sum_{t = 1}^{N} {(g_{t} - \frac{1}{N} \sum_{t = 1}^{N} g_{t})}^{2}

(5)

h_{m} = \frac{1}{N} \sum_{t = 1}^{N} {(m_{t} - \frac{1}{N} \sum_{t = 1}^{N} m_{t})}^{2}

(6)

If some or all of

h_{a}

,

h_{g}

and

h_{m}

were below certain thresholds, then the user was classified as static (not walking). Otherwise, the user was identified as moving. To effectively reduce the power consumption, walking detection was used to trigger the following walking distance estimation method.

2.3. Feature Extraction

Feature extraction from accelerator and gyroscope data streams is a crucial operation for smartphone mode recognition and stride length estimation. An excellent set of features provides accurate and comprehensive descriptions of motion distance. To capture either temporal variations or periodic characteristics of walking, both time-domain and frequency-domain features were considered in each stride.

Statistical Features:Table 4 shows the main statistical features’ description, with a brief definition of each feature, extracted from each stride observation. Mean, median, standard deviation, skewness, kurtosis, energy, maximum value, interquartile range, minimum value, and amplitude were considered.
Time-Domain Features: Represents how inertial sensors’ signals vary with time. Table 5 shows the time-domain features. The number of peaks, g-crossing rate, zero-crossing ratio, gyroscope-accelerometer correlation, and inter-axis correlation were extracted from each stride observation.
Frequency-Domain Features: Represents the inertial sensors’ signal in the frequency domain. As shown in Table 6, frequency-domain features represent signals according to their frequency components. A fast Fourier transform (FFT) was applied, and first dominant frequency, second dominant frequency, and the amplitude of the first and second dominant frequencies were the features used.
High-Order Features: In addition to the time-domain and frequency-domain features, we also built higher-order features based on the acknowledged studies, including Kim [24], Ladetto [22] and Weinberg [23]. All of the extracted high-order features are summarized in Table 7. The features mentioned above were extracted from the observations of accelerometer and gyroscope in one stride.

2.4. Smartphone Mode Recognition

Once the data pre-processing and features extraction were completed, features were used to train the multi-class classifier and predict smartphone modes in a timely way.

2.4.1. Smartphone Mode Definition and Analysis

As shown in Figure 8, in addition to the normal mode of handheld, the calling, pocket, arm-hand and swinging-hand modes were also considered.

Handheld: Pedestrian holds their phone horizontally with the hand in front of their chest while walking (see Figure 8a).
Calling: Pedestrian makes or receives a phone call while walking (see Figure 8b).
Trouser pocket: Pedestrian carries the smartphone in a trouser pocket while walking (see Figure 8c).
Swinging-hand: Arm swinging is the natural motion of the arms when walking with the hands-free, and it is synchronized with the opposite side’s foot (see Figure 8d).
Arm-hand: In scenes such emergency rescue, users usually tie their smartphone to their arms (see Figure 8e).

In consideration of the different sensor characteristics, corresponding to different smartphone modes, we analyzed the differences of inertial sensors in the five usage modes. As shown in Figure 9 and Figure 10, the mode in the black dotted rectangle is the handheld mode; the mode in the red dotted rectangle is the calling mode; the mode in the blue dotted rectangle is the swinging-hand mode; the mode in the blue dotted rectangle is the arm-hand mode; the mode in the blue dotted rectangle is the trouser pocket mode. From the figures, we found that the observations of inertial sensors under different modes showed slight differences. Therefore, we made full use of the extracted statistical features, time-domain and frequency-domain features of inertial sensors, to identify different smartphone modes.

2.4.2. Classification Model

The key step in smartphone modes recognition is classification, which takes advantage of the extracted features. In this study, based on these features, six state-of-the-art single classifiers (Extreme Gradient Boost (XGBoost) [43], LightGBM [44], K-Nearest Neighbor (KNN) [45], Decision Tree (DT) [46], AdaBoost [47], and support vector machines (SVM) [48]) were compared to recognize smartphone modes. Each classifier presents its advantages and disadvantages.

To improve the accuracy and robustness of smartphone mode recognition, we needed to fuse the results of multiple classifiers. Stacking is an ensemble model, where a new model is trained from the combined predictions of two (or more) previous models. In general, the stacked model outperforms each of the individual models, due to its smooth nature and ability to highlight each base model, where it performs best, and discredit each base model, where it performs poorly. As shown in Figure 11, we used a two layer stacking model for smartphone mode recognition. During the ensemble process, we utilized the predictions of non- linear models including AdaBoost [47], DT [46], KNN [45], LightGBM [44], SVM [48], and XGBoost [43] to train the first-level model to generate the second-layer train set and test set. Logistic regression in the second-level model was employed to output the final prediction.

We took the F1 score as a performance metric to quantify the classification performance of different models. Precision is the ratio of correctly predicted conditions to the total predicted positive conditions for each class. Recall presents the ratio of correctly predicted positive conditions to all the true conditions for each class. F1 score is a combination of precision and recall that represents the detection result with less bias than the accuracy in multi-class classification problems, especially with disproportionate samples in each class [49].

a c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N} = \frac{a l l p o s i t i v e p r e d i c t i o n s}{a l l p r e d i c t i o n s}

(7)

p r e c i s i o n = \frac{T P}{T P + F P} = \frac{p o s i t i v e p r e d i c t e d c o r r e c t l y}{a l l p o s i t i v e p r e d i c t i o n s}

(8)

r e c a l l = \frac{T P}{T P + F N} = \frac{p o s i t i v e p r e d i c t e d c o r r e c t l y}{a l l p o s i t i v e o b s e r v a t i o n s}

(9)

F 1 = 2 \times \frac{p r e c i s i o n \times r e c a l l}{p r e c i s i o n + r e c a l l}

(10)

The definitions of the above metrics use the true positive (TP), true negative (TN), false positive (FP), and false negative (FN). A high F1 score indicates a high level of classification performance and agreement between the classification and ground truth.

Figure 12 compares the accuracy of smartphone mode prediction for the six single classifiers and a stacking ensemble classifier in the three trajectories. Figure 12 indicates that the classification model based on stacking ensemble outperformed all single classifiers in the three tested trajectories. The average recognition accuracy of the classification model based on stacking ensemble reached about 98.47%. The average recognition accuracy of stacking ensemble classifier was improved by 26.3%, 1.9%, 33.5%, 22.7%, 22.8%, and 2.0% compared to AdaBoost [47], DT [46], KNN [45], LightGBM [44], SVM [48], and XGBoost [43], respectively. The precision, recall, and F-measure score of each smartphone using the stacking classifier are summarized in Table 8.

2.5. Stride Length Estimation Based on Regression Model

2.5.1. Single Regression Models

Compared to traditional SLE methods, a regression model of machine learning has excellent generalization ability and distinct advantages in terms of approximating nonlinear continuous function. To make full use of the advantages of different machine learning models and obtain the best SLE accuracy, we trained six regression models, including Extreme Gradient Boost (XGBoost) [43], LightGBM [44], K-Nearest Neighbor (KNN) [45], Decision Tree (DT) [46], AdaBoost [47], and Support Vector Regression (SVR) [50].

2.5.2. Stacking Regression Model

To improve the accuracy and robustness of stride length estimation, the stacking regression technique of ensemble learning [51] was employed to combine multiple regression models via a meta-regressor (see Figure 13). In the offline training phase, we selected XGBRegressor [43], DecisionTreeRegressor [46], AdaBoostRegressor [47], and LightGBM [44] as single regression models. The single regression models were trained based on the complete training set. We selected SVR [50] with kernel = ’rbf’ as the meta-regressor. The meta-regressor is fitted based on the outputs (meta-features) of the single regression models in the ensemble. In the online phase, the trained stacking regression model predicted the stride length of pedestrian in real time. More detail of stacking regression can be found in References [51,52].

Figure 14 compares the stride length estimation of the stacking model and single regression models. The stacking model achieved the smallest SLE error in that the estimation errors of the average, the 75th percentile, and the 90th percentile were 0.039 m, 0.051 m, 0.075 m, respectively.

2.6. Walking Distance Estimation Based on Smartphone Mode Recognition

The characteristics of the inertial signals differed between the carrying modes, thus resulting in inaccurate stride length estimation. Therefore, we trained five stride length models corresponding to five smartphone modes (handheld, swing, pocket, arm-hand, and calling) in the offline phase. In the online phase, the proposed stacking classifier was used to detect smartphone mode in a timely way. Once the smartphone mode was identified, we estimated the walking distance of the pedestrian accurately by selecting the stride length model corresponding to the smartphone mode. Denoting N as the total number of strides, the walking distance D was calculated as follows:

D = \sum_{i = 1}^{N} L_{i}

(11)

2.7. Performance Evaluation Metrics

We utilized the error rate of the stride length and walking distance as metrics to evaluate the proposed method. The error rate of the stride length was calculated with the following formula:

E_{s} = \frac{1}{N} \sum_{i = 1}^{N} (\frac{| L_{e}^{i} - L_{t}^{i} |}{L_{t}^{i}} \times 100 %)

(12)

where

L_{e}^{i}

and

L_{t}^{i}

represent the predicted stride length and the actual stride length of the i-th stride, respectively.

The error rate of walking distance was calculated with the following formula:

E_{c d} = \frac{| \sum_{i = 1}^{M} L_{e}^{i} - \sum_{i = 1}^{M} L_{t}^{i} |}{\sum_{i = 1}^{M} L_{t}^{i}} \times 100 %

(13)

where

L_{e}^{i}

and

L_{t}^{i}

denote the estimated stride length and the actual stride length of the i-th stride, respectively.

Modes	Handheld		Calling		Swing-Hand		Arm-Hand		Trouser Pocket
Sensors	acc (m/s²)	gyro (rad/s)	acc (m/s²)	gyro (rad/s)	acc (m/s²)	gyro (rad/s)	acc (m/s²)	gyro (rad/s)	acc (m/s²)	gyro (rad/s)
Mean	9.59	0.15	9.79	0.53	10.33	0.99	9.84	0.76	11.02	1.75
STD	0.65	0.14	1.03	0.95	2.38	1.25	1.61	0.55	4.19	1.01

Devices	x-IMU		Smartphone
Items	Accelerometer	Gyroscope	Accelerometer	Gyroscope
Range	±16 g	±2000°/s	±8 g	±2000°/s
Stability	0.00049 g	0.06°/s	0.001 g	0.001°/s
Sample frequency	400 Hz	400 Hz	200 Hz	200 Hz

Subjects	Gender	Age	Height (cm)	Weight (KG)
S1	M	30	169	68
S2	F	25	156	46
S3	F	25	161	53
S4	M	27	181	82
S5	M	26	173	61

Feature	Description
Mean	The mean of a signal. $\bar{s} = \frac{\sum_{i = 1}^{N} s_{i}}{N}$ where $s_{i}$ are the samples, $i = 1, \dots, N$ .
Standard deviation	$s t d = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} (‖ s_{i} ‖ - \bar{‖ s ‖})}$ where $s_{i}$ are the samples, $i = 1, \dots, N$ .
Skewness	$s_{skew} = \frac{1}{N σ^{3}} \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(‖ s_{i} ‖ - \bar{‖ s ‖})}^{3}}$ . Skewness is a measure of the asymmetry of the probability distribution.
Kurtosis	$s_{kurt} = \frac{1}{N σ^{4}} \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(‖ s_{i} ‖ - \bar{‖ s ‖})}^{4}}$ . Kurtosis is a descriptor of the shape of a probability distribution.
Interquartile range	Quartiles divide an ordered data set into four equal parts. The interquartile range (IQR) is the first quartile subtracted from the third quartile.

Feature	Description
Magnitude area	The sum of absolute values of a signal.
Number of peaks	The count of maximum points within one stride window of the signal where the maximum points should be above a pre-set value.
Zero-crossing ratio	$Z C R = \frac{1}{N - 1} \sum_{i = 1}^{N - 1} \prod {s_{i} s_{i - 1} < 0}$ The zero-crossing rate is a measure of how many times within a stride a signal changes from a positive value to a negative value, and vice versa [42].
Inter-axis correlation	$R_{\vec{x}, \vec{y}} = \frac{N (\sum_{i = 1}^{N} x_{i} y_{i}) - (\sum_{i = 1}^{N} x_{i}) (\sum_{i = 1}^{N} y_{i})}{\sqrt{[N \sum_{i = 1}^{N} x_{i}^{2} - {(\sum_{i = 1}^{N} x_{i}^{})}^{2}] [N \sum_{i = 1}^{N} y_{i}^{2} - {(\sum_{i = 1}^{N} y_{i}^{})}^{2}]}}$ where $x_{i}$ and $y_{i}$ are the samples from two axes, $i = 1, \dots, N$ .
Accelerometer–gyroscope correlation	The cross-correlation coefficient between the acceleration and gyroscope.

Article Menu

Pedestrian Walking Distance Estimation Based on Smartphone Mode Recognition

Abstract

1. Introduction

2.2. Pre-Processing and Walk Detection

2.3. Feature Extraction

2.4. Smartphone Mode Recognition

2.4.1. Smartphone Mode Definition and Analysis

2.4.2. Classification Model

2.5. Stride Length Estimation Based on Regression Model

2.5.1. Single Regression Models

2.5.2. Stacking Regression Model

2.6. Walking Distance Estimation Based on Smartphone Mode Recognition

2.7. Performance Evaluation Metrics

3. Experimentation and Evaluation

3.1. Experimental Setup

3.2. Experimental Results of Smartphone Mode Recognition

3.3. Comparison of Stride Length Estimation using Regression Only and Regression Based on Smartphone Mode Recognition

3.4. Experimental Results of Stride Length Estimation

3.5. Experimental Results of Walking Distance

3.6. Complexity Analysis

4. Discussion and Conclusions

5. Patents

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Feature	Description
Spectrum energy	$e n e r g y = {\sum_{i = 1}^{N} ‖ s_{i} ‖}^{2}$ . Depicts the energy distribution of each frequency point.
Spectral entropy	Depicts the degree of uncertainty in the magnitude distribution of the source.
Frequency points	FFT (fast Fourier Transform): direct component,1,2,3,4,5 Hz.

Feature	Description
Weinberg	$W e i n b e r g = \sqrt[4]{a_{\max} - a_{\min}}$ . Weinberg utilizes the difference of the vertical acceleration values during the stride to estimate stride length. $a_{\max}$ and $a_{\min}$ represent the maximum and minimum acceleration values on the Z-axis in each stride, respectively.
Kim	$K i m = \sqrt[3]{\frac{\sum_{i = 1}^{N} \| a_{i} \|}{N}}$ Kim estimate stride length based only on average acceleration during the stride. $a_{i}$ represents the measured acceleration value of the $i^{th}$ sample in each step, N represents the number of samples corresponding to each step.
Scarlett	$S c a r l e t t = \frac{\frac{\sum_{i = 1}^{N} \| a_{i} \|}{N} - a_{\min}}{a_{\max} - a_{\min}}$ Scarlett eliminates the spring effect of the human gait and estimates stride length based on minimum, maximum, and average acceleration.

Modes	Precision (%)	Recall (%)	F-measure (%)	Support
Handheld	99.56	97.84	98.70	232
Calling	96.05	100.00	97.98	243
Pocket	97.32	99.09	98.20	110
Arm-hand	98.35	96.76	97.55	247
Swing	100.00	98.38	99.18	247
Avg/total	98.37	98.33	98.33	1079

Attributes	Regression Only		Regression Based on Smartphone Mode
Attributes	Error	Error Rate ¹	Error	Error Rate
Mean	0.058	5.12%	0.036	3.04%
Std	0.074	-	0.048	-
25%	0.019	1.36%	0.012	0.85%
50%	0.041	3.01%	0.025	1.85%
75%	0.073	5.18%	0.046	3.30%
95%	0.144	11.31%	0.095	7.03%

Segment	1	2	3	4	5	6	7	8	9	10	11	12	13	14
Scenarios	Office			Stair		Street					Skyway		Station	Street
Modes
Recognition Accuracy (%)	97.2	96.6	98.4	97.3	99.1	99.6	97.8	98.6	96.8	98.9	97.5	98.7	96.9	98.5

Attributes	Proposed	Tapeline [38]	Ladetto [22]	Weinberg [23]	Kim [24]
Avg error (m)	43.55	57.42	69.94	106.04	87.08
Avg error rate ¹ (%)	2.62	3.28	4.21	6.39	5.25

Type	Attributes	Proposed	Tapeline [38]	Ladetto [22]	Weinberg [23]	Kim [24]
Outdoor stadium	error (m)	22.56	29.87	38.44	56.36	49.13
Outdoor stadium	error rate (%)	2.51	3.33	4.28	6.28	5.47
Road with inclination	error (m)	10.21	14.02	16.42	25.41	22.24
Road with inclination	error rate (%)	2.79	3.83	4.48	6.93	6.07

	Training Dataset Size	Test Dataset Size	Training Time	Testing Time
Smartphone mode recognition	2000	1000	180.99 s	20.35
Stride length regression			9.13 s	2.77 s
Total			3 min 10.12 s	23.12 s
Tapeline	8000	1000	2 h 18 min 26 s	86.9 s