Activity Recognition Using a Single Accelerometer Placed at the Wrist or Ankle : Medicine & Science in Sports & Exercise

Journal Logo

APPLIED SCIENCES

Activity Recognition Using a Single Accelerometer Placed at the Wrist or Ankle

MANNINI, ANDREA1; INTILLE, STEPHEN S.2; ROSENBERGER, MARY3; SABATINI, ANGELO M.1; HASKELL, WILLIAM3

Author Information
Medicine & Science in Sports & Exercise 45(11):p 2193-2203, November 2013. | DOI: 10.1249/MSS.0b013e31829736d6
  • Free

Abstract

Purpose 

Large physical activity surveillance projects such as the UK Biobank and NHANES are using wrist-worn accelerometer-based activity monitors that collect raw data. The goal is to increase wear time by asking subjects to wear the monitors on the wrist instead of the hip, and then to use information in the raw signal to improve activity type and intensity estimation. The purposes of this work was to obtain an algorithm to process wrist and ankle raw data and to classify behavior into four broad activity classes: ambulation, cycling, sedentary, and other activities.

Methods 

Participants (N = 33) wearing accelerometers on the wrist and ankle performed 26 daily activities. The accelerometer data were collected, cleaned, and preprocessed to extract features that characterize 2-, 4-, and 12.8-s data windows. Feature vectors encoding information about frequency and intensity of motion extracted from analysis of the raw signal were used with a support vector machine classifier to identify a subject’s activity. Results were compared with categories classified by a human observer. Algorithms were validated using a leave-one-subject-out strategy. The computational complexity of each processing step was also evaluated.

Results 

With 12.8-s windows, the proposed strategy showed high classification accuracies for ankle data (95.0%) that decreased to 84.7% for wrist data. Shorter (4 s) windows only minimally decreased performances of the algorithm on the wrist to 84.2%.

Conclusions 

A classification algorithm using 13 features shows good classification into the four classes given the complexity of the activities in the original data set. The algorithm is computationally efficient and could be implemented in real time on mobile devices with only 4-s latency.

The accurate quantification of daily physical activity and energy expenditure would advance science and assist with proper management of pathologies such as obesity, diabetes, and cardiovascular diseases (21). Accelerometer-based activity monitors are capable of quantifying human motion, covering the range of acceleration amplitudes and frequencies required to measure human movement (4). Moreover, their low power consumption, small dimensions, and light weight contribute to ease of wearability and make long-term activity monitoring practical. Accelerometer-based activity monitors often integrate raw accelerometer data over short windows of time (“epochs”) into “activity counts” (28). Activity counts summarize data in an epoch, simplifying data management, analysis, and interpretation; however, information about the structure of the raw accelerometer signal is lost, which might be used by algorithms to infer specific activity type (3,17), gait parameters (gait phase detection, walking speed estimation) (16,23), posture transition and balance (18,33), rehabilitation progress (20), and the detection of falls (5,9). Activity type information inferred from raw data analysis can be used to improve estimates of energy expenditure (1,10) or walk/run speed (26) by switching regression parameters to those tuned to the recognized activity.

Activity classification using accelerometers can be obtained using one or more sensors on the body. A multisensor configuration is preferable because it allows detecting a larger variety and finer complexity of activities by capturing both upper and lower body motion independently (3). Although multisensor systems are becoming more practical, single-sensor systems may simplify study administration and lower study administrative cost and are therefore current practice in larger studies. With a single sensor, common location choices are the hip, thigh, upper arm, wrist, or ankle. The hip location has been used extensively in physical activity measurement studies. It generally captures major body motions, but algorithms using hip data can underestimate overall expenditure on activities such as bicycling or arm ergometry, where the hip movement is not proportional to movement of the limbs (30). Hip sensors must be attached to the outside of clothing or worn on a belt, and that can lead to lower wear-time compliance, especially during sleep. Water-resistant wrist-worn devices can be used without discomfort during activities of daily living, including sleep, and can remain on even if clothes are changed and do not require a special belt or clip, thus leading to improved wear time (12). However, given the large amount of gesticulation, involving upper limbs, that does not correspond to large body movement (and therefore large energy expenditure), estimating energy expenditure from wrist movement counts may introduce more error than the same calculations at the hip (22). Nevertheless, the NHANES and UK Biobank surveillance studies switched from the hip to the wrist location in the most recent round of data collection to improve wear time (6); the intent is to log and use raw data analysis to characterize activity despite the additional wrist gesticulation. Preliminary data from the 2011–2014 NHANES study suggests that wrist placement will result in higher wear time (27), but processing the raw data to characterize overall activity provides a challenge because of the wrist gesticulation and variability in movement. For example, one study that compared different sensor placement sites reported lower activity type detection performance for the wrist with respect to other locations for the detection of sit-to-stand or lie-to-stand transitions (2). The wrist may move differently during the same activity, depending on what is in the hand and what the hand is holding or stabilizing. For example, the raw motion signature for the wrist while walking holding a heavy bag, walking with a full cup, or walking with a mobile phone held to the head may differ. Several recent studies explored the problem of detecting activities from wrist-worn sensors (8,14,25,31,32). As described in the discussion section, the prior studies have limitations due to the amount of data tested, the complexity of the activities tested, or the validation approach used when reporting results.

An alternative placement location that deserves consideration is just above the ankle, worn under clothing. Drawbacks cited for the ankle location are concerns about tight fitting boots and concerns that subjects will refuse to wear an ankle-worn device that may resemble an alcohol monitor or police tether. A sufficiently small and thin sensor may minimize these problems, however. As with the wrist location, a sensor could be attached to the ankle under the clothing and left on day and night, perhaps increasing wear time and compliance. Further, the ankle placement site may be particularly useful for pedestrian navigation and gait analysis purposes (24), such as gait segmentation (19) or walking speed estimation (15). Despite the potential benefits, we are unaware of prior work studying the effect of ankle placement on data analysis, wear time, and acceptability.

In this work, we replicated a recently reported algorithm for detecting activity class from raw accelerometer data collected at the wrist (32) and tested it on a data set with 33 participants performing a set of daily physical activities. We aimed at classifying activities within four classes (ambulation, cycling, sedentary activities, and other activities), using a leave-one-subject-out validation to characterize algorithm performance. Various combinations of window lengths (i.e., the amount of data acquired to give a single classification output) and feature sets (sets of variables used for classification purposes) were tested to develop an algorithm. The algorithm is computationally efficient and suitable for implementation on mobile phones that support real-time mobile phone interventions using wrist-mounted wireless accelerometers. Real-time capabilities allow fast feedback to the user that may be useful in future systems designed to use this information to improve compliance.

MATERIALS AND METHODS

Data acquisition and annotation

This study uses a data set of acceleration data tagged with activity type that was previously acquired for other studies from 42 adults recruited from the Stanford, California, community. Stanford University’s Human Use Committee approved the protocol, and written informed consent was obtained from all subjects before participation. Triaxial accelerometers called Wockets (12) were attached using custom Velcro bands to each participant’s ankle, thigh, hip upper arm, and wrist. The wrist Wocket was placed on the dorsal aspect of the dominant wrist midway between the radial and the ulnar process. The ankle Wocket was placed on the outside of the ankle, just above the lateral malleolus. The thigh Wocket was located on the anterior thigh midway between the top of the patella and the inguinal fold. The arm sensor was worn over the lateral side of the arm midway between the shoulder and the elbow. Arm and thigh Wockets were attached with both adhesive tape and a sleeve worn over the Wocket and around the sensor. The hip Wocket was worn on a belt around the participant’s waist on the dominant side of the body. The Wockets are small, thin, and lightweight devices (43 × 30 × 7 mm, 13 g) that include a triaxial accelerometer (MMA7331LCR1), a microprocessor (ATMEGA1284P), a Bluetooth radio (RN-41), and a rechargeable lithium battery (3.7 V, 240 mA·h). They are optimized for long-term wearability for physical activity monitoring studies, where mobile phones are used for data collection. Raw acceleration data (range ± 4g, g = 9.81 m·s−2) were acquired at 90 Hz and sent using the Bluetooth wireless protocol to a smartphone.

The experimental protocol consisted of asking participants to perform a guided sequence of laboratory-based physical activities and simulated daily activities. Activities were annotated during the execution of tasks using a voice recorder, and then timings on the voice recording were used to annotate start/stop times for specific activities being observed. Data and annotation were synchronized using custom software (12). Twenty-six activities with more than 0.5 min of steady state data were labeled in the original data set. Those activities have been clustered into four more general categories for this study: sedentary (lying, sitting, Internet search, reading, typing, writing, sorting files on paperwork, and standing still), cycling (indoor and outdoor), ambulation (natural walking, treadmill walking, carrying a box, and stairs up/down), and other activities (sweeping with broom and painting with roller or brush). In the current data set, other sedentary activities such as driving a car or riding public transit were not available. Data that were not labeled or for which the label was “unknown” were discarded. Multitasking behaviors were not allowed during experiments, except for the activity walking-carrying a load.

Data from nine participants were discarded due to high data loss or to technical problems affecting the wrist or ankle sensor, as reported in notes taken by the staff at the time of data collection. Ankle and wrist data from the remaining 33 participants (11 men and 22 women, age = 18–75 yr, height = 168.5 ± 9.3 cm [range 149–189], weight = 70.0 ± 15.6 kg [range 48–114]) were imported into the Mathworks Matlab (version 7.6, Natick, MA) environment, which was used for all evaluations described. All other available data were discarded as they were not pertinent to the aims of this study. The data set and Matlab code used in this study are available to interested researchers (http://mhealth.ccs.neu.edu/datasets). The data set was acquired with a protocol designed to encourage natural behavior within a laboratory setting. Participants were told what to do but not how to do it, and staff annotated the activities as previously described.

This data collection procedure allows for natural participant variability in how activities are performed but can also lead to errors in annotation at activity transitions due to reaction time when labeling. For this reason, in this work, we discarded one window (12.8 s) before and after each label transition. When smaller windows were considered, to keep the analysis consistent, 12 s before and after each transition were still discarded, corresponding to three or six windows for 4- and 2-s windows, respectively. Another type of error is that some short activity changes are not labeled at all. For example, the data set contains examples where a participant stops briefly during nontreadmill walking, such as at a door that had to be opened. In such cases, even though the participant is standing still briefly, the label for the data is still “ambulation.” Some errors can be detected by using the ankle acceleration recordings because in data labeled as “ambulation,” the signal magnitude vector (SMV),

, of the ankle sensor must show movement. We therefore use the ankle sensor to identify these errors in labeling and correct labels indicating ambulation. In particular, 2-s windows labeled as “ambulation” with an SMV SD less than 0.1 g are marked as labeling errors and discarded. In the results section, data with this correction are referred as corrected data.

Data loss due to Bluetooth wireless transmission errors was handled by discarding windows with less than the 80% of the number of expected samples at the nominal 90 Hz sampling rate. In such cases, a new window was started at the end of the data gap. Some fluctuations in the sampling rate may occur in the remaining windows due to the wireless connection. Before extracting frequency domain features, the SMV in each window were linearly interpolated to obtain the same number of samples in every window.

Data preprocessing and features evaluation

Three-axis raw accelerometer data were preprocessed to extract the SMV according to the previously introduced definition. The resulting 90-Hz SMV signal is independent of the orientation of the sensing node. The SMV were low-pass filtered using a 15-Hz cutoff fourth-order Butterworth filter to limit the bandwidth of the signal to the frequencies common in human motion (4), removing high frequency noise.

To classify data within the four defined activity classes, the SMV data were broken into 12.8-s nonoverlapping windows. This window size was proposed by Zhang et al. (32). The window lengths of 4 and 2 s were also tested because shorter windows reduce latency in providing feedback in real-time implementations and would therefore be preferable. Windowed portions of signals were processed to extract features commonly used in raw data processing of accelerometer signals (3) and the feature set proposed by Zhang et al. (32). Mean and SD values of SMV were considered jointly with a time-frequency analysis of 12.8-s windows. The analysis of power spectral density is aimed at characterizing (a) total power in the frequencies between 0.3 and 15 Hz, (b) first and second dominant frequencies and their powers in the same frequency band, (c) dominant frequency in the band 0.6–2.5 Hz and its power, (d) the ratio between the power of the first dominant frequency and the total power (0.3–15 Hz), and (e) the ratio between the dominant frequency of the current segment and the segment before. We also considered two features based on wavelet transforms found to improve classification accuracy in Zhang et al. (32):

where dj is the decomposed signal at level j of the SMV and

. The selected wavelet was the Daubechies “db10” for its close match to walking data (32). The maximum level considered for decomposition was J = 8, whereas the levels considered for the evaluation of these features were α = 5 and β = 6. We also introduced two simple features evaluated on each windowed portion of SMV—the minimum and the maximum values within the window.

Computationally simple features are preferred to those that require substantial processing such as wavelet analysis if the complex features only modestly improve results. Simple features minimize latencies and maximize battery life of computing devices that run the classification algorithms. Algorithms that permit real-time implementations may ultimately be used in measurement and intervention studies that provide real-time feedback to participants based on detected activity. To identify the best trade-off between accuracy and classifier complexity, eight different feature set combinations were evaluated. In particular, the effect of removing the wavelet-based features from the training set was investigated.

Classifier validation

Preliminary testing showed that the highest classification rates were achieved using support vector machine (SVM) classifiers (29). The best outcomes were achieved using a radial basis function kernel with upper complexity bound C = 100 and γ = 0.1. SVM classifiers are desirable because the optimization criteria are convex, which implies that a global optimal solution exists (29), and many toolboxes exist that simplify application of the algorithms to particular data sets. Here, the SVM implementation from the LibSVM toolbox (7) was used.

Two different validation approaches were compared for both ankle and wrist data. The first approach was n-fold cross validation (n-fold validation) (13). In this approach, data (consisting of the windowed sections of data-label pairs) are randomized and divided into n different subsets (folds). The algorithm is trained on n−1 subsets and tested on the remaining one. The second approach was leave-one-subject-out cross validation (LOSO validation) (11). In this case, the subsets correspond to data from the various participants. Recognition models were trained on data from all subjects except one that is used for the test phase. In both cases, the procedure was repeated to test all data. At the end of the procedure, results are aggregated by summing all the obtained confusion matrices. Cross-validation is a well-established technique used in pattern recognition experiments to avoid training and testing on the same data when only small data sets are available (13). The drawback of the n-fold validation approach is that temporally adjacent bits of data may be split into the training and test sets, encouraging the algorithm to overfit the data, inflating positive results. The LOSO validation approach avoids this problem and is more likely to generalize to new data; it is therefore a preferable method.

Results are evaluated in terms of overall accuracy and F1 score for each class. The F1 score is defined as the harmonic mean of precision and recall, F1 = 2(precision × recall)/(precision + recall), where precision = TP / (TP + FP) and recall = TP / (TP + FN). True positives (TP) are data correctly classified within the selected class. False positives (FP) are those data that are incorrectly classified as belonging to the selected class. False negatives (FN) are data belonging to the selected class that are incorrectly classified in another one. The F1 score merges information about precision and recall in a single number; it ranges from 0 to 1, where 1 is a perfect classification.

RESULTS

As a first step, preliminary activity classification outcomes obtained by testing the algorithm performance with different feature sets (FS) are shown in Table 1. Corrected wrist data were processed using LOSO validation. The effect of discarding features is evaluated: time-frequency analysis (FS 2), wavelet transform (FS 3), mean and SD (FS 4), and maximum and minimum value (FS 5). FS 6 to FS 8 show the effects of discarding sets of the features. The introduction of the “minimum” and “maximum” values improves the classification accuracy by 1.8 percentage points (going from FS 6 to FS 3), whereas removing the wavelet-based features reduces the wrist data classification accuracy by 0.5 percentage points. All subsequent results presented here therefore refer to FS 3 that achieves an 84.7% overall accuracy with LOSO validation.

T1-22
TABLE 1:
Results classifying 12.8-s windows of activity into four activity classes from wrist data using an SVM recognizer and LOSO cross validation with 10 different FS.
T2-22
TABLE 2:
Wrist and ankle classification confusion matrices for the four target activity groups using the SVM classifier with FS 3.

Table 2 shows the results of the activity classification problem in terms of confusion matrices that were obtained after and before the introduction of the ankle data–based correction. The amount of ambulation data discarded by this correction is 1.1% of the ambulation data (0.4% of the total). The ankle data–based correction was applied to both ankle and wrist data sets to remove windows that were clearly mislabeled as ambulation given the ankle data. Results are shown for both wrist and ankle data, considering the two proposed validation approaches: parts A and B contain LOSO validation of uncorrected and corrected data, respectively, and part C shows results obtained with 10-fold validation of corrected data. Results for 10-fold validation of uncorrected data were omitted for brevity. Table 2B shows the wrist-worn sensor was ∼10% less accurate than the ankle-worn sensor, which shows impressive performances on the selected activities with LOSO validation (wrist = 84.7%, ankle = 95.0%). Most errors for wrist occur in the cycling class, and the accuracy of walking detection is 87.2%.

A detailed version of the confusion matrices shown in Table 2B is presented in Table 3, and classification accuracies by participant and activity category are reported in Figure 1. These data provide insight into the nature of the classification errors, as discussed later.

T3-22
TABLE 3:
Wrist and ankle classification details showing category recognition for each specific activity type.
F1-22
FIGURE 1:
Comparison of wrist/ankle classification accuracies by participant and activity class, showing the large variability in detection of the cycling and other classes from the wrist data.

Time-frequency analysis improves with longer windows, but increasing window size reduces the time resolution of outcomes and increases latency in a real-time implementation. The 12.8-s windows introduce substantial latency. Table 4 shows LOSO validation results for the ankle and wrist corrected data, varying the size of windows between 12.8, 4, and 2 s.

T4-22
TABLE 4:
Results using the SVM classifier on the corrected data and FS 3 with three different sizes of windows.

For the real-time implementation of the algorithm, the computational complexity of feature computation and the classification algorithm must be considered against overall performance. On a 1.6-GHz Intel Centrino2 processor with 4-GB RAM using Mathworks Matlab R2008b on a 64-bit Windows 7 operating system, the algorithm using 12.8-s windows was measured on the data set to require the following processing time per window: 6.0 ± 0.5 ms (mean ± SD) to preprocess the window, 104.9 ± 4.0 ms to evaluate features from it, and 0.6 ± 0.1 ms to classify it according to the trained classification rules and parameters. The total time needed for classifying each window is therefore 111.5 ± 4.1 ms. Using shorter windows (4 s) requires 96.8 ± 4.2 ms per window. Introducing the wavelet analysis increases the time by ∼12% (13 ms for 12-s windows and 12 ms for 4-s windows, on average).

DISCUSSION

In this work, we obtained a classification algorithm that shows good assignment of a wide variety of activities into four distinct classes using raw data from a triaxial accelerometer worn either on the wrist or ankle. The algorithm is computationally efficient and could be implemented in real time with short latency to recognize the activity of the user.

In Table 5, our activity classification algorithm is compared with state-of-the-art solutions. The solutions or studies published to date on this topic have four limitations. The first limitation is that the algorithms have been trained and evaluated on small pools of participants with little data per participant (8,14,25,31). A second limitation is that some studies used n-fold validation but not LOSO validation (8,25,32). The third limitation is that some studies used 50% overlapping windows (8,14,31). With overlapping windows, some of the same data appears in two windows, potentially inflating recognition results—especially if this overlapping window technique is applied jointly with n-fold validation, as in the study of Chernbumroong et al. (8).

T5-22
TABLE 5:
Recent results using raw data for activity classification from the wrist, as compared with the approach presented in this study.

Another limitation of the prior research, and perhaps the most important, is that the algorithms proposed may have been tested on data sets that fail to represent nonlaboratory behavior. The prior studies evaluate algorithms on stereotypical activities performed for only a few seconds. As a consequence, the total amount of classified minutes of wrist data studied and the total minutes per activity class recognized were limited, as shown in Table 5. The variability of the behavior within activities lasting for only a few seconds is likely to be reduced, especially for highly structured behaviors such as treadmill walking. For example, the small postural changes someone might make while sitting in a chair reading or while working on a computer typing are more likely to be recorded if the data logging focuses on activities lasting more than a few seconds. All previous studies used short activities except the sport activities classification by Siirtola et al. (25), in which activities lasted for 10 min. That study, however, did not include sedentary activities of daily living where wrist sensor data may be difficult to interpret; some are included in our protocol (see Table 3).

The classification rates achieved by Zhang et al. (32) are higher than those reported here. This work uses the LOSO validation instead of the n-fold validation because LOSO is a more realistic test. Further, this work includes cycling data, which appears to be the most confounding activity for wrist-based activity recognition. In Zhang et al., according to the reported confusion matrices (and assuming nonoverlapping windows), 331 min of data was classified. Here, 1609 and 1633 min of recordings, for wrist and ankle, respectively, were classified. This difference in the number of available windows per class between the wrist and the ankle sensors is due to gaps in data related to the wireless Bluetooth data transmission.

As expected, activity classification results obtained on our data set with 10-fold validation are higher (1.9 percentage points for the wrist data) than those using a LOSO validation (see Table 2, parts B and C). The 10-fold validation uses a subset of data from all participants for training the model: it is less affected by data intersubject variability than the LOSO validation approach in which the tested subject is not considered in the definition of classification rules. Using LOSO validation is a better test because it ensures that absolutely no data from an individual has been included in the training set, and therefore bias is not introduced into the testing set.

Tables 2 and 3 highlight some challenges with wrist-based activity recognition:

  • Cycling activities are more difficult to detect than the other classes from wrist data, due to the stable position of the wrist on the bike handle (see also the lower F score values for cycling in Tables 1 and 4). Only the study of Siirtola et al. (25) included cycling.
  • Assuming that natural walking is typically level walking at a self-selected speed (∼3 mph), the “treadmill 0% incline 3 mph” activity is likely to be similar to natural walking, and ambulation detection is best for these two activities on the wrist. Other variations on walking, such as different speeds and inclinations, increase error rates. This increase may result from having fewer training instances for these walking variations.
  • The classifier is capable of recognizing ambulation from wrist data even if the participant is carrying a load, which would alter the nature of the signal. The previous studies do not include examples of carrying objects that might affect ambulatory wrist movement.
  • Using ankle data, the most challenging classification category is the “other activities” class. In this case, the participant may perform the subactivities without moving his or her feet. This is evident for the “painting” subclass, whereas the “sweeping with broom” subclass yields higher classification rates probably because it involves moving the feet.
  • Sedentary activities are often classified as “other activities” if the subject is in an upright position or if the subject is sorting files or paperwork. This is more evident for wrist data, probably because the hand movement in these activities can be similar to the “other activities” class.
  • Downhill cycling using the ankle sensor location is recognized less well than level cycling and exercise bike pedaling. This is probably related to the frequency of pedaling, which is lower in uphill conditions and may be zero while going downhill. The Siirtola et al. (25) study that included cycling had only cycling and spinning activities without any distinction in the incline of the path being reported.

Figure 1 shows performance by participant, highlighting person-specific outliers such as the large variability in the “cycling” and “other activities” detection from the wrist data. Figure 1 also shows differences in classification performance between the ankle and the wrist data. For example, with cycling wrist classification, it is evident that data from some participants are better classified than others (see, e.g., 2 and 3 and 4 and 5). This high intersubject variability suggests that further improvement may require introduction of a subject-specific adaptation of classification rules, where parameters of the trained classifier are tailored to the particular participant’s behavior with minimal intervention by the user. This subject-specific adaptation, common in fields such as speech recognition, is one of our focuses for future research. In doing so, the critical importance of accurate labeling of activities, discussed earlier, must be considered.

In this work, we clustered 26 activity types into four broad activity classes of ambulation, cycling, sedentary, and other activities. We also evaluated the same algorithm on the same data set using the simpler binary classification problem detecting “ambulation” versus “everything else.” The LOSO overall accuracy for wrist data rises to 93.5% in this case. This confirms that the recognition of ambulation from wrist data is feasible, even with an algorithm suitable for real-time implementation.

Window length

To better describe typical everyday activities, it would be preferable to limit window length to smaller values. In fact, many posture and ambulation “bouts” are intermixed and last less than 12.8 s. In Table 4, it is confirmed that by reducing the time length of windows from 12.8 to 4 s, overall classification rates for wrist data slightly decrease. A further reduction to 2-s windows results in a greater decrease in classification rates. The ankle data recognition is only modestly affected by shorter window lengths. The cycling data for the wrist sensor show that the highest classification accuracy is obtained with 2-s windows. For some activities such as cycling, shorter windows may be preferable because there is less likely to be variability during the window (e.g., stopping pedaling), which may create a misclassified activity due to a transition in the type of movement. Moreover, more windows are available for training if they are shorter, and this may improve classifier performances. The results on this data set suggest that in real-time systems, window lengths could be reduced from 12.8 to 4 s, reducing latency substantially, with little effect on overall performance.

Window length affects computational efficiency. The presented results were obtained after testing several combinations of FS. As shown in Table 1, wavelet transform features slightly improve the classification accuracy. However, the improvement of 0.5 percentage points is at the cost of a 12% increase in computation time, which is substantial when considering real-time implementation on mobile devices. This is the reason why FS 3 from Table 1 was used in this work. The improvement in classification accuracy achieved by introducing the minimum and maximum SMV value features is higher than using the wavelet features. This is not surprising because it is evident from data that the range of recorded SMV of the acceleration differs for different activities, and as Table 1 shows comparing columns FS 2, FS 3, and FS 7, the time-frequency analysis features already provide most of the benefit of the wavelet features.

Generalizability of approach

The proposed methodology could be reproduced on any device that outputs raw three-axis acceleration data at 40 Hz or higher; it is not dependent on the Wocket sensors. Devices with dynamic range other than ±4g might require that the algorithms be retrained or that the raw data be mapped into the ±4g range. In this work, a 90-Hz sampling rate was used because it was the upper limit of our system, and the sampling rate can be downsampled. However, the frequency content of human movement is limited, and tests confirmed that the sampling rate could be reduced to 40 Hz without degrading classification accuracy.

Most of the computational complexity of using SVM classification occurs during the training of the classifier. Once the models are built from the training data, the algorithm can be run efficiently in real time. The time needed to process a 12.8-s window compared with the case of a 4-s window is slightly higher, although the complexity of the classification step in the second case is higher because a larger training data set may result a larger number of support vectors that must be processed. Therefore, a real-time system using 4-s windows would reduce not only latency but also overall computational complexity. This would make the real-time implementation of activity class detection more computationally feasible on mobile devices.

Transitions crop and ankle-based correction of wrist data

Research assistants were trained to record activities in real time on a portable computing device, but it is still difficult to annotate behavior in real time without making mistakes at some activity transitions. We discarded some windows around activity transitions to minimize data labeling inaccuracies that might affect the results. It is worth noting that the overall LOSO validation accuracy for wrist data does not change significantly using all data without this transition rejection, whereas the ankle data classification accuracy is reduced by 2.3 percentage points when transition data are used. The erroneous classifications of some ambulation data as sedentary activity for wrist data (see Table 2A) are resolved by cleaning the data using the ankle-based correction (see Table 2B). Using this corrected data also reduces errors classifying sedentary data as ambulation because after the windows mislabeled as ambulation are removed, overall performance improves. The errors in labeling that we were able to discover by comparing the wrist and ankle data demonstrate the importance of providing the training algorithm a cleanly labeled data set—even a small percentage of mislabeled data affects performance on a small data set. Despite laboratory conditions and human “gold standard” direct-observation labeling, we still found errors in our data because of the rapid switching between ambulation and nonambulation during everyday movement. As new mobile systems are developed that require end users to provide calibration data, the importance of accurate activity labeling must be considered carefully. If labeling is obtained in less-controlled conditions (e.g., by end users or the system themselves), errors are likely to be far more common. Using both wrist and ankle sensors in future work may provide one method by which labeled ambulation data may be verified to improve training of wrist-only algorithms.

To make a fair evaluation of classification accuracy possible, all windows containing multiple activities were discarded both from train and test phases. As a consequence, this work establishes an upper bound on the overall performances of the algorithm. In future work, we intend to deal with such transition windows in the test phase, providing information on the likelihood of each window being classified in each of the available classes (i.e., a soft-assignment labeling approach). Results could then be compared with the likelihoods of the labels for any given segment; rather than detecting the activity, the algorithm must detect the likelihood of the activity and that is compared with the likelihood of all possible labels. A strong assumption made in this work is that people are not multitasking during most activities, and specifically not during the activities we asked them to perform. As a consequence, new algorithms may be necessary to detect complex multitasking behaviors. Complex situations like walking and talking on the phone or walking and pushing a stroller are left to future work.

Study strengths and weaknesses

This study has several weaknesses related to a relative small and homogenous sample of adults, a selected group of physical activities that account for only a portion of the time and activities many adults perform throughout their day and use of data collected only during steady-state activities. Study strengths include data collection according to well-defined and executed protocols, extensive cleaning of data using customized software to ensure accuracy, advanced data analytical procedures, an unbiased validation approach, and direct comparison of results with recently published data.

CONCLUSIONS

As large surveillance studies move toward using wrist-worn accelerometers that collect raw data, the research community will need methods to compute summary statistics on the raw data. Of particular importance is the reliable detection of sedentary versus ambulatory activity because there are many sedentary activities that involve relatively large amounts of wrist movement (e.g., animated talking, keyboard typing, and repetitive desk work). Ambulatory activities such as walking typically involve repetitive, cyclic motion of various parts of the body, and so intuitively one way to identify such movement from raw accelerometer data at the wrist might be to use frequency-domain features. In addition, we might expect a monitor on the ankle, which would rarely move repetitively and extensively except during ambulation, to perform better than a sensor at the wrist. In this work, we have confirmed this intuition to be true on a data set of 26 activities collected from 33 people. We find that when trying to detect if a person is engaged in one of four categories of activities—ambulation, cycling, sedentary activities, and other activities—a good solution is to use frequency domain features (plus several other simple features) computed on short windows (12.8 s or 4 s) of raw data. On the same data set, the ankle does outperform the wrist by 10.3%. We find that once frequency domain features are included, the addition of the more computationally complex wavelet features provide only modest improvements that probably do not justify the computational cost, especially when looking toward the future where devices will provide real-time (low-latency) feedback as part of just-in-time interventions.

This study was funded by the National Heart, Lung and Blood Institute, National Institutes of Health award no. 5UO1HL091737 to the Massachusetts Institute of Technology (Stephen Intille, PI) with a subaward to Stanford University (William Haskell, PI). Mr. Mannini was funded by the Italian Ministry of Education, Universities and Research, MIUR. The present study does not constitute endorsement by the American College of Sports Medicine. The authors have no conflicts of interest to disclose.

Fahd Albinali and Jason Nawyn provided help with the Wocket sensors and data collection and data cleaning.

REFERENCES

1. Albinali F, Intille SS, Haskell W, Rosenberger M. Using wearable activity type detection to improve physical activity energy expenditure estimation. In: Proceedings of the 12th ACM International Conference on Ubiquitous Computing. Copenhagen, Denmark; 2010. p. 311–20.
2. Atallah L, Lo B, King R, Yang G-Z. Sensor positioning for activity recognition using wearable accelerometer. IEEE Trans Biomed Circuits Sys. 2011; 5 (4): 320–29.
3. Bao L, Intille SS. Activity recognition from user-annotated acceleration data. Pervasive. 2004; 301: 1–17.
4. Bhattacharya A, McCutcheon EP, Shvartz E, GJ E. Body acceleration distribution and O2 uptake in humans during running and jumping. J Appl Physiol. 1980; 49 (5): 881–7.
5. Bourke K, O’Donovan K, Olaighin G. The identification of vertical velocity profiles using an inertial sensor to investigate pre-impact detection of falls. Med Eng Phys. 2008; 30 (7): 937–46.
6. Centre UBC. Category 2 Enhanced Phenotyping at Baseline Assessment Visit in last 100–150,000 Participants. Stockport Cheshire: UK Biobank Coordinating Centre; 2009. p. 1–33.
7. Chang C-C, Lin C-J. LIBSVM : a library for support vector machines. ACM Trans Intel Syst Tech. 2011; 2 (27): 1–27.
8. Chernbumroong S, Atkins AS, Yu H. Activity classification using a single wrist-worn accelerometer. In: Proceedings of the 5th International Conference on Software, Knowledge Information, Industrial Management and Applications; 2011. pp. 1–6.
9. Degen T, Jaeckel H, Rufer M, Wyss S. SPEEDY: a fall detector in a wrist watch. In: Proceedings of the 7th IEEE Symposium on Wearable Computing; 2003. pp. 184–9.
10. Dongwoo K, Kim HC. Activity energy expenditure assessment system based on activity classification using multi-site triaxial accelerometers. In: Proceedings of the 29th IEEE International Engineering in Medicine and Biology Conference. Lyon, France; 2007. pp. 2285–7.
11. Esterman M, Tamber-Rosenau BJ, Chiu Y-C, Yantis S. Avoiding non-independence in fMRI data analysis: leave one subject out. NeuroImage. 2010; 50 (2): 572–6.
12. Intille SS, Albinali F, Mota S, Kuris B, Botana P, Haskell WL. Design of a wearable physical activity monitoring system using mobile phones and accelerometers. In: Proceedings of the 33rd IEEE International Eng in Medicine and Biology Conference. Boston, MA; 2011. pp. 3636–39.
13. Jain AK, Duin RPW, Mao J. Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell. 2000; 22 (1): 4–37.
14. Kao T-P, Lin C-W, Wang J-S. Development of a portable activity detector for daily activity recognition. In: Proceedings of the IEEE International Symposium on Industrial Electronics. Seoul; 2009. pp. 115–20.
15. Li Q, Young M, Naing V, Donelan JM. Walking speed estimation using a shank-mounted inertial measurement unit. J Biomech. 2010; 43 (8): 1640–3.
16. Mannini A, Sabatini AM. Gait phase detection and discrimination between walking-jogging activities using hidden Markov models applied to foot motion data from a gyroscope. Gait Posture. 2012; 36 (4): 657–61.
17. Mathie MJ, Coster ACF, Lovell NH, Celler BG. Accelerometry: providing an integrated, practical method for long-term, ambulatory monitoring of human movement. Physiol Meas. 2004; 25 (2): R1–20.
18. Mayagoitia RE, Lotters JC, Veltink PH, Hermens H. Standing balance evaluation using a triaxial accelerometer. Gait Posture. 2002; 16 (1): 55–9.
19. Pappas IPI, Popovic MR, Keller T, Dietz V, Morari M. A reliable gait phase detection system. IEEE Trans Neural Syst Rehabil Eng. 2001; 9 (2): 113–25.
20. Patel S, Hughes R, Hester T, et al. A novel approach to monitor rehabilitation outcomes in stroke survivors using wearable technology. Proc IEEE Inst Electr Electron Eng. 2010; 98 (3): 450–61.
21. Plasqui G, Westerterp KR. Physical activity assessment with accelerometers: an evaluation against doubly labeled water. Obesity. 2007; 15 (10): 2371–9.
22. Rosenberger M, Haskell WL, Albinali F, Mota S, Nawyn J, Intille S. Estimating activity and sedentary behavior from an accelerometer on the hip or wrist. Med Sci Sports Exer. 2013; 45 (5): 964–75.
23. Rueterbories J, Spaich EG, Larsen B, Andersen OK. Methods for gait event detection and analysis in ambulatory systems. Med Eng Phys. 2010; 32 (6): 545–52.
24. Sabatini AM. Quaternion-based strap-down integration method for applications of inertial sensing to gait analysis. Med Biol Eng Comput. 2005; 43 (2002): 94–101.
25. Siirtola P, Laurinen P, Haapalainen E, Roning J, Kinnunen H. Clustering-based activity classification with a wrist-worn accelerometer using basic features. In Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining. Nashville, TN; 2009. p. 95–100.
26. Song Y, Shin S, Kim S, Lee D, Lee KH. Speed estimation from a tri-axial accelerometer using neural networks. In: Proceedings of the 29th IEEE Engineering in Medicine and Biology Society Conference. Lyon, France; 2007. p. 3224–7.
27. Troiano R, Mc Clain J. Objective measures of physical activity, sleep, and strength in U.S. National Health and Nutrition Examination Survey (NHANES) 2011–2014. In: Proceedings of the 8th International Conference on Diet and Activity Methods. Rome, Italy; 2012.
28. Troiano RP, Berrigan D, Dodd KW, Mâsse LC, Tilert T, McDowell M. Physical activity in the United States measured by accelerometer. Med Sci Sports Exerc. 2008; 40 (1): 181–8.
29. Vapnik V. The Nature of Statistical Learning Theory. New York: Springer; 2000. p. 314.
30. Welk GJ, Blair SN, Wood K, Jones S, Thompson RW. A comparative evaluation of three accelerometry-based physical activity monitors. Med Sci Sports Exerc. 2000; 32 (9Suppl): S489–97.
31. Yang J-Y, Chen Y-P, Lee G-Y, Liou S-N, Wang J-S. Activity recognition using one triaxial accelerometer: a neuro-fuzzy classifier with feature reduction. Lect Notes Comput Sci. 2007; 4740: 395–400.
32. Zhang S, Rowlands AV, Murray P, Hurst TL. Physical activity classification using the GENEA wrist-worn accelerometer. Med Sci Sports Exerc. 2012; 44 (4): 742–8.
33. Zijlstra W, Bisseling RW, Schlumbohm S, Baldus H. A body-fixed-sensor-based analysis of power during sit-to-stand movements. Gait Posture. 2010; 31 (2): 272–8.
Keywords:

ACTIVITY CLASSIFICATION; INERTIAL SENSING; MOBILE HEALTH; LEAVE-ONE-SUBJECT-OUT VALIDATION; ACTIVITY MEASUREMENT; ENERGY EXPENDITURE

© 2013 American College of Sports Medicine