Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 29;73(5):1221-1241.
doi: 10.1093/jrsssc/qlae033. eCollection 2024 Nov.

Walking fingerprinting

Affiliations

Walking fingerprinting

Lily Koffman et al. J R Stat Soc Ser C Appl Stat. .

Abstract

We consider the problem of predicting an individual's identity from accelerometry data collected during walking. In a previous paper, we transformed the accelerometry time series into an image by constructing the joint distribution of the acceleration and lagged acceleration for a vector of lags. Predictors derived by partitioning this image into grid cells were used in logistic regression to predict individuals. Here, we (a) implement machine learning methods for prediction using the grid cell-derived predictors; (b) derive inferential methods to screen for the most predictive grid cells while adjusting for correlation and multiple comparisons; and (c) develop a novel multivariate functional regression model that avoids partitioning the predictor space. Prediction methods are compared on two open source acceleometry data sets collected from: (a) 32 individuals walking on a 1.06 km path; and (b) six repetitions of walking on a 20 m path on two occasions at least 1 week apart for 153 study participants. In the 32-individual study, all methods achieve at least 95% rank-1 accuracy, while in the 153-individual study, accuracy varies from 41% to 98%, depending on the method and prediction task. Methods provide insights into why some individuals are easier to predict than others.

Keywords: accelerometry; biometrics; functional data.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest: Ciprian Crainiceanu is consulting for Bayer and Johnson and Johnson on methods development for wearable and implantable technologies. The details of these contracts are disclosed through the Johns Hopkins University eDisclose system. The research presented here is not related to and was not supported by this consulting work.

Figures

Figure 1.
Figure 1.
Eight 3-s intervals shown from different study participants. Data are a single time series obtained as the sum of squares of observed accelerations along the three axes. The left panels provide the information about the identity of study participants, whereas the right panels do not. The questions are (a) among the individuals shown in the right plots, is there any individual whose data are displayed in the left panels? and (b) if yes, then which ones? The answer to the riddle is provided at the end of the discussion and in the acknowledgements.
Figure 2.
Figure 2.
Subset of the empirical joint distribution of acceleration and lag acceleration for Subject 19 in the IU data. The pairs {vij(su),vij(s)} are plotted for u=1,15,30,45 centiseconds (columns) and j=1,2 s (rows).
Figure 3.
Figure 3.
Predictor extraction for Subject 19. The values of Xijc for Subject 19 in the IU data are shown for u=1,15,30,45 (columns) and j=1,2 (rows). The white number in each grid cell is the value of Xijc for that cell. For example, Xi=19,j=2,c=[0.75,1.00),[0.75,1.00)=61 and is shown in the bottom-left corner of the plot. Only a subset of the grid cells is shown as the other grid cells have no observations for these two seconds.
Figure 4.
Figure 4.
Classification metrics over varying number of seconds in testing data. First row: rank-1 accuracies; Second row: rank-5 accuracies. Each column corresponds to different data and prediction tasks. The lines show how accuracy for each model changes as the number of seconds averaged over in the testing data is increased.
Figure 5.
Figure 5.
Significant grid cells from image partitioning, Subject 143 ZJU Session 1. (a) Grid cells that are significant in distinguishing Subject 143 from the other subjects in the ZJU S1 task. (b) Grid cells that are significant after adjusting for correlation and multiplicity. (a) Unadjusted and (b) Correlation and Multiplicity Adjusted.
Figure 6.
Figure 6.
Comparison of data from well and poorly predicted subjects. Panel (a) demonstrates a subset of the joint distribution for subject 5 (left) and subject 79 (right) in session 1 (top row) and session 2 (bottom row). The images are similar between sessions and these individuals were correctly identified in session 2 from their session 1 data. Panel (b) shows the same subset of the joint distribution for subject 3 (left) and subject 136 (right) in session 1 (top row) and session 2 (bottom row). The images do not look similar, and hence these individuals were not correctly predicted in the ZJU S1S2 task.

References

    1. Bours P., & Shrestha R. (2010). Eigensteps: A giant leap for gait recognition. In 2010 2nd International Workshop on Security and Communication Networks (IWSCN) (pp. 1–6). IEEE.
    1. Chellappa R., Veeraraghavan A., & Ramanathan N. (2009). Gait Biometrics, Overview. In (pp. 628–633). Springer US.
    1. Chen T., & Guestrin C. (2016). XGBoost. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM.
    1. Chipman H. A., George E. I., & McCulloch R. E. (2010). BART: Bayesian additive regression trees. The Annals of Applied Statistics, 4(1), 266–298. 10.1214/09-AOAS285 - DOI
    1. Cohen J. A., & Verghese J. (2019). Gait and dementia. Handbook of Clinical Neurology, 167, 419–427. 10.1016/B978-0-12-804766-8.00022-4 - DOI - PubMed

LinkOut - more resources