Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec 15:25:e44599.
doi: 10.2196/44599.

Tensorial Principal Component Analysis in Detecting Temporal Trajectories of Purchase Patterns in Loyalty Card Data: Retrospective Cohort Study

Affiliations

Tensorial Principal Component Analysis in Detecting Temporal Trajectories of Purchase Patterns in Loyalty Card Data: Retrospective Cohort Study

Reija Autio et al. J Med Internet Res. .

Abstract

Background: Loyalty card data automatically collected by retailers provide an excellent source for evaluating health-related purchase behavior of customers. The data comprise information on every grocery purchase, including expenditures on product groups and the time of purchase for each customer. Such data where customers have an expenditure value for every product group for each time can be formulated as 3D tensorial data.

Objective: This study aimed to use the modern tensorial principal component analysis (PCA) method to uncover the characteristics of health-related purchase patterns from loyalty card data. Another aim was to identify card holders with distinct purchase patterns. We also considered the interpretation, advantages, and challenges of tensorial PCA compared with standard PCA.

Methods: Loyalty card program members from the largest retailer in Finland were invited to participate in this study. Our LoCard data consist of the purchases of 7251 card holders who consented to the use of their data from the year 2016. The purchases were reclassified into 55 product groups and aggregated across 52 weeks. The data were then analyzed using tensorial PCA, allowing us to effectively reduce the time and product group-wise dimensions simultaneously. The augmentation method was used for selecting the suitable number of principal components for the analysis.

Results: Using tensorial PCA, we were able to systematically search for typical food purchasing patterns across time and product groups as well as detect different purchasing behaviors across groups of card holders. For example, we identified customers who purchased large amounts of meat products and separated them further into groups based on time profiles, that is, customers whose purchases of meat remained stable, increased, or decreased throughout the year or varied between seasons of the year.

Conclusions: Using tensorial PCA, we can effectively examine customers' purchasing behavior in more detail than with traditional methods because it can handle time and product group dimensions simultaneously. When interpreting the results, both time and product dimensions must be considered. In further analyses, these time and product groups can be directly associated with additional consumer characteristics such as socioeconomic and demographic predictors of dietary patterns. In addition, they can be linked to external factors that impact grocery purchases such as inflation and unexpected pandemics. This enables us to identify what types of people have specific purchasing patterns, which can help in the development of ways in which consumers can be steered toward making healthier food choices.

Keywords: diet; food; food expenditure; loyalty card data; principal components; purchase pattern; seasonality; tensorial data.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Money spent (per €1000) by product group across customers and average purchase basket of participants (y-axis is the percentage of money spent on each product group). Only the product groups covering >1% of all purchases are illustrated. Thus, altogether 27 product groups, that is, 11.01% of all purchases were omitted.
Figure 2
Figure 2
The upper line graph illustrates the percentage of purchases made across the weeks. Dashed red line represents 1.9%, which would be the weekly average if all purchases were distributed evenly across the year. Heat map shows weekly purchase pattern illustrated for total sum of the money spent on each product group (rows). The color indicates the row-wise z scores of each product group. Holiday weeks 12, 25, and 51 clearly stand out from the analysis. Simultaneously, the figure illustrates the sum patterns for the product groups, showing that some of the product groups are more often purchased during summer (eg, beer, wine, and cider), whereas others are purchased more during winter (eg, frozen fruits and frozen vegetables). Summed data are clustered with correlation distance and average linkage.
Figure 3
Figure 3
Loadings of tensorial principal component analysis components (x-axis) for (A) weeks and (B) product groups. Red indicates a high positive loading, and green indicates a high negative loading, both equally interesting.
Figure 4
Figure 4
Illustration of the average purchase pattern of the card owners having the highest (A) and lowest (B) 10% scores of the third principal component of weeks (PC3—summer vs winter) and the first principal component of product groups (PC1—product average). The values have been standardized against all customers, after which the averages were computed, that is, a dark green value indicates that the average money spent within the illustrated group on each product group is 0.4 SDs lower than the average across all customers and a strong red value indicates that the average money spent within the illustrated group on each product group is 0.4 SDs higher than the average across all customers.
Figure 5
Figure 5
Average percentages of weekly expenditures on ready-to-eat foods, with the deciles divided based on the product group PC2-ready-to-eat and time PC1-weekly average (A), time PC2-spring versus autumn (B), and time PC3-summer versus winter (C). (A) The first time component finds the groups of participants with different levels but temporally stable purchase behavior for ready-to-eat food. (B) The second time component and especially its extreme deciles reveal the participants with increased or decreased use of ready-to-eat food. (C) With the third time component, we can identify participants with seasonal change in the ready-to-eat food purchase pattern.
Figure 6
Figure 6
Average percentages of weekly expenditures on red meat with the deciles divided based on the product group PC3-red meat and time PC1-weekly average (A), time PC2-spring versus autumn (B), and time PC3-summer versus winter (C). (A) The first time component finds the groups of participants with different levels but temporally stable purchase behavior for red meat. (B) The second time component reveals the participants with increased or decreased meat, and (C) detects the participants with summer versus winter difference in meat purchases.

Similar articles

Cited by

References

    1. Demchenko Y, Grosso P, De LC, Membrey P. Addressing big data issues in scientific data infrastructure. Proceedings of the 2013 International Conference on Collaboration Technologies and Systems; CTS '13; May 20-24, 2013; San Diego, CA. 2013. pp. 48–55. https://ieeexplore.ieee.org/document/6567203 - DOI
    1. Nevalainen J, Erkkola M, Saarijärvi H, Näppilä T, Fogelholm M. Large-scale loyalty card data in health research. Digit Health. 2018 Nov 29;4:2055207618816898. doi: 10.1177/2055207618816898. https://journals.sagepub.com/doi/10.1177/2055207618816898?url_ver=Z39.88... 10.1177_2055207618816898 - DOI - DOI - PMC - PubMed
    1. Clark SD, Shute B, Jenneson V, Rains T, Birkin M, Morris MA. Dietary patterns derived from UK supermarket transaction data with nutrient and socioeconomic profiles. Nutrients. 2021 Apr 27;13(5):1481. doi: 10.3390/nu13051481. https://www.mdpi.com/resolver?pii=nu13051481 nu13051481 - DOI - PMC - PubMed
    1. Rains T, Longley P. The provenance of loyalty card data for urban and retail analytics. J Retail Consum Serv. 2021 Nov;63:102650. doi: 10.1016/j.jretconser.2021.102650. https://www.sciencedirect.com/science/article/abs/pii/S0969698921002162 - DOI
    1. Lintonen T, Uusitalo L, Erkkola M, Rahkonen O, Saarijärvi H, Fogelholm M, Nevalainen J. Grocery purchase data in the study of alcohol use - a validity study. Drug Alcohol Depend. 2020 Sep 01;214:108145. doi: 10.1016/j.drugalcdep.2020.108145.S0376-8716(20)30310-0 - DOI - PubMed

Publication types

LinkOut - more resources