Analysis of supermarket grocery data for prediction of nutritional and health outcomes at population level

November 4, 2022

Agnes Kiragga, Maureen Ng”etich, Steve Cygu, and Elizabeth Kimani

Globally, the burden of non-communicable diseases (NCDs) are leading cause of death. The majority of NCDs are driven by changes in lifestyles and diet. Diseases including hypertension, diabetes, cardiovascular diseases are associated with consumption of high contents of sugar, salt, or oily foods. While undernutrition is still a widespread concern in Africa, nutritional related conditions such as obesity has become a rising epidemic due to urbanization, changes in lifestyle and income growth.

Diet is traditionally measured using food and drink consumption, often self-reported. There has been an increase in online tools including diet-disease surveys investigating nutrient composition data. However, an alternative to recording nutrient consumption diaries is the use of food and drink purchases. With the assumption that purchased food is later consumed, purchase records could offer an advanced reflection of individual or household dietary patterns. Supermarket data is currently underutilized in determining nutritional trends, particularly in Africa despite the ballooning number of grocery stores. Compared to traditional markets, supermarkets provide processed foods high in fat, sugar, and salt, which could possibly affect health outcomes in the form of increased prevalence of non-communicable outcomes. We therefore aim to explore the possibility of harnessing supermarket sales data to estimate uptake of key nutrients and their likely impact on the incidence of increasing diet-related communicable diseases to inform food regulation policies. Our aims include:

Aim 1: To explore and map the availability, variety, trends and categories of food/nutrition related data that can be shared from grocery data in Kenyan supermarkets.

Hypothesis: We hypothesize that between 20-30% supermarket proprietors will be willing to share their data for research and that approximately 50% of the data are usable for research.

Aim 2: To create a minimum standardized and harmonized dataset of key grocery data to support future health-related research projects

Hypothesis: Over 50% of digital data systems in supermarkets can be harmonized and used to generate a standardized minimum dataset for health research using grocery data

Aim 3: To a standardized dataset of grocery dataset to explore the purchase habits of residents in Nairobi and prediction of outcomes from unhealthy consumption using mathematical modelling and data science.

Hypothesis: A dynamic mathematical model can be used to predict future health burdens from consumer data with at least 60% accuracy.

Aim 4: To use the data generated in aims 1-3 to inform a policy analysis to understand Social, economic, and political variables that influence policy decisions on sale of unhealthy foods, and potential impact on regulatory policies.  The policy analysis will provide understanding of the local environment and policies governing imports of food, as well as governance of supermarkets and the likely effect of health-related policies on economic gains in Kenya. The proposed policy analysis will include many facets, involving empirical research and statistical data collection, as well as the participation of key stakeholders such as economists, community members, and public officials tasked with enacting policies.

Potential impact: The proposed work combines expertise from different themes and units at the Center. It has potential to advance our understanding of food systems and their influence on overall health, through NCD. The proposed policy analysis will provide information to inform the enaction of new policies or amending existing policies and frameworks on health food sale and governance of supermarkets in Kenya.