Analysis of the impact of the data cleaning process on malnutrition indicators

  • Agustín Nicolás Dramis Grupo de Bioestadística Aplicada (GBA, FCEyN-UBA), Consejo Nacional de Investigaciones Científicas y Técnicas de Argentina(CONICET
  • María Soledad Fernández Grupo de Bioestadística Aplicada (GBA, FCEyN-UBA), Consejo Nacional de Investigaciones Científicas y Técnicas de Argentina(CONICET
  • Adriana Alicia Pérez Grupo de Bioestadística Aplicada (GBA, FCEyN-UBA)
  • Pablo Guillermo Turjanski Grupo de Bioestadística Aplicada (GBA, FCEyN-UBA), Consejo Nacional de Investigaciones Científicas y Técnicas de Argentina(CONICET
Keywords: Data Quality - Simulation - Anthropometric data

Abstract

The systematic recording of anthropometric measurements allows the evaluation of the nutritional status of populations, providing a fundamental input for designing, directing, and evaluating public policies. Anthropometric measurements are usually collected through a manual entry process by healthcare professionals. This process can lead to data entry errors, potentially impacting the assessment of the population's nutritional status. To address this issue, the WHO introduced guidelines for the removal of individually implausible data. However, these guidelines are not considered sufficient for detecting all errors. There are methods available that can detect longitudinal inconsistencies within records of the same individual. In this study, we simulated an anthropometric database (based on a real one) and randomly introduced four types of errors described in the literature. We observed the impact of these errors and the effects of the cleaning process (both cross-sectional and longitudinal) on the prevalence of a malnutrition indicator. We found an increase in the prevalence after introducing each type of error, and a convergence towards the original prevalence values after applying the cleaning processes. This highlights the importance of implementing data cleaning procedures before analyzing nutritional indicators.

Published
2023-07-21
How to Cite
Dramis, A., Fernández, M., Pérez, A., & Turjanski, P. (2023). Analysis of the impact of the data cleaning process on malnutrition indicators. Proceedings of JAIIO, 9(5), 20-27. Retrieved from https://ojs.sadio.org.ar/index.php/JAIIO/article/view/740
Section
CAIS - Congreso Argentino de Informática y Salud