Abstract:
Log data, captured during use of mobile health (mHealth) applications by
health providers, can play an important role in informing nature of user engagement
with the application. The log data can also be employed in understanding health
provider work patterns and performance. However, given that these logs are raw
data, they require robust cleaning and curation if accurate conclusions are to be
derived from analyzing them. This paper describes a systematic data cleaning
process for mHealth-derived logs based on Broeck’s framework, which involves
iterative screening, diagnosis, and treatment of the log data. For this study, log data
from the demonstrative mUzima mHealth application are used. The employed data
cleaning process uncovered data inconsistencies, duplicate logs, missing data within
logs that required imputation, among other issues. After the data cleaning process,
only 39,229 log records out of the initial 91,432 usage logs (42.9%) could be
included in the final dataset suitable for analyses of health provider work patterns.
This work highlights the significance of having a systematic data cleaning approach
for log data to derive useful information on health provider work patterns and
performance