
The challenges are in every step of Big Health Data management, starting from the source of data collection to its application. In this articles, Silverio A, et al. had mentioned the challenges of big health data and how to deal with those challenges, including;
Missing Data: The missing data in big health data is uncontrollable. Thus, this article suggested several methods for handling a large number of missing data, such as Imputation technique, Mixed effects regression model, Generalized estimating equation, and Inference. Several methods suggested to handle miss data that is less than 10% would remain the same distribution and prevent any outlier data.
Selection Bias: A large scale of health data from EHR would have many confounding factors as the data are from different sources, interventions, inconsistent, and independent. Big data is observational studies that reflect actual cases in the real world. I believe that the information from Big data analysis would be an important element that supports the randomized controlled trial design to confirm the hypothesis.
Data Analysis and Training: I personally believe that more clinicians and researchers are interested in training on big data analysis using appropriate statistical and methodological tools. However, only few researchers are able to interpret with more complex data. It would be great if clinical researchers themselves are trained with informatics, coding, data analysis, etc. The availability of well-trained researchers are required as advisory.
Data Privacy and Ethical Issue: Health data in Thailand are mostly centralized. The data confidentiality and privacy are primary concerns. The Personal Data Protection Act (PDPA) is implemented in Thailand to handle ethical issues. Patients provide their broad consent to allow using their health data on further research. However, we also need to ensure that our system security is good enough to protect data from cyberattackers.