
First of all, this paper is very good in mentioning how useful and important of big health data. Particularly when they mentioned that stent thrombosis in patient with CAD undergoing PCI was found through large-scale studies.
There are many challenges for research in using big health data indicated in this paper:
– Disease definition/ Unstructured data: disease definition may be varied among countries. By using clinical coding standards (such as ICD-10) in classifying diseases and health problem could help in heterogeneous of disease classification. Yet, the data from each EHR and sources can be different and could cause big obstacles in collecting and analyzing of data. Therefore, interoperability standards should be set among systems, hospitals, and stakeholders. This will ease data transfer and management.
– Legal and ethical issues: especially when many countries have started to be enacted PDPA just like Thailand, or HIPPA rules which was legislated in the States. This could be difficult for future research in using patient’s data in the past. I would suggest that hospital should have agreement at first by asking for consent in sharing medical data for example, laboratory results or imaging results etc. In addition, we must make sure that patient identifiable information must be blinded prior to sharing. Nonetheless, the ethical committee should be in part in considering what research can or cannot do.
– Data security/ Data integrity: researcher team must have specialized IT staff in managing information and system security. The data system should have policy of access control and audit trail to make sure that data are complete, consistent, and accurate to original. Furthermore, backing up data method is crucial to prevent data lost.
– Data quality and missing data/ Data inconsistency/ Training: researchers must be well trained to make sure that data are completely and correctly collected. If we implement interoperability standards at first, we will have a set of information that must be collected from subjects. Thus, this helps in prevent missing of data. In addition, our lives would be easier by having IT helps in checking consistency of data. Most importantly, to solve missing data and ensure quality of data, researchers must also have competence and knowledge in data analytics with multiple testing, in order to know which methods could solve those complexities.
There are many more solutions and suggestions for each challenge mentioned above. However, some of them might or might not be applicable depend on real world setting.