- This topic has 14 replies, 8 voices, and was last updated 1 month, 3 weeks ago by
Wirichada Pan-ngum.
-
AuthorPosts
-
-
2025-01-07 at 11:32 am #46452
Wirichada Pan-ngum
KeymasterWhat are your suggestions on coping with those challenges? (10 marks)
-
2025-01-27 at 4:34 pm #46709
Siriluk Dungdawadueng
ParticipantI would like to share some suggestions for coping with the challenges associated with using big health data in cardiovascular research and clinical care:
Missing Data
– Data Imputation Techniques: Use statistical methods to estimate and fill in missing values.
– Data Quality Improvement: Implement standardized data collection protocols to minimize missing data.
– Collaborative Data Sharing: Encourage data sharing among institutions to fill gaps.Selection Bias
– Random Sampling: Use random sampling techniques to ensure a representative sample.
– Propensity Score Matching: Match patients with similar characteristics to reduce bias.
– Sensitivity Analysis: Conduct sensitivity analyses to assess the impact of potential biases.Data Analysis and Training
– Training Programs: Offer training programs for healthcare professionals on big data analytics.
– Interdisciplinary Teams: Form teams with expertise in data science, statistics, and clinical care.
– User-Friendly Tools: Develop and use tools that simplify data analysis for non-experts.Interpretation
– Clear Guidelines: Establish clear guidelines for interpreting big data results.
– Expert Consultation: Consult with experts in data science and clinical care for accurate interpretation.
– Validation Studies: Conduct validation studies to confirm findings from big data analyses.Privacy and Ethical Issues
– Data Encryption: Use encryption to protect patient data.
– Anonymization: Anonymize data to protect patient identities.
– Ethical Frameworks: Develop and adhere to ethical frameworks for data use.
– Regulatory Compliance: Ensure compliance with data protection regulations (e.g., GDPR, HIPAA).Additional Suggestions
– Standardization: Standardize data formats and definitions to facilitate data integration and comparison.
– Collaboration: Foster collaboration between institutions, researchers, and policymakers to address common challenges.
– Patient Involvement: Involve patients in the research process to ensure their perspectives and concerns are considered.-
2025-02-01 at 9:25 pm #46824
Wannisa Wongkamchan
ParticipantThank you for sharing these valuable suggestions! I agree that addressing missing data, Data Quality Improvement, and privacy concerns is crucial for making big health data more reliable in cardiovascular research. Collaboration between experts and institutions, along with clear guidelines and ethical frameworks, will help ensure accurate analysis and real-world applicability.
-
2025-02-02 at 6:33 pm #46828
Aye Thinzar Oo
ParticipantI appreciate your useful suggestions! I concur that it is essential to tackle issues related to missing data, improve data quality, and address privacy concerns to enhance the reliability of big health data in cardiovascular research.
-
2025-02-03 at 8:42 pm #46836
Cing Sian Dal
ParticipantYour discussion was interesting and simulating. I agree with anonymization of data in ensuring privacy and confidentiality, which is the most important thing in any health data system.
-
-
2025-01-29 at 3:06 pm #46800
Wannisa Wongkamchan
ParticipantBig health data has great potential to improve cardiovascular research, but there are many challenges that need to be addressed. Here are ways to address them:
1. Missing Data
Missing data is a common problem in big health datasets and can lead to biased results. To handle missing data, we can use imputation techniques, such as mean imputation, regression imputation, or multiple imputation, to estimate missing values based on available data. In cases where missing data is too high (over 60%), researchers should consider collecting additional data or using sensitivity analysis to check how missing values affect results. Improving data entry practices and integrating multiple data sources may help reduce missing data problems. It is also important to improve data collection by training healthcare workers to record information more consistently.2. Selection Bias
Selection bias happens when the data does not represent the entire population, leading to incorrect conclusions. One way to fix this is to collect data from many hospitals, regions, and patient demographics, to make the dataset more representative. Another way is using statistical methods, like propensity score matching, to balance the differences between patient groups. While big data allows for large sample sizes, it does not always mean better accuracy, so careful validation with randomized controlled trials (RCTs) is necessary before applying findings to clinical practice.3. Data Analysis and Training
Analyzing big health data requires advanced statistical and programing skills, which many researchers and clinicians are not trained to use. To improve this, more training programs on biostatistics, statistical methods, AI, and programing should be provided to healthcare professionals. Using simple AI/ data analytics tools can help doctors and researchers use data more easily without needing advanced technical skills. However, collaboration between clinicians, data scientists, and engineers is also important to ensure accurate data analysis and interpreted correctly.4. Interpretation and Translational Applicability of Results
Since big data analyses are complex and not always easy to apply in real-world medicine, the interpretation and translational applicability of results can be difficult. To improve this, research findings should be presented in a simple and clear format for doctors and policymakers. AI models should be tested in real clinical settings before being widely used. It is also important to standardize data collection and ensure that studies use high-quality, well-documented datasets to avoid incorrect interpretations.5. Privacy and Ethical Issues
Handling patient data requires strict privacy protection to prevent misuse and ensure ethical research practices. Strong security measures, like encryption, secure storage and access controls, can help keep data safe from cyberattacks. Additionally, following legal regulations such as PDPA, GDPR and HIPAA ensures that data is used ethically. Most importantly, patients should be informed consent about how their data is used and should have the right to give or refuse consent.In conclusion, big health data offers many opportunities for cardiovascular research, but it also faces challenges. By addressing missing data, reducing selection bias, improving training, making results easier to use, and protecting privacy, these steps will help utilize big data more effectively and safely, ultimately improving healthcare outcomes.
-
2025-02-02 at 6:40 pm #46829
Aye Thinzar Oo
ParticipantThank you share your detailed information. Yes, I also understand that you mentioned it. It correctly identifies that selection bias leads to unrepresentative data, which can skew findings and lead to incorrect conclusions. It is a good point of the suggestions of collecting diverse data and utilizing statistical methods.
-
-
2025-02-01 at 8:46 pm #46821
Alex Zayar Phyo Aung
ParticipantMissing data: Standardized data collection tools should be used across the EHR and make mandatory entry for essential variables.
Selection bias: A representative sampling should be used across the study. Randomization and control trails (if possible but will be resource intensive for large scale survey) should also be used for comparative study
Data analysis and training: Domain knowledge is a must for researchers and enumerators for better understanding of the meaning of the data and interpretation. Knowledge on statistical methodologies are also important factor for hypothesis testing and statistical modelling.Privacy and ethnical issue: Most of the researchers make the data to be anonymous. However, consent from patients are not usually taken especially for big data analysis and use. Getting consent should be mandatory to ensure the data will be used for the sake of good and to ensure the patients understanding on what they are agreeing to.
Additional suggestions: In addition to above areas mentioned in the literature, I would suggest the quality of data should be one of the major challenges for bid data analysis. Accuracy, consistent and completeness (similar to missing data) are crucial for data governance otherwise the result be garbage in and garbage out.
-
2025-02-01 at 9:33 pm #46825
Wannisa Wongkamchan
ParticipantThank you for sharing, I agree that quality of data should be one of the major challenges, while anonymization helps protect patient identities, obtaining informed consent should be a standard practice in big data research. However, in practice, obtaining explicit consent from every patient can be challenging, especially in retrospective studies where data has already been collected. Balancing ethical considerations with research feasibility remains a key challenge.
-
2025-02-02 at 6:45 pm #46830
Aye Thinzar Oo
ParticipantThank you for sharing your thoughts. I concur that ensuring data quality is a significant challenge in research. And also I agree the striking a balance between ethical considerations and the feasibility of research remains a critical issue.
-
2025-02-03 at 8:33 pm #46835
Cing Sian Dal
ParticipantI love your explanation by the analogy of garbage in and garbage out regarding data quality. In the end, low quality data can waste a lot of time. It should be handled in the first place.
-
-
2025-02-02 at 6:31 pm #46827
Aye Thinzar Oo
ParticipantMissing Data: A significant challenge in health facility data collection is missing data, which can undermine data quality and integrity. Missing data leads to incomplete analysis, biased results, and poor decision-making in healthcare. Additionally, the lack of standardized data collection methods makes it difficult to integrate information across different healthcare facilities, further complicating analysis.
Selection bias: in the health facility data collection occurs systematically from the data analysis, leading to unrepresentative samples. Routine healthcare data may not capture diverse patient populations due to factors like access to care, and geographic. As a result, findings based on this data may not accurately reflect the health outcomes.
Data Analysis and Training: Data users and researchers must possess the necessary domain knowledge to interpret the data accurately. Proper understanding of data is essential for accurate statistical analysis and hypothesis testing, ensuring meaningful and reliable results.
Interpretation and Translational applicability of result: it has several key issues, such as data storage, searching, and capturing are technical difficulties and analysis. All the results are applicable across different healthcare settings is complex due to variability in data quality and relevant context.
Ethical and Privacy Issues: Big health data faces significant privacy and ethical challenges in data collection. Patient health information must be protected to prevent misuse, requiring robust security measures to defend against cyber threats. Encryption and secure data storage are essential to safeguard patient privacy.
Conclusion: I propose that data quality should be recognized as a critical challenge in big data analysis. Ensuring accuracy, consistency, and completeness in data collection is essential for effective data governance. Moreover, exploring new approaches to manage and analyze large volumes of data will help improve decision-making and outcomes in healthcare. -
2025-02-03 at 8:25 pm #46834
Cing Sian Dal
ParticipantThe article focuses on utilizing Big Data to solve cardiovascular diseases and mentions the challenges of practically approaching it.
# (1) Missing data: It is due to the data being omitted by clinicians, considering it unnecessary, patient refusal, disagreeing with data collection, and unsolvable missing data. As a result, there is less than 10% manageable missing data, 10-60% unmanageable missing data giving different results among methods, and 60% of missing data does not have a valid statistical solution.
The paper mentioned different solutions: (a) Complete-case analysis, (b) Available-case analysis, (c) Imputation techniques, (d) Mixed effects regression models, (e) Generalized estimating equations, (f) Pattern mixture models and selection models.
In addition to that, my suggestion for this missing issue is that data fields should be validated during the process of submitting if the data collected is aimed at further research. There should also be a beneficial program for the patients such as providing a monthly lucky draw.
# (2) Selection Bias: As the patients differ in their geographic profiles, insurance coverage, and medical history, this results in different variable distributions in different treatment groups. Consequently, a large volume of data no longer ensures a representative sample, preventing making any valid inference, and generating several false positive results.
The paper suggests (a) Propensity score analysis, (b) Instrumental variable analysis, (c) Mendelian randomization for genetic studies, (d) Considering results as hypothesis-generating, and (e) Validating through RCTs.
My opinion on this issue is the same as the Propensity score analysis which matches patients with similar characteristics across treatment groups to reduce bias.
# (3) Data Analysis / Training: Lack of formal trainings in informatics, coding, data analysis, large database handling, inefficient algorithms leads to this complexity, resulting suboptimal analysis and inefficient data processing.
In my opinion, this could be easily improved by providing formal training programs, and collaboration between clinicians and data scientists. Analyzing singly-handed could lead to doing the right thing in the wrong way.
# (4) Interpretation and Translational Applicability of Results: Studies being complex and not self-explanatory with poor variables description, subjective assumptions in analysis, questionable data quality could contribute to unclear conclusion and biased interpretation.
To improve the interpretation and translation applicability of results, firstly, the variables and metadata in the datasets should be consistently well-defined to make it easier to interpret or use across studies. Standardization will address this issue. Secondly, validating through independent studies should be established whether replicating the same studies confirms the same results.
# (5) Privacy and Ethical Issues: Medical servers can be targeted by cybercriminals and there could be a risk of identifying individual information. As a result, it compromises individual privacy.
The paper discusses (1) using broad consent models, (2) implementing a “social contract”, (3) continuous improvement of data security systems, and (4) balancing privacy protection with community benefits.
My concern is about balancing privacy protection with community benefits. If it aims for benefits primarily, it is open to abuse such as corruption and public manipulation. Even if there are no benefits, ethical standards, and privacy policies should not harm people.
-
2025-02-03 at 9:52 pm #46838
Tanaphum Wichaita
ParticipantDisease Definition
Setting clear disease definitions helps us determine what data to collect and use. Using standard systems like ICD (International Classification of Diseases) or SNOMED CT ensures consistency. Doctors and data experts should work together to refine these definitions so they match real-world cases.Data Quality and Missing Data
If data is missing or incorrect, it becomes unusable. To fix this, we need to check data accuracy when collecting it. If data is missing, we can use methods like multiple imputation or regression imputation to fill in gaps with estimated values.Unstructured Data
Audio recordings and videos may contain useful medical details, such as a patient’s heartbeat sound or movement patterns in a video. However, these types of data are difficult to analyze directly. To make them useful, we can convert speech to text/number or use machine learning to recognize patterns. This helps turn unstructured data into structured data that can be used in research and treatment.
Data Analysis and Training
Learning how to handle big data is essential. Machine learning can help identify patterns in large datasets, while tools like Apache Spark and Hadoop process big data efficiently. Training programs and workshops can help healthcare professionals improve their data skills. -
2025-02-10 at 5:47 am #46857
Wirichada Pan-ngum
KeymasterGood discussions. If we were to study on site, I would love to take you on the tour to our data center, BIOPHICS where Siriluk works! Talking to people who have lot of experiences working with different projects would be very helpful learning.
-
-
AuthorPosts
You must be logged in to reply to this topic. Login here