Health information in various Electronic Health Records across different hospitals can be considered Big Data.
1. Volume — As the number of patients is high, the volume of data is undoubtedly high. It is also high in other aspects as will be described.
2. Velocity — Healthcare facility works like a factory. A lot of data is continually added everyday, with no holidays
3. Variety — At least, there will always be digitizable formats, such as typing in; and undigitizable formats, such as scans and videos. If variety of data is not accepted, some valuable information will be lost.
4. Variability — Not all data is equally important. Data can be categorized by importance.
5. Value — All data recorded to the system should have value. This may also include interoperability.
6. Validity — Some information is actually lost by digitization.
7. Veracity — Inputting of data should be accurate, but isn’t always in reality.
8. Venue — Different venue may or may not use the same application, and even with the same application, the guideline on how to use the application is not well developed.
9. Vocabulary — Data models and data structures are usually the same if controlled by the same application. But in reality, they don’t always use the same application. Also, there are several supplement applications.
10. Vagueness — After all, value of information depends on context. There needs to be research on which context to use, and possibility of using it in a new context.
I might also add
– Volatility — Usually the information is not volatile in health. Complete natural history helps.
– Virality and viscosity — As the data is big. The emphasis is not put equally on every part of data, resulting in bias.
– Visualization — to prevent bias, and to make use of data.
– Redundancy — There will be repetition of data from different sources in different context.