I think twitter is one of the famous examples of Big Data implemented on so many health informatics projects. It can represent all characters of big data as follows;
1 Volume: Twitter generates more than 500 million “tweets” a day. A tweet contains around 280 characters or roughly 40 words per tweet. That means twitter create 20,000 million words a day!
2 Velocity: You can observe how massively and continuously twitters generate data in real-time here https://www.internetlivestats.com/twitter-statistics/
3 Variety: Tweets are mostly texts, and pictures posted and are unstructured.
4 Veracity: Tweets are posted from all groups of age, ethnic, background, worldwide, etc., to represent thoughts, views, truths, situations in some specific time and area. In Thailand, you can check daily talk-of-the-town by top tweets on the day. The tweets can be a rumor or a truth.
5 Value: texts, pictures in tweets are posted in real-time and can predict the prevalences of some pandemic like flu, which is valuable for health informatics.
6 Variability: You can see trends and change of meaning in tweets as data is generated differently when the time has passed, effected by different situations on a minutely, daily or monthly basis.
7 Visualization: Sources like twitter may sound not trustable, so all information analyzed from twitter must be in an excellent presentation to support the purpose of intended projects.
8 Validity: to make analyzed data be able to represent accurate results, consistency of qualified tweets and definitions of each evidence texted or pictured in tweets is vital unless the trends can be biased.
9 Vulnerability: For large value mount of data like big data, it’s essential to have such data protected by adopting data safety practices. Data breach in big data can impact a vast population.
10 Volatility: as the date comes into sever with such a rapid stream like shows in Velocity, it’s crucial to manage how long to store collected tweets data before getting rid of such data to reserve storage space for new coming tweets data.