- This topic has 16 replies, 16 voices, and was last updated 11 months, 1 week ago by Noi Yar.
-
AuthorPosts
-
-
2023-09-23 at 10:50 am #41850SaranathKeymaster
Can you give an example of data that you think it could be considered as “Big Data”? What are the characteristics of the data that fit into 5Vs, or 7Vs, or 10Vs of Big data characteristics?
-
2023-09-25 at 2:30 pm #41883Weerapat PipithruengkraiParticipant
I think Electronic Health Records (EHR) data would be considered as “Big Data”, since it is digital records of patient’s health information, such as medical history, diagnoses, lab results, etc. In my opinion, EHR data have the following big data characteristics:
5Vs characteristics (Volume, Velocity, Variety, Veracity, and Value)
Volume: The volume of EHR data can be massive as it consists of comprehensive patient records that include medical history, diagnoses, medications, lab results, and much more. Moreover, as the EHR data are individual data, it is expected to increase exponentially every day.
Velocity: EHR data is created and updated continuously as patients receive medical care. Each patient interaction contributes to the data stream in real time.
Variety: EHR data is highly diverse as includes data from various specialities and departments. The EHR data can be in all structural types such as structured data (standardized health records), semi-structured data (clinical notes and reports), and unstructured data (medical images, hand-written notes).
Veracity: EHR data is required to have accuracy, completeness, and consistency, as it’s essential for healthcare operations. Since data quality can impact healthcare outcomes.
Value: EHR data is essential for healthcare providers, as they are fundamental information for healthcare operations such as diagnosis or treatment. In addition, EHR can improve the quality and efficiency of healthcare delivery by supporting research, communication, and continuity of care among providers and patients.7Vs characteristics (Adding Variability and Visualization):
Variability: Variability in EHR data can refer to format, structure, meaning, or context.
Visualization: EHR data can be used with visualization techniques that can convey information, patterns, and insights that can be easy to understand.10Vs characteristics (Adding Vulnerability, Validity, and Volatility):
Vulnerability: EHR data is highly sensitive and subject to privacy regulations, which are required to be protected against data breaches and unauthorized access.
Validity: The viability of EHR data refers to the correctness and reliability of the patient information. It is essential to reduce data errors to improve outcomes.
Volatility: The patient data is required to be regularly updated to ensure data relevance and continuity of care outcome. -
2023-09-28 at 9:31 pm #41954Ching To ChungParticipant
An easy example I could think of would be Youtube Videos.
5Vs characteristics (Volume, Velocity, Variety, Veracity, and Value)
Volume: The volume of Youtube videos are enormous. 3.7 million videos are uploaded every day with 200,000+ hours of content.Velocity: Since videos are constantly being uploaded, viewed, and shared in real-time with users all across the globe, there is a unending stream of video uploads, views, likes, comments, and shares which demonstrates the velocity of the platform
Variety: Youtube Videos are diverse in their nature and requires special treatment. Some are music videos and should be included in the “Youtube Music” app and be optimized for streaming. Some have super high video quality (60fps, 4k, etc.) and requires special treatment. Some are for kids, and it is important to examine that there is no malicious content included in them.
Veracity: Youtube videos need to be accurate and of high quality to ensure the viewing experience of the audience. Also, there are often videos of illegal content being uploaded to Youtube (violence, pornographic, etc.). It is therefore important for algorithms to be in place to moderate and filter out inappropriate content.
Value: YouTube videos possess significant value, both for viewers and organizations (as you can probably tell from how much Youtubers earn). For users, YouTube offers entertainment and information. For organizations and content creators, YouTube provides a platform for marketing, advertising, brand building, and reaching a wide audience, and increasing profit. Analyzing YouTube video data can also offer insights into user preferences and trends, which can be valuable for marketing.
-
2023-09-29 at 2:29 pm #41975Suppasit SrisaengParticipant
Thailand’s National Health Data Center (HDC) or the 43 files could be considered as “Big Data” in healthcare.
10 Vs:
Volume: The 43 files cover extensive health data across Thailand, encompassing millions of medical records.Velocity: Given Thailand’s sizable population and the constant flow of healthcare services, data is generated and updated rapidly.
Variety: The data includes various types such as textual clinical data and numeric lab tests making it diverse.
Veracity: Ensuring data accuracy is crucial. The HDC uses rigorous verification methods to maintain high data quality.
Value: The data has immense value for epidemiological studies, policy-making, and healthcare improvements.
Variability: The data can be highly inconsistent due to seasonal diseases or public health crises, affecting its structure and interpretation.
Vulnerability: Health data is sensitive; thus, strong security measures are essential.
Visualization: Effective visual tools are needed to interpret this complex and multi-layered data.
Volatility: How long data should be stored is vital, especially when considering the speed of medical advancements.
Validity: The data must be valid and conform to compliance standards, like those set by HIPAA or local Thai regulations.
-
2023-09-29 at 4:12 pm #41997Supida BamrungtrakulsukParticipant
The big data that could be an example related to our daily life would be health and fitness data collected by a wearable smart device called Apple watch.
5Vs characteristics
1. Volume
The Apple Watch continuously collects data and tracks physical activities, such as step counts, heart rate measurement, blood oxygen, time asleep, respiratory rate, EKG, and body temperature etc., resulting in a vast volume of data being generated for each user over time.
2. Velocity
The Apple Watch shows real-time data, for instance heart rate monitoring, which is synced continuously to the connected iPhone for immediate health insights and alerts.
3. Variety
The Apple Watch consists of various sensors that capture a diverse range of data types; for example, GPS for location information during workouts, accelerometers for step counts and sleep tracking, optical heart sensor for heart rate, and gyroscopes for standing activity etc.
4. Veracity
Data accuracy and quality collected by Apple Watch could be influenced by some factors, such as proper device fit and usage.
5. Value
Monitoring user’s health and fitness data from Apple Watch could provide health insights for informed decisions about their well-being, including tracking daily activity goals, monitoring heart rate trends, and detecting irregular heart rhythms or falls, which can be life-saving. -
2023-09-30 at 12:03 am #41999Teeraboon LertwanichwattanaParticipant
In my opinion, the data collected through the Mor Chana Mobile app for COVID-19 contact tracing can be classified as Big Data for several reasons.
1. Volume: This term refers to the large amount of contact information gathered from individuals who might have been exposed to COVID-19.
2. Velocity: Data is collected and updated rapidly, allowing for a swift response to prevent further transmission of the virus.
3. Variety: Various types of data are collected, including GPS coordinates, social media interactions, public transportation usage, and workplace attendance records.
4. Veracity: This aspect ensures the accuracy and reliability of contact data, taking into account potential discrepancies and errors.
5. Value: The data is valuable when analyzed to identify high-risk areas, super-spreader events, and transmission patterns, aiding in effective decision-making.
6. Variability: Contact patterns can vary due to factors such as lockdowns, social distancing measures, or changes in public compliance, making the data dynamic and adaptable to different situations.
7. Validity: It’s essential to verify that the contact data comes from reliable sources and is not based on misinformation or unreliable reports. -
2023-09-30 at 11:24 am #42040Soe HtikeParticipant
I think “Facebook” can be a good example of big data. It fits in describing big data characteristics like –
1. Volume – Facebook data is generated in massive quantities daily. Billions of users worldwide generate text, images, videos, and interactions on social media platforms.
2. Velocity – Facebook data is generated in real time. Users post status updates, photos, and comments continuously. The platform needs to process and display this data promptly.
3. Variety – Facebook data is highly diverse. It includes text posts, multimedia content (photos and videos), user profiles, friend networks, and metadata. This variety of data types adds complexity to data processing.
4. Veracity – Veracity is a significant concern for Facebook, as it faces issues related to fake accounts, misinformation, and privacy breaches. Ensuring the accuracy and trustworthiness of user-generated content is essential.
5. Value – Facebook leverages its Big Data to provide personalized content, targeted advertising, and user engagement insights. Advertisers use Facebook’s data to reach specific demographics, and users benefit from a more tailored experience.
6. Variability – Facebook data exhibits variability due to trending topics, viral content, and shifting user behaviors. The platform must adapt to these changes, such as algorithm updates, to prioritize certain types of content.
7. Vulnerability – Facebook faces vulnerabilities related to data privacy and security. Data breaches and unauthorized access to user accounts are ongoing concerns. The platform must invest in robust security measures.
8. Visibility – Facebook aims to maintain visibility into user activities to improve user experience and content recommendations. It employs algorithms to monitor content and user behavior.
9. Validity – Ensuring the validity of user-generated content on Facebook is essential to combat misinformation and fake accounts. The platform employs content moderation and reporting mechanisms.
10. Volatility – Facebook data is subject to rapid changes influenced by global events, user trends, and platform updates. Understanding these fluctuations is crucial for content delivery and engagement. -
2023-09-30 at 8:55 pm #42058Sirithep PlParticipant
The asset tracking data could be considered as Big Data. In the large organization or hospital, there are many valuable assets in each department that have to record and track. These data are concordance to the 5Vs of Big data characteristics as follows:
1. “Volume”: Each department have different asset to record, for example,
in-patient care units have durable assets (e.g. patient bed, medical devices) and consumable products (e.g. saline solutions, drugs, etc.)
Laboratory units have durable assets (e.g. laboratory machine, microscope) and chemical solutions, etc.
These data are large in quantity and essential for intradepartmental/organization’s management.2. “Velocity”: The velocity for generating, storing and analyzing data is one of the characteristics of Big data. In the hospital or organization, the recording and tracking of the data about the durable and consumable goods in each department are fast in day or month for consumable products and in quarter or year for durable assets.
3. “Variety”: The recorded data of the assets are in various file types or values from each department due to the difference of types of assets.
4. “Value”: These data are valuable for the hospital and organization for managing the resource to administrate hospital services and budget.
5. “Veracity”: These data must be accurate and reliable for appropriate context within data analysis. If they are not accurate, the mistake will happen and may mislead for asset management.
-
2023-09-30 at 11:34 pm #42060Nichcha SubdeeParticipant
I would like to present ‘Twitter’ or the current name ‘X’ as an example that is considered as Big Data. Twitter can be categorized into the 5Vs of Big Data characteristics as follows:
1. Volume = There are millions of tweets posted daily from across the world, including texts, images, and various other media content.
2. Velocity = Twitter’s users post and share their tweets in real time. In the event of incidents such as earthquakes or other natural disasters, users in affected areas and relevant organizations can rapidly share and update crucial information, enabling swift responses.
3. Variety = Twitter has both structured and unstructured data, ranging from textual content to images, GIFs, and videos.
4. Veracity = Despite serving as an information source, Twitter still has to deal with a lot of fake accounts or misleading information. The validation of each tweet’s accuracy remains an ongoing challenge.
5. Value = The data on Twitter is valuable for businesses and other organizations because it allows them to analyze trends and get real-time insights into public opinions which helps in making decisions. -
2023-10-02 at 5:59 pm #42089Panyada CholsakhonParticipant
Recently, an e-commerce is really popular and I think the big data is crucial in order to bring the profits to many online businesses and helps them reach their objectives. Amazon e-commerce is a good example as it is the largest e-commerce platform that uses the concept of big data which contains 5Vs: Volume, Velocity, Variety, Veracity, and Values as following
1. Volume: As it is the largest online shopping in the world, they sells various kinds of goods and services which also involve million of customers each day. In their business, they deals with data from many sources such as the stocks, transactions, customers’ behaviours, customer’s review, etc. The big data help they run their business smoothly and benefits for the decision making.
2. Velocity: As the largest e-commerce platform in the world, it means million of people are using this platform at the same time and 24/7. The speed and the update of the data is very important as the real-time business analysis can affect the customers’ purchasing behaviour and overall outcome.
3. Variety: Amazon business is using all types of data. Structured data such as customer’s name, age, address, telephone number, price, and number of items. Semistructured-data such as reviews, product description. Lastly, unstructured data such as image, video review, and social media comments.
4. Veracity: It is the accuracy and reliability of data which is the indicator of data quality. It is crucial for making meaningful decision from large data set.
5. Value: Big data in the business helps the business to meet its goal as it uses big data to extract actionable insights that create value such as improving customer’s experiences and increasing sales. -
2023-10-02 at 6:52 pm #42091Teerawat PholyiamParticipant
Telemedicine (this is used for primary consultations and initial diagnosis, remote patient care) can be considered as a source of big data. Big data typically refers to datasets that are large, complex, and constantly growing, often exceeding the capacity of traditional data processing tools.
Telemedicine data possesses several important characteristics, often categorized as the 7Vs:
Volume: Telemedicine generates vast amounts of data, including patient records, images, and real-time information.
Velocity: Data is produced and updated in real-time, necessitating swift processing for timely medical interventions.
Variety: Telemedicine data comes in diverse formats, such as structured patient records, unstructured medical images, and semi-structured reports.
Veracity: Accuracy and reliability of data are crucial to ensure quality patient care.
Value: The primary goal is to extract valuable insights from data to improve patient care and decision-making.
Variability: Data can vary significantly due to patient differences and technology usage, requiring effective management.
Visualization: Data visualization aids in understanding patient conditions, treatment progress, and health trends.
These characteristics underscore the complexity and significance of telemedicine data, which must be managed, analyzed, and secured effectively to provide high-quality healthcare.
-
2023-10-04 at 7:24 am #42118Pyae Thu TunParticipant
That’s a good example, Teerawat. Telemedicine data is a valuable resource that has the potential to imporve healthcare services. By effectively managing, analyzing, and securing telemedicine data, we can improve patient care and advance biomedical research.
-
-
2023-10-03 at 1:03 pm #42107Thitikan PohpoachParticipant
Netflix, a leading subscription-based streaming service, is one example of big data. It allows subscribers to watch TV shows, movies, and documentaries.
Volume – size of data
According to Statista.com, Netflix had around 238.39 million paid subscribers worldwide as of the second quarter of 2023. This marked an increase of 5.89 million subscribers compared with the previous quarter.Velocity – how quickly data is generated and how quickly that data moves
Netflix uses data processing software and traditional business intelligence tools such as Hadoop and Teradata, as well as its own open-source solutions such as Lipstick and Genie, to gather, store, and process massive amounts of information. These platforms influence its decisions on what content to create and promote to viewers.
Gabrielle Sadeh. How Netflix uses big data to create content and enhance user experience. 2019. Available from https://www.clickz.com/how-netflix-uses-big-data-content/228201/Variety – diversity of data types
Some of the variety of information can be seen. In particular, the following types of information are held:
– Browsing activities: sites, movie visited, downloads, searches
– Financial transactions
– Interests etc.Big data can be used to improve operations of Netflix, provide better customer service, and create personalized marketing campaigns — all of which increase value. The technology can be utilized to monitor customer behavior at both a micro and macro level which makes the experiences more convenient and enjoyable. A series of algorithms are applied based on the subscriber’s viewing preferences, Netflix is able to predict what we are likely to watch next.
-
2023-10-03 at 10:10 pm #42115Myat Htoo LinnParticipant
I would like to mention LinkedIn which is a professional networking platform dealing with big data like many other social media. This will fit into the following big data characteristics;
Volume: This platform generates and stores massive volumes of data daily which includes user profiles, connections, job postings, messages, and content. As of 2023, LinkedIn has 900 million users. Of these, 310 million are Monthly Active Users (MAUs) on LinkedIn.
Variety: Data comes in various forms, such as structured data (user profiles, job listings), semi-structured data (user-generated content, messages), and unstructured data (text, images, videos).
Velocity: LinkedIn updates real-time data. User interactions, job postings, and content creation occur continuously, requiring systems to process and make sense of data streams rapidly.
Veracity: Data quality is crucial on the platform to ensure accurate user profiles and recommendations because this is related to professional learning and job-related networks. Managing data quality involves cleaning and validating data to eliminate errors and inconsistencies.
Value: The primary goal of leveraging big data on LinkedIn is to extract actionable insights. This includes understanding user behavior, improving user engagement, enhancing content recommendations, and helping recruiters find suitable candidates.
-
2023-10-03 at 10:19 pm #42116PhyoParticipant
I would choose ‘banking data’ as big data. It has the following characteristics of big data.
Volume: Loads of data such as bank account registration, users’ transactions, cash deposits, transfers, withdrawals, etc. have been uploaded and updated every day with a certain amount of data flow.
Velocity: the data has been processed between the users and the servers with incredible speed when there is a transaction at the bank or other stuff.
Variety: Several data types are included in the banking data such as transactions, credit amounts, cash reports, mobile top-ups, etc.
Value: The executives in banking can make quick business decisions in response to real-time analysis of big data.
Veracity: The information and reports generated can be trusted by both account holders and bankers. -
2023-10-04 at 7:19 am #42117Pyae Thu TunParticipant
One of the examples for Big Data is medical imaging data which are the X-rays, CT and MRI images recorded for patients. These are stored in electronic format and health service providers can access those records from different locations.
Medical imaging data is characterized by the 5Vs:
Volume: Medical imaging data is generated in massive quantities, with billions of images produced each year.
Velocity: Medical imaging data is generated and processed at a high speed, as patients interact with the healthcare system.
Variety: Medical imaging data comes in a variety of formats, including 2D and 3D images, and images with different levels of contrast and resolution.
Veracity: Medical imaging data is generally accurate, but it can contain artifacts and errors.
Value: Medical imaging data is valuable for healthcare providers and researchers, as it can be used to diagnose diseases, assess treatment progress, and develop new treatments. -
2023-10-04 at 11:50 pm #42133Noi YarParticipant
Genomic data from scientific experiments by pharmaceutical companies is a type of Big Data that fits into the 7Vs of Big Data characteristics in the following ways:
Volume: Pharmaceutical companies generate a massive amount of genomic data from their scientific experiments. This is because they need to sequence the DNA of millions of patients and healthy individuals in order to identify genetic variants that are associated with diseases.
Variety: Genomic data is very varied. It includes data from a variety of sources, such as blood samples, tissue samples, and tumors. It also includes data from a variety of technologies, such as DNA sequencing machines and microarrays.
Velocity: Genomic data is generated at a very high velocity. This is because pharmaceutical companies are constantly conducting new scientific experiments.
Veracity: Genomic data can be challenging to verify. This is because it is often generated by complex sequencing machines and algorithms.
Value: Genomic data is very valuable. It can be used to develop new drugs, diagnose diseases, and predict a person’s risk of developing certain diseases.
Variability: Genomic data can be very variable. This is because the amount of data that is generated on a given day can vary depending on the number of experiments that are being conducted.
Visualization: Genomic data can be visualized in a number of ways, such as through charts, graphs, and heatmaps. Visualization can help to make genomic data more understandable and actionable.
-
-
AuthorPosts
You must be logged in to reply to this topic. Login here