-
Pimwadee Chaovalit replied to the topic Q&A Meeting arrangement session week 2 in the forum Data mining and machine learning 1 month, 2 weeks ago
Hi,
The data link might have been changed. Please download the data from the website listed in task #1 instead. Follow the link, then click download button located on the top right of the screen.
I hope this helps.
Pimwadee
-
Pimwadee Chaovalit replied to the topic To identify missing values in the dataset in the forum Archive 2023 1 year ago
Hi Siriphak,
Thank you for your question. I would use unique() to list all possible answers for the data.
You may also use str() or summary() on your data.
I hope that helps.
Pimwadee
-
Pimwadee Chaovalit replied to the topic Week 2 Decision tree: sample size & continuous attribute in the forum Archive 2023 1 year, 1 month ago
Hi Tanyawat,
I’m sorry it took a while for me to circle back to answering this question in written form.
1) An answer for this question is highly data-dependent. But the rules of thumb for data training is the more the better. And then you set that aside for training 70-80% of all the data you have.
If there are clear patterns hidden in your…[Read more]
-
Pimwadee Chaovalit replied to the topic The outliers in clustering in the forum Archive 2023 1 year, 1 month ago
Hi Abdillah,
You are right to be concerned about outliers. Outliers are ubiquitous in all kinds of datasets! When we are trying to make sense of something, we probably don’t want to think too much about exceptional cases like outliers. Because outliers can distort our understanding of the nature of the data.
There is an exception if we want to…[Read more]
-
Pimwadee Chaovalit replied to the topic Q&A Meeting arrangement session week 3 in the forum Archive 2021 3 years ago
For the purpose of this course, you do not need to compare your tree model with these methods. Since our main objective here is to learn and practice tree construction and interpretation, running decision tree construction including parameter tuning and pruning, as well as decision tree evaluation is all you need. We do not require you to know…[Read more]
-
Pimwadee Chaovalit replied to the topic Q&A Meeting arrangement session week 2 in the forum Archive 2021 3 years, 1 month ago
Hi Navinee,
Good question. The short answer is there is no agreeable number in the industry as to which between_SS / total_SS is considered high or low. But let me take you through some examples, which I found on . It is helpful to consider between_SS / total_SS altogether with some plots of the data.
Now please consider case 1. This clustering…[Read more]
-
Pimwadee Chaovalit replied to the topic Q&A Meeting arrangement session week 2 in the forum Archive 2021 3 years, 1 month ago
Dear Rawinan,
Thanks for your question. Accuracy is the percentage of correct prediction, while error rate is the percentage of the incorrect prediction. Those two numbers are calculated across all classes. Precision tells you how correct the algorithm was in predicting the positive class. Finally, recall tells you how much of the actual positive…[Read more]
-
Pimwadee Chaovalit replied to the topic COVID-19 header information link is dead in the forum Assignment 4 years, 2 months ago
Thank you for informing us. I have chosen the README.md link. I replaced it with the dead link on the assignment page.
-
Pimwadee Chaovalit replied to the topic Week 2 Assignment : Data Source in the forum Assignment 4 years, 2 months ago
The dataset I believe has been processed from its original. The data with a bunch of numbers with no B or M diagnosis is in fact described here. http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.names
An excerpt from the above file is below:
=========================
7. Attribute Information:…[Read more] -
Pimwadee Chaovalit replied to the topic Class in decision tree analysis assignment in the forum Assignment 4 years, 2 months ago
Referring to the file “breast-cancer-wisconsin.names” within this data folder (http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/), the class attribute is in the last column.
-
Pimwadee Chaovalit replied to the topic R script for assignment 1 in the forum Assignment 4 years, 3 months ago
Hi. It seems from the error message (NA/NaN/Inf in foreign function call (arg 1)) that the error has something to do with invalid data types. Perhaps the input data is not conforming to the required input for kmeans?
-
Pimwadee Chaovalit replied to the topic Week 1 section 1.5 Nearest and Farthest neighbour clustering in the forum General Topic 4 years, 3 months ago
Thanks for the question! Yes, the reading assignment 2 of section 1.5 mentions both single linkage and complete linkage, which are two different ways for an agglomerative clustering to merge smaller clusters to bigger clusters. Even though we did not mean to elaborate on their definitions and difference in this course, we are happy to see its…[Read more]
-
Pimwadee Chaovalit changed their profile picture 5 years, 4 months ago