Name: TMHG 535 Mathematical and Economic Modeling Applications in Biomedical and Public Health
Start: 2025-8-24
End: 2025-9-21

This topic has 1 reply, 2 voices, and was last updated 1 year, 10 months ago by Pimwadee Chaovalit.

Viewing 1 reply thread

Author

Posts
- 2023-10-05 at 6:17 pm #42150
  
  ABDILLAH FARKHAN
  Participant
  
  Hi, I am glad to be given a chance to study clustering here, but I am concerned about data in abnormal positions or distances (what we called outliers). Some algorithms such as K-means and hierarchical clustering may be tolerant and can capture the outliers to be included in the clusters. However, other algorithms such as DBSCAN do not capture outliers (from what I learned in YouTube videos). Here, I need to ask how should we manage the outliers itself? I mean, are the outliers something that we need to consider in the cluster or do we just ignore them? Thank you.
- 2023-10-08 at 4:26 am #42199
  
  Pimwadee Chaovalit
  Keymaster
  
  Hi Abdillah,
  
  You are right to be concerned about outliers. Outliers are ubiquitous in all kinds of datasets! When we are trying to make sense of something, we probably don’t want to think too much about exceptional cases like outliers. Because outliers can distort our understanding of the nature of the data.
  
  There is an exception if we want to study those extreme cases, then outliers are important. But for clustering, algorithms that are prone to outliers like k-means or hierarchical clustering need to be treated with care. If we are aware that data has outliers, some may choose to play with inclusion / exclusion of those cases to see if clustering results change.
  
  I hope that makes sense. Thanks for a spark of discussion!
  
  Pimwadee
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic. Login here

The outliers in clustering

Login with your site account