Detecting outliers: analysis technique for data scientists
This year, it was the eighth reunion of my class friends from high school. A few years ago, we discovered each other through various social media channels, and then we have been meeting regularly in person.
As it happens in any other group, most of the friends enjoy talking to each other, rejuvenating from the memories of the past. However a couple of them dominate the conversations.
Prashant is a very outspoken person. He will always be seen talking to someone or the other from the time he enters the room. He is a bundle of energy and can not be missed while he is around.
Suhas on the other hand enjoys organizing the event. Right from collecting the RSVPs, arranging various activities, making sure everyone participates, and then collecting the contribution and settling the payments. Suhas ensures that everyone participates and enjoys the event.
This year, unfortunately both Prashant and Suhas could not attend the reunion. Who do you think was missed by all?
During data analysis, when we look at outliers, we will find some having higher impact on the results. It becomes important to detect outliers and analyze them further. We do get different and interesting results based on what we look for and what we eliminate. We can categorize them as