It was the first day at high school. The class was full of students from different parts of the city. A few had moved to the city from nearby towns. Except for a few, who were with me in the primary school, most others were new faces.

Naturally the day started with the round of introductions. The intro contained lots of information about everyone, it was just not possible to remember everything. Mentally I started noting some important information about every one.

• The most obvious piece of information was the gender. At the high school age, that happens to be the most important attribute for some reason.
• Interest in courses – He seems to be a bright student, he can help with the studies
• Location of stay – He is from my neighborhood, can we pool together to go to school?
• Interest in sports – Wow! This guy plays chess, must meet him after the class

All of this was happening so naturally, I did not even realize that I was using the basic principle of tagging.

Tagging is identification of an attribute, and associating it with the item, which can be used to make future decisions. Tagging in itself is a very simple process. The difficulty lies in identifying the right attributes, identifying possible values for them and having an accurate measure in place. Also it takes up a lot of resources in terms of elapsed time and budget since the simple process needs to be executed a large number of times to be effective with machines. It can involve the maximum amount of human effort.

How do you perform tagging of your data set?

