Information is Messy – Building Taxonomies

Screen Shot 2015-04-17 at 9.35.26 PMFor one of my projects I am working with taxonomies, sets of categories that I use for analyzing customer support emails. The project, UserChamp, aims to provide product managers with actionable information from their users in the form of support conversations. To do this categorization I am manually creating categories and then categorizing email conversations from our users’ support. To scale the system and get some more users onboard I need to productize and partly automate the categorization of support emails. And to do that I need to formalize and productize the creation and maintenance of the taxonomy used for each customer company.

Taxonomies are hard to work with for a few different reasons. The applicable categories differ by company and which context they are operating in. Depending on what the product is and who the users are one category may be way more common with one company and completely absent with another. In extension of that, putting together a usable taxonomy requires domain knowledge from the company in question. For a complete outsider, creating a taxonomy based on support tickets with no knowledge of the system or the user is very difficult. At the same time, the categories chosen need to make up a taxonomy that works taken as a whole. Categories need to have a level of granularity where the categories are useful, but the overall number of categories is still limited. That takes a level of knowledge that is most likely not available in the team that has the domain knowledge needed.

So I need a middleground. My working hypothesis right now is that a set of generic categories can be used. A client would be presented with the generic categories and a sample of their own support tickets (something in the order of 500). They would then categorize the tickets as best they could. Any tickets that don’t fit a category go in a separate category and are then broken out in groups and labelled as new categories after the first pass. While this would get things going and you would have a working taxonomy of issue categories at this point, changes will be needed down the line. New issue categories may appear, old ones disappear. So it is also necessary to have a set of principles for how to adapt the taxonomy over time. The working theory in this area, is to have a threshold for how big a topic in the “Other” category has to be in order to be a candidate for a new separate category.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: