I used to loathe working on categorization projects because
I knew eventually someone would say, “I don’t know, put it in a miscellaneous
folder”. This meant that the overall design of the categories was flawed or
that we didn’t have enough time and energy to work out every minute detail,
only to have it change in a few months anyways.
Flexible categorization makes sense, but the tools are still
designed to tag content with fixed values. Big Data solutions might eventually
help with this if you have millions of dollars to spend on them. For small to
medium sized systems, we are stuck with good old fashioned indexing and search.
However, this might prove to be a better long-term solution to content mining.
I still believe the better the metadata, the better search results.
The Misc folder suits a number of different purposes:
In general, this folder can be used to analyze new trends in
metadata values, that is, some patterns of values will become apparent as more
content goes there. Over time, the patterns will become folders/categories and
there metadata values will become part of the indexing process. Likewise,
categories that are almost empty will be merged with others because their index
values are too restrictive.
In taxonomy, a miscellaneous folder is a black box,
something that gets all that is outside the scope of the people working on it
at the time. Emphasis on “at the time” here because as ways of organizing
information changes, so goes the taxonomy.
In workflow, miscellaneous really means a place/bucket where
all the routing mistakes are sent, or more likely where any new unanticipated content
types go. This works well as it is obvious as the content builds up which queue
they should be routed too. The alternative would be to ignore the outliers
which would leave them for discovery projects in the future.