Tuesday, January 26, 2016

The Role of MISC in ECM

I used to loathe working on categorization projects because I knew eventually someone would say, “I don’t know, put it in a miscellaneous folder”. This meant that the overall design of the categories was flawed or that we didn’t have enough time and energy to work out every minute detail, only to have it change in a few months anyways.

Flexible categorization makes sense, but the tools are still designed to tag content with fixed values. Big Data solutions might eventually help with this if you have millions of dollars to spend on them. For small to medium sized systems, we are stuck with good old fashioned indexing and search. However, this might prove to be a better long-term solution to content mining. I still believe the better the metadata, the better search results.

The Misc folder suits a number of different purposes:

In general, this folder can be used to analyze new trends in metadata values, that is, some patterns of values will become apparent as more content goes there. Over time, the patterns will become folders/categories and there metadata values will become part of the indexing process. Likewise, categories that are almost empty will be merged with others because their index values are too restrictive.

In taxonomy, a miscellaneous folder is a black box, something that gets all that is outside the scope of the people working on it at the time. Emphasis on “at the time” here because as ways of organizing information changes, so goes the taxonomy.

In workflow, miscellaneous really means a place/bucket where all the routing mistakes are sent, or more likely where any new unanticipated content types go. This works well as it is obvious as the content builds up which queue they should be routed too. The alternative would be to ignore the outliers which would leave them for discovery projects in the future.

No comments: