Sunday, April 24, 2011

The Intersections of Metadata, eDiscovery, Taxonomy, and Records Management

Designing and implementing systems which manage content (outside of creating/reviewing/approving content) such as metadata, ediscovery, taxonomy, and records management can be a challenge if done in a vacuum. Each of these systems of content description and rules has intersection points with one another.

I have witnessed what happens when one system is designed without taking the others into account: change management nightmares. Think about freezing a file plan for RM and then having to change it… Add to this, CMIS and other web services which try to do similar actions on content and we have a web of interactions that collide and push and pull on each other.

Why do monolithic ECM companies have to apply layer upon layer (apps mentioned above) of abstraction and rules and xml configurations to do very core things to content? Because it sells new products and brings in market share from other vendors: it allows ECM companies to grow and to please their shareholders. So where do the specialized products and do-it-all products meet? At the following components of description and action:

Tip: Plan for developing ways to migrate large amounts of information into the repository.
A mechanism to get content into the repository and to describe it as table entries in a database that points to the file location.

Tip: plan for inevitable change with a metadata repository.
Describe the important/pertinent aspects of the content for the purposes of discovery, 21 CFR part 11, ISO-15489, MOREQ2, SAS 70, etc. Describe for the rules and regulations not for the applications. Describe for the audit. Describe for the User trying to process invoices through approval and payment.

Tip: Google is Google for a reason, don’t try to copy them.
Searching for content especially lots of content is a major challenge for large repositories. For lawyers trying to find content pertinent to a class action suit this can be good or bad depending on the company’s strategy and how it weighs the fines for not auditing correctly vs. finding self-damaging evidence. The key is how to handle results and this is still not in its infancy.

Tip: think about the ramifications of completely changing the folder structure or overlaying it with multiple ways of “seeing” the information.
Many companies discount how powerful the folder hierarchy metaphor still is. They through content into repositories and hope they can find it through searching. Only later do they figure out that folders can be thought of as virtual in the sense that they can change in structure and labeling without disrupting other ways to find the content they need.

Tip: be careful of file names and deep folder paths.
Get the content and metadata out of the system for discovery, migration, or long-term storage. The issues are getting the attributes and audits together and maintaining the context of the content with locations and original modification dates, user data, and validation.

Tip: here’s where poor metadata and lazy content management really cost a company huge bucks in maintaining backups of worthless content.
Delete unwanted content period. If you are like some pharma or financial companies and send all of your old content to Iron Mountain, you are a hoarder and should seriously look at your retention policy.

No comments: