Thursday, December 22, 2011

Quick Overview of Designing a Basic DCTM Solution

Make sure the business requirements are complete and describe use cases on how the users plan to use the system. The users might say they want everything the "old" system has, but work with them to be as detailed as possible.

List out the attributes which describe the assets.
Custom objects types should be created where the attributes are not standard and need to be entered in, validated and searched by the users via forms on the application server.

Security: Users/Groups/ACLs
Access Control must be detailed out and matched to the custom object types. It may be that you will only have one custom type and one main ACL.
  • Users need to be mapped to groups
  • Groups to ACLs
  • ACLs to folders/objects

These are used to automate changing attributes on content as it goes through stages of development from Draft, Review, Approve, Obsolete, etc.

Business Process
Using the Use Cases as a guide, create workflows which have activities that follow the procedures for publishing and/or storing assets.

The key to find content is to describe it well enough so that users can find it better than they do now.

Folder Structure
Keep folder structure simple, 2-3 levels deep max if that, and rely more on search to find content.

Most important steps:
Document functional and technical specs
Build prototypes for the users.

Thursday, October 20, 2011

FAQ for Information: Frequently Needed Information (FNI)

How many times have you vaguely remembered a project you worked on a few years ago and wanted to review it as a template for what you are working on now? Frequently Needed Information is a need, like tags and metadata, but a need that is thoughtful and idiosyncratic to be put in a bucket for you only. 
When you need it, you know where to look. Like a drawer in the middle of the kitchen for quick access of useful utensils. The drawer has implicit shared knowledge as well, as members of the family know to use it as well. This is not Enterprise 2.0 or social media hype, this is essential information readiness. 

Is there a feature like this in Webtop, CenterStage, or xCP? Not yet. Most frequently used is not necessarily what I need to look at for the next project. Should this be automatic or tagged? Should I only have to tag the content in the authoring program and not the from the content management system application UI? There's room for improvement.

Jeremy Rifkin knows how to build the pillars of the "Third Industrial Revolution". If energy creation is going the way of distributed green energy, then why can't we figure out distributed ECM? Is it that centralized control and production of ECM software is still too proprietary and entrenched in our hierarchical businesses? 

Decentralized pods of information gathering and projection could emerge as a possible solution. Every business unit would be on its own, but have to play within the rules and make sense to the whole or they will fail. Business units already have relationships, and control patterns in their normal daily tasks, now they need to fashion their use of technology to these habits. 

The "man" is turning into the "people", we just don't realize it yet. Facebook realizes it with relationships, now it's time for business to let go of the reigns of information and to start the new information revolution within their companies. Rules and regulations will still dictate process and procedures in some aspects of information gathering, however individuals and groups will slowly gain more and more license to gather their own utensils in the kitchen drawer and use them effectively. Enterprise 2.0 should really be Distributed Information 1.0.

Saturday, September 24, 2011

DCTM Outage Scenarios

Outage Windows
Typically, weekend outage times are acceptable to users of the system. These times will be used to deploy most fixes or upgrades.

Each server has specific requirements for OS upgrades as well as application upgrades. These upgrades may require downtime. Separate evaluations will have to be done by reviewing the risk matrix to determine the amount of integration dependencies.

Server Failures and VM Clones
Services on servers fail for a variety of reasons. Each server should have a recovery policy associated with it. For example, a clone of each server could be maintained for fast recovery of that particular server.

Routine Maintenance
Occasionally, patches will be applied to DCTM software installations. These patches may require restarting the services.

For problems with individual applications on servers, a procedure for fixing the issue in development, testing in Validation, and deploying to Production will be followed.

The SLA required by the GxP rules state that 4 hours is acceptable. This means that HA for the DCTM is not required.

Thursday, September 22, 2011

The risks for service outage

The risks for service outage can be broken down into three categories:

Server: each server has services which are vulnerable to outage. These servers are the Content Server, Index Server, Application Server, Database Server, and the Storage Server.
Systemic: The dependency of each server’s integration(s) with each other is vulnerable to outage. For example, if the content server goes down, the application will be out; if the database or storage goes out, the content server is down, etc.
Disaster: This would mean that the whole server room is down. The disaster scenario would cause the DR system to synch and start up.

The risks of services going down are real and happen most often at the server level. User complaints occur during times when performance is slow which may be a sign that a service is in trouble. Many times integration between DCTM and other services are risky because it is assumed that the other services are always up. If a company is growing, the network will be changing, databases will stumble, even electricity circuits will blow, so keep all of this in mind and in your recovery plans regardless of assurances that this "will never happen".

Risk Matrix by Server

Integration Dependency
Risk Level
Storage App
Storage Services
Database, Content, Index, App
Low (If HA, redundancy)
monitoring scripts
Database Server
Content Server
Low (If HA, redundancy)
monitoring scripts
LDAP Server
App/Content Server
Low (If HA, redundancy)
monitoring scripts
DNS Server
All Servers
Low (If HA, redundancy)
monitoring scripts
Repository Services
App Servers, Index Servers
Med (If standalone)
monitoring scripts
Java Method Server (JBoss)
Index agents, Jobs, workflow
Med (If standalone)
monitoring scripts
Med (If standalone)
monitoring scripts
xPlore Servers and Agents
App Server Search
Med (If standalone)
monitoring scripts

Disaster Recovery systems are replicated systems which constitute a low but viable risk.

Tuesday, September 6, 2011

In the Aftermath of EMC Sales and Sales Engineers

Any consultant who has landed a project after EMC sales and sales engineers have "sold" the DCTM software suite knows that resetting the client's expectations can be a challenge.  The motivations of sales and implementation are two completely difference animals. EMC Sales wants licenses and commissions, Consultants want to design, develop, and deploy the best possible solution (ideally). The intersection of these two perspectives is the customer who, more times than not, ends up feeling deceived and gipped.

So how do we accommodate the claims of EMC sales? First, accept that the client will want more than the software can deliver. For example, if the sales engineer said InputAccel for invoices can learn automatically how to pick up line items from an invoice, then you need to immediately explain in fuller detail what validation means and the steps taken for IA to actually “learn” the layout of an invoice.

Another example would be that it takes a few weeks to implement an enterprise wide solution for content management. If you installed the vanilla products and walked away maybe, but the client would be left with a car without a clue how to drive it and no roads to follow.

Second, do not make promised that you know you can’t keep. If you bid low to get a project, get ready to pay the consequences. Be honest and as comprehensive as possible. Show the client the details where they will have to pay more to accomplish what EMC sales had envisioned for them. The client wants a great deal and everything for free, but it is your job to bring them back to reality.

Sunday, April 24, 2011

The Intersections of Metadata, eDiscovery, Taxonomy, and Records Management

Designing and implementing systems which manage content (outside of creating/reviewing/approving content) such as metadata, ediscovery, taxonomy, and records management can be a challenge if done in a vacuum. Each of these systems of content description and rules has intersection points with one another.

I have witnessed what happens when one system is designed without taking the others into account: change management nightmares. Think about freezing a file plan for RM and then having to change it… Add to this, CMIS and other web services which try to do similar actions on content and we have a web of interactions that collide and push and pull on each other.

Why do monolithic ECM companies have to apply layer upon layer (apps mentioned above) of abstraction and rules and xml configurations to do very core things to content? Because it sells new products and brings in market share from other vendors: it allows ECM companies to grow and to please their shareholders. So where do the specialized products and do-it-all products meet? At the following components of description and action:

Tip: Plan for developing ways to migrate large amounts of information into the repository.
A mechanism to get content into the repository and to describe it as table entries in a database that points to the file location.

Tip: plan for inevitable change with a metadata repository.
Describe the important/pertinent aspects of the content for the purposes of discovery, 21 CFR part 11, ISO-15489, MOREQ2, SAS 70, etc. Describe for the rules and regulations not for the applications. Describe for the audit. Describe for the User trying to process invoices through approval and payment.

Tip: Google is Google for a reason, don’t try to copy them.
Searching for content especially lots of content is a major challenge for large repositories. For lawyers trying to find content pertinent to a class action suit this can be good or bad depending on the company’s strategy and how it weighs the fines for not auditing correctly vs. finding self-damaging evidence. The key is how to handle results and this is still not in its infancy.

Tip: think about the ramifications of completely changing the folder structure or overlaying it with multiple ways of “seeing” the information.
Many companies discount how powerful the folder hierarchy metaphor still is. They through content into repositories and hope they can find it through searching. Only later do they figure out that folders can be thought of as virtual in the sense that they can change in structure and labeling without disrupting other ways to find the content they need.

Tip: be careful of file names and deep folder paths.
Get the content and metadata out of the system for discovery, migration, or long-term storage. The issues are getting the attributes and audits together and maintaining the context of the content with locations and original modification dates, user data, and validation.

Tip: here’s where poor metadata and lazy content management really cost a company huge bucks in maintaining backups of worthless content.
Delete unwanted content period. If you are like some pharma or financial companies and send all of your old content to Iron Mountain, you are a hoarder and should seriously look at your retention policy.

Saturday, April 16, 2011

Parsing the xCP Buzz Words

Taking all of the buzz words out of the Documentum xCP pitch we’re left with “accelerated”, “content” and “platform”.

If a client would accept building a TaskSpace application from a napkin in their production environment, then I’d say this accelerates the application build process, but it no client’s IT department would allow it. TaskSpace apps are built in development and deployed to test and prod via Composer, quick right? Well, what about requirements and functional aspects of the solution? Is that made quicker? No, and here’s why:

Let’s say we have a solution where we scan/capture invoices, process them, and finally report on them. Easy, just like the end to end slick demo that EMC sales did, right? Install InputAccel  and you’re done? Install Forms and Process Builder and slap together a workflow? I don’t think so.

The problem with smoke and mirrors is that we as solution architects and developers get blamed for how slow it takes to build a solution that the sales guys touted as a piece of cake, 3 months to build max. The 3 month schedule should be more like 6 to 9 months. The sales guys are long gone and the customer is annoyed and start to cover their own asses as the bean counters are tapping their fingers.

These products might be easier to use for cookie cutter solutions, but what about that 20% of a solutions that doesn’t fit the mold? You need requirements which take time, you need functional specs to setup up the configurations for scanning, forms, processes, use-cases, etc. This takes more time than is usually allotted. This is not accelerated.

Where is going to happen with the old content? The legacy stuff needs to be migrated. Where are the requirements for this? What are the new attributes and object model for the new system? What is the mapping of old to new attributes? The sales guys didn’t talk about this. This is not part of the acceleration.

The “platform” is still a mashed up combination of export connections (InputAccel) and xml integrations between Forms Builder and TaskSpace and the Content Server. One application has variables, the other has attributes. One can parse scanned pages, the other reads a whole document. In order to put the whole solution together you have to be part developer, part UI designer, and part lucky. The reporting aspect of this platform is an afterthought and with BAM can bog the whole Java Method Server down to a halt.

Next generation of xCP
The next generation of xCP needs to address the following: 

  • Better coupling between requirements and functional specs, configuration, and validation of configurations. 
  • A smoother ride when developing/configuring the pieces of the solution puzzle in terms of common language of computing as well as nomenclature in manuals and tutorials. 
  • Build on open source platforms which are in common use, take a tip from Alfresco. 
  • Slowly eliminate the bottlenecks of configuration, for example, on a large project for each product there will be experts who are assigned to work on their one piece, however they always seem to hurry up and wait for others in the config chain to finish, or make changes.

Monday, March 28, 2011

Documentum's 2009 "Roap Map": Optimism to Reality

Looking back on EMC Documentum’s product “Road Map” announcements and the hype revolving around them shows how marketing works and how it tries to scare or lure customers into upgrading and/or buying more products.

I read an article entitled EMC World 2009: Beyond D6.5, A Product Roadmap written by Pie in May, 2009. It was very informative with all the buzz words and I’m sure accurate for that moment in time. However, there were statements like “D7 not 2009, will be 2010”. This gives the impression that EMC is moving quickly on its major releases.

It is March, 2011 and we’re at D6.6. I’m not going to bet that D7 comes out this year. I remember reading Pie’s article and being excited to work with DFS and CMIS. I was also surprised at how many installations were still at 5.3sp2. I’m working on an upgrade right now from 5.3sp2 to 6.6. This is a large installation which requires lots of planning and coordination. A year ago, I looked at a 4.3 installation still going strong...

The owners of legacy DCTM systems wait until post support ends and then they leapfrog over many releases. Which technique is best suited for your company depends on many factors, but I’m willing to bet that they save a lot of money by waiting. There is less disruption of User experience as well. The IT department suffers because the technology is antiquated by the time of the upgrade. Three CIOs have probably swung through the company.

Taken with a grain of salt, these “road maps” usher in new excitement about technology X.0. These unveilings show us what we want to hear, they excite us, they allow us to dream of new interactions, new trends, new connections. I enjoy these “road maps”, but I would rather they were called “dreamscapes” instead. Like most companies that are strapped these days and cutting IT budgets, EMC needs to figure out ways not to over sell. They need to tell it like it is, which is a map with changing roads and the distances will always be longer, the scale will morph, and their products will slip in and out of relevancy to the hype. 

EMC Documentum's longevity has to do with the product's orginal vision, not the over selling antics of sales and marketing. 

Friday, March 25, 2011

Kicking the Share Drive Habit

Business productivity in the world of “just get this thing done already” means using whatever MS Excel can offer to track the content out there and report on it. This process usually equates to lots of manual double checking, lots of verification, and lots of busy work. Lots of human intervention is not a bad thing, however at some point the “finding” and “versioning” become unwieldy. Some type of content management system must be purchased.

The purchasing, deployment, and education of the first system is the most crucial step in setting the stage for the future health of the organization’s information. It was not that long ago when IT Directors were saying that content management systems did not belong in the enterprise services stack. Some industry IT shops still harbor misguided and regrettably wrong impressions about the complexities of information and especially content and process automation.

So here’s a guide to moving content off the share drive and into a content management system.

One of the key concepts that is missed by most ECM vendors is that each bulk import/migration needs to be executed with a certain percentage of customization in order to get the best results. You could bulk import with an off the shelf tool, but don’t expect to get all of your content into the target content management system.

Remember: the cost of exceptions could add up to more than the cost of the off the shelf tool.

Analyze Your Content and Metadata

Use a file listing app (whether it’s off the shelf or home grown) to build inventory Excel/CSV files with the following criteria:

• Absolute file path for folder and file location

• File names

• File properties such as creation date

• Create drop down lists based on the target content management system
  • User names
  • Fixed values like state or vendor name
  • Destination folder paths
Use or Create an Import/Migration Application

• Incorporate the work done in the Analysis

• Design it based on the specific requirements of the import or migration at hand

• Pre-Flight of Excel track files

• Does the file exist?

• Are the date values valid?

• Do User names exist?

• Are Illegal characters handled during the migration?

Make Validation Simple

Using the absolute path values of the source content, make sure the target has an attribute for that same value. This will make is much easier to not only validate the results, but to recover from failures where rerunning thousands of imports (for one failure) would be incredibly inefficient.