Friday, December 18, 2009

ECM Framework Equals Aligned Goals

All organizations need an ECM architectural framework in order to move in unison toward common goals. I know all about the phased, lumbering approach of ECM deployments in the enterprise and everyone’s intersecting plans of what to do about it, but I have been through this before a few times and I have to underscore some ideas and questions.

I’m going to use a content management software upgrade as an example shock to the ECM system. My focus is on the results of a lack of clear goals around ECM.

Why use an architecture framework?

Silos of ideas need to be reviewed and galvanized into a commonly understood direction and focus. We all have visions, but are they all in line, fully clued into each other? There are budget forecasts, collaboration strategies, content management strategies, roadmaps of what will be executed and when, top 10 projects, ECM strategic meetings, but is everyone looking at the same picture?

  • Budgets show the how many bucks and what quarter.
  • Strategies based on Gartner or McKinsey reports overwrite each other.
  • Roadmaps show what, who and when.
  • But where’s the why and how?
    What are the goals???


For example, at the functional ECM level, how will an upgrade be impacted by SharePoint? What is the anticipated impact of a Usability report, can we start thinking about executing the obvious changes now? Have the mission critical regulation focused groups been frustrated with our services, why aren’t we working with them to implement their submissions to the FDA, shouldn’t this be our number one priority?

Due Diligence in a Vacuum
At the document management level, an upgrade should be done “in place” with limited scope and repositories should be eventually consolidated to save money and centralize control. However, at an ECM level, the approach to the upgrade is magnitudes more complicated and interrelated to other strategic and governance factors.

Ideas and questions around the upgrade and restructuring at an architectural level

  • Will consolidation of repositories meet the imperative cultural goals of the company? What is the SLA threshold for upgrading a system? One week of downtime? Zero downtime? Plan for a year or execute in 2 months? Wouldn’t it make sense to isolate mission critical content by business unit and their specific SLA’s for FDA submission, audits, security, etc?
  • How does the upgrade fit into a more tactical approach based on commonly understood ECM goals / strategies (framework)? If the long-term vision is to migrate away from LL, then let’s face it strategically. If we thought of ECM in terms of services like business process, records management, FDA submission, publishing, migration, security, library, retention etc., would this help us define our goals?
  • Should the upgrade methodically move non-regulated content to a parallel upgraded system to free up the rest of the company from the rules and regulations of a validated system? Should we assume that Sharepoint will consume all of projects eventually, and if so, are we being methodical about this?
  • Should the controlled repository be migrated to another upgraded repository, or should we separate repositories by service, for example, business process management/SOP (current controlled upgraded in place), file share sandbox and portal (new repository), etc.
  • Should each mission critical business unit have its own repository (within the framework of the ECM stack) to be unencumbered by the goals/governance of other mission critical content of other business units?
  • At one company I worked at the content, structure, security and business process were consolidated around common business goals and anticipated use: finance, portal, mission critical business units and remote sites each had their own repository with workflows and migration tools for integration between them. If agility in change management is one of our goals this distributed approach should be considered.
  • What are the primary goals, mission critical drivers of ECM at the Enterprise level, Business Unit, and Service levels?

The mere fact that there are so many questions around an upgrade is why you should absolutely consider a framework for change management/deployment of services.

Here are a few among many: Zachman (IBM traditional) and TOGAF (ECM focused and complicated, but comprehensive).

Tuesday, December 8, 2009

eDiscovery Due Diligence Approach

Requirements, Requirements, Requirements

Define your legal hold requirements, not at a high level, but at very specific level. Create a few scenarios. For example, the chain of custody of a copy of information vs. “frozen” information in context will present itself very differently in court. These requirements, like records management requirements, filter down to all information in the organization. It is worth the effort to understand the intricacies of each existing system and their integrations.

As-Is Systems

Determine what information is needed for the Legal Hold tool to function correctly, for example, assuming identity management is important to discovery then is IDM/AD in good shape? If it isn’t, when will it be? Does the Information Security group have enough resources to deal with this new software?

Inventory the existing information repository vendors to determine if they have eDiscovery add-ons which might be adapted to or used out right. For example, Open Text has an eDiscovery module tailored to its Livelink software.

Integrations

Interaction of the proposed software solution with existing systems is very important. For example, how well does EMC’s solution adapted to Open Text repositories? A few more: Email holds? File system shares? Identity Management?

Tradeshows and Research Analysts Analysis

At tradeshows the players with the deepest pockets are going to wow the audience with all of the bells and whistles. “Yeah, we can do that”, but it’s a customization… Even the demos are usually canned and not real. The reality is that it’s up to your specific requirements. Gartner’s quadrant could be based on a pure play model not an integrated one. I agree with Gartner that solutions are still in their awkward stage, which is more the reason to develop specific requirements.


Search vs. eDiscovery focused

Autonomy is an excellent search tool, however, is it going to integrate well with our other systems specifically around the access control aspects? This could dovetail nicely into an Enterprise Search tool effort…

Security Group Participation

The Information Security group must be an integral part of this whole approach. Without their participation and buy-in from the start, it will be an uphill battle. They will obviously work in tandem with Legal to perform eDiscovery activities. They need to be comfortable with driving their cruiser.

Professional Services after the purchase

I’m not sure of the overall percentage of services vs. software in Legal Hold software, but I’d say it is substantial. Weighing the specific requirements against the software’s out-of-the-box offerings will be worth the effort.

Monday, November 16, 2009

When An Interview Becomes Consulting

When you're interviewing with a potential employee and he asks a specific question about a current issue that his team is having with their content management system, how are you suppose to answer? If you answer vaguely he might think you don't really know what you're talking about, but if you answer completely and might cut the interview short and send you home in order to run down to his team and tell them how to fix the problem.

These days financial companies are the worst because they were used to living fat off the hog for the past few years spending big bucks on contractors to do their development work. Now they have a work force of implementors, but no one experienced enough with development to take on the new custom work. So what do they do? They go ahead and try the development tasks and get into trouble along the way. In the meantime they are creating an environment in IT that is toxic to developers so anyone with talent moves on and the manager is stuck in a never ending interview cycle. But, hey I have a great idea, let's get some really experienced developers in here and interview them and get their advice on our issue! Brilliant! Short sighted, just like this latest rise in the stock market.

So how does you avoid free consulting? Here are some tips:
  1. Confuse the hell out of them, laugh and walk out
  2. Say you want to meet with the developers themselves to solve this issue
  3. Start answering the question, but stop short of answering saying they have to hire you to find out
  4. Answer the question, but introduce other aspects of the issue and build in explanations of risk
  5. Ask if they've ever tested their disaster recovery system, fully
  6. If the manager is an ass, ignore his questions, talk around them until you get kicked out
  7. Ask how large their IT compliance team is and whether they are hiring
  8. Start saying "just kidding" after every other sentence
The bottom line is don't give sleaze balls free consulting. We're valuable and they know it. Now all we have to do is wait until they pay us the big bucks again in 2012.

Friday, October 16, 2009

Composer and ACHells

Hosed by Composer again! This time trying to install Permission Set from Dev to Test Repositories:

Environments:

  • Dev Repository name: LoserDev
  • Dev Repository Domain User: LoserDev
  • Test Repository name: LoserTest
  • Test Repository Domain User: LoserTest

  1. Created a Permission set in Dev, which set the ACL domain as LoserDev which corresponded to the LoserDev install parameter in Composer.
  2. Installed it into LoserDev and everything unit tested fine.
  3. Went to install the dar file to the LoserTest repository and got an error: “user LoserDev does not exist in Repository”.
  4. ** Don’t be tempted to create the user in the repository. This will allow the dar to install, but will really confuse the UI with ACLs that kind of work, but not really.
  5. Went back to the LoserDev Project and opened the LoserDev install parameter, typed in “dm_dbo” into the default value box, saved it, and created another dar.
  6. Went to install the dar file to LoserTest: same error. What the?
  7. Went back to LoserDev Composer project, check the LoserDev install parameter and the default value was blank. Hmmm.
  8. Type the default value into the user parameter value of dm_dbo again and hit the enter key. Ahha! Saved it, created the dar and the install to LoserTest worked.

Bottom line: In Composer make sure you have an asterix * in the tab, to guarantee that your work is getting saved to the underlying xml file which is used to create the dar file.

Documentum Composer Wrestles with Lifecycles

Hosed by Composer again! Whoever thought Composer was ready for primetime with Lifecycle management was really in the clouds. Here’s what happened:

  • I created a Composer Project (call it Poser) and a new lifecycle (let’s call it DOA) along with many other artifacts.
  • I installed the Poser.
  • DOA had issues with ACLs and was not working correctly
  • I thought it would be better to create a delta dar for modifying just DOA
  • I created Composer Project 2 (Hoser) and imported DOA from the repository
  • I fixed DOA and installed Hoser.
  • Now Webtop showed 2 DOA lifecycles.
  • Naturally I deny that anything wrong is happening and I choose the wrong DOA, try it and get frustrated.
  • I DQL, I look at ACLs, I search Powerlink, I download Doc App Builder 5.3sp65
  • So I go back to Hoser, import the original DOA and click the “uninstaller” checkbox to uninstall both DOAs.
  • I install the Hoser again.
  • Now Webtop showed the original DOA still installed…What the? What got uninstalled?
  • At this point I installed Documentum App Builder, created a docapp, imported the lifecycles and uninstalled them. I fixed the DOA and had no other issues. This is still the true work horse!

Bottom line: once you create a lifecycle and actions, stick with that project, don’t create new projects using the original artifacts.

Monday, October 5, 2009

Content Architecture using Memetics

Virus of the Mind: The New Science of the Meme by Richard Brodie, describes the foundation of memes as being distinctions, associations, and strategies. When applied to Enterprise Content Management there are some interesting comparisons. Advertisers know how to push our buttons to drive sales, just as content architectures know the best ways to describe and find content, or do they?


Distinctions

Example: It’s snowing, or our content is all on a share drive.

These are the ways to describe content that are particular to the business unit, the company, and the industry. This metadata is vital for survival of the content, in other words, can a User find it among thousands or millions of other pieces of content? What key information can be drawn out of the content file or its context to direct successful search result?

Associations

Example: Snow is dangerous when driving, or without metadata I get thousands of search results.

Relationships among content and its environment are key the understanding the thoughts (memes) behind the content. A taxonomy helps by categorizing a business unit’s way of thinking for its search or retention purposes. This taxonomy would have to fit into the enterprise as a whole. The issue here is to start at the level where the content is created and is useful to the local users, then expand the levels out in a way that doesn’t disturb the functional aspects of the original group. Too much of an imposition will get rejected or worse slowly ignored.

Rules and Regulations come in to play for controlling and focusing content for common delivery to people and interfaces outside of the company’s mindset.

Fuzzy vs. Absolute: Users want to be able to fill out metadata and find that exact content later. This means the content architecture has to balance the business unit’s requirements with the enterprise's.

Strategies

Example: If I have an all wheel drive I’ll make it through the snow, or with a taxonomy I can make sense of complex organizations.

Repetition: this is used to drill home the importance of certain ways of thinking (memes). For example, a naming convention will reinforce ways of thinking about content and its context (association meme).

Cognitive Dissonance: this is used to reward a User for taking the time to fill out metadata correctly. For example, filling out metadata and associations is rewarded with less change management, less hassle in the future when ways of organizing content changes.


Content Silos

However you want to attack the issues of content silos, they will always exist. The strategy memes of the business unit will always differ in meaning and scope from the enterprise. I’ve come to the conclusion that the best way to make importing and/or changing content the easiest on the User is the hardest to figure out in terms of scale and performance on the system. This means that finding the right balance of splitting up the system’s resources for each business unit weighing the content demands, the ability to find content, the access control, and the application of the latest rules and regulations. This balance when seen visually will make sense, but the challenge is to get agreement from all the parties involved, the governance. This is where strategy memes make inroads: they help far removed executives understand the long term benefits when seen from the past, present and future.

Wednesday, September 16, 2009

Why Test Scripts Suck

I’ll trade one motivated business user in for five IT testing professionals for making sure an application works as designed. Why? Because that business user has all of her day to day requirements, pain points and frustrations invested into the new application. On the other hand, the testing professional has to make sure the test script is executed flawlessly, that’s it, on to the next project.

Below is the typical best practice, software development mantra that project managers will promote. I’ve added some notes under each of them and few new mantras.

The software implements the required functions.

Have the requirements been aloud to change during the project? Flexibility is key to the perceived success of any project. If concessions can not be made without huge push back and it’s a pain to change requirements from the business’s perspective, the project should be stopped and re-evaluated for its purposes (and management). This is very pronounced with large projects.

Added to normal Project Manager’s software development lifecycle list:

Prototyping

Prototyping functionality for business users to experience (see) what’s been talked about and promised. If a third party or internal team cannot commit to show their application during development for fear of “giving a bad impression” or “scaring” the business then there are trust and lack of communication issues going on that need to be dealt with now rather than during the full User Acceptance Testing.

The software has passed unit testing.

Make sure the developers know the architecture of the application as a whole, its requirements and the importance of the unit testing and integration into the larger application or service. If one developer is slacking the project is at risk of failure. Project managers should have a good idea of who can perform and who needs help at this point in the project timeline.

Added to normal Project Manager’s software development lifecycle list:

Code Reviews

A junior developer cannot possibly know all of the ins and outs of the application if they are focused on coding specific components or services. All code, at least initially, should be reviewed by senior developers and architects to assure efficiency and scalability.

The software source code has been checked into the repository.

This can be a pain in the ass if the project is small, however necessary if you are developing with others and integrating into a larger repository. This also is a good check for senior developers to do quick code reviews.

The software has been compiled into the current build (for compiled systems) and deployed into the appropriate test environment.

Without proper safe guards, one developer’s code could break a whole series of other test scripts. Smoke testing is highly advised before fully committing the code.

The team has developed appropriate test scripts that exercise the software's capabilities.

These scripts are usually end to end tests that are few clueless testers, not irrational business users, who change their minds, cancel, go to lunch, upload their whole hard drive, etc.


The software has passed integration and system testing, and been deployed into the user acceptance test environment.

Many large projects are desperate for true testing environments are usually skimp on resources for them. This poses an issue when the new build is supposed to be deployed and fully functional in a test environment that has kludges.

Added to normal Project Manager’s software development lifecycle list:

Performance and Scalability Testing

Normal third party developers comment during the development phase of the project that they wonder if this will scale or how it will perform under a load. These developers talk to the project manager and usually the discussion ends there. If this is brought up at this time in the project with no time allotted to it then forget about it. Also, why would the third party be motivated to do this when this is a typical reason that they get called back in to do more business?

The users have had an opportunity to use and respond to the software, and their change requests have been acknowledged and implemented where appropriate.

Again, this is important, but there should be no surprises at this time. The users should have had their requirements changed and prototyped and testing by this time.

The software has been documented in accordance with whatever standards your project follows.

Have you ever seen a test script written for documentation accuracy?

If documentation is not ongoing during the whole project this document will be worthless. I have not worked on a project where the design document is perfect after being signed off on. During development and fixing, the design doc needs to be corrected, changed, or expanded on.

Also, the deployment and knowledge transfer documentation should be complete and tested.

Sunday, September 13, 2009

What the F’ Happened to the Customer’s Vision?

It seems every new technology or architecture or new way of looking at the complexities of content is like building a new platform on quicksand. It eventually sinks below the surface and then a new “genius” comes along with a solution that gets sold to our “shock and awe” addicted users.

The customer used to always be right, now they are sold what’s “right”. What is sold to the customer is pretty and “easy-to-use” technology which is over their heads. They become reliant on experts to build the solution and to come up with language that makes the Manager/Director look good to his superiors.

Once ECM is in place, the users look at it and inevitably want their old system back. After a while they become more comfortable to the new ways of doing things. Then they want continuous improvement. By the time this happens a new version is released, new bugs cause the experts to come in and fix them. The continuous improvement requirements get scaled down by technology issues of performance and content growth. The IT department thinks they own the system. The Business Units get frustrated with IT. Yada, yada, yada.

At this point “shadow IT” starts its cycle again. In the late 90’s it was websites popping up every where as intranets via easy to use, inexpensive website publishing tools. Now it’s Sharepoint portals. These portals are what the customers want. They want messy rooms (unstructured content spaces) where they can play with content and ideas, not technology. Metadata, security, taxonomy, workflow, lifecycles, retention, etc. need to be worked into these “messy rooms”, periodically cleaning them up, organizing the useful content and throwing away the building blocks. EMC’s CenterStage, like Sharepoint, is trying to fulfill the need for users to produce, edit (collaborate) content while the systems handles structuring and storing behind the scenes.

This introduces the big gray area of ECM: the void between structure and unstructured content. Let’s say an invoice is structure content because it originated from a database and has a number. The problem is that this invoice was printed to paper, signed, scanned, and place back into a content repository. The number is still there, but the systems are different. Even though the systems are interoperable there is no source of record anymore. What is more important the financial aspect of the content or the actual scanned proof of purchase? It depends who you ask.

Thursday, June 4, 2009

Solution Pattern for OOTB Webtop and Search

Scenario

  • The client requires a multi-tiered custom object model with most of the attributes at the child level.
  • There are 10 distinct types of content which share some common attributes, but have very specific attributes as well.
  • The client want to search for attribute/value pairs across all of the children documents easily by selecting one parent object, typing in search criteria, and executing.
  • All imports, checkins, new templates, and properties interfaces need to be tailored to the client’s specific requirements for conditional attribute population in specific order.
  • All content types have some mutually exclusive required attributes that cannot be null.
  • And of course, the client does not want a lot of Webtop customization.

Possible Solution Routes
  • Traditional: Customize Webtop import, checkin, new doc, properties, and search pages.
  • New: TBO with common attributes pushed to the parent object type, with limited WDK customization.

Design Road less Traveled By

We isolated the absolutely necessary WDK components that needed to be developed to satisfy display and functional requirements. There is some functional requirements work here to decide on which attributes are common and have the most impact finding content in the repository. Searching across all child object types was a critical requirement so we focused the design of the object model toward shared or common attributes. The design of the object model looked like this:

Parent (common_attr 1, common_attr2)
Child 1(common_attr 1) --- Child2(common_attr 1, common_attr2) --- Childx(common_attrx)…
Every time a document is saved, the common attributes of that child doc type is replicated to the common attributes of its parent. Besides display customization of search results and grid columns, there is no need to customize the query builder of the search page. The query performs better because the attributes are at the parent level, there are no database table unions happening behind the scenes during the execution of the search.

Developing the TBO
The main purpose of the TBO is to override the Save and Checkin methods that are triggered during the use of Webtop. This TBO basically gets the common attributes of the content being saved or checked in and sets their values to the parent attributes. This TBO is deployed to override the parent object type.
Developing the Webtop Components

The customizations are to the object and doc lists. The “documentum.webtop.webcomponent.objectlist” and “doclist” components have the common attributes added to them in order to show the same view of the attribute columns for browsing folder contents and search results. This could have been a preference setting, however there was a bug with the particular version at the time in terms of sorting the columns so we had to customize these. The “search60” component was also changed to show only the parent object type and its children object types in the search.

Monday, May 4, 2009

Max Session: Obscure Documentum server.ini key saves the day!

Environment
Windows D6.5 installation using Webtop, some WDK customizations, and a TBO for major customizations. The TBO has to create a superuser session to do some work.

Issue
Max sessions are reached on the application server after a few hours of use even though the “Active” sessions are far below the configured threshold. In other words, the Tomcat server is counting “Active” and “Inactive” sessions in it determine of “max” sessions.

Keep in mind that there is a lot of customization in the TBO and this required creating and releasing a session manager for a superuser during each save of the document. This is what is building up the "Inactive" sessions. 

Patch Solution Given
The initial solution was to jack the max sessions up to 100,000. This caused the Tomcat service to die once a week or so, basically maxing out the memory allocation to the process.

Max Sessions Investigation
I opened the Documentum Content Server Administration Guide and searched for “max” or “session”.

I listed the variables in the session being maxed out

  • Application versions
  • Property files
  • Custom code


I looked into implicit and explicit sessions

Hint: For testing superuser in TBOs, I used a different superuser than the install owner account.


Application Server

  • Measured by Active and Inactive Sessions
  • dfc.properties Key/Value Pair: dfc.session.max_count = 1000 (default)
  • DQL: execute show_sessions
  • DA: Administration > User Management > Sessions (All)
  • web.xml: HTTP session timeout is set in the \app server\ web.xml (default is 30 min): 30
  • Hint: To find leaks set dfc.diagnostics.resources.enable = true (default is false)


Content Server

  • Measured by Active Sessions
  • server.ini Key/Value Pair
  • concurrent_sessions = 100 (default is 100, max is 1024). These sessions are “Active” sessions from the content server’s perspective
  • history_sessions = (how many timed out sessions show in list_sessions)
  • history_cutoff = (default is 240 minutes)
  • client_session_timeout: default is 5 min
  • check_user_interval: frequency in seconds which the CS checks the login status for changes.
  • Default is 0, meaning it checks only and connection time.
  • login_ticket_timeout: length of time a ticket is valid, default is 5min
  • DQL: execute list_sessions

Final Solution

I added “history_cutoff = 5” in the Content Server’s server.ini file.The “history_cutoff” key controls the longevity of the inactive sessions. The default value of this key is 240 minutes (4hrs). This would explain why only on occasionally the max session is hit.

My testing has shown that if you set the “history_cutoff” key to a value much smaller like 5 to 30 minutes, that this allows for the inactive sessions to clear reasonably soon, so as not to fill the max sessions of the Tomcat server.

To test this I set the following:

Set up WDK Automated Test Framework to run the same tasks over and over again to build up Active and Inactive sessions.

Set up the Content and Tomcat servers with these base line settings:
server.ini file: concurrent_sessions = 20
dfc.properties in the webtop/WEB-INF/classes: dfc.session.max_count = 30

Result: The Tomcat server fails when the total of Active and Inactive sessions exceeds 30.

Then set up the Content and Tomact servers with these settings:
Settings with history_cutoff changed
server.ini file: history_cutoff = 5 and concurrent_sessions = 20
dfc.properties in the webtop/WEB-INF/classes: dfc.session.max_count = 30

Result: The Tomcat server fails only if the number of active sessions exceeds 20, thus relieving it of the inactive session burden.

Saturday, April 25, 2009

Creating and Deploying Templates Using API and DQL Scripts

Recently I thought I was finished working on a project that had some templates in the design and deployment. We were “done” which meant the budget was depleted and the customer wanted us gone; no more billable time. I’m not sure what happened to the “customer is always right”, but this statement and sentiment is coming back in popularity. In this economy it makes more sense to bend over backwards to please a client, than bicker over getting paid for our own mistakes.

The issue with the templates eluded my developer and me. My developer had created an API script (see below) to load in the templates. The script created the doc objects, set the content, and set the i_folder_id of the doc to the “/Templates” cabinet object id. The templates were “linked” into the cabinet and seemed to function as desired.

However, as the in-house developer at the client site found out later, the templates were not truly linked to the “/Templates” cabinet. The in-house developer had the advantage of sorting this out over a much longer period of time than we as consultants had. That being said, I should have figured this out, but I was confident that my developer’s script was correct, plus a dump of the template object looked okay.

Here’s the one attribute of the template object that we missed: “i_reference_cnt”. It was “0” instead of “1”. The “i_folder_id” was correct, but the “i_reference_cnt” was not set correctly. The script was setting the i_folder_id value when it didn’t have to. The object gets linked to the home cabinet of the session account be default. A follow up DQL can be run to move the object from the home cabinet to the ‘/Templates’ folder.

There’s a support note on Powerlink which describes how to create custom templates from using DQL and copying using DA. You can also try the following API and DQL script that was modified to work correctly for deploying template files.

API Script

create,c,dm_document

set,c,l,object_name
Test template

set,c,l,owner_name
dmadmin

set,c,l,a_is_template
true

set,c,l,title
Test Template

setfile,c,l,C:\temp\test.doc,msw8

save,c,l

DQL Script

update dm_document objects move to '/Templates' where object_name = 'Test template'

go

Tuesday, March 17, 2009

Documentation: What and When

In software development we test everything but the project’s documentation. I can’t tell you how many times I’ve had to scour a project’s documentation for information that should be organized for quick reference and be up-to-date with the latest configuration and customizations. Instead the documentation is usually missing some crucial bit of detail that forces me to search for answers and waste time, mine and the client’s.

So, to get back to putting more emphasis on verify or using documentation: how do we do this, besides test scripts and installation docs, or design and requirements docs?

One approach is to log more diligently all of the issues that happen during the development and deployment of the project. These logs have vital setup information and deployment hurdles that will never get documented formally. The testers and developers are the keepers of this knowledge and need to document it as they work through problems encounters.

The problems lead to the most important aspects of the project’s success. The problem’s solutions will suffice for the time being, but they will strike again in a similar fashion, in a pattern. These patterns are what need to be understood.

For example, you deployed a workflow with an auto activity that timed out during the QA testing. The timeout setting was increased, but no one documented it. When the workflow gets deployed to Production the same thing happens, but users see the workflow has paused and are now concerned and annoyed. The first thing you do is read through the documentation which has no reference to timeout changes. Then you look at logs and see that a method has failed with no reason why. The workflow supervisor’s inbox is filled with errors but you don’t know that because no one documented how to occasionally check that user’s inbox. No one even considered a fast system with a few workflows timing out.

I think the point here is that documenting is not only writing about the design of the system, its configuration and customizations, but detailing the pitfalls and hurdles of the process as well. There could be two sets of documents, one for the client and one for your sanity when things are wrong, which they will, it’s just a matter of time. Next time you'll be more prepared with a cheat sheet and quick references to previous issues and complex configuration and deployments.

Monday, March 2, 2009

Documentum Maintenance/Procedure Checklist

After the initial Documentum installation and rollout of the first phase, it is essential to
follow a maintenance/procedure checklist to assure maximum system performance and stability.

Documentum Administrator
Many of the maintenance procedures and jobs are configured or accessed through Documentum
Administrator (DA):
  • Server and Repository configurations
  • LDAP configuration
  • Users, Groups, Roles
  • Security (ACLs)
  • Storage (Locations, Storage, and Filestores)
  • Index Agent’s failed index list should be understood and resubmitted if necessary
Maintenance

Logs to Monitor
It is highly recommended to check all logs periodically for errors and warnings.

Application Server
Name: stdout_yyyymmdd.log (example: stdout_20090218.log)
Location: \Program Files\Apache Software Foundation\Tomcat 6.0\logs
Purpose: shows warnings and errors from Webtop and TBOs.

Content Server Repository Log
  • Name: DocbaseName.log
  • Location: C:\Documentum\dba\log
  • Purpose: Shows the repository startup output and any warnings or errors.
Java Method Server Log
  • Name: access.log and DctmServer_MethodServer_DocbaseName.logLocation:
    C:\Documentum\bea9.2\domains\DctmDomain\servers\DctmServer_MethodServer\logs
  • Purpose: tracks access and status of the Java Method Server
Index Server Log
  • Name: access.log and DctmServer_IndexAgent.log
  • Location: C:\Documentum\bea9.2\domains\DctmDomain\servers\DctmServer_IndexAgent\logs
  • Purpose: tracks access and status of index agent
Disk Space Management

The Content Server has a state of the docbase job (dm_StateOfDocbase) which monitors
this. Also the data drive should be monitored.
  • The SQL Server transaction log should be monitored
  • The Webtop cache files should be monitored
  • The Index data drive should be monitored
  • Database Maintenance and Logs
  • Disk space should be monitored
  • Transaction logs should be monitored
  • CPU and RAM usage patterns
Jobs

Some of the jobs below are not active OOTB. They have to set to active and started on a schedule. Be sure to set the run times so that they do not conflict other jobs and backup
schedules.

dm_ContentWarning
  • Purpose: Warnings for low availability on DM content/fulltext disk devices
  • Method args: -window_interval 720, -queueperson, -percent_full 85
    dm_DMClean: Executes dmclean on a schedule Method args: -queueperson, -clean_content TRUE, -clean_note TRUE, -clean_acl TRUE,
    -clean_wf_template TRUE, -clean_now TRUE, -clean_castore FALSE, -clean_aborted_wf FALSE, -window_interval 1440
  • Note that the "-percent_full" value is "85" which you may want to lower for a more lead time to deal with diskspace.

dm_LogPurge
  • Purpose: Removes outdated server/session, and job/method logs Method
  • args: -queueperson, -cutoff_days 30, -window_interval 1441
  • Note the "cutoff_days" parameter should be set to a reasonable number of days, balancing compliance and trouble shooting issues.
dm_StateOfDocbase
  • Purpose: Lists docbase configuration and status information
  • Shows: Number of docs and Total size of content, among many other stats.
dm_AuditMgt
  • Purpose: Removes old audit trail entries A key parameter is the cutoff in days, basically how many days worth of audits to keep.
  • args: -queueperson, -custom_predicate r_gen_source=1, -window_interval 1440,
    -cutoff_days 1
  • Note the "cutoff_days" parameter should be set to a reasonable number of days, balancing compliance and trouble shooting issues.


dm_QueueMgt

  • Purpose: Deletes dequeued items from dm_queue
  • args -queueperson, -cutoff_days 90, -custom_predicate, -window_interval 1440

dm_UpdateStats

  • Purpose: Updates RDBMS statistics and reorgs tables (if RDBMS supports)
  • args: -window_interval 120, -queueperson, -dbreindex READ, -server_name SQL2\SQL2005

dm_ConsistencyChecker

  • Purpose: Checks the consistency and integrity of objects in the docbase

dm_DataDictionaryPublisher

  • Purpose: Publishes data dictionary information

dm_LDAPSynchronization

  • Purpose: One-way synchronization of LDAP users and groups to Docbase Method
  • args -window_interval 1440, -queueperson, -create_default_cabinet true, -full_sync
    false

dm_FTStateOfIndex

  • Purpose: State of Index dm_FTIndexAgentBoot Boot Index Agents Method
  • args -window_interval 12000, -queueperson dmadmin, -batchsize 1000,
    -writetodb_threshold 1000000, -serverbase F, -usefilter F, -dumpfailedid F,
    -matchsysobjversion F, -matchallversion F


dm_GwmTask_Alert

  • Purpose: Sends email alert if task duration is exceeded

dm_GwmClean

  • Purpose: Cleans all the orphan decision objects

DQLs to run to check on audit trails and dmi_queue_items

The following statements are some of the DQLs that EMC support had us run to determine the
number of audit trails and queue items that were in the repository:


Select count(*) from dmi_queue_item

Select count(*) from dm_audittrail

Backup Procedures

Ideally, the Content Server should be shutdown prior to running the back up of the SQL Server database and started back up afterward. This will reduce any likelihood of the repository becoming out of synch with the database and the content files.

OS and Software Upgrades/Patches

Before applying any patches or upgrades to any of the Documentum suite and supporting applications, be sure to check for compatibility. Apply any patches or upgrades to the dev and QA environments and test them first.

Network Connectivity Interruption

If any network interruption occurs, then service logs should be checked for compromised activity. The Content Server and Tomcat server may need to be restarted. The logs of the application and content servers should be periodically monitored for errors and warnings.


Performance


RAM and CPU Utilization Maxed Out

If RAM is filled or CPU utilization is maxed out then the service responsible should be checked. If the service is a Documentum service, it should be restarted and root cause should be determined. Utilization should be monitored and any anticipated spikes in use or
additional services need to be load tested and analyzed. What should you do if Tomcat performance slows? If the concurrent users reach EMC’s limit of 20, EMC will recommends adding a second Tomcat server.


Further Java Memory Allocation settings to consider.

EMC Support gives the basic JVM settings to cover for common exceptions and crashes. There
are a number of other settings to add as more traffic occurs on the Tomcat server. From the
DCM Installation Guide:“To achieve better performance, add these parameters to the application server startup command line:

  • -server-XX:+UseParallel01dGC

Document caching can consume at least 80MB of memory. User session caching can consume approximately 2.5 MB to 3 MB per user. Fifty connected users can consume over 200 MB of VM memory on the application server. Increase the values to meet the demands of the expected user load.”

Monitor Sessions

DA

  • Location: Under Administration > User Management > SessionOrDQL: execute show_sessions (to show all active and inactive sessions)


DQL

  • execute list_sessions(to show active sessions)

Via docbasic ebs script

  • Purpose: set this script at a command line prompt to output how many active and inactive sessions are current on the content server. Set the interval between output and how many loops to run.


Troubleshooting Max Sessions error

Before restarting Tomcat:
Try logging into the content server from docweb using the Doc App Builder
application. If you can, then this isolates the max session error to the Tomcat/Webtop
server.

  • Using DA, look at how many “active” users sessions are currently in the repository.
    How many “inactive” sessions.
  • Try reducing the session timeout value in the web.xml on the Tomcat server to see if
    the inactive sessions get cleared out faster.

Security and Server Access Maintenance

  • Test users and test content should be deleted out of Production
  • The database schema owner account should be locked down
  • The Documentum install owner, “dmadmin” should be locked down
  • Only scheduled, authorized access to the Production should be allowed for all
    servers of the system.
  • Repository audit trails should be configured for certain events, such as deleting of
    content.

Long Term High Availability and Scaling Recommendations

  • As more users access the system, it may become necessary to create a second Tomcat
    (clustered) instance to ease the load on just one application server.
  • As more content get added to the system, more disk space will need to be added to
    the filestore drive.
  • Set up failover services for all key components
  • Add more Java Method Servers if lifecycle processing overwhelms the existing one.
  • A comprehensive content archiving plan will need to be designed and implemented.
  • Setup a disaster recovery site if the system’s service level agreement (SLA) is
    sooner than a new system could be built with backups.

Sunday, February 22, 2009

Applying Malcolm Gladwell to a Documentum Project

In Outliers, Malcolm Gladwell’s stories of successful people show a few concepts that can be applied to Documentum projects. These concepts explain how some individuals succeed and some fail. A successful project can point to these concepts as part of a successful pattern.

“Concerted cultivation”
Taking an honest interest in what the business customer really needs out of a Documentum content management system (beyond your personal business/monetary objectives) is what this is about. Fostering the open disclosure of issues the business would like to solve in an honest and transparent manner will go miles to making the project successful. If the issue is lack of accountability in tracking their processes, then they need workflow, not just Webtop and we'll see in the future if they need workflow. If the issue is that they want to convert from paper and be more productive, then they need scanning and basic classification, not just Author Integration Services.

Cultivate what the client will need to learn in order to appreciate the solution, do not hold back information in hopes establishing an on going support contract. If you don’t give enough knowledge transfer, the client will resent the issues it encounters and might blame you for not training them enough. If you don't document all of the potential issues that might come up, then the customer could point their finger at you and say "fix this for free".

Being the right person present at the right time
Chances are it wasn’t that you or your company won a Documentum project because of your raw talent, it was more likely that you had the right architect give the pre-sales demo or a contact at the company who you knew from a previous engagement, or some other connection to the company that won the work. Chances are even better that you happened to have previous experience that fitted well with what the company was looking for.

This brings up one of my issues with pre-sales. Most of us get paid when we’re working on a project, if we’re not we’re on the bench slaving over a state of work or a response to an RFP. Why don’t consulting companies realize that they need a research and development group just as much as a software company? It doesn’t have to be for product development, but solution development. Researching issues and developing solutions is what we do, but most of us solve issues on the project, write documentation and then we’re on to the next project. We don’t have what we need which is an occasional sabbatical to put the solution together and be able to truly work through the requirements, content, functionality, testing implementation, architecture, and most importantly the lessons learned. Once I was asked what types of content was part of a solution that I had only helped technically with, I couldn't answer for sure what types of content there were. This was because my time on the project was specifically allocated to deploy it to production, not to fully understand what the content was. This was sobering. I'm in this business to help solve content management issues and I was so caught up in the technology that I didn't even know what content was being managed...

“Mitigated speech”
It’s no secret that developers and architects from India are well trained and very talented in the computer science fields. What has taken me a while to understand is how a developer will communicate to an architect or project manager. He might ask a question like, “do you think the system is a little slow?” You might respond, “It’s not bad considering that it’s a QA environment. The developer might be downplaying what is being said, that is, he might really be saying “do some load testing you idiot or this solution might crash production when it’s released”. This is how “mitigated speech” works and it needs to be dealt with during the beginning of the project. Make sure you’ve created a few avenues for communication that wring out concerns clearly and effectively.

“It is those who are successful…who are most likely to be given the kinds of special opportunities that lead to further success.”
Obviously, this applies to all project outcomes, however, you may think you’ve succeeded when in fact the business feels coerced into paying you the final amount due. During the final stage of a project, which includes stabilization and knowledge transfer, there may be times when giving a few extras days for free gives a nice feeling to the customer. If the stabilization of the project is bumpy, like most are, beware of saying “this is not in the budget, we won’t do it without getting paid.” This kind of approach during a time when the customer is stressed out and any problems are easily blamed on the solution provider will not result in the kind of return business that would be achieved if you went the extra mile for free. Letting the customer push you around a little at the end may help extend the support contract until a further project comes along. These days a customer needs to fell like they’re getting a deal on your services. Make them feel that way…

The success of a project is not because of the one person or the architect or the design, it is because of the overall efforts of everyone involved, the timing, the company culture, the governance, mutual trust, and confidence in each other. One project’s success could catapult your whole company into phenomenal success, but it was a combination of experience, who you know, how much you’ve practiced, and whether you were in the right place at the right time. And luck.

Sunday, January 18, 2009

Users, Groups, and Roles with ACLs and Presets

At some point in the evolution of any repository’s design and implementation, the issues of managing users, groups and roles, and of reducing object type clutter rise to the top of the priority list. Also, finding a common way in Webtop of displaying attributes for documents listed in folders and search results becomes vital to maintaining a consistent user experience and getting a grip on endless customization costs.

You can talk to the business about users, groups, and roles, but the ramifications of your design will not click with them until they see Webtop in action and what presets and ACLs are actually doing to their user experience. For example, a user has the ability to import content and as the owner of the content they can promote it even if they have Read access only: this is hard to understand unless it was shown during a demonstration.

Here are some tips on how to design and implement users, groups, and roles with ACLs and presets:

Manage User Group Membership from LDAP or Active Directory Groups
I strongly recommend managing group membership using LDAP integration, otherwise you’ll have to run a script to add members to groups the first time and then manage group membership one by one with Documentum’s tool instead of a more robust user directory tool management tool.

Either use or exclude the dm_world default group from accessing your content
For tighter control of security, create a custom group that is your company’s base group from which to build. This base group should have a consistent level of minimal access to all content in the repository, such as Read access.

Using Presets to reduce Action in Webtop
Now that you have a base custom group, you can exclude from that group any actions, such as “Create New Documents” or “Import” from Webtop. Review the manual on what the basic client capabilities of Consumer, Contributor and Coordinator are before trying to add these to Presets.

Create Test Users that have Inline Passwords, as well as Users from LDAP and Local Domain Users
Sometimes LDAP will be “slow” or “failover” (I know User Directories never fail, but they do and you need to be able to access the Repository when and if it ever does. Chances are that your install owner account (“dmadmin”) will be a local account (I hope). So if domains or LDAP or the network is down at least you’ll be able to access the repository with inline or local accounts.

dmadmin: superuser best practice
- Add docu group as delete, full privilege to all ACLs
- Don’t include docu with the custom world group
- Don’t use dmadmin to test
- Don’t have dmadmin in any other groups, especially ones controlled by presets because the link to Webtop administration of presets may vanish and you’ll have to create or move a new user into the preset group.

Create contributor Roles that are responsible for importing and creating new content for each group of content types
For a financial company, this means create a group for Tax, Treasury, Marketing, Operations, Accounting, and so on. Each group is responsible to for a certain group of content, for example, Tax imports tax returns, withholding docs, etc and Treasury has capital calls and distributions.

Create Presets for each Role that include the content types they import only
This controls the object type dropdowns when importing, checking in, and creating new content from templates. Be sure to include a parent type that will not be used, but will be the first in the dropdown, so that custom attributes will be refreshed when the custom type is selected from the dropdown.

Scope Search Object Type Dropdown by Object Type Parent
Users of Webtop usually need to limit the amount of object types to search for in advanced search. There’re only so many types that users need to look at for searching. Less is more in the case of the advanced search object types.

Scope Object and Doc Lists by Role
The display of custom attributes in the columns of search results and when browsing folders should be customized by Role to establish consistency and attribute value listing expectations. Custom attributes will be the important ones to show and sort in these circumstances.

For searching across all object types think about replication
First, you’ll need to figure out the common attributes across all of the object types. Second, although you might be able to roll up a few of the custom attributes into the parent type, different user interface requirements will probably force the design to repeat certain attributes in every child object type. One solution is to develop a TBO that replicates some of these common child attribute values to the corresponding parent attribute in order to search for them from the one parent object type in advanced search.