Thursday, June 4, 2009

Solution Pattern for OOTB Webtop and Search

Scenario

  • The client requires a multi-tiered custom object model with most of the attributes at the child level.
  • There are 10 distinct types of content which share some common attributes, but have very specific attributes as well.
  • The client want to search for attribute/value pairs across all of the children documents easily by selecting one parent object, typing in search criteria, and executing.
  • All imports, checkins, new templates, and properties interfaces need to be tailored to the client’s specific requirements for conditional attribute population in specific order.
  • All content types have some mutually exclusive required attributes that cannot be null.
  • And of course, the client does not want a lot of Webtop customization.

Possible Solution Routes
  • Traditional: Customize Webtop import, checkin, new doc, properties, and search pages.
  • New: TBO with common attributes pushed to the parent object type, with limited WDK customization.

Design Road less Traveled By

We isolated the absolutely necessary WDK components that needed to be developed to satisfy display and functional requirements. There is some functional requirements work here to decide on which attributes are common and have the most impact finding content in the repository. Searching across all child object types was a critical requirement so we focused the design of the object model toward shared or common attributes. The design of the object model looked like this:

Parent (common_attr 1, common_attr2)
Child 1(common_attr 1) --- Child2(common_attr 1, common_attr2) --- Childx(common_attrx)…
Every time a document is saved, the common attributes of that child doc type is replicated to the common attributes of its parent. Besides display customization of search results and grid columns, there is no need to customize the query builder of the search page. The query performs better because the attributes are at the parent level, there are no database table unions happening behind the scenes during the execution of the search.

Developing the TBO
The main purpose of the TBO is to override the Save and Checkin methods that are triggered during the use of Webtop. This TBO basically gets the common attributes of the content being saved or checked in and sets their values to the parent attributes. This TBO is deployed to override the parent object type.
Developing the Webtop Components

The customizations are to the object and doc lists. The “documentum.webtop.webcomponent.objectlist” and “doclist” components have the common attributes added to them in order to show the same view of the attribute columns for browsing folder contents and search results. This could have been a preference setting, however there was a bug with the particular version at the time in terms of sorting the columns so we had to customize these. The “search60” component was also changed to show only the parent object type and its children object types in the search.

Monday, May 4, 2009

Max Session: Obscure Documentum server.ini key saves the day!

Environment
Windows D6.5 installation using Webtop, some WDK customizations, and a TBO for major customizations. The TBO has to create a superuser session to do some work.

Issue
Max sessions are reached on the application server after a few hours of use even though the “Active” sessions are far below the configured threshold. In other words, the Tomcat server is counting “Active” and “Inactive” sessions in it determine of “max” sessions.

Keep in mind that there is a lot of customization in the TBO and this required creating and releasing a session manager for a superuser during each save of the document. This is what is building up the "Inactive" sessions. 

Patch Solution Given
The initial solution was to jack the max sessions up to 100,000. This caused the Tomcat service to die once a week or so, basically maxing out the memory allocation to the process.

Max Sessions Investigation
I opened the Documentum Content Server Administration Guide and searched for “max” or “session”.

I listed the variables in the session being maxed out

  • Application versions
  • Property files
  • Custom code


I looked into implicit and explicit sessions

Hint: For testing superuser in TBOs, I used a different superuser than the install owner account.


Application Server

  • Measured by Active and Inactive Sessions
  • dfc.properties Key/Value Pair: dfc.session.max_count = 1000 (default)
  • DQL: execute show_sessions
  • DA: Administration > User Management > Sessions (All)
  • web.xml: HTTP session timeout is set in the \app server\ web.xml (default is 30 min): 30
  • Hint: To find leaks set dfc.diagnostics.resources.enable = true (default is false)


Content Server

  • Measured by Active Sessions
  • server.ini Key/Value Pair
  • concurrent_sessions = 100 (default is 100, max is 1024). These sessions are “Active” sessions from the content server’s perspective
  • history_sessions = (how many timed out sessions show in list_sessions)
  • history_cutoff = (default is 240 minutes)
  • client_session_timeout: default is 5 min
  • check_user_interval: frequency in seconds which the CS checks the login status for changes.
  • Default is 0, meaning it checks only and connection time.
  • login_ticket_timeout: length of time a ticket is valid, default is 5min
  • DQL: execute list_sessions

Final Solution

I added “history_cutoff = 5” in the Content Server’s server.ini file.The “history_cutoff” key controls the longevity of the inactive sessions. The default value of this key is 240 minutes (4hrs). This would explain why only on occasionally the max session is hit.

My testing has shown that if you set the “history_cutoff” key to a value much smaller like 5 to 30 minutes, that this allows for the inactive sessions to clear reasonably soon, so as not to fill the max sessions of the Tomcat server.

To test this I set the following:

Set up WDK Automated Test Framework to run the same tasks over and over again to build up Active and Inactive sessions.

Set up the Content and Tomcat servers with these base line settings:
server.ini file: concurrent_sessions = 20
dfc.properties in the webtop/WEB-INF/classes: dfc.session.max_count = 30

Result: The Tomcat server fails when the total of Active and Inactive sessions exceeds 30.

Then set up the Content and Tomact servers with these settings:
Settings with history_cutoff changed
server.ini file: history_cutoff = 5 and concurrent_sessions = 20
dfc.properties in the webtop/WEB-INF/classes: dfc.session.max_count = 30

Result: The Tomcat server fails only if the number of active sessions exceeds 20, thus relieving it of the inactive session burden.

Saturday, April 25, 2009

Creating and Deploying Templates Using API and DQL Scripts

Recently I thought I was finished working on a project that had some templates in the design and deployment. We were “done” which meant the budget was depleted and the customer wanted us gone; no more billable time. I’m not sure what happened to the “customer is always right”, but this statement and sentiment is coming back in popularity. In this economy it makes more sense to bend over backwards to please a client, than bicker over getting paid for our own mistakes.

The issue with the templates eluded my developer and me. My developer had created an API script (see below) to load in the templates. The script created the doc objects, set the content, and set the i_folder_id of the doc to the “/Templates” cabinet object id. The templates were “linked” into the cabinet and seemed to function as desired.

However, as the in-house developer at the client site found out later, the templates were not truly linked to the “/Templates” cabinet. The in-house developer had the advantage of sorting this out over a much longer period of time than we as consultants had. That being said, I should have figured this out, but I was confident that my developer’s script was correct, plus a dump of the template object looked okay.

Here’s the one attribute of the template object that we missed: “i_reference_cnt”. It was “0” instead of “1”. The “i_folder_id” was correct, but the “i_reference_cnt” was not set correctly. The script was setting the i_folder_id value when it didn’t have to. The object gets linked to the home cabinet of the session account be default. A follow up DQL can be run to move the object from the home cabinet to the ‘/Templates’ folder.

There’s a support note on Powerlink which describes how to create custom templates from using DQL and copying using DA. You can also try the following API and DQL script that was modified to work correctly for deploying template files.

API Script

create,c,dm_document

set,c,l,object_name
Test template

set,c,l,owner_name
dmadmin

set,c,l,a_is_template
true

set,c,l,title
Test Template

setfile,c,l,C:\temp\test.doc,msw8

save,c,l

DQL Script

update dm_document objects move to '/Templates' where object_name = 'Test template'

go

Tuesday, March 17, 2009

Documentation: What and When

In software development we test everything but the project’s documentation. I can’t tell you how many times I’ve had to scour a project’s documentation for information that should be organized for quick reference and be up-to-date with the latest configuration and customizations. Instead the documentation is usually missing some crucial bit of detail that forces me to search for answers and waste time, mine and the client’s.

So, to get back to putting more emphasis on verify or using documentation: how do we do this, besides test scripts and installation docs, or design and requirements docs?

One approach is to log more diligently all of the issues that happen during the development and deployment of the project. These logs have vital setup information and deployment hurdles that will never get documented formally. The testers and developers are the keepers of this knowledge and need to document it as they work through problems encounters.

The problems lead to the most important aspects of the project’s success. The problem’s solutions will suffice for the time being, but they will strike again in a similar fashion, in a pattern. These patterns are what need to be understood.

For example, you deployed a workflow with an auto activity that timed out during the QA testing. The timeout setting was increased, but no one documented it. When the workflow gets deployed to Production the same thing happens, but users see the workflow has paused and are now concerned and annoyed. The first thing you do is read through the documentation which has no reference to timeout changes. Then you look at logs and see that a method has failed with no reason why. The workflow supervisor’s inbox is filled with errors but you don’t know that because no one documented how to occasionally check that user’s inbox. No one even considered a fast system with a few workflows timing out.

I think the point here is that documenting is not only writing about the design of the system, its configuration and customizations, but detailing the pitfalls and hurdles of the process as well. There could be two sets of documents, one for the client and one for your sanity when things are wrong, which they will, it’s just a matter of time. Next time you'll be more prepared with a cheat sheet and quick references to previous issues and complex configuration and deployments.

Monday, March 2, 2009

Documentum Maintenance/Procedure Checklist

After the initial Documentum installation and rollout of the first phase, it is essential to
follow a maintenance/procedure checklist to assure maximum system performance and stability.

Documentum Administrator
Many of the maintenance procedures and jobs are configured or accessed through Documentum
Administrator (DA):
  • Server and Repository configurations
  • LDAP configuration
  • Users, Groups, Roles
  • Security (ACLs)
  • Storage (Locations, Storage, and Filestores)
  • Index Agent’s failed index list should be understood and resubmitted if necessary
Maintenance

Logs to Monitor
It is highly recommended to check all logs periodically for errors and warnings.

Application Server
Name: stdout_yyyymmdd.log (example: stdout_20090218.log)
Location: \Program Files\Apache Software Foundation\Tomcat 6.0\logs
Purpose: shows warnings and errors from Webtop and TBOs.

Content Server Repository Log
  • Name: DocbaseName.log
  • Location: C:\Documentum\dba\log
  • Purpose: Shows the repository startup output and any warnings or errors.
Java Method Server Log
  • Name: access.log and DctmServer_MethodServer_DocbaseName.logLocation:
    C:\Documentum\bea9.2\domains\DctmDomain\servers\DctmServer_MethodServer\logs
  • Purpose: tracks access and status of the Java Method Server
Index Server Log
  • Name: access.log and DctmServer_IndexAgent.log
  • Location: C:\Documentum\bea9.2\domains\DctmDomain\servers\DctmServer_IndexAgent\logs
  • Purpose: tracks access and status of index agent
Disk Space Management

The Content Server has a state of the docbase job (dm_StateOfDocbase) which monitors
this. Also the data drive should be monitored.
  • The SQL Server transaction log should be monitored
  • The Webtop cache files should be monitored
  • The Index data drive should be monitored
  • Database Maintenance and Logs
  • Disk space should be monitored
  • Transaction logs should be monitored
  • CPU and RAM usage patterns
Jobs

Some of the jobs below are not active OOTB. They have to set to active and started on a schedule. Be sure to set the run times so that they do not conflict other jobs and backup
schedules.

dm_ContentWarning
  • Purpose: Warnings for low availability on DM content/fulltext disk devices
  • Method args: -window_interval 720, -queueperson, -percent_full 85
    dm_DMClean: Executes dmclean on a schedule Method args: -queueperson, -clean_content TRUE, -clean_note TRUE, -clean_acl TRUE,
    -clean_wf_template TRUE, -clean_now TRUE, -clean_castore FALSE, -clean_aborted_wf FALSE, -window_interval 1440
  • Note that the "-percent_full" value is "85" which you may want to lower for a more lead time to deal with diskspace.

dm_LogPurge
  • Purpose: Removes outdated server/session, and job/method logs Method
  • args: -queueperson, -cutoff_days 30, -window_interval 1441
  • Note the "cutoff_days" parameter should be set to a reasonable number of days, balancing compliance and trouble shooting issues.
dm_StateOfDocbase
  • Purpose: Lists docbase configuration and status information
  • Shows: Number of docs and Total size of content, among many other stats.
dm_AuditMgt
  • Purpose: Removes old audit trail entries A key parameter is the cutoff in days, basically how many days worth of audits to keep.
  • args: -queueperson, -custom_predicate r_gen_source=1, -window_interval 1440,
    -cutoff_days 1
  • Note the "cutoff_days" parameter should be set to a reasonable number of days, balancing compliance and trouble shooting issues.


dm_QueueMgt

  • Purpose: Deletes dequeued items from dm_queue
  • args -queueperson, -cutoff_days 90, -custom_predicate, -window_interval 1440

dm_UpdateStats

  • Purpose: Updates RDBMS statistics and reorgs tables (if RDBMS supports)
  • args: -window_interval 120, -queueperson, -dbreindex READ, -server_name SQL2\SQL2005

dm_ConsistencyChecker

  • Purpose: Checks the consistency and integrity of objects in the docbase

dm_DataDictionaryPublisher

  • Purpose: Publishes data dictionary information

dm_LDAPSynchronization

  • Purpose: One-way synchronization of LDAP users and groups to Docbase Method
  • args -window_interval 1440, -queueperson, -create_default_cabinet true, -full_sync
    false

dm_FTStateOfIndex

  • Purpose: State of Index dm_FTIndexAgentBoot Boot Index Agents Method
  • args -window_interval 12000, -queueperson dmadmin, -batchsize 1000,
    -writetodb_threshold 1000000, -serverbase F, -usefilter F, -dumpfailedid F,
    -matchsysobjversion F, -matchallversion F


dm_GwmTask_Alert

  • Purpose: Sends email alert if task duration is exceeded

dm_GwmClean

  • Purpose: Cleans all the orphan decision objects

DQLs to run to check on audit trails and dmi_queue_items

The following statements are some of the DQLs that EMC support had us run to determine the
number of audit trails and queue items that were in the repository:


Select count(*) from dmi_queue_item

Select count(*) from dm_audittrail

Backup Procedures

Ideally, the Content Server should be shutdown prior to running the back up of the SQL Server database and started back up afterward. This will reduce any likelihood of the repository becoming out of synch with the database and the content files.

OS and Software Upgrades/Patches

Before applying any patches or upgrades to any of the Documentum suite and supporting applications, be sure to check for compatibility. Apply any patches or upgrades to the dev and QA environments and test them first.

Network Connectivity Interruption

If any network interruption occurs, then service logs should be checked for compromised activity. The Content Server and Tomcat server may need to be restarted. The logs of the application and content servers should be periodically monitored for errors and warnings.


Performance


RAM and CPU Utilization Maxed Out

If RAM is filled or CPU utilization is maxed out then the service responsible should be checked. If the service is a Documentum service, it should be restarted and root cause should be determined. Utilization should be monitored and any anticipated spikes in use or
additional services need to be load tested and analyzed. What should you do if Tomcat performance slows? If the concurrent users reach EMC’s limit of 20, EMC will recommends adding a second Tomcat server.


Further Java Memory Allocation settings to consider.

EMC Support gives the basic JVM settings to cover for common exceptions and crashes. There
are a number of other settings to add as more traffic occurs on the Tomcat server. From the
DCM Installation Guide:“To achieve better performance, add these parameters to the application server startup command line:

  • -server-XX:+UseParallel01dGC

Document caching can consume at least 80MB of memory. User session caching can consume approximately 2.5 MB to 3 MB per user. Fifty connected users can consume over 200 MB of VM memory on the application server. Increase the values to meet the demands of the expected user load.”

Monitor Sessions

DA

  • Location: Under Administration > User Management > SessionOrDQL: execute show_sessions (to show all active and inactive sessions)


DQL

  • execute list_sessions(to show active sessions)

Via docbasic ebs script

  • Purpose: set this script at a command line prompt to output how many active and inactive sessions are current on the content server. Set the interval between output and how many loops to run.


Troubleshooting Max Sessions error

Before restarting Tomcat:
Try logging into the content server from docweb using the Doc App Builder
application. If you can, then this isolates the max session error to the Tomcat/Webtop
server.

  • Using DA, look at how many “active” users sessions are currently in the repository.
    How many “inactive” sessions.
  • Try reducing the session timeout value in the web.xml on the Tomcat server to see if
    the inactive sessions get cleared out faster.

Security and Server Access Maintenance

  • Test users and test content should be deleted out of Production
  • The database schema owner account should be locked down
  • The Documentum install owner, “dmadmin” should be locked down
  • Only scheduled, authorized access to the Production should be allowed for all
    servers of the system.
  • Repository audit trails should be configured for certain events, such as deleting of
    content.

Long Term High Availability and Scaling Recommendations

  • As more users access the system, it may become necessary to create a second Tomcat
    (clustered) instance to ease the load on just one application server.
  • As more content get added to the system, more disk space will need to be added to
    the filestore drive.
  • Set up failover services for all key components
  • Add more Java Method Servers if lifecycle processing overwhelms the existing one.
  • A comprehensive content archiving plan will need to be designed and implemented.
  • Setup a disaster recovery site if the system’s service level agreement (SLA) is
    sooner than a new system could be built with backups.

Sunday, February 22, 2009

Applying Malcolm Gladwell to a Documentum Project

In Outliers, Malcolm Gladwell’s stories of successful people show a few concepts that can be applied to Documentum projects. These concepts explain how some individuals succeed and some fail. A successful project can point to these concepts as part of a successful pattern.

“Concerted cultivation”
Taking an honest interest in what the business customer really needs out of a Documentum content management system (beyond your personal business/monetary objectives) is what this is about. Fostering the open disclosure of issues the business would like to solve in an honest and transparent manner will go miles to making the project successful. If the issue is lack of accountability in tracking their processes, then they need workflow, not just Webtop and we'll see in the future if they need workflow. If the issue is that they want to convert from paper and be more productive, then they need scanning and basic classification, not just Author Integration Services.

Cultivate what the client will need to learn in order to appreciate the solution, do not hold back information in hopes establishing an on going support contract. If you don’t give enough knowledge transfer, the client will resent the issues it encounters and might blame you for not training them enough. If you don't document all of the potential issues that might come up, then the customer could point their finger at you and say "fix this for free".

Being the right person present at the right time
Chances are it wasn’t that you or your company won a Documentum project because of your raw talent, it was more likely that you had the right architect give the pre-sales demo or a contact at the company who you knew from a previous engagement, or some other connection to the company that won the work. Chances are even better that you happened to have previous experience that fitted well with what the company was looking for.

This brings up one of my issues with pre-sales. Most of us get paid when we’re working on a project, if we’re not we’re on the bench slaving over a state of work or a response to an RFP. Why don’t consulting companies realize that they need a research and development group just as much as a software company? It doesn’t have to be for product development, but solution development. Researching issues and developing solutions is what we do, but most of us solve issues on the project, write documentation and then we’re on to the next project. We don’t have what we need which is an occasional sabbatical to put the solution together and be able to truly work through the requirements, content, functionality, testing implementation, architecture, and most importantly the lessons learned. Once I was asked what types of content was part of a solution that I had only helped technically with, I couldn't answer for sure what types of content there were. This was because my time on the project was specifically allocated to deploy it to production, not to fully understand what the content was. This was sobering. I'm in this business to help solve content management issues and I was so caught up in the technology that I didn't even know what content was being managed...

“Mitigated speech”
It’s no secret that developers and architects from India are well trained and very talented in the computer science fields. What has taken me a while to understand is how a developer will communicate to an architect or project manager. He might ask a question like, “do you think the system is a little slow?” You might respond, “It’s not bad considering that it’s a QA environment. The developer might be downplaying what is being said, that is, he might really be saying “do some load testing you idiot or this solution might crash production when it’s released”. This is how “mitigated speech” works and it needs to be dealt with during the beginning of the project. Make sure you’ve created a few avenues for communication that wring out concerns clearly and effectively.

“It is those who are successful…who are most likely to be given the kinds of special opportunities that lead to further success.”
Obviously, this applies to all project outcomes, however, you may think you’ve succeeded when in fact the business feels coerced into paying you the final amount due. During the final stage of a project, which includes stabilization and knowledge transfer, there may be times when giving a few extras days for free gives a nice feeling to the customer. If the stabilization of the project is bumpy, like most are, beware of saying “this is not in the budget, we won’t do it without getting paid.” This kind of approach during a time when the customer is stressed out and any problems are easily blamed on the solution provider will not result in the kind of return business that would be achieved if you went the extra mile for free. Letting the customer push you around a little at the end may help extend the support contract until a further project comes along. These days a customer needs to fell like they’re getting a deal on your services. Make them feel that way…

The success of a project is not because of the one person or the architect or the design, it is because of the overall efforts of everyone involved, the timing, the company culture, the governance, mutual trust, and confidence in each other. One project’s success could catapult your whole company into phenomenal success, but it was a combination of experience, who you know, how much you’ve practiced, and whether you were in the right place at the right time. And luck.

Sunday, January 18, 2009

Users, Groups, and Roles with ACLs and Presets

At some point in the evolution of any repository’s design and implementation, the issues of managing users, groups and roles, and of reducing object type clutter rise to the top of the priority list. Also, finding a common way in Webtop of displaying attributes for documents listed in folders and search results becomes vital to maintaining a consistent user experience and getting a grip on endless customization costs.

You can talk to the business about users, groups, and roles, but the ramifications of your design will not click with them until they see Webtop in action and what presets and ACLs are actually doing to their user experience. For example, a user has the ability to import content and as the owner of the content they can promote it even if they have Read access only: this is hard to understand unless it was shown during a demonstration.

Here are some tips on how to design and implement users, groups, and roles with ACLs and presets:

Manage User Group Membership from LDAP or Active Directory Groups
I strongly recommend managing group membership using LDAP integration, otherwise you’ll have to run a script to add members to groups the first time and then manage group membership one by one with Documentum’s tool instead of a more robust user directory tool management tool.

Either use or exclude the dm_world default group from accessing your content
For tighter control of security, create a custom group that is your company’s base group from which to build. This base group should have a consistent level of minimal access to all content in the repository, such as Read access.

Using Presets to reduce Action in Webtop
Now that you have a base custom group, you can exclude from that group any actions, such as “Create New Documents” or “Import” from Webtop. Review the manual on what the basic client capabilities of Consumer, Contributor and Coordinator are before trying to add these to Presets.

Create Test Users that have Inline Passwords, as well as Users from LDAP and Local Domain Users
Sometimes LDAP will be “slow” or “failover” (I know User Directories never fail, but they do and you need to be able to access the Repository when and if it ever does. Chances are that your install owner account (“dmadmin”) will be a local account (I hope). So if domains or LDAP or the network is down at least you’ll be able to access the repository with inline or local accounts.

dmadmin: superuser best practice
- Add docu group as delete, full privilege to all ACLs
- Don’t include docu with the custom world group
- Don’t use dmadmin to test
- Don’t have dmadmin in any other groups, especially ones controlled by presets because the link to Webtop administration of presets may vanish and you’ll have to create or move a new user into the preset group.

Create contributor Roles that are responsible for importing and creating new content for each group of content types
For a financial company, this means create a group for Tax, Treasury, Marketing, Operations, Accounting, and so on. Each group is responsible to for a certain group of content, for example, Tax imports tax returns, withholding docs, etc and Treasury has capital calls and distributions.

Create Presets for each Role that include the content types they import only
This controls the object type dropdowns when importing, checking in, and creating new content from templates. Be sure to include a parent type that will not be used, but will be the first in the dropdown, so that custom attributes will be refreshed when the custom type is selected from the dropdown.

Scope Search Object Type Dropdown by Object Type Parent
Users of Webtop usually need to limit the amount of object types to search for in advanced search. There’re only so many types that users need to look at for searching. Less is more in the case of the advanced search object types.

Scope Object and Doc Lists by Role
The display of custom attributes in the columns of search results and when browsing folders should be customized by Role to establish consistency and attribute value listing expectations. Custom attributes will be the important ones to show and sort in these circumstances.

For searching across all object types think about replication
First, you’ll need to figure out the common attributes across all of the object types. Second, although you might be able to roll up a few of the custom attributes into the parent type, different user interface requirements will probably force the design to repeat certain attributes in every child object type. One solution is to develop a TBO that replicates some of these common child attribute values to the corresponding parent attribute in order to search for them from the one parent object type in advanced search.