Infinite Blend : ECM Blog: 2008

Monday, December 15, 2008

Composer the Poser

When a teenager wears skateboarding shoes and a brand named hoody but doesn’t skateboard, he is referred to by real boarders as a “poser”. Unfortunately, the same can be true for EMC’s Composer. It wears all of the key features of the doc app builder, but it ain’t no DAB.

Don’t get me wrong, like a good EMC citizen, using a DAB built system, I created Composer projects from Production to QA and Dev. I worked through some initial bugs in the Composer build and was determined to deploy the next phase of the project using D6 sp1 Composer. I had heard that D6.5 was not going to use DAB, so I thought I’d better get on the bandwagon and get used to Composer.

I read through Jason Duke’s article on the Blue Fish site and was assured that I’d get through the process with a few issues, but no show stoppers. I figured out that all of the directory structures needed to be consistent in each environment. I also worked through a deployment plan for multiple, decoupled dar files for easier development coordination and implementation. I rewrote my design document to follow the new and “improved” configuration UI of Composer.

The first road block was value assistance for a query. The xml file created by Composer was missing the “complete_list = true” key/value pair for the query element. This meant that the dropdown was fixed at 20 pixels. EMC support sent a fix for webtop (not Composer) which fixed the width of the dropdown, but the dropdown graphic was the new look and feel graphic. This was glaringly bad for users to see. What were they thinking?

The second road block was that if you include groups and roles in the Composer project during deployment they will overwrite the existing groups and members if you don’t remember to change the install instructions each time. There is no global way to tell Composer to ignore existing artifacts, thus each time you deploy, you have to set the install option again for each artifact. When the project install fails which it will over and over again, you’ll be clicking and clicking. Bottom line is that this needs a lot of work for it to be production ready.

The third road block was that value assistance for query based dropdowns changes the rendering of the select box as compared to the fixed value dropdown. Hello, a customer can tell the difference between the two and won't like it. The fix is the go into DAB and save it again, but this is no excuse. Gotta wonder what level of regression testing was done...

After chasing a few more bugs with Composer D6 sp1, the patch came out, but that didn’t help much. Common sense won out and we’ve gone back to using DAB for deployment. As clunky as it is, at least it isn’t a poser. By the way, DAB was released for D6.5. Gee I wonder why?

Sunday, November 9, 2008

LDAP Unwrapped

Every time I configure a Documentum repository for LDAP Synchronization I start out impressed with the new features that have been added to the latest version, but end up frustrated with the myriad of problems to solve to get the synch to actually work.

The first thing you have to do is download the latest bug fixes, to avoid what always happens to me: shagging after errors on EMC Developer Network and searching and searching and finally finding some reference to bugs and fixes. The ftp address for D6 sp1 fixes is ftp://dev_pre:qa5.grN6@ftp2.lss.emc.com/sustaining/Content_Server/6.0_SP/LDAP .

The appendix below lists the bugs that are currently fixed. These fixes will make the synchronization actual work if you have more than one configuration set up to run in succession.

Start with the User Directory
You need to engage the LDAP administrator in order to determine whether the group membership structure of LDAP and if it makes sense to use it, that is, are the groups setup with content management in mind or are they a mixture of file system security and ad hoc assignments.

Also, have the user accounts been setup consistently? For example, Active Directory works fine with accounts set up like “smith,john” and “smith.joe”, however Documentum throws an error while processing the “smith.joe” because of the period.

You’d think managing group membership from one source application is easier than from multiple applications, however decoupling LDAP groups with Documentum groups may be more practical given the differing purposes and rules within each application. The classification rules (driven in part by group/role definitions) for a content management system are usually different and definitely more robust than those of a file system.

Automatic creation of user’s cabinets from LDAP
It’s not obvious how to do this from the LDAP configuration form. From the LDAP Server Configuration Properties’s mapping tab you have to add the following:
- Property: default_folder
- Type: dm_user
- Map To: /${cn}
- Map Type: Expression

This mapping will create a cabinet for each LDAP user as they are added to the repository. The cabinet will be owned by the user and will be private to that user. This cabinet will be the user’s home cabinet.

Appendix: D6 SP1 LDAP Bug Fixes

Bug fixes in LDAP_DOCAPP_HOTFIX
153994 - Unable to create LDAP config object on AIX/BEA as it throws java.lang.NoClassDefFoundError (It applies to all OS/Appserver combinations)
153355 - The Diretory type property is nto properly set while creation of LDAP config object
153322 - DA hangs on the Mapping tab while creation of LDAP config object
151022 - Cannot create LDAP config object with out setting proper OU info
150397 - LDAP config object cannot pull the SamAccountName attribute in AD2000 and AD mixed mode
151570 - Cannot create proper LDAP config object

Bug fixes in D6SP1_dmldap_hotfix
145896 - When using "\" in the CN values for LDAP, the LDAP synchronization propagates the users to Documentum properly but group membership synchronization fails because of this special character.
149446 - LDAP Synch fails when using subtype of dm_user
151224,154269 - LDAP Users with apostrophes in their names (i.e Robert O’Leary) have been successfully imported to Documentum. However, due to the apostrophe in the user name a DQL query is failing.
149443 - LDAP Synch fails when mapped attribute is null on Directory server
154399 - LDAP Sync jobs tries to deactivate the user even though the user is not present in the docbase.
154511 - If dm_ldap_config is to configure to map LDAP attributes to DocBase attributes of a subtype of dm_user, running LDAP sync failes to populate the additional attributes introduced in my subtype with LDAP data.
154704 - LDAP Sync throws DmLdapException:: THREAD: [ACTIVE] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)'; MSG: home_docbase - Property not found; ERRORCODE: ff; NEXT: null
155432 - The LDAPSync job appears to be querying older versions of the serverconfig object to get the ldap_config_id. If the older versions do not have a value set, the job will fail.
156347 - LDAP Sync job fail to remove a sub group when a sub group is deleted from AD.
157177 - LDAP Sync throws NullPointerException
157650 - com.documentum.fc.common.DfException constructor throws Exception:
java.lang.StringIndexOutOfBoundsException: String index out of range: 90
157759 - LDAP Sync throws Exception [DM_GROUP_E_UNABLE_TO_SAVE_EXISTING]error: "Cannot save group geoda-grp because a user already exists with the same name"; ERRORCODE: 100; NEXT: null
158197 - LDAP job fail to create the users cabinet though the argument create_default_cabinet set to true.
158766 - Running LDAP Sync throws the following LDAP NamingException:
javax.naming.NamingException: [LDAP: error code 1 - 000020EF: SvcErr:DSID-020513B8, problem 5012 (DIR_ERROR), data 8333]; remaining name 'CN=GAU2876SPE,OU=Security Groups,OU=Groups,OU=USPTO,DC=prepro,DC=local'"; ERRORCODE: 100; NEXT: null
153709 - Request for LDAPSynch job to have a switch to ignore case when comparing usernames
162830 - Ldap Sync job removes users default_group attribute
162684 - LDAPSynch leaves trailing spaces when setting user attributes with mapped custom values
164164 - LDAPSynch fails with DM_LDAP_SYNC_E_EXCEPTION_ERROR "String index out of range: 32" on getTruncatedString adding a member to a group if the username has trailing blank spaces that exceed the 32-character length limit.

Tuesday, October 21, 2008

Blame Everything on Testing

Your project is executing according to the schedule; the code development is done and unit tested, the Documentum applications have been installed and smoke tested; and UAT is about to begin. The designated tester on the team has been testing and writing scripts on the Test system. There’s a week allocated for testing the system in test. Oh, by the way, there’s a migration of legacy content being tested as well.

Yessss, I’ll be able to take my planned vacation
The testing is going excellent in Test. What do you think could go wrong? Well, first there are LDAP users in Prod and not in Test. Second, there are slight undocumented differences in the ACL’s between the two environments. Third, the migration will not be fully tested until it is run in the Prod environment. Fourth, UAT Users were not fully engaged during the testing and script writing phase of the project. Fifth, basic performance tests were run on Test, but real use tests to soak the system have not been executed. Sixth, a key developer is rolled off the project to start a new project at another company. Finally, seventh the sponsor of the project left the company and a new more technical sponsor is critiquing the application.

Can I get a refund for my vacation?
LDAP Users are in Production: The user that are imported will need to be added to groups if they aren’t from LDAP. The user logins will need to be tested. Renaming users will need to be tested.

ACLs: It is extremely tempting to make changes in Dev, QA and then Prod for ACLs manually without deploying them through a docapp or scripts each time. This inevitably causes slight differences between the environments.

Migration: When the migration is finally run in prod more focus will be on the results. The real users will be searching and finding bugs. Rerunning the migration a few times to prod should be built into the schedule.

UAT: No matter how much you tell the UAT testers that they need to be engaged and thinking about testing while reviewing the test scripts, they’ll find a ton of issues with the application during UAT. Showing them demos and prototypes during the development of the application helps, but they still won’t be able to really test until they sit down and seriously try to figure things out.

Performance: The Test environment was not tested for performance and there’s not enough money in the budget for a separate performance env. You’re left with trial by fire and it’ll be challenging.

Key Developer Rolls Off: This happens a lot. The Project Manager schedules a developer to be available for a few days after Unit testing. The best documentation of code and customizations could exist, but if the developer is no longer on the project, the learning is huge for someone else to fix bugs quickly during UAT.

New Sponsor: In this time of layoffs, you’ll get a number of sponsors who are reassigned of layed off. This means that the new person will come in with preconceived notions as to what needs to be done. This new person will want to make difference and a quick win is to delay the project while reviewing what it is suppose to do. What business problems are being solved, let’s look at the bigger picture here, why is this over budget, can we consolidate more, etc?

Stacation
The testing phase of projects will give you a few gray hairs. There are so many potential pot holes. How ever many days are scheduled for UAT, double it no matter what. If you want to go on vacation after, schedule it for next year.

Tuesday, September 30, 2008

Docapp Distractions

I know Composer is out there and useful in many installations, however it’s tempting to still use the Docapp Builder to build real time object models and once you start you kind of have to keep going. Also, there are still a ton of 5x installations out there that need to be maintained and eventually upgraded.

Different versions of Docapp Builder and Docapp Installer
I was recently in a situation where one developer was developing on D6 sp1 with a Docapp Installer version 6 and another was developing using 5.3 sp5 update1. The D6 developer was creating docapp archives for Forms and Process Builder which when installed using the 5.3 sp5 update1 installer left out certain form properties and functionality. It was a real waste of time trouble shooting these issues.

Creating a workflow docapp with auto activities for lifecycle promotion
If you create a workflow and have an auto activity for promoting a lifecycle state regardless of whether the lifecycle is added to the Docapp, the lifecycle will get added to the Docapp archive and installed on the target docbase.

ACLs with non dm_dbo domains
If you have set up ACLs with an owner that is not the dm_dbo (docbase user account) then these permission sets will not show up in the Docapp’s Lifecycle, Action, Add permission set drop down list.

Versions
If you are splitting the docapp work between a few developers, get ready to deal with a lot of versions of the docapp and all of the workflows, lifecycles, activities, etc.

ACL inheritance from Folder
If your docbase is set up with ACL inheritance from folders be sure to set the “System/Applications” folder ACLs. If you don’t do this, everytime you deploy a docapp its components, such as a lifecycle will inherit the ACLs of the docapp folder and may or may not work in the application.

Scoping of Display configurations
This is very strange when it happens. Let’s say you start out with a small object model of two levels and configure the display. Then you need to add a new parent level and redo the display. Sometimes the scope remains with the original model and object type. The only way I found to fix this was to dissect the database scope config and cut out the disease.

Mandatory attribute changes with old references
If you ever decide to change an object type’s attribute from required to not required, sometimes this change is held in a reference or cache somewhere. The only way I’ve been able to clear it is to delete the attribute using DAB, closing DAB, clearing the cache and inserting the attribute again.

Importance of having one source for docapp creation
I can’t stress it enough that Docapps should be created and control from one source, with the same versions of the Docapp Builder and Installer on the source and target machines. Configure the object type’s display only when the attributes and object model are final. Add workflows in with lifecycles. Try not to version docapps or its objects unless absolutely necessary. Finally, beware of the permissions that are inherited by the docapp installation.

Friday, September 19, 2008

Solution Demos: Catch 22

Damned if you demo, damned if you don’t, especially if your client is new to Documentum. You can talk about functionality, show the vanilla installation of Webtop, and still get surprised looks when you show the final solution. On the flip side, you can demo the solution when it’s not fully tested and have to dance around errors and incomplete development.

Demos of a client’s project are essential if the client reads the Functional Requirements Spec and the Tech Design without too many comments. This means they don’t understand and/or don’t have enough time to review it. This can be frustrating, but if you demo the solution in front of those same future users, you’ll get invaluable feedback and will eventually have a more successful launch of the solution.

Steps to successfully demonstrate your solution:

Test the part of the solution that will be demo’d
Plan talking around any part of the solution that is not completed
Have an agenda of what will be shown during the demo
Repeat to the clients/users that this is not a finished solution, that there will be bugs
Try to have fun with your audience, this should not be too serious
Have other technical resources there to take notes and comment on questions that pertain to their customizations
Listen you your client’s comments and take their feedback seriously
Make notes of comments that could form future opportunities, like “does this mean we have to sign on to another application each time we want to access it?”. This could mean that single sign on is in the future. Or, “I already fill out this information in the database, why do I need to do this twice?” This could mean an integration possibility to ease the transition to using Documentum, not to mention make finding content easier with more integrated attributes.
Schedule a few demos to reduce the risk of disillusionment after an error causes one demo to fall short of expectations.
Try to impress the client each time by showing something extra in the product that they might not have seen.
Whatever you do, do not take screenshots and dummy up the solution. This will lead to more questions and anxiety over the progress of the overall solution.

Friday, August 22, 2008

From Functional Requirements to Tech Design

The most important part of designing a Documentum solution is to understand and translate the business and functional requirements into Documentum configurations and customizations. You have to create maps of all of the major components of requirements, matching each requirement to solution. You might use a traceability matrix or you might organize your specification similar to the functional requirements. It depends on the project scope and business expectations. My opinion is that if you need a traceability matrix the solution is too wide in scope or there are some trust issues with the business.

Document Types to Object Types

Scope of project phase
When reviewing the list of doc types in the functional specification do not take for granted that these are the final types of documents. A business analyst cares about capturing all of the possible documents, not necessary thinking about the scope of the project when he does so. Be sure to explicitly describe the scope of the project in your design. For example, the doc types may cover all content in the enterprise in order to create a comprehensive object model, however the scope of the current phase of the project may be a subset of those types. Do not over commit to the business what content will be covered in the first phase. When you demo the functionality, only show that phase’s content being imported and published. Try to reduce scope creep by repeating the project’s phase expectations and assuring that all content will eventually be brought into the system. This is a good opportunity to talk about a roadmap of the phases required to fully actualize the business’s total content management with Documentum.

Model review and rework
Roll up attributes as you work through each object type, finding commonalities across object types. Think about all the UI, TBO, ACL ramifications of your design. If the object model is huge, try to consolidate, look into using aspects as an alternative to many object types or many attributes.

Implementation design
Depending on the scope, you may be releasing the object types in phases. If this is the case, make sure the phase’s object types cover all of the document types, business processes and security requirements. You may want to prototype some of the object types and their relationships to importing and folder linking. For one-offs in a dev environment, Documentum Application Builder is the fast way to do this. For more methodical approaches, Composer is more portable and best practice going forward.

Metadata to Attributes

Match with OOTB attributes first
Don’t reinvent the wheel, search for a suitable existing attribute first then create a custom one. If there’s a requirement for a comments field, use the log_entry attribute. If there’s a status field, use the r_version_label.

Source of values

Value assistance could be used for the easily maintained attribute values, which would be maintained using Doc App Builder or Composer. This uses docbasic.
You could get a custom object to query for attribute values, which would be maintained by the business via the UI.
You could query registered tables or views which could integrate with linked database tables
You could read and parse a properties file maintained on the server.

UI constraints: repeating attribute, large text boxes
Think about how the users will react to the WDK repeating attribute dialog box. For large strings, you’ll have to put a text area in the forms.
Taxonomy to Folder Structure

Search ramifications
Giving context to search results is an important way to transfer knowledge. Repeating key attribute/value pairs (for searching) with folder hierarchy labels (for browsing) usually makes sense. If it doesn’t, maybe you are trying to put too much metadata on the object.

Simplify where ever possible
If you are going deeper than four levels, you better have a good reason. At a certain level there’s a “project” or “deal” or “product” level which repeats over time. This level should be closer to the top level, have some automated way to build new folders as new “projects” are added, and have an automated or easy way to archive, thus reducing the clutter of old content over time.

Business Process vs. share drive organization
Folder structure may be organized in many ways. A business process oriented structure may make more sense than strictly following the share drive approach of silos of data structured by business group. Avoid organizing content by date only. If there’s an end of quarter date use it, but don’t rely on it solely to find content.

Mix of common, evolved classification with more structured rules
Chances are good that you are moving from one established way of structuring content to another. Doing this right so the users know what they are doing in the new system without hours of training can be difficult, especially because the old way is usually not the best way. If the company is large, your headache is merging structures, if the company is small your headache is reorganizing silos of information. Sometimes both.

Modify according to folder/object type map
As the reality hits the road and you’re building folders and automating the linking of content to them, expect changes to the design, additions to the attributes needed, and delays to build and customize the right way.

Folder Structure mapping to Object Type plus Attributes

Match Object Type and Attribute Key/Value pair Combinations to Folder paths
Prototype the search and browse functionality
Do subfolders follow the document templates?

Business Process to Workflow
This is where you’ll read the functional requirements of a process of getting content approved and have to figure out the specific activities involved for a workflow. Sometimes, Common business processes may not translate into a formal workflow. In this case, describe the steps of using the application and how the work can get done without a formal workflow. You’ll have to determine how to create decision point that splits the workflow if there’s a yes/no decision to be made. A common example of an auto-activity is the requirement to send a notification email with a link to the workflow’s package

Importing content
Check all assumptions involved with importing content into the system. It might be assumed that all content related to the first phase should somehow get into the system. The problem with this is that the contributors of the related content may not be ready to participate in the Documentum system. They may not be in the first phase.

Sunday, May 18, 2008

The Art of the Custom Documentum Object Model

Before designing a Documentum Object Model, you’ll need to take a litmus test of the culture of trust between the sponsor of the project and the IT organization. If the company is large there will be multiple levels of politics. You’ll need to judge from how the requirements gathering sessions went to figure out your approach to the object model design. Part of gathering requirements is educating your client on the types of objects that make up the content management system without getting too technical and wrapped up in explanations that are too long winded and lost on the client. During this education, ask questions like:

How do the different business units communicate with each other?
Do they share information, is there emphasis on security?
Are there databases that they use to look up information?
How effective are Marketing and Sales at driving the accumulation of knowledge into content published to consumers?

The answers to these types of questions help determine the meaning of the object model’s hierarchy levels. As an architect of the content management system, you are the only one who is qualified to make non-biased design decisions and hopefully would not have an agenda in your design. The most common object model hierarchy has an enterprise object as a child of dm_document, dm_folder, etc. and then children objects underneath it. The ramifications of your design magnify at the second level of the hierarchy. Here are some scenarios of what happens when the architect gets influenced in the wrong ways:

Forced to design without all the requirements
Have you been given enough time to really design a model that reflects the whole organization? If the IT Manager on the project says, “Don’t worry about the whole organization, we have three business units in front of us now, this is all we need to worry about for now,” you know there will be issues with the design if you create a model without knowing the bigger picture. Scalability, performance and reporting all suffer when a design is not drawn from the full foundational background.

Influenced by the wrong folks in IT
Most notorious for screwing up an object model would be the turf war database architects who don’t understand object-oriented design. They demand to know what the relationships are between all of these tables, where’s the schema, etc. They will flip out over table unions and joins if you tell them too much. They could care less how much the object model is integrated with the UIs and security. So when you say that the enterprise level attributes are reserved for only the most far reaching attributes across the whole company like retention period, stick to your guns when they push back and shake their heads. If you need more than 3 levels of custom object levels, make your case is as sound and simple as possible, try to include monetary impacts on potential customizations if it isn’t followed.

Confusing Department Security with Content Functionality
Most companies are set up by departments. Each department shares some information and restricts access to the rest. This doesn’t mean that the object model has to follow the org chart of the company. It may make more sense in the long run to figure out the function of each content type in the enterprise and really study the use cases of the content that is most critical to the company’s success. For example, for a government organization which has vital records (like birth certificates) to scan, index, and store, it makes sense to design an object model around function, in this case vital records, instead of the name of the agency that keeps track of the vital records. Agencies and departments will reorganize over time, functions such as birth records will not.

One repository vs. many
In many cases multiple repositories designed around one global registry makes sense. There’s more flexibility built into this design through out the technology stack, as well as with the changes in the business units over time. This however does not mean that each repository should have autonomy in its object model design. In fact, there should still be an enterprise level custom object for each object type being customized. The object model should be the same in each repository. You’ll have to be more diligent with migrating docapp archives between repositories, especially with the install options.

Decoupling Internal Business Process and External Publishing
If the end goal of the content is to publish it to a portal, there will be conflicts between the internal structure of the content (how the business works with each other) and external structure (how consumers view and search the content). Do not underestimate how long it will take to weed out the navigational systems for each side, the security and identity management, the functional driving attributes, etc. In the best case there will be enough decoupling of the objects and their attributes that the design can provide decoupled and scalable solutions to the conflicts between content management and content publishing.

Some General Rules of Object Model Design

Determine if content types are functional or departmental in nature
The security model of the repository whether it’s user, object, or folder based may have an overriding influence
Build in flexibility to enable the object model to expand in all directions
Move attributes that span all of object types up one level if possible

Saturday, February 9, 2008

Polluting the ECM Ecosphere

You know those email spams that fill up your inbox? Well what about the trail of junk that Content Management Systems leave behind as they forge ahead solving complex business problems?

As I’ve work on large, small, and medium sized CMS’s, I’m always amazed at how polluted they are with logs, audit files, orphaned work items, queue items, reports, ACLs, versions, etc. The out-of-box clean up jobs focus on getting rid of unwanted versions, orphaned content, logs, and queue items, to an extent.

The problem is, with some business reporting requirements, they are too good at deleting files that may be of use for historical analysis. Business users get nervous when you say “we have to clean up things up to maintain performance”. They say, “Can we wait for a while until we really need to do this? What are the risks? What if you delete something that we need at the end of the quarter, or year, or in ten years?”

M. Scott Roth’s “Seven Jobs Every Documentum Developer Should Know and Use” article details the use of seven Documentum jobs: DMClean, DMFilescan, LogPurge, ConsistencyChecker, UpdateStats, QueueMgt, and StateOfDocbase. There’s a job to trim versions, but for whatever reason Scott didn’t include it in his job list. These jobs are all essential to keeping your repository clean and performing the way you expect, but what do you do about ACLs, workflow history, and versions if the deletion is not specific enough?

Content pollution is rampant in all industries and is a direct result of rushed design and over ambitious technical solutions to relatively simple business problems. Take a regulated content management system for example. This system most likely creates new versions of content for every change to its file or its metadata. There also could an audit trail which records every version’s change, a backup of the file system and the database for nightly and weekly data security, disaster recovery with off-site replication, multiple renditions, and multiple language versions.

The upshot is that the proliferation of versions and logs, and backups is great for storage “archive” companies, but can lead to confusion and a false sense of security. Who’s making the design decisions? Most likely it’s a business user who doesn’t want change, thus forcing an over worked IT Manager and ECM Architect to work out the solution which puts garbage control on the back burner. “We’ll deal with logs and versions later, right now we have to roll out the project on time and within budget.”

So how do we design with conservation in mind? For one, we think a year or two in the future and try to extrapolate the effects of thousands or millions of scraps of content floating around in the CMS, slowing done queries and filling up the more expensive hard disk space. Here are some more ideas:

Design to recycle:
For each log and object type ask how will this be created, versioned, and disposed of. What is the purpose of this content? How long will it be useful? Efficiency Think of conservative approaches to logging events, to versioning content (regardless of OOTB functionality). Upgrades, New Development, and Performance Testing Logs, database temp space, temporary migrated content files can pile up everywhere during special testing and migrations. These files are often “hidden” and sometimes move along to production systems only to clog things up later.

Site Cache temp files and orphaned site files:
Site Cache Services is notorious for leaving stray temp files, logs, and orphan folders all over the place especially during fail publishing attempts.

Docapp messes:
Docapps, when not performed carefully, can leave references to old lifecycles, workflows, object types, and attributes. These orphaned objects can not only clog the system, they can corrupt production environments with hardcoded references to filestores and none existent owner names.

Repository and LDAP synch logs:
Every time an LDAP synch job runs logs get stored in the repository and on the Content Server file system. Every time a repository starts up a new log starts for it. These logs fill up the server file system which is usually not a large disk.

DFC traces:
During development and testing, trace logs are essentially to tracking down bugs and slow performance. These files are usually forgotten and build into huge space choking surprises when you least expect it.

Environments such as Sandbox, Dev, Test, Performance, Staging, Prod, DR, Off shore, Business Continuance:
All these environments double, triple, xduple the amount of disk space needed for solutions. Think about ways to migrate subsets of needed content without versions perhaps. Reduce logging in environments not used very often or that are dormant for a period of time.

Integrations with other applications:
Many integrations of systems require multiple renditions of content for presentation. For example, email messages from Outlook get saved as .msg files in Documentum. Even when EMC’s email Xtender is installed, an integration with Outlook requires copies of the original email to be imported into Documentum’s repository.

Friday, January 4, 2008

Queue Item Maintenance

Issue:
The recording of task events are tracked in tables, namely the dmi_queue_item table. These build up proportionally to the number of tasks executed. As time goes on and performance potentially slows down, these queues will need to be deleted.

Requirements:
Periodically delete old records from the dmi_queue_item table, but keep track of a document’s workflow history for a certain amount of time beyond the dmi_queue_item cleaning.

Solution:
The first thing to determine is whether the out-of-box dm_QueueMgt job can do what’s required. This job runs a method that can delete queue items older than a cutoff date. This is useful, however, we want to keep the workflow history of a document. This is also useful because this table keeps a record of many other events in the repository which need to be deleted on a scheduled basis. The solution was to create a custom table which holds the requirement queue items to maintain the workflow history of documents, and to create a new job and method to populate it before as part of queue management.

Solution Details:
First, create a custom table using DQL (Note: this table has the same columns as the dmi_queue_item table):

UNREGISTER TABLE wf_history_s
EXECUTE exec_sql WITH query='DROP TABLE wf_history_s'
EXECUTE exec_sql WITH query='CREATE TABLE wf_history_s
(
r_object_id VARCHAR2(32),
object_type VARCHAR2(32),
id_1 VARCHAR2(32),
string_5 VARCHAR2(200),
string_4 VARCHAR2(200),
string_3 VARCHAR2(200),
string_2 VARCHAR2(200),
string_1 VARCHAR2(200),
workflow_id VARCHAR2(32),
policy_id VARCHAR2(32),
registry_id VARCHAR2(32),
audit_signature VARCHAR2(255),
audited_obj_vstamp INTEGER,
user_name VARCHAR2(32),
time_stamp_utc DATE,
audit_version INTEGER,
chronicle_id VARCHAR2(32),
controlling_app VARCHAR2(32),
object_name VARCHAR2(255),
audited_obj_id VARCHAR2(32),
version_label VARCHAR2(32),
acl_domain VARCHAR2(32),
attribute_list_id VARCHAR2(32),
host_name VARCHAR2(128),
user_id VARCHAR2(32),
i_audited_obj_class INTEGER,
event_source VARCHAR2(64),
event_name VARCHAR2(64),
r_gen_source INTEGER,
owner_name VARCHAR2(32),
time_stamp DATE,
event_description VARCHAR2(64),
session_id VARCHAR2(32),
current_state VARCHAR2(64),
application_code VARCHAR2(64),
acl_name VARCHAR2(32),
attribute_list VARCHAR2(2000),
i_is_archived VARCHAR2(32),
id_5 VARCHAR2(32),
id_4 VARCHAR2(32),
id_3 VARCHAR2(32),
id_2 VARCHAR2(32)
)'

REGISTER TABLE dm_dbo.wf_history_s
(
r_object_id CHAR(32),
object_type CHAR(32),
id_1 CHAR(32),
string_5 CHAR(200),
string_4 CHAR(200),
string_3 CHAR(200),
string_2 CHAR(200),
string_1 CHAR(200),
workflow_id CHAR(32),
policy_id CHAR(32),
registry_id CHAR(32),
audit_signature CHAR(255),
audited_obj_vstamp INT,
user_name CHAR(32),
time_stamp_utc TIME,
audit_version INT,
chronicle_id CHAR(32),
controlling_app CHAR(32),
object_name CHAR(255),
audited_obj_id CHAR(32),
version_label CHAR(32),
acl_domain CHAR(32),
attribute_list_id CHAR(32),
host_name CHAR(128),
user_id CHAR(32),
i_audited_obj_class INT,
event_source CHAR(64),
event_name CHAR(64),
r_gen_source INT,
owner_name CHAR(32),
time_stamp TIME,
event_description CHAR(64),
session_id CHAR(32),
current_state CHAR(64),
application_code CHAR(64),
acl_name CHAR(32),
attribute_list CHAR(2000),
i_is_archived CHAR(32),
id_5 CHAR(32),
id_4 CHAR(32),
id_3 CHAR(32),
id_2 CHAR(32)
)

update dm_registered object
set owner_table_permit = 15,
set group_table_permit = 15,
set world_table_permit = 15
where table_name = 'wf_history_s'

Second, create a custom Documentum method to be executed by the custom queue management job. This class should have the following methods and logic:

a. Populate Workflow History Table according to criteria. Here’s an example dql:

"insert into dm_dbo.wf_history_s " +
"(r_object_id, event_name, time_stamp, user_name, audited_obj_id, string_4, workflow_id, string_3) " +
"SELECT '0000000000000000' as r_object_id, task_name as event_name, date_sent as time_stamp, sent_by as user_name, r_object_id as audited_obj_id, name as string_4 , router_id, task_state as string_3 " +
"FROM dmi_queue_item " +
"WHERE r_object_id not in (select audited_obj_id from dm_dbo.wf_history_s) " +
"AND router_id != '0000000000000000' " +
"AND date_sent < DATEADD(Day, -"+sCutOffDate+", date(today)) " +
"AND delete_flag = 1";

b. If the Workflow History Table gets populated successfully, delete the dmi_queue_item rows according to criteria. Here’s an example dql:

"DELETE dmi_queue_item objects " +
"WHERE router_id != '0000000000000000' " +
"AND date_sent < DATEADD(Day, -"+m_cutoff+", date(today)) " +
"AND delete_flag = 1";

c. Write the job report to the repository.

Third, create the custom queue management job.