Tuesday, May 26, 2015

How to Triage ECM slowness

Who’s the first team a User calls when the ECM application slows down? The ECM team of course. But, nine times out of ten, slowness is caused by effects of other systems. Whether it’s the database, network, or the User’s open applications, sluggish performance has many sources.

Question: who’s working on what client, in what environment, and where?

Network: we were fixing something

From Who: multiple sites simultaneously, or one building?

Large companies with multiple buildings most likely have networks that are somewhat patched together leaving some Users with networks that are performance subpar. Also, some areas of the company may be hugging bandwidth with applications that are dragging the whole network down.

Database: This won’t impact the application…

From Who: All applications that use that db server, or one application?

Typically, the database server is a shared environment thanks to our buddies who consolidated individual servers at the expense of “decoupling”. This shared environment could at the mercy of reporting for BI initiatives slowing it down. If it’s an Oracle RAC, sometimes the nodes don’t reboot as advertised. The shared environment of tier 1 applications, could put the other lower tiers at risk because the lower ones will not be the priority if there’s a business outage.

Backups: we were trying to restore another application

From Who: One application or many?

Backups might happen late at night during “off” hours, but there’s still a performance hit on databases and file stores. There’s also the possible wave of activity after a recovery that clogs all downstream applications.

Security: we were hacked

From Who: One app, or many?

With new layers of security applied comes extra processing thus potential for slowness. This is usually agreed upon at the design stages, but complained about after implementation.

Virus protection: half of our share drive files are encrypted

I’ve had many times when I’m looking for causes of slowness on my PC or on a server, only to find out that the task manager is showing a huge percent of CPU being used by the virus protection software. Hint: Double check when the full scan is scheduled.

User’s 5k open applications: who me?

If one User complains, log onto their PC and check out what applications they are running (assuming they didn’t close some while they waited for you). Try closing and opening Outlook. What’s in their startup folder? Check their browsing history for views and downloads.

Service Desk: this is a routine patch

Even when the Service Desk is being proactive with mandatory testing of patches to Windows or IE, there are always issues, especially with interaction of multiple open web browser (“no footprint”) applications.

Upshot

When you get blamed for slowness of your ECM application have a script of questions to ask to triage the issue. Check the possible larger issues first and move toward the User at hand. Slowness happens because everyone wants information faster, that is, in our zealousness to always get faster we stumble occasionally.


No comments: