The risks for service outage can be broken down into three
categories:
Server: each server has services which are vulnerable to
outage. These servers are the Content Server, Index Server, Application Server,
Database Server, and the Storage Server.
Systemic: The dependency of each server’s integration(s) with
each other is vulnerable to outage. For example, if the content server goes
down, the application will be out; if the database or storage goes out, the
content server is down, etc.
Disaster: This would mean that the whole server room is down.
The disaster scenario would cause the DR system to synch and start up.
The risks of services going down are real and happen most often at the server level. User complaints occur during times when performance is slow which may be a sign that a service is in trouble. Many times integration between DCTM and other services are risky because it is assumed that the other services are always up. If a company is growing, the network will be changing, databases will stumble, even electricity circuits will blow, so keep all of this in mind and in your recovery plans regardless of assurances that this "will never happen".
Risk Matrix by Server
Scope
|
Server
Outage
|
Description
|
Integration Dependency
|
Risk Level
|
Monitoring
|
Systemic
|
Storage App
|
Storage Services
|
Database, Content, Index, App
|
Low (If HA, redundancy)
|
monitoring scripts
|
Systemic
|
Oracle
|
Database Server
|
Content Server
|
Low (If HA, redundancy)
|
monitoring scripts
|
Systemic
|
LDAP Server
|
LDAP
|
App/Content Server
|
Low (If HA, redundancy)
|
monitoring scripts
|
Systemic
|
DNS Server
|
DNS
|
All Servers
|
Low (If HA, redundancy)
|
monitoring scripts
|
Server
|
DCTM
|
Repository Services
|
App Servers, Index Servers
|
Med (If standalone)
|
monitoring scripts
|
Server
|
DCTM
|
Java Method Server (JBoss)
|
Index agents, Jobs, workflow
|
Med (If standalone)
|
monitoring scripts
|
Server
|
Application
|
Tomcat
|
Med (If standalone)
|
monitoring scripts
|
|
Server
|
Index
|
xPlore Servers and Agents
|
App Server Search
|
Med (If standalone)
|
monitoring scripts
|
Disaster Recovery systems are replicated systems which constitute a low but viable risk.
No comments:
Post a Comment