It’s been another incredibly busy couple of weeks for the Real World ITIL team. It’s a good moment though to record a few more thoughts on our current project, so we’ll continue our story about Availability Management (AM) for this entry. If you’re just joining us, you might wish to read the first entry in this series, because we’ll be referring to the project phases described there.

Before we get started, though, let’s pause to say thanks to Robin Yearsley for his response to one of our recent entries. We’re happy to direct our readers to his new search engine at http://www.ITServiceToday.com, and also to congratulate Robin’s Dr. ITIL website on its first anniversary! May cyberspace be long graced with its ‘onlineness’.

And, as always, we invite thoughtful commentary from all of our readers ? just click on the Comment link below. We’ll be happy to respond to your thoughts.

So why are we focusing on AM in particular? The company-in-question operates financial trading systems as an essential part of its business. In this kind of business, every minute of unplanned downtime might cost millions. You can imagine that the availability of their production systems is a very serious matter indeed.

In fact, before we continue our discussion any further, we must note that the systems at the company-in-question have an excellent record of availability. We’ve seen it in action first-hand, and their production environment without question performs extremely well and the staff maintaining them is exceedingly competent. This is important to keep in mind.

Therefore, throughout this multi-part case study, we are not discussing how to correct a problematic situation but rather how to continuously improve operations in an environment where the stakes for effective infrastructure service delivery are very high.

Getting back to our story, this concern over availability obviously applies to the realtime transaction applications used by traders during the business day. But (perhaps less obviously) it also applies to a number of other supporting technologies, such as the systems that reconcile the company’s books at night and the networks that transport transactions to clearing houses, etc.

As we’ve talked with our client’s staff, however, it’s been interesting to note that, while no one in the company would disagree in principle with our statements in the paragraph above, people in different positions have different perspectives on what our focus should be when attempting to ensure adequate availability of production systems.

For example, it hasn’t necessarily been clear to staff-level server engineers why process improvements are needed in order to improve availability. After all, they’ve invested a great deal of money, thought and effort in designing a fault-tolerant infrastructure.

On the other hand, management (who know this fact) wonder why they must still sometimes invest thought and effort in dealing with the cumulative business consequences of whatever availability issues may have occurred, despite the fact that their applications are running on expensive fault-tolerant platforms. Is there nothing else that can be done to improve uptime?

We often speak of the delivery of infrastructure services as consisting of three components: people, process and technology. When considering how to make availability improvements, we realize that the people part ( human error ) can never be completely eliminated, no matter how competent the staff. As for technology, if your company has already invested in fault-tolerant platforms, then there is limited opportunity for improvement there - certainly so without unduly increasing capital cost. This, of course, leaves us with only process improvement as the best way to improve availability without spending more money.

So here’s where the ITIL framework enters the story.

As we go on, keep in mind the fact that the applications group here submits nearly 3,000 change requests each week. This high rate-of-change is driven by a need to maintain a competitive edge in business applications as well as several other factors such as regulatory changes that may affect applications and data storage.

Given this rate-of-change for production, the company has found it challenging to arrange dedicated windows for performing preventative maintenance for the purpose of supporting existing availability standards. Therefore one of our highest priorities in Phase 1 (taking advantage of the fact that we are also re-engineering the Change Management processes at the same time) is to ensure through appropriate negotiations that the ‘windows’ of planned downtime defined for application changes also allow formally-defined times for activities that serve to improve availability.

We must also ensure that these two types of windows, once defined, are aligned with the
uptime ‘promises’ made to customers by Service Level Management through the Service Catalog. This alignment will help to correct a tacit, longstanding, ‘unwritten rule’ between end-users and the infrastructure group that all systems will be made available 24 x 7 x 365 regardless of criticality to the business. This informal cultural understanding will be replaced by a formal, businesslike policy aimed at providing suitable systems availability in a cost-effective manner.

Another high priority in Phase 1 has been to formally establish an authoritative Availability Plan, in the form of a ‘living’ online document, whose Table of Contents will look something like (in first draft, subject to change) the following:

I. Introduction & Executive Summary

II. Availability Management Mission
…. a. AM Goals & Objectives
…. b. Availability Management Board charter
…. c. Availability Architecture Board charter
…. d. Availability Manager job description
…. e. AM workflow maps
III. Improvement Activities
…. a. General Maintenance Processes & Improvement Plans

…. b. Specific Availability Improvement Initiatives
IV. AM Guidelines
…. a. A record of ‘lessons learned’
…. b. Maintenance schedule definitions
…. c. Maintenance window work protocols
…. d. Security-related standards & guidance
…. e. Fault analysis methods & procedures

V. Interfaces Protocols to Other ITIL Areas
…. a. Service Level Management
…. b. Service Continuity Management
…. c. Financial Management
…. d. Incident Management
…. e. Problem Management
…. f. Capacity Management
…. g. Change Management

Once the Availability Plan has been written, we will of course have to advertise its publication and teach people how (and why) to use it. An Availability Manager - a full-time, mid-level manager that the organization intends to appoint under the executive function for IT Infrastructure, will oversee this. Availability Management, after all, is really one part of the headquarters function of IT - something that needs to apply to all systems and processes.

Has your organization implemented an Availability Plan? Do you agree or disagree with the Table of Contents above? Share your thoughts by clicking on the Comment link below.

Well, that’s all the time we have for blogging this week. We hope this information has been useful and invite you to share your thoughts on Availability Management, too. Next time, we’ll wrap-up this discussion and then move on to new topics.

Until next time, thanks for reading Real World ITIL!

Regards,
Scott (your moderator)

Technorati Tags:


Posted in: ITIL Implementation  Tags:

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Hello to All-

I am getting ready to present the following topic in NYC to Senior IT management in the financial vertical space. I thought it might be of interest to some out there.

The talk is entitled, “IT Governance Frameworks: ITIL® and CobiT®
Overhead or Strategic Weapon?

I have only 10 minutes to make an executive level presentation–so it stays high level.

Slide 1 deals with the root issue for most IT transformation activities--IT operating in technical silos rather than as one enterprise. And often not realizing it due to the still mostly technical rather than process view of the world in IT.


Slide 2: Heightened Interest in Process Frameworks

(business drivers)
Sarbanes Oxley / Compliance
Alignment With Business

Service Quality & Accuracy
Efficiency / Cost Reduction
Agility
Risk Reduction
Business Continuity / Disaster Recovery for Data Centers


Slide 3: What is ITIL?

ITIL (IT Infrastructure Library) is an integrated set of best practices (processes) for enterprise delivery of IT?s products & services
Service Support? 5 Mgmt. Disciplines
Incident, Problem, Configuration, Change, and Release
Service Delivery?5 Mgmt. Disciplines
Service Level, Financial, Capacity, Service Continuity, AvailabilitySlide 4: ITIL Pros and Cons

Strengths
De facto standard, comprehensive, independent, large body of knowledge, educational support, common language
Shortcomings
Not prescriptive, not the goal, ?grand design? trap, no value measurement, meets organizational resistance
Business Values
Align with business, policy compliance, risk reduction, service quality, agility, accountability, better business continuity / disaster recovery, efficiency

Key Challenge: How to get the business value?

Slide 5: What is CobiT?

CobiT (Control Objectives for Information and Related Technologies) is a comprehensive model for enterprise control of the IT environment / IT Governance.
Generally accepted as de facto guidance for Sarbanes 404 compliance
4 Domains
Planning & Organization
Acquisition and Implementation
Delivery and Support
Monitoring

32 Processes
Over 300 detailed control objectives for processes

Slide 6: CobiT Pros & Cons

Strengths
De facto standard, comprehensive, independent, evolving, large body of knowledge, educational support, common language, maturity model, good fit with ITIL
Shortcomings
Not fully prescriptive, very broad, ?grand design? trap, no value measurement, meets change resistance, unclear governance ownership

Business Values
compliance, corporate risk reduction, accountability, use in base-lining control maturity

Key Challenges: Scope and order of application, business value, i.e., how to make other than an overhead burden?

Slide 9 is a picture. It shows CMMi–ITIL–CobiT as a effective process framework over the components of IT–Applications, Infrastructure, and DataCenter. They combine logically to form most of end to end guidance, with little overlap.


Slide 10 tells a Business Story around enterprise process.

Challenge: How to meet Sarbanes Oxley 404 compliance AND improve operational effectiveness?

Approach:
Create a base level framework of ITIL & COBIT components
Engineer in compliance at start of work ?use ITIL Change Mgmt.
Update policy, work processes, imbed in technology
Drive adoption via web and policy

Results:
Automated 404 compliance for IT

Reduced change processing time by 30%
Improved accuracy & reduced risk (system abends fell by 90%)
Became proactive–emergency changes dropped from majority of changes to <10%

I welcome any thoughts you may have–if the Powerpoint may be of any use to you, feel free to drop me a line and I will send you a copy.

Cheers-

Don Casson

UPDATED:
You can now download the presentation from evergreensys.com

Tech Tags:


Posted in: ITIL Implementation  Tags:

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Posted in: ITIL Implementation  Tags:

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Posted in: ITIL Implementation  Tags:

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Welcome to 2006 and another edition of Real World ITIL! It’s good to be back after a long holiday break because there’s a lot of ITIL work to do this year and it’s time to get busy. As we noted in our last blog entry, we’ve got some implementation deadlines looming within the next few months and there’s plenty to do before then.

Our current project involves implementing six ITIL process areas at a ‘version 1.0′ level to act as a sort of ITIL ‘jump start’ for the company-in-question. Of course, what the team is really designing is a single, integrated system of workflows that serves six different infrastructure delivery needs.

One of these needs is to ensure that proper standards, methods, controls and reporting have been established concerning the accessibility of the company’s production systems. ITIL’s Availability Management (AM) process is intended to meet this need and thus it is our subject for this week.

In the present case, we have some real world constraints on timelines and also on how quickly the organization can absorb change. We have therefore divided the implementation of AM into a few distinct phases in order to keep our scope tight (and therefore achievable). These phases are subject to change, of course :-)

Phase 1 (4-6 months): Basic Availability Management

  • Define a ‘Version 1.0′ Availability Plan at a component level (without use of a CMDB , which is under construction)
  • Define and implement standard ‘availability windows’ in the production environment for changes and maintenance activities driven by the infrastructure group (making sure to synchronize these windows with the Change Management system)
  • Design and implement basic, non-automated workflows for: measuring and reporting of Availability metrics for management purposes, improving availability following the occurrence of Incidents and Problems (reactive mode only) and maintenance of the Availability Plan document
  • Develop requirements for tool sets to automate AM and review available tools
  • Establish clear working relationships with Service Level Management (SLM) and Change Management (CM) at a fundamental level
  • Make recommendations for Availability specifications in the company?s new Service Catalog
  • Designate and implement a minimal dedicated staffing model to operate the workflows
  • Plan Phase 2 with focus on continuous improvement of the existing model

Phase 2 (4-6 months): Intermediate

  • Evolve the Availability Plan to express a product and service perspective (beyond just components)
  • Define and implement standard ‘availability windows’ in the production environment for changes and maintenance activities driven by the applications group
  • Evolve the workflows developed in Phase 1 to become more integrated with the other ITIL areas (especially the Configuration Management team, which is currently building a CMDB )
  • Adjust the Availability specifications in the nascent Service Catalog using real world operational experience
  • Strengthen interrelationship with SLM and CM by implementing appropriate Operating Level Agreements. Begin providing routine, periodic AM reports to SLM using manual methods.
  • Build relationships with other ITIL areas as appropriate
  • Continue operating AM with minimal staff commitment and implement two virtual ‘Boards’ - an Availability Architecture Board and an Availability Management Board for policy and oversight
  • Select and acquire an automation tool to support AM
  • Plan Phase 3 with focus on continuous improvement of existing model

Phase 3 (future): Advanced

  • Develop full partnership with an evolved SLM function, including participating in the design of future Service Level Agreements and the Service Catalog
  • Implement full reliance on the Configuration Management Database (CMDB )
  • Implement methods and processes for proactive governance of factors affecting the availability of production systems
  • Make the AM staff as ‘virtual’ as possible (because AM has become part of the organizational consciousness) but maintain the two virtual Boards
  • Fully automate AM processes, analyses and measurement systems to the greatest extent possible
  • Plan future evolution and continuous improvements

This list, of course, is only a general framework for implementing AM (there may be some essential items missing). There certainly are many other ways of implementing AM in an effective manner. However, we think this phased approach will get the company where it needs to be within the given constraints. We intend to succeed through taking manageable steps, achieving quick wins where possible, and by stressing early end-user adoption of the ITIL concepts and processes.

In our next article, we’ll discuss certain of these line items in greater detail. Until then, thanks for reading Real World ITIL! We’ll see you back here next time.

Regards,
Scott (your moderator)

Technorati Tags:

—–


Posted in: ITIL Implementation  Tags:

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Search

Calendar

«  March 2010  »
MoTuWeThFrSaSu
22232425262728
1234567
891011121314
15161718192021
22232425262728
2930311234
View posts in large calendar