What is Incident Management in ITIL?

Q: 1. What is Service Management according to ITIL?

ITIL service management provides a set of best practices and techniques for selecting, planning, delivering, and maintaining IT services within a business that aligns the IT department's actions and expenses with changing business demands. Service Management is an organizational capability that is utilized to deliver a service to the customer.

Q: 2. What is the purpose of IT service management?

Service Management focuses on providing value to the customer and also on the customer relationship. Service Management provides a framework to structure IT-related activities and the interactions of IT technical personnel with customers and clients.

Q: 3. Is Service Management part of ITIL?

Service management refers to the way you manage the information systems that deliver value to your customers. It is a generic term, Of the dozens of service management frameworks found in the wild, the most adopted is ITIL.

By Manikandan Mohanakrishnan

Updated on Apr 08, 2022 | 11 min read | 13.6k views

Table of Contents

When most people think of IT, Incident Management process typically comes to the mind. It focuses solely on handling and escalating incidents as they occur to restore defined service levels. Incident management does not deal with root cause analysis or problem resolution. The main goal is to take user incidents from a reported stage to a closed stage.

Once established, effective incident management provides recurring value for the business. It allows incidents to be resolved in timeframes previously unseen. For most organizations, the process moves support from emailing back and forth to a formal ticketing system with built-in:

Prioritization
Categorization
SLA requirements

The formal structures take time to develop but results in better outcomes for users, support staff, and the business. The data gathered from tracking incidents allows for better problem management and business decisions.

Incident management also involves creating incident models, which allow support staff to efficiently resolve recurring issues. Models allow support staff to resolve incidents quickly with defined processes for incident handling. In some organizations, a dedicated staff has incident management as their only role. In most businesses, the task is relegated to the service desk and its owners, managers, and stakeholders. To have a detailed understanding of Incident Management you can take up ITIL foundation certification.

The visibility of incident management makes it the easiest to implement and get buy-in for, since its value is evident to users at all levels of the organization. Everyone has issues they need support or facilities staff to resolve and handling them quickly aligns with the needs of users at all levels.

What Is ITIL Incident Management?

An unplanned interruption to a service or reduction in the quality of a service. The purpose of the incident management practice is to minimize the negative impact of incidents by restoring normal service operation as quickly as possible.

This article describes incident management process. It will be used as a reference for the implementation and use of incident management process on an ongoing basis. This process guide is based on the best practices described on the Information Technology Infrastructure Library (ITIL).

Every participant in the process is expected to understand and adhere to the process and roadmap from which lower-level operation procedures can be defined and implemented by the service improvement team and IT Delivery staff.

Incident Management Overview

The policy governs the Incident Management process and all procedures implemented for the management and execution of the process. This policy statement defines the system of governance that is used to ensure that support team members and contractors follow the prescribed process as requirement. The Incident Management policy is defined as a ser of ruled listed below:

All incidents should be recorded in a ITSM ticketing tool as the single source of incident data.
There shall be a defined escalation process to ensure timely resolution of incidents within agreed SLA.
There shall be one parent incident for an issue, other related incidents shall be marked as child.
Service Desk will be single focal point for the Requestors.
The incident manager is responsible to produce all reports/KPI reports as per defined frequency.
The Service Desk team should strive to match similar incidents; and use KEDB to find workarounds or permanent solutions to known issues.
If support team resolves the issues and finds that the incident should have a different operational category that on which it was opened. The support team personnel should change the category at resolution to reflect the change.
Incident priority shall be classified based on impact and urgency in accordance with documented criteria.
Support team shall route tickets to the appropriate support group after mutual discussion to avoid re-assignment, tickets can be escalated to service desk team if resolver group is not known the assigned support group; thereafter service desk team will send the ticket to the appropriate team.
Service Desk teams should escalate incidents to incident manager if appropriate/group is not found by service desk team or if the technical team does not feel it can resolve the issue.
Incident manager will provide all support for hierarchical escalations. For all major incidents P1/P2 tickets the incident manager will inform and escalate all issues which require decision making at the highest level. Hence the support team should route incidents to the Incident Manager if they require hierarchical escalation.
Tickets can be moved to pending/On Hold State, only if there is a justified reason and communication to the requestor.
All resolved incidents should be closed after an agreed time.
After resolution, if the support analyst feels that the incident resolution was just a workaround and permanent solution would need more investigation, the support personnel should propose a problem for the same and inform to problem manager. In addition, any known error should also be communicated to the problem manager.

Tasks in Incident Management

Incident Management Process consists of the following major sub process, which includes further processes:

Incident Detection and Recording.
Incident Classification and Support.
Incident Investigation and Diagnosis.
Incident Resolution and Recovery.
Confirmation and Closure.

Incident Management Main Function

The document defines actual standards for delivering IT services, in case of customer has specific requirements the document can be customized to customer specific requirements. This document serves the purpose of providing material for high level training and education to end requestor and IT communities.

The intended audience for this document includes all incident management process roles, Service Desk Analyst, Manager, other service management process owners (Problem Manager, Change Manager), Application Development and Maintenance staff involved in incident management.

Incident Process

The high-level Incident Management process is depicted in the following diagram.

Master Right Skills & Boost Your Career

Avail your free 1:1 mentorship session

Organization Roles & Responsibilities

Requestor

Requestor is the authorized person to report issues:

Contact Service Desk for reporting issue(s).
Use Self-Help tool to report the issue(s).
Provide detail description of Issue.

Service Desk Analyst (SDA)

The Service Desk is the first role that the Requestor interfaces. This includes initial support.

Acts as the Single Point of Contact for all incidents.
Capture all required incident details and log/update the incident.
Categorizes and prioritize the incident.
Relates new incidents to existing ones when applicable.
Provides initial support to Requestor reported issue and route the incident to relevant support team, if needed.
Tracks the incident till closure to ensure incidents are resolved within agreed SLAs.
Escalates the incidents as appropriate after pre-determined threshold points are reached for unresolved incidents Keeps the Requestor informed of the incident status.
Escalate incident to incident manager if unable to find appropriate support group.

Support Analyst/ Technical Support

Resolve incidents within agreed service levels.
Escalate the unresolved incidents to higher support levels at the appropriate time.
Make appropriate use of available resources to resolve incidents (people, tools, and processes).
Communicate the incident status internally and externally as applicable.
Interface with other process as required to resolve incident.
Maintain up-to-date knowledge on the relevant technical platform.

Incident Manager

Reviews effectiveness and efficiency of the process.
Creates procedures for incident management.
Act as an escalation point to action any misrouted tickets.
Ensure that incident management processes and tools are integrated with other processes.
Is responsible for the success or failure of the process.
Ensures that the process is defined, documented, maintained and communicated at all levels within the organization and to vendors.
Is responsible for the requirement and guidelines of tool usage.
Establishes and communicates the process roles and responsibilities.
Establishes and communicates the process, service levels, and process performance metrics.
Provides adequate process training for the organization.
Establishes targets for process improvement.
Monitors and reports on the performance of the process.
Participates in other ITSM process initiatives and process reviews.
Ensure Third Party performance of the incident management process.

Major Incident Manager

Takes ownership of a major incident.
Open War Rooms/Bridge Calls and manage communications.
Coordinates among various teams/suppliers for resolution.
Initiate the Major incident bridge and involve e all required stake holders.
Determines stakeholders for communication updates and report.
Determines content of the communication.
Prepares Major incident report and presents it to the management.
Work closely with Problem Manager post closure of a major incident and handover major incident.
Synopsis to problem manager.
Participates into the RCA call to detail the incident (if required).
Highest priority of the incident will also be reported.

Detection and Recording

Detailed Description of Recording and Classification

Procedure	Input	Description	Output
Identify Incident	Disruption of service.	Requestor identifies a disruption of service.	Incident Identified
Email/Phone	Disruption of service	Requestor calls/sends email to service desk.	Call/Email sent to Service Desk
Monitoring Tool Alert	Disruption of service	Monitoring tools opens a ticket with support team without anyone’s interaction.	Incident Logged
Validate Details	Email Incidents	Service Desk validates the data updated by Requestor.	Validation done.
Collect and Record Information	Incident identified.	Service Desk collects and verifies the basic details.	Incident details collected.
Existing Record?	Incident Details Collected	Service Desk to check if reported incident is for any existing incident record.	Record identified.
Current Incident	Incident Details on Call	Service Desk fills the incident form based on details received.	Incident created.
Is Incident Escalated?	Incident details on Call	Service Desk checks if the incident is due Escalation.	Decision
Invoke Escalation Procedure	Incident details on Call	For calls due to escalation / Repeat calls – Initiate Escalation Procedure.	Escalation Invoked
Trigger Priority Change	Incident details on Call	Follow the priority change process.	Decision
Update Incident	Incident details on Call	Update the existing details about purpose of Requestor call.	Updated Incident.

Classification and Initial Support

Detailed Description of Classification and Initial Support

Procedure	Input	Description	Output
Operational and Product Categorization	Incident Creation	Service Desk does the operational and product categorization and checks if it qualifies for an Incident.	Categorization completed.
Prioritization & Linkage to CI	Categorization completed. Prioritization	Service Desk completes the impact and urgency of a ticket to generate priority if the incident and links the CI.	Prioritization & CI Linkage
Initial Support	Prioritization	Provide an initial support to drive it towards resolution.	Resolved Incident & Assignment
Incident Resolved	Initial Support	Checks to see if incident is resolved or not.	Assign to support Analyst.
Assign to Support Analyst	Initial Support	After verification of the technical team, investigation and diagnosis is started.	Assign to support Analyst.
Is the incident routed correctly?	Validation of Incident	Incident routed incorrectly.	Escalated to Incident Manager
Escalation	Support group is not identified.	Ticket assigned to correct support team/ hierarchical escalation done.	Appropriate support group is assigned.

Investigation & Diagnosis

Description of Investigation and Diagnosis

Procedure	Input	Description	Output
Accept the ticket.	Ticket Assigned	The support Analyst accepts the ticket and ensure the response SLA being met.	Ticket Accepted
Is it a Major Incident?	Ticket Acknowledged	Validate if the incident qualifies for Major Incident.	Major Incident Validated
Refer Knowledge Article	Ticket Accepted	Knowledge article is referred for Workaround/Solution.	Workaround/Solution Checked
Apply Workaround	Workaround Found	Workaround/Solution is applied.	Workaround/Solution applied.
Investigate Further	Workaround not found.	Technical support specialist will investigate further.	Investigation started.
Vendor Support	Vendor Support	Vendor contacted and ticket opened.	Vendor ticket logged.
Contact Customer	Vendor Support not required.	Customer is contacted in case further information required.	Information gathered.

Resolution and Recovery

Description of Resolution and Recovery

Procedure	Input	Description	Output
Carry out the tasks for incident resolution.	Solution/Workaround identified.	Service Desk/Support analyst on identifying the solution/workaround, can start executing the task for resolution.	Resolution tasks executed.
Is Incident Resolved?	Incident with updated logs	Support Analyst resolves the incident.	Decision taken.
Functional Escalation Required	Incident with resolution steps	Support Analyst follows functional escalation.	Decision taken.
Update worklog and resolve incident.	Resolved Incident	Incident work log updated.	Updated incident status and worklog

Confirmation & Closure

Procedure	Input	Description	Output
Confirm resolution with Requestor.	Incident Resolved	Resolution confirmed with Requestor.	Resolution accepted/rejected by Requestor
Solution Accepted?	Resolved incident.	Requestor to validate incident resolution.	Decision taken.
Reopen Incident	Resolution rejected by Requestor.	Requestor contacts service desk to reopen incident.	Incident Reassigned
Closure of the incident	Incident was not reopened in 5 calendar days.	Incident auto closed in 5 business days.	Incident Closed

Elevate your career with our online PMP courses taught by industry experts. Master project management and achieve new heights.

Major Incident Management

Definition

A major incident (MI) is an incident that results in significant disruption to the business and demands a response beyond the routine incident management process. Major incident has a separate procedure with shorter time scaled and urgency that is required to accelerate resolution process for incidents with high business impact. Take up KnowledgeHut IT service management certification to further boost your understanding of Incident Management.

Major Incident Priority Assessment Criteria

Incident priority is based on two factors – Impact and Urgency. Impact is defined as the measure of the criticality of the issue. Urgency is defined as the necessary speed of resolving an incident in timeline.

Urgency Code

Description

Critical

A full-service outage of a critical system. System is non-operational and urgent response required. The damage caused by the incident increases rapidly.

Delaying in resolution may lead to high revenue/business/productivity loss.

Impact Code

Description

Critical

Multiple systems are non-operational with major financial implications and needs to be restored immediately. A large number of customers are affected and/or not able to perform their BAU activities with business reputation at higher level. Workaround not available

Outage caused to a financial application.
Data Corruption.
100% impact to network.
Business critical services are impacted.
Severe problem during critical periods (e.g., month end processing)
Security Violation (e.g., denial of service, widespread virus, etc...)

Description of Major Incident Management Handling

Procedure	Input	Description	Output
MI Identified	Incident Submitted as Major Incident	Major incident process is invoked by Service Desk/ Support Specialist.	Incident accepted by Major Incident Manager
Open Conference Bridge & notify stakeholders.	Incident Reviewed and communication sent.	Major Incident Manager opens a conference bridge and Initial communications is sent.	Conference Bridge Opened & Communications sent.
Inform related support groups.	Incident Reviewed	Major Incident Manager drives the bridge towards incident resolution and support groups are involved.	Related support groups involved.
Determine stakeholders for communication.	Incident Reviewed	Stakeholders and communication plan are determined.	Stakeholders identified. Communication plan decided.
Co-ordinate resolution	Fix applied/to be applied.	Major Incident Manager/Support group coordinating to resolve the incident.	Co-ordination for resolution Incident Resolved
Collect Status	Co-ordination for resolution	If incident is not yet resolved, the status is collected by Major Incident Manager, and history is updated.	Status update
Communicate the status to stakeholders.	Stakeholders identified. Communication plan determined. Status update	Major Incident Manager sends communication to stakeholders.	Communication to stakeholders
Perform Major Incident Review	Incident Resolved	Major Incident Review and Incident report is prepared.	Incident report prepared and submitted.
Lesson Learned and Follow-up	Major Incident Review	Lessons learned is recorded in Incident Review and Report.	Lessons learned/ preventive actions documented, and follow-up done.

Prioritization Guidelines

This section describes on assessment of urgency and impact criteria and priority matrix calculation.

Urgency	Urgency Assessment
Immediate Attention is required.	Critical
Urgent attention is required as impact is same day.	High
Urgent attention is required as impact is within 3 working days.	Medium
There is no immediate attention required. Business as usual can continue, possibly with a workaround until resolved.	Low

The Incident Statuses

The following section explains the status of an Incident:

New: The status New cannot be selected by the users, it is assigned by the application after an Incident is logged.
Assigned: After the Incident is logged, it is assigned to a Workgroup, based on the selected Tenant. The Workgroup then specifies the Category, Classification, Urgency, Impact, Priority, Workgroup, and SLA Service Window based on the Symptom provided by the End User. The status of the Incident changes to Assigned.
In Progress: When the Incident is assigned to an Analyst, the status of the Incident is changed to In Progress. The Analyst can refer to various Knowledge Articles or Similar Incidents to work on the Incident.
Pending: If the Analyst cannot continue working on the Incident as the End User needs to provide some details or the Incident is dependent on any other activity to complete, the status of the Incident is moved to Pending.
Resolved: After the Incident is In Progress, the Analyst should resolve the Incident within the provided Service Window. After an Incident is resolved, the status of the Incident is changed to Resolved. Resolved incidents can be added as a Knowledge Base by selecting, Add to KB check box option on the Incident Details page. The End User can reopen the Incident if the resolution is not satisfactory.
Closed: After an Incident is resolved, the status of the Incident can be changed to Closed based on the configuration (manual closure or auto closure).
Canceled: The status of an Incident can be changed to Canceled if the End User does not want any further investigation on the Incident (For reasons, such as the issue is resolved or unable to replicate the issue). This option is available to the users if configured by the Administrator.

Top Cities where KnowledgeHut Conduct ITIL Certification Training Course Online

ITIL Certification in Singapore	ITIL Certification in Melbourne	ITIL Certification in Delhi
ITIL Certification in Pune	ITIL Certification in Mumbai	ITIL Certification in Perth
ITIL Certification in Bangalore	ITIL Certification in Sydney	ITIL Certification in Toronto
ITIL Certification in Hyderabad	ITIL Certification in Dubai	ITIL Certification in Brisbane
ITIL Certification in Chennai	ITIL Certification in Kolkata	ITIL Certification in Calgary

Best Practice for Implementing Incident Management

The following incident management practice has been designed for all parties whether internal such as IT departments or users, or external service providers that participate in service management including but not limited to IT functional areas such as Application, Infrastructure, Information Security, will adhere to the incident management process.

The process goal describes a specific purpose or achievement toward which the efforts of the process are directed. The purpose of incident management practice is to minimize the negative impact of incidents by restoring normal service operation as quickly as possible in a controlled and predictable manner. It is a fundamental element of service management. The quick restoration of a service is a key factor in user as well as customer satisfaction, the credibility of the provider and the value organization creates in the service relationships.

Scope of the practice includes activities that are undertaken as part of the practice aiming at reaching the goal of the practice. Scope of incident management includes:

Detecting and registering incidents.
Diagnosing and investigating incidents.
Managing incident records.
Communications with relevant stakeholders throughout the incident lifecycle.
Reviewing incidents and initiating improvements to service and to the incident management practice after resolution.

Advantages of Incident Management

Helps minimize the business impact of incidents and increase effectiveness by timely resolution.
Enables proactive identification of beneficial system amendments and enhancements.
Improves proactive monitoring, thus enabling accurate measurement of performance against SLAs. 
Promotes dissemination of information on different aspects of service quality 
Enables better utilization of staff that in turn leads to greater efficiency.
Enhances customer and user satisfaction.

Conclusion

The incident management process is triggered when Requestor contacts the service desk Single Point of Contact (SPOC) to report service disruption. When Auto-detected events generates an incident in the management tracking tool. When Internal support group identifies a service disruption (potential disruption) on managed system and generates an incident. The Incident Management is considered complete once work-around or solution is implemented, and Incident is resolved and closed. Take up KnowledgeHut ITIL foundation certification for better knowledge.