Incident Management: Building an ITIL Process
Businesses depend on their IT systems functioning 24/7, 365. If your customers can’t complete their purchases because your website’s backend is down, or if your employees can’t access their work on the cloud, you’re in trouble. You need to be prepared to handle IT issues when they inevitably arise.
Thankfully, by establishing a reliable incident management process supported by a robust service desk, you’ll have the tools you need to keep IT operations running smoothly.
Incident management involves identifying and resolving disruptions to IT services. These disruptions — or incidents — include anything that can affect business operations. They take many forms, from downed servers to bugged software.
The goal of incident management is to minimize the impact incidents have on end users, who may include employees, customers, or vendors. An incident management team is then your first line of defense when unanticipated problems occur.
The IT Infrastructure Library, commonly known as ITIL, is the most popular framework for providing IT services. ITIL 4, the latest version of the framework, includes a wide variety of best practices for delivering IT services, including incident management. ITIL is an excellent guide for implementing incident management at your organization.
While the terms aren’t always used clearly, incidents are not the same as problems, service requests, or changes.
A problem is what causes an incident or series of incidents. Say an employee’s desktop computer is unresponsive due to a newly-introduced glitch in the company’s proprietary software. Incident management focuses on quickly addressing the issue and restoring functionality, which here may involve rolling back to a previous version of the software or providing the employee with another device to work on. Problem management comes into play after the immediate crisis is resolved, and here would require fixing the software bug to prevent future incidents.
Service requests are formal requests from an end user, like a customer or employee. They may involve requests for information, hardware, or software, and they’re typically straightforward in comparison to the unexpected nature of incidents.
Incident management often involves changes to your IT services or infrastructure. The change might have caused the incident in the first place, or changes may be needed to resolve the incident.
The incident management process includes several key steps that lead from initial identification of an incident to resolution and closure.
The first step in managing an incident is, of course, identifying it. You’ll likely discover incidents through a variety of channels, though a service desk — discussed in more detail below — should serve as the single point of contact (SPOC) for each. Make it easy for end users who encounter incidents to report them, whether in person, by phone, or online through email, chat, and fillable forms. Your service team should always be available to assist with reporting as needed and to tackle automated monitoring alerts as soon as possible.
Recording all relevant information about an incident is just as important as identifying it in the first place. You can easily track incidents as tickets through your service desk. These tickets should provide an at-a-glance summary of the incident for everyone on your IT team who gets involved, such as contact info for the end user who reported the incident, a brief overview of the issue, and other key data points, like the date and time the incident occurred.
After you’ve identified the incident and logged the necessary details, it’s time to plan next steps. These depend on what category the incident falls into. Categorization should be multi-level and, to the extent possible, automated through online forms and rules in your service desk platform. Quickly and accurately sorting incidents ensures that they’ll be resolved efficiently and improves user satisfaction.
You need to decide how to prioritize the incident along with categorizing it. Assigning an appropriate low-medium-high priority level is critical, as it determines how quickly it will be addressed and how many resources will be dedicated to that effort. Factors like the incident’s effect on end users and daily operations should weigh heavily in this analysis.
Depending on the complexity of the incident, additional investigation may be required to determine exactly what the incident is and how it should be handled. This may involve the incident management team conferring with other IT professionals or even other business units. A detailed knowledge base that includes standard questions and troubleshooting guides for common issues is greatly helpful here. Diagnosis may reveal that the incident can be quickly resolved, or that it requires escalation.
If your service desk personnel don’t have the expertise or resources necessary to resolve the incident up front, they should escalate it to those that do. The incident may require on-site assistance, the knowledge of an IT specialist, or the input of a manager. While escalating, communication is key to keep the incident management process moving efficiently and prevent issues from getting lost in transit.
If you find that a significant proportion of issues can’t be resolved by your front-line team, look into it. Do they need more personnel or other resources? Is there a particular type of incident that they’re unable to address? Resolving issues on first contact is a critical part of keeping incident management lean and effective.
It’s finally time to fix things. At this stage, the incident is in front of the right IT professionals who have what they need to resolve it. Resolution might take the form of a permanent fix or a temporary solution, with the underlying issues left to a separate problem management process to handle. The incident should be resolved as quickly as possible, but not at the expense of any necessary testing — you don’t want to make it any worse or frustrate users by promising a fix that doesn’t deliver.
Once the incident has been resolved and service is restored, all that’s left is closing the relevant ticket out. This should only occur once the end user has been contacted and confirms that they’re satisfied with the resolution. Any escalated incidents should be returned to the front-line personnel at the service desk before closure as well.
To speed the process along, continue to practice communication between the relevant internal and external parties, and allow for the automatic closure of resolved issues when appropriate. This ensures that your team isn’t wasting time on rote tasks and prevents already-resolved tickets from lingering in your system.
How can you effectively implement ITIL incident management at your organization? Start by following these tips.
If it wasn’t clear already, frequent communication is a must for all stages of incident management. This applies to contacts with end users, members of your IT team, and other members of your organization involved in the incident management process.
Updating statuses in real time is one of the most critical communication tasks that comes with incident management. From opening an issue to assigning and closing it, everyone on your team needs to know where an incident stands at all times to avoid wasting time or duplicating work.
Take advantage of handy tools to make communication quick and simple as well. Make text and video chat services available to team members, and ensure that your documentation and tracking tools allow for easy commenting.
Modern IT tools make it easy to document past incidents and track ongoing ones. For instance, Confluence can be used to build and maintain a knowledge base, while Jira Service Management is an ideal solution for tracking all your incident-related tickets. The more documentation you have on incident types, and the better you’re able to track ongoing incidents, the more successful your incident management process will become.
Take advantage of automation to avoid spending an inordinate amount of time on incident management. From generating notices to categorizing and assigning incidents, incident management is full of repetitive tasks that can easily be automated in whole or in part.
Exceptional incident management processes don’t spring up overnight. Even after years of work, you’ll still find ways to improve, whether by expanding your knowledge base to address a frequent issue in more detail or streamlining communication between other departments and your IT team. Track metrics that relate to your incident management goals and collect feedback from end users, then make changes based on the insights you gain. And conduct post-incident reviews to determine what went right and what could have gone better.
Service desks are a must for effective ITIL incident management. They’re an organization’s SPOC for all things incident management-related and beyond. As the first-responders to any IT incident or request, they’re the central hub of your IT service management (ITSM) ecosystem.
While service desks handle responsibilities beyond incident management, their capabilities make them uniquely suited to filling this role. As they’re integrated with other IT departments as well as non-IT business units, incidents can be easily escalated throughout your organization as needed. And they maintain a knowledge base with information on frequently encountered issues, making incidents much simpler to resolve.
The task of establishing an incident management process and service desk that can meet all the needs of your business may be intimidating, but don’t worry — there’s an easy way to realize the benefits of incident management today. Outsourced service desks are more affordable than providing comparable capabilities in-house, and they let you take advantage of a team of experts who can handle whatever incidents come their way.
Contegix offers an exceptional IT service desk together with comprehensive services for managing and maintaining your technical infrastructure. And Contegix’s experts are available to assist you with developing a strategically designed incident management solution using the Atlassian ecosystem. Whether you’re looking to build your incident management infrastructure from the ground up or fine-tune your existing process, consulting with Contegix can help.
If you’re looking for a trusted partner to help you prevent and resolve incidents, contact Contegix today.