SLA, SLO, SLI의 차이와 고려할 점
Description: SLA, SLO, SLI 각 용어에 대한 의미와 이 들을 읽을 때 및 사용할 떄 고려해야할 점에 대해 생각해 봅니다.
Differences between SLA, SLO, SLI, and Points to Consider
TL;DR
- Think of each concept as an SLA as an accountability, SLO as a theoretical metric, and SLI as a practical metric.
- SLAs are the most sensitive as they relate to legal liability and are difficult to set in each situation, while SLOs require appropriate metrics that are not complicated, and SLI requires simple methods and metrics to fulfill them.
- So it's a good idea for the technical team to first select the SLOs, measure the SLI, and then talk about the SLAs with the legal team (and similar functions) in the company.
Service Level Agreement
Appointment for responsible service time etc
For example, responding to an incident within hours, a concept that encompasses both SLOs and SLIs as described below.
Includes accountability for violations of SLOs, etc.
SLAs are very difficult to measure and report on. There are always nuances depending on the situation.
For example, if you have a response time of 4 hours, it depends on the context of the issue, how difficult it is to respond, and how to legally handle issues related to response times beyond 4 hours, no matter how fast you are.
Service Level Objectives
Expected system availability goals
Usually expressed as a percentage and using a measurable metric, such as performance is guaranteed to be within 500 ms for a period of one year.
Less vague than SLAs, but there is risk with overly complex metrics
Simplicity and clarity are key point
Service Level Indicators
A quantitative representation of the success rate of a service level as a percentage.
Metrics in the form of tangible evidence of SLOs
Putting all of these concepts together
an SLA is a guarantee for 99.99% of the year
the value of the SLO is 99.99%, and
The value of the SLI is an actual value that may or may not equal 99.99%, and if it falls short, the SLA triggers liability coverage for the provider's fault.
ref