Let's Grow

contact@eustatiu.com

Strategy 13 min read

SLA guide: the 7 clauses your software contract is missing (2026)

Delta Airlines lost $500 million from one outage. SLA credits covered less than 0.1% of actual losses. Here's what to demand instead.

$500 million. That’s what Delta Airlines estimated in losses from the CrowdStrike outage in July 2024 [1]. Cancelled flights, stranded passengers, operational chaos that lasted days. The SLA credits Delta could claim from CrowdStrike? Virtually nothing – less than 0.1% of the actual damage. CrowdStrike’s standard agreements cap liability at fees paid, and service credits are calculated as a percentage of your monthly subscription cost. Not a percentage of what you lost. A percentage of what you pay them.

That gap – between what an outage costs you and what an SLA actually covers – is the single most misunderstood concept in software contracts. And it’s the reason most SLAs are theater.

Every SLA is written by the provider

Read any SLA guide online. They’ll explain uptime percentages, talk about “nines,” and maybe include a template. What they won’t tell you is that every standard SLA in the software industry was drafted by the provider’s legal team, reviewed by the provider’s risk department, and approved because it limits the provider’s exposure to nearly zero.

You’re not a provider. You’re a buyer. This guide is written for you.

We’ve reviewed SLAs on behalf of clients across 40+ software projects – SaaS contracts, cloud infrastructure agreements, custom development deals. The patterns are always the same: impressive-sounding uptime numbers at the top, and an exclusions section at the bottom that quietly removes almost every scenario where you’d actually need protection.

What the nines actually mean

Every SLA discussion starts with the uptime table. Here’s the version most articles give you:

UptimeDowntime per yearDowntime per month
99%3.65 days7.3 hours
99.9%8.76 hours43.8 minutes
99.95%4.38 hours21.9 minutes
99.99%52.6 minutes4.38 minutes
99.999%5.26 minutes26.3 seconds

Now here’s the part they leave out: how downtime is measured matters more than the number itself.

AWS measures availability as the percentage of 5-minute intervals during which at least one instance in a region is running [2]. If your single instance is down for 4 minutes, then up for 1 minute, then down for 4 minutes again – AWS can count that as 100% available. Azure uses a similar approach but with different interval definitions per service [3].

Salesforce is worse. Their standard enterprise contract contains no numeric uptime guarantee at all [4]. They publish a “trust” status page and reference “commercially reasonable efforts.” That’s not an SLA. That’s a press release.

What cloud providers actually promise

Let’s be specific. Here’s what the three largest cloud providers put in their SLAs as of early 2026:

AWS (EC2 SLA) [2]:

  • Below 99.99% availability: 10% service credit
  • Below 99.0%: 30% service credit
  • That’s it. The maximum penalty AWS faces is giving you 30% off next month’s bill

Azure (Virtual Machines SLA) [3]:

  • Below 99.99%: 10% credit
  • Below 99.0%: 25% credit
  • Below 95.0%: 100% credit
  • Better than AWS, but 100% credit on a $200/month VM is… $200

Google Cloud (Compute Engine) [5]:

  • Below 99.99%: 10% credit
  • Below 99.0%: 25% credit
  • Below 95.0%: 50% credit
  • Notice: even at 95% uptime – meaning 36 hours of downtime per month – you still pay half the bill

Now run the math on a real scenario. The AWS us-east-1 outage in December 2021 lasted approximately 11 hours and took down Disney+, Slack, Coinbase, and parts of Amazon’s own retail site [6]. If you were running a $3/month Lightsail instance that went down for the full duration, your SLA credit would be approximately $0.30.

$0.30
SLA credit for a 12-hour outage on a $3/month cloud instance – while your actual business losses could run into thousands
Based on AWS SLA credit structure [2]

This isn’t a flaw in the system. It’s the system working exactly as designed. Cloud providers price their services assuming SLA credits will cost them almost nothing. The credits exist to satisfy procurement departments, not to make you whole after an outage.

The 10-50x gap

Fortune 1000 companies lose between $1 million and $5.4 million per hour of downtime, depending on the industry [7]. Even a mid-market SaaS company with $5 million ARR loses roughly $570 per hour of complete downtime in direct revenue alone – before accounting for customer churn, support tickets, reputation damage, or contractual penalties to their own customers.

The typical SLA credit covers 2-10% of your actual business losses from a qualifying outage [8]. We call this the 10-50x gap: for every dollar your SLA pays out, you lose ten to fifty dollars.

This is why “we have a 99.9% SLA” means almost nothing as a standalone statement. The uptime number is marketing. The credit structure, exclusions, measurement method, and claim process are the actual contract.

The 7 clauses to demand in your software SLA

These are specific to software projects – custom development contracts, SaaS agreements, managed services deals. Not generic boilerplate. Every one of these comes from real contract negotiations we’ve participated in.

1. Uptime with a defined measurement method

Don’t accept “99.9% uptime” without knowing how it’s calculated. Demand the formula in the contract:

Availability = (Total minutes in period - Downtime minutes) / Total minutes in period

Specify that downtime starts when you report it or when monitoring detects it – whichever comes first. Not when the provider acknowledges it. Many SLAs start the clock only after the provider confirms an incident, which can add hours to their reported response.

Measurement period: monthly. Annual measurement hides bad months inside good ones.

2. Response time tiers – not just resolution time

Most SLAs define resolution targets. Demand response targets:

SeverityExampleResponse timeResolution target
CriticalService down, all users affected15 minutes4 hours
HighMajor feature broken, workaround exists1 hour8 hours
MediumNon-critical bug, limited user impact4 hours48 hours
LowCosmetic issue, enhancement request1 business dayNext release

Response means a human has acknowledged the issue and begun working on it. Not an auto-reply ticket confirmation. Define this explicitly, because providers will count automated “We received your ticket” emails as a response.

3. Credits that escalate with duration

Flat-rate credits are provider-friendly. Demand escalating credits:

  • First hour below SLA: 10% credit
  • 1-4 hours: 25% credit
  • 4-12 hours: 50% credit
  • 12+ hours: 100% credit on the affected service for that month
  • 24+ hours: right to terminate without penalty

The escalation creates real financial pressure to fix problems fast. A flat 10% credit whether the outage lasts 20 minutes or 20 hours gives the provider zero incentive to rush.

4. A short, specific exclusions list

This is where SLAs go to die. Standard cloud SLAs exclude:

  • Scheduled maintenance (often unlimited with 48-hour notice)
  • Force majeure (defined so broadly it covers almost anything)
  • Third-party service failures (even ones the provider chose to depend on)
  • Customer-caused issues (subjectively determined by the provider)
  • Beta features, preview services, free tiers

Demand that the exclusions list is closed – meaning only the items listed are excluded. No “including but not limited to” language. Cap scheduled maintenance at a specific number of hours per month (4 is reasonable). Require that force majeure events are named, not described with catch-all language.

The architecture decisions behind your software directly affect which exclusions apply. A single-region deployment means every regional outage is your problem. Multi-region architecture with automated failover makes regional outages the provider’s problem – because the service should stay up.

5. Proactive monitoring with shared dashboards

Demand that your provider runs uptime monitoring and shares the results with you in real time. Not a monthly report. A live dashboard. If they won’t share monitoring data, they’re hiding something.

Specifically require:

  • Synthetic monitoring (automated checks hitting real endpoints every 60 seconds)
  • Status page accessible without authentication
  • Incident timeline published within 24 hours of resolution
  • Monthly uptime report delivered by the 5th of the following month

This prevents the most common SLA dispute: provider claims 99.95%, you experienced 4 hours of downtime, and neither side has objective data. Shared monitoring makes the conversation about numbers, not narratives.

6. Root cause analysis with a deadline

After every Critical or High severity incident, demand a written root cause analysis (RCA) within 5 business days. The RCA must include:

  • Timeline with UTC timestamps
  • Root cause identification
  • Immediate remediation taken
  • Preventive measures with implementation dates

This isn’t bureaucracy. It’s accountability. A provider who won’t write an RCA either doesn’t know why their system failed (alarming) or doesn’t want you to know (more alarming). Either way, you need a new provider.

7. Termination rights tied to SLA performance

This is the clause with real teeth. If the provider misses their SLA target in any 3 of 12 consecutive months, you can terminate without penalty and receive a pro-rated refund of prepaid fees.

Without termination rights, credits are your only remedy. And as we’ve established, credits are pocket change. The right to walk away – with your data, without a penalty, and without waiting for a contract renewal date – is the single most powerful clause in any SLA. Providers who won’t agree to it don’t trust their own infrastructure.

The fine print that kills you

Even with strong SLA clauses, three provisions regularly undermine the entire agreement:

Claim windows. AWS requires you to submit an SLA credit claim within 30 days of the incident [2]. Miss the window and your claim is void – no matter how severe the outage. Some providers use 15-day windows. Check yours.

Aggregate vs. individual measurement. If your provider runs 100 servers and 3 go down, is availability 97% or 100%? Many SLAs measure across the fleet, not per-instance. Your three servers being down is a rounding error in the provider’s aggregate number.

Credit caps. Almost every SLA caps total credits at 100% of the monthly fee for the affected service. Not your total spend. Not your actual losses. The monthly fee for that specific service. If your $50/month database goes down and costs you $200,000 in lost revenue, your maximum credit is $50.

“Commercially reasonable efforts.” This phrase appears in more SLAs than any specific uptime number. It means nothing enforceable. If your SLA says the provider will use “commercially reasonable efforts” to maintain 99.9% uptime, that’s not a 99.9% guarantee. It’s a statement of intent with no penalty for failure.

How to calculate if your SLA actually protects you

Here’s the formula we use with clients. It takes two minutes and tells you whether your SLA is real or decorative.

Step 1: Estimate your hourly cost of downtime.

For a SaaS product: (Annual revenue / 8,760 hours) + (estimated support costs per hour of outage) + (customer churn risk per hour).

For an e-commerce site: (Average hourly revenue) + (cart abandonment losses) + (ad spend wasted during downtime).

Step 2: Calculate your maximum annual exposure.

(Hours of downtime your SLA allows per year) x (hourly cost from Step 1) = your unprotected risk.

Step 3: Calculate your maximum SLA payout.

(Monthly service fee) x (maximum credit percentage) x 12 = annual protection.

Step 4: The coverage ratio.

(Annual protection / Maximum annual exposure) x 100 = your SLA coverage percentage.

If that number is below 10%, your SLA is decorative. Below 1%? It’s marketing copy.

2-10%
typical SLA coverage ratio – the percentage of actual business losses covered by standard cloud provider SLA credits
Industry analysis across AWS, Azure, and GCP standard SLA terms [2][3][5]

The cost of building software includes the cost of enforcing the SLA that comes with it. Budget for monitoring tools, incident response processes, and – if the numbers justify it – legal review of the contract.

What actually protects you (it’s not the contract)

Here’s the uncomfortable truth: no SLA clause will save you if your software is architecturally fragile. A contract is paper. Infrastructure is physics.

The companies that survive outages without catastrophic losses do three things:

They build redundancy into the architecture. Multi-region deployments. Automated failover. Database replicas that promote automatically. If a single point of failure can take down your product, your SLA is a band-aid on a bullet wound.

They monitor before the SLA triggers. By the time downtime hits your SLA threshold, you’ve already lost money. Proactive monitoring with alerting at degraded-performance thresholds catches problems before they become outages.

They test failure regularly. Chaos engineering isn’t just for Netflix. Even a simple monthly drill – kill a database replica, simulate a region failure, test the failover – reveals whether your architecture actually works or just looks good on a diagram.

The best SLA in the world is one you never have to invoke.


Your SLA protects you on paper. Your architecture protects you in practice. We build both – software designed from the foundation for the availability your business requires, backed by SLAs with the seven clauses above. Tell us what you’re building and we’ll scope the architecture and the agreement together.

References

[1] Delta Airlines, “CrowdStrike outage cost Delta Air Lines $500 million,” reported by CNN Business, August 2024. cnn.com

[2] AWS, “Amazon Compute Service Level Agreement,” updated 2025. aws.amazon.com/compute/sla/

[3] Microsoft Azure, “SLA for Virtual Machines,” updated 2025. azure.microsoft.com/en-us/support/legal/sla/virtual-machines/

[4] Salesforce, “Master Subscription Agreement,” 2025. salesforce.com/company/legal/agreements/ – references “commercially reasonable efforts” without numeric uptime commitment in standard terms.

[5] Google Cloud, “Compute Engine Service Level Agreement,” updated 2025. cloud.google.com/compute/sla

[6] AWS, “Summary of the AWS Service Event in the Northern Virginia (US-EAST-1) Region,” December 2021. aws.amazon.com/message/12721/ – affected Disney+, Slack, Coinbase, and Amazon retail.

[7] Ponemon Institute / ITIC, “Hourly Cost of Downtime Survey,” 2024. itic-corp.com – Fortune 1000 average: $1M-$5.4M per hour depending on industry.

[8] Uptime Institute, “Annual Outage Analysis,” 2024. uptimeinstitute.com – Analysis of SLA credit recovery vs. actual outage costs across enterprise contracts.

Frequently asked questions

What is an SLA?

A Service Level Agreement defines measurable performance guarantees – uptime, response time, resolution time – and the penalties when a provider fails to meet them. Most standard SLAs protect the provider, not you.

What uptime should I demand in an SLA?

99.9% (8.76 hours downtime per year) is the baseline for business-critical software. For payment processing or healthcare, demand 99.99% (52.6 minutes per year). Anything below 99.9% is not a real SLA – it's a disclaimer.

Do cloud provider SLAs actually protect my business?

No. AWS, Azure, and Google Cloud SLAs cap remedies at service credits – typically 10-25% of your monthly bill. If your $3/month instance goes down for 12 hours, your credit is $0.30. Your actual losses could be thousands or millions.

We build and deploy with real uptime guarantees.

Multi-region architecture. Automated failover. Health checks that page us, not you. We back our work with SLAs that have teeth – defined response times, measurable targets, and financial penalties we actually pay.

Or leave your details — we'll reach out within 24h.

Build with real uptime guarantees.