CVSS 3.1 vs CVSS 4.0 — A Practical Comparison for Security Teams

Published May 19, 2026 · By AxVeil Research · 13 min read

CVSS 4.0 has been a published specification at FIRST.orgsince November 2023, but most vulnerability-management programmes still anchor SLAs on CVSS 3.1. That gap is becoming a problem in 2026 — CNAs increasingly publish both vectors, regulators are referencing the new BTE score, and the supplemental metrics finally let security teams encode the real blast radius of a finding. This article gives you a metric-by-metric comparison, three re-scored CVEs, and a migration plan you can ship to engineering and auditors without breaking existing remediation SLAs.

What changed structurally

CVSS 3.1 had three metric groups: Base, Temporal, Environmental. CVSS 4.0 keeps Base and renames the others. The full picture is now four named scores plus the Supplemental group:

CVSS-B — Base only. The intrinsic severity, vendor-neutral. Replaces "CVSS Base Score v3.1".
CVSS-BT — Base + Threat. Threat replaces Temporal and now uses a single Exploit Maturity metric.
CVSS-BE — Base + Environmental. Used by asset owners to tailor a score to their deployment.
CVSS-BTE — All three, the recommended score for prioritisation. Closest analogue to the old "final" CVSS v3.1 score.
Supplemental — Safety, Automatable, Recovery, Value Density, Vulnerability Response Effort, Provider Urgency. Does not change the numeric score; provides routing and context.

The vector string is incompatible. CVSS 3.1 starts with CVSS:3.1/. CVSS 4.0 starts with CVSS:4.0/. Parsers must branch on the prefix; you cannot run a v3.1 vector through a v4.0 calculator and expect a meaningful number.

New Base metrics worth knowing

Attack Requirements (AT) — supplements Attack Complexity (AC). AT captures pre-conditions in the target environment (race condition window, specific config, MITM position). AC now only measures defender-controlled mitigations like ASLR. Splitting the two stops the historical over-weighting of "complex" bugs that were trivial in practice.
User Interaction (UI) — now three values: None, Passive, Active. A drive-by exploit (Passive) scores higher than one requiring a deliberate click (Active).
Vulnerable System (VC, VI, VA) + Subsequent System (SC, SI, SA) — replaces Scope. You now score impact to the immediate component and impact to downstream systems as separate triples. This is the change that most often shifts severity bands.

Threat group — simpler than Temporal

CVSS 3.1 Temporal had Exploit Code Maturity (E), Remediation Level (RL), and Report Confidence (RC). CVSS 4.0 Threat keeps only Exploit Maturity. RL and RC are removed because in practice every analyst entered "not defined" — they added noise without changing prioritisation. Exploit Maturity values are Attacked (active in-the-wild exploitation, CISA KEV-tier), POC, Unreported, Not Defined.

Environmental — closer to the way teams actually triage

Environmental in 4.0 lets you re-state every Base metric for your environment. Most importantly, you can set Modified versions of the new Subsequent System impact triple (MSC, MSI, MSA). If your blast radius is well-contained — for example, a CVE in a parser running inside a hardened sandbox with no network egress — you can collapse Subsequent System impacts to None and drop a 9.x BTE down into the 6-7 range. The score finally reflects the architecture, not just the bug.

Three CVEs, re-scored

Re-scoring real CVEs is the fastest way to feel how the new metrics behave. We picked one each from the "goes up", "stays the same", and "drops" buckets so you can calibrate expectations before you re-score your own backlog.

CVE-2024-3094 (XZ Utils backdoor)

CVSS 3.1: AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H        = 10.0 Critical
CVSS 4.0: AV:N/AC:L/AT:P/PR:N/UI:N/VC:H/VI:H/VA:H/
          SC:H/SI:H/SA:H/E:P                          = 8.9 High (BT)

The 3.1 vector flagged this as a perfect-10 because it ignored the very narrow pre-condition (target had to be in the brief window where the backdoor build was distributed). CVSS 4.0 captures that with AT:P (Attack Requirements: Present), which knocks the score into the High band. Both scores are defensible; the 4.0 vector is the more honest one for retrospective prioritisation.

CVE-2023-34362 (MOVEit Transfer SQLi)

CVSS 3.1: AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H        = 9.8 Critical
CVSS 4.0: AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/
          SC:H/SI:H/SA:H/E:A                          = 9.3 Critical (BT)

Stays critical in both versions. The MOVEit chain had no special pre-conditions and produced strong subsequent-system impact across thousands of tenants — the score earns its severity band twice. With Exploit Maturity set to Attacked the 4.0 BT remains in Critical territory.

CVE-2022-22965 (Spring4Shell)

CVSS 3.1: AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H        = 9.8 Critical
CVSS 4.0: AV:N/AC:L/AT:P/PR:N/UI:N/VC:H/VI:H/VA:H/
          SC:L/SI:L/SA:L/E:A                          = 8.4 High (BT)

Spring4Shell required a specific combination of JDK 9+, packaged WAR, and Apache Tomcat — captured as AT:P. Most exploited deployments contained blast radius to the immediate JVM, so the Subsequent System triple drops to Low. Net result: the same bug that screamed 9.8 in 3.1 lands at 8.4 in 4.0. That is the right number; pretty much every organisation that responded in 2022 prioritised this in High, not Critical, lanes after the initial scramble.

Supplemental metrics — context without score inflation

Supplemental metrics are the quiet upgrade. They never change the BTE number, but they let you encode the routing data that triage teams currently keep in spreadsheets:

Safety (S) — does exploitation risk human safety? Critical for ICS / OT, medical device, and automotive vendors. Values: Negligible, Present.
Automatable (AU) — can attackers automate exploitation across many targets? Maps cleanly to the CISA SSVC Automatable decision point.
Recovery (R) — Automatic, User, Irrecoverable. Tells responders whether they have a graceful failure mode.
Value Density (V) — Diffuse vs Concentrated. A single vulnerable identity store is Concentrated; a per-tenant misconfig is Diffuse.
Vulnerability Response Effort (RE) — Low, Moderate, High. Honest signalling so customers can plan change windows.
Provider Urgency (U) — Clear, Green, Amber, Red. Vendor's own opinion of urgency.

Together they enable the CISA SSVCdecision tree natively in the vector, which is the direction most government-aligned vulnerability programmes are heading. If you already enrich findings with EPSS and KEV membership (and you should), adding Automatable + Recovery completes a four-signal model that is meaningfully better than CVSS 3.1 alone.

Migration plan for a typical AppSec programme

We use this plan across customer engagements at AxVeil — from VAPTquarterlies to year-long compliance programmes. The goal is to stop re-arguing severity quarter after quarter; pin the methodology once, score everything with both versions for one cycle, then cut over.

Inventory. Pull every open finding from the last two years. Tag with the CVSS 3.1 vector and severity band.
Parallel scoring. For one quarter, score every new finding with both 3.1 and 4.0. Use the FIRST official calculator or any reputable open-source port — and store both vectors in your finding model.
Distribution analysis. Compute the delta histogram. If 4.0 BTE is on average 0.6 lower than 3.1, you know your remediation SLAs need a band shift, not a number shift.
SLA rewrite. Pick band thresholds in 4.0 that produce roughly the same patching volume per band as 3.1 did. Communicate the change in writing to engineering, product, and your auditor.
Tooling cutover. Confirm your scanner, ticketing system, and SIEM enrichment can store and surface the CVSS:4.0/ vector. Update dashboards.
Decommission. After two clean quarters, retire CVSS 3.1 from new findings. Keep historical 3.1 vectors on closed records for forensics.

Two things to watch. First, your auditor probably still has SLAs written against 3.1 severity bands — negotiate the rewrite before the cutover, not during the audit. Second, EPSS is still the strongest external signal for exploitation likelihood; pair the new CVSS-BTE + EPSS percentile + KEV membership in your prioritisation queue. Tools like the CVSS calculatorand the pentest cost estimatorcan help you frame both severity and remediation budget for the same finding in one place.

FAQ

Is CVSS 4.0 backwards compatible with CVSS 3.1?

No. The vector string is incompatible — CVSS 4.0 introduces new metrics (Attack Requirements, MSI, MSA, Safety) and renames several. Tools must parse the leading 'CVSS:4.0/' prefix and route to a separate calculator. You can keep historical 3.1 scores in your asset inventory, but new vulnerabilities should be scored against the 4.0 specification.

Does NVD publish CVSS 4.0 scores yet?

NVD started accepting CVSS 4.0 from CNAs in late 2023 and publishes both 3.1 and 4.0 scores when a CNA provides them. Coverage is still partial in 2026, so most vulnerability management tools display 3.1 as the primary score and 4.0 as a secondary score when available.

Will a critical CVSS 3.1 vulnerability stay critical in CVSS 4.0?

Not always. CVSS 4.0 differentiates Subsequent System impact from Vulnerable System impact, so a bug that only affects the local component (no downstream blast radius) can drop a severity band. Conversely, vulnerabilities with strong lateral impact can climb. Always re-score critical CVEs before basing patch SLAs on them.

Should we replace 3.1 in our SLAs immediately?

Run them in parallel for one quarter. Score every new finding with both calculators, compare distributions, then cut over with documented thresholds (for example, CVSS-BTE 9.0+ in 4.0 maps to your previous CVSS 3.1 9.5+ SLA). Communicate the change to engineering and to auditors before flipping the contract.

Where does EPSS fit alongside CVSS 4.0?

EPSS estimates exploitation probability within 30 days, CVSS estimates severity if exploited. They are complementary. The combination CVSS 4.0 BTE + EPSS percentile + CISA KEV membership is the strongest prioritisation signal currently available and is the model we recommend in customer remediation plans.