The story behind Matters.AI funding journey

Incident Response

How incident response works: the six phases, why data incidents break standard IR frameworks, and what fast investigation actually requires.

Read with AI

What is Incident Response?

Incident response (IR) is the structured process an organisation uses to detect, contain, investigate, and recover from security incidents, while preserving evidence, meeting regulatory obligations, and reducing the total impact of the event.

Good incident response doesn't start when something goes wrong. It starts months before that, with preparation: documented playbooks, defined roles, tested communication trees, and the technical infrastructure that makes fast investigation possible. The organisations that contain data incidents in hours rather than weeks built the response capability before they needed it.

The six phases of incident response

The NIST framework for incident handling describes four phases. Many practitioners use a six-phase model that gives more operational clarity, particularly for data incidents with regulatory notification timelines.

Preparation

Defines who does what when an incident occurs, what tools and access they need, and what the escalation path looks like. This phase also includes building and testing playbooks for the most likely incident types: ransomware, data exfiltration, insider threat, account compromise. An untested playbook is an assumption. Preparation also means ensuring the technical infrastructure for fast investigation is in place before an incident forces its absence to matter.

Identification

An alert fires, or an anomaly is observed, or an external party reports a potential compromise. Identification is the phase of determining whether a genuine incident has occurred. Not every alert is an incident. Not every anomaly is malicious. The identification phase triage decides whether to escalate to full incident response procedures or resolve as a false positive at the analyst level.

Containment

Once an incident is confirmed, the immediate priority is preventing it from getting worse. Short-term containment isolates the affected systems or revokes the compromised credentials to stop active damage. Long-term containment maintains operational continuity while the full investigation proceeds, which may mean building a clean parallel environment rather than immediately restoring the affected systems.

Investigation and scoping

This is where most data incident responses actually break down, and where the gap between a fast and a slow IR programme is widest. Investigation determines what happened, who was responsible, what data was involved, and how far the incident has propagated. Scoping answers the blast radius question: how many systems, records, and individuals were affected?

For data incidents specifically, scoping requires knowing what sensitive data exists where, who accessed it, whether it left the organisation's control, and through which path. That information comes from data lineage, classification records, behavioural telemetry, and endpoint logs — all correlated across a specific time window. Manual correlation of those sources across multiple tools is the dominant time sink in enterprise data IR. It's why average breach dwell time remains measured in months rather than days.

Remediation and recovery

Once the scope is understood, remediation removes the threat: patching the vulnerability, eliminating the malware, revoking the compromised credential, fixing the misconfiguration. Recovery restores affected systems and data to normal operation, with verification that the threat has been fully eliminated before systems are returned to production.

Post-incident review

Documents what happened, what controls failed, what the response team did well, and what needs to change. This phase feeds directly back into Preparation: updated playbooks, patched vulnerabilities, additional monitoring coverage, and revised detection rules based on the actual attack pattern.

The two metrics that define IR programme quality

MTTD (Mean Time to Detect) and MTTR (Mean Time to Respond) are the standard KPIs for incident response performance. Every IR programme reports them. Most IR teams are optimising the wrong thing.

MTTD measures how long from the start of an incident to detection. Reducing MTTD requires better detection tooling: lower false positive rates so real threats don't get buried in noise, and higher-confidence signals so analysts can confirm a genuine incident faster.

MTTR measures how long from detection to containment. Reducing MTTR requires faster investigation: getting to scope faster, making containment decisions faster, executing remediation faster.

The hidden factor that neither metric directly captures is investigation quality. An IR team can achieve fast MTTR by containing quickly but scoping narrowly, missing the downstream systems where the attacker also accessed data. That narrow scope looks like a fast response. It may produce an incomplete breach notification that attracts regulatory scrutiny later.

The real metric practitioners should track alongside MTTD and MTTR is investigation confidence: how certain is the team that the scope they've defined is complete? Organisations that can answer that question quickly, because they have continuous data lineage and classification available before the incident opens, resolve incidents differently than organisations that reconstruct scope manually under time pressure.

Where data incident response specifically breaks down

Standard IR frameworks were designed for infrastructure and malware incidents. Data incidents have a different structure and a different set of hard questions.

After a malware incident, the key questions are: which systems were compromised, what did the malware do, has it been fully removed? Those questions have relatively deterministic answers from endpoint telemetry and forensic analysis.

After a data incident, the key questions are: what data was involved at the semantic level, not just which systems were touched? How far did that data propagate after the initial access? How many individuals are affected? That propagation question is the one that regulatory frameworks care about, and it's the one that most IR teams struggle to answer quickly.

The investigation stalls not because the team is slow. It stalls because the information needed to answer the propagation question doesn't exist in one place. Database access logs show what was queried. Endpoint logs show what files were created. Network logs show what was transmitted. Cloud storage access logs show what was uploaded. None of these sources share a data model. Correlating them manually to trace the path from access to exfiltration takes days when regulatory notification windows are measured in hours.

That structural gap is what continuous data lineage addresses. When every step of sensitive data movement is tracked in real time before any incident occurs, the investigation phase starts with the propagation map already built. The scope question is a query against a live lineage graph, not a reconstruction from fragmented logs. The difference in investigation time is not marginal.

IR for data incidents vs general incidents

Data incidents have specific characteristics that generic IR playbooks don't fully address.

Regulatory timelines compress everything. GDPR requires notification within 72 hours of becoming aware a breach has occurred. India's DPDP Act has comparable requirements. The clock starts at awareness, not at investigation completion. A team that takes four days to determine scope has already missed its notification window regardless of how fast they contained the incident.

Evidence standards are higher. Regulatory, legal, and insurance proceedings scrutinise the evidence trail produced during incident response. Logs assembled manually under time pressure, cross-referenced from systems not designed to correlate with each other, carry less evidentiary weight than immutable, contemporaneous records produced as a continuous byproduct of normal monitoring. What the evidence trail looks like after the incident is determined by what infrastructure was in place before it.

Scope ambiguity has direct cost implications. An organisation that notifies too broadly, telling regulators that 500,000 customers were potentially affected when the actual affected population was 50,000, creates unnecessary reputational damage and over-scoped remediation costs. An organisation that notifies too narrowly faces regulatory penalties for inadequate disclosure. The only way to scope accurately is to have the data provenance information available to produce a confident, specific answer.

Frequently asked questions

What is incident response?

Incident response is the structured process for detecting, containing, investigating, and recovering from security incidents. It encompasses preparation activities like playbook development and role definition, through to detection, containment, investigation, remediation, and post-incident review. For data incidents specifically, it includes the regulatory notification process triggered by data breaches.

What are the six phases of incident response?

Preparation (building playbooks, tools, and team readiness before incidents occur), identification (determining whether a genuine incident has occurred), containment (stopping active damage and isolating affected systems), investigation and scoping (understanding what happened and how far the incident has propagated), remediation and recovery (removing the threat and restoring normal operations), and post-incident review (documenting lessons learned and improving the programme).

What is MTTD and MTTR in incident response?

MTTD (Mean Time to Detect) measures how long from incident start to detection. MTTR (Mean Time to Respond) measures how long from detection to containment. Both are standard IR performance metrics. For data incidents, investigation confidence, the certainty that the scoped impact is complete rather than partial, is an equally important metric that neither MTTD nor MTTR directly captures.

What is the difference between incident response and incident management?

Incident management is the broader operational discipline that governs how all types of incidents (including IT service disruptions, not just security events) are tracked, communicated, and resolved. Incident response is specifically the security-focused process for handling threats to systems, data, and business continuity. The two overlap in larger organisations but are typically run by different teams with different tools and objectives.

What tools are used in incident response?

Core IR tooling includes SIEM for event aggregation and correlation, EDR for endpoint telemetry and containment, SOAR for workflow automation and playbook execution, forensic tools for evidence collection and analysis, and ticketing systems for case management. For data incidents specifically, data lineage tracking, classification systems, and DAM provide the investigation context that general IR tools don't natively supply.

Published May 1, 2026
Share

Ready to see Matters in Action?

Join a specialized 30-minute walkthrough. No sales fluff, just pure visibility and security intelligence.