Data exfiltration in 2026 looks fundamentally different from what most security teams were trained to watch for. According to CyberArk’s 2025 Identity Security Landscape Report, which surveyed 2,600 cybersecurity decision-makers across twenty countries, machine identities now outnumber human users by 82 to 1 in the average enterprise.
Service accounts, API keys, OAuth-connected applications, and AI agents now account for the vast majority of activity inside corporate systems, and most of them operate with broader access, longer-lived credentials, and far less oversight than any human employee.
Classic exfiltration paths, such as attachments, uploads, and removable media, remain in use. They have not gone away. But alongside them, a newer category of identity has quietly become the largest unwatched surface in most enterprises, and most of this year’s data movement is happening through it.
The Rise of Non-Human Identities in Modern Data Exfiltration
The category sounds abstract until you look at what actually lives inside a modern enterprise. A Salesforce OAuth integration set up in 2022 to sync data with a marketing tool. A reporting service account that pulls from three production databases every night. A vendor’s API token with read access to a customer table. A CI/CD bot with deploy permissions across the cloud environment. An AI agent built last quarter to summarize support tickets, with access to the entire ticketing system.
Each of these is a non-human identity. Most enterprises have thousands. Some have hundreds of thousands.
They were not created as part of a security strategy. They were created one at a time, by engineers who needed an integration, by product teams shipping an AI feature, by vendor onboarding flows that issued a token automatically. The volume came from a thousand small decisions that nobody coordinated.
They share three properties that change the exfiltration picture entirely. Their credentials do not expire on a quarterly review cycle. Their access is rarely audited. And they can act at machine speed, executing thousands of operations in the time it takes a human to read a single email.
When data leaves through one of them, it leaves through a channel that was already approved, with credentials that are still valid, in a pattern that looks like routine work.
Why Traditional Data Exfiltration Detection Tools Fall Short
Most of the security stack a modern enterprise runs was designed around a different assumption, that the entity moving data was a person.
Posture management answers where the data lives. Access controls answer who is allowed to reach it. Identity tools answer which account is authenticated. Together, they cover the question of permission. They do not cover the question of behavior.
A DSPM scan run nightly will confirm that a customer table is still in the right database, classified correctly, and accessible only to authorized roles. It will not flag that a service account with one of those authorized roles read the entire table at 3 AM on a Tuesday and sent the contents to an integration endpoint that was approved in 2023. A DLP gateway will block an attachment with a credit card number in the body of an email. It will not see the same data leaving through an authorized API call from a service account whose behavior has never been baselined.
These tools were not designed to fail. They were designed for a question that has since changed. The question used to be whether a person was allowed to access certain data. The question now is whether that access is doing something it should not.
Why This Has Not Been Fixed Yet
The instinct, reading the above, is to ask why nobody has solved this. The answer is structural.
Revoking a machine identity carries real production risk. Service accounts and API tokens are often the connective tissue between systems that the business depends on every minute. Pulling the wrong one can break a billing pipeline, a customer onboarding flow, or a nightly report that an executive reads on Monday morning. Security teams know this, so they err on the side of leaving identities in place.
Most identity governance teams are also sized for the human population. They were built when the ratio was closer to 1 to 1. At 82 to 1, the same team is now expected to manage dozens of times the surface area with no proportional increase in headcount.
And there is no equivalent of an HR process for machines. When an employee leaves, HR triggers an offboarding workflow that revokes their access automatically. When a project ends, a vendor relationship terminates, or an engineer who set up an integration moves to another team, no system fires. The identity stays. The credentials stay valid. The access remains.
Why AI Data Exfiltration Is Making the Gap Wider
The pressure to ship new features and integrations is consistently higher than the pressure to govern access. Permissions are being granted faster than they are being reviewed. So the gap between the identities that exist inside an enterprise and the identities the security team actually has visibility into gets larger every quarter, not smaller.
AI adoption is compounding the problem. Every new agent deployed creates a new non-human identity, and AI data exfiltration has become one of the fastest-growing categories of data movement most security teams have no visibility into. The ratio that already stands at 82 to 1 is moving in one direction.
How to Prevent Data Exfiltration for Non-Human Identities
The way to prevent data exfiltration for non-human identities is to watch what the identity does after it logs in, not just whether it was allowed to log in. Permissions are checked once. Data moves continuously. The protection has to follow the activity, not stop at the login screen.
For a security team starting on this today, three concrete capabilities matter more than the rest.
The first is full inventory. Every non-human identity in the environment has to be visible, including the ones nobody on the current team created. Most enterprises do not have this list, and producing it manually is the kind of work that takes months and is out of date the day it is finished. Matters.AI discovers non-human identities continuously across cloud, SaaS, on-prem, and endpoint environments, and keeps the inventory current as new ones are created.
The second is behavioral baselining. A reporting service account that pulls a hundred records every night at 2 AM has a recognizable pattern. The same account suddenly pulling a million records at 11 PM is a behavioral event, even though the permission check passes. Matters.AI uses user and entity behavior analytics to build and maintain these baselines automatically for every identity in the environment. The AI does the work that would otherwise require a dedicated team setting thresholds one identity at a time.
The third is data movement visibility. The fact that an identity was allowed to read a database is less useful than the fact that the identity just exported the entire customer table to an external endpoint. The first is a permission check. The second is the actual exfiltration. Matters.AI watches the data itself, across every channel it moves through, and connects each movement event back to the identity that triggered it.
Watching the data, not just the identities allowed to touch it, is the difference between seeing what is happening in time to act and reconstructing what happened weeks later. That is what the Matters.AI platform was built to do.
See Your Own Non-Human Identity Inventory
Most enterprises do not know how many non-human identities are operating inside their environment, what those identities have access to, or what they are doing with that access right now. Matters.AI can show you this picture for your own environment in under an hour, not in three months.
What Is Coming Next
Non-human identities are one of several paths through which data leaves enterprises today. Others look nothing like an AI agent making API calls or a service account moving records between systems, and each requires a different approach to close.
The next blog in this series will cover one of those paths.




