Glossary Term

Data Exfiltration

“Data exfiltration” is the unauthorized movement of data. It can also be referred to as data theft, data extrusion, data exportation, data leakage, and data exfil.

What is Data Exfiltration?

Data exfiltration is a technique used by malicious actors to target, copy, and transfer sensitive data. Data exfiltration can be done remotely or manually and can be extremely difficult to detect given it often resembles business-justified (or “normal”) network traffic. Common targets include financial records, customer information, and intellectual property/trade secrets.

Unfortunately, an attacker does not need to use particularly advanced tools to infiltrate a network, exfiltrate data, and not get caught; this is true for both advanced persistent threat (APT) groups as well as less sophisticated threat actors, and especially true for malicious insiders.

Detecting Data Exfiltration

Monitor legitimate business tools

Attackers are evolving to techniques that rely on tools that already exist within the environment, such as remote access tools (RATs). While many RATs can be used legitimately, these tools are often designed to actively bypass network controls, obscuring which parties are communicating, when, and how. This means that security teams must now detect malicious intent that blends with business-justified activity, a task that is both tedious and challenging for most analysts. This ability to fly under the radar is attractive to malicious insiders and outside attackers alike who are interested in exfiltrating data.

Monitor encrypted traffic

While the network has historically been a valuable source of insight that enabled effective detection and response, it has become increasingly opaque as more of the data on the network is encrypted. For security teams, this means losing visibility into this powerful data source, just as attackers use techniques like encryption to evade traditional detection methods.

Knowing is half the battle when it comes to identifying and stopping data exfiltration attempts. When a major bulk of traffic is encrypted, TLS fingerprinting can be used to identify what applications might be on the network. TLS fingerprinting uses the metadata in the TLS traffic, which isn’t encrypted, to inform security teams about what kind of application may be the source of that traffic. This type of information is extremely useful in data exfiltration investigation. For instance, while a large upload from a browser may perhaps not be alarming, the same upload from Python might very much be so.

Know who has authorized access to data and monitor activity

When data is exfiltrated by an authorized employee it can be even more difficult to detect than if an outsider was responsible. To get ahead of potential insider data exfiltration attempts, it is recommended security teams have a complete, real-time understanding of who is authorized to access sensitive information, then closely monitor accounts for changes in behavior. While the amount of data being exfiltrated may seem small and inconsistent, the activity may actually be persistent and unique, therefore elevating its risk score to prompt a closer look.

Common Attack Techniques

Social engineering/phishing attacks

A common data exfiltration tactic is to use deceptive, manipulative social engineering techniques to trick someone into opening a malicious script which then infects a company’s network.

Often, phishing emails will be designed to look like it had been sent from a high-ranking company executive. Even targeted campaigns are rarely isolated to a single user, so once an email is delivered, a small number of users will typically be impacted.

Downloads/Uploads

Another common exfiltration technique is to download the targeted data from a secure device, then upload it to an external device, such as a laptop, smartphone, tablet, camera, or thumb drive.

Human error

Like the recent Equifax investigative report detailed, human and procedural failures, like failing to maintain appropriate security certificates, can also make it easier for data exfiltration to occur, given proper protections may not be in place.

Detecting Data Exfiltration with Awake

Situational awareness

Awake can automatically discover the entities on the network and allow the security team to annotate these based on their data classification. These annotations can then be used to customize threat detection for each organization.

Entity tracking

Awake combines several AI techniques to autonomously discover, profile, and classify every device, user and application on any network and cluster similar entities over time via behavioral fingerprints. This, in turn, allows attribution of the suspect or malicious behaviors back to an entity as opposed to an ephemeral IP address.

Threat detection

The Awake Security Platform automatically detects attacker tactics, techniques and procedures (TTPs) such as data exfiltration. Awake hunts down and focuses your attention on the most consequential threats by:

  • Building up evidence of malicious intent for each entity, correlating all behaviors over time and ultimately identifying the “smoking gun”.
  • Going beyond alerts and visualizing the entire incident kill-chain across entities, protocols, and time. Security professionals can look for sequences of events over weeks or months while mapping them to a known attacker kill chain or framework such as MITRE ATT&CK matrix.
  • Allowing custom detection for your unique risks without needing threat hunters or data scientists.

Information turned into quick action

As with most security threats, early detection is key for risk management. This is most true about data exfiltration given it can represent the final chance for the defenders to minimize the impact of a breach. Awake autonomously understands the behaviors and attributes of entities and monitors for changes. Malicious intent is detected by a data science approach that avoids temporal baselining and instead performs behavioral analytics based on an understanding of the entities involved, behaviors of similar clustered entities, and behaviors prevalent across the enterprise. This avoids the need to retrain the system (as is needed with first generation ML solutions) when behaviors legitimately change, as well as eliminates the false negatives that are prevalent when systems are already compromised before training occurs. All of this means the value is seen quickly as the Awake Security Platform is integrated into the rest of the security infrastructure.

Also See