Richard NogueraCISO at Gap, Inc.
From operations to stores to ecommerce, our digital strategy is transforming our business. And security is foundational to our strategy. We are continuously looking at the latest techniques and technologies for rapid threat detection and response. Our partnership with the Awake team has allowed us to provide our feedback while engaging with world-class investigators and security professionals to help design and build their solution—a truly refreshing approach.
Awake in a nutshell
Answering questions that others cannot
Awake spans the gap between the low-level data analysts are forced to work with today and the high-level concepts they actually need to focus on. The Security Knowledge Graph data model shows, for example, when devices arrive and leave and how they move around the network, and constantly gathers data to infer their characteristics.
Awake also enables the analyst to move fluidly between low-level data and the high-level view of the related entities. If given an alert containing an IP address, EntityIQ allows the analyst to immediately identify the device and any notable behaviors it exhibited at the time of the alert. The analyst can also dive down and see detailed timelines of these behaviors with the actual network transactions using ActivityIQ. Importantly, Awake allows searches using any combination of the entity data from the Security Knowledge Graph and the low-level raw data points.
To give a concrete example, suppose an analyst has intelligence that devices of a particular operating system version are targeted by a watering hole web site. In Awake, a single query can produce, in seconds, a list of just those devices running that OS version that have also visited the web site. To do this, the system filters devices by the OS version, then collates all historical network sessions from these devices that involved the website. Awake accounts for the fact that each such device may have had many different IP addresses over time and these IP addresses may also have been used by other, irrelevant devices at other times.
Awake Security™: Under the Hood
The Awake Advanced Security Analytics Solution has to ingest raw data at high speeds, extract signal from that data to identify and track the entities in the environment to build the Security Knowledge Graph, then store it all in a way that can be queried flexibly and quickly by analysts. Additionally, it must support analytics like EntityIQ and ActivityIQ that provide deeper insight into the entities’ behavior. The input data can arrive at volumes that exceed traditional solutions like SIEM by an order of magnitude.
Awake, therefore, developed a custom-built analytics stack, built on recent innovations in networking, machine learning and data science, to produce a few key components:
In real time, this component processes incoming data, identifying and tracking entities. Thus, this builds the foundation of the Security Knowledge Graph. In addition, it performs pre-correlation, linking each data point with its associated entities, as it is ingested.
Awake’s continually running analytics (EntityIQ™and ActivityIQ™ in our current version) derive views that integrate graph and pre-correlated bulk data by applying techniques from the field of knowledge discovery and scalable unsupervised machine learning. Because Awake pre-correlates data with entities as it is ingested into the system, it is possible to build analytics, like EntityIQ and ActivityIQ, that look at all the data associated with given entities to draw broader or even organization-wide conclusions, which previous approaches could not achieve.
Existing solutions cannot integrate graph and structured sources, limiting the queries analysts can formulate. These solutions also only perform correlation among data points and the relevant entities at query time, which makes the queries slow and sometimes requires the analyst to create individual sub-queries. The entire process can take hours for queries that return in seconds with Awake.
Awake can do this, supporting a full range of queries with interactive response, because of the custom-built analytics stack as described above. The stack is tuned to run with high performance on a single node, while also based on an architecture that supports scale-out across multiple nodes to accommodate future growth.