By David Pearson
In the original post of this series, I set the stage for why remote access capabilities should be analyzed more carefully. Given these tools make the network perimeter more permeable than security teams may realize, it is important to know who is initiating the connection, and where the communicating devices reside — especially if this isn’t the approved remote access solution. In our first foray into this subject, we learned about the types of techniques LogMeIn uses and how they can circumvent security controls. Now it’s time to learn what TeamViewer has up its sleeve.
TeamViewer has been part of the threat actor’s arsenal for years. In 2013, a report came out that discussed how a surveillance tool named TeamSpy had been using TeamViewer for roughly a decade. Since then, the Shade ransomware took advantage of TeamSpy, and– more recently–other spam campaigns have been using it. Additionally, a piece of malware named Skywyder–which I found on the Dark Web sometime in mid 2016–leverages TeamViewer. Each of these occurrences take advantage of the fact that TeamViewer is a common remote support tool and can aggressively avoid attempts to detect or block it.
By observing the DNS names, we’re able to see that TeamViewer has very easily-identifiable domains and subdomains, and includes several disparate IP address ranges.
In fact, at the time of this writing, there are 16 master servers (master1 – master16), hundreds of normal servers across numerous IP ranges, as well as a number of other publicly-reachable infrastructure devices. Some of these IP addresses change over time, while others seem to be mostly static. However, it’s important to know that most of these servers are not actually owned by TeamViewer.
When trying to keep track of what communications exist with these infrastructure devices, it’s clear that there’s a lot of things going on.
There are several connections occurring simultaneously to 7 different servers within the infrastructure. For example, Figure 3 shows that there are communications on barebones TCP, as well as TLS.
As some of the earlier conversations continue or die off, yet more spin up.
There are some larger packets visible in this data, but–unlike LogMeIn–the data is unreadable.
Interestingly, much later in the capture we find communications occurring over UDP as well. However, again, there is no readable information.
At this point, there is very little information we know, except that some sort of communication is occurring with TeamViewer infrastructure devices. In this capture, I did also connect to a colleague’s system across the Internet–where is that data?!
First, a number of sessions (both over UDP and TCP) communicate on port 5938.
I began paying attention to communication that was occurring nearby these flows that I couldn’t explain away. It was around then that I noticed a UDP session that was attempting to communicate with an RFC1918 private IP address that did not exist on my local private network!
Looking closer, I saw that there was yet another foreign (to my network) private IP address being contacted, followed by a public IP address. Interestingly, each of these three packets were identical in size, as well as source and destination ports!
Upon a closer look at the payloads of each packet, I noticed that each had the destination IP address (as a string) and destination port (in little-endian form)–plus a handful of leading bytes–as the only non-zero content.
These leading bytes–0x03172447500005–struck me as odd, especially because a subset of them produced a human-readable string ($GP). Whenever I find fragments of data that seem to be repeated for no clearly discernable reason, I start to explore. By searching for $GP, I came across some of the earlier sessions–also via UDP–that were related to TeamViewer (see the payload in Figure 6). In those cases, the matching substring is shorter–only 0x031724475000.
At this point, I went back to revisit the unintelligible TCP data to the TeamViewer infrastructure devices–communications that occurred with TCP port 5938. As I pored over that data, I couldn’t help but to notice that there were many data frames that contained the TCP flags PSH, ACK set.
With a filter in place to only show these packets, I realized that there were far fewer packets to explore. Starting from the top of the much-smaller data set, the first packet I came across had–you guessed it–a partial match!
I continued to explore the packets and noticed that there were several beginning bytes possible for the payloads. However, 0x1724 and 0x1130 were by far the most common. Around this time I also discovered that some of these payloads did in fact contain readable data. There was certainly some form of structure here, and I had to figure it out.
In this case, searching for the above hex bytes and the name TeamViewer hit pay dirt.
The article above, by an Optiv engineer named Braden Thomas, provided an immense amount of information–much of which complemented what I had discovered. In addition, Braden had started to dig into the application itself, and had created a basic TeamViewer parser in lua. Though his parser was written nearly four years prior to my beginning to explore the protocol, it turns out that the fundamentals of what he had created still worked. TeamViewer still uses a simple 1 bit rotation scheme to send and receive its data.
Key TCP Messages
Adding this parser to Wireshark as a plugin provided me with a glimpse into the traffic that I had not yet been able to understand. However, most of the communications that I cared about were either unexplored or only slightly parsed. As I continued my work, I kept updating the parser to handle new capabilities. The messages discussed below are those relevant to how TeamViewer communicates through network perimeters to bypass security tools.
This command is used to identify whether or not the device using TeamViewer is able to reach out to a system within the TeamViewer infrastructure. It contains a nine-digit identifier that we’ll discuss more later.
This command is used by a device to tell the TeamViewer master servers that it is ready to communicate. It contains many fields, including OS and version information, client application, function, language, license type, and its nine-digit identifier.
This is the response to the CMD_MASTERCOMMAND, and contains a configuration. Within that configuration information is a list of server names and their corresponding IP addresses, identifiers, and source and destination port pairs. Note that this is where the device learns which servers to communicate with later (both with or without DNS).
This is another message that a device uses to identify itself to the infrastructure. Again we see the nine-digit identifier associated with the TeamViewer application, but we also see other fields that provide the non-master parts of the infrastructure with additional information about the device.
A device communicates with some of the TeamViewer infrastructure devices with this message. While again we see our own TeamViewer application ID, we also discover the ID of an infrastructure server here.
This message is sent by a device that wishes to connect to another device. In the contents of this packet, we see the actual TeamViewer ID of the other device, a connection ID (which can be used for reference in other parts of the communication), and sometimes the requesting IP address.
This message is received by a device when a connection is requested to it. It will generally contain the sending device’s TeamViewer ID, any proxy IP address used by the TeamViewer infrastructure, and the session ID.
This message allows a user to initiate a disconnection from the TeamViewer communication. It contains your TeamViewer ID when sent.
What about UDP?
In the section above, we discussed many of the messages integral to communication between a device and the TeamViewer infrastructure via TCP. However, what about all of the UDP traffic that we saw earlier? Interestingly, this traffic begins with one final TCP message called CMD_UDPFLOWCONTROL. In this message, the beginnings of setting up a connection between devices–especially those that are on different private networks–occurs. To do so, the TeamViewer servers provide several quite informative pieces of data.
Looking at the above, we are able to learn:
- Destination IP address of the other side (public and one or more private IP addresses)
- Destination ports for the different communications
Moreover, we see the attempted outreach to each of those public and private IP addresses on the aforementioned ports immediately afterwards. However, you may have noticed that the UDP packets (in blue above) are not labeled TEAMVIEWER for their protocol. The reason is because UDP messages created by TeamViewer insert additional null bytes at the beginning of the content, thereby changing the header.
Once I modified the parser to account for this different behavior, it’s clear that TeamViewer can function over UDP. The CMD_UDPPING messages (above) are used as a mechanism to verify whether or not a message using this IP address and port combination can actually reach the destination IP address. Since in this case we’re not on the same network as the device we’re trying to reach, we only receive a response from the public IP address.
Interestingly, in my analysis of this communication over the course of several months, I’ve never seen the UDP version of TeamViewer use ports lower than 50000. I suspect this is to help avoid any firewall issues.
What We Know
In several messages, we are able to learn the nine-digit identifier that uniquely identifies an installation of TeamViewer on a device. In addition, we can also learn the session ID that corresponds to the current communication occurring between two devices. The most clear place to learn this information in one shot is either CMD_REQUESTCONNECT or CMD_CONNECTTOWAITINGTHREAD.
It is also possible to determine if the TeamViewer application is actually being used, or if it is simply installed. There are several methods to do so, but the easiest is to identify whether there are other functions than “Login” called by the CMD_MASTERCOMMAND. Note that it’s possible that later CMD_MASTERCOMMAND messages may be encrypted, thereby requiring additional methods for determining communication.
The CMD_MASTERCOMMAND message also provides several pieces of information that are helpful for fingerprinting a device. In fact, the information learned just in this message helped me to identify a covert VPN in one customer environment, and a support tool that ran on top of (a quite out of date version of) TeamViewer in another!
By analyzing the traffic, we’re also able to discover who originated a TeamViewer session. The sender of a CMD_REQUESTCONNECT message has this role, while the receiver of a CMD_CONNECTTOWAITINGTHREAD message is the recipient.
Once the devices attempt to begin communication via UDP–which then avoids the TeamViewer infrastructure altogether–we are able to learn both the public and private IP addresses of both sides of the communication, as well as the ports over which they’ll communicate. Additionally, we are able to discover whether the devices are on a local network or are remote based on which communications are successful.
The final piece of information we can learn is when some major action happens. For example, it is possible to lock a screen remotely, or to switch the direction in which control is occurring–within the same communication. These behaviors seem to correlate temporally with the observation of a CMD_CARRIERSWITCH message.
Without knowing that all of these communications are part of TeamViewer, it’s easy to see how they would have been dismissed as background noise. Additionally, without careful analysis, it would be nearly impossible to discover the meaning of the private IP addresses that appear within the communications, as no context to their relation to TeamViewer would be known.
Using Awake’s platform exposes this information in a way that is not only searchable, but that allows rich context about the users, devices, domains, and applications associated with the activity. It’s even possible to search for arbitrary values observed a certain number of times, rather than for some specific value itself:
By sharing this analysis, I hope to empower others to continue digging deeper into TeamViewer and its complex messages, as work like this is essential to discover methods used in the wild today (as well as those that modify this behavior in the future). Be sure to check back soon for the next installment.