Some of you might have heard that I gave a talk on Cisco IOS security at the 25C3 this year. The talk was unique for me in many ways, starting with the fact that it covered content going all the way back to the beginnings of Phenoelit up to material that was developed within Recurity Labs. It could be said that the talk was a nexus of research efforts in different areas of my life.
The second unique aspect was the cheer amount of stuff to cover, which prevented more in-depth reflections on some of the issues. This begins with the question of who would actually take over Cisco routers. The short answer is of course: whoever can. But that needs to be taken apart in more detail. Let's focus on attacks that directly apply to the device in question and ignore for now that the easiest way to take over an entire network infrastructure is to attack the unpatched Sun servers running in the Network Operation Center.
Consider successful exploits a question of development cost. An exploit is in that respect not different than any other software: you find someone you think can actually pull it off, present your requirements and have them develop it for you. In almost all cases, this isn't going to be for free, so they will give you a price tag for the work, which in most cases is a linear function of the cost of a work hour for them. This implies that the better they are, the less hours they will need to develop the exploit for you, which makes it cheaper. This is what made exploits against Windows desktops so cheap that attackers mostly relied on them for gaining access to networks from what the network owners considered inside (an outdated but still widespread way of looking at it).
Our research concerning Cisco IOS security was always based on the assumption that there are entities in this world that have reached a reasonable development cost level for IOS exploits. But the publicly available knowledge on how to write IOS exploits didn't fit the bill, as they required to jump into some memory address that is specific to the IOS image running on the target. Assumed you don't know what image is running on that machine, you could still argue that the exploit writer included a list of all possible image dependent addresses into the exploit would try them out, one at a time. This would cause the router to reboot every time a wrong guess was tried.
In the presentation, I said that there are about 100.000 different IOS images in use. This is a very debatable number, as about 15.000 are supported by Cisco at any given time and good network administrators will only run one or two IOS images in their entire network, often times investing several months to figure out which exact image they want. When we however fire up the Cisco Feature Navigator and ask it for all IOS images that support IP Routing, which should be all of them, we get: “Showing 1-50 of 280715 results” at the bottom of the page. Wow, what a number. On page 5615 of the result listing (thank god they have a direct jump feature!), we see that this covers everything from IOS 12.4 to 11.2. Therefore, the number doesn't take very old networks into account, in which you do see images below 11.2 running occasionally.
It should be clear that trying them all out is not an option, especially considering reboot times of 30 seconds and more per attempt. Your exploit would constantly reboot a router for 2339 hours or 97 days. And that doesn't take into the account that you need to get and disassemble them all, estimated time with IDA: 5848 days or 16 years.
Therefore, there is either no one exploiting IOS devices or they have found a better way to do it. Our Cisco Incident Response tool was developed in the hopes of finding people attacking IOS devices, successful or not. But then again, it's hard to write a detection for the unknown, so we also had to look into finding ways to get code execution stable. The method presented at the 25C3 (and documented here, feel free to post questions in the discussion section) only reduces the number of things you have to know about your target, it doesn't eradicate the problem in general. Now, we only need to know the ROMMON version and there are a lot less ROMMONs than IOS images out there.
For smaller machines, such as 2600s, updating ROMMON did not seem to be supported and the version depends on the shipping date. However, after closer inspection, here comes an errata: Cisco does offer 6 updated ROMMONs for 2600 routers. For larger machines, e.g. 7206s, there are about 36 different versions known. That's a few magnitudes smaller than 280715. But it is also still far from the ultimate truth, as you still need to know and have that ROMMON as well as knowing a few things of the box, most importantly the hardware series. Some people like to include the hardware series in their router's DNS records or name the PTR records of the IP addresses bound to a router's interface after the interface itself, which allows to guess what type of metal it is.
Knowing the hardware platform is actually more important for the first and second stage shellcode than it is for getting stable code execution, as the same ROMMON seems to be applicable to a number of subtypes of routers, while one subtype may have memory wired into different addresses than the other. But being lucky is also a valid option, which is what happened when we selected the memory area for a direct write: I assumed the memory at 0x80000000 on a 2600 is used for global IOS variable pointers, which is incorrect. So, errata #2: I was made aware that this is of course the exception vectors, after the MMU is turned on. Accordingly, this is a very good place to store two instructions.
There is still a lot to do and research when it comes to Cisco IOS and security. But the stable, image independent code execution at least allows us finally to draw better assumptions about the attacks we should be looking for. It shows nicely that, even with CIR, we should not try to detect the exploitation while it happens, but focus on the shellcode functionality and the footprints it leaves. And the IP options vulnerability is a perfect example why critical infrastructure should always dump the core files onto its own FLASH device, as dumping core over FTP doesn't really work to well when your “IP Input” process just got popped.