January 2009 Archives
Thu Jan 29 19:15:02 CET 2009
During a non public security event, I saw a presentation by Olaf Kolkman about the new DNS server named Unbound. When he mentioned that the whole thing is written in C for performance reasons, I returned that we should simply stop developing production software in languages that produce unmanaged code. We got into quite some discussions with the whole audience after that, many people stating that there isn't an alternative and everything else is just to slow. I'm not buying this for many obvious reasons, but that's for another post.
To put our abilities where my mouth is, we ended up performing a short source code audit for the Unbound developers. After all, Unbound is an effort to produce a reliable, validating and DNSSEC ready name server, something we all want to have deployed on a larger scale. Sergio Alvarez, who by the way will be speaking at CanSecWest this year, looked at the code and found it surprisingly not riddled with remote code execution bugs. I was certainly happy about that find, because it meant the next generation DNS server deployments wouldn't look at a future comparable to ISC bind's past.
That impression, however, was largely because Sergio compiled the code with all the ASSERT statements intact. Now, people running heavy duty production DNS servers will most certainly try to make it as fast as possible, instructing the compiler to get rid of “debug” features like ASSERTs. That might not be a good idea. So here is another lesson learned: When building for production, you might want to keep those ASSERTs compiled in, since your server crashing on funny packets is probably better than to share the administrative control of the machine.
Other than that, I hope the Unbound team keeps up the good work, so people have one less excuse to not move to DNSSEC.
Thu Jan 29 16:55:44 CET 2009
What didn't fit into the talk.
Some of you might have heard that I gave a talk on Cisco IOS security at the 25C3 this year. The talk was unique for me in many ways, starting with the fact that it covered content going all the way back to the beginnings of Phenoelit up to material that was developed within Recurity Labs. It could be said that the talk was a nexus of research efforts in different areas of my life.
The second unique aspect was the cheer amount of stuff to cover, which prevented more in-depth reflections on some of the issues. This begins with the question of who would actually take over Cisco routers. The short answer is of course: whoever can. But that needs to be taken apart in more detail. Let's focus on attacks that directly apply to the device in question and ignore for now that the easiest way to take over an entire network infrastructure is to attack the unpatched Sun servers running in the Network Operation Center.
Consider successful exploits a question of development cost. An exploit is in that respect not different than any other software: you find someone you think can actually pull it off, present your requirements and have them develop it for you. In almost all cases, this isn't going to be for free, so they will give you a price tag for the work, which in most cases is a linear function of the cost of a work hour for them. This implies that the better they are, the less hours they will need to develop the exploit for you, which makes it cheaper. This is what made exploits against Windows desktops so cheap that attackers mostly relied on them for gaining access to networks from what the network owners considered inside (an outdated but still widespread way of looking at it).
Our research concerning Cisco IOS security was always based on the assumption that there are entities in this world that have reached a reasonable development cost level for IOS exploits. But the publicly available knowledge on how to write IOS exploits didn't fit the bill, as they required to jump into some memory address that is specific to the IOS image running on the target. Assumed you don't know what image is running on that machine, you could still argue that the exploit writer included a list of all possible image dependent addresses into the exploit would try them out, one at a time. This would cause the router to reboot every time a wrong guess was tried.
In the presentation, I said that there are about 100.000 different IOS images in use. This is a very debatable number, as about 15.000 are supported by Cisco at any given time and good network administrators will only run one or two IOS images in their entire network, often times investing several months to figure out which exact image they want. When we however fire up the Cisco Feature Navigator and ask it for all IOS images that support IP Routing, which should be all of them, we get: “Showing 1-50 of 280715 results” at the bottom of the page. Wow, what a number. On page 5615 of the result listing (thank god they have a direct jump feature!), we see that this covers everything from IOS 12.4 to 11.2. Therefore, the number doesn't take very old networks into account, in which you do see images below 11.2 running occasionally.
It should be clear that trying them all out is not an option, especially considering reboot times of 30 seconds and more per attempt. Your exploit would constantly reboot a router for 2339 hours or 97 days. And that doesn't take into the account that you need to get and disassemble them all, estimated time with IDA: 5848 days or 16 years.
Therefore, there is either no one exploiting IOS devices or they have found a better way to do it. Our Cisco Incident Response tool was developed in the hopes of finding people attacking IOS devices, successful or not. But then again, it's hard to write a detection for the unknown, so we also had to look into finding ways to get code execution stable. The method presented at the 25C3 (and documented here, feel free to post questions in the discussion section) only reduces the number of things you have to know about your target, it doesn't eradicate the problem in general. Now, we only need to know the ROMMON version and there are a lot less ROMMONs than IOS images out there.
For smaller machines, such as 2600s, updating ROMMON did not seem to be supported and the version depends on the shipping date. However, after closer inspection, here comes an errata: Cisco does offer 6 updated ROMMONs for 2600 routers. For larger machines, e.g. 7206s, there are about 36 different versions known. That's a few magnitudes smaller than 280715. But it is also still far from the ultimate truth, as you still need to know and have that ROMMON as well as knowing a few things of the box, most importantly the hardware series. Some people like to include the hardware series in their router's DNS records or name the PTR records of the IP addresses bound to a router's interface after the interface itself, which allows to guess what type of metal it is.
Knowing the hardware platform is actually more important for the first and second stage shellcode than it is for getting stable code execution, as the same ROMMON seems to be applicable to a number of subtypes of routers, while one subtype may have memory wired into different addresses than the other. But being lucky is also a valid option, which is what happened when we selected the memory area for a direct write: I assumed the memory at 0x80000000 on a 2600 is used for global IOS variable pointers, which is incorrect. So, errata #2: I was made aware that this is of course the exception vectors, after the MMU is turned on. Accordingly, this is a very good place to store two instructions.
There is still a lot to do and research when it comes to Cisco IOS and security. But the stable, image independent code execution at least allows us finally to draw better assumptions about the attacks we should be looking for. It shows nicely that, even with CIR, we should not try to detect the exploitation while it happens, but focus on the shellcode functionality and the footprints it leaves. And the IP options vulnerability is a perfect example why critical infrastructure should always dump the core files onto its own FLASH device, as dumping core over FTP doesn't really work to well when your “IP Input” process just got popped.
Posted by FX | Permanent link
Thu Jan 29 16:54:12 CET 2009
This year's Chaos Communication Congress, better known as 25C3, was an exceptional event in many ways. It begins with a program committee that attracted so many interesting people over the last years that they had ample material to select from, and they did a very good job of that too. Accordingly, the quality and spectrum of the presentations was significantly above many other conferences and we all need to thank the people that put up the program. And while I'm still not done seeing all the video recordings of all presentations, there have been quite a number of highlights.
The hard working organizers and Engel of the CCC apparently are by now so well trained in running a Congress that it almost appeared as stress-less routine to the casual observer. I've never been to a Congress with less shouting and less chaos in terms of organization, and despite the event's name, I think that's a good sign. They even somehow managed to handle the insane amount of people showing up, which, as DEFCON attendees will surely know, is quite a challenge by itself.
And then of course there were the presentations, above all Alexander Sotirov and Jacob Applebaum with their successful creation of a rogue SSL CA certificate. The work shows how the combination of academia research with the practical experience and dedication of world class security professionals can achieve something that was considered a theoretical attack. It also shows how much of a pipe dream the perceived security of browser based communication over SSL/TLS actually is. If all but one trusted CAs belong to the same publicly traded commercial entity, they don't actually need to fulfill their security promise anymore, because they have a monopoly.
The purpose of a publicly traded corporation is to maximize the profit for the share holders*. And if selling certificate signatures generates enough revenue to get your stocks rated as "buy", you did your job. If you need to revoke a large part of these certificates, because you failed to react to previously published research on vulnerabilities in them (MD5), this is similar to a call back of your product and would therefore hurt your reputation on the stock markets. If you however just ignore the problem as long as you can and then trust that very few people will actually understand the problem so it doesn't impact your sales, you can even offer remedy at no additional cost and look good in the press. From a business point of view, that is a remarkable containment stunt. From a security point of view, it's devastating. Not only does it show that revocation simply does not work, but also that the one entity that must be extremely strict with revocation actually doesn't follow it at all.
Interestingly enough, this proves two points made by Dan Kaminsky. The first is about how much of a defense SSL actually is in the light of vulnerabilities like his DNS issue from summer 2008. Dan said in his presentation at BlackHat that SSL proved to be much less of a defense than we all thought it would be. The second point is actually less obvious: The much debated partial disclosure approach Dan followed had a very interesting positive side to it that nobody saw before. The big difference between Alex's and Jake's big-bang presentation and Dan's long process of informing selected people gradually over time is the learning effect they had on all the other people. I think after that summer, we security professionals will not hear that old argument of a vulnerability not being critical because the attacker would need to control DNS in order to exploit it very often anymore. On the other hand, I don't see anyone reviewing their security perception of the trust model that so-called secure web sites are build on. Everyone is just happy that the issue got "fixed" so quickly. I for one have not realized that aspect of making a big fuzz about something enhancing its long term educational value before, and I certainly thank Dan for teaching that lesson to me.
Speaking of thanks, my fellow Phenoelit members, above all Mumpi, need to be thanked too for the awesome party they put on. That also includes DJ Vela and CMOS for playing at that event and in CMOS' case for flying all the way into Berlin to do so. And last but not least, I would like to thank the audience of the 25C3, which again was one of the smartest I had the privilege to speak in front of. I apologize for the suboptimal delivery of the Cisco IOS presentation to everyone who saw it, if you found a stray "Erm" in what I said, you may keep it.
* You could argue that this is the case with any corporation, but working at Recurity Labs, I can tell you it isn't.
Sun Jan 25 19:38:42 CET 2009
Reconstruction in Progress
Unfortunately, a number of things recently broke, including the main (and only) harddrive of the machine running phenoelit.net and therefore this blog. We are recovering all the services but, as you might have noticed, switched blog engines so we no longer actually run any (potentially vulnerable) code when you hit this site.
Now we just need to port the old entries over in a consistent and permalink friendly way, which might require a few more days.
Thanks for not unsubscribing :)
Update: Things should be back to normal and the two delayed posts should be posted. Please contact me with any complains if you find something missing or broken. Thanks.
Posted by FX | Permanent link