Sat Apr 1 18:08:27 2006

You are a software vendor

In computer security, the vendor of a software product is commonly seen as the entity responsible for fixing identified security issues with the product and proactively working towards better security in general. While some notable software vendors accept this challenge posed by their customers, many don't really care and handle vulnerability reports much the same way as they handle complains about a false button color in the user interface.

While we are waiting for those vendors to take a bite from the Apple from the tree of knowledge and getting banned from their imagined paradise of unbreakable software, we have some time to think about this perception of the vendor's sole responsibility for security. This commonly accepted point of view is partially based on the fact that the vendor who receives the money for the software product must care about the product's quality, security included. While I fully embrace this concept and still hope that one day, you can return software that just doesn't work as advertised [1], it is an insufficient argument when it comes to security. What about free open source software?

Intuitively, GPL software users also hold the maker of the software responsible for its security, and there is nothing wrong with that. The user extended a certain level of trust to the maker(s) by using their software and this trust gets hurt when the user gets owned. But the argument of responsibility based on a monetary transaction clearly doesn't hold in this case. There have been cases where free software makers simply refused to fix their code or committed that they don't care. Others just mention that the user is an unthankful beast and should run something else. After all, the software was free and there is no warranty.

We could try and change the definition of the responsible party to "owner of the source code". After all, you can only secure something if you have access to its building blocks, right? This case would mean the same thing for commercial software vendors, since they do own the code to their products and are the only ones who can access it. If we extend the definition to mean "write access to the main source tree", it would also neatly describe the maintainers of free software. Case closed, or may be not?

The problem with this approach is that nobody has complete control over the entire source code, and even the few who apparently have it don't. Software is made out of modules and every piece of software uses a wide range of modules: from kernel and system calls to statically and dynamically linked libraries, other software handling the events sent by the software and of course firmware and microcode on hardware devices. It is highly unrealistic to expect anyone or any organization to have complete control over all the components their software depends on. Big software companies are much like a collection of small companies that happen to work together on a single large project.

Today, the lack of central understanding and control leads to responsibility being resolved by the social or business equivalent of a call graph. If, for example, a vendor of a complex server software faces a security vulnerability in the image parsing and handling code of said server, he identifies the maker of the library in question. The library maker gets contacted and asked to fix the bug. The library maker in turn realizes that the issue is in a piece of lower level library and contacts its respective maker and so forth, until hopefully one of the elements in this chain feels responsible and fixes the issue. Or this is how it should work.

In reality, it is often not so easy to identify whose code is actually responsible for the security issue. In the post buffer overflow era of software vulnerabilities, many application specific issues arise from the inter-workings of components. Let's assume, to stay with the image library example above, the image in question has a width and height information of -3. The low level library computes that there will be -3 * -3 = 9 bytes space required and provides this much. The upper level library copies image data until a counter reaches 4294967293, the unsigned interpretation of -3 on a 32 Bit machine. Both parts of the behavior are not correct, but who is responsible for fixing the issue? What if they do not agree? That becomes an important point when the issue is larger than just a signed vs. unsigned integer.

It is by now a commonly accepted fact that a good design and architecture can prevent a lot of issues before they actually happen. After all, the term architecture comes from a profession that must plan on things to be not perfect. If an essential part of a building does hold the weight anymore and crashes, you can sue the vendor and demand a fix but your building is already in rumble and dust. Or, you can just design the whole thing to not depend completely on a single element. Some software vendors have understood that and start to build their products accordingly.

Design becomes even more important than it already is when it comes to computer systems, simply since this is the only way to handle all the complexity we are facing. Accordingly, it needs to be more precisely executed and validated by additional people with a different viewing angle on the subject. Also, changes dictated by reality, Murphy, management or customers need to get back-ported into the design documents. Companies who already work by those principles produce amazingly good, secure and easy to manage products, not only in the software world.

We already identified that there is no such thing as the software maker but rather a more or less designed and planned way of putting together components. In that respect, writing software is some times just linking already existing components by specifying their inter-working in arcane grammars called programming languages. Therefore, the primary work is in selecting the right components for the task, design their relations and how data is handed over from one to the other and make the whole thing work.

But if we take this description, we can see that selecting an operating system and installing software on top of it is actually the same process. In fact, on a high abstraction level, there is no difference between writing code and designing and implementing an infrastructure solution like a company's email system. You have to know your components. You have to rely on third parties to tell you exactly what their components can and cannot fulfill and how they work. You have to take explicit and implicit requirements into account and finally design a solution. And you have to work on imperfect data, since almost all facts you take into the calculation may turn out to be false, just like the assumption about the security of this image parser.

A good design is based on the definition of your goals, so I gave this insight a try. The defined goal was an actively used Windows XP system that rots as little as possible over time. Most people are forced to freshly install their Windows from time to time, just because they installed and deinstalled a lot of software and everything left a bit of waste lying around or installed additional components that are not removed when the software is already gone. And, much like with security issues, most people believe that, if you actively use a Windows system and install new stuff from time to time, a performance degrading mess is inevitable, since you don't have the source code and you don't know what happens under the hood.

In my design, I therefore decided to not just install software all the time but rather to perform a minimal verification. This is done by installing it into a VMware Windows installation. What I'm looking at is:

  1. Does the software what I want it to do? If not, there is no point in installing it in my production environment.
  2. How stable is the software and how good does it handle the data I trust it with? If I can't use the data afterwards in other software I'm already using, the candidate fails.
  3. What security implications does the software have on first sight? Things included here are open ports, shared memory, highly privileged processes or system services as well as kernel drivers.
  4. On a very high level, what modifications does the software do when being installed? I don't check every detail, just if it puts something in auto start registry keys or loads a process to display a tray bar icon.
  5. How much is left when the software gets uninstalled?
  6. Do I need the software just once or twice? If this is the case, I can keep it in VMware for that purpose and get rid of it afterwards.

It's a very quick check. The result are two Windows XP systems that I run for over a year now, heavily used and permanently modified but still almost as well performing as when I installed them for the first time and not a single fatal failure that caused data loss.

I'm totally aware that this example is slightly off-topic and that it is common practice in every good IT operation, but it is useful to illustrate how important design and architecture are and how little difference there is between composing an application by code plus libraries and composing a set of software to work together by setup.exe and configuration menus. After all, seasoned UNIX programmers tend to be very good system administrators on the same platform.
For both large system architectures as well as for my question whether I install a piece of software on my production system or not, one thing holds true: you have to decide on inperfect information but once you decided, it's pretty hard to get rid of the consequences.

Coming back to the responsibility question, I tried to show that really good security, namely defense in depth, can only be achieved by a good design and architecture. While having a number of other merits, a well reviewed design can be held accountable for security, despite all the imperfections of the components involved. The only entity that can finally be held responsible for the security of something is whoever designed it. Having an all-embracing system design, reviewed by experienced subject matter experts from different fields, yields a very good result and provides as much security was we can achieve today.

[1] I could also imagine getting reimbursed a percentage of the software's price for every crash dump I send to the vendor via the "please inform CorpX about this problem that just wasted half a day worth of your work" message box.


Posted by FX | Permanent link | File under: paradigms