Software Pluralism

Software Pluralism

Software Quality

Introduction

Parties on both sides of the open/proprietary software debate argue that one model of software development necessarily results in superior software. This article unpacks some of the claims and attempts to bring some clarity to the debate.

"Better" Software, What Is It?

We often hear people making claims like "this piece of software is better than that one." What do they mean by "better"? Evaluating software is really not all that different from evaluating the other mass-produced things in our world, such as automobiles. When we hear people compare cars, they speak of qualities such as safety, reliability, performance, and comfort. Some of these qualities are of course more objectively measurable than others. For instance, we can say that a Porsche 911 is faster than a Volkswagen Beetle, and we can easily verify that claim by way of a head-to-head race. On the other hand, it is not so clear that an Audi A8 is more comfortable than a Lincoln Town Car. Comfort seems like a more subjective criterion, and different drivers may have different opinions about what constitutes a comfortable ride.

The same basic criteria can and are used to compare software systems, and the same basic issues arise. Performance has historically been a favorite basis for comparison, because it is generally easy to measure. However, for much end-user office software, it is unclear that performance really matters that much anymore, because most modern computers are generally more than fast enough to run word processors, email clients, and web browsers without too much difficulty. On the other hand, video gamers might still be very interested in the number of frames per second their video game system can produce. Or a company running a large, e-commerce website might care about the number of transactions per second their back-end database software can process.

The computer software analog to "comfort" is "usability", often as embodied in the design of the human-computer interface. Although the standards in this area tend to be somewhat subjective, usability is amenable to metrics of various sorts. For instance, some interfaces might be easier to master (e.g. graphical user interfaces) whereas other interfaces might be more efficient (e.g. text-oriented interfaces).

As with cars, people care about reliability. Quite simply, reliable software doesn't break. Measuring software reliability is notoriously difficult. Over time, different metrics have been suggested, but none of them are entirely satisfactory. For example, we might measure the defect density—the number of defects per lines of code. The problem with this metric is that we might have a program with a very low defect density that still breaks all the time, if particular defects are located in regions of code that are frequently executed.

A final basis for comparison is software security. Software security can be thought of as a subclass of reliability. Many programs are plagued with defects that allow a malicious hacker to seize control of that program, thereby assuming the "identity" of the user who initially executed that program. This sort of security defect is of real concern for institutions that are running server programs that can be compromised, providing a back door to access data that should be otherwise protected from the outside world.

top

The Claims

Some proponents of open software argue that because the process of designing and constructing open source systems differs from that of proprietary systems, it necessarily yields software that is of higher quality. The argument has its historical roots in Eric Raymond's Cathedral v. Bazaar creation myth. In that story, open source systems are built in the bazaar style, where large teams of volunteer programmers located all around the world but connected by the Internet, work together to build a piece of software. The Bazaar is seen as a self-organizing, anti-hierarchical, organic method of software development. The Bazaar shuns project plans, schedules, and design documents. The Bazaar stands in stark contrast to the Cathedral, which is a formal, hierarchical, planned, command-and-control software development process. The Bazaar is the free market; the Cathedral is a planned economy. Raymond goes on to argue for the superiority of the Bazaar because it necessarily leads to the creation of better software. The idea is that because the code in the Bazaar is subject to inspection by anyone, and because every user is also able to inspect the code, that defects in the code should be more apparent, easier to fix, and thereby result in an improved end-product. In theory, the Bazaar generates higher quality code.

Raymond's story certainly makes good reading, but is it really accurate? That is, is it really the case that the Cathedral model dominates all proprietary software development? Is it true that all open source projects follow the Bazaar model? Are we really stuck with a binary distinction between Cathedrals and Bazaars? Or might we see "process" more as a continuum, with different kinds of projects employing varying levels of formality and hierarchy depending on the maturity, architecture, and other features of the software product? The reason we should want to know the answers to these questions is that if we find that in fact open source software development is not all that different from proprietary software development, then we should not expect the end results to really be all that different.

top

The Results

Cathedrals

It is obviously difficult to get an accurate picture of proprietary development efforts, given that they take place behind closed doors. However, it seems unlikely that all proprietary efforts follow a pure, Cathedral model of software development. The Cathedral model, in its extreme form, is really a creation of software engineering textbook theory. While it is taught, understood, and perhaps practiced in software engineering courses, it by no means dominates all proprietary software development. Textbooks generally describe how software ought to be developed, and not how it really is developed. Anecdotal evidence tells us that proprietary efforts vary greatly in terms of formality. If any generalization can be made, it is that formality increases with code maturity. In some ways, this only makes sense. While software products are in their prototype stage, they are often developed by small teams of programmers who would only be slowed down by highly formal development processes. As the prototype is transformed into a commercial product with an installed user base, greater care must be taken to conform to standards, maintain compatibility with old versions, and manage the ever growing team responsible for developing, testing, and debugging the codebase. And as the codebase grows, it is typically modularized into subsystems. Between subsystems, development proceeds in parallel. Individual subsystems become the responsibility of smallish teams, who may still employ relatively informal development processes internally.

Proprietary efforts also practice greater or lesser degrees of formal testing. Early stage startups might not employ any testers and instead rely on the developer(s) to test as they go. On the other hand, a mature software development company might employ scores of testers and develop and use highly sophisticated, formal test plans. And even mature companies are often criticized for releasing code that can at best be considered in "beta," relying on their customers to test the software for them.

Bazaars

As we have seen, it is not at all clear that the Cathedral—in its textbook incarnation—is the only software development model employed by proprietary software efforts. In reality, the Cathedral might be a little more Bazaar-like than we have been led to believe. And now we ask whether it is not possible that the Bazaar might show itself to be a little like the Cathedral.

One paper, by Andrus Mockus et al., compares and contrasts development processes of two well-known open source projects, Apache and Mozilla. Along the way, it tests a number of hypotheses about open source development. Apache initially shows itself to be fairly Bazaar-like: it is a largely informal, decentralized development effort. On closer review, however, we can see hints of Cathedral-ism. First, there are divisions of labor, at least to the extent labor is divided by code architecture. Any non-trivial piece of software is decomposed into separate modules, allowing different programmers to work on separate modules in parallel, provided there are not dependencies between them. Second, development tools such as bug databases and version control systems are employed. Development tools like bug databases are really just technical implementations of traditional, managerial functions. In this regard, every open source project is organized and regulated by the development tools it employs. And since these development tools are often the same ones employed by commercial developers, open source projects are managed to the same extent commercial projects are managed by the technology they employ. Third, even though Apache doesn't embrace a formal notion of code ownership (common to commercial efforts), it does draw a line around a small (about 25) group who may vote on code inclusion and has rights to commit code to the repository. Finally, an empirical analysis of the source tree shows that 15 developers were responsible for almost 90% of the code added. While Apache isn't exactly command-and-control, neither is it complete chaos.

The Mozilla project shows itself to be even more Cathedral-like than Apache. This is perhaps no surprise given its commercial origins. It is also significant, however, that Mozilla is larger by an order of magnitude than Apache. In Mozilla, divisions of labor are more explicit; code ownership is strict; build and test processes are well documented. Interestingly, the authors note that at least some of the success of the Mozilla project (after a rocky start) is due to excellent documentation coupled with high quality, scalable development tools. The development tools built to support the Mozilla project include a bug tracking system, a code cross-referencing system, and a change presentation system. The lesson for us is that mechanizing management functions does not mean that there is no management, just that it is not being performed by human beings. An empirical analysis of Mozilla source shows a contribution pattern similar to Apache's—a small number of programmers are responsible for the bulk of the code.

Finally, Mockus et al. compare Apache and Mozilla to a number of proprietary projects on the basis of code quality (in terms of defect density), responsiveness to bug reports, and productivity. Comparisons such as these are difficult to make, because of differing attitudes towards release schedules, test plans, and so on. The authors do attempt to account for process differences and find no significant differences between open source and commercial projects in terms of code quality, responsiveness, and productivity. The fact that there is no clear pattern at least signals that the claim that the Bazaar model produces better software (by some metric) is false. Or it may indicate that open source and proprietary efforts are really not all that different from a process point of view. If two products are manufactured using similar processes, we would expect to see no great quality differences in the result.

Another set of papers focuses largely architectural issues in open source projects. Schach et al. study coupling (the degree of interaction) between various modules in the Linux kernel. They show that over time, coupling has increased exponentially. High degrees of coupling are associated with bugginess, because it means that individual modules cannot be changed without risking unpredictable side effects in other modules. MacCormack et al. compare Linux, early (pre-redesign) Mozilla, and late (post-redesign) Mozilla. Mozilla was redesigned several months after its release as an open source project. They show that Linux is architecturally more modular than early Mozilla and less modular than late Mozilla. High degrees of modularity are associated with decreased change cost, because (assuming minimal coupling between modules) programmers can effectively work in parallel on adding functionality to the system.

What are we to take away from these papers? It's hard to say. Schach et al. point out a potentially serious drawback in the Bazaar model of development: informality in early stages of development may result in code that is not well architected for maintainability and growth in the long run. Yet MacCormack et al. show similar problems appearing in commercial systems as well. Early Mozilla is essentially a proxy for commercial software, because it is the code that was initially "freed" by the Netscape corporation. And early Mozilla is shown to be less modular than Linux or late Mozilla, both the result of open source efforts. Perhaps this all just goes to show that careful, deliberative design work early in a software project can result in a better architecture, yielding great benefits in the long run. Of course, such design work is not without cost, and may result in slower times to market, leaving us with a well-designed software system that nobody buys. This may go a long way to explaining the relatively "poor" architecture of early Mozilla: the designers traded off (consciously or not) architectural advantages in order to speed their product to market. Finally, it is perhaps ironic that a hallmark activity of the Cathedral is critical to the healthy functioning of the Bazaar: early, careful design work is required to architect a system that can support a geographically dispersed community of asynchronous contributors.

top

Software Security

We treat software security specially because it is such a hot button issue. The popular media is filled with claims such as "GNU/Linux is more secure than Windows," or "Viruses and worms don't affect GNU/Linux" or "I run Mac OSX and don't ever get any viruses." These claims are generally backed up with empirical evidence showing much larger numbers of security incidents affecting Windows systems than GNU/Linux or other open source or open source derived systems such as Mac OSX. From this data, open source proponents make the claim that this somehow implies that open source software is necessarily more secure than proprietary software. On the other side of the debate, advocates of proprietary software development insist that "security through obscurity" is the way to achieve high security applications. They may also characterize open source software as hobbyists projects that cannot possibly be relied upon in security sensitive applications.

As an initial matter, the shallow claims on both sides obviously border on the ridiculous. Saying that the low number of attacks upon open source systems somehow implies that open source software is necessarily more secure is as fallacious as saying that North Americans are somehow immune to the bird flu because so far the only people who have died from it live in South East Asia. The fact that there aren't as many high profile security incidents concerning GNU/Linux does not imply that GNU/Linux - or more generally, open source solutions - are more secure. It probably only shows that for some reason proprietary software (in particular, that published by Microsoft Corporation) is a more appealing target to hackers. On the other side, pejoratively characterizing open source efforts does not add anything useful to the debate, because as we have seen above, many open source projects are probably as sophisticated or "grown up" from a software development point of view as proprietary efforts.

Furthermore, there are good reasons to believe that again the apparent differences in security quality may be illusory. First, as noted above, we can think of software security as a subclass of the more general software reliability problem. Therefore, if we believe that open and proprietary development models are in reality not as different as they appear at first glance, we would expect that code produced by open and proprietary projects to contain a similar number of defects that would allow the code to be compromised by an attacker. Second, and more specifically, a large class of software vulnerabilities (the buffer-overflow vulnerability) are the result of common programming errors in the most commonly employed programming languages (C and C++). These vulnerabilities can be reduced or eliminated by careful inspection, thorough testing, or using more modern programming languages (such as Java) that do not allow for such errors to occur. C and C++ are the favored languages for both open source and proprietary development, and until different languages are adopted, or unless more rigorous testing and coding practices are employed, there is reason to believe that code produced under both development models will be plagued with this particular vulnerability.

A look at some simple numbers shows us that there is lots of vulnerable code on both sides of the divide. Between 1998 and August, 2005, Microsoft has issued 449 security bulletins. Of these, 334 are classified as "critical". Each of these bulletins details some sort of defect that could allow an attacker to gain complete control of the system. (See, www.microsoft.com/technet/security/current.aspx, last visited August 17, 2005.) On the other side, Debian, a GNU/Linux distribution known for its focus on security, has issued 926 Security Advisories during the same period. (See, www.debian.org/security/, last visited August 17, 2005.) These numbers only tell us what they tell us—that we have examples of open and proprietary software that are similarly riddled with vulnerabilities. Any further conclusions would be speculative at best.

Perhaps the best argument on the open source side of the debate is that the transparency of open source software should lead to better security. As a general matter, software security experts have rejected the argument that security can be achieved through obscurity. Open source seemingly has an advantage because many user-testers can peruse the code, i.e. there are many eyeballs looking for potential vulnerabilities. There should be no hidden problems in open source code, because there is nothing to hide, and no ability to hide it if there were. Code is peer reviewed and thereby forced to withstand widespread scrutiny. This argument certainly has merits, and probably makes the most sense within areas of focused inquiry, such as the design and implementation of particular cryptography algorithms. Bug fixes that patch security holes may also be more rapidly available in the open source context, because open source projects do not have the ability to keep mum about newly discovered security flaws. In theory, this process should lead to rapid, iterative refinement of open source codebases.

top

Conclusion

Advocates of open source software argue that because there is something fundamentally different about the way open source software is developed, it necessarily results in better, higher quality software. In reality, the answer is not nearly that easy. It is not at all clear that the processes employed by open and proprietary efforts are really all that different. And if the processes are not that different, then we have no reason to expect that one project will necessarily yield higher quality software than the other. Empirical studies applying various quality metrics to open and proprietary systems have shown little in the way of significant differences.

References

Related Articles