Open Source and Agile Methodologies
During the last 5 years, a set of methodologies have become popular, called agile methodologies . An agile methodology is, in general, one that emphasizes incremental development and small design steps guided by frequent interactions with customers. The customer and developers get together and agree on the next set of features and capabilities for the software. Ideally, the work should take at most a few weeks. The developers then make the additions and the software is released to the customers, who react to it, perhaps making corrective suggestions.
Agile methodologies and open source would seem, at first glance, to be radically different: Agile methodologies are thought of as being about small, collocated teams and open source as being about large, distributed ones. A company might expect that the benefits of one are pretty different from the benefits of the other. Agile methodologies arose, largely, from the ranks of paid consultants, whereas open source seems like a hippie phenomenon. A company might, therefore, believe there is a sharp choice to be made between them, but the choice has more to do with the conversations, the diversity of participants, and the transparency of the process to the outside world than it does with the philosophy of design and development: The two approaches share many principles and values.
Some agile methodologies have special practices that set them apart from others–for example, extreme programming uses pair programming and test-driven development. Pair programming is the practice of two people sitting at the same computer screen with one person typing and the other observing and commenting. Instead of one person sitting alone with his or her thoughts, pair programmers engage in a conversation while working, which serves as a real-time continuous design and code review. Test-driven development is the practice of defining and implementing testing code before the actual product code is implemented. The following are the agile development principles taken from the Agile Manifesto website1 –most of these principles also apply to open source, except as noted.
- “Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.”
- Open source does not talk about the customer, but in general, open-source projects do nightly builds and frequent named releases, mostly for the purpose of in-situ testing.
- “Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.”
- Open-source projects resist major changes as time goes on, but there is always the possibility of forking a project if such changes strike enough developers as worthwhile.
- “Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter time scale.”
- Open source delivers working code every night, usually, and an open-source motto is “release early, release often.”
- “Business people and developers must work together daily throughout the project.”
- Open-source projects don’t have a concept of a businessperson with whom they work, but users who participate in the project serve the same role.
- “Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.”
- All open-source projects do this, almost by definition. If there is no motivation to work on a project, a developer won’t. That is, open-source projects are purely voluntary, which means that motivation is guaranteed. Open-source projects use a set of agreed-on tools for version control, compilation, debugging, bug and issue tracking, and discussion.
- “The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.”
- Open source differs most from agile methodologies here. Open-source projects value written communication over face-to-face communication. On the other hand, open-source projects can be widely distributed, and don’t require collocation.
- “Working software is the primary measure of progress.”
- This is in perfect agreement with open source.
- “Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.”
- Although this uses vocabulary that open-source developers would not use, the spirit of the principle is embraced by open source.
- “Continuous attention to technical excellence and good design enhances agility.”
- Open source is predicated on technical excellence and good design.
- “Simplicity–the art of maximizing the amount of work not done–is essential.”
- Open-source developers would agree that simplicity is essential, but open-source projects also don’t have to worry quite as much about scarcity as agile projects do. There are rarely contractually committed people on open-source projects–certainly not the purely voluntary ones–so the amount of work to be done depends on the beliefs of the individual developers.
- “The best architectures, requirements, and designs emerge from self-organizing teams.”
- Possibly open-source developers would not state things this way, but the nature of open-source projects depends on this being true.
- “At regular intervals, the team reflects on how to become more effective, and then tunes and adjusts its behavior accordingly.”
- This is probably not done much in open-source projects, although as open-source projects mature, they tend to develop a richer set of governance mechanisms. For example, Apache started with a very simple governance structure similar to that of Linux and now there is the Apache Software Foundation with management, directors, and officers. This represents a sort of reflection, and almost all community projects evolve their mechanisms over time.
In short, both the agile and open-source methodologies embrace a number of principles and values, which share the ideas of trying to build software suited especially to a class of users, interacting with those users during the design and implementation phases, blending design and implementation, working in groups, respecting technical excellence, doing the job with motivated people, and generally engaging in continuous (re)design.
A good example of a company-related open-source project that embraces both open-source and agile values is the Visualization ToolKit (VTK), which is partly sponsored by GE. VTK is a software system for 3D computer graphics, image processing, and visualization, and portions of it are subject to patents held by GE and a smaller company called Kitware. As its website states:
VTK supports a wide variety of visualization algorithms including scalar, vector, tensor, texture, and volumetric methods; and advanced modeling techniques such as implicit modelling, polygon reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation. In addition, dozens of imaging algorithms have been directly integrated to allow the user to mix 2D imaging/3D graphics algorithms and data. The design and implementation of the library has been strongly influenced by object oriented principles. VTK has been installed and tested on nearly every Unix-based platform, PCs (Windows 98/ME/NT/2000/XP), and Mac OS X Jaguar or later.2
The kit is substantial, encompassing over 600 C++ classes and around half a million lines of code. There are over 2000 people on the VTK mailing list. GE’s stance regarding VTK as a commercial advantage is summed up in the following statement: “We don’t sell VTK, we sell what we do with VTK.”3 GE has a number of internal and external customers of the toolkit–it is used in a variety of projects GE is involved with. Kitware provides professional services associated with VTK.
As an open-source project, VTK is a bit unusual, and this is the result of some of its principals being involved with GE, which is the prime supporter of a design and implementation methodology called six sigma . Six sigma refers to a statistic that states that a manufactured artifact is 99.99966% defect-free, and it also refers to a process in which factors important to the customers’ perception of quality are identified and systematically addressed during a design and implementation cycle whose steps are Define, Measure, Analyze, Improve, Control (DMAIC). Open source involves the possibility of diverse innovations and also provides opportunities for interacting with customers in a direct way, which is appealing to an organization focused on customers, but there is also the possibility of erratic results when there is not a strong, explicit emphasis on quality that can be enforced. Therefore, open source went only part of the way to satisfying GE’s goals for quality.
Moreover, the original VTK implementation team was small and dispersed within GE, and its members were admittedly not software engineers. The open-source component added to this the need to find a way to handle quality. The solution was to adopt some of the practices of Extreme Programming, which is one of the agile methodologies. Extreme Programming (or XP) emphasizes testing and advocates a practice called test-driven design in which tests are written at the same time as, or before, the code is designed and written.4 Writing tests first has the effect of providing a sort of formal specification–the test code–as well as a set of tests to be used for regression and integration testing. XP calls for frequent (tested) releases, and VTK combines this with the open-source practice of “release early, release often” to do nightly, fully tested builds.
The VTK developers implemented a regimen in which submitted code is tested overnight using a large corpus of regression tests, image regression tests (comparing program output to a gold standard), statistical performance comparisons, style checks, compilation, error log analyses, and memory leak and bounds-check analyses; the software’s documentation is automatically produced; and the result is a quality dashboard that is displayed every day on the website. The dashboard is similar to those produced by the Mozilla project,5 but considerably more detailed. The tests are run on around 50 different builds on a variety of platforms across the Internet, and distributions are made for all the platforms.
The reasons for this approach, as stated by the original team, are as follows:
- To shorten the software engineering life cycle of design/implement/test to a granularity of 1 day.
- To make software that always works.
- To find and fix defects in hours not weeks by bringing quality assurance inside the development cycle and by breaking the cycle of letting users find bugs.
- To automate everything.
- To make all developers responsible for testing (developers are expected to fix their bugs immediately).
Among the values expressed by the original development team are the following:
- Test early and often; this is critical to high-quality software.
- Retain measurements to assess progress and measure productivity.
- Present results in concise informative ways.
- Know and show the status of the system at any time.
This is not all. The VTK website provides excellent documentation and a coding style guide with examples. Most of the details of the mechanics of the code are spelled out in detail. Moreover, there are several textbooks available on VTK.
In short, the VTK open-source project has integrated open-source and extreme-programming practices to satisfy GE’s need to express to customers its commitment to quality, even in projects only partially controlled by GE. Furthermore, GE has tapped into a larger development community to assist its own small team, so that its customers get the benefits of a high-functionality, high-quality system infused with GE values.
Continuous (Re)design
The primary source of similarities between open-source and the agile methodologies is their shared emphasis on continuous (re)design. Continuous design is the idea that design and building are intertwined and that changes to a design should be made as more is learned about the true requirements for the software. This is why both camps agree with the mantra, “release early, release often.”
Continuous design is an approach that is predicated on recognizing that it is rarely possible to design perfectly upfront. The realization is that design is often the result of slowly dawning insights rather than of knowing everything at the start of the project and that, like most projects, the activities are progressive and uncertain. Specifications of software function, usability, and structure, for example, cannot be fully known before software is designed and implemented. In continuous design, software source code, bug databases, and archived online discussions capture and track the preferences and realities of co-emerging software systems and their user/developer communities in a continuous cycle of innovation, change, and design. Explicit and formal specifications and formal design processes rarely exist: The code itself along with the archived discussions are the specification.
Some open-source projects, especially hybrid company/volunteer projects, use more formal processes and produce more formal artifacts such as specifications, but even these projects accept the idea that the design should change as the requirements are better understood. In fact, we could argue that even software produced using the current principles of software design, software engineering, and software evolution are often discretized versions of continuous design–imposing the idea of formal design and specifications done largely upfront, but (unconsciously) allowing the effect of continuous design over a series of infrequent major releases rather than through small, essentially daily ones.
The above is a post that I found very interesting. The reason I took the liberty to post it here was to ensure that everyone gets to read it. If given as a link alone, the merit would have been lost.