July 2005 - Continuous Build Methodology
By Andy Bruce
Email: andy@softwareab.net
http://www.softwareab.net/
 
From Floundering to Professional in 10 Easy Steps
A quick overview of how a disciplined and detail-oriented
/process /turned a floundering and badly written system into a commercial
reality. 
In 2002 I became an independent consultant and started working with an
ex-manager of mine on some thoroughly prosaic VB6 software extending the M$ Great
 Plains accounting system. When I came on board, the previous team
lead had discouraged even full compiles "since they just keep the
developers from moving forward on the deadlines." There was no process
defined, no technical design documents (other than my manager friend's work for
the then-single customer), and no thought given to integration, installation,
deployment, or maintenance. The system was hopelessly bottlenecked in a pure
reactive mode with no thought given past the current set of fires, and a very
unhappy client. 
Principles regarding a nightly build and continuous integration are ones I have
absorbed organically in my career. It takes around 12 years, but after X number
of products and Y number of Friday 3am
bug fix marathons, the absurdity of the typical development process becomes
apparent. Even strong companies where I've worked like EMC in Hopkinton, MA or Landmark
Systems Corporation in Tyson's Corner, VA had an emphasis on keeping testing
separate from the coding. 
In a word--the single most important aspect of successful development is: Have
Respect For The Process. In our case, following the
process has led to a reliable product with very fast turnaround of bug patches,
but easily maintained ongoing development. We have automated builds, regression
tests, integration between defect tracking and source control, quick
"state of the system" statuses, a well-defined set of developer and
customer documents, automated installations and program upgrades, and much
more. 
The Process is what allows you not only to develop reliable and well-tested
code in the beginning, but also allows you to respond to the inevitable fires
that occur after code is released and new development is fighting for resources
with maintenance patches. 
In my product's case, success came through following these points: 
 - Ensure that development can occur
     anywhere, at any time. The existing software environment assumed that
     everyone was on-site (a typical M$ SourceSafe drive-mapping-based system).
     Although in my case I used CVS, the important point was that your version
     control needs to free your developers from the tyranny of location. And by
     leveraging SSH as a poor-man's VPN, I was able to free us from the
     bottleneck of the corporate IT department (they trusted SSH well enough,
     and simply opened up port 22 on the firewall). By doing this, we were able
     to use all our development resources (source control, common programs,
     database tools, etc.) remotely.
 
 
 - Understand where you're coming from.
     In my case, no system documentation existed all. My next step was to
     ensure that the major processes (build, deploy, maintain) were written
     down. Besides serving as a manual install guide, this laid the framework
     for the automated processes to follow.
 
 
 - Code standards matter. Another
     tough one. Many developers have the mistaken opinion that how code is
     formatted has little bearing on the end result. Others believe that
     following a standard is obstreperous and a drag on creativity. Nothing is
     farther from the truth. Just because one follows the strict form of a
     sonnet or a contrapuntal fugue does not mean that the sonnet or fugue is
     restricted (in the real sense of the word) at all. Instead, it just means
     that others can recognize the form and have a much better chance of
     understanding the progressions and applying changes. It is just so with
     software development. By following well-defined and concise formatting
     standards, one can reduce the chance of errors (as in, with the old C
     compilers, using the form "if( 0 == x
     )" rather than "if( x = 0)" with the inevitable error
     applied in the second case). Moreover, it makes company code equally easy
     to read regardless of the initial author. While to many this is a
     difficult concept to grasp, the fact is that we all grow and move on. It's
     also inevitable that in any successful code the ongoing maintenance team
     is almost never the original development team (and, in many cases, simply
     not as good technically as the original development team). One must give
     the next person the best chance at understanding and working on code
     modules, unless one's goal is stagnation and a reduced role as "the
     permanent foobar maintainer".
 
 
 - Compile for complaints. Turn up
     your compiler warnings and errors to the highest possible level. If using
     C++ or C, get a good lint and apply a strict
     corporate-wide policy. Do not tolerate any messages from your compiler.
     Enable any type of memory and/or logic checking tools that can be compiled
     with your application (even the old M$ C++ compilers offer the ability to
     do simply memory verification during program runs; more advanced tools
     like BoundsChecker simply do a better job).
 
 
 - Assume that they're all out to get
     you. In your modules, assume that your input parameters are not only
     wrong, but wrong in strange ways. When dealing with database parameters,
     assume that every variable passed in is a mistaken NULL and that every
     data lookup fails and that every memory operation is wrong and that every
     OS call dies miserably. As to what happens when failures do occur: I'm of
     the school that you capture the error, log it, and keep going. Others are
     of the school that you fail quickly and fail hard. But in either
     case, the key point here is that everything does go wrong at some time or
     another. Your job is to be aware of that fact.
 
 
 - Software deployment and upgrades make
     or break the system. Even before anything else, one must consider the
     frightening fact that software, if successful, will be installed. It will
     be installed on many machines, and most users will not have admin privs. It's critical to take the time up-front to have
     a plan such that one's product can not only install itself once, but keep
     itself maintained automatically. In our case, the initial client install
     simply puts a stub on the machine. During the initial login to the server
     database is when all the important stuff happens. The client itself can be
     upgraded, new libraries can be installed, client extensions can be loaded
     dynamically, menu configuration options can be
     set all based on configuration files located on the server. The key point
     here is that we design the system to assume wildely
     successful sales, and as we all know the only thing that can kill any
     project faster than failure to sell anything is the ability to sell a lot
     of things.
 
 
 - Go out of your way to automate.
     The key point here is: /Not all time is created equally./
     A script that automates a key process otherwise requiring alert and full
     attention is never a waste of time. So, take the time to get the product
     build automated using the tool of your choice. However, even an automated
     build  is just the beginning. Example: When
     building software there may be a set of instructions between the
     "make build" and "make deploy" that need to be typed
     manually. All together, these instructions take less than ten minutes to
     run while automating the instructions may require several hours due to
     complexity. However, when one is working under high pressure with multiple
     deadlines all popping at once, one finds that /not having to think/ during
     repetitive but crucial processes is a life-saver. This concept generally
     raises programmers' hackles: after all, why can't you just be alert and
     focused when executing a well-defined set of commands? The only answer to
     that is hard experience and the sad realization
     that mistakes do happen; and mistakes occuring
     during the final "quick rebuild of the system for the last
     patch" ultimately occur to all of us. Those are the painful lessons.
 
 
 - Every line of code is a mistake.
     This is another controversial statement, but it means one simple thing: In
     many conditions, human beings simply cannot write very good computer code.
     Under pressure, humans make mistakes; we get tired and cranky, and in
     general we don't do repetitive and precise actions very well. The answer
     to the statement is: Minimize your
     mistakes. In other words, write less code. The best way to do this is
     to identify where in one's process one has well-defined sets of software
     modules that must be kept in synchronization with each other, and to
     automate that generation. In our case, this was the database interface as
     well as the database upgrade scripts. I ensured we had both commercial and
     in-house custom tools to generate our database layers and our database
     upgrade and installation scripts automatically. I have no qualms that the
     code generation works equally well under all types of deadline pressure.
 
 
 
 - Build your regression tests early and
     make them complex. Regression tests are not a panacea by any means,
     but they are highly effective in ensuring a minimum level of reliability.
     But, developers really don't like to write them. And, some managers have
     the odd idea that, just because something works when it was first written,
     the same software should work months later. Sadly enough, that simply
     isn't the case. As systems evolve the underlying modules change their
     interfaces and assumptions in subtle ways. Regressions are simply the only
     way to ensure that modules that used to work,
     continue to work. And the more complex and detail-oriented the regressions
     are, the better (developers and manage despise such tests because of the
     initial flurry of false positives they create). In our project we have
     around 500 regressions that run with each build, simply because I insisted
     we build them. And while I do have trouble getting anyone to write any
     more of these tests, I do feel a sense of relief and satisfaction each
     time our existing test suite passes at 100%.
 
 
 - The Development Install is the same as
     the Customer Install. I know three different major products on which I
     worked, all of which strictly segregated the developers’ configuration
     setup from the customers. This inevitably led to much wasted time and lots
     of finger pointing at the end. And in all three cases, there ended up
     being a full team of folks working on the product installation suite. In
     our product, I simply did not tolerate that approach. We have One Way to
     setup our system. This means that our installer is tested every day on
     numerous machines by our automated build process. When we get to the end
     of the cycle, we don't have to worry about whether we've updated the
     installer to handle new shared libraries or system registry entries; we've
     caught that low-hanging fruit very early in the process.
 
There
are lots more points to consider… (automated
data loading, managing multiple ongoing development branches, code promotion
strategy ("Never Break The Build"), but this is enough for one
article!