Matt Kopala

Software Development, Technology, Travel

Building a Development Infrastructure

| Comments

Getting a software development team started and then building up the infrastructure of tools & processes that they use involves a lot of work & problem solving, and is an evolutionary process. You typically don’t want to do everything right at the beginning, because this will cause delay in feature delivery, and it comes with a much greater risk of waste. This post attempts to outline the many different things that you might want to set up in your development organization, and some ideas on when to put them in place.

I focus on the different elements and the technical options. I don’t talk much about the human element, which involves selling, convincing, etc. of the benefits and costs of these things, which could be a whole other post. It is definitely very important that you have a great team. You won’t get very far very quick with this stuff here if you’re lacking that.

Figuring out how to get people to work together, or building a great team, isn’t addressed explicitly, although many items are really about communication, and can have a strong impact on team dynamics.

Implementing any of the items on this list requires learning and effort. It is of course much cheaper to set them if already have someone with experience.

Overview

Here is my list of items that you might consider for your development infrastructure. This is a list based on my own experience, and by no means a correct or complete list. Some of the sections are rather sparse on examples, either because my experience is limited, or because I had already spent too much time on this blog post by the time I got to that section. I deleted several sections because I didn’t have time to write them up.

The rest of this post will talk a little bit about what each of these is, and when you might want to address it.

  • Process & Project Management
  • Product Backlog
  • Bug, Issue, and task tracking
  • Communication tools
  • Remote access & Connectivity
  • Shared filesystem & directories
  • Revision control
  • Isolated & Complete development environments
  • Packaging and Deployment
  • Automated Testing
  • Configuration Files
  • Code Coverage analysis
  • Continuous Integration
  • Documentation extraction
  • User Documentation
  • Developer software/tools
  • Code reviews
  • Logging
  • Branching & Versioning
  • Monitoring & Reporting
  • Application Framework
  • Customer & User Metrics

I thought about ordering this by priority, or grouping by some kind of category, but I skipped it.
I would recommend to take the time to at least be familiar with everything on this list, and then you can do your own categorization or prioritization.

Yes, it’s a long list. It’s definitely longer that The Joel Test, though several are duplicates. However, like Joel argues, most top companies are doing all of these already, and I would argue that they are essential for long term efficiency, scalability, throughput, and team harmony.

A lot of these will be evolving. Some will develop organically, while others need to be planned – this probably depends on the experience of your developers & technical leadership. Remember it’s all about continuous improvement. So let’s get started …

Process & Project Management

Process is how every thing gets done. We all follow a process, whether consciously or unconsciously. The lack of defined process for your team just means individuals following their own process.

You should have some sort of process from the start. The most important thing is that everyone on the team knows it, buys in to it, and there is a mechanism to continuously improve it. Keep it simple to start. The less to remember the better.

Goals for a process include: efficiency, transparency, improved communication, optimize teammate relationships.

It’s good to be at least somewhat familiar with CMMI, and what it is.

Most software companies these days, and especially startups, use some soft of Agile methodology. I have quite a bit of experience with both Scrum and Kanban and highly recommend them. Although I have experience with waterfall, I haven’t had experience with a team that was effective using it, or have I worked anywhere that I found it enjoyable. I don’t think strict Agile is the answer for all software development, and I think the best book to look at both points of view is Balancing Agility and Discipline: A Guide for the Perplexed

A good process should ensure that critical things aren’t missed. It should have some way minimize technical debt and decided when to do refactoring. You should have a Definition of Done and you should define Acceptance Criteria for each feature or task you do.

Overall, I think process has two main purposes: Get things done and Keep everyone happy

Out of the items on this list, process is probably the most complex, has the most information written about it, and takes the most work to get right.

  • When: From the start

Product Backlog

The Product Backlog is a simply a flat (non-hierarchical) and prioritized list of what needs to be done for the project. Recommendations for your backlog are to keep it DEEP and INVEST in your User Stories.

Pulling work off the top of the backlog and limiting the work-in-process helps reduce multi-tasking, which actually reduces efficiency in many cases.

  • When: From the start

Bug, Issue, and Task tracking

There are tons of options here. I posted about this a while back when looking to see if there was anything out there that I liked more than what I’d been using. Find a tool that has backlog functionality as well as supports other features that you want.

  • When: From the start

Communication Tools

Roughly speaking, these include: email, instant messaging(IM), video conferencing, phones (cell & landline),

All I can say about email is that Gmail rocks, and Outlook sucks.

For IM, I have primarily used Google Talk and Skype, but I’ve also set up an XMPP server using ejabberd. If you have a lot of people on your team, and are already using LDAP internally, finding a tool that populates your buddy list automatically for your entire team is huge. I know a couple of teams that use HipChat.

Be wary of those who don’t use, or are not willing to use IM – especially if your team is separated in to offices or are remotely located. The excuse that it distracts them is lame – they can switch to “Busy” if they’re unavailable. Developers use IM even if sitting next to each other: to copy & paste URLs, send log file snippets, etc. An IM message is also a good way to ping someone that might be busy (who can then respond when available), and has a much lower barrier for initiation than an email (or phone call, for most developers).

It’s much better to use an IM client that has voice as well. If you need to switch to voice from IM, it’s just a click away.

  • When: From the start

Remote Access & Connectivity

This is for groups that have an office or location where they primarily develop (either on physical or virtual machines), but there is a part or full-time need to access these machines from outside. If you need this, I recommend setting up a VPN. Microsoft shops can usually use a PPTP VPN by forwarding the right ports to a Windows Server. I’ve also had good experience with Cisco ASA firewalls and AnyConnect SSL VPN.

In my experience, at the full-time jobs that I’ve worked at, it made more sense to keep applications & windows open, and access them with Remove Desktop or VNC, than try and setup a second machine at home, and develop from that. Many companies want their data to stay on company (or company administrated) hardware as well, so to work productively from a remote location, a good remote access setup is needed.

One reason to set up remote access and have developers working off a shared environment is that there is overhead to each developer setting up their own environment. Developers like to customize, and should have Admin rights to their box, but if there is significant complexity to getting an environment setup, and you haven’t automated the process yet, then this might be an option.

You should consider if it is better to have them work on own machines, machines on network, or start with pre-configured disk or VM images.

  • When: As needed

Shared filesystem & directories

It’s incredibly useful to have a place where you can drag & drop files to share them. It’s less overhead than more formal version control for files that don’t need it to be under version control. You should however, have backups & snapshots in place for your shared filesystem.
Put all of the software that your developers use & need to install here.

If in a Linux or Mac environment, with your own network & hardware, you can NFS. In a mix or Windows environment, you can use Samba or CIFS. Many distributed teams use Dropbox%

  • When: As needed

Revision control

Revision control enables experimentation, parallel development, disaster recovery, tracking of current state, but (IMO) is mainly a communication tool.

I highly recommend Git If on Windows, Mercurial might be preferred, though I still would use Git via Cygwin. I have used RCS, CVS, SVN, and TFS in the past, and Git is the first VCS I’ve fallen in love with.

It is very easy to set up: all you need for one developer is to have git installed on the system, and then run git init in your source directory. For teams of more than one, you can use a shared file system, a server with SSH access, or an online service such as GitHub or Bitbucket.

Gaining expertise & proficiency with tool takes much longer than just setting it up.

  • When: From the start

Isolated & complete development environments

You want a single developer to be able to change anything about the application without affecting other developers. Each developer should be able to develop independently, with their own copy of the code, and their own database. Sharing a front-end web server is OK. Ideally, you should be able to checkout out the source code and run one command to set up the development environment. This also makes setting up your Continuous Integration server much easier.

  • When: From the start

Packaging and Deployment

Once you’re ready to ship your software, you’ll want to package it up somehow, or have some other way of deploying it.

An interim solution that I’ve used and seen used by others is just to clone the source code to the production directory, and then do a git pull to grab new updates. This works great in many instances, and a simple git log in the directory will tell you which commit you’re at.

However, in many cases you don’t want to ship all of your source files around. Also, if your application is compiled or requires a build, the git pull solution may not work by itself.

The simplest way to package your code is a compressed archive (zip, tar.gz). You can easily name the file with a version number. For Linux systems, you may want to use RPM or .deb packages. For Windows, MSI installers are an option.

However you package your code, it should be done automatically during the build.

To deploy, I’ve used rsync, scp + SSH, and Robocopy.

Heroku and Google App Engine have simple tools for deploying to their infrastructure if you’re using them to host your apps.

  • When: Once you’re getting ready to deploy; start simple, refine as needed

Automated Testing

Software Testing is a large topic. You’ll want a combination of automated and manual testing. I tend to favor as much automated testing as possible, and I’m a big fan of Test Driven Development.

Although you can get away without writing tests at the start, regressions & bugs can halt forward progress once your application is put in to production, if you’re not careful.
I find that automated testing is critical for reliability and speed of development. There is some good guidance here on this topic.

  • When: From the start for critical & easily testable features; increase code coverage as needed

Configuration

Store your settings in configuration files. Don’t hard-code settings in to source code files. Configuration files should not be committed to source control. Instead, commit a template (named differently than the actual config file). Ideally, generate & update existing files automatically for developers and test/staging/prod environments as the template evolves.

Format choices are often tied closely to a language, but common formats include XML, YAML, INI, JSON.
It should be a plain text format for easy editing via terminal. Don’t build your own format.

You should start storing things in a configuration file as soon as you have any settings that will vary between developers or installations, and you don’t want to put that info in a database. You database connection parameters will typically need to go in a configuration file anyway.

  • When: As soon as needed

Code Coverage analysis

A Code Coverage tool helps you see what is and isn’t tested, so that you can decide where risk is higher, or where to add tests. It doesn’t tell you that code is correct. Looking at a coverage report can confirm or refute developer assumptions about the existing tests. A graph of code coverage over time can be a nice metric for dashboard, and some continuous integration servers will create one for you. Some teams set it up so that a build will fail if coverage falls below a certain threshold.

  • When: As needed

Continuous Integration

A Continuous Integration build server is used to compile your code (if using a compiled language, and not an interpreted language) in to binaries or byte code, run all tests, and run all static code analysis.

It is basically a glorified cron job with history & metrics. Developers should be able to run everything the build does with a single command in their development environment. In fact, it’s probably best to have the build just call a single target which is wrapped up in a config file for the build tool of your choice.

My tool of choice for a CI server is [Jenkins]. It is Java based, very easy to set up, very configurable, has tons of plugins. I still use old-fashioned Makefiles for my build files & automation, rather than Ant, Maven, Phing, Rake etc.

  • When: As soon as tests take too long to run manually each time, or when bugs & regressions keep making it past Code Reviews, or when technical management wants metrics & history

Documentation Extraction

There are lots of tools for extracting code comments out to a separate file. They are easy to set up, but I have found I rarely use the documents produced. You should still include Javadoc or similar comments, but let your IDE consume them for code hinting, completion, and quick help instead.

  • When: As needed, if at all

User Documentation

The need for User Documentation depends on the complexity of the software and how it’s used. Try to design your product so documentation isn’t needed.

If you’re creating an iPhone app, you probably don’t need docs. If creating a developer API, you’ll need docs for sure.
For a web site, create as needed, but make sure you don’t have usability problems first, and that your documentation is for more advanced features and users, or those that want to learn your product more thoroughly.

  • When: As needed

Code Reviews

[Code Reviews] serve many purposes, the main one is for learning. They are also used to improve code quality, catch bugs, keep code consistent (if lacking static analysis tools), and improve maintainability. A code review can be interactive with two looking at screen, or asynchronous. You don’t fix issues during code review just take notes for later. To learn more, I recommend the book Peer Reviews in Software: A Practical Guide.

You can do code reviews from the start with git using git log -p or git diff on a remote. For a nice visual diff, I recommend the git diffall tool, which will use your default difftool. I use Kdiff3.

If you are finding code reviews involve a lot of comments on a lot of sections of code, I recommend setting up a tool such as ReviewBoard as soon as possible.

  • When: From the start; Specialized tool as needed

Coding Standards

depends on developers. some have such an attention to detail (myself included) that style inconsistencies will throw off their brain from trying to read & understand the code. for others, inconsistencies may just be an annoyance. if something annoys you with another person’s coding style, bring it up & put it in writing. improved efficiency. can be detailed, or more general. agreed upon most important. compromise, but weight the judgment of experienced coders, and err on side of more consistent. use accepted style for community unless very good reason not to.

  • When: As needed

Logging

When i diagnose problems on an unknown application, I look at the logs first. Being able to refer to logs is invaluable. Make sure your application has good logging – you will not regret the investment. It’s usually a good idea to have different logging levels.

You definitely want to have logging in place before putting your application in production.

  • When: As soon as possible, annoying to add in everywhere later

Branching & Versioning

As the project matures, and more developers work on it, and especially after the application goes in to production, you’ll need to have a good branching & version system.

Branches allow you to work on features in parallel that might not all be released together, to work on hotfixes, and merge everything back together while avoiding manual work and (ideally) not creating a mess.

A successful Git branching model is a good model for branching & versioning. The author packaged up his flow in to a tool called git flow.

For versioning, Semantic Versioning is quite popular. Jeff Atwood has a good blog post about version numbers.

It’s important to be able to see quickly what the version is of a particular deployed application. Nothing sucks worse than hunting for a bug that appears in production in the wrong version of the code.

  • When: From the start if possible; Once released in to production

Monitoring & Reporting

You’ll probably want some method to make sure your servers and services are up and running.
This completely depends on the business. Uptime & reliability isn’t as important for some products & companies, so the investment isn’t justified, especially early on.

  • When: As needed

Application Framework

For most projects, picking the programming language(s) is only the first step. It often makes sense to employ an Application Framework as well. Frameworks build on the MVC Pattern are numerous and quite popular.

If you’re experienced with a language, but not a particular framework, there can be a significant learning curve to learn some of the frameworks. It may slow you down at the beginning, but you’ll find the frameworks solve a lot of problems that you’d have to solve yourself if you didn’t use one.

Ruby on Rails is such a well-known platform that for a while, most times when someone was talking about Ruby, it was in the context of Rails.

  • When: From the start for any medium to large project

Customer & User Metrics

As has been popularized by the Lean Startup movement, it’s becoming more and more important to be able to generate meaningful metrics on customer’s use of your product.

Putting a system in place to track these kind of metrics takes an investment up front. The risk of not doing it building a lot of features the user doesn’t want or use.

  • When: From the start, unless you know exactly what the customer should have
    • (no, you are not Steve Jobs)

Comments