On using Subversion for web projects

Friday September 21, 2007 - 7 months ago

Posted by James Ellis / Filed under Code, Web

Subversion, the open-source version control software, has changed our web development process.

At one point in time, I thought version control software was the stuff of super-nerds. I had imagined complex software running on complex servers doing something fancy to manage programming projects. I didn’t recognize it as something that applied to me. So, for the most part, I ignored it.

Then, as version control started to creep into the web development community, I began to take notice. I would see mention of it in blogs, in books, etc. The more I learned, the more I realized it was a cool idea with a lot of benefits.

The big idea:

  • Store (and safe-keep) your project in a repository on a remote server. Never worry about making local backups. Each time you commit changes to the repository, you are making a remote backup.
  • Allow multiple users to collaborate on the same code base at the same time. Collaboration from any number of users, from any machine, at any time.
  • Keep track of all changes made to a project over time. Subversion allows you to jump back in time and access any and all previous versions of a project. No more duplicating files as backups just in case you break something.

    (Sidenote: This aspect of version control seems very similar to Leopard’s time machine feature. I’m not sure how Apple implemented this, but perhaps it’s similar to how Subversion works.)

Eventually, I ended up working on a project with a colleague, and he already had the project in Subversion. He suggested that I get up to speed with Subversion so that we would remain in sync with one another. I gave it a shot, and by the end of the project, I had the hang of it. More importantly, I realized version control should be an integral part of the web development process. (thanks Tim!)

The basics of Subversion

Subversion stores your project in a repository. The repository usually lives on a remote server. As you make changes to your project, the repository remembers everything you do.

Once you have your repository in place, you check out a copy of the project to your local machine. As you make changes to your local working copy, you commit your changes to the repository. Also, you can update your local copy to pull down the latest changes made by other users. That’s all there is to it.

Where it gets interesting…

The lone web developer may ask, “Why should I bother? I’m a one-man team, a lone wolf. I can manage my own backups. I don’t collaborate with anyone so I don’t need this stuff.”

To me, one of the most important benefits of using Subversion for web development projects is that Subversion eliminates one of the classic web development tasks: using an FTP client to push files to your production server. With Subversion, your production server can run a working copy of your repository just like your local machine does. So, like your local machine, to update the copy of the project on the live server, you run an update, pulling down the latest changes from your Subversion repository.

Let’s say you’re working on a project and you need to push a large number of changes live to a busy website. Perhaps 30+ files have changed — lots of new code, you’ve added some new images, maybe some videos, fixed a few bugs, etc. — and you want to push these live.

If you’re relying on an FTP client:

  • You have to be very careful to make sure you upload all the latest files.
  • You have to wait for them to upload.
  • It can be very tricky to make all the changes happen at once.
  • If something goes wrong, it can be difficult to revert back to a previous version.

If you’re using Subversion:

  • You run one command, svn update, that pulls down all of the latest changes at once. Subversion knows exactly which files have changed, which files are new, which files need to be deleted, etc. If your repository is hosted on the same server, the update runs in a second or two. If you’re connecting to a repository on an external server, the transfer rate should still be very fast with most updates taking place in a matter of seconds.
  • If something goes wrong, you can have Subversion revert back the previous state where everything worked.

Still not sold? Consider this…

I do most of my work at the Athletics office, but there are times when I need to take projects home, on the road, or send projects to other developers. It was always a pain trying to sync my desktop and laptop — I’d use little jump drives, post zip archives to our ftp, or remember to bring my laptop to the office. With Subversion I don’t bother with any of that. To sync any computer, I just run an update and I’m done. To provide another developer with access, I just have them check out a working copy from the repository.

Update:

Henry Todd pointed out to me that running your production server as a working copy isn’t the smartest way to deploy critical web apps. Henry offers up a more solid solution.

The deployment process:

  1. Configure Apache to point the server’s document root to a symlink. The symlink will then point to whatever directory is currently being used as the live directory.
  2. Instead of running an update on the production server (where the live webroot is also a Subversion working copy), the site is generated by running an export of the project to a directory parallel to the current live directory.
  3. Then, to make the switch, you simply change the symlink to point to the new directory.

Benefits:

  • While the export process may seem a bit cumbersome, this method allows you to push changes to your production server all at once. Changing a symlink is like snapping your fingers, while running a big update can potentially take a while.
  • If something goes wrong, reverting back to a previous version is much easier, and again, much faster — you just change the symlink to point back to the old live directory.

For most of our sites, running the production environment as a working copy is totally fine. But if it’s a critical website, this method is the way to go. Thanks Henry!

Getting started with Subversion

If you’re serious about getting started, you should read the Subversion book. It’s extensive and everything you need to know is in there. And it’s free. If you prefer more hand-holding, I recommend the book from the Pragmatic Programmers, Pragmatic Version Control. I enjoyed it.

Downloading the Subversion client

The Subversion website offers packages for just about every system on the downloads page. The OS X package is easy to use.

Setting up a repository

You’ll need to find a hosting provider that supports Subversion. Most of the good ones do. All of our projects run on Empowering Media’s managed VPS’s, but we’ve used Media Temple and Joyent/TextDrive in the past.

To a create repository, you’ll need to SSH into your server (using Terminal on OS X), go to the directory where you would like to store the repository, and run:

svn create REPOSITORY_NAME

This will create an empty repository. You don’t put anything into the repository until you start making commits. Keep reading…

(Read more on svnadmin create, or check out Media Temple’s KB doc on creating repositories.)

Initial checkout

Once you have your repository in place, you’ll want to check out a working copy to your local machine. You will need to have the Subversion client installed on your machine to do so.

Using Terminal, navigate to the folder where you would like to store your working copy. You will use the svn checkout command to check out a working copy to your local machine. The way in which you connect to your repository depends on how your host has Subversion configured. Many hosts require the svn+ssh method. It looks something like this:

svn checkout svn+ssh://user@yourhost.com/path/to/repository_name

Running this command will connect to your repository and “check out” the latest version of the project. Subversion will create all of its behind-the-scenes support files (see the bit about .svn folders below) and essentially “activate” your local working copy.

Working with your local copy

Once you have your local working copy in place, you can begin to add/modify files and folders.

First, you need to keep in mind that Subversion requires that you explicitly add files and folders to the repository — see svn add. Once files/folders have been added, Subversion will keep track of all changes made.

Working with your local copy will require discipline on your part. You will need to inform Subversion of certain changes that you make. For example, if you need to delete a folder from your working copy, you will need run the svn delete command. This will instruct Subversion to delete the directory with the next commit you make. If you forget to do this and just delete the folder yourself, Subversion will get confused. It’s a similar situation for instances where you want to rename, copy or move folders. While this file system hand-holding can seem cumbersome at first, it eventually becomes second nature. In fact, I find that the added effort helps keep me more deliberate when making decisions regarding file/folder structures.

Making Commits

After you have added or modified files in your local copy, you will want to make commits. Running a commit will instruct Subversion to send your changes and new files to the repository. Once the repository receives the commit, a new version of the project is recorded.

Subversion features atomic commits. In addition to sounding cool, atomic commits are quite important. When receiving a commit, your repository will not record the commit as a new version until it receives the entire thing. Thus, if you are making a very large commit, and in the middle of uploading all the files your internet goes down, your repository will disregard the entire commit. This keeps your repository from getting out of sync.

To make a commit, you will run the svn commit command.

Tools

SvnX

While you can get by with the command line (Terminal) alone, it can be helpful to have a GUI around. On OS X, my favorite is SvnX, an open source Subversion client. Here’s a screenshot:

I tend to use a combination of both SvnX and the command line. I like having SvnX available to help add and remove files/directories, and I occasionally use it for commits and updates, but for the most part, I prefer the command line.

And though I’ve never tried it, I understand that Tortoise SVN is an amazing Windows client.

TextMate

TextMate is my editor of choice. Also, it includes some very handy built-in Subversion support. I often run commits, adds, etc. within TextMate. It’s not as robust as SvnX, but it can handle most Subversion tasks. Here’s a shot of TextMate’s Subversion context menu:

SSHKeychain

Another important app on OS X is SSHKeychain.

If you’re SSH-ing into your server often, and especially if you are connecting to your repository over svn+ssh://, you are going to want to establish SSH key pairs instead of typing in your password a million times. SSHKeychain is an open source application that integrates your SSH keys with OS X’s keychain.

First, you’ll need to set up SSH key pairs. Check out this article from the TextMate site that goes through the process.

Important Notes

What’s with the .svn directories?

From the Subversion book:

Every directory in a working copy contains an administrative area, a subdirectory named .svn. Usually, directory listing commands won’t show this subdirectory, but it is nevertheless an important directory. Whatever you do, don’t delete or change anything in the administrative area! Subversion depends on it to manage your working copy.

You won’t find these .svn directories in OS X’s Finder, but they are there behind the scenes keeping track of your working copy. While they are out of sight, you should keep them in mind…

Let’s say you have two different projects going on, both in Subversion, in two separate local copies. Imagine that you want to copy a directory from project A into project B. If you duplicate a directory in project A and place it in project B, the hidden .svn directories will cause you problems. Subversion will see these .svn directories from project A and get confused.

The solution is to remove the .svn directories (often referred to as “taking the files/folders out of version control”) before you copy to project B, then run svn add to instruct Subversion to add the new directory.

As you may imagine, there are many instances where you need to take files out of version control. In these instances, you need a way to get rid of these .svn directories. Here’s two solutions:

  • Run an export. Subversion’s export command is designed specifically to export files from either a working copy or a repository. The exported files will not be under version control (they won’t have the .svn directories).
  • Use a script to remove all .svn directories. It’s not as elegant, but this command can be super handy if you know what you’re doing. To strip the .svn directories from anything, open up a Terminal window, navigate to the folder in question, and run:

    find . -type d -name .svn -print0 | xargs -0 -t rm -Rf

    (Friends, keep in mind that any recursive rm command should be used with caution)

Update:

Huge thanks to David Buxton for emailing me about the rm command I had originally posted. David spotted a flaw and hooked me up with the command above.

Finally, if you’re just copying files within the same working copy, you don’t need to take the files out of version control. Just use the svn copy command. By using this command, instead of copying a folder in Finder, Subversion will intelligently copy the folder and manage the .svn support files appropriately.

Securing your server’s public-facing working directory

If your production server (live web server) is going to be using a working copy of a Subversion repository as the public-facing site (document root), you need to make sure that visitors can’t browse the hidden .svn directories that Subversion creates to keep track of your working copy. These files contain some sensitive information that you don’t want to share with the world.

For Apache, the easiest way to disallow access to this information is to include the following directive in your httpd.conf file:

<directorymatch "^/.*/.svn/">
   Order deny,allow 
   Deny from all 
</directorymatch>

If you don’t have access to your host’s httpd.conf file, you can use an Apache mod_rewrite directive. The following code should be placed in an .htaccess file in your document root.

<IfModule mod_rewrite.c>
   RewriteEngine On
   # Prevent .svn directory browsing.
   RewriteRule ^(.*)(.svn)(.*)$ http://www.yourdomain.com/error.html [L]
</IfModule>

For more on URL re-writing using Apache’s mod_rewrite, check out this article on Sitepoint.

Subversion Hosting

Lately I’ve noticed a bunch of new Subversion hosting providers. I haven’t tried any of them, but I’m fond of hosted/managed services, and I’ve been considering giving one of them a try. Most providers offer built-in integration with Trac, a web-based bug/issue tracking system. We’ve never worked with Trac, but we do use Basecamp on a daily basis, and I’d like to see more Subversion hosting services offer Basecamp integration. (Springloops is the only service I’ve found with Basecamp support.)

  • DevGuard
  • SVNRepository
  • wush.net
  • CVSDude
  • Springloops
    Springloops is particularly interesting as they offer Basecamp integration and their service is specifically geared for web development teams, whereas most of the other Subversion hosting services are designed for programming teams.
  • Google Code
    * Christian Wolf emailed to remind me that Google offers free open source project hosting using Subversion.

Since I like conclusions…

Subversion is now an integral part of our web development process. We enjoy the peace of mind knowing that we have a comprehensive repository (dare I say, time machine!) of each project sitting on heavy-duty remote web servers. We enjoy the ability to seamlessly work with developers all over the country. We enjoy pushing changes to sites with just one command.

But one thing remains… version control is still the stuff of nerds. By its very nature, Subversion creates another layer of complexity between you and your code. It took a little time for me to get comfortable with Subversion, but at this point, I can’t imagine working on a web project without it. If you’re a web nerd, I promise: Subversion is for you.


Comments? Contact James via email - .