Seventeen entries in "Web"

Delicious API Integration

Tuesday July 1, 2008 - 3 weeks ago

Posted by James Ellis / Filed under Code, Web

Last week we announced the relaunch of the Thinking for a Living website. Today a bit about how it works.

Earlier this year Duane King (of BBDK) and I started discussing possibilities for a new TFAL website. What had started out as a brief booklet was evolving into a larger concept: an open-source design education resource. Duane wanted to grow a database of print and online resources, and find a way to publish this information effectively online. I recommended using Delicious, the Yahoo-owned online bookmarking service, as a back-end.

Delicious allows you to maintain a single repository of bookmarks, with all data hosted on Delicious servers instead of your local machine. The obvious benefit to this system is that your bookmarks are accessible from any computer on the web.

It’s easy to push new bookmarks to the system: You can use the Delicious Firefox extension or browser bookmarklets to quickly bookmark items as you browse. Each bookmark may include information such as title and description. And rather than organize bookmarks by placing them in specific folders, each Delicious bookmark can be associated with any number of tags.

Most important for our purposes, and where things get interesting: Delicious offers an API that allows external applications to communicate with the service. By tapping into the API, we can access the TFAL Delicious account bookmarks, and (re)publish this data/content on the TFAL website in whatever fashion we like.

This is interesting as most people view Delicious as a personal internet utility, not a content management system. It’s also interesting that we can take our bookmark data/content out of the highly normalized Delicious environment and integrate it within something of our own design.

It could be said that we’re simply reformatting information, however, I think there is more to do it than that. The important point is that we’re able to involve our content in a larger context and give it greater meaning. And by presenting it in a format of our own design, we can make the content that much more uniquely ours. Given the standardization of the modern web, we think a little specialness goes a long way.

Working with the API

Athletics has been working with the Zend Framework for some time now. Given the investment we’ve made in our existing systems/infrastructure, ZF’s plug-n-play approach to framework integration suits us well. ZF should be accessible to anyone with a decent grasp of PHP.

ZF provides dozens of modules designed to address a variety of different tasks. For this project we made use of Zend_Service_Delicious, a module designed specifically for interfacing with the Delicious API.

An example of the code required to connect to the Delicious API and query for all posts:

  1. <?php
  2. // Connect to the Delicious service
  3. $delicious = new Zend_Service_Delicious('username', 'password');
  4.  
  5. // Query for all posts
  6. $all_posts = $delicious->getAllPosts();
  7.  
  8. // Loop through all posts
  9. foreach ($all_posts as $post) {
  10. echo "Title: $post->getTitle()";
  11. echo "Url: $post->getUrl()";
  12. }
  13. ?>

It doesn’t get much easier than that. ZF abstracts all the complicated stuff; you never have to bother about XML/JSON/etc.

Caching

Delicious is strict about how often you hit the API, as they can’t have external apps slamming the service and draining bandwidth. From their API help page:

  • Please wait AT LEAST ONE SECOND between queries, or you are likely to get automatically throttled.
  • Please watch for 503 errors and back-off appropriately.

This means you must implement some sort server-side caching to prevent your site from making too many calls to the Delicious API. With TFAL, we store the result of $delicious->getAllPosts() for about two hours. We do this using another handy ZF module: Zend_Cache.

In the following example we combine Zend_Service_Delicious with Zend_Cache to store the Delicious data for two hours:

  1. <?php
  2. $frontendOptions = array(
  3. 'lifetime' => 7200, // cache lifetime (60 * 60 * 2 = 2 hrs)
  4. 'automatic_serialization' => true
  5. );
  6.  
  7. $backendOptions = array(
  8. 'cache_dir' => 'path_to_cache_dir' // Directory where we put the cache files
  9. );
  10.  
  11. // Instantiate a Zend_Cache_Core object
  12. $cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);
  13.  
  14. // Can we load the cache?
  15. if( !$delicious_data = $cache->load('AllPosts' ) ) {
  16.  
  17. try {
  18. // Connect to the Delicious service
  19. $delicious = new Zend_Service_Delicious( 'username', 'password' );
  20.  
  21. $delicious_data = $delicious->getAllPosts($tag=null);
  22.  
  23. // Save the results to the Cache
  24. $cache->save( $delicious_data, 'AllPosts');
  25.  
  26. } catch ( Zend_Service_Delicious_Exception $e ) {
  27. // If connection to Delicious failed, load old cache regardless of expiration
  28. $delicious_data = $cache->load('AllPosts', $doNotTestCacheValidity=true );
  29. }
  30. }
  31. ?>

In the code above we initialize a Zend_Cache object on lines 1-12. On line 15 we attempt to load the cached version of the Delicious data. If the cache has expired we continue to lines 17-29 where we first try to connect to the Delicious API and request all posts (lines 19-24). If this fails (the server couldn’t connect to the Delicious service for any reason), we catch the exception (line 26) and force load the last cached version of the Delicious data, regardless of whether it has passed its 2-hour expiration date.

We found this sort of caching essential as Delicious doesn’t hesitate to block your requests if you get too greedy. (We were blocked a couple times during development.)

Additional caching

Storing $delicious->getAllPosts() is great and will definitely keep you from being booted off the Delicious API, but once you start dealing with large data sets (with over 1200+ items, TFAL certainly qualifies) additional page-level caching can speed things up significantly. For a high-traffic page like the Home page, we store the system-rendered HTML page output for 10 minutes instead of repeatedly asking the system to spider all 1200+ items just to determine what constitutes the latest 3 resources. This makes a lot of sense given that the $delicious->getAllPosts() data-set only updates every 2 hours.

We have page-level caching functionality built into our general purpose web framework. The implementation is very similar to that of WP-Cache. However, I should mention that Zend_Cache can also be used for page-level output caching.

The numbers:

Rendering the Home without page caching requires 0.2 seconds and 6.24MB of memory. (This is, of course, using the cached result of $delicious->getAllPosts(). Making the full trip to the Delicious API takes much longer — usually a second or two.)

0.2 seconds might seem fast, but it’s actually a bit excessive. By using a full page-cache we can get this down to .01 seconds and 1.83MB of memory. Big improvement. This additional caching is particularly important the day your site gets linked up on a heavily trafficked website and an avalanche of traffic rolls in.

Additional notes:

The odd del.icio.us name/web address has always bothered me a bit. I’m all for waving the nerd flag, but I’ve never found domain hacks to be all that clever. Inevitably in conversation someone ends up saying “dell dot ick-eo dot you-ess”, which is ridiculous. After being acquired by Yahoo they picked up delicious.com, which bounces you to del.icio.us. I understand Yahoo intends to rebrand the service simply “Delicious” at some point. I’m looking forward to that.

Also, this post was an excuse to play with GeSHi, the PHP-based generic syntax highlighter class. It’s really quite handy and supports fifty-eleven languages. We’ll be using this for code highlighting from now on.


Questions/Comments: Contact James via email - .

Thinking for a Living

Monday June 23, 2008 - 1 month ago

Posted by James Ellis / Filed under Design, Web

As designers we celebrate meaningful libraries of information. We love the information, but it’s not our only concern. We also take the design of the library itself – the system that organizes the information – very seriously.

Designers have a long history of library-making. Pre-internet, we compiled books (reference books, monographs, retrospectives, magazines, etc.) Now we find ourselves reinventing our libraries for an online world. We develop new systems to better manage and deliver our information, and to accommodate the scale and scope of the internet.

Unlike the finality of books, online libraries constantly change, and this dynamic nature makes modern library-making a unique challenge. To do it right, you need more than the designer and archivist; you need the information architect – the online system builder – as well.

Today we’re happy to announce the significant relaunch of a modern online resource. We partnered with our friend and colleague Duane King (the DK of BBDK) to develop a system that extends his immense library of online resources to the web. Please enjoy Thinking for a Living™:

http://thinkingforaliving.org/

Stats tracking using Mint and Google Analytics

Wednesday April 23, 2008 - 2 months ago

Posted by James Ellis / Filed under Code, Software, Web

Most every web project we push out the door uses both Mint and Google Analytics (GA) for site stats/tracking/analytics.

Analytics software has come a long way since the days of server log file analysis programs such as Webalizer. Modern jobs like Mint and GA take a different approach, using client-side Javascript to grab more in-depth user data. The client-side approach better differentiates between humans and robots, and captures the return visits of users browsing pages from their browser’s cache. And though it was once a concern, nowadays only weirdos are browsing without Javascript support. They don’t count.

Mint
http://www.haveamint.com

Unless you’re juggling metrics at an ad agency, Mint will generally get the job done, delivering a well organized and easy to digest overview of site activity. A single page/interface reveals your latest visits, referrers, pages, searches and more.

The user experience is very different from GA. Mint doesn’t dig particularly deep, but it’s fast and fun. The interface is chock full of Javascript/AJAX flippity-do’s.

Mint is a self-hosted app and requires a server running both PHP and MySQL — basic gear. Mint collects data via Javascript, then drops it into a MySQL database using a bit of PHP. Mint runs quick, provides real-time stats, and is easy to install.

Cost: $30 per site license (per domain). Cheap.

The Visits module:

Mint Visits Screenshot

The User Agents module displaying Screen Resolutions. Notice how many users are still at 1024×768, even on a site generally trafficked by design/tech folks.

The User Agents module displaying Browsers. Sadly, 10% are still using IE6.

The User Agents module displaying Flash Player installs. 94% on 9, amazing. Thanks Youtube…

Favorite feature: Mint provides an RSS feed of your site’s newest unique referrers. Plug this into Google Reader (or whatever you like) and you can keep up every new website, blog, link-list, etc. that directs traffic to your site. Beyond the obvious utility, this tool can also reveal web theft in progress.

Mint Demo:
http://www.haveamint.com/about/demo

Mint Feature Highlights:
http://www.haveamint.com/about/feature_highlights

Google Analytics
http://www.google.com/analytics

Upon acquiring Urchin’s tracking software in 2005, Google re-released the software as Google Analytics, free of charge. Given Urchin’s popularity and the new non-price, GA was everywhere seemingly overnight.

GA does fifty-eleven things; it certainly out-features Mint. You’ll find charts, graphs, tables and maps everywhere, for every possible metric. It’s great for analyzing very specific trends. You could spend hours digging through it all.

You can manage any number (or at least a whole bunch) of domains within a single GA/Google account. Not everyone manages a pile of domains, but for those that do, this is a big deal.

If you’re prepping a stats report for a client, check out the export-to-PDF or XML feature. GA produces very attractive PDFs which you can then take into Illustrator and chop up as needed. This includes charts, and it’s all vector.

Unlike Mint, GA allows for the tracking of Flash events, and of file downloads (PDFs, mp3s, etc.) See here and here.

GA even offers a Content Overlay view, allowing you to browse your site with GA sprinkling stats all over the page. You can see which links are getting the most clicks. Check it:

(If you’re into this sort of data visualization, check out Crazy Egg. That’s their whole deal.)

Another interesting feature is the Map Overlay. By analyzing user IP addresses, GA is able to map which countries visitors are coming from.

Why run both?

The biggest difference between Mint and GA is that GA is always a day behind. GA only updates account data once every 24 hours. (Real-time analytics would introduce massive data-processing overhead for Google.) Mint, however, always displays up to the minute data.

Analyzing trends is great, but there are many instances where you want real-time stats. If you can’t ride the refresh button the day you get linked up on Digg, NYTimes, etc., you’ll have missed the fun.


Questions? Comments? Contact James via email - .

Web Theft

Thursday April 10, 2008 - 3 months ago

Posted by James Ellis / Filed under Code, Random, Web

The web is pretty well open. You can view-source your way through most HTML, CSS and Javascript. That’s how most web workers learned their way around — by studying other websites. It’s one of the things we like about the web.

We certainly have no issue with anyone viewing our HTML, CSS, etc. But please don’t steal our design. And certainly don’t copy/paste our entire site HTML+CSS, change out the logo, post it behind your own domain and call it your own. Unfortunately, this happens on a somewhat regular basis.

Thanks to Mint’s newest unique referrers RSS feed, we can keep up with the latest URLs linking to the Athletics site. This feed lists the latest sites, blogs, link-lists, etc. directing traffic to our site.

And yesterday, upon clicking through to some of the latest referrers, we found this:

screenshot
(Click for enlarged)

Web theft, in progress. Here we have someone in the process of customizing our site to make it their own. They have changed out the logo, changed some copy, but otherwise you can see they are still using our graphics and copy.

We couldn’t find an email address on the site, but after doing a whois on the domain we found that it’s registered to someone in Ankara, Turkey. We did find an email address registered with the domain, but it bounced back our kindly worded please-remove-our-property-from-your-site email.

Then, after taking a closer look, we noticed that they were still linking directly to our images. We realized we had the ability to send the folks at Yenioyun (and other web-offenders that we may not be aware of) a message.

Using a bit of mod_rewrite code, we were able to reroute all external requests for images on our server to an altogether different image.

Click through to http://yenioyun.org/ to see the result. And as I’m sure they will be changing their site shortly, here’s a screenshot for posterity. For the full effect, see the ani-gif we are using.

Of course this is nothing new. Web admins have long employed this sort of tactic for dealing with users leeching bandwidth (hotlinking images within their MySpace pages, message boards, porn sites, etc.) Most recently, I particularly enjoyed the John McCain MySpace incident.

Please, have the code:

With the help of this article and the mod_rewrite manual, we put the following mod_rewrite rule into an .htaccess file and placed it in our images directory.

<IfModule mod_rewrite.c>
  RewriteEngine on
  RewriteCond  %{REQUEST_FILENAME}  .(gif|jpe?g|png)$  [NC]
  RewriteCond  %{HTTP_REFERER}  !^$
  RewriteCond  %{HTTP_REFERER}  !athleticsnyc.com   [NC]
  RewriteCond  %{HTTP_REFERER}  !bloglines.com   [NC]
  RewriteCond  %{HTTP_REFERER}  !google.   [NC]
  RewriteCond  %{HTTP_REFERER}  !search?q=cache   [NC]
  RewriteCond  %{REQUEST_URI} !^/images/stop_stealing.gif
  RewriteRule  (.*)   http://athleticsnyc.com/images/stop_stealing.gif?id=$1   [R,NC,L]
</IfModule>

The first line looks looks for all gif, jpeg and png files. The next few lines define the domains allowed to serve up our images (we want Google Reader and Bloglines users to be able to view our images). The next to last line disregards the rules if you’re requesting the replacement image (to keep from causing an infinite loop of redirects).


Questions? Comments? Contact James via email - .

Future-stuff: Ubiquitous Internet

Thursday April 3, 2008 - 3 months ago

Posted by James Ellis / Filed under Web

I’m sure road-warrior PC folks are quite familiar with the latest mobile internet gadgets, but until recently, I was not.

I had the opportunity to spend a chunk of March on the road, eventually making my way down to Austin, TX for SxSW. Being a busy guy, it was necessary that I get work done while bumbling around.

I had heard tale of the little data cards that provide internet access over 3G cell networks, but I’d never seen one in action. After a bit of research I discovered that they now offer little PC/Mac-compatible USB sticks that get the job done, no card required. All of the big providers – AT&T, Verizon, Sprint, etc. – offer these things. I had read a nice thing or two about Sprint so I gave them a shot.

Here’s what the gadget looks like:
(I found some better photos here)

Setup was easy. The gadget actually comes with the OS X drivers built in (it has an internal flash drive), so the initial install went quickly. Once set up, getting online is easy: you plug in the stick, wait for it to power up, then click Connect using the standard OS X Internet Connect app.

They market these things as running at broadband speeds, but I figured this was likely a generous description. Thus, I was surprised to find that this thing was fast, especially in metropolitan areas. I was uploading/downloading big files, watching YouTube videos, downloading iTunes episodes of Lost, etc. It wasn’t cable/fiber fast, but fast nonetheless.

One thing I noticed: each time you made a web request after a period of inactivity, there would be a 1-2 second lag before data started moving. But once it got going, it was fast again.

Reception was great. Speed was good even in many remote areas. The only black-out spot that I experienced was in a remote stretch between Little Rock and Baton Rouge.

My only complaint: I didn’t like having a little USB stick dangling off the side of my computer. I was always worried about accidentally putting pressure on it and having it break. Sprint did a great job jamming a 3G cell phone into a little USB stick, but I couldn’t help but wish that it was already built into the machine, as WiFi is.

The Sprint deal

You pay $200 for the gadget and $60/month for unlimited access. Not a bad deal, but then they ruin it by requiring a 2-year contract, just as they do with cell phones. It’s not just Sprint – all of the major providers do this.

Fortunately, Sprint offers a 30-day trial period: sign up for the 2-year contract, use the service all you want for up to 30 days, and if for any reason you wish to discontinue the service, you can return the gadget for a full refund and cancel your contract, no questions asked, no penalty fee. You are only charged a prorated portion of the $60/month service fee.

Thus, I had Sprint mobile internet for the duration of my trip, and upon return, promptly cancelled the service. Total cost: $55.

Thoughts (or, what the MacBook Air is missing)

Having ubiquitous, wireless, broadband internet feels like the future. It just works. It makes bothering with (and often paying for) WiFi hotspots seem very lame.

Now that I know ubiquitous internet is possible, it’s turning into an expectation. John Gruber’s comment comes to mind:

After using my iPhone for a few months, it started feeling weird that my PowerBook doesn’t have ubiquitous wireless networking: Wi-Fi when available, and seamless, instant switchover to something else when it isn’t.

I’ve seen the MacBook Air up close. It’s rad. But not quite to-the-max, I’m afraid. I’m just not sweating it. My MacBook Pro gets the job done. I can deal with the two extra pounds.

I think a lot of the tech world feels the same way: the Air’s form is very impressive, but that’s where the innovation sorta putters out.

My feeling is that if you’re going to introduce a breakthrough ultra-portable, and call it Air, it should deliver on the wireless promise.

I would have liked to have seen Apple extend their partnership with AT&T to include built-in support for AT&T’s 3G data service. (Sony does this with their VAIO laptops. Most of them come with built-in support for Sprint’s high-speed service.) Given the Apple/AT&T iPhone relationship, Apple likely could have persuaded AT&T to offer an Air data plan without requiring a 2-year contract — perhaps a pay-as-you-go plan, or an optional add-on to your iPhone bill. Whatever. Just give me a fair plan. I’d pay for it.

Until Air incorporates some sort of mobile broadband, I’m not seduced. If I’m a mobile computer person, and I need an ultra-portable device, I want wireless. I don’t care about DVD/CD drives, a bunch of ports, or face-melting performance. I just want to be on the internet everywhere. And I definitely don’t want some USB modem-stick dangling off the side of my otherwise flawless machine.

Testing & debugging Flash apps in the browser

Friday March 28, 2008 - 3 months ago

Posted by James Ellis / Filed under Code, Software, Web

I’ve always found debugging Flash applications to be somewhat challenging. The Flash IDE offers a decent debugger complete with download simulation, bandwidth profiling, etc., but this is only available when you build and preview a SWF directly in the Flash IDE. If you want to do any debugging outside this environment, you’re going to need to look beyond the tools Adobe provides.

Given that web users generally consume Flash apps within the browser via the Flash Player plugin, it only makes sense to test in this environment.

Many Flash apps are totally dependent on dynamically generated XML (i.e. Flash apps tied to content management systems). In such scenarios – which the majority of our Flash projects fall into – you really need the ability test & debug in the browser. By running Flash in the browser, we get to test on top of our full local development stack (Apache, MySQL and PHP on OS X).

Also, many Flash apps require some sort of javascript integration. This might just be having Flash content rendered using swfobject, or using SWF Address for deep-linking, or doing some more complicated communication between Flash and HTML page elements. Obviously, you’ll need a browser around to test this sort of gear.

There are a couple of tools we use to get under the hood while running Flash in a browser.

FlashTracer

This free Firefox extension created by Alessandro Crugnola allows you to view all output generated by Flash’s trace() function in a sidebar window.

An example screenshot:


(Click for enlargement)

FlashTracer does require the use of the Flash Debug Player, which you can download here:
http://www.adobe.com/support/flashplayer/downloads.html

FlashTracer works by displaying the Debug Player’s log. This log consists of trace() command output from any running Flash app. This means that FlashTracer will display output from Flash running in any browser, not just Firefox.

Strangely, on OS X the Debug Player seems to be overwritten by other applications from time to time. Perhaps application installers overwrite the Debug Player with the regular player — I’m not sure. For this reason, it’s a good idea to keep the Debug Player installer handy.

Also, keep in mind that the Debug Player does introduce some performance issues, especially when you’re tracing off fifty-eleven things per frame.

Charles Web Debugging Proxy

Charles does all sorts of things, but for me, it’s most useful as a bandwidth throttling tool. Charles acts as a man-in-the-middle and simulates various bandwidth situations by introducing network latency even when accessing content on local machines.

If you’ve ever developed a Flash application, you know that Flash behaves differently when (pre)loading content from a local machine vs. a real internet connection. It’s important to test how Flash behaves in real-world scenarios, including slower connections. Charles makes it easy to do so.

Of course, Charles does fifty-eleven other things as well. A more complete (and complicated-sounding) description from the Charles website:

Charles is an HTTP proxy / HTTP monitor / Reverse Proxy that enables a developer to view all of the HTTP traffic between their machine and the Internet. This includes requests, responses and the HTTP headers (which contain the cookies and caching information).

Charles simulates modem speeds by effectively throttling your bandwidth and introducing latency, so that you can experience an entire website as a modem user might (bandwidth simulator).

Charles is especially useful for Adobe Flash developers as you can view the contents of LoadVariables, LoadMovie and XML loads.

Charles is shareware. You can test it out for free. $50 for a single user license.

Other options

There are alternative approaches. Most hardcore AS folks will tell you to stay away from trace() altogether and use something like the LuminicBox.Log API (english translation). However, for quite a few developers, trace() gets the job done, especially since it’s possible to view its output within the browser.


Comments? Contact James via email - .

Thoughts on bandwidth and Amazon S3

Tuesday February 19, 2008 - 5 months ago

Posted by James Ellis / Filed under Code, Software, Web

This past September we relaunched the Dangerbird Records website. Since then the site has experienced a massive increase in traffic, with the new videos section being particularly popular. In addition to regular web traffic, we’re noticing a significant number of users embedding Dangerbird videos on MySpace/blogs/etc. As a result, site bandwidth has been growing exponentially — a good problem, but a problem nonetheless.

Here’s an example video:

That’s a 27MB video file. If ten thousand people check it out, that’s roughly 270GB of bandwidth.

Here’s another example:

At 17 minutes in length, this video is a whopping 124MB. If 10k visitors consume it, we’re way up to 1.2TB of bandwidth. With high-performance bandwidth costing about $1/GB, you can see how costs can quickly get out of control for a popular website.

Fortunately, we can move this high-demand content elsewhere. The services meeting this type of demand are generally referred to as content delivery networks.

Traditionally, CDN’s were designed for performance and marketed to the enterprise crowd. For example, Apple uses Akamai to serve up images and videos for apple.com. Akamai hosts copies of Apple’s assets on high-performance servers all around the globe. When a user requests a file, they receive the file from whatever server is closest to their geographic location. Akamai helps Apple maintain global performance at web-scale demand, but it’s not exactly cheap.

There haven’t always been a lot of options in economy content delivery. (You might try to exploit Dreamhost’s ridiculous 5TB/month @ $6/month plan, but Dreamhost isn’t exactly performance hosting.) However, in the last few years we’ve seen a lot of activity in the economy content delivery space as web sites/hosts have struggled to keep pace with increasing demand for content such as videos/images. Surprisingly, what appears to be the best offering in this market comes from Amazon — best known as the world’s largest online retailer, not web infrastructure provider.

In March 2006, Amazon launched S3, or Simple Storage Service. S3 is a web service providing websites with unlimited storage and unlimited bandwidth. You simply pay for you what you use and at minimal cost (Storage: $0.15/GB/month, Bandwidth/transfer: $0.10/GB).

S3 is both an online storage service and content delivery network economy hosting/bandwidth provider. You could use S3 to backup your entire computer, or you could use S3 to deliver fifty-eleven-gazillion copies of a single animated GIF. Both tasks fall outside the scope of capability of a normal web host. Meaning, you can generally only host so much data, and any single server will eventually choke upon receiving too many concurrent requests.

Update: Our colleague Larry Ludwig of Empowering Media & HostCube – our primary hosting/IT provider – emailed to comment that Amazon isn’t currently set up as a proper content delivery network (See Wikipedia’s definition here), as S3 content is delivered from one of two locations – either out of Amazon’s D.C. data center, or from Europe – rather than being replicated across various nodes and serving users by geographic location. It’s important to make this distinction between CDN’s and S3’s economy storage/bandwidth/hosting.

A few example use-cases for you:

  • An individual might use S3 to maintain a private, off-site backup of important documents, using minimal storage, and near-zero bandwidth, costing pennies per month.
  • Dangerbird Records can use S3 to deliver an infinite amount of video content at minimal cost.
  • A gigantic site like SmugMug (a photo site similar to Flickr) could use S3 to store a staggering amount user image data (in fact, SmugMug does use S3, saving them roughly $1M/year, crazy!)

It’s all a very interesting application of the distributed, on-demand, grid/cloud-computing, redundant, failure-tolerant, scalable (and many other words) systems architecture that we’ve arrived at in the post-Google world. Amazon sorted out the fundamentals of S3 in developing their own infrastructure for amazon.com. Now in true software-culture form, they have opened up their otherwise proprietary infrastructure to the world at minimal cost.

Since moving all video content to S3, we’ve seen Dangerbird’s normal bandwidth stabilize. Also, requests and transfer of video content will no longer being tying up server resources; now the server can focus on rendering page requests. And Dangerbird will save money.

Getting started with S3

If you can use FTP software, you can handle S3. It’s quite simple. First, set up an account. Then connect to S3 using some sort of client. The latest version of Transmit now provides support for S3, and there’s even a Firefox plug-in interface. Also, you might want to look into JungleDisk if you’re interested in off-site backup.

Volumeone refresh.

Tuesday January 15, 2008 - 6 months ago

Posted by James Ellis / Filed under Design, Web

Matt Owens launches redesigned volumeone.com.

Various thoughts on Flash, past and present.

Friday January 4, 2008 - 6 months ago

Posted by James Ellis / Filed under Code, Software, Web

I’m not sure you can be a Flash designer/developer here in 2008 and not have mixed feelings about Flash websites. Understand: I have been working with Flash for ten years. I like Flash a lot. It does one million things. But I think everyone has realized that all-Flash, all-the-time, makes no sense. Specifically, I’m talking about “Flash-world” websites, or websites that load all site content into a single Flash shell.

Before web browsers were any good, before CSS worked well, before AJAX Javascript noodlery, before blogs and RSS hit, before Google ruled the earth, a lot of people thought Flash might be the answer to everything. During this time (1999-2005ish) the Flash-world approach dominated among designers and ad agencies. Unlike HTML, Flash offered designers exacting control over the visual experience. In particular, designers were lured by Flash’s ability to render fonts. Ad agencies were seduced simply by Flash’s ability to render the Photoshop “boner-boards” they sell to clients. (This is why ad agencies are still obsessed with Flash.)

However, even as early as 2000, Flash-world sites were receiving a lot of criticism, particularly regarding usability (See: Flash: 99% Bad). By giving designers total control, Flash-world sites broke nearly every rule of the web; each Flash-world site introduced new, idiosyncratic conventions for navigation, scrolling, etc. The mess of the whole slowed the emergence of many of today’s established web design conventions. Now, many realize that Flash is best used to solve specific problems, not provide complete site architecture.

Consider YouTube. The entire site is predicated on the delivery of video using Flash, yet the site itself is HTML/CSS. At this point, the silliness of rolling all of YouTube into one giant Flash-world should be obvious to all.

MTV.com is another example. In 2006 mtv.com went Flash-world. As a designer and developer, I found the site interesting and certainly an impressive technical achievement. However, the user experience was frustrating. MTV realized their error and within nine months ditched the Flash-world and returned to an HTML/CSS architecture. Flash is still used to deliver video content, but the site browsing experience is no longer hijacked by a Flash-world.

Flash Development

Beyond the issue of appropriateness, Flash applications (especially Flash-worlds) are challenging development projects. It’s not that Actionscript is complicated. (It’s pretty easy to pick up for those familiar with Java, Javascript or any OO language.) Flash development is challenging because it is ridiculously tedious. By providing designers and developers with a blank canvas and complete control, Flash development becomes an exercise in reinventing wheels. Want to go to another “page” in your Flash-world? You’re going to need to write Model, View and Controller classes. Want a form in Flash? You’ll need to instantiate a bunch of objects, add event listening, and tie the whole form to a bunch of logic for error checking, data handling, etc. Even the basic task of loading an image (a no-brainer with HTML in a browser) is a challenge in Flash — you’ll need instantiate an image loading class, set up some logic, and bother with event handling. It’s all a lot of work.

Flash-world development is essentially the reinvention of the browser, inside a plugin, running in a browser. It’s silly. Browsers already do such a good job of managing state, rendering content and working with forms. Why start over?

To be profitable in the Flash-world business, you have to develop and maintain a library of frameworks and classes to deal with all the wheels you need to reinvent. Historically, Flash development had a closed, proprietary, arms-race quality to it, with studios maintaining proprietary arsenals of frameworks and classes. The closed nature of Flash development was due to a number of factors:

  • Unlike plain-text HTML and CSS, SWF files are compiled runtimes and there is no way to “View Source” or otherwise look under the hood of Flash apps. This keeps the code mysterious, prevents the development community (from beginner to pro) from examining and learning from the work of others, and generally excludes Flash from the web’s traditional culture of knowledge-sharing. Two notes: Flash decompilers do exist, but such tools have only been used by the extremely motivated, and again, one doesn’t need a decompiler to view HTML or CSS source. Second, View Source functionality is now possible in Flash, but this still remains a developer-elected option rather than a default.
  • Flash started as a designer’s tool. Designers, being a bit more guarded with their intellectual property, do not have the same culture of sharing that you find in the software world.
  • Actionscript did not become (in the eyes of developers) a real programming language until the introduction of Actionscript 2 with Flash 7 (2004). AS2, being styled directly from Java, was designed to appeal to Java developers. Having AS embraced by the software world injected a lot of software culture in Flash, and helped pull Flash out of its designer-centric origins.

There is now a growing open source Flash community, but you still don’t find near the scope of community as you do around big-timers like PHP or (relatively) new technologies like Ruby on Rails.

Flex

Adobe has tried to remedy Flash’s customization-over-convention problem with the introduction of Flex, a development framework for creating Flash apps, or what Adobe refers to as Rich Internet Applications (RIAs). Summary: it’s like HTML for Flash. There are certainly some very interesting things about Flex, but again, it all seems like a very complicated way of doing things that are already possible with HTML/CSS/Javascript in browsers.

Flex doesn’t seem to have much of a future, and it’s difficult to find anyone excited about it. These days the web is full of rich internet apps (i.e., Gmail, Basecamp, Flickr), but I couldn’t name a single app built using Flex. I went to Adobe’s Flex Showcase but didn’t find any particularly rich internet experiences — mostly the same confusing, idiosyncratic Flash interfaces quick to hijack the browser’s scrollbar and disable the scroll wheel (in OS X anyway).

More coming

Over the years we have certainly built our share of Flash-world sites, though we have been moving away from this type of work. Rather, we try and only use Flash when appropriate – video/audio players, widgets, games, slideshows, etc. But now, for the first time in nearly a year, we are tackling a proper Flash-world project — an experience-oriented site with a few particular demands that only a Flash-world can accommodate.

Flash-worlds have come a long way since our last contact. Developers have been working to bring Flash apps in line with standard web conventions, most notably deep-linking and browser back/forward-button support.

We are using Asual’s open-source library, SWFAddress, to provide our Flash-world with state management and deep-linking. SWFAddress plugs in nicely in front of our otherwise proprietary MVC architecture. It has been great to find a solid open-source, cross-browser solution that solves one of the most glaring Flash-world usability issues. We first discovered SWFAddress after visiting Burst Labs, a wholly appropriate Flash-world site designed by Gridplane and developed by David Knape. (Note: Knape has also released Bumpslide, an open-source library of useful AS2 classes.)

Given our thoughts on the Flash-world practice, this return to form has been an interesting challenge. In some ways frustrating, others rewarding. We are pleased to be using Flash for its intended purpose: constructing interactive applications combining both visual and technical challenges. Despite our criticism and reservations, the Flash-world architecture can still make for smart solutions given the appropriate context.

Look for more Flash-related thoughts over the next months as this current project progresses.

President's Cup Design Face-Off

Thursday December 13, 2007 - 7 months ago

Posted by James Ellis / Filed under Events, Web

Athletics’ Jason Gnewikow and Matt Owens to compete in tomorrow’s Layer Tennis challenge. Commentary will be provided by Joshua Allen.

For those unfamiliar, Layer Tennis is a series of live design events held on Friday afternoons. Two designers face-off, swapping a file back and forth, riffing off one another, sorta like dueling solos at a Guitar Center drum clinic, except on the internet. A new “volley” is posted every 15 minutes.

The folks at Coudal are behind it all.

Sleevage on Alternative to Love

Monday November 12, 2007 - 8 months ago

Posted by James Ellis / Filed under Design, Web

Recently discovered Sleevage, a blog on record packaging design. Sleevage is still relatively new, and it seems like they’re still working to find their voice. I find many of the reviews to be a little too easy — too brief, minimal commentary, or stories I already knew.

However, today’s interview with David Calderley of Graphic Therapy on the Brendan Benson – Alternative to Love packaging was great.

I enjoyed David’s words on how he eventually arrived at the final product. It’s always nice to learn the back story, see early concepts, and final art for the album and singles in one place.

For those unfamiliar with Brendan Benson, I recommend all three of his records, and that one he did with Jack White as well.

Link:
http://sleevage.com/brendan-benson-the-alternative-to-love/

Results from ALA's 2007 Web Design Survey

Wednesday October 17, 2007 - 9 months ago

Posted by James Ellis / Filed under Web

Yesterday A List Apart released the results from 2007’s Web Design Survey, conducted in the spring of 2007. 33,000 web professionals participated, myself included. ALA’s 81-page report is a fascinating read.

In the spring of 2007, I distinctly remember being excited by the prospect of the survey. I was surprised to discover that this marked the first meaningful public research of the profession.

In answering each of the 37 questions, I was amazed that we (the web industry/community/whatever) didn’t know this stuff already — Who’s doing the work? For what sort of companies? Where do people live? Age, sex, ethnicity? How much money do people make? Education? Do people enjoy the work?

That’s just the tip of the iceberg. The questions dig pretty deep: prevalence of blogging/personal publishing, tracking career history/future, perceptions of bias, interest/methods of continued education.

The results are extensive. Tons of graphs and charts. ALA did well to commission statisticians Alan Brickman and Larry Yu to make sense of it all. The report is a beautifully crafted document. ALA intends to conduct the survey annually.

Thoughts

It’s interesting to consider the number of participants. The 33,000 figure can seem both big and small. I’m impressed that ALA managed to get that many people to carve out the time necessary to complete the survey. It’s more than you can fit into Madison Square Garden (19,763), but 33k isn’t blowing my mind. The world is sort of big, with a lot of humans. Microsoft alone employs 78,000 people. Over a million people live in Rhode Island. The human head weighs eight pounds.

I realize the industry is bigger than 33k, and that the survey only represents the dedicated core — those down for the cause, blog-readers and general-purpose nerds — but, it reveals that this whole dot com thing is propelled forward by a relatively small group of people.

RSS for the unwashed masses

Monday October 15, 2007 - 9 months ago

Posted by James Ellis / Filed under Web

Many websites offer RSS feeds. If you have no idea why, this article is for you.

RSS stands for Really Simple Syndication. Any given feed is just a summary of content from an associated web site. Websites offer feeds for all sorts of content — blog posts, podcasts, news articles, or any sort of timely content.

An RSS feed is content wrapped in a generic data format called XML. Humans have no business reading this stuff. It’s for computers.

All that matters is that you can plug feeds into a feed reader. A feed reader like Google Reader allows you to keep track of hundreds of feeds in one place. Rather than discover new content by navigating to websites individually, a feed reader can instantly inform you of new content across hundreds of sources. For information junkies, feed readers offer increased consumption across countless content sources, at near real-time speed.

Process

I wake up in the morning and open Google Reader, where I plow through a river of news. 154 subscriptions and growing, I’m following major media, design, politics, and all varieties of nerdstuff. I repeat at various points during the day. I can’t say the same for newspapers, magazines, television, etc.

Here’s a snapshot from today:

River of news

I have a bunch of feeds in one big pool called “favorites.” Here I step through a big pile of latest posts from the various feeds I designate as favorites. With this method of browsing, I’m able to move through an amazing amount of content very quickly.

New feeds

As I browse the greater web, I find other sites of interest and plug them into the Google Reader super-brain.

Firefox makes this easy with the little subscription icon:

Upon clicking “Subscribe to this page…”, Firefox will add the feed to whatever feed-feader you like (Google Reader, Bloglines, My Yahoo, NewsGator, etc.)

Or, you can paste the feed directly into Google Reader:

The comprehensive sell

If you’re not excited by new forms of information consumption/aggregation, you might be reeled in by content. Here’s some favorites.

Designerly

(Note: Interestingly Newstoday.com doesn’t offer a feed.)

Nerd-stuff

Other stuff

Other interesting cases

  • You can subscribe to any Flickr user’s public feed (Here’s ours).
  • Or subscribe to any del.icio.us user’s bookmark feed.
  • You can find fifty-eleven different feeds for every major media outlet (BBC, CNN, NY Times, AJC, etc.)
  • You can keep up with the latest on social news sites like Digg and Reddit.
  • Most web services (hosting providers, things like that) offer customer support feeds keeping customers informed of service issues.
  • Most message boards offer feeds.
  • Weather.com offers feeds for national and local weather.
  • If you run a website, the web analytics software Mint offers a feed for your website’s latest referrers.

Higher education doesn’t guarrantee awareness

This new form of information consumption remains a mystery to many — even those with one or more master’s degrees. The insertion of a content aggregator (Google Reader) between users (you, me) and content creators (websites, blogs, etc.) is a level of abstraction that requires some explanation. Perhaps this article will help connect the dots for those interested in increasing their daily information throughput.


Questions? Comments? Contact James via email - .

Color management for web designers and developers

Sunday September 30, 2007 - 9 months ago

Posted by James Ellis / Filed under Design, Web

If you’re reading this, you likely arrived via a link posted somewhere out there in this great big internet. This article is no longer available because it needs some revision. You could say it expired. Adobe has changed the behavior of Photoshop a bit since Adobe CS1 and some of the issues that inspired this piece in the first place are no longer applicable.

On using dummy domains

Friday September 28, 2007 - 9 months ago

Posted by James Ellis / Filed under Web

Moving a website from one web hosting provider to another can be frustrating. Domains take a long time to propagate. How do you keep email from getting lost? How do you properly test a site before moving? A dummy domain can help.

Let’s say you have a new client. They have an existing website that your firm is going to redesign. Not only will you be redesigning the site, but you’ll be starting over with a new CMS and fancy front-end framework. Chances are that the client is currently hosting their site with some lame provider (i.e. out of date software, no Subversion support, no server-side spam software, etc.). You need to move the site to a provider you trust, and come launch time, the transition needs to be as smooth as possible.

After a bit of trial and error, we’ve come up with a system of using dummy domains to test projects in development, and ease the transition during domain name propagation.

The basics, step by step

  1. Go buy a dummy domain. We use a variety of dummy domains — athletics-transfer.com is a good example. You’ll notice that this was the dummy domain we used for the recently launched Dangerbird Records website. In a few weeks we’ll disable this and use it for some other project.
  2. Set up the new hosting environment. There are many great hosting providers out there. We like Empowering Media, but you can use whoever you like.
  3. Point your dummy domain to the new host. Set the dummy domain’s nameservers to point to your new hosting provider.
  4. Develop and test the new site. Develop and test the entire project on the new server using the dummy domain. To keep folks from peeping the unreleased site, we always lock things down behind a password, which we provide to the client.
  5. Set up email addresses. The client is already using a number of email addresses. You’ll need to set up these addresses on the new server. It doesn’t matter that the addresses are behind the dummy domain. So, for address@clients-domain.com, you’ll create address@dummy-domain.com.

Launch time

Once the site has been fully developed, tested, and the client is ready to go live, there are few steps to ensure a smooth transition.

  1. Add a domain alias. In the new hosting environment, add the client’s live domain as an alias of the dummy domain — meaning, visiting either domain in your web browser should resolve to the same site.
    (Of course, nothing will happen until you point the live domain’s DNS to the new hosting provider, but we still have a few things to do before we get to that…)
  2. Wait two hours. Give the hosting provider a bit of time to add the client’s live domain to their internal nameservers.
  3. Forward all email. On the client’s old server (that you’re about to abandon), set up forwards on each email account forwarding all email to the corresponding address at the dummy domain. So, example@clients-domain.com should forward all email to example@dummy-domain.com.
  4. Test email forwards. Send a few tests to make sure that email is being forwarded correctly.
  5. Change DNS. Now you can finally change the client’s domain’s nameservers to point to the new hosting provider. A domain’s nameserver information gets cached and stored all over the internet — thousands of servers cache this information to keep the internet moving fast. It can take up to 72 hours for all of these servers to switch over and begin using the new nameservers.

So what just happened?

By forwarding email on the old server to corresponding addresses at the dummy domain, we made sure that email didn’t get lost during the domain name propagation period (again, this can last up to 72 hours). Any email that made its way to the old server was forwarded along to the new one.

Post-launch

Once the client’s domain has had time to propagate (7 days is pretty safe), you can remove the dummy domain.

Was all the fuss about email?

Pretty much. For most of our clients, losing email is a deal breaker. However, with an $8 dummy domain and a bit of jiggery-pokery, we can ensure that nothing gets lost.

Also, we like the professional touch. To clients, dummy domains feel more real than something like http://dev.athleticsnyc.com/client-name/. We explain to the client that the dummy domain is running directly on their new server. The client gets a sense of comfort knowing the dummy domain is their website, just behind a different name.

Why don’t you just set up an MX record to send email somewhere else?

True, that’s another solution. We love that Google Apps now offers the ability to pipe your domain’s email over to Google’s servers and use Gmail to manage your email at @your-domain.com instead of @gmail.com. It’s easy to set up, and free.

A major benefit of using an MX record to bounce all email along to a hosted service is that you don’t have to worry about losing email when switching web hosts. You’re cool as long as both the old and new hosting providers’ DNS include an MX record routing all email to the third party email service.

There are a number of professional solutions for managing email in this way. Just google “hosted email” and you’ll find a bunch of them. However, I find most of these services to be geared for the corporate crowd. Further, a third-party pro email service just isn’t necessary for most of our clients. Running email and web services on the same server works fine for most.

Worth the $8

I try to enjoy peace of mind whenever possible. The dummy-domain-email-bouncery trick helps with that.


Questions? Comments? Contact James via email - .

Dangerbird Records Launches

Thursday September 27, 2007 - 9 months ago

Posted by James Ellis / Filed under Code, Design, Web

Yesterday we launched the wholly redesigned and developed website for Dangerbird Records, an independent record label located in Hollywood, CA and home to artists Silversun Pickups, Dappled Cities, La Rocca, among others.

One of the more info-heavy sites we’ve done, the Dangerbird site is packed with content and loaded with features. The site balances out all sorts of content — artists, releases, news, tours, downloads, store, videos, etc. Considering that each of these include various levels of sub-content, the overall scale of the site made for an IA challenge.

Information Architecture

As we’re respectable web/info designers, we went through a full IA process. We tend to do this for any large-scale site, especially the info-heavy ones (jumping straight into Photoshop would just be asking for a world of pain.).

We’re quite fond of the Flash-based prototyping tool we’ve developed specifically for the purpose of wireframing representative site states, and stitching the wireframes together in a click-able environment. For those of you familiar with making wireframes in Illustrator and packaging them up in PDFs, you’re sure to be sweating our little app.

In the past, we’ve worked on some globo-chem level corporate projects that required extensive wireframing. Trying to manage these projects with Illustrator & PDFs is too much. We’ve looked into pro software products like Axure’s RP Pro, but this sort of thing is just too far down the corporate-web-ditch-digging rabbit hole. And after taking a closer look at these prototyping tools, we realized that we could just lift a few of the basic concepts/features and roll our own app in Flash.

Basics of site prototyping

  • Wireframe each representative site state, focusing on organization of content, visual hierarchy, etc.
  • Link the states together in a straight-forward way, allowing the client to click from state-to-state and really get a sense of how the different pages work together.
  • Upon completion of the prototyping phase of the project, the visual designer should have a clear vision of the site’s structure, content and functionality, all before cracking Photoshop.

What does our prototyping app do?

By using Flash, we are able to design each wireframe in an environment similar to Illustrator. However, unlike Illustrator, we’re able to create reusable objects (Movieclips) for all sorts of repeated elements (site header, footer, sidebars, etc.). And the real trick is that we can use Actionscript to stitch the wireframes together in an intelligent way, allowing the user to click through to the wireframes.

With the Dangerbird site, it was important to prototype the Artists, Artist Detail, Releases and Release Detail states. By having the prototype link these states together, the user gets a feel for how the different states relate to one another. It’s just not the same if you’re stepping through a PDF.

Demo

Want to see what it looks like? Check out one of the later prototypes for the Dangerbird site. Various details were tweaked along the way, but you’ll find that the prototype seriously informed the final product.

In the end all relavent parties (client, designer, developer) have a clear picture of what’s going to be built, how it will work, how it will feel, etc.

Framework & Content Management

The real nerd-thunder of the Dangerbird site is the underlying web framework. Like other Athletics web projects (DKNY Jeans, Oak, and the Athletics site itself), the Dangerbird site is powered by the Studio IV Adminkit CMS and front-end framework.

The Adminkit CMS software manages all aspects of site content (artists, releases, news, tours, downloads, store, videos, jukebox — everything). Adminkit is something we’ve been actively developing for more than three years now.

The front-end framework doesn’t have a catchy name, but perhaps it should. Very much informed by things like Ruby on Rails, CakePHP and CodeIgniter, our framework follows the MVC design pattern and defines common conventions for integrating HTML templates into a system powered by the Adminkit CMS. The framework is mostly geared towards increasing developer efficiency and creating reusable chunks of code. Also, the framework provides some nice features, including human-readable URLs and output/page caching.

Human-readable URLs

Human-readable URLs are important to us. Rather than crazy looking query-string style URLs with a bunch of random looking garbage thown in, we like simple URLs that plainly describe the content they represent. For example:

Query-string style URL:
http://www.example-domain.com/news_article.php?cat=235&article_id=235266

Human-readable URL:
http://www.example-domain.com/news/title-of-article

Considering how extremely possible human-readable URLs are, we see no excuse for the former.

Output/page caching

Caching is important if you intend to power a popular database-driven website. In the era of the Digg/Slashdot effect, your website needs to be prepared to deliver content to a lot of visitors in an extremely efficient manner.

Without having the page-caching system built into the framework, a sudden surge in site visits would overload the server’s database with requests for content. Fortunately, a good page caching system can eliminate this database bottleneck. The page caching system stores static copies of pages generated from database content, and upon future requests for the same content, the system delivers the static version instead of making the full roundtrip to the database. Clever indeed. Inspired by WP-Cache and the CodeIgniter implementation.

Web toys

The Dangerbird site was an opportunity to throw all sorts of different gadgets into the mix.

  • Jukebox. Over the years, we’ve done a ton of players, most of them being minor updates to code dating back to 2002 (that’s a lot of mileage out of the now-extinct RemyZero site). The Dangerbird player is a bit different. First, it’s powered by Adminkit, so Dangerbird can upload MP3s and compile playlists whenever they like. Second, we built two different embed-able options for the player — regular and the mini. Hopefully the MySpace/Facebook crowd will embed them like mad.
  • Home Promo Slideshow. These things are pretty common at this point, but I think it’s important to point out the home page’s promo slideshow. In the Adminkit CMS, Dangerbird can upload and assign any number of JPG or SWF files as slides within slideshows. They can toggle various slideshows, re-order slides, etc. It’s a pretty cool way for Dangerbird to immediately inform visitors of whatever the new-big-thing is. I’ve always liked the way Macromedia did slideshows on their site, and now Adobe continues the practice.
  • Search. This is cool… results are segmented by section, so you can get a sampling of results from every section (artists, releases, news, videos, etc.). Here’s an example if you searched for “silversun”.
  • Mailing List. The Dangerbird mailing list is managed by Campaign Monitor, our favorite email newsletter software. Campaign Monitor offers an API for integrating CM’s services directly within your site — rad.
  • Buzznet Integration. We’ve become familiar with Buzznet recently — we did a bit of profile integration on the Cartel site. With Dangerbird, we did something similar, but instead of using one of the packaged Flash/Javascript widgets provided by Buzznet, we rolled our own. Two problems with Buzznet’s packaged widgets: 1) they are generic looking, and 2) they are often slow. To get around this we parsed Dangerbird’s photo RSS feed and rendered the content to work within the overall site design direction.

    One thing: the Buzznet site is slow. Anytime the server tries to fetch the RSS feed, it takes forever. Rather than make every Dangerbird visitor suffer, we maintain our own server-side cache of the RSS feed and serve that up instead. Once every three hours someone has to wait for the Buzznet feed to be re-cached. You can check out the Buzznet call-out in the sidebar on the site home. (Thanks to the Zend Framework for the feed parsing and caching.)
  • Videos. The player is a modified version of the player used here on the Athletics site, but the embed player is new. An example:

Parting thoughts…

We like the folks over at Dangerbird and have really enjoyed working with them over the past few months. It’s been a pleasure to work with a client that is already informed about the web — they were right there with us on RSS feeds, embed-able players, blogs, and other web two-point-new terms.

And on a personal note: given the Pumpkins’ inability to comeback, Silversun’s gonna have to really step it up and carry the torch.


Comments? Contact James via email - .

On using Subversion for web projects

Friday September 21, 2007 - 9 months ago

Posted by James Ellis / Filed under Code, Web

Subversion, the open-source version control software, has changed our web development process.

At one point in time, I thought version control software was the stuff of super-nerds. I had imagined complex software running on complex servers doing something fancy to manage programming projects. I didn’t recognize it as something that applied to me. So, for the most part, I ignored it.

Then, as version control started to creep into the web development community, I began to take notice. I would see mention of it in blogs, in books, etc. The more I learned, the more I realized it was a cool idea with a lot of benefits.

The big idea:

  • Store (and safe-keep) your project in a repository on a remote server. Never worry about making local backups. Each time you commit changes to the repository, you are making a remote backup.
  • Allow multiple users to collaborate on the same code base at the same time. Collaboration from any number of users, from any machine, at any time.
  • Keep track of all changes made to a project over time. Subversion allows you to jump back in time and access any and all previous versions of a project. No more duplicating files as backups just in case you break something.

    (Sidenote: This aspect of version control seems very similar to Leopard’s time machine feature. I’m not sure how Apple implemented this, but perhaps it’s similar to how Subversion works.)

Eventually, I ended up working on a project with a colleague, and he already had the project in Subversion. He suggested that I get up to speed with Subversion so that we would remain in sync with one another. I gave it a shot, and by the end of the project, I had the hang of it. More importantly, I realized version control should be an integral part of the web development process. (thanks Tim!)

The basics of Subversion

Subversion stores your project in a repository. The repository usually lives on a remote server. As you make changes to your project, the repository remembers everything you do.

Once you have your repository in place, you check out a copy of the project to your local machine. As you make changes to your local working copy, you commit your changes to the repository. Also, you can update your local copy to pull down the latest changes made by other users. That’s all there is to it.

Where it gets interesting…

The lone web developer may ask, “Why should I bother? I’m a one-man team, a lone wolf. I can manage my own backups. I don’t collaborate with anyone so I don’t need this stuff.”

To me, one of the most important benefits of using Subversion for web development projects is that Subversion eliminates one of the classic web development tasks: using an FTP client to push files to your production server. With Subversion, your production server can run a working copy of your repository just like your local machine does. So, like your local machine, to update the copy of the project on the live server, you run an update, pulling down the latest changes from your Subversion repository.

Let’s say you’re working on a project and you need to push a large number of changes live to a busy website. Perhaps 30+ files have changed — lots of new code, you’ve added some new images, maybe some videos, fixed a few bugs, etc. — and you want to push these live.

If you’re relying on an FTP client:

  • You have to be very careful to make sure you upload all the latest files.
  • You have to wait for them to upload.
  • It can be very tricky to make all the changes happen at once.
  • If something goes wrong, it can be difficult to revert back to a previous version.

If you’re using Subversion:

  • You run one command, svn update, that pulls down all of the latest changes at once. Subversion knows exactly which files have changed, which files are new, which files need to be deleted, etc. If your repository is hosted on the same server, the update runs in a second or two. If you’re connecting to a repository on an external server, the transfer rate should still be very fast with most updates taking place in a matter of seconds.
  • If something goes wrong, you can have Subversion revert back the previous state where everything worked.

Still not sold? Consider this…

I do most of my work at the Athletics office, but there are times when I need to take projects home, on the road, or send projects to other developers. It was always a pain trying to sync my desktop and laptop — I’d use little jump drives, post zip archives to our ftp, or remember to bring my laptop to the office. With Subversion I don’t bother with any of that. To sync any computer, I just run an update and I’m done. To provide another developer with access, I just have them check out a working copy from the repository.

Update:

Henry Todd pointed out to me that running your production server as a working copy isn’t the smartest way to deploy critical web apps. Henry offers up a more solid solution.

The deployment process:

  1. Configure Apache to point the server’s document root to a symlink. The symlink will then point to whatever directory is currently being used as the live directory.
  2. Instead of running an update on the production server (where the live webroot is also a Subversion working copy), the site is generated by running an export of the project to a directory parallel to the current live directory.
  3. Then, to make the switch, you simply change the symlink to point to the new directory.

Benefits:

  • While the export process may seem a bit cumbersome, this method allows you to push changes to your production server all at once. Changing a symlink is like snapping your fingers, while running a big update can potentially take a while.
  • If something goes wrong, reverting back to a previous version is much easier, and again, much faster — you just change the symlink to point back to the old live directory.

For most of our sites, running the production environment as a working copy is totally fine. But if it’s a critical website, this method is the way to go. Thanks Henry!

Getting started with Subversion

If you’re serious about getting started, you should read the Subversion book. It’s extensive and everything you need to know is in there. And it’s free. If you prefer more hand-holding, I recommend the book from the Pragmatic Programmers, Pragmatic Version Control. I enjoyed it.

Downloading the Subversion client

The Subversion website offers packages for just about every system on the downloads page. The OS X package is easy to use.

Setting up a repository

You’ll need to find a hosting provider that supports Subversion. Most of the good ones do. All of our projects run on Empowering Media’s managed VPS’s, but we’ve used Media Temple and Joyent/TextDrive in the past.

To a create repository, you’ll need to SSH into your server (using Terminal on OS X), go to the directory where you would like to store the repository, and run:

svn create REPOSITORY_NAME

This will create an empty repository. You don’t put anything into the repository until you start making commits. Keep reading…

(Read more on svnadmin create, or check out Media Temple’s KB doc on creating repositories.)

Initial checkout

Once you have your repository in place, you’ll want to check out a working copy to your local machine. You will need to have the Subversion client installed on your machine to do so.

Using Terminal, navigate to the folder where you would like to store your working copy. You will use the svn checkout command to check out a working copy to your local machine. The way in which you connect to your repository depends on how your host has Subversion configured. Many hosts require the svn+ssh method. It looks something like this:

svn checkout svn+ssh://user@yourhost.com/path/to/repository_name

Running this command will connect to your repository and “check out” the latest version of the project. Subversion will create all of its behind-the-scenes support files (see the bit about .svn folders below) and essentially “activate” your local working copy.

Working with your local copy

Once you have your local working copy in place, you can begin to add/modify files and folders.

First, you need to keep in mind that Subversion requires that you explicitly add files and folders to the repository — see svn add. Once files/folders have been added, Subversion will keep track of all changes made.

Working with your local copy will require discipline on your part. You will need to inform Subversion of certain changes that you make. For example, if you need to delete a folder from your working copy, you will need run the svn delete command. This will instruct Subversion to delete the directory with the next commit you make. If you forget to do this and just delete the folder yourself, Subversion will get confused. It’s a similar situation for instances where you want to rename, copy or move folders. While this file system hand-holding can seem cumbersome at first, it eventually becomes second nature. In fact, I find that the added effort helps keep me more deliberate when making decisions regarding file/folder structures.

Making Commits

After you have added or modified files in your local copy, you will want to make commits. Running a commit will instruct Subversion to send your changes and new files to the repository. Once the repository receives the commit, a new version of the project is recorded.

Subversion features atomic commits. In addition to sounding cool, atomic commits are quite important. When receiving a commit, your repository will not record the commit as a new version until it receives the entire thing. Thus, if you are making a very large commit, and in the middle of uploading all the files your internet goes down, your repository will disregard the entire commit. This keeps your repository from getting out of sync.

To make a commit, you will run the svn commit command.

Tools

SvnX

While you can get by with the command line (Terminal) alone, it can be helpful to have a GUI around. On OS X, my favorite is SvnX, an open source Subversion client. Here’s a screenshot:

I tend to use a combination of both SvnX and the command line. I like having SvnX available to help add and remove files/directories, and I occasionally use it for commits and updates, but for the most part, I prefer the command line.

And though I’ve never tried it, I understand that Tortoise SVN is an amazing Windows client.

TextMate

TextMate is my editor of choice. Also, it includes some very handy built-in Subversion support. I often run commits, adds, etc. within TextMate. It’s not as robust as SvnX, but it can handle most Subversion tasks. Here’s a shot of TextMate’s Subversion context menu:

SSHKeychain

Another important app on OS X is SSHKeychain.

If you’re SSH-ing into your server often, and especially if you are connecting to your repository over svn+ssh://, you are going to want to establish SSH key pairs instead of typing in your password a million times. SSHKeychain is an open source application that integrates your SSH keys with OS X’s keychain.

First, you’ll need to set up SSH key pairs. Check out this article from the TextMate site that goes through the process.

Important Notes

What’s with the .svn directories?

From the Subversion book:

Every directory in a working copy contains an administrative area, a subdirectory named .svn. Usually, directory listing commands won’t show this subdirectory, but it is nevertheless an important directory. Whatever you do, don’t delete or change anything in the administrative area! Subversion depends on it to manage your working copy.

You won’t find these .svn directories in OS X’s Finder, but they are there behind the scenes keeping track of your working copy. While they are out of sight, you should keep them in mind…

Let’s say you have two different projects going on, both in Subversion, in two separate local copies. Imagine that you want to copy a directory from project A into project B. If you duplicate a directory in project A and place it in project B, the hidden .svn directories will cause you problems. Subversion will see these .svn directories from project A and get confused.

The solution is to remove the .svn directories (often referred to as “taking the files/folders out of version control”) before you copy to project B, then run svn add to instruct Subversion to add the new directory.

As you may imagine, there are many instances where you need to take files out of version control. In these instances, you need a way to get rid of these .svn directories. Here’s two solutions:

  • Run an export. Subversion’s export command is designed specifically to export files from either a working copy or a repository. The exported files will not be under version control (they won’t have the .svn directories).
  • Use a script to remove all .svn directories. It’s not as elegant, but this command can be super handy if you know what you’re doing. To strip the .svn directories from anything, open up a Terminal window, navigate to the folder in question, and run:

    find . -type d -name .svn -print0 | xargs -0 -t rm -Rf

    (Friends, keep in mind that any recursive rm command should be used with caution)

Update:

Huge thanks to David Buxton for emailing me about the rm command I had originally posted. David spotted a flaw and hooked me up with the command above.

Finally, if you’re just copying files within the same working copy, you don’t need to take the files out of version control. Just use the