Serving static content from Cloud Files using Ruby

I recently moved my site skylinesaustralia.com from hosting in the states back to hosting in Australia and while it is reasonably affordable, one thing I can't afford is a burst in bandwidth. I average 600GB outbound per month and I only have 600GB allowance with my server, so I decided to serve my gallery and post attachment images from Rackspaces Cloud Files service. The Cloud Files service is much like Amazons S3 service, only faster and for a similar amount of money.

My issue with this was that my gallery on SAU is 100GB in size, so using something to mount the Cloud Files 'container' locally using fuse or something similar is far too slow and without mounting it, I cannot use rsync. Even if I did manage to mount it, I had no way of doing an 'immediate' sync when files were uploaded.

After some thinking and chatting with a very cluey sys admin at work, I looked at inotify. Basically, inotify (and inotify-tools) alerts you of changes to files and directories, thus making it possible to write scripts based on changes to the file system. Awesome!

Rackspace provides a bunch of very nice API interfaces for all sorts of languages. I used Ruby, but the PHP one is also great.

So, my approach was to monitor a directory for changes, capture the changes, use the file name and location captured and push the changes via the Cloud Files API. Then, we can view the files using the Cloud Files CDN URL. This is easily the trickiest to get right, but the easiest to set up and requires very little integration to work for any site.

The first thing I did was write a little Ruby script to upload files to a container on Cloud Files. I wanted this script to remain generic so that I could use it to push my database backups to Cloud Files also.

require 'rubygems'
require 'cloudfiles'

# Log into the Cloud Files system
cf = CloudFiles::Connection.new("username", "APIKey")

if ARGV.empty? then
    print "Usage: \n"
    print "pushToCloud.rb <container> <remotefile> <localfile> \n"
    print "remote file MUST contain relative path under the container!\n"
else
    container = cf.container(ARGV[0])
    if container.object_exists?(ARGV[1]) then
        # object (file) exists
    else
        # object does not exist...
        newfile = container.create_object(ARGV[1], true)
        newfile.load_from_filename(ARGV[2])
    end
end

This is very simple. Connects to Cloud Files, checks if the object (file) exists, if not, it creates a new object and writes data to it. There are a few things to note here;

  1. An 'object' is the name of anything in Cloud Files; a file or a directory can be an 'object'
  2. Objects don't have a path, they are named with their path. ie /var/log/my.log gets sent as '/var/log/my.log' - the path is part of the name. (At least this is how I understand it)
  3. Objects can be created but not written to. This concept means that you need to create and object and then write data to it (you can see this in the code above.)

Now that I have my Ruby script, I will write a small bash script;

#!/bin/sh
function checkExists {
    if [ ! -e "$1" ]
    then
        sleep 5
        checkExists $1
    fi
}

inotifywait -mr --timefmt '%d/%m/%y-%H:%M' --format '%T %w %f' -e modify,moved_to,create,delete /home/skylines/html/forums/uploads | while read date dir file; do

    cloudpath=${dir:20}${file}
    # I only want everything after /home/skylines/html/
    localpath=${dir}${file}
    checkExists $localpath
    ruby /home/cbiggins/bin/pushToCloud.rb skylinesaustralia.com $cloudpath $localpath
    echo "${date} ruby /home/cbiggins/bin/pushToCloud.rb skylinesaustralia.com $cloudpath $localpath" >> /var/log/pushToCloud.log
done

So, this very simple script uses inotifywait to monitor my uploads directory for changes (modify, moved_to, create and delete changes) and writes the path and file to stdout, then my while loop grabs that output and reads it into the date, directory and file. We create a few paths for Cloud Files and locally and we pass them to my Ruby script. I am also a big advocate of logging everything, so I write my changes to a log file also. Note the checkExists function - if this bash script gets called before the file has completed uploading, then its not available to be pushed to Cloud FIles and we end up with errors, so this function just sleeps for 5 seconds if its not there and trys again until the file exists.

As you can see, pushing files to the 'Cloud' is extremely easy and apps like inotify and inotify-tools make it super simple to monitor and perform actions based on file system changes.

If you have any suggestions or questions, please don't hesitate to leave a comment below.

Thanks!



  • Thanks for this tutorial to me. Lot of 


  • Really i admire this post, its useful information, I read this whole and carefully. This article covers the all required thing. 
    website developers for hire | hire a website developer
  • Roollandy
    great post
  • Excellent post. Thanks for sharing such a valuable information with us
  • This is good blog on Static Content on website thanks for sharing...
  • Hi again Christian,
    Can I ask who you've moved to locally, and how you've found them in the last 8 months? Are you on VPS or dedi?

    I'm considering moving greenandgoldrugby.com on-shore, currently with linode and have found response time not too bad with nginx proxying to apache but would love to try it closer to home.

    I was also looking into CDN for a while, figure if the bulk of my content was here then if I run out of bandwidth in a busy month just switch the CDN off and serve the reset from the states.
  • Hey Moses,

    I chose Digital Pacific for our local hosting. They're not too cheap but the support and service has been fantastic. We do ~700GB per month so have been tossing up moving media to Rackspace but I dont have the bandwidth available to get it there!! Haha.

    I have 2 dedicated boxes, the primary with Quad quad-core's and 16GB memory. The second server is just replicated to for downtime-less dumps of the live db and contains an hourly rsync of all our files.

    I'm just using apache and PHP with no proxying but think I may give Nginx a go soon, Lighttpd has been a pain at work.

    C
  • Hey Christian,
    Just wanted to say thank you for this excellent script! It's very useful. I'm trying to figure out how to handle a Checksum confirmation so hopefully I can share that once I figure it out.
  • Yes please let me know, I want to build on this a little bit soon so I'll post here when I do.
  • thelen
    Why did you move hosting back to AUS? There are many reliable US hosts, and indeed EU hosts as well, surely they would be a better option?

    With the increase in international links, hosting in AU isn't much better than other countries now, in terms of response etc, and even many big US providers have direct peering with AUS, cutting 50ms off as well.
  • only4customer
    With the increase in international links, hosting in AU isn't much
    better than other countries now, in terms of response etc, and even many
    big US providers have direct peering with AUS, cutting 50ms off as
    well. http://www.jerseysisback.com
    http://www.ineedjeans.com
    http://www.thestylishshop.com
  • Hi thelen,
    Are you in Australia? Either way, response time between states in Australia is far better than response time between Australia and the States. Having hosting all my sites for the best part of 10 years in the States, I can tell you now that its very, very different. 95% of my user base is in Australia and a few percent are in NZ. It'd be silly to continue to push the traffic OS and make the browsing slower.
    Also, even if it were not for the increased response time, I would prefer to have it in Australia to support local business and to have a local support contact on the same timezone as me.
  • thelen
    Yea I am in Aus, and I have a fairly large hosting business of my own, with 10s of servers spread across the globe ;)

    Sure it is good to host local, but that 600GB must be costing you at least a couple hundred, where even in the US with a premium host, it would be like 50 or less.

    Anyway, I wouldn't say the users would notice a better response, unless they use FTP, I use a bunch of US sites for hours a day and they are very snappy. Does come down to design, though, so I guess if SAU is a fairly large site then the lower response would help loading somewhat. Heh, even VNC to my EU servers is as fast as when I work remotely on AU customers' computers for my IT job ;)

    Each his own I guess, just figured you could save some money ;)
  • Well, as I'll soon be serving all static content from Cloud Files so bandwidth cost will halve soon. Site browsing speed has halved since bringing it back here, but it is a fairly large site and as I said we've spent a fair few years on different boxes and with several hosts in the states and its never been as good as it is now.

    Thanks for the comment though. If I felt I could keep the speed up while keeping it in the states, I would. :)
blog comments powered by Disqus