How to scale a web site…very basic example using AWS.
July 11, 2008 – 3:26 pmI was asked by a friend how I would setup a website in a scalable way using Amazon Web Services (AWS).
The website will have a service where it will take file uploads from users, process them and make the result publically available on the website. It is expected that this website will be very busy and without proper planning it will have bandwidth issues and long processing delays.
In its most basic form, the site will have three components, namely:
- load balancer
- web server
- processing server
It will also use the following AWS services:
- S3: http://aws.amazon.com/s3
- SQS: http://aws.amazon.com/sqs
- EC2: http://aws.amazon.com/ec2
To perform the load balancing function you can use an EC2 instance with some load balance app installed on it. There are plenty of these publically available under the GPL such as Pound (http://www.apsis.ch/pound) or Crossroads (http://crossroads.e-tunity.com).
Behind this you can add the bank of web servers, perhaps two at the start. The web servers would manage the following these steps:
- allow the user to upload the file that needs to be processed
- save the file in S3 and create a log entry in the website database
- insert a message into SQS (listing the location of the video file)
- setup the processing servers to poll the SQS queue and take the next available job, process it, save the result in S3 and update the mysql database to say that the processing has been completed and the file is available for public viewing
The really low-tech way to scale either the web server bank or the processing bank is to get the individual servers to email you (using cron) when the load averages are high. The more tech savvy way to do it is to use either Scalr (http://code.google.com/p/scalr) or your own auto-deploy tool/app.
This is not supposed to be an exhaustive study or example of how the site would be setup…but I hope that this acts as a starting block.