How to scale a web site…very basic example using AWS.

July 11, 2008 – 3:26 pm

I was asked by a friend how I would setup a website in a scalable way using Amazon Web Services (AWS).

The website will have a service where it will take file uploads from users, process them and make the result publically available on the website. It is expected that this website will be very busy and without proper planning it will have bandwidth issues and long processing delays.

In its most basic form, the site will have three components, namely:

  1. load balancer
  2. web server
  3. processing server

It will also use the following AWS services:

  1. S3: http://aws.amazon.com/s3
  2. SQS: http://aws.amazon.com/sqs
  3. EC2: http://aws.amazon.com/ec2

To perform the load balancing function you can use an EC2 instance with some load balance app installed on it. There are plenty of these publically available under the GPL such as Pound (http://www.apsis.ch/pound) or Crossroads (http://crossroads.e-tunity.com).

Behind this you can add the bank of web servers, perhaps two at the start. The web servers would manage the following these steps:

  1. allow the user to upload the file that needs to be processed
  2. save the file in S3 and create a log entry in the website database
  3. insert a message into SQS (listing the location of the video file)
  4. setup the processing servers to poll the SQS queue and take the next available job, process it, save the result in S3 and update the mysql database to say that the processing has been completed and the file is available for public viewing

The really low-tech way to scale either the web server bank or the processing bank is to get the individual servers to email you (using cron) when the load averages are high. The more tech savvy way to do it is to use either Scalr (http://code.google.com/p/scalr) or your own auto-deploy tool/app.

This is not supposed to be an exhaustive study or example of how the site would be setup…but I hope that this acts as a starting block.

The AWS ecosystem…barrier to innovation?

June 30, 2008 – 5:14 pm

Stacey Higginbotham from Gigacom wrote an interesing article about AWS. I found the article opened up an interesting discussion around the future of innovation at AWS.

Stacy says:

Werner Vogels, CTO at Amazon said that they built AWS for the company’s internal developers, and as such, didn’t feel the need to wrap services such as dashboards and testing offerings around it.

She goes on to say:

Companies such as RightScale, Hyperic and Soasta depend on both the success of AWS and its shortcomings — the solutions to which they propose to offer. So I sat down with the online retailer’s CTO, Werner Vogels, to see how Amazon viewed this ecosystem. My takeaway? I think most of the these firms are safe.

The ecosystem listed above is very big and leads to a large portion of the fees that Amazon raises for AWS. These solution providers help may large customers to use AWS in a very structured and safe way.

But, what happens if the thousands of small developers and start-up companies who use AWS want a service or feature that is already provided as a premium service by one of the companies listed above? Would Amazon release a service that might see one of the solution providers loose market share? Does Amazon need their custom more than that of the thousands of small developers?

My gut feeling is that AWS would develop services for their clients…and that this development roadmap is dependent upon who shouts loudest. Lets see how things work out over the next 12 moths.

More at:
http://gigaom.com/2008/06/27/hey-startups-amazon-gets-it

Speed of thought…Cloud Computing helps lower the time to market.

June 22, 2008 – 6:35 pm

My day job is managing the worldwide network of spam filtering servers used to provide the EmailCloud service. I have been managing email servers for almost ten years and have quite a bit of experience in the area. Recently I have been using several cloud computing providers to host parts of the network. This has helped me to add scalability, fail over and add geographical load balancing. These are great things, but, one of the most exciting aspects of this new technology is the speed that it allows me to design, prototype, test and deploy new technology.

In the past the process was quite a bit more time consuming. The steps were something like  this:

  1. Find some hardware.  I am lucky enough to have quite a bit of test hardware  in the office…we have  a suite of around five servers that are used for testing. The problem is that they are regularly dismantled and the parts used in lots of ways. Finding hardware is easy…but it takes time to build it.
  2. Install an OS. For the testing to be perfect it is a good idea to use the same OS the deployment environment.
  3. Replicate the existing production environment. This step can take ages…installing all the correct OS modules, perl modules and application software.
  4. Write your code and test it. 
  5. Deploy….
  6. tidy up and put the server back on the shelf.

The whole process can take days of work…but due to the demands of small business it can take weeks to get it done….breaking the line of thought.

Using the AWS and S3 service I have been able to lower time from conception to deployment from months to perhaps hours. Now, firing up an instance and testing some crazy idea only takes minutes and costs peanuts. Once tested it can be built into a custom image and then fired up in bulk for deployment….moving the IP addresses from an old bank of now obsolete servers is the last step to move traffic across before shutting down the old servers.

Anybody for lunch???

June 17, 2008 – 9:09 pm

I have been talking to several people (Alex Kavanagh, Codeworks Connect, members of the AWS user group and the Newcastle Open Source Network) over the past few weeks about trying to build some awareness of Linux, open source software and my favorite topic, cloud computing.

We have been playing with a few ideas, namely:

  1. A version of GoOpen for the North East
  2. A FOSSCamp event on the theme of the very popular BarCamp concept
  3. A ‘hackathon’ or ‘mashup’ type event

Some of us are meeting up for a sandwich at 13:00 on Thursday 26th June at Panis cafe in Newcastle-upon-Tyne.

I hope that we can get a few interested people together to throw around a few ideas. The whole concept is in a very early stage so we would like to invite ANY interesed parties to attend.

If you cant attend please feel free to add your comments to this blog entry and I will raise your comments at the meeting.

zoomii.com, a cool user interface to Amazon.com

June 17, 2008 – 8:26 pm

I found out about zoomi books on the AWS Blog.

I really like the new user interface…check it out for yourself. Zoom in and out, pan around, and try clicking on the arrows in the headers of each book shelf. Check out the attention to detail as the site even displays book images in a manner that shows their size in relation to the books next to them!

When I saw this site it reminded me of the interface of the iPhone. While Zoomii is not as slick as an iPhone interface it is rather similar in the intuitive way it works.

The writer of the site is a guy called Chris Thiessen. In an interview with Mike from AWS he said:

“I’m pretty sure I wouldn’t even have tried to build Zoomii without Amazon EC2 and Amazon S3; it would have looked too expensive, too daunting. And, of course, Zoomii couldn’t exist without access to a dataset like that. It looks like Amazon is growing a good business with its web services, but it’s providing far more value than it captures–you changed the equation for a lot startups and other projects.”

The site is well worth the visit.

The various level of cloud computing offerings

June 7, 2008 – 6:50 pm

The guys at Rightscale came up with these really easy ways to define the various level of cloud computing offerings; I have slightly rephrased their explanations:

  • Applications in the cloud: In my opinion this is not ‘cloud computing’…it is simply SaaS, Software as a Service, looking to join the ‘cloud’ craze. In this area we will find sites like gmail, yahoo mail, Hotmail, the various search engines, wikipedia, encyclopedia britannica, etc. Basically put, these are services / websites that people can use (either commercially or free) without any concern about where, how, by whom the compute cycles and storage bits are provided.
  • Platforms in the cloud: Mosso, Google App Engine, and Force.com have all released services in this area. Developers write their application to a more or less open specification and then upload their code into the cloud where the app is run magically somewhere, typically being able to scale up automagically as usage for the app grows.
  • Infrastructure in the cloud: This is where AWS sits. Developers and system administrators obtain general compute, storage, queueing, and other resources and run their applications with the fewest limitations. This is the most powerful type of cloud in that virtually any application and any configuration that is fit for the internet can be mapped to this type of service.

Looking at these different types of clouds it’s pretty clear that they are geared toward different purposes and that they all have a reason for being.

More here:
http://blog.rightscale.com/2008/05/26/define-cloud-computing/ 

Why are cloud computing services proving so popular?

June 4, 2008 – 12:16 pm

There are many entrants in the web services / cloud computing / utility computing arena, but at this time the market leader is AWS from Amazon.com. Web services are creating a buzz in the web development industry. If you are involved in the web development industry and you are not already using this sort of technology I predict that you will be doing so in the not to distant future.

The reason why these technologies are proving so popular is that they lets businesses and individuals “rent” computing power, data storage, and bandwidth on this vast network, and, best of all, you only pay for what you use.

AWS is gaining quite a bit of traction in the Facebook development area…you might be surprised to hear that several high profile facebook application such as iLike and Family Tree are actually using AWS to serve their users in facebook. These applications use AWS to serve hundreds of thousands of users hits per day. How is that possible?

I believe this is possible due to the following facts:

  • The various AWS offerings are very cheap to use and have no up-front costs. This allows you to experiment with the services without major commitments. More importantly, you can test their performance and behavior with simple prototypes before you build your final application.
  • Each service offering in the AWS range has a wealth of tools, libraries and code samples that are available for AWS users. Most of these have been written and contributed by the various developers on the forums. There is a large and active community of developers using the infrastructure services, and many of the available resources are open source.
  • The low cost allows you to rethink how you design your web applications. Things work differently in the cloud. This can be an advantage as it will allow you to be more ambitious, but at the same time you may have to adjust your expectations to build applications that are robust and massively scalable.
  • Design your application to take advantage of the strengths of the AWS…both in service delivery and low cost…this will allow you to spread your application’s workload between many small components rather than centralising it into a single point of failure.
  • Embrace the concepts behind cloud computing….build application from components that will quickly recover from transient errors, and that can be easily restarted or replaced if they fail completely.

Scalr Presentation, AWS user group

April 25, 2008 – 11:16 am

I have already written about Scalr in the past, but at the AWS user group I gave a short presentation about Scalr and why I think it is an important development in the area of Cloud Computing for startup and early stage companies.

Here are the slides of the presentation:
Slide 1

Slide 2

Slide 2

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

You can download a ZIP file of the whole Scalr Presentation.

Explaining Cloud Computing to Less-Technical Friends

April 22, 2008 – 8:41 am

In a constant effort to explain cloud computing to my non-technical friends I try to use analogy’s and examples like the slash dot story or the what if I had more data story. Here is another one….which I paraphrased from TechTarget.com.

Useful analogy for cloud computing is RAID, redundant arrays of inexpensive disks. When the first patents for this revolutionary concept were filed by IBM in 1977, the focus was on performance, not cost. Ten years later it became apparent that an array of consumer-grade “crap disks” could deliver better reliability and performance than standalone disks at dramatically lower costs. So much cheaper, in fact, that when enough parts failed, the array was “pushed out the back door” and dumped.

Google’s cloud operating system rests on similar hardware architecture of throwaway components lashed together for maximum processing and minimum monitoring — all under the control of a “single executive” software.

More here.

Amazon Web Services gets serious about enterprise

April 21, 2008 – 8:00 am

Cloud Computing was once looked upon by enterprise users as a purely academic and perhaps unlikely to be deployed in mission critical situations. It usually takes years for new technologies to be adopted by the enterprise….and often enterprises are criticised for prolonging the use of old but trusted and perhaps uneconomic technologies.

Cloud Computing and its brother, Utility Computing is different….the cost savings and economies of scale are so vast that enterprises haven’t been able to ignore them…and now they don’t have to fight the conflict as Amazon have announced commercial support for their range of services.

Cloud Computing has grown up with the announcement that Amazon Web Services will now provide support for users of its Simple Storage Solution, Elastic Compute Cloud and Simple Queue Services products. Amazon, with its launch last week of persistent storage, was clearly wooing enterprise users, and the offer to provide support signals a formal courtship.

The newly announced support packages are:

  • Silver, starting at $100 per month, gives access to new AWS Support Center, with personalized Web support, case tracking and guaranteed response times during US office hours (6am - 6pm Pacific) Monday to Friday.
  • Gold, starting at $400, includes all of the above plus personalized phone support and around the clock (24×7×365) coverage.

Some people have asked why now? Actually, why release this service? The online forums have been used very effectively to provide support to the tens of thousands of AWS users in the past. Well, if you read into the statement from Jeff Barr closely you will see the reason. He said:

“Increasingly, we see that organizations of all sizes are putting AWS to use in new, innovative, and mission-critical ways. These organizations have told us that they need a more direct and more discreet way to request assistance and to report problems.”

The hint is the word “discreet”.

On April 7th there was a major outage of the EC2 service from AWS. For some reason about 20% of the EC2 instances lost connectivity. A post was placed on the AWS forum and within around one hour the issue was resolved. At the time I was very impressed at the speed and efficiency of the support….but one thing stuck in my mind….the whole episode was very public. During the outage several developers posted information about their problems, all of which were posted on a public forum and still visible on the internet….something that really does not concern me with my small business but would be completely unacceptable for an enterprise.

Well done, Amazon, for finally making Cloud Computing acceptable for the enterprise.