Monday, November 16, 2009

Amazon EC2, EBS RAID-0 & PostGIS build script

EC2's dirty secret
Javier's post was a great tutorial on building out a PostGIS database on Amazon EC2. We all know EC2, but it does have it's drawbacks and they are mainly related to disk IO. When using EC2 & EBS with large datasets you can easily run into IO bottlenecks. Individually these are not such a big deal, but when you are conducting global analyses poor disk IO on EC2 & EBS can quickly become a problem.

Clean living?
To help alleviate this, there is a trend of people stringing together EBS volumes and creating their own software RAID-0 arrays to achieve higher read and write throughput.

Nope, a Bash script.
I pieced together bits and bobs to create a script that builds out a PostGIS database on an n-volume RAID array on EC2. It's pretty simple stuff, but should mean that instead of hours, you can get your 20 volume RAID-0 PostGIS test rig up and running in minutes.

You can grab it from Github:


Sergio Alvarez said...

cool post! thanks!

fak3r said...

This is great, I'm always sharing out "code" that I've just done in BASH - I say if it's helpful, what do I care what language it is? Raw can be helpful, and throwing it all out there is the only way to know. Looking forward to investigating this with Eucalyptus.


Anonymous said...
This comment has been removed by a blog administrator.
نلسون said...

thanks simon!

but I reckon postgresql repo is not working anymore.

adding 'deb jaunty-backports main universe multiverse restricted' to sources.list is also not solving the problem.

is there any other source for postgresql 8.4?

نلسون said...

had to hack it with:
- deb jaunty-backports main restricted universe multiverse
- deb jaunty-backports main restricted universe multiverse
- deb jaunty-backports main restricted universe multiverse

any other option?