EC2's dirty secret
Javier's post was a great tutorial on building out a PostGIS database on Amazon EC2. We all know EC2, but it does have it's drawbacks and they are mainly related to disk IO. When using EC2 & EBS with large datasets you can easily run into IO bottlenecks. Individually these are not such a big deal, but when you are conducting global analyses poor disk IO on EC2 & EBS can quickly become a problem.Clean living?
To help alleviate this, there is a trend of people stringing together EBS volumes and creating their own software RAID-0 arrays to achieve higher read and write throughput.
Nope, a Bash script.
I pieced together bits and bobs to create a script that builds out a PostGIS database on an n-volume RAID array on EC2. It's pretty simple stuff, but should mean that instead of hours, you can get your 20 volume RAID-0 PostGIS test rig up and running in minutes.
You can grab it from Github:
5 comments:
cool post! thanks!
This is great, I'm always sharing out "code" that I've just done in BASH - I say if it's helpful, what do I care what language it is? Raw can be helpful, and throwing it all out there is the only way to know. Looking forward to investigating this with Eucalyptus.
@fak3r
thanks simon!
but I reckon postgresql repo is not working anymore.
adding 'deb http://archive.ubuntu.com/ubuntu jaunty-backports main universe multiverse restricted' to sources.list is also not solving the problem.
is there any other source for postgresql 8.4?
had to hack it with:
- deb http://ec2-us-east-mirror1.rightscale.com/ubuntu jaunty-backports main restricted universe multiverse
- deb http://ec2-us-east-mirror2.rightscale.com/ubuntu jaunty-backports main restricted universe multiverse
- deb http://ec2-us-east-mirror3.rightscale.com/ubuntu jaunty-backports main restricted universe multiverse
any other option?
Post a Comment