EC2's dirty secretJavier's post was a great tutorial on building out a PostGIS database on Amazon EC2. We all know EC2, but it does have it's drawbacks and they are mainly related to disk IO. When using EC2 & EBS with large datasets you can easily run into IO bottlenecks. Individually these are not such a big deal, but when you are conducting global analyses poor disk IO on EC2 & EBS can quickly become a problem.
To help alleviate this, there is a trend of people stringing together EBS volumes and creating their own software RAID-0 arrays to achieve higher read and write throughput.
Nope, a Bash script.
I pieced together bits and bobs to create a script that builds out a PostGIS database on an n-volume RAID array on EC2. It's pretty simple stuff, but should mean that instead of hours, you can get your 20 volume RAID-0 PostGIS test rig up and running in minutes.
You can grab it from Github: