MapBox

Using the OSM Planet EBS

MapBox maintains a Public Data Set with the complete collection of OpenStreetMap data stored as a PostgreSQL data cluster. You can get access to an enormous amount of geographic data in a matter of seconds by creating an EBS volume based on this snapshot. Traditionally The Planet was bzip2 compressed, so it unzips from 7.4GB to 134GB, and then occupies ~80GB in a PostGIS database.

Preparing the data set

If you haven't done so already, start a MapBox EC2 instance.

ec2run *AMI* -k *KEYPAIR* -t m1.large

Make a note of the availability zone and the ID of the instance. You'll need them when you create and attach the EBS volume.

Create a new EBS volume based on the OSM snapshot.

ec2addvol --snapshot *SNAPSHOT* -z *ZONE*

Attach the volume to your EC2 instance. You may use any valid device name in place of dev/sdx.

ec2attvol *VOLUME* -i *INSTANCE* -d /dev/sdx

Connect to your EC2 instance using SSH. Once you're logged in create a mountpoint for your volume. I've selected /mnt/osm as my mountpoint, but you can call this whatever you like. Once you have a mountpoint, go ahead and mount your volume.

mkdir /mnt/osm
mount -t ext3 /dev/sdx /mnt/osm

Your mount point should now contain the following directories:

data/
keepup/
lost+found/

The volume we just created contains a PostgreSQL data directory, and we'll need to create a new data cluster using this directory.

pg_createcluster -d /mnt/osm/data 8.3 osm

The output from this command will contain the port number that was assigned to the data cluster. Make a note of it because you'll need to specify the port number to connect to the server.

Now, restart the PostgreSQL server to start using the data.

/etc/init.d/postgresql-8.3 restart

Using the OSM Planet EBS with other EC2s

To use the OSM Planet EBS with AMIs from other sources is simple - the only requirement is PostgreSQL 8.3 or newer, which can be installed with a package manager on Linux or an installer package on Windows. The EBS contains the PostGIS extensions within the database dump, so it is not necessary to install PostGIS.