Since its start in 2006, Amazon Web Services has grown to over 40 different services. Amazon Simple Storage Service (S3), our object store, and one of our first services, is now home to trillions of objects and core to many enterprise applications. S3 is used to store many kinds of data, including geo, genomic, and video data and facilitates parallel access to big data. Netflix considers S3 the “source of truth” for all its data warehousing.
The goal of this presentation is to illustrate best practice for open or shared geo-data in the cloud. To do so, it showcases a simple map tiling architecture, running on top of data stored in S3 and uses CloudFront (CDN), Elastic Beanstalk (Application Management), and EC2 (Compute) in combination with FOSS4G tools.
The demo uses the USDA’s NAIP dataset (48TB), plus other higher resolution city data, to show how you can build global mapping services without pre-rendering tiles. Because the GeoTIFFs are stored in a requester-pays S3 bucket, anyone with an AWS account has immediate access to the source GeoTIFFs at the infrastructure level, allowing for parallel access by other systems and if necessary, bulk export. However, I will show that the cloud, because it supports both highly available and flexible compute, makes it unnecessary to move data, pointing to a new paradigm, made possible by cloud computing, where one set of GeoTIFFs can act as an authoritative source for any number of users.