The national geospatial foundation of Norway consists of multiple semi-structured and semi-standardized thematic data sets made available in a variety of formats. Storing, extracting and performing lightweight analyses across the different data sets adds value and usefulness to the data sets, which is a prime motivation for releasing the data freely to the public. Earlier approaches have stored the different data sets in a traditional relational manner resulting in hundreds of Postgresql/PostGIS tables _ some with dozens of attributes. Updating and querying the data sets becomes unnecessary complicated and often a tedious, manual task.
In an effort to deal with these issues, we have looked at other ways of storing and querying the data. A schemaless storage mechanism, like NoSQL-databases, fits perfectly to the task. However, NoSQL-database implementations have major drawbacks related to geometry handling when compared with PostGIS. We wanted the geometry handling of PostGIS combined with the schemaless storage mechanisms of a NoSQL database.
Postgresql fits this combination perfectly with PostGIS handling of geometry and HStores handling of key-value stores. HStore is an extension that implements a binary data type in Postgresql that allow storing an arbitrary number of key-value pairs. In contrast to the JSON data type, HStore enable indexing on the key-value stores. Combining PostGIS geometry with HStores key-value storage for non-geometry attributes was a perfect match for storing the highly varying data sets. The flexibility gain is tremendous and a huge success allowing our data developers to find new ways of combining and making value of the data sets. Future work on the JSONB data type will combine the benefits of both the HStore and the JSON data type, enabling solutions that are even more advanced as well as bridging the gap between NoSQL-databases and relational spatial databases.
This talk will present our success in combining geometry and key-value stores in Postgresql by using PostGIS and HStore _ which lead to a neatly structured geospatial data collection with excellent performance for extractions, both in materialized views, but also running real-time extractions and lightweight analyses used in production decision-making.