Provider's Guide to COG

Introduction

Cloud Optimized GeoTIFF (COG) is a new way to produce data, enabling less duplication and innovative cloud-based workflows. If you are a provider of imagery and like what you see from Cloud Optimized GeoTIFF, this guide can help you with how to provide your data as COG. For additional background see the Why COG? page.

COG Advantages for Imagery Providers

The main advantage to providing your data as Cloud Optimized GeoTIFFs is to enable more cloud-centric workflows. Data on the cloud should not be copied many times, it should sit in one location and let everyone access from there. With COG this is much more likely, because a number of common geospatial workflows on the cloud can make use of it.

Using COG should actually save money. Download of data is one of the larger distribution costs for geospatial data, as cloud-hosting providers charge egress fees for any data leaving their network. COG-aware clients can download just the portion of the data they need, instead of larger files, so less bandwidth will often be consumed. Furthermore many users will be inclined to run their processing in the same cloud location as your data, as that allows the fastest access to the data. Most cloud providers don’t actually charge for bandwidth when the data is moving in the same zone, so you can save additional money due to that, especially if you encourage your users to run their processes and services in the same zone.

Enabling Cloud Optimized GeoTIFFs

In all likelihood GeoTIFF is already a data format that you make available to your users. Thankfully you don't need to make a fully new format - you just need to start formatting your existing GeoTIFF data in a COG compliant manner.

There are two main things to do to take advantage of Cloud Optimized GeoTIFF:

Format your data according to the COG best practices. GDAL is how the vast majority of data is created, and so likely you just have to tweak the parameters of your file creation. There are full details on the GDAL wiki, but the quick version of how to format your data is:
```
gdal_translate in.tif out.tif -co TILED=YES -co COPY_SRC_OVERVIEWS=YES -co COMPRESS=DEFLATE
```
This will set the data structure properly to be able to be read by any COG-aware software.
The other thing to do is to make your data available on a web server that supports HTTP range queries. By far the easiest way to do this is just by distributing the data on Amazon S3, Google Cloud Storage, or Azure. And most modern web servers will also work out of the box, so chances are this part will 'just work'. If you have trouble with it, then get in touch. You can put most any authorization protocol you want on top of it, see the GDAL VSI docs for lots of details, and we hope to get a nicer guide to the options.