Microsoft Azure offers different replication modes for Azure Storage. Every mode approximately doubles the costs for a TB of data. During a great workshop with Patrick Heyde we talked about stamp copies and I asked myself which mode I really needed.
First I took one step back, to identify all requirements I typically have in my projects for a highly scalable, fault safe, redundant storage :
- When a hard-drive in a storage server brakes, my data still needs to be usable.
- When Microsoft has huge power outage on a whole datacenter, I want to bring my app up and running again in another datacenter.
- When I (or my customers) are removing data by accident, there needs to be an option to revert to a former snapshot.
So a review of the different replication modes compared to these requirements leads me to the following results:
When a hard-drive in a storage server brakes, my data still needs to be usable:
The Local Redundant Storage fulfils this requirement perfectly. Microsoft writes 3 different copies of every bit within one single data center. When a hard drive on a stamp or the whole stamp goes down, another one can take over and all data is available without any interruption.
When Microsoft has huge power outage on a whole datacenter, I want to bring my app up and running again in another datacenter:
Microsoft offers a geographical redundant storage mode that stores another 3 copies in another datacenter a hundred miles away. This helps a lot, because every application can use the secondary location to access to the data – but is it worth the price? The price for GRS is three times higher then for LRS.
An automated replication between two different LRS storages, hosted in a Azure WebJob might be a good solution the fulfill the requirement as well.
When I (or my customers) are removing data by accident, there needs to be an option to revert to a former snapshot:
Globally Redundant Storage is not helpful when it comes to removing data by accident. As soon as the data is removed from the primary storage, the system removes the data in the backup location as well, often within seconds.
But this requirement can also be fulfilled with a replication between two different LRS storages, as already described above. The whole application needs to be designed for this use cases.
So this review brings me to the conclusion that in my personal opinion GRS storage is not needed in most of the use cases. Normally several LRS storages and an application logic optimised on the specific data security requirements works well and preserves the budget.
What’s your opinion? Do you have use cases where GRS and Read-GRS are hard requirements? If you like, leave a short comment …