Why are there no out-of-the-box high-availability solutions for SQL Server in the cloud?

When selecting any cloud service, you would like to get some guarantees about the reliability (uptime) and durability (what is the chance that your data is lost) of the service. Now, let’s see what we can find on cloud relational database services, either as a PaaS (Platform as a Service), or installed as part of an IaaS (Infrastructure as a Service, i.e. virtual machines) configuration.

  • Windows Azure SQL Database gives an uptime SLA of 99.9%. The database is replicated treefold (within the same datacenter), but Microsoft gives no SLA for durability of the data. It indicates that to secure your data against widespread datacenter failure, you will need to export your complete database yourself (as a SQL script, or a BACPAC file).
  • Amazon RDS (Relation Database Service)  has no SLA at all for uptime or durability. In the past, RDS has turned out to be the most unreliable of Amazon’s services, because it is build on top of the EC2 services (virtual machines) and the EBS service (virtual drives). RDS uses SQL Server logfiles to be able to restore a complete server instance (not separate databases) to a maximum time of 5 minutes ago. It also maintains daily snapshots of the instance, which can be retained for a maximum of 35 days. RDS also offers live replication and failover to another datacenter (Multi-AZ), but this is not available for SQL Server.
  • SQL server running on virtual machines of Amazon (EC2) or Microsoft (Azure Virtual Machines) have no SLA. There is only an SLA for the underlying virtual machines from both Amazon and Microsoft for 99,95% uptime.

So, from this I have to conclude there’s really a lot of uncertainty about the reliability and durability of any database service in the cloud. It also means that for SaaS applications where you can not afford to a have a few hours (or more) downtime you have to look into separate high-availability solutions. What is available in this area, specifically for SQL Server?

  • Windows Azure SQL Database has no built-in high-availability solution. You can copy a database to a new database in the same datacenter, but to copy it to another datacenter you have to export/import all data to a new database. Failover to a database copy will have to be managed manually.
  • Amazon RDS has a Multi-Availability-Zone option (Multi-AZ), where a database replica is maintained in another datacenter in the same region. This option, however, is only available for Oracle and MySQL, not for SQL Server. For SQL server your only option is to make instance snapshots, but you can not copy RDS snapshots to another datacenter.
  • On Amazon or Microsoft Azure virtual machines, you can implement any of the standard high-availability solutions of SQL Server (SQL Server 2012 AlwaysOn, database mirroring, log shipping). But keep in mind the following:
    • SQL Server 2012 AlwaysOn, and asynchronous database mirroring,  are only available in the Enterprise Edition of SQL Server. You will have to buy your own license for this, which is extremely price ($7000 per core).

    All of these high-availability solutions that you implement yourself on top of cloud virtual machines require a lot of expertise, ongoing management, and more than double the cost of the solution.

Conclusion: In general, for SaaS suppliers durability of data is much more important than uptime. In most cases 99,9% uptime would be sufficient (less than 1 hour downtime each month), but data durability has to be “Nine Nines” (99,9999999%), because any permanent data loss is unacceptable. Currently, none of the cloud relational database suppliers can offer this reliabiliy “out of the box” for SQL Server. Amazon RDS’s Multi-AZ solution comes closest (at almost double the price), but it is not available for SQL Server. Suggested manual solutions (based on export/import of data) are in many cases completely unrealistic for large amounts of data, because it could take days to complete an import of terabytes of data.

With all this,  are cloud relational database services currently an option for business-critical applications? I think they are, but the situation with regards to high-availability needs to improve rapidly, so that high-availability is simply an option to check. In the mean time, especially when you need to keep costs in check, we will need to rely on old-fashioned methods like log-shipping and data-export to safeguard our own data.

One response

  1. Excellent blog post, thanks for sharing.

    But why are there no out-of-the-box high-availability solutions for SQL Server in the cloud. Does SQL Server operating at Cloud scale actually need a new file system?

    Accounting software a sweet spot for SQL Server needs to do collections for totals and double-entry transactions accountants crave, but can it do so for web accounting at cloud scale?

    The Dynamo White Paper was based on high-availability lessons from Amazon eCommerce — sporing the NoSQL movement, an architectural back to key value stores and a journey forward to specialist databases for high-availability solutions without SQL. http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf

    The NoSQL movement — CouchDB, MongoDB, DynomoDB, Cassandra DB — gathered such momentum SQL Server is starting toward a Hadoop partnership with Hornton Works. A shared file system (notice Hbase on slide 12) http://www.slideshare.net/MichaelRys/microsofts-hadoop-story

    Architecturally the closest accounting software had come to disruptive innovation is back-fitting CRM. But extending row-based data tables and still using SQL server did not work out well [the GBKUMT case study].

    Focusing on CRM in the Cloud, Force.com architect Parker Harris chooses another approach based on dot-com lessons. Flickr adopted MySQL dropping SQL in order to shard the tables over many machines with external indexes but for pictures. Early Facebook adopted the similar for its site (not accounting use cases).

    Force.com needed to support collections and transactions (closer to accounting). Parker innovated an Oracle database-on-database, the pattern allows sharding with SQL transactions — very broadly — one database uses external reverse indexes, the second Oracle database performs traditional SQL. Architected more than 10 years ago, there’s news on street a project to use Apache HBase is underway (a tie in with @SQLServerMike’s slide 12?)

    SAP has chosen another way to scale it’s accounting centric applications, switching to a columnar database design in HANA, with a 300 percent improvement in performance over row based SQL — but then SAP is not multitennancy cloud scale. SAP seemingly prefers a new file system.

    Perhaps SQL Server needs architecting like Force.com until Microsoft innovates a new file system capable of high-availability at Cloud scale?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

SQL Server Blog

Thoughts about SaaS software architecture

Brent Ozar Unlimited®

Thoughts about SaaS software architecture

The Visual Studio Blog

Thoughts about SaaS software architecture

Microsoft Azure Blog

Thoughts about SaaS software architecture

AWS News Blog

Thoughts about SaaS software architecture

%d bloggers like this: