Azure SQL Hyperscale Tier – Next Milestone

Couple of months ago, we wrote an introductory article about Hyperscale tier of Azure SQL Database. As we have discovered more about it so we are excited to share all the learning with all of the world. A quick recap of its architecture can easily be understood by the follow diagram published by Microsoft.

architecture
Azure SQL Hyperscale Database Architecture, credit: Microsoft

A hyperscale database is created with starting size of 10GB and then it starts growing by 10GB every 10 minutes until it reaches 40GB so default size of Hyperscale database is 40GB. Each of these 10GB chunk is created on a separate page server to provide more I/O parallelism.

Each data file grows by 10GB. As the data grows, data files and associated page servers are added i.e. database size grows automatically as you insert more data.

Similar to other Azure SQL Database PaaS offerings, TempDB is sized proportionally to the compute size.

It support PITR i.e. point-in-time-recovery. RPO (Recovery Point Objective) is 0 minutes as the log is retained as it is. As it is means there are no traditional backups. Instead, there are regular snapshots of data files. RTO (Recovery Time Objective) is 60 Minutes, in general, regardless of the size of the database. However, if there was an intensive write activity before the restore point then it may be longer.

As there are NO traditional database backups and storage subsystem is separated in the architecture so the snapshot backups of storage doesn’t have any impact on the performance of the database.

Scaling typically take 2 Minutes regardless of the database size. However, if you are adding a secondary replica then there is NO connection drop.

High Availability is driven by two factors – compute resiliency and storage resiliency which is obvious as the compute and storage are separated in the architecture. When compute is down, new compure replica is created automatically.

Polybase is not supported but this is no surprise as Azure SQL Database also doesn’t support this and Hyperscle is just an extension of Azure SQL Database. R and python are also not supported.

Geo-replication is not there but Geo-Restore can be done.

As storage is a separate component so it becomes important to look at the data latency which is typically in tens of milliseconds but there is NO upper limit.