A Practical Guide to Active Geo Replication

Active Geo Replication is a database-level feature specific to Azure SQL Databases. It creates a secondary replica of your primary database which can be used –

  1. To offload the read operations from primary database
  2. As a failover in case of disaster recovery

Let’s first quickly look at some of the key points about this feature and then we will see a practical example of this along with real observations among the primary and secondary databases.

Key Points

  1. Secondary replica can be created in the same data center/region or another data center/region
  2. Though needless to mention but sometimes this is good to be explicit that you can not create the secondary database on same logical server as the database with same name would already be there
  3. As we mentioned above that this feature is specific to Azure SQL Database so don’t get mistaken by PaaS implementation. Active Geo-Replication is NOT supported by Azure SQL Managed Instance
  4. Active Geo-Replication is available for all database tiers EXCEPT Hyperscale
  5. Cost-wise, secondary databases are priced at 100% of the primary databases. The cost of replication traffic between the primary and secondary database is included in the cost of secondary database itself
  6. Up to four secondary replicas can be created, be it in the same region or in another region
  7. You can create secondary replicas of secondary databases i.e. a nested chain of replicas to bypass this limit but there would be a trade-off of lag on the leaf most secondary databases
  8. Internally, this is designed as a Business Continuity solution and leverages Always On Availability Group technology of database engine to replicate committed transactions on primary database to secondary database in an asynchronous way using snapshot isolation
  9. It replicates the changes to secondary database by streaming transaction log. So, don’t relate this to traditional replication method which replicates data/changes by executing DML changes through INSERT, UPDATE or DELETE commands
  10. It’s not just data which is replicated to secondary, other objects which are scoped at database level like stored procedures, views, contained users etc. are also replicated to secondary databases
  11. Failover to secondary database is done manually either by the application or by a user
  12. When failover is activated and if there were multiple secondary databases then all other secondary databases are automatically linked to this new primary
  13. At the end, would like to stress on the fact that this is a database-level feature so anything which was configured at server level on primary database like server-level firewall rules, changes to backup retention period etc. are not replicated to secondary database and hence must be configured separately on secondary database. However, any contained users being scoped at database level are automatically replicated so can start accessing the database without any additional configuration except that they need to refer to secondary server to connect to secondary database

Practical Observations

As soon as you select the secondary region/server and pricing tier for secondary database, the replication gets started and you see it as “Initializing” under the status as shown below

Then you will witness the actual replication progress as “Seeding” –

Once the replication is done which largely depends on the size of your primary database and the activities on it, secondary database becomes available immediately and the status starts showing as “Readable“. But don’t forget to add firewall rules on secondary logical server if you were using server level firewall rules for your primary database before attempting to connect to it.

Over the secondary database’s Geo-Replication menu on the portal, you will start seeing the options for manual failover and to stop the replication too if you wish to –

Another observation – if you had auditing enabled on primary database server then again being enabled at server level so is not cascaded to secondary server and must be enabled separately over secondary replica –

There is one more observation though it’s a database-level feature that Query Store is disabled on secondary database. Under the Automatic Tuning menu over the portal, you will find a notification as shown below which says that Query Store is disabled because the size has reached its limit.

Even if you try to purge the query store, you will get the following message –

After looking at the additional information in above dialog box you will understand why it’s disabled and why it’s not allowing you enable it. Simply because the secondary database is the read-only copy and you can not make any ALTERation to the database here. If you need to make a change at database level it needs to come from primary through replication which makes sense.

You can see this in object explorer too where Query Store is marked as Read-Only

Just before concluding this post, let’s touch upon another point whereby we mentioned that the replication between primary and secondary is asynchronous so at any time you can monitor this by querying the DMV as mentioned in the below query –

SELECT * FROM sys.dm_geo_replication_link_status

You can run this query on both primary and secondary databases. But running this on primary database will give you the statistics about the lag in the replication and the exact time when last replication/last commit was done.

There are several things that you observe when you actually do it from your own hands so get your hands dirty with it and have a wonderful learning experience if haven’t explored this so far.