One of the cool features of Mirroring and Always On Availability Groups is Automatic Page Repair. With Mirroring or Availability Groups, when SQL Server encounters certain types of corruption it will look to the Mirror/Replicas to see if they have a clean copy of the page. If they do, it will repair the page in the Principal/Primary and make your life easy. Obviously any occurrence of corruption requires further investigation, but this is a handy feature where SQL Server can bail you out of certain scenarios. In this post I will step through an example of what happens when you corrupt a page in a normal database and then what happens when you corrupt the same page in a database that is part of an Availability Group.

For this example I have two SQL Server 2012 servers (V-SQL1, V-SQL2) that live on a Windows Server Failover Cluster. They are both part of an Availability Group (AG1) that contains the BaseballData database. These replicas are in synchronous commit mode. V-SQL1 also has a copy of the BaseballData database called BaseballData2.

AutoPageRep_Instance

We will start by corrupting a page in BaseballData2 to see how SQL Server normally behaves when it encounters corruption, and then we will do the same thing with the AG database (BaseballData). Let’s begin by picking a random page to corrupt.

AutoPageRep_DBCCIND

AutoPageRep_DBCCPAGE

Here we have picked page 7071 slot 33 of file 1 which contains data about the 1942 Yankees.

AutoPageRep_Row

Next we are going to use a hex editor to corrupt the page we previously identified by writing some zeroes to the file. We will first need to take the database offline.

AutoPageRep_HexEdit

Now that it is corrupt, lets bring the database back online and try to retrieve some information about the 1942 Yankees.

AutoPageRep_Error1

As you can see, we encountered an invalid checksum. In this situation, our best bet is to go to a backup to get that data back. Let’s take a look at msdb..suspect_pages and sys.dm_hadr_auto_page_repair.

AutoPageRep_SusPages1

We can see our corrupt page in suspect_pages and we see nothing in dm_hadr_auto_page_repair. As you can probably guess, we will see something in dm_hadr_auto_page_repair later in this example. Let’s go ahead and step through the same process on the BaseballData database. To do this, I will stop the V-SQL1 instance (causing failover) and edit the data file the same way as we did for BaseballData2. I will then start the instance back up.

AutoPageRep_Shutdown

AutoPageRep_HexEdit

AutoPageRep_Startup

Now that our instance is running again, let’s fail back to V-SQL1 and then try to run our select statement.

AutoPageRep_Error2

It threw the same error, great example right? Let’s try again and see what happens.

AutoPageRep_Row

Success! It looks like Automatic Page Repair had our back. We can verify this by querying suspect_pages and dm_hadr_auto_page_repair.

AutoPageRep_SusPages2

Now we see our page listed in dm_hadr_auto_page_repair, because SQL Server repaired that page with Automatic Page Repair. Running CHECKDB returns no errors as our corruption has successfully been repaired.

Image 1017