You get a call about a database server that went down due to a power outage and now the instance won’t start up, and, for some reason, there are no backups. This outlines the steps required to attempt and recover the master database of an instance that won’t start due to the transaction log file being corrupted.
Symptoms
Usually, an instance that ends up with its master database’s transaction log getting corrupted will fail to start and record one of the following Application errors in the Windows Event Log:
“Event 9003, MSSQL$[Instance]
The log scan number (219:286:0) passed to log scan in database ‘master’ is not valid. This error may indicate data corruption or that the log file (.ldf) does not match the data file (.mdf). If this error occurred during replication, re-create the publication. Otherwise, restore from backup if the problem results in a failure during startup.”
“Event 9004, MSSQL$[Instance]
An error occurred while processing the log for database ‘master’. If possible, restore form backup. If a backup is not available, it might be necessary to rebuild the log.”
“Event 5173, MSSQL$[Instance]
One or more files do not match the primary file of the database. If you are attempting to attach a database, retry the operation with the correct files. If this is an existing database, the file may be corrupted and should be restored from a backup.”
You can also encounter the following warning message whenever you try to start the SQL Server service:
“You have until SQL Server (MSSQL$[Instance]) to logoff. If you have not logged off at this time your session will be disconnected, and any open files or devices you have open may lose data.”
This will also record a System error in the Windows Event Log under Event ID 7024.
Warning
The next steps make use of an undocumented option for CREATE DATABASE, messing around with the master database’s mdf file, and a lot of luck.
Ideally you’d want to restore the master database from a recent backup, this is just a last resort solution.
Fixing the corrupted transaction log
In order to attempt to fix this you’ll need another SQL Server instance running the same version (including service pack and/or CU level) as the affected instance. In my case, both the affected instance and the one used for recovery are running SQL Server build 15.0.2095.3.
Copy the data file to another server
Since the transaction log is the issue, I’ll just grab the .mdf file and copy it on another server to the directory where the user databases’ data files reside.
Renaming the file isn’t mandatory (unless your actual master database resides in the same directory), I just did it to avoid confusion.
Note that when attaching database files originating from other servers there might be some permission issues on said files, so I’ll first make sure that this instance’s service account has full permissions on the data file.
Failed attach attempt
Just out of curiosity, I first attempt to attach the database using the standard option to rebuild the transaction log.
1 2 3 4 5 6 | USE [master] GO CREATE DATABASE [_master] ON (FILENAME = N'E:\VSQL\Data\_master.mdf') FOR ATTACH_REBUILD_LOG GO |
The full error message:
“File activation failure. The physical file name “E:\SysFiles\MSSQL15.SQL19\MSSQL\DATA\mastlog.ldf” may be incorrect.
The log cannot be rebuilt because there were open transactions/users when the database was shutdown, no checkpoint occurred to the database, or the database was read-only. This error could occur if the transaction log file was manually deleted or lost due to a hardware or environment failure.
Msg 1813, Level 16, State 2, Line 8
Could not open new database ‘_master’. CREATE DATABASE is aborted.”
Introducing ATTACH_FORCE_REBUILD_LOG
The undocumented ATTACH_FORCE_REBUILD_LOG does what it says, it forces a transaction log rebuild regardless of what inconsistencies SQL Server might detect between the transaction log file and the data file.
1 2 3 4 5 6 | USE [master] GO CREATE DATABASE [_master] ON (FILENAME = N'E:\VSQL\Data\_master.mdf') FOR ATTACH_FORCE_REBUILD_LOG GO |
Even if the output still contains the file activation failure message, this is just because the “E:\SysFiles\MSSQL15.SQL19\MSSQL\DATA\” directory structure does not exist on this server. SQL Server did create the new transaction log file in this instance’s default t-log directory.
Checking the state of the database and the files to make sure everything looks ok.
Updating file location metadata
Since the master.mdf file stores some metadata pertaining to the location of the mastlog.ldf file, I’ll need to do a couple of changes so that that data matches the source.
This is mainly for consistency’s sake, SQL Server is usually able to find the mastlog.ldf file once back on the source instance.
First I’ll have to recreate the directory structure to match the one on the source VM “E:\SysFiles\MSSQL15.SQL19\MSSQL\DATA”.
At this point I have to update the location of the transaction log file in the catalog.
1 2 3 | ALTER DATABASE [_master] MODIFY FILE (NAME = mastlog, FILENAME = 'E:\SysFiles\MSSQL15.SQL19\MSSQL\DATA\mastlog.ldf') GO |
I stop the instance, move and rename the file, and grant inheritable full control to the service account on the E:\SysFiles directory so that SQL Server will be able to access it when I start the instance up again.
1 | icacls "E:\SysFiles" /grant "NT Service\MSSQL$VSQL19":(OI)(CI)F /T |
I then start the instance back up and make sure that the transaction log file is accessible in its new location.
Move the files back and start the instance
In order to move the mdf and newly generated ldf files back to the source instance, I first have to detach the _master database.
1 2 | EXEC master.dbo.sp_detach_db @dbname = N'_master' GO |
Rename the old files on the source server and copy the _master.mdf and the newly generated mastlog.mdf file to their original location on the source server.
I rename _master.mdf back to master.mdf and check permissions to make sure that the instance’s service account is able to access the files after they’ve been moved back from the recovery server.
I start the instance, connect to it and run a DBCC CHECKDB against the recovered master database.
If everything’s ok, proceed with a DBCC CHECKDB against all the other databases, and use this opportunity to set up backups for this instance.
Important
If DBCC CHECKDB against the master database fails, I strongly recommend you bail on the instance, build a new one and attach the user database files to that one. And hopefully set up a decent backup strategy for the new instance.