In this post I cover a script I’ve put together for measuring storage write speeds in SQL Server, namely against database data files.
This is meant to help get an idea of how the underlying storage performs when SQL Server is writing 1GB of data to a database.
At this point, you might be asking yourself: “Why not use CrystalDiskMark instead?”.
The answer is simple: you might not always be able to install/run additional software in an environment. Even more so if you work with external customers or you’re a consultant. It’s a lot simpler to ask a customer to run a script and send you the output, than it is to ask them to install and run some 3rd party software.
The script
Since the TestWriteSpeeds.sql script is too big to include here, I’ve added it in my SQL Server Scripts GitHub repository.
In short, the script does the following:
- Loads a table variable with 1GB of data.
- Creates a table named io_stats_writes to store IO stats related data.
- Creates the speed_test table in which the previously generated data will be written.
- Does 4 passes of inserting the 1GB worth of records using the TABLOCK hint.
- Captures info from the sys.dm_io_virtual_file_stats DMV to the io_stats_writes table for every pass of the insert into the speed_test table.
- Outputs information about the duration of the inserts, IO write stall duration, if and by how much the data and/or transaction log file grew during the test.
- Drops the 2 tables created during its execution.
Result sets explained
The first result set shows the following:
- The name of the database.
- Total number of passes.
- The file’s physical name.
- The file type.
- The average amount of MBs written to each file (data and transaction log).
- The average duration in milliseconds of a pass.
- The average time in milliseconds that SQL Server had to wait for the IO request (the IO stall time aka latency).
- Average writes per file.
- Average amount of MB by which the data and/or transaction log file increased in size during the test
The second result set is fairly similar to the first one, but this one shows the totals for all 4 passes instead of averages.
The third result set shows the non-aggregated information for each of the 4 passes.
Limitations
While the script uses a table variable to store the initial data, it still gets written to tempdb, although, from my tests the table variable way was faster than using a temp table, the read speed of the storage on which tempdb resides can be a bottleneck if it’s slower than the write speed on any of the storage used by the data files.
Since the insert duration is calculated as pre-insert timestamp - post-insert timestamp
both data and tlog files have the same duration per pass. The only way to tell if your transaction log file is the bottleneck, and not the data file, is by looking at the amount of data written to it, if and by how much the file has grown during a pass, and the IO stall time.
Note
While working on another post, I’ve ran into a situation where parallelism related waits add to the write times, but only on databases using the FULL recovery model.
If you’re seeing suspiciously long times on databases with FULL recovery model, uncomment line 136 (the OPTION(MAXDOP 1)
hint) and re-run the script.
How to run it
Ideally, you run this when the database isn’t being used. Or you create a dedicated database on the same storage and config that you’d want to test.
Test runs
All my 3 test runs are done against databases created using variations of the following T-SQL.
1 2 3 4 5 6 7 8 | CREATE DATABASE [SlowDB] ON PRIMARY ( NAME = N'SlowDB', FILENAME = N'E:\VSQL\Data\SlowDB.mdf' , SIZE = 5120000KB , FILEGROWTH = 1048576KB ) LOG ON ( NAME = N'SlowDB_log', FILENAME = N'E:\VSQL\TLog\SlowDB_log.ldf' , SIZE = 819200KB , FILEGROWTH = 307200KB ) GO ALTER DATABASE [SlowDB] SET RECOVERY SIMPLE GO |
So, the data file is 5000MB in size, with 1024MB growth increments. The transaction log file is 800MB with 300MB growth increments.
And the database is set to use the simple recovery model.
The only differences between them are the database and file names, and the locations of the files.
Slow storage
SlowDB, as the name implies, resides on the slowest drive in my machine, a SATA 3 Samsung 860 Evo SSD.
I’m reusing here the previous screenshot because it shows the results of the speed test on this specific database.
The main info to keep from this is that the average time it takes for 1GB to be written to the data file is 2 seconds, the average IO write stall time is 443 milliseconds.
With the total amount of time required to write 4GB being 8.21 seconds out of which 1.77 seconds of IO stall time.
Since the database is set to use the Simple recovery model, there’s not so much data written to the transaction log – an average of 12.57MB per pass.
And none of the files go through autogrowth events since they’re both properly sized to accommodate the test.
Fast storage
Up next is FastDB, which resides on a Samsung 980 Pro SSD connected to a PCIe 3.0 M.2 slot.
The write speed improvement brought on by the faster storage is noticeable, with an average write speed of 433 milliseconds per 1GB and 7 milliseconds of IO stall time on average.
Faster storage
Now for FasterDB, which also resides on a Samsung 980 Pro SSD, but this one is connected to a PCIe 4.0 M.2 slot.
On this drive the average write time went down to 390 milliseconds for 1GB, and the IO write stall for the data file is a consistent 1 millisecond on each pass.
Conclusion
This script should be a decent option when you’re limited on what you can use to benchmark storage write speeds.
For testing storage read speeds, check this blog post covering how to do that via backing up to NUL device.
8 comments
Good work!
For minimize the influence of the transaction log, delay durability is an option.
Thanks, Sab!
I plan to look into delay durability in an upcoming post where I’ll cover how write speeds are impacted by file size and auto-growth increments, full recovery model, and other configuration options.
[…] Vlad Drumea performs a test: […]
Hi Vlad,
Thank you for your script. Any idea what might be acceptable on a virtual server? I tried it on some of our VMWare servers and the results are much slower than your tests on slow storage:
avg_duration_ms : 7836 (horrible) ; avg_io_stall_write_ms : 46/5 (ok)
Is this acceptable on virtual servers or should I check with the virtualisation team?
Hi Patamons,
7836 is a bit on the high side. Is your database using the full recovery model? Since the speed is bound to the slowest file, transaction log write speeds affect the overall results.
If that’s the case then either try uncommenting line 136 or creating a separate database that uses the simple recovery model.
If not, then I’d recommend running sp_Blitz and checking if it returns the “Slow Storage Writes on Drive …” finding. If you get the same finding there, then it might be time to talk to your system/storage admins.
Hi Vlad, thanks for making this available. It’s been helpful for trying to isolate a performance issue with a test instance of SQL Server I am am running under Linux (or more accurately a Docker instance running under MacOSX). I am curious if you’ve tried SQL Server out on Linux in a Docker container and if so what kind of performance you got?
So far I am taking a hit on writes through Docker (up to 50%) and then another 50% through SQL Server. Plus if my database is fragmented, another 50-80% (so SQL write speeds to my lowly 800-1000MB/sec disk SSD came in as low as 30 MB/sec and at best 200MB/sec).
P.S.One suggest tweak to the code is to match the case of the column names as defined in the original DDL on line 160 for those configured their database/server default with a case sensitive collation. So [sw].[post_size_on_disk_MB] = [pi].[post_size_on_disk_MB]. Also it’s easy enough for the user to compute the MB/second from all the information you include in the output but would be nice to add a summary of such to the output in addition to all the timings and breakdowns.
Hi Ben,
Good catch on the case difference! I’ve fixed that now.
Unfortunately, I haven’t played around with SQL Server on Docker at all, so I can’t really offer any input from that perspective.
From what I’ve heard other data professionals say about SQL Server on Docker is that the performance isn’t necessarily the best, but with no firsthand experience I can’t vouch for this info.
I’ll add the MB/second info in the coming days since I was also planning on making a few more tweaks to the script.
Thanks Vlad and look forward to it!
On the MSSQL/Docker/MacOS X performance issues I should share that I’ve since learned that my test environment is not quite optimal. I assumed MS’s container for SQL Server was configured according to their specifications but have since learned that Docker containers aren’t completely insulated from the semantics of the Docker host and Docker doesn’t meet those requirements under MacOS X. Among other things MS recommends using XFS under Linux and not to use the Overlay filesystem. Unfortunately Docker Desktop for Mac is designed to use the Overlay filesystem while XFS doesn’t appear to be available. I am sure there’s a way to do this better on the Mac but that’s going to require learning a lot more about Docker and containers than I had planned…
In the meantime I am now surprised my environment works as well as it does. I’ve had no data loss and table scan queries are hitting 900MB/sec, which is pretty good given the limits of this hardware…