Boosting INSERT Speed by Generating Scalable Keys

Throughout history, similar ideas tend to surface at about the same time. Last week, at SQLBits 9, I did some “on stage” tuning of the Paul Randal INSERT challenge.

It turns out that at almost the same time, a lab run was being done that demonstrated, on a real world workload, a technique similar to the one I ended up using. You can find it at this excellent blog: Rick’s SQL Server Blog.

Now, to remind you of the Paul Randal challenge, it consists of doing as many INSERT statements as possible into a table of this format (the test does 160M inserts total)

,c3 CHAR (111) DEFAULT ‘a’

Last week, I was able to achieve  750K rows/sec (runtime: 213 seconds) on a SuperMicro, AMD 48 Core machine with 4 Fusion-io cards with this test fully tuned. I used 48 data files for best throughput, the subject of a future blog.

Now, I suspect Paul used NEWID() for key generation, to create as much I/O as possible with page splits. If the goal is purely to optimize throughput, why not ask: Is NEWID really the best way to generate a key?

Traditional SQL Server knowledge dictates that either IDENTITY(1,1) or NEWSEQUENTIALID are better key choices: they lay out the keys sequentially, at the end of the index. But one should not easily trust such recommendations without validating them. I did the above test on the 48 core box Fusion-io lend me, and here are the results:



This might shock you! The “Traditional, sequential layout” of the key is more than 100 times SLOWER than completely random inserts!

The CPU are NOT at 100% during this (they were when using NEWGUID). The problem shows itself when we dig into sys.dm_os_wait_stats. (by the way, below is my new: “this is how I think about it” diagram, let me know if you like this way of doing things and I will continue to use this format for illustrating troubleshooting mind processes)


We are waiting for a PAGELATCH on the last pages of the index. This wait dominates the run. The price of page splits is minor (when you have a fast I/O system) compared to the price of coordinating the last page memory access!

As Rick writes in his blog, there are ways to work around this problem by being smart about your key generation. Basically, you want a key that is NOT sequential. Rick flips the bits around in a Denali SEQUENCE objects – a very neat idea that uses a new SQL Server Denali feature.

For my test at SQLBits (which ran on 2008R2), I used a variant of the same idea: I took the @@SPID of the users session and added an increasing value (increments controlled by the app server) to each insert made. On reflecting, using the SPID is not a good idea, you should probably generate the key something like this:


[Unique App Server ID] * 10**9 + [Unique Number Generated at App Server]

I also tried the hash trick I have described here – to compare the techniques.

Below are the results of my test of Paul’s INSERT Challenge with different key strategies (I have taken out the sequential ones, since we have seen how poorly they perform)


The last column is a trick I will describe in more detail in a later blog. Suffice to say that I reached 1.1M INSERT/sec (runtime: 150 sec) using a key generation strategy very similar to Rick’s.


  • The UNIT in the introduction was originally wrong. it should of course be 750K not 750M as can be seen later in the post.

For Myth busting, see the site map:DW and Big Data