Throughout the years, I have become convinced that the default settings used in SQL Server are often wrong for systems that need scale or are properly managed. There is a series of settings I find myself consistently changing when I audit or install a new server. They are “my defaults”
I thought I would share those here. Bear in mind that these are setting that assume a certain level of rational behaviour in the organisation you find yourself in. If you are working in a bank, they may not apply to you.
Here is what I always do
|Grant Lock Pages in Memory||Paging the buffer pool is bad.|
|Set Max Memory||So you don’t take too many of above|
|Grant Perform Volume Maintenance tasks to the SQL Server account||Allow instant file initialisation. See: http://technet.microsoft.com/en-us/library/ms175935(v=sql.105).aspx|
|tempdb #files = #cores||I have shown that this is the most scalable configuration|
|model in simple mode and autogrowth off||Prevents new users from allocating a database that he has not consciously made good backup choices on|
|Enabled Async Stats in Model||I have yet to find a case where async stats are not better than the default|
|Named Pipes off, TCP/IP on||I never found a real use case for named pipes. Turning them off is one less surface area for malicious attackers to exploit|
|max worker threads||On a modern 2 socket, I generally set this to 8192. Typically, you can run more threads than the default configuration allows. Threads that are un-used only burn a bit of memory, which is cheap.|
|-T1118||Mixed extends are a joke. Off with them!|
|-T1117||If you are managing you data files right, you need them to grow at the same rate. Failure to do so means you are doing something wrong – and I don’t configure for people who do things wrong|
|-T4199||Always enable the latest optimiser fixes. There is enough bad stuff happening with the optimiser all the time (nature of the beast). Being on the latest version is much better than being stuck in the past|
|-T3226||Logging successful backups to the error log is a royal pain. That log is reserved for error conditions and other things you need to carefully manage. Logging of backup status belongs in the agent or in the backup system you run.|
|-T835||Enable Locked Pages on Standard Edition|
For a documentation of many of these trace flags, see this: http://technet.microsoft.com/en-us/library/ms188396.aspx
Database / Table Configuration
In addition to server configuration, there are a series of settings I automatically change on databases and tables
|Async stats||As described above. In my experience, always the superior choice. You need a good reason to have it off, not the other way around.|
|New filegroup: DATA, set it to default||I never put objects in the PRIMARY filegroup. PRIMARY contains the metadata, and needs extra protection (I normally put it on the TLOG drives). The additional filegroups allow me to freely reallocate new tables and move things around between disk|
|sp_autostats OFF on large tables||In nearly every case I have seen on large tables, auto stats are harmful. They always kick in at the most inconvenient times. And when they kick in, the cause the one thing that is worse than bad performance and data loss: Unpredictability.
I leave auto create stats enabled as this happens rarely enough to be useful and not disruptive.
|Table lock escalation OFF on large tables||Same problem as auto stats, when lock escalation kicks in it nearly always spells trouble. If you are running out of locks, you are doing something wrong that needs to be addressed. Lock escalation is not the solution – it typically makes the problem worse or hides the underlying issue.
See background details on the trade offs: http://technet.microsoft.com/en-us/library/ms184286(v=sql.105).aspx
|ALLOW_PAGE_LOCKS = OFF, ALLOW_ROW_LOCKS = ON for small, and sometimes large, tables||The lock manager often becomes the bottleneck at scale. For small tables, getting rid of page locks help. For large tables, make sure you can live with the the impact on table scans before turning it off|
|#files = #cores||Yes, even for user databases!|
On Important Queries and Hinting
I would like to wrap this post up with another place where I deviate from the standard configuration and “best practices” in the SQL Server world.
If I find a stored procedure or query that either
A) Gets run often… or
B) Is very important and needs to be consistent (“important” is nearly always the same as consistent)
… I simply don’t trust the optimiser with these types of queries. Mission critical queries are much too important to leave in the hands of an unpredictable process like that.
Once I have seen a good plan for the query (and validated that I can’t make a better one myself) I force the query using hints. I don’t wait for it to go wrong, I do it when I see the query the first time (even if the plan is good). This has already saved our production systems from trouble and I am teaching this technique to my colleagues.
However, I do restrict myself to these hints:
|OPTION (<join type> JOIN)||Pretty safe, even under changing index conditions. In OLTP system, if OPTION (LOOP JOIN) cant give you a good plan, you are likely missing an index (and it will be obvious which one when you see the forced plan)|
|LEFT/INNER <join type> JOIN||Allows me to force ordering and really control execution to harvest filters in the right order.|
|FORCESCAN||The optimiser often gets this wrong for reporting queries. I find that it is a little too fond of seeking when a scan is better. Scans are often “safer” in the sense that you know they are proportional with the size of the table scanning. Seeks gone wrong (especially if they cause table spools) can really wreck havoc.|
|FORCESEEK||While this hint can be risky (because it assumes the existence of an index) – there are rare cases where the optimiser needs a bit of manhandling. This is a hint of last resort if the join strategies don’t give me enough|
|CROSS APPLY||Yes, I use this as a hint. If I know that a result will return a specific number of rows (and the optimiser does not) I can do a CROSS APPLY to SELECT TOP 1. A good example of this is BETWEEN joins on dates|
|OPTIMIZE FOR UNKNOWN||My favourite hint. Most of the times, I don’t want a plan to assuming anything about the boundary values of stats in another table (my fondness for GUID helps here as they are more likely to be nicely spread out). This hint is great for stability under changing conditions|
|MAXDOP||Very few queries (except those tuned by master Machanic) takes advantage of more than 8-16 cores.|
|OPTION (HASH GROUP)||Sometimes, the optimiser will trigger a massive sort before going into an aggregate operation. If I know that the aggregate will have a small result (often the case) I force the hashing strategy instead to avoid the excessive memory usage of the sort.|
Hints I avoid:
|Index hints||I don’t want queries to break if I changes indexes. In my entire career, I think I have found only 1-2 cases where I needed an index hint and forcing of join orders wasn’t good enough. Index hints make queries very brittle – which I suspect is why query hints in general have such a bad rap.|
|FORCE ORDER||If I force order, I also force the join strategy (as above). So I just haven’t found a need for this hint.|
|OPTIMISE FOR (@var = x)||Making something dependent on the contents of a column is dangerous. Data changes faster than code.
However, Adam Machanic has some really cool ways to force some interesting side effects by using this hint together with a TOP (@var) (See his blog: http://sqlblog.com/blogs/adam_machanic/)