Archive

Archive for the ‘SQL Server’ Category

An Interesting Blog to Follow

2011-09-01 2 comments

I know some of you track Conor Cunningham’s blog, which I highly recommend. The information on query optimizers is rather sparse out there – and it is a real privilege when people who know what they are talking about share their information freely on a blog. There are just not that many of them.

Here is something to think about in that context: Even if you are only interested in SQL Server, there are good things to be learned from studying other database engines – they are after all very similar. One of the good blogs for that purpose is the Oracle optimizer blog: http://blogs.oracle.com/optimizer/. It is delightfully free from marketing, and packed with good stuff. Don’t worry about the fact that they like textual query plans over there in Oracle. You might be used to viewing plans graphically, but I am sure you agree that such minor difference in display preferences can be set aside in the name of reading good material.

My Speaking Schedule the rest of 2011

2011-08-23 2 comments

For those of you who would like to hear me throw my usual tantrums at conferences about data modeling, high scale tuning and optimal use of hardware. Here is my speaking schedule for the rest of 2011.

  • SQLUG Sweden (14th September, Gothenburg) – Just around the corner from my home. I will be presenting about Data Warehouse tuning there.
  • SQLBits – Query across the Mersey (29th September to 1st October, Liverpool) – large SQL Server conference in Europe. I really like going there: it is local, casual atmosphere, focused on tech level 400 and the place to have great discussions over beers at the conference. Last BITS I experimented with a different presentation style – this time I will go back to the old and proven style that has served me well for BITS before. I got access to some crazy hardware, so I may reveal some nasty tricks you can play with big machines.
  • PASS Summit 2011 (11-14 October, Seattle) – This event, because if its location in Seattle, makes it easy for you to meet the developers of SQL Server. There will be tons of sessions. Go catch your favourite developer on the hallways after their session.
  • PASS SQLRally Nordic (8-9 November, Aronsborg, Near Stockholm) – The big Nordic PASS conference. This is a new initiative hosted by the Nordic MVP and SQL community. I look  much forward to this event.

I am unfortunately locked down fully for speaking arrangements, but hopefully, I can get much more speaking done in 2012.

Categories: SQL Server, Travel Tags:

New blog about Analysis Services

2011-08-22 1 comment

This week, I welcome friend, coffee connoisseur, large cube runner and coder Pete Adshead to the blogosphere.

Pete is blogging at: http://peteadshead.wordpress.com/.  His first blog is about managing SSAS through AMO – and the “interesting” coding pattern you have to adopt.

Have fun reading

Categories: Analysis Services, SQL Server Tags: ,

Analysis Services Performance Guide in Final Draft

2011-08-14 Leave a comment

To the Analysis Services crowd out there: Last week I finished the last section of the Analysis Services 2008R2 Performance Guide. Furthermore, I reviewed my co-author Denny Lee’s sections. Denny is currently polishing off a few minor details and we expect to send the draft to our editor early the coming week. I would like the take the opportunity to give you some background on what happens to a whitepaper before you can read it on MSDN.

But before I describe the Microsoft process, I would like to acknowledge the big contributions made by the reviewers. We have had reviews from MVP, SSAS Maestros, partners, independent consultants and customers. Thank you everyone, without you this document would not have been as detailed and accurate. Your critical reviews helped us preserve clarity.

The Microsoft publishing process has a lot of checks and balances in place – which all aim to increase quality. The Analysis Services team has already reviewed the paper. Akshai Mirchandani, Ashvini Sharma and Edward Melomed are particular thorough reviewers and contributors, and there are always great arguments on our internal mailing lists about the exact way to state a point. Sometimes, we even go to the source code to validate a specific behavior and description.

The next thing to happen to the Performance Guide is that our editor (Beth Ingram) will go over the 100+ page document. She will check for both spelling errors, semantic clarity and style. For example, I aim to write the real English, as spoken by the Queen, Denny writes US English from the breakout colony – Beth will make it all US and correct my spelling errors and figures of speech (which I of course will claim are all correct).

Beth will also check for things that may offend or come across as politically incorrect – as you can infer from my blog, she has saved me more than once. Apart from the Forbidden Word List™, Microsoft also has style guides that restrict the language constructs you can use. This creates consistency in publications and also aids auto translators like Google Translate. The style guides also increase the likelihood that HTML and PDF rendering of the document is successful and neat. A lot of responsibility rests on the shoulder of our editor to validate conformity with the style guide  – and the primary purpose is to improve your reading experience.

When Beth has gone over the draft, we will typically have one or two iterations where the authors (Denny and yours sincerely) make sure that the edits have not change the intended semantics of our sentences, or where we discover that Beth’s edit was what we ACTUALLY meant. We will have the occasional stylistic argument – within the constraints of the Microsoft publishing guidelines.

If Beth does her job well, which she consistently does, your will think that both Denny and I have paid a great deal of attention to these details – and we end up with all the credit. This is therefore my my chance to give credit where it is due: to Beth. And now you know why there is often a delay in large publications like these: It is so Beth can work her magic and make the document look really well polished.

At the end of all the editorial process, a web portal (in Sharepoint) request is made for MSDN to publish the article. Two persons (not the authors) have to sign off that it is OK to push the material to the web. Once those signatures (electronic of course) are in place – the CAT team is free to announce it and you will start seeing the news on Twitter, Facebook, our blogs and other social media.

Related Blogs:

The Ascending Column Problem in Fact Tables –Part two: Stat Job!

2011-07-07 2 comments

In my last post I described a common problem with statistics in data warehouses. I also directed you to a trace flag that partially solves the problem. There is an issue with this trace flag: at the time of writing this blog, it does not work on a partitioned tables. Considering that large tables tend to be partitioned, this could be an issue.

Recall from my last post that old statistics will not contain information about the newly inserted rows until stats are updated. Queries asking for data that is outside the bounds of the histogram will estimate 1 row and typically end up with a bad, serial plan. The trace flag works, or doesn’t, by trying to intelligently guess how many rows are such an “out of bound” range instead of assuming the value of 1.

There is another approach that does not rely on the optimizer making guesses: You can intentionally deceive the optimizer, lie about the distribution. Recall that we want the “desired DW plan”. Alas, as with all telling of lies, this exposes you to a lot of risk.

Disclaimer: What I am about to show you is a: “Don’t do this at home” trick. We are going to see some features in SQL Server used in a ways they are not intended for. And we are going to take control of the optimizer in ways it is not designed to handle. If you go down this path, you are on your own! Don’t call support with your issues. Just because I showed it to you, doesn’t mean you can put this in production.

Consider this example an exploration and education of the internals of SQL Server. And think of in the context of what you might consider asking for, next time you file an MSConnect item.

On the issue of MSConnect: If you care about big (any TB sized) DW workloads: please, please file and vote on items that make your working day hard. It helps if you describe which company you work for and what the impact is on your business of not having the feature or “fix”, I know this is not always possible. I know it takes times, and I appreciate you are busy, but Microsoft does listen, and they listen more closely when you can make it clear why this feature is needed and how it hurts you (I call this the “blood on the table” approach). If your favourite Connect item is about DW functionality, shoot me a mail (include the ConnectID please) to let me know. I may well put up a big fight for items that align with my mission in MS. Please keep in mind: I don’t care about client tools at all, I fight for server side features. We have a lot of people pushing on client side features and they are very important too, I have just chosen to specialize.

And with that said, let us dig in:

Recall that our issue was that the histogram lacked rows/buckets containing information about a range of data. The histogram in SQL server has up to 200 rows, but even if we could add a single row to the histogram on every insert, we would soon run out on a large table.

But how about this idea: Fake a histogram were every year we ever want to store (say 100 of them) contains a lot of data. In fact, make sure that every date range or value we select contain a tremendous amount of rows. One way to create such a histogram is to grow the fact table very large and then capture the stats and make it smaller again. If your fact table is not born with the right proportions, growing fat to get the right histogram proportions and then slimming down is not a very nice approach is it?

Our Fact table histogram is not pretty as it is, but how about doing some precision surgery instead? For that, we need an implant. First, we create a table with the same index structure as Fact:

CREATE TABLE HackStats
(
  date_sk INT NOT NULL
  , payload CHAR(4000) NOT NULL DEFAULT ‘X’
)

CREATE CLUSTERED INDEX scan_me ON HackStats(date_sk)

Now, let us prepare the implant. We need something that has these characteristics:

  1. Enough histogram divider rows to cover all possible data values we could ask for.
  2. A very large value for RANGE_ROWS and AVG_RANGE_ROWS between in each histogram row.
  3. A value of 1 for the DISTINCT_RANGE_ROWS in each histogram row.
    Ad 1) Make sure that we do the implant once, and that I lasts for the lifetime Fact.
    Ad 2) Make sure that there is sufficient size and elasticity for any range of dates we might touch.

More about 3) later…

With a little reverse engineering (no, I did not look at the source), we can create a histogram that match our requirement. It does not take a lot of rows in HackStats. Here we go:

SET @y = 1950
WHILE @y < 2050 BEGIN
  /* make the stats dense (low distinct) within the year */
 SET @r = 0
 WHILE @r < 100 BEGIN
/* Make sure the equal rows in the histogram are never hit using invalid
    dates  */
    INSERT HackStats (date_sk) VALUES (@y * 10000)
    INSERT HackStats (date_sk) VALUES (@y * 10000 + 5000)
    SET @r = @r + 1
  END
  SET @y = @y + 1
END

You may have been wondering why I added a payload column to HackStats. For small tables (by page size), SQL Server will do a full sample, even if we ask for a small percent. And the engine is smart enough to know that is has seen all pages. A fully sampled table cannot be hacked and made artificially pretty – in ways you will soon observe. The HackStats table has to be small, but large enough (in our case 20K pages) that all rows will not be sampled. And that is why we had to to pad the rows a bit.

We can now hack like this:

 /* prepare the implant */

UPDATE STATISTICS HackStats WITH ROWCOUNT = 199000000
UPDATE STATISTICS HackStats WITH SAMPLE 99 PERCENT

The first statement above “fakes” a large table by telling SQL Server that there are a LOT of rows in the table. The second statement samples the distribution of those rows (which we have cunningly crafted to have just the right dimensions, so we need to sample almost all of them).

Let us check the quality of the implant:

DBCC SHOW_STATISTICS (HackStats, scan_me)
WITH STAT_HEADER, HISTOGRAM

Output (your results may vary a little, depending on which rows got sampled, rerun the sample if you don’t like the output)

image

And:

image

Notice the neat distribution. Any query asking for a year range will get a high estimate of the outcome of scanning the table. SQL Server is smart enough to boost the row estimates based on how many total rows it believes are in the table.

And now, for the dirty little trick (Brent Ozar, if you are reading this, just don’t say it Smile ):

Let us transplant the faked, big stats to the ugly Fact table. First, take the implant:

DBCC SHOW_STATISTICS (HackStats, scan_me)
WITH STATS_STREAM

This gives an output somewhat like this (your result will vary):

  0x01000000010000000000000000000000B1AB0052000000….

We don’t really have to worry what is in there, it is some binary representation of the stats. The feature to read the binary stats was put into SQL to allow database clones – and we are abusing it now. But hey, we are in the business of precision surgery: Copy the bit values to the clipboard, past and execute:

UPDATE STATISTICS Fact scan_me WITH
STATS_STREAM = 0x01000000010000000000000000000000B1AB0052000000….

And a finishing flourish, to make it fact table look really big:

/* Make the table expensive to access in the future */

UPDATE STATISTICS Fact scan_me WITH PAGECOUNT = 200000000

Validate the result:

/* Check the patient */

DBCC SHOW_STATISTICS (Fact, scan_me)
WITH HISTOGRAM

(output similar to HackStats)

And let us check our problem query again:

DBCC FREEPROCCACHE
SELECT C.shoe_size, SUM(Measure)
FROM Fact F
JOIN DimCustomer C ON F.customer_sk = C.customer_sk
WHERE F.date_sk = 20010111
  AND C.shoe_size = 43
GROUP BY C.shoe_size

Eureka: the “Desired DW Plan”:

image

 

Notice something interesting: the Estimated Operator Cost of access is about 736. That is a REALLY expensive access to the table. Just the way we like it actually, wouldn’t want the optimizer to get any ideas that this table is cheap to access. Our little trick with hacking the page size gave our Fact patient quite a bit of confidence.

I deferred talking about why we need a low number of distinct rows in the histogram. For now, I leave this as an exercise to the reader. Hint: it is related to the bitmap filters and the way Fact is accessed.

And remember folks, you do this at your own risk, you are on your own and in unsupported land here… Don’t come crying to support if you break something. Ask for a connect item instead.

The Ascending Key Problem in Fact Tables– Part one: Pain!

2011-07-01 3 comments

Time for another deep dive into SQL Server. In the last couple of days, I have been having coffee with Joe Chang in the afternoon, who some of you may know. He is visiting Sweden a lot by the way, so shoot him an email if you are interested in having him present. Joe and I ended up talking about an interesting problem in data warehousing: the ascending key problem. I think it is best illustrated by an example.

Let us say you have a pure star schema (sorry, Inmon people), with a large fact table. If you are following the guidance in the Top 10 Best Practices from SQLCAT (bullets 1,2 and 9), you typically partition the fact table by date, and you  cluster on date too. On of the reason you do this, is that it helps you do fast range scans.

Let us try an example star schema that follows this guidance:

CREATE PARTITION FUNCTION pf_daily(INT) AS RANGE RIGHT FOR VALUES
(20000101, 20000102
, 20000103, 20000104
, 20000105, 20000106
, 20000107, 20000108
, 20000109, 20000110
, 20000111, 20000112
, 20000113, 20000114
)
CREATE PARTITION SCHEME ps_daily AS PARTITION pf_daily ALL TO ([PRIMARY])

CREATE TABLE Fact
(
  date_sk INT NOT NULL
  , customer_sk INT NOT NULL
  , payload CHAR(80) NOT NULL DEFAULT REPLICATE(‘X’, 80)
  , measure MONEY NULL
)
/* Turn off stats (as per typical DW) */
EXEC sp_autostats ‘Fact’, ‘OFF’
CREATE CLUSTERED INDEX scan_me ON Fact(date_sk) ON ps_daily(date_sk)

CREATE TABLE DimCustomer
(
  customer_sk INT NOT NULL
  , shoe_size INT NOT NULL
)
CREATE UNIQUE CLUSTERED INDEX CIX ON DimCustomer(customer_sk)

 

If you have a very large fact table, you often don’t want to run stats in the middle of a query, so we have turned auto stats off. In this context, recall that stats in SQL Server are per table, not per partition – which can be a pretty expensive operation. We really want to control when stats are run for large tables.

To generate some test data, let us fill up 10 partitions with 50MB each. For the customer dimension, we will use 100K rows:

/* Generate test data */

DECLARE @i INT
DECLARE @customer_size INT
SET @customer_size = 100000
DECLARE @day_size_pages INT 
DECLARE @day_size_rows INT
SET @day_size_pages = 50001 / 8
/* last sum are row width plus uniquefier */

SET @day_size_rows = @day_size_pages * 8060 / (4 + 4 + 80 + 8) 

 

/* Create one day of data */
CREATE TABLE OneDay
(
  date_sk INT NOT NULL
  , customer_sk INT NOT NULL
  , payload CHAR(80) NOT NULL DEFAULT REPLICATE(‘X’, 80)
  , measure MONEY NULL
)
SET @i = 0
WHILE @i < @day_size_rows BEGIN
  INSERT OneDay (date_sk, customer_sk, measure)
 
VALUES (’20000101′, @customer_size * RAND(), RAND() * 1000)
  SET @i = @i + 1
END

/* create the customer dimension */
SET @i = 0
WHILE @i < @customer_size BEGIN
  INSERT DimCustomer (customer_sk, shoe_size)
  VALUES (@i, 36 + @i % 15)
  SET @i = @i + 1
END

/* Insert ten days of data in the fact */
SET @i = 0
WHILE @i < 10 BEGIN
  INSERT Fact (date_sk, customer_sk, measure)
  SELECT 20010101 + @i, customer_sk, measure
  FROM OneDay
  SET @i = @i + 1
END

Good, now we can get started on some testing. Let us first execute a typical, star schema query. But before that, we will update the statistics. To be nice to the query, we will even do a full scan.

UPDATE STATISTICS Fact WITH FULLSCAN
DBCC FREEPROCCACHE 

/* An ordinary star-schema query arrives */
SELECT C.shoe_size, SUM(Measure)
FROM Fact F
JOIN DimCustomer C ON F.customer_sk = C.customer_sk
WHERE F.date_sk = 20010101
  AND C.shoe_size = 43
GROUP BY C.shoe_size

On my 2-core laptop and a hot buffer pool, this query runs in a little over 100ms, touching over 500K rows. Life is good. The query plan is what I like to refer to as “the desired DW plan”:

image

All the neat stuff in the SQL Server optimizer that we like for DW workloads:

  • Hash Join to dimension tables
  • Bitmap index pushdown to the fact table (BOL link)
  • The right join order (build hash on dimension, use fact table to probe)
  • Parallelism
  • Cluster index seek into nice, fat range of facts

But warehouses are not always nice to us. During the night, some evil ETL developer comes along and loads a new day of data:

/* Nightly ETL run adds another day of data */
INSERT Fact (date_sk, customer_sk)
SELECT 20010111, customer_sk FROM OneDay

Alas, the ETL guy did not have time to run statistics – maybe he forgot. Now, let us run the same DW query again, but this time with the newly loaded data:

DBCC FREEPROCCACHE
SELECT C.shoe_size, SUM(Measure)
FROM Fact F
JOIN DimCustomer C ON F.customer_sk = C.customer_sk
WHERE F.date_sk =
20010111
  AND C.shoe_size = 43
GROUP BY C.shoe_size

over 2 seconds on my 2 core laptop, more than 20 times slower! This is the time when people typically complain that SQL server is broken and doesn’t scale. Instead, we adopt a less moaning stance and start a little bit of troubleshooting. As always, we begin with a look at the query plan:

image

Well, apart from the plan fitting the width of my blog, there is nothing good to say about it. What just happened?

This is the part where we up the geek bar just a little: SQL Server gives you the ability to dig into the statistics like this:

DBCC SHOW_STATISTICS (Fact, scan_me)
WITH HISTOGRAM

Output:

image

Oh no, the value 20010111 is not even in the histogram. The optimizer estimates zero rows coming out of the fact table. Zero rows are very cheap to access, so the overhead cost of all those nice, parallel and hash join optimizations are not deemed worth it. Of course, the problem here is that the histogram is lying, there are 500K rows in that date range! Loop joins are expensive in that case.

That hurt, what now?… Well, I guess you just have to remember to update your statistics won’t you? And for those of you that can see a problems with the that (and some will) – start the commenting, go vote on Connect, and of course: stay tuned…

… And by the way, one last last thing, until we meet again in the next post: Have a look at this: Ascending Keys and Auto Quick Corrected Statistics.

Have a great weekend everyone, and consider updating your stats if the warehouse loads data on Sunday…

Latch and Spinlock Papers Published on Microsoft

2011-07-01 Leave a comment

I am happy to announce that my team mates, Ewan Fairweather and Mike Ruthruff have published two excellent whitepapers on latch and spinlock diagnosis. You can find them here:

Pay special attention to the spinlock scripts, you will find an interesting trace flag in there that sheds more light on my intentionally vague dodging of the call stack subject in my blog entry here. I am sorry that I did not provide more details in the blog, but I did not want to give the plot away Smile

I am very proud to have Ewan take over the OLTP tuning on the SQLCAT  team here in EMEA. Buy this guy a drink next time you meet him at a conference and have a chat with him about scalability.

DL980 Configuration Guidance published

2011-06-10 Leave a comment

The Asian CAT team, together with Jimmy May (The Aspiring geek himself), the MS PFE SAP team and HP have published a very interesting blog about DL980 configuration:

If you plan to run on that machine, I highly recommend you read up on the great information in the blog.

Whenever you play around with an HP server, I generally recommend you download and skim through the QuickSpec. It gives you the details of how the machine is laid out – the PCI card speeds and placement come in quite handy when you configure the machine with FusionIO.

Another good resource for large machine tuning is to take a look at the full disclosure reports on TPC-E runs

Categories: OLTP, SQL Server Tags: , , ,

That Analysis Services 2008R2 Operations Guide is online

2011-06-01 9 comments

It is my pleasure to announce that the Operations Guide for SQL Server Analysis Services 2008R2 (and also 2005 and 2008) is now available on MSDN. Written by Denny Lee, John Sirmon (our new SSAS CAT member) and yours sincerely.

The guide describes how to configure, test and operate Analysis Services installations in a production environment. It is more than 100 pages of good information with contributions from a long list of MVP, SSAS specialists and the product group

Here it is: The Analysis Services 2008R2 Operations Guide

It was a pleasure working with you all to get this out there.

Diagnosing and fixing SOS_OBJECT_STORE spins for Singleton INSERTS

2011-05-30 2 comments

Following up on my previous post, my next target for “optimization”, while I am waiting for an even faster I/O system, is the SOS_OBJECT_STORE spin.

Recall that I am having high waits for WRITELOG, but still see 100% CPU, which indicates that spins may be our sinner. The big spin on the system is still LOGCACHE_ACCESS – but until we get hardware to work on that – we might as well stay greedy and learn a bit about SQL Server in the process. We just got rid of the OPT_IDX_STATS spin by running TF2330.

Unfortunately, the documentation available on our next spin: SOS_OBJECT_STORE is rather sparse. It is one of the SQLOS internal data structure used many places inside SQL Server. But there are ways, even for the public (which is why I can share it here), to get more information about what is going on. You can capture the call stacks of SQL Server when it does this spin and use publicly available symbols to lookup the function names inside the code.

One way to do this is to run an Xperf trace of the sqlservr.exe, another is with WinDbg. Anything that can collect and bucketize call stacks can help you. I will not get into more details here, but follow the links in this paragraph to learn more. I also have an article on SQLCAT that should help you get started on setting public symbol paths.

Suffice to say that I got hold of the sqlservr.pdb file (the publicly available symbols) and had a look at the call stacks that leads to SOS_OBJECT_STORE spins:

SpinlockBase::Sleep
LockManager::GetLockBlocks
lck_lockInternal
GetLock
PageContext::AcquireLock
IndexDataSetSession::AcquireLocksForInsert
IndexDataSetSession::InsertSmallRecord
IndexDataSetSession::InsertRowInternal
DatasetSession::InsertRow
RowsetNewSS::InsertRow

Aha! So this is related to the lock manager acquiring a lock on a page. Now, you may then ask: how can we influence this, surely we cannot control the way locks are acquired.

Well, as a matter of fact, we DO have a knob that gives us a tiny bit of influence. How about building the index like this:

CREATE CLUSTERED INDEX MyBigTable_cl
ON dbo.MyBigTable (c1)
WITH (ALLOW_PAGE_LOCKS = OFF)

That should get rid of one level of the lock hierarchy (ROW/PAGE/TABLE), restricting us to either table level locks or row locks. Since we are playing OLTP system here – who needs page locks anyway? Total leftover from old times Smile… (I am only half kidding here)

Interestingly, this was just what was needed, the SOS_OBJECT_STORE spin is now gone:

image

But throughput has not changed at all. This is not surprising, given the much larger amount of spins on LOGCACHE_ACCES. But we learned something new: Disabling PAGE level locks can save some CPU cycles by eliminating some of the code paths – we can speculate that this might lead to increased throughput once other bottlenecks are gone.

At this time, before I am moving to a faster transaction log, these are my waits:

image

Notice the high SOS_SCHEDULER_YIELD up there right after the WRITELOG? I have a feeling those spins are not yet done tormenting us….

Follow

Get every new post delivered to your Inbox.

Join 350 other followers