Archive

Posts Tagged ‘SQL’

Running Many Batch Statements in Parallel

2012-01-08 1 comment

Location: Somewhere over the Atlantic

imageWhen designing highly scalable architectures for modern machines, you will often need to do some form of manual parallelism control. Managing this is not always easy, but in this blog I will give you one piece of my toolbox to help you.

Let us walk through an example together, a tiny case study. This is a problem which many of you will be familiar with.

Let us say you have 16 files that you want to load into the same table in your database in an automated manner. The naïve approach will do something like this:

BULK INSERT MyTarget FROM ‘C:\temp\MyFile1′ WITH
  (FIELDTERMINATOR = ‘;’, ROWTERMINATOR = ‘\n’)

BULK INSERT MyTarget FROM ‘C:\temp\MyFile2′ WITH
  (FIELDTERMINATOR = ‘;’, ROWTERMINATOR = ‘\n’)

… etc…

BULK INSERT MyTarget FROM ‘C:\temp\MyFile16′ WITH
  (FIELDTERMINATOR = ‘;’, ROWTERMINATOR = ‘\n’)

Now here is the problem with this approach: it executes one statement at a time. Sequential execution is BAD, you need to stop thinking about the world like that if you want to scale on a modern architecture.

Lets assume we have enough hardware resources (in this case, it would take a blade server and a decent I/O system). What we really want is to run every one of these statements in parallel. Unfortunately, SQL server does not have a command to start up new connection from inside T-SQL… what to do?

Getting to the Command Line

Because you cannot execute more than one command on a single connection at a time, we will need multiple connections to SQL Server and this mean we have to go back to the command line. Let us start by creating a little batch file Worker.Cmd with this content:

REM Worker.Cmd File

CALL SQLCMD –S.\MyServer –q”BULK INSERT MyTarget FROM ‘C:\Temp\MyFile%1’ …EXIT

This allows us to invoke a bulk load for the first file by executing: Worker.Cmd 1

Unfortunately, we still cannot start multiple connections without manually firing up a lot of command prompts. The coders in the audience may at this point reach for their favorites programming language to write a little utility that can spawn multiple copies of an executable.

However, there is a problem with such a home made executable: you cannot generally rely on a server having the necessary runtime libraries. Typical comments might be:

 “No, we don’t have .NET 4.0 here, this is not yet certified by our infrastructure department. Could you recompile it for 1.1 please?”.

“Power Shell is much too fancy for us, what is wrong with running this on Windows 2000?”

Perhaps this customer is just skeptical about letting you run your executable on a server. This may sound silly, but I have seen this happen too many times to make assumptions.

Start to the Rescue

There is a very nice little utility for the good old command prompt that allows you to fire up new processes: START.EXE. This comes with all versions of Windows and it takes any command line executable as input, fires it up in a new thread and returns control back to the caller.

Using start.exe, we can write batch script that fire up multiple copies of the same executable. It looks like this:

REM SpawnMany.cmd

REM Author: Thomas Kejser
REM Purpose: Spawns many copies of the same executable. Useful for running many things in parallel

@ECHO OFF
ECHO Spawning %2 copies of %1

FOR /L %%i IN (1, 1, %2) DO (
    ECHO Spawning thread %%i
    START "Worker%%i" /Min %1 %%i %2
)

Each new process is started in a minimized window and we pass the thread number and the total number of threads to it. Using this little batch script, we can now do this:

  SpawnMany.exe Worker.Exe 16

This starts 16 workers, each with their own thread number assigned. Very useful for running stuff in parallel in a quick and dirty way. For example, I use this to run the TPC-H data generator dbgen.exe highly parallelized.

Notice that I added the EXIT command at the end of the worker.cmd batch. This makes sure that the window closes itself when done executing.

Summary

In this blog, I have shown you how to write a little batch script to fire up multiple threads, from the command line. each doing their own work in parallel. The script is “zero dependency” which makes it ideal for server use and for hacking together quick and dirty parallelism for test scenarios.

I mentioned that SQL server does not have a way to start up new connections from T-SQL. This is not strictly true. Sorry for leading you astray, but I wanted you to see how to do this from the command line first (and go through the pain Devil ). There is a way to hack SQL Server and implement a stored procedure I like to call sp_executesql_async. This will be the subject of a future blog, but since I am heading into a lab for a few weeks, you just have to wait for it.

Implementing MurmurHash and CRC for SQLCLR

2011-12-07 Leave a comment

As we saw in my previous post, the build in hash functions of SQL Server were either expensive with good distribution, or cheap, but with poor distribution. As a breath of fresh air, let us look at a useful magic quadrant:

 

image

We see that all of the hash functions exposed by HASHBYTES fall into low speed quadrants, caution is advised here. We also see that while CHECKSUM and BINARY_CHECKSUM are challengers, they do not have the spread to fully compete. We are left with modulo as the best possible hash function, but we believe that this algorithm may have trouble executing. The Modulo hash function has some problems (sorry, had to say “problem” instead of “challenge”, I couldn’t bear speaking management bollocks anymore):

  • It only works on integer types
  • If there is structure in the data (for example, if all values are equal) it will not spread the values equally
    The question we are faced with is:
    “Could we create a high speed, good spread, hash function that belongs in the leader quadrant and which does not have the limitation of the module functions”

It was this question I sought to answer when I started hacking away at SQLCLR functions. As you read the following, please don’t laugh too hard at me (a small snigger will suffice) – it has been a long time since I hacked away at code.

MurmurHash2 and Davy’s Bit Magic

Google has published a very interesting hash function, MurmurHash, claiming to have the properties I am looking for: It is both fast and has a good spread. The hash algorithm comes in different flavours: MurmurHash2 and MurmurHash3. Each flavour also exists in architecture and word size optimized versions. I decided to implement the x86/x64, 32-bit integer version in C#.

Fortunately, someone already beat me to it. Davy Landman has an implementation of MurmurHash2 – which he has kindly shared under GPL. I took his implementation and wrapped it in a SQLCLR user defined function.

While trying to understand Davy’s code and hacking it to work with SQLCLR, I had a few “aha moments” that I would like to share.

First of all, Davy uses this cute trick to turn 4 bytes from a BYTE[] array into a 32-bit INT:

UInt32 k = (UInt32)(data[currentIndex++] 
                  | data[currentIndex++] << 8
                  | data[currentIndex++] << 16
                  | data[currentIndex++] << 24
           
);

 

I thought that was a rather neat trick. To practice my illustration skills, here is what happens:

image

Please note that left shift (<<) and or (|) operators take INT32, not UINT32 as input. Hence the need to cast the final value to UINT32.

Speaking of integers, in my modification of Davy’s source code you will find:

unchecked
{
    return (SqlInt32)(Int32)h;
}

I need to get form the UINT which is returned by the reference implementation of MurmurHash to the signed integer (SqlInt32) that SQL Server understands. Doing the unchecked cast here is faster than Convert.ToInt32.

MurmurHash3

From the C++ source code of MurmurHash3, and using Davy’s bit trick, it took me only a few hours to get myself a nice implementation of MurmurHash3. I compared this with a C++ implementation done by my colleague Christian Martinez (based directly on the Google source), and we agreed on outputs that exercise all the branches in the code. So I am reasonably confident that my implementation is correct.

Except for the fact that it turns out to be important to use SqlBinary instead of SqlBytes (blogged here), there is not much more to say. Feel free to use the MurmurHash3 for your own implementations (my implementation is GPL)

The source code for the MurmurHash functions is too long to paste in this blog, so I created a new page on my site which you can find by following this link: C# Source code for MurmurHash in SQLCLR.

Before I show you the spread and speed results, let me just talk a little about two very old, but very commonly used, hash functions.

CRC16 and CRC32

The Cyclic Redundancy Check (CRC) family of hashes are, compared to MurmurHash, very simple to implement. They walk through each byte in the input and divide it with a specially selected polynomial. For implementation purposes, the polynomial is simply a constant in the code and the division is done with XOR and shifting (making this a very efficient operation). The remainder of the byte division is fed into the division on the next byte (giving the hash algorithm its name) and so on, until the end of the input .

WikiPedia contains a great reference implementation that I used for my tests. For readability, the source code is again available on a separate page of this blog: CRC16 and CRC32 in C#.

In the naïve approach to CRC, the polynomial division is implemented using an inner loop through all the bits of each byte:

rem = rem ^ data[i];

for (int j = 0; j < 8; j++)
{
  if ((rem & 0×00000001) == 0×00000001)
  {   // if rightmost (least significant) bit is set
     rem = (rem >> 1) ^ Polynomial32; /* Polynomial */
  }
  else
  {
    rem = (rem >> 1);
  }
}

An interesting optimization, shaving off a lot of cycles, is to create a table that allows you to look up the result of this loop for all 256 combinations of 8 bits. My source code contains such  tables, both for CRC32 and CRC16 implementations. Using these tables, I can replace the above code fragment with:

rem = (rem >> 8 ) ^ CRCTable32[(rem & 0xff) ^ data[i]];

…A major optimization

Speed Results

At this point, you may wonder if I did all of this to give you a chance to laugh at my rusty coding skills. What exactly was the point of implementing a new hash function in the first place?

Without further ado, let us look at the results compared to native SQL Server functions:

image

Isn’t that beautiful? If we don’t need the cryptographic properties of the HASHBYTES functions, we can beat SQL Servers hashes with our own SQLCLR implementation. Please note that I took out MD2 from the above to avoid skewing the results (I have previously shown that it is simply too inefficient).

Now, the bad news: the yellow line above. That represent a “hash function” that returns zero, like this:


public static SqlInt32 NoHash(SqlBinary data)
{      
  return (SqlInt32)0;
}

 

Notice that even such a simple user defined function that does nothing, consumes almost all the same number of CPU cycles as a full hash calculation. This is the overhead of the CLR and throwing data back and forth between managed and unmanaged code. So much for my optimizations in CRC.

Spread Results

We now know that there is at least hope of implementing a faster hash function than HASHBYTES. But how well do these functions spread the input. Running the chi-squared test from my previous blog entry we get These results:

image

Now THAT is pretty nice isn’t it? We see that good old CRC makes for an excellent hash function for our purpose. MurmurHash does very well too, comparable with SHA, but at under half the CPU cost.

Summary

In this blog, I have shown you how to implement hash functions that are both faster and have better spread than the built in hash functions exposed in SQL Server HASHBYTES. We have also seen that the overhead of SQLCLR is significant for this case, and hence, it would be preferable if non cryptographic hash function were exposed natively in SQL Server.

If you care about such new hash functions for SQL Server, I would suggest you file a Connect item to ask for them and state your purpose for wanting them. “It would be cool” or “Thomas says so” is not enough, provide a business justification. But at least, now you know what to ask for…

I owe Christian Martinez thanks for this blog for directing me to the MurmurHash functions, and for helping me validate the correctness of my C# implementation against his C++ version.

PS: As I was exploring this little implementation, I noticed that one of my heroes, Donald Knuth, has extended The Art of Computer Programming with a new volume: Volume 4, Fascicle 1: Bitwise Tricks & Techniques; Binary Decision Diagrams. That sounds like an interesting read and it is now on my birthday wish list.

The Ascending Column Problem in Fact Tables –Part two: Stat Job!

2011-07-07 2 comments

In my last post I described a common problem with statistics in data warehouses. I also directed you to a trace flag that partially solves the problem. There is an issue with this trace flag: at the time of writing this blog, it does not work on a partitioned tables. Considering that large tables tend to be partitioned, this could be an issue.

Recall from my last post that old statistics will not contain information about the newly inserted rows until stats are updated. Queries asking for data that is outside the bounds of the histogram will estimate 1 row and typically end up with a bad, serial plan. The trace flag works, or doesn’t, by trying to intelligently guess how many rows are such an “out of bound” range instead of assuming the value of 1.

There is another approach that does not rely on the optimizer making guesses: You can intentionally deceive the optimizer, lie about the distribution. Recall that we want the “desired DW plan”. Alas, as with all telling of lies, this exposes you to a lot of risk.

Disclaimer: What I am about to show you is a: “Don’t do this at home” trick. We are going to see some features in SQL Server used in a ways they are not intended for. And we are going to take control of the optimizer in ways it is not designed to handle. If you go down this path, you are on your own! Don’t call support with your issues. Just because I showed it to you, doesn’t mean you can put this in production.

Consider this example an exploration and education of the internals of SQL Server. And think of in the context of what you might consider asking for, next time you file an MSConnect item.

On the issue of MSConnect: If you care about big (any TB sized) DW workloads: please, please file and vote on items that make your working day hard. It helps if you describe which company you work for and what the impact is on your business of not having the feature or “fix”, I know this is not always possible. I know it takes times, and I appreciate you are busy, but Microsoft does listen, and they listen more closely when you can make it clear why this feature is needed and how it hurts you (I call this the “blood on the table” approach). If your favourite Connect item is about DW functionality, shoot me a mail (include the ConnectID please) to let me know. I may well put up a big fight for items that align with my mission in MS. Please keep in mind: I don’t care about client tools at all, I fight for server side features. We have a lot of people pushing on client side features and they are very important too, I have just chosen to specialize.

And with that said, let us dig in:

Recall that our issue was that the histogram lacked rows/buckets containing information about a range of data. The histogram in SQL server has up to 200 rows, but even if we could add a single row to the histogram on every insert, we would soon run out on a large table.

But how about this idea: Fake a histogram were every year we ever want to store (say 100 of them) contains a lot of data. In fact, make sure that every date range or value we select contain a tremendous amount of rows. One way to create such a histogram is to grow the fact table very large and then capture the stats and make it smaller again. If your fact table is not born with the right proportions, growing fat to get the right histogram proportions and then slimming down is not a very nice approach is it?

Our Fact table histogram is not pretty as it is, but how about doing some precision surgery instead? For that, we need an implant. First, we create a table with the same index structure as Fact:

CREATE TABLE HackStats
(
  date_sk INT NOT NULL
  , payload CHAR(4000) NOT NULL DEFAULT ‘X’
)

CREATE CLUSTERED INDEX scan_me ON HackStats(date_sk)

Now, let us prepare the implant. We need something that has these characteristics:

  1. Enough histogram divider rows to cover all possible data values we could ask for.
  2. A very large value for RANGE_ROWS and AVG_RANGE_ROWS between in each histogram row.
  3. A value of 1 for the DISTINCT_RANGE_ROWS in each histogram row.
    Ad 1) Make sure that we do the implant once, and that I lasts for the lifetime Fact.
    Ad 2) Make sure that there is sufficient size and elasticity for any range of dates we might touch.

More about 3) later…

With a little reverse engineering (no, I did not look at the source), we can create a histogram that match our requirement. It does not take a lot of rows in HackStats. Here we go:

SET @y = 1950
WHILE @y < 2050 BEGIN
  /* make the stats dense (low distinct) within the year */
 SET @r = 0
 WHILE @r < 100 BEGIN
/* Make sure the equal rows in the histogram are never hit using invalid
    dates  */
    INSERT HackStats (date_sk) VALUES (@y * 10000)
    INSERT HackStats (date_sk) VALUES (@y * 10000 + 5000)
    SET @r = @r + 1
  END
  SET @y = @y + 1
END

You may have been wondering why I added a payload column to HackStats. For small tables (by page size), SQL Server will do a full sample, even if we ask for a small percent. And the engine is smart enough to know that is has seen all pages. A fully sampled table cannot be hacked and made artificially pretty – in ways you will soon observe. The HackStats table has to be small, but large enough (in our case 20K pages) that all rows will not be sampled. And that is why we had to to pad the rows a bit.

We can now hack like this:

 /* prepare the implant */

UPDATE STATISTICS HackStats WITH ROWCOUNT = 199000000
UPDATE STATISTICS HackStats WITH SAMPLE 99 PERCENT

The first statement above “fakes” a large table by telling SQL Server that there are a LOT of rows in the table. The second statement samples the distribution of those rows (which we have cunningly crafted to have just the right dimensions, so we need to sample almost all of them).

Let us check the quality of the implant:

DBCC SHOW_STATISTICS (HackStats, scan_me)
WITH STAT_HEADER, HISTOGRAM

Output (your results may vary a little, depending on which rows got sampled, rerun the sample if you don’t like the output)

image

And:

image

Notice the neat distribution. Any query asking for a year range will get a high estimate of the outcome of scanning the table. SQL Server is smart enough to boost the row estimates based on how many total rows it believes are in the table.

And now, for the dirty little trick (Brent Ozar, if you are reading this, just don’t say it Smile ):

Let us transplant the faked, big stats to the ugly Fact table. First, take the implant:

DBCC SHOW_STATISTICS (HackStats, scan_me)
WITH STATS_STREAM

This gives an output somewhat like this (your result will vary):

  0x01000000010000000000000000000000B1AB0052000000….

We don’t really have to worry what is in there, it is some binary representation of the stats. The feature to read the binary stats was put into SQL to allow database clones – and we are abusing it now. But hey, we are in the business of precision surgery: Copy the bit values to the clipboard, past and execute:

UPDATE STATISTICS Fact scan_me WITH
STATS_STREAM = 0x01000000010000000000000000000000B1AB0052000000….

And a finishing flourish, to make it fact table look really big:

/* Make the table expensive to access in the future */

UPDATE STATISTICS Fact scan_me WITH PAGECOUNT = 200000000

Validate the result:

/* Check the patient */

DBCC SHOW_STATISTICS (Fact, scan_me)
WITH HISTOGRAM

(output similar to HackStats)

And let us check our problem query again:

DBCC FREEPROCCACHE
SELECT C.shoe_size, SUM(Measure)
FROM Fact F
JOIN DimCustomer C ON F.customer_sk = C.customer_sk
WHERE F.date_sk = 20010111
  AND C.shoe_size = 43
GROUP BY C.shoe_size

Eureka: the “Desired DW Plan”:

image

 

Notice something interesting: the Estimated Operator Cost of access is about 736. That is a REALLY expensive access to the table. Just the way we like it actually, wouldn’t want the optimizer to get any ideas that this table is cheap to access. Our little trick with hacking the page size gave our Fact patient quite a bit of confidence.

I deferred talking about why we need a low number of distinct rows in the histogram. For now, I leave this as an exercise to the reader. Hint: it is related to the bitmap filters and the way Fact is accessed.

And remember folks, you do this at your own risk, you are on your own and in unsupported land here… Don’t come crying to support if you break something. Ask for a connect item instead.

Latch and Spinlock Papers Published on Microsoft

2011-07-01 Leave a comment

I am happy to announce that my team mates, Ewan Fairweather and Mike Ruthruff have published two excellent whitepapers on latch and spinlock diagnosis. You can find them here:

Pay special attention to the spinlock scripts, you will find an interesting trace flag in there that sheds more light on my intentionally vague dodging of the call stack subject in my blog entry here. I am sorry that I did not provide more details in the blog, but I did not want to give the plot away Smile

I am very proud to have Ewan take over the OLTP tuning on the SQLCAT  team here in EMEA. Buy this guy a drink next time you meet him at a conference and have a chat with him about scalability.

Small differences between SQL Server and PostgreSQL

2011-05-19 10 comments

In my copious amount of spare time Smile, I am currently working with Gapminder to build a data warehouse. We are using PostgreSQL and Ruby Rails as the development platform.

As I learn PostgreSQL, I am running into some interesting differences. For the SQL Server/PostgreSQL people out there, I thought it might be interesting to describe some of them in case you are transitioning from one to the other.

 

SQL Server PostgreSQL
INSERT t VALUES (…) This syntax is not allowed. Allows:
INSERT INTO t VALUES (…)
BULK INSERT and BCP uses COPY instead
(which has the functionality of both BCP and BULK INSERT)
Management Studio pgAdmin
Bit type Boolean type (accepts values true and false)
IDENITTY Has sequencers (like Oracle)
default schema is dbo default schema is PostgreSQL
Default Listening on 1433 Default listening on 5432
datatype: varchar(max) datatype: text
Key is clustered by default key is not clustered by default (and it is enforced by a constraint and not an an index!)
User Defined Data Types Domains
user: sa user: postgres
No such thing NATURAL and USING joins
SELECT TOP 10 * FROM t SELECT * FROM t LIMIT 10
Query plans read from right to left Query plan read from left to right
Estimate Query Plan: CTRL+L Estimate Query Plan: F7
Categories: PostgreSQL, SQL Server Tags:

Grade of the Steel: Introduction

2011-05-01 Leave a comment

There are a lot of things I want to say on this blog, so I do apologize in advance for jumping a bit between subjects. I will shortly write the next installment of my data modeling series – but before that, I want to let you know about some exciting new tests that I am working on in my copious amount of spare time Smile.

FusionIO has kindly granted me access to what can only be described as a beast of a machine! Here is the configuration I am looking at in my terminal session:

  • An HP DL980G7 box with 8 Nehalem-EX CPU at 8 Cores each (16 in HT mode). Clock speed of 2.0 GHz.
  • 22 PCI Slots, 7 which are currently filled with FusionIO cards (with at least 2 ioDIMM in each slot)
  • A good pile of RAM (still not sure how much we will run on, but the box is capable of 2TB)
    Along with Tom Scribner of FusionIO (one of the guys who wrote the FIO Windows driver) – I will be doing some tests on this monster box. This is one of the largest scale-up systems out there – and the biggest scale-up x64 box that HP currently ships.
    We still don’t know exactly how many IOPS we can squeeze out of this, but it is looking promising. You know you are heading the right way with your tuning, when both tuners are looking at 100K Random IOPS, at 64K blocks, consistent 0.6ms latency – and say: “Nice, but not impressive”. This is basically a system with no bottlenecks.

Here are a few screenshots:

imageimage

Notice the little green light in the top right corner, with the number 22 next to it.That is the number of IODimms.

Some of you may have participated in the “Crappy Code Games” during SQL Bits and seen some pretty amazing results. But now, I want to up the bar and try “Great Code Game”. I want to put SQL Server to the test and discover the grade of the steel of basic operations that we all need to do when designing databases.

Here are some of the questions I plan to answer:

  1. How fast can we run the ETL World Record on this? (I hope to hit 4 minutes for 1TB if the CPUs will do it)
  2. How many singleton INSERT / sec can this box sustain?
  3. How many single row UPDATE statements / sec can we do?
  4. How fast can you table scan fact tables?
  5. How fast can you do backup?

Ad 1) Since I don’t have any blade servers to use for ETL clients, I will run BULK INSERT commands. This will be the (current) ultimate test of a scale-up ETL load

Ad 2,3) This will of course depend a lot on the table design and file layout. If you have seen me present at SQLBits, you will know that I have some unorthodox design techniques to boost scale. I plan to put them through extreme testing and write more about them. For all you OLTP people out there – this is going to be important if you design for scale.

Ad 4) I have no clue how fast this box will run for table scans. But I know that the PCI bus can do at least 36GB/sec.

Ad 5) Again, I have no clue what the result will be and what bottlenecks I might hit. Only testing will show.

I am quite confident that the standard “out of the box” designs you generally see recommended will not scale on this box. But I am equally confident that there are ways to break through the scale roof and shut up the voices that would claim that SQL Server does not scale. Be prepared for some surprising design hints.

The box also has Hyper Threading (HT). Time allowing, I will see how well the claim that the new HT actually boosts scale will measure up to actual testing. Also, I know a lot of people have asked me about NAND devices and what can be done with them. Feel free to suggest ideas for tests in the comment section.

Anyone else have a box like this out there? I challenge you to compete and explore with me.

Let the games begin!

Utility functions: fn_convert_to_base and fn_nums

2011-04-26 2 comments

I will often use code to illustrate my points in this blog. Because data generation is a big part of these examples, I will take the chance to introduce a few functions that I find useful for that. I will use these functions in my examples, so refer to the Utilities category on this blog to find the source. Most of these functions are adaption from other users of SQL Server and I will do my utmost to give credit where credit is due. If you feel you are the owner of the original idea – please send me an email so I can give you proper credit.

fn_convert_to_base is very useful for creating hex or binary representations of numbers. As we shall see, using the hex representation of a key can be good for illustrating some points. The version I use is based on the code by Leo Vindosola:

CREATE FUNCTION dbo.fn_convert_to_base
(
    @value AS BIGINT,
    @base AS INT = 16
) RETURNS VARCHAR(MAX) AS BEGIN  

    -- some variables  
    DECLARE @characters CHAR(36),
            @result VARCHAR(MAX);  

    -- the encoding string and the default result  
    SELECT @characters = '0123456789abcdefghijklmnopqrstuvwxyz',
           @result = '';  

    -- make sure it's something we can encode.  you can't have  
    -- base 1, but if we extended the length of our @character  
    -- string, we could have greater than base 36  
    IF @value < 0 OR @base < 2 OR @base > 36 RETURN NULL;  

    -- until the value is completely converted, get the modulus  
    -- of the value and prepend it to the result string.  then  
    -- divide the value by the base and truncate the remainder  
    WHILE @value > 0
        SELECT @result = SUBSTRING(@characters, @value % @base + 1, 1) + @result,
               @value = @value / @base;  

    -- return our results  
    RETURN @result;
END

Probably my all time favourite function is fn_nums that allows you to quickly generate a virtual table of integers. I use Itzik Ben Gan’s version:

CREATE FUNCTION [dbo].[fn_nums](@n AS BIGINT) RETURNS TABLE
AS

    RETURN
    WITH  L0 AS(
        SELECT 1 AS c
        UNION ALL
        SELECT 1)
        ,  L1 AS(
            SELECT 1 AS c FROM L0 AS A, L0 AS B)
        ,  L2 AS (SELECT 1 AS c FROM L1 AS A, L1 AS B)
        ,  L3 AS (SELECT 1 AS c FROM L2 AS A, L2 AS B)
        ,  L4 AS (SELECT 1 AS c FROM L3 AS A, L3 AS B)
        ,  L5 AS(SELECT 1 AS c FROM L4 AS A, L4 AS B)
        ,  Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY c) AS n FROM L5)
        SELECT n FROM Nums   WHERE n <= @n;

For pasting source code into the blog, I use the Code Snippet Paster from Omar.

I am not too happy about the syntax formatting though, doesn’t recognize all keywords and comments. Hope someone reading can recommend a better plug-in for live writer.

Categories: Utilities Tags:

Doing FIRST and LAST aggregates in SQL Server 2005

2006-07-10 Leave a comment

Users of Microsoft Access may be familiar with the aggregation functions FIRST and LAST. Basically, what you want from these aggregates is to scan the tables in a sorted order. The first or last value encountered in each group is send to the output – much like the existing SQL Server MIN/MAX function.

 

Let me illustrate with some pseudo code:

 

SELECT Col

  , FIRST(Value) AS FirstV

  , LAST(Value) AS LastV

  , SUM(Value) AS SumV

FROM Table

GROUP BY Col

 

On a table with these rows:

Col Value
A 3
A 2
A 6
B 5
B 1

…You want this output:

Col FirstV LastV SumV
A 3 6 11
B 5 1 12

 

Notice how there in an implicit assumption of a row ordering. You probably want some sort of identity column to order your rows. Implicitly – this is a very ISAM way of looking at the world. Now, how do we do this inside SQL Server?

 

Lets make some test data based on AdventureWorksDW:

 

SELECT

s.*, IDENTITY(int, 1,1) AS ID

INTO

#Agg

FROM

dbo.FactInternetSales s

CROSS

JOIN (SELECT TOP 30 * FROM sys.objects) scaleUpFactor

 

CREATE

UNIQUE INDEX IX_FirstLast ON #Agg (ProductKey, ID, OrderQuantity)

 

Well… My colleage Jesper Rasmussion came up with a brilliant answer (try this on the testdata from above):

 

SELECT

I.ProductKey

  , F.OrderQuantity AS FirstQuantity

  , L.OrderQuantity AS LastQuantity

  , I.SumQuantity AS SumQuantity

FROM

  (SELECT ProductKey

   , min(ID) AS FirstID

   , max(ID) AS LastID

   , sum(OrderQuantity) AS SumQuantity

   FROM #Agg

   GROUP BY ProductKey) I

INNER

JOIN #Agg F ON F.ID = I.FirstID AND F.ProductKey = I.ProductKey

INNER

JOIN #Agg L ON L.ID = I.LastID AND L.ProductKey = I.ProductKey

 

This query has some very interesting properties. If you check the executions statistics you will see that SQL Server only does very few I/O requests in the index created (not more than 10-20% more than it takes for a full scan – probably around log(size(#agg) ). Since all rows HAVE to be visited to answer the requiest – this means that the query is very efficient.

 

Now, if only we could make a first and last aggregate that does just one table scan… :-)

 

Categories: SQL Server Tags:
Follow

Get every new post delivered to your Inbox.

Join 271 other followers