Pages

Tuesday, 8 November 2011

T-SQL Tuesday #024 - Calculating Easter with a CLR Function

T-SQL Tuesday for November 2011 is about Prox n Funx (Procedures and Functions) and is hosted by Brad Schulz here: http://bradsruminations.blogspot.com/2011/10/invitation-for-t-sql-tuesday-024-prox-n.html

This is my first T-SQL Tuesday post (be gentle) and I thought I'd share a function I have recently converted from T-SQL into a CLR based function as it fits with this months topic.  I wont be going into what the CLR is or how you go about using it in SQL Server as there are plenty of posts and articles on the Internet already that explain it far better than I ever could.

A little while back I created a blog post on calculating the date of Easter when given a particular year, that post is here: http://blog.sqlconcepts.co.nz/2011/08/i-like-chocolate-eggswhen-is-it-easter.html

The function works fine from a logical point of view, however, due to the type of calculations going on it doesn't perform that well.  If you use to populate a date dimension (or other static list of dates) its more than adequate for the job but if you are calling it frequently you'll want something with less overhead and more speed.

For that reason I set about turning it into a C# CLR function.  Another reason for doing this was to get me back into the CLR and C# coding as I've only ever done a few pieces here and there.  I need the practice!


Before I present the C# code and a few instructions to get it into the database a bit of background on calculating the date of Easter.  In looking up how to calculate the date from a given year I discovered a few things I hadn't previously considered:
  • There is more than one way to calculate the date
  • Different parts of the world recognise the date differently
The first item wasn't too much of a surprise, that just became a case of picking the one I was most comfortable with (i.e. the one I could follow and implement!).  The second one did come as a bit of a surprise although probably shouldn't have. In the end I went with what is appropriate for New Zealand but if you're in another part of the world you may have to find an alternate method. A chance for a function upgrade in the future perhaps.

The method I eventually went with is called the "Anonymous Gregorian Method".  You'll find info on it, and also the algorithm, on Wikipedia: http://en.wikipedia.org/wiki/Computus#Anonymous_Gregorian_algorithm

The Code:

using System;
using System.Data.SqlTypes;

namespace SQLConceptsLtd
{
    public class SqlClrEaster
    {
        [Microsoft.SqlServer.Server.SqlFunction(IsDeterministic = true, IsPrecise = true)]
        public static SqlDateTime CalcEaster(SqlInt32 Year)
        {
            // Function to calculate the date of Easter for a supplied year.
            // If you want to return a wider range of dates to SQL Server you can use
            // the DateTime (.Net) datatype to return to a DateTime2 in SQL Server.  However,
            // that presents other problems notably the inability to set the return
            // value to Null.

            // This function uses the "Anonymous Gregorian algorithm" as described in
            // Wikipedia: http://en.wikipedia.org/wiki/Computus#Anonymous_Gregorian_algorithm


            // Return null if Year is null or out of DateTime range
            if ((Year.IsNull) || (Year < 1753) || (Year > 9999))
                return SqlDateTime.Null;

            // Cast SqlInt32 to int type as we reference it a few times in the calculation
            int year = (int)Year;

            // See the Wikipedia link above to understand the need for all these variables
            int a, b, c, d, e, f, g, h, i, k, l, m, month, day;

            // Here's the calcuation to get the month and day values
            a = year % 19;
            b = year / 100;
            c = year % 100;
            d = b / 4;
            e = b % 4;
            f = ((b + 8) / 25);
            g = ((b - f + 1) / 3);
            h = ((19 * a) + b - d - g + 15) % 30;
            i = c / 4;
            k = c % 4;
            l = (32 + (2 * e) + (2 * i) - h - k) % 7;
            m = (a + (11 * h) + (22 * l)) / 451;

            month = (h + l - (7 * m) + 114) / 31;
            day = ((h + l - (7 * m) + 114) % 31) + 1;

            // return the date!
            return new SqlDateTime(year, month, day);
        }
    }
}
There really isn't all that much to this function and if you've looked at the Wikipedia article with its description and algorithm you'll see I've literally just converted it to C# and wrapped a few .Net datatypes around it.  I did discover we don't (yet) have support for the SQL DATETIME2 data-type in .Net.  I found you could use the .Net DateTime datatype which would then map to the SQL DATETIME2 datatype, however, you cannot set the .Net DateTime datatype to NULL which could be a bit of a problem.  For this function I stuck to the SqlDateTime which can be set to NULL.

As I'm still (re)learning C# if I've done something stupid I'd be keen to get some feedback on that so comment away!

Once you compile the class above you'll get a .dll assembly we can load into SQL Server.  That's done with the following code:
create assembly CalcEaster
from 'C:\Users\Kent Chenery\Documents\Visual Studio 2010\Projects\SqlClrEaster\SqlClrEaster\bin\Release\SqlClrEaster.dll'
with permission_set = safe;
go

create function dbo.CLRCalcEaster(@Year int)
returns datetime
as
external name CalcEaster.[SQLConceptsLtd.SqlClrEaster].CalcEaster;
go

I imagine its obvious but you'll have to adjust the assembly source for yourselves and your environment.

You can now call the dbo.CLRCalcEaster function like you would any other scalar function.  So to find out the date of Easter in 2012 you simply run:
select dbo.CLRCalcEaster(2012);
The function will tell you its: 8th April 2012

I said at the beginning of this article that the T-SQL function was slow - thats perhaps not that fair, its not so slow as to be unusable but the CLR equivalent function results in a measurable drop in CPU usage and execution duration.  Here's some stats for you to consider:
# Calculations T-SQL Version CLR Version % Improvement
10000 0.233 0.159 31.8%
100000 1.598 0.998 37.5%
1000000 15.445 8.601 44.3%

The timings are in seconds and an average of 5 results for each set of tests.

As you can see, with a good number of calculations you get 40% savings in CPU and execution times. Without any further research/testing my guess for the increased efficiency with more calculations comes from the compilation time becoming a smaller and smaller portion of the overall execution time.

Conclusion

This is a trivial example of a CLR function but one that clearly demonstrates the advantages in performance that can be gained with the CLR.  I'm planning to explore the CLR further and hopefully post up some more functions and procedures that might be of use to people (other than myself).

1 comment:

  1. this good POST
    I like this.

    form
    http://bantalsilikon01.blogspot.com
    http://bumbupecel.esy.es
    http://www.facebook.com/bantalsilikongrosir
    http://bumbupecelbali.blogspot.com

    ReplyDelete