# Tuesday, 07 July 2015

Every now and then an error with only a GUID as part of the information comes up on a screen... The question is, does it live in my database and if so, where (and how) to find it? Today was one of those days and I quickly wrote a query to find out where the particular uniqueidentifier sits in de SQL Server database.

A quickly written query did the job, but I had this feeling it was suboptimal. The way I wrote it, required three scans or seeks on the source table. You’ll hardly notice the difference in AdventureWorks, but the faster (and resource-friendlier) the result is available, the better.

Typical query result for GUID in SQL database

DECLARE @guid uniqueidentifier
SELECT
@guid = 'E552F657-A9AF-4A7D-A645-C429D6E02491'
SELECT OBJECT_SCHEMA_NAME(c.object_id) AS schema_name
   , OBJECT_NAME(c.object_id) AS table_name
   , c.name AS column_name
   , @guid AS value
   , CAST(NULL AS bigint) AS value_count
   , CAST(NULL AS bigint) AS notnull_count
   , CAST(NULL AS bigint) AS row_count
   , CAST(NULL AS nvarchar(max)) AS preview_query
INTO ##hunt_for_guid
FROM sys.columns c
    INNER JOIN sys.types t ON c.system_type_id = t.system_type_id
    INNER JOIN sys.objects o ON c.object_id = o.object_id
WHERE t.name = 'uniqueidentifier' AND o.type = 'U'
WHILE (SELECT COUNT(*) FROM ##hunt_for_guid WHERE row_count IS NULL) > 0
BEGIN
DECLARE
@sql nvarchar(max)
   /*
-- Not optimal, requires reading the source table/index 3 times
SELECT @sql = 'UPDATE ##hunt_for_guid SET value_count = (SELECT COUNT(' + QUOTENAME(column_name) + ') FROM ' + QUOTENAME(schema_name) + '.' + QUOTENAME(table_name) + ' WHERE ' + QUOTENAME(column_name) + ' = ''' + CAST(@guid AS nvarchar(36)) + ''')
       , notnull_count = (SELECT COUNT(' + QUOTENAME(column_name) + ') FROM ' + QUOTENAME(schema_name) + '.' + QUOTENAME(table_name) + ')
       , row_count = (SELECT COUNT(*) FROM ' + QUOTENAME(schema_name) + '.' + QUOTENAME(table_name) + ')
       , preview_query = ''SELECT * FROM ' + QUOTENAME(schema_name) + '.' + QUOTENAME(table_name) + ' WHERE ' + QUOTENAME(column_name) + ' = ''''' + + CAST(@guid AS nvarchar(36)) + '''''''
       WHERE [schema_name] = ''' + schema_name + ''' AND [table_name] = ''' + table_name + ''' AND [column_name] = ''' + column_name + ''''
       FROM ##hunt_for_guid WHERE row_count IS NULL
*/
-- Optimized by scanning the source table/index only once.
SELECT @sql = 'UPDATE H
SET H.value_count = S.value_count, H.notnull_count = S.notnull_count, H.row_count = S.row_count
, H.preview_query = ''SELECT * FROM '
+ QUOTENAME(schema_name) + '.' + QUOTENAME(table_name) + ' WHERE ' + QUOTENAME(column_name) + ' = ''''' + CAST(@guid AS nvarchar(36)) + '''''''
FROM ##hunt_for_guid AS H INNER JOIN (SELECT
SUM(CASE WHEN '
+ column_name + ' = ''' + CAST(@guid AS nvarchar(36)) + ''' THEN 1 ELSE 0 END) AS value_count,
COUNT('
+ column_name + ') AS notnull_count, COUNT(*) AS row_count FROM ' + schema_name + '.' + table_name + ') AS S ON
H.schema_name = '''
+ schema_name + ''' AND H.table_name = ''' + table_name + ''' AND H.column_name = ''' + column_name + ''''
FROM ##hunt_for_guid WHERE row_count IS NULL
EXEC (@sql)
   END
SELECT
* FROM ##hunt_for_guid ORDER BY value_count DESC, schema_name, table_name, column_name
DROP TABLE ##hunt_for_guid

Tuesday, 07 July 2015 00:51:11 (W. Europe Daylight Time, UTC+02:00)
# Thursday, 26 March 2015

Quite often a DBA will get questions from an application administrator regarding the growth possibilities for databases or the possibilities for adding additional databases. In general, the administrators for applications like Lync, SharePoint, Microsoft Dynamics or System Center will have (and require) some higher level privileges on the SQL Server anyway. So chances are, they won’t even require additional permissions the query the information they need and on top of that, their applications may even have the permissions to create these databases without the intervention of a DBA. So here’s the query to let them answer the question themselves (provided they have enough permissions on the server);

/* Works for SQL Server 2008 R2 SP1 and above */
WITH volume_info
AS
(
SELECT vs.volume_mount_point, mf.type_desc
, COUNT(DISTINCT mf.database_id) AS Database_Count
, COUNT(mf.database_id) AS File_Count
, MAX(CAST(vs.Total_Bytes AS decimal(38,4))) AS Total_Bytes
, SUM(CAST(mf.size AS decimal(38,4))) * 8192 AS Database_Used_Bytes
, MIN(CAST(vs.available_bytes AS decimal(38,4))) AS Available_Bytes
FROM sys.master_files mf /* requires VIEW ANY DEFINITION (or CREATE DATABASE or ALTER ANY DATABASE) */
CROSS APPLY sys.dm_os_volume_stats(mf.database_id,mf.file_id) vs /* requires VIEW SERVER STATE */
GROUP BY mf.type_desc, vs.volume_mount_point
)
SELECT volume_mount_point, type_desc, Database_Count, File_Count
, CAST(Total_Bytes/(1024*1024*1024) AS decimal(38,4)) AS Total_Volume_GB
, CAST(Database_Used_Bytes/(1024*1024*1024) AS decimal(38,4)) AS Database_Used_GB
, CAST(Available_Bytes/(1024*1024*1024) AS decimal(38,4)) AS Available_GB
, CAST(100 * Database_Used_Bytes / Total_Bytes AS decimal(7,4)) AS Database_Used_Pct
, CAST(NULL AS decimal(7,4)) AS Available_Space_Pct
, CAST(NULL AS decimal(7,4)) AS Other_Usage_Pct
FROM volume_info
UNION
SELECT
volume_mount_point, NULL, NULL, SUM(File_Count)
,
MIN(CAST(Total_Bytes/(1024*1024*1024) AS decimal(38,4)))
,
SUM(CAST(Database_Used_Bytes/(1024*1024*1024) AS decimal(38,4)))
,
MIN(CAST(Available_Bytes/(1024*1024*1024) AS decimal(38,4)))
,
CAST(100 * SUM(Database_Used_Bytes) / MIN(Total_Bytes) AS decimal(7,4))
,
CAST(100 * MIN(Available_Bytes) / MIN(Total_Bytes) AS decimal(7,4))
,
CAST(100 * (MIN(Total_Bytes) - SUM(Database_Used_Bytes) - MIN(Available_Bytes)) / MIN(Total_Bytes) AS decimal(7,4))
FROM volume_info
GROUP BY volume_mount_point;

Required permissions at SQL Server level to execute this query.

As for the output, you will receive a list containing an aggregated line for each volume that hosts data and/or Tlog files for databases hosted by the SQL Server instance you query. Note that if you have multiple instances of SQL Server storing data on the same volumes, this query will only show the data consumed by the databases of this instance; the data consumed by other instances will be part of Other_Usage_Pct. The type_desc for the aggregated line will be NULL. If applicable you will also receive a row regarding data files (type_desc = ROWS) and log files (type_desc = LOG). This particular output is for a Fail over Cluster Instance with a 250 GB data volume and a 35 GB Tlog volume.

volume_mount_point type_desc Database_Count File_Count Total_Volume_GB Database_Used_GB Available_GB Database_Used_Pct Available_Space_Pct Other_Usage_Pct
E:\ NULL NULL 80 249.9980 214.0164 13.8435 85.6073 5.5374 8.8553
E:\ LOG 4 4 249.9980 0.0146 13.8435 0.0059 NULL NULL
E:\ ROWS 65 76 249.9980 214.0018 13.8435 85.6014 NULL NULL
E:\tlog_mount NULL NULL 61 34.9971 13.5620 21.0367 38.7517 60.1100 1.1383
E:\tlog_mount LOG 61 61 34.9971 13.5620 21.0367 38.7517 NULL NULL
Thursday, 26 March 2015 14:58:58 (W. Europe Standard Time, UTC+01:00)
# Friday, 10 January 2014

At one of my customers, the deployment of some new test-servers uncovered a nice chain of issues. Some Windows Server 2012 R2 machines were deployed but the application they were intended for, could not connect to some of the SQL Servers (on a cluster). However, when trying to reproduce the problem for a specific case made of this problem, the application did work as expected. Then after Windows updates were applied to the cluster, the case was reopened again and now reproduction succeeded (as in the application indeed failed to connect).

Troubleshooting at that stage narrowed it down to the fact that Windows 2012 R2 servers could not connect due to authentication issues, whereas Windows Server 2008 R2 machines with the same application had no issues. Must be Windows 2012 R2, or a policy setting that is different. It turned out to be the latter of the two and it was concluded that the policy setting “Network Security: Restrict NTLM: Outgoing NTLM traffic to remote servers” to Deny all, caused the behavior. Bottom line, Kerberos authentication didn’t work, NTLM did.

Now with Kerberos authentication to SQL Server being an issue, the case followed its way to me. Well, Kerberos authentication should work. All clustered SQL Servers were configured and tested for Kerberos. But a little more about this particular cluster;

  • 4 Nodes (Windows Server 2008 R2). Originally a 2-node cluster, extended about a year ago.
  • 8 Virtual SQL Servers (2008 R2). Install dates of those virtual servers varying between 3 years to 1 month ago.
  • Each virtual server is installed using a domain account for that particular server.
  • Install is pretty much default, so the installer determines the dynamic tcp-port.
  • After installation, the required SPNs are created.
  • The state for this cluster is TEST/ACCEPTANCE.

This particular case included two clustered SQL Servers having the problem;

  • CL2_SQL_INS2\INS2
  • CL2_SQL_INS5\INS5

To verify the situation, I connected to the node for INS2 and confirmed there were SPNs registered for the service accounts for INS2 and INS5 (SetSPN –L “domain_name\svc_acnt_name”) which returns results like

C:\Users\admin_075>SetSPN -L "organization\srv_sql_ins2"
Registered ServicePrincipalNames for CN=SRV_SQL_INS2,OU=CL2_SQL,OU=SQL_Clusters,OU=Servers,DC=organization,DC=local:
        MSSQLSvc/srv_sql_ins2:12345
        MSSQLSvc/srv_sql_ins2.organization.local:12345

In other words, it looked just like it should, the SPN for INS2 was 12345 and for INS5 was 54321. Next I opened the SQL Server Configuration Manager and retrieved the TCP Port for IPAll for both INS2 and INS5. INS5 was at TCP Dynamic Ports = 54321 (as expected), but INS2 was at TCP Port = 1433! Now this explains why INS2 isn’t working on Kerberos, it’s not registered for 1433. Of course there now is the question why INS2 is forced to listen on 1433 instead of 54321, but at that stage I was more concerned with why INS5 wasn’t authenticating on Kerberos despite the port configuration. Since the Windows Server I currently was looking at, was the active host for INS2 but not for INS5, I connected to the host for INS5 and checked the TCP Port settings. There is was in for yet another surprise: INS2 = tcp-port 12345 and INS5 = tcp-port 1433!

This is the situation that is not supposed to be!!! The active hosts for the respective virtual servers listen at 1433, but the passive hosts are correct (well, turned out INS2 had 2 hosts configured for 1433). I learned these were probably remnants from earlier troubleshooting activities. However I’m still surprised as in a cluster, when one edits the tcp-port number on the active host, it should propagate with a failover. So I edited the port on the active host for INS2 to be 12345 instead of 1433 and brought the resource offline and online again, port remained at 12345 (= OK). Next I did a failover to the other host that still had 1433 listed, expected that the port would become 12345. It didn’t, it remained at 1433. What is broken here (or has never worked correctly at all)?

The broken, or better said, missing part turned out to be checkpoints. Not that this is the first case of missing checkpoints SQL Server as it seems, the way to fix them is hidden inside the kb-article How to change SQL Server parameters in a clustered environment when SQL Server is not online. The title seems to suggest that this only applies when SQL is not online, but the described method 2 has no restriction on SQL Server being online or not. After checking with cluster . resource /checkpoints it turned out that none of the SQL Server Services had a checkpoint. And since the path HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10_50.INS2\MSSQLServer\SuperSocketNetLib\Tcp\IPAll wasn’t covered by the another checkpoint, it is obvious why tcp-port numbers were inconsistent throughout the nodes. After adding a checkpoint for INS2 (from the host where INS2 was running on 1433) and editing the tcp-port number to become a static (not dynamic as on the other host) 12345. Next I conducted a failover to the previous host and verified that it too now had a static IPAll for port 12345. Issue of the missing checkpoints solved. The other 7 instances went the same way.

CAUTION: At step 4 for Method 2, for SQL Server 2008  the syntax is listed as cluster . resource "SQL Server (<InstanceName>)" /addcheckpoints:"HKEY_LOCAL_MACHINE\Software\Microsoft\Microsoft SQL Server\MSSQL.x\MSSQLSERVER". Be sure to leave out the HKEY_LOCAL_MACHINE\ part as this results in invalid syntax and take a good look how to replace the MSSQL.x as this would apply to SQL Server 2005, whereas MSSQL10.<InstanceName> would apply to SQL 2008 and MSSQL10_50.<InstanceName> to SQL Server 2008 R2. Hopefully this will be updated based on the feedback I submitted.

For more background on the checkpoints, I found the Windows 2008 (R2) documentation on clustering is rather limited to the “What’s new” stuff and to be honest, even the Windows 2003 documentation is not as clear on the subject as Exchange 2003 on Windows Cluster Architecture (sections Checkpoint Manager and Database Manager under Cluster Service Components).

Friday, 10 January 2014 22:51:14 (W. Europe Standard Time, UTC+01:00)
# Friday, 03 February 2012

When performing maintenance to a cluster, as in adding an SQL Server virtual to a node or removing it from a node, setup may fail for a number of reasons. And once something went wrong, things keep getting wrong, even if you verified the procedure and you corrected the cause of the error.

Well, the procedure didn't tell you everything. First of all, you should only be logged in to the node where you are running the setup. Logoff from all other nodes. The reason is that SQL Server setup will create a Windows Scheduled task on the other nodes of the cluster for you, under your account. Having an interactive session (you being logged in) and logging in as a batch job (the setup task) with the same account, may lead to unexpected events (and thus setup failing). Obviously Task Scheduler should be running to ;-). To be honest, Microsoft does tell you about the scheduler and the logging off in a KB article (let's be nice and add some community content).

Where it does get mean in the KB article, is when setup keeps failing (and it most likely will); restart all remote nodes!!! The most likely reason for the setup failing this time, is a running setup.exe from a previous failed attempt. This is what may cause the access denied for the scheduled task (or check that your account is local admin on every node and has the login as a service permission). Here is how to avoid having to reboot;

  • Exit the setup on the local node.
  • Check the scheduled tasks on the remote nodes. Some are running: stop them, then delete them. Some have failed, delete them
  • Next, check the processes on all nodes, there may be one or two setup.exe processes from SQL Setup active on the nodes, kill those processes.
  • Don’t forget to log off from the remote nodes.
Friday, 03 February 2012 12:38:29 (W. Europe Standard Time, UTC+01:00)
# Tuesday, 25 October 2011

Character based data types, especially the var types varchar and nvarchar allow to declare a really big amount and just use a tiny bit and it doesn’t even hurt in SQL Server. However, it may in the application. Today I was working with a customer for the implementation of some third party software that would be doing analysis on the data. It would load the really over-dimensioned (n)varchar types in something like a (n)char.... you get the picture, huge amounts of wasted space. So from the databases side, a bit of restructuring was desirable. To see how much we can actually save, I created query to report on the character usage;

CREATE TABLE ##char_columns (
[SCHEMA]
VARCHAR(128) NULL,
[TABLE]
VARCHAR(128) NULL,

[COLUMN]
VARCHAR(128) NULL,

[data_type]
VARCHAR(128) NULL,

[max_length]
INT NULL,

[is_replicated]
BIT NULL,

[actual_max_length]
INT NULL,

[row_count]
BIGINT NULL,

[not_null_count]
BIGINT NULL)

INSERT
##char_columns

SELECT
s.name AS 'schema'
,
t
.name AS
'table'
,
c
.name AS
'column'
,
y
.name AS
'data_type'
,
c
.max_length
,
c
.is_replicated
,
NULL
,
NULL
,
NULL

FROM
sys.tables t

INNER
JOIN sys.schemas s ON t.schema_id = s.schema_id

INNER
JOIN sys.columns c ON t.object_id = c.object_id

INNER
JOIN sys.types y ON c.system_type_id = y.system_type_id

WHERE
t.type = 'U'
AND y.name IN ('varchar','char','nchar','nvarchar')
WHILE
(SELECT COUNT(*) FROM ##char_columns WHERE actual_max_length IS NULL) > 0

BEGIN
TRY

DECLARE
@sql NVARCHAR(4000)
, @actual_max_length INT , @column VARCHAR(128) , @schema VARCHAR(128) , @table VARCHAR(128)
SELECT
TOP 1 @schema = [SCHEMA], @table = [TABLE], @column = [COLUMN]
FROM ##char_columns

WHERE
actual_max_length IS NULL

-- Set the actual_max_length to -1 if a column has only null values or no rows.

SELECT @sql = 'UPDATE ##char_columns
SET [actual_max_length] = (SELECT ISNULL(MAX(LEN([' + @column + '])),-1) FROM [' + @schema + '].[' + @table + '])
,
      [row_count] = (SELECT COUNT_BIG(*) FROM [' + @schema + '].[' + @table + '])
,
      [not_null_count] = (SELECT COUNT_BIG([' + @column + ']) FROM [' + @schema + '].[' + @table + ']) 
WHERE [schema] = ''' + @schema + '''
 
      AND [table] = ''' + @table + '''
 
      AND [column] = ''' + @column + ''''
EXEC
(@sql)

END
TRY

BEGIN
CATCH

DECLARE
@err_num NVARCHAR(20), @err_msg NVARCHAR(2048), @full_msg NVARCHAR(2048)

SELECT
@err_num = ERROR_NUMBER(), @err_msg = ERROR_MESSAGE()
SELECT @full_msg = 'ERROR: ' + @err_num + ' DESCR: ' + @err_msg + ' QUERY: ' + @sql

RAISERROR
(@full_msg,16,0)

BREAK

END
CATCH

SELECT
* FROM ##char_columns

DROP
TABLE ##char_columns

Be aware that varchar(max) and nvarchar(max) will give a max_length of –1 and that actual_max_length is –1 if no rows exist in the table or the column only contains NULLs (refer to the row_count and not_null_count to determine the situation).

Tuesday, 25 October 2011 22:37:16 (W. Europe Daylight Time, UTC+02:00)
# Friday, 15 April 2011

In a clustered environment, it may not always be obvious what node SQL Server has been running on. Most logging is done based on the network name of the resource group, or the instance name of the virtual server. Still it is logged once in the Windows Application Event Log and the SQL Server Error Log each, though the “memory” of those sources is likely limited by configuration; size and overwrite settings of the application log and number of error logs (maximum 99) for the SQL Server Error log.

In the Windows Application Event Log, the EventID 17664 is of interest (verify the source is an SQL Server instance). Event Log filtering is relative straight forward and turns up an event like this.

EventMessageNodeSQLServer

Figure Event Properties; 1 = Instance name, 2 = Network name of the resource group, 3 = Computer name of the cluster node, 4 = Binary code containing network name\instance name

In case of the SQL Server Error Logs, the thing of interest is the message;

The NETBIOS name of the local node that is running the server is 'computername'. This is an informational message only; no user action is required.

Parsing the files might be the easiest from SQL Server with the xp_readerrorlog. Unfortunately error handling surrounding eXtended stored Procedures is limited. Using the script below, the last result set shows the node where a clustered SQL Server was started on.


SET
NOCOUNT ON
DECLARE
@iteration
int
DECLARE
@log_events table ([LogDate] datetime, [ProcessInfo] nvarchar(20), [Text] nvarchar(max), [Iteration] int
)
SELECT @iteration = 0
-- Set to current error log
WHILE @iteration <
100
    BEGIN
    BEGIN
TRY
        -- EXECUTE the XP without INSERT to trip the error.
        -- XP error is handled poorly, but it is even worse
        -- when it occurs on the insert from executing an XP.
        EXEC xp_readerrorlog @p1 = @iteration, @p2 = 1, @p3 = 'NETBIOS name'
;
        INSERT @log_events ([LogDate], [ProcessInfo], [Text]
)
            EXEC xp_readerrorlog @p1 = @iteration, @p2 = 1, @p3 = 'NETBIOS name'
;
        UPDATE
@log_events
            SET [Iteration] =
@iteration
       
    WHERE [Iteration] IS
NULL;
        SELECT @iteration = @iteration + 1
;
   
END
TRY
    BEGIN
CATCH
        SELECT
@iteration = @iteration + 1
;
    END
CATCH
    END
SELECT
[Iteration],
[LogDate]
   
    , CAST(SUBSTRING(

           
SUBSTRING([Text]

            ,
LEN('The NETBIOS name of the local node that is running the server is ''') + 1
           
,16
-- Max length NETBIOS name is 16 char (15 actually).
   
    ), 1, CHARINDEX('''', SUBSTRING
(
            [Text]
           
,LEN('The NETBIOS name of the local node that is running the server is ''') +
1
           
,16
-- Max length NETBIOS name is 16 char (15 actually).
   
    )
    )
- 1) AS varchar(16)) AS
'NodeName'
FROM
@log_events
WHERE [Text] LIKE
'The NETBIOS name of the local node that is running the server is %'

Friday, 15 April 2011 12:56:11 (W. Europe Daylight Time, UTC+02:00)
# Tuesday, 30 November 2010

There is nothing fancy about this post… just a little frustration. I wanted to compress files after they were processed with SSIS 2005. And with compression, I mean NTFS-compression, not creating a .zip or .gz (because that is what I mostly found through the search engines).

My first path of research was to do it in a Script Task, but it turns out, invoking the compression attribute of files on NTFS requires C# and can’t be done with VB.NET. So it’s a no go in SSIS 2005.

During the search, somewhere the compact.exe program in Windows was suggested as an alternative. Very useful from an SSIS perspective, it means the Execute Process Task is required. To make it all happen, three things are needed;

  • A Foreach Loop container
  • A string variable (either in the scope of the Foreach Loop Container or in the Package-scope), named filename in this example.
  • An Execute Process Task in the Foreach Loop container

clip_image002

Configure (Collection Page) the Foreach Loop Container as Foreach File enumerator, point to the folder where the files reside and retrieve the file names as Fully qualified. Depending your requirements, you may also specify an additional mask and include subfolders.

clip_image004

Next visit the Variable Mappings tab and point to the previously created variable listed as User::filename.

Next is the Execute Process Task. But first a little on compact.exe, to compress a file, compact.exe should be called with the /C flag and the name of the file to be compressed. Something like C:\WINOWS\System32\compact.exe /C "D:\ftp\archive\dump_001208.csv"

In terms of what the Execute Process Task expects, the Executable should be set to compact.exe. The arguments should be /C "D:\ftp\archive\dump_001208.csv" and since the filename is supplied by the variable, an expression should be used. On the Expressions tab, create an expression for Arguments. With the expression designer, create the following expression

"/C \"" + @[User::filename] + "\""

The important part to recognize is that expressions are largely built on C-style syntax (download the PragmaticWorks white paper  SSIS Accelerator Series: Working with SSIS Expressions to learn more on SSIS expressions). The arguments expression should return a string, hence the opening and closing double-quotes. The filename, which is supplied by the @[User::filename] variable (drag and drop the variable in the expression designer) should be enclosed within double-quotes too. For SSIS to recognize that these double-quotes are part of the string it should return, the double-quotes need to be escaped (using the backslash). To confirm the expression is correct, use the Evaluate Expression button. The evaluated value should return /C "" (slash + C + space + double-quote + double-quote), once the filename variable is filled, it will appear between the two double-quotes.

clip_image006

Enclosing the filename in double quotes prevents files with spaces in the name from causing errors. There could be other errors though, like a file being locked. With current configuration, the package would stop and throw an error on the first occurrence of this condition. To prevent the package from failing in that event, set the FailTaskIfReturnCodeIsNotSuccessValue property of the Execute Process Task to false.

Tuesday, 30 November 2010 21:49:56 (W. Europe Standard Time, UTC+01:00)
# Tuesday, 19 October 2010

Some time ago I was working with performance data collected by a customer to look for I/O-patterns and make recommendations for the I/O-subsystem for a new SQL Server 2008 machine that would replace a couple of SQL Server 2000 servers. Unfortunately I ran into issues with the values for Avg. Disk sec/Read and Avg. Disk sec/Write, which were ridiculous with average values around 100-million seconds per read or write. Those values occurred on two virtualized Windows 2000 Servers.


Today, I was working for another customer. Physical Windows Server 2003 boxes (both nodes in the same cluster). I imported the .blg files into a database and was looking at very high values (up to 100 million) for Avg. Disk sec/Read and Avg. Disk sec/Write. I consulted the staff-DBA, who referred me to one off the Windows administrators. So the Windows administrator logged in, opened performance monitor and… got reasonable values for those counters (hovering at about 0.006, I’d even say those values are good). To prove I wasn’t making a fool of myself, we walked over to my desk and I showed the results I had… same servers, same counters, very different values. Next we tried using the account of the staff-DBA, which showed the “normal” values.


Simply eliminating, the Win-admin said the only difference is the account, so he went and checked whether the account I used had full administrator permissions. It had. Next he turned his attention to the user profile for the account I used (which happened to be quite big). The user profile belonging to the account I used was deleted and I logged in again, this time to be shown the correct values. RESOLVED!!!


Unfortunately the collected data still was useless. I started the counter log again, let it run for about 10 minutes and examined it… sky high values!!! In disbelief I switched to current activity, to be shown normal values!!! Next I created a new counter log definition, let it run for a couple of minutes and this time I got normal values.


Bottom line, to resolve insanely high values for Avg. Disk sec/Read and Avg. Disk sec/Write (it might apply for other counters of the type PERF_AVERAGE_TIMER too), use a fresh user profile and a fresh counter log definition.


Thanks Ruud.

Tuesday, 19 October 2010 21:44:49 (W. Europe Daylight Time, UTC+02:00)
# Thursday, 19 August 2010

I am working on an MS Access database that needs to be integrated in a CRM system. Since the backend of the CRM is SQL Server (why else would I be involved), I migrated the Access database to SQL Server using the SQL Server Migration Assistant for Access v4.2 (SSMA for Access). Next I turned my attention to translating attributes between the systems and unfortunately, a free text notes field in the Access database is used for a lot of things that need to be translated to attributes. Some things I know how to translate, but some things have to be determined by the administrators for the CRM System and Access application. So I built a query in SQL on the tables formerly stored in the Access database and used Excel (with MS Query) to create a report for the administrators, so they can sort out how customer statuses should be translated.

Extremely simplified, the query looked like this;

SELECT tlbCustomer.CUST_CODE
  , tlbCustomer.
PAY_REF
  , COALESCE(UPPER(REPLACE(tblNotes.NOTE_TXT,'.','')),'') AS
NOTE_TXT
FROM tlbCustomer LEFT OUTER JOIN
tblNotes
      ON tlbCustomer.CUST_ID = tblNotes.
CUST_ID
GROUP BY tlbCustomer.
CUST_CODE
  , tlbCustomer.
PAY_REF
  , UPPER(REPLACE(tblNotes.NOTE_TXT,'.',''
))
ORDER BY COUNT(*)
DESC

The query worked great on SQL, but did not return the text (NOTE_TXT) to Excel.

To cut the long story short, the original tblNotes.NOTE_TXT field in the Access database was of type MEMO. SSMA 2008 for Access converted this to nvarchar(max) and somehow MS Query (in Office 2007) does not return, nor throw an error on, large data types (varchar(max), nvarchar(max)). A quick test showed that MAX was not really needed;

SELECT MAX(LEN(tblNotes.NOTE_TXT)) FROM tblNotes

So modify the query to return NOTE_TXT as nvarchar(256) does the trick;

SELECT tlbCustomer.CUST_CODE
  , tlbCustomer.
PAY_REF
  , CAST
(
      COALESCE(UPPER(REPLACE(tblNotes.NOTE_TXT,'.','')),'')

    AS nvarchar(256)) AS
NOTE_TXT
FROM tlbCustomer LEFT OUTER JOIN
tblNotes
      ON tlbCustomer.CUST_ID = tblNotes.
CUST_ID
GROUP BY tlbCustomer.
CUST_CODE
  , tlbCustomer.
PAY_REF
  , UPPER(REPLACE(tblNotes.NOTE_TXT,'.',''
))
ORDER BY COUNT(*)
DESC
Thursday, 19 August 2010 12:26:16 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 24 July 2010

Sometimes one is confronted with forgetting an option... It all started with a database that had a single data file of 1.8 GB and a transaction log more than 6 times the size of that. Usually that would be a database in full recovery model and no transaction log backups. This database was in simple recovery model though and according to the Disk Usage report, 99.99% of the transaction log was free. So a DBCC SHRINKFILE should solve the problem, however DBCC SHRINKFILE solved nothing, the log remained at about 11 GB.

A query on sys.databases learned that this particular database had ACTIVE_TRANSACTION for log_reuse_wait_desc. A peek in Activity Monitor then showed that there was an open transaction for the database; a DELETE task with a huge wait time and the BROKER_RECEIVE_WAITFOR wait type.

Then it occurred to me this is the database that used to have a deadlock problem. After implementing some procedure fixes, there haven’t been any deadlocks for over a half year and neither has the server (or service) been shutdown in between.

The stored procedure on the queue does a receive, but no timeout associated to it. Effectively this means that the receive will wait inside the transaction, until a messages comes into the queue. That hasn’t happened in the last six months. The resolution is simple, add the TIMEOUT option to the RECEIVE and the procedure will stop when there wasn’t any work to be done for the specified amount of time (in milliseconds). For the code, see the post “Deadlock Alerts Through Event Notification”, which has been updated with the TIMEOUT option.

After altering the procedure, kill the spid. Service Broker will restart, but now with the TIMEOUT option in the procedure and stop after the specified amount of time has elapsed without messages arriving in the queue.

Saturday, 24 July 2010 18:48:00 (W. Europe Daylight Time, UTC+02:00)
# Friday, 26 February 2010

When restoring a backup of a database, SQL Server recreates the files with the same sizes as listed in the backup, or at least tries to do so. The only option you have, is specifying the path for each file. Generally not a problem, but every now and then I encounter a database with a transaction log about 10 to 50 times the size of the data file(s) . . . yep, full recovery model and just daily full backups. I am the first to acknowledge transaction logs backups are key to prevent this situation, but the next best thing still is that recent valid database backup. . . and then you can’t restore it, because the 200 GB to write that big empty file called transaction log cannot be allocated.

Today was one of those days;

Msg 3257, Level 16, State 1, Line 1
There is insufficient free space on disk volume 'D:\' to create the database. The database requires 101197938688 additional free bytes, while only 19386400768 bytes are available.
Msg 3119, Level 16, State 4, Line 1
Problems were identified while planning for the RESTORE statement. Previous messages provide details.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.

Surely I wouldn’t be the only one having problems with this and I found two others on connect;

I voted (as important) for both items, though the second one is already closed. Yet I was triggered by the comments from the MS-people to ask for a specific solution towards the transaction log;

RESTORE DATABASE ... WITH STUB_LOG

One of the habits of RESTORE DATABASE is that it wants to allocate the log size (and files) as found in the backup. Strange, as the size of those files is the result of past activity (not contained in the backup being restored). The only space needed in the transaction log, is that of the activity during the backup. Additional space requirements may exist for subsequent restores of differential backups and log backups (in norecovery or standby) or future activity (after recovery), but that isn’t an issue as the log files would grow again during these restores.

The only real issue I can think of is having a database with fixed size transaction log file(s), which means no auto grow. So for a WITH STUB_LOG option to work properly in these scenarios, it may need to imply that the log file(s) are always set to unrestricted auto grow (possibly only honoring the original increment).

It would at least prevent me from having to allocate 100GB for an empty transaction log on 20GB available.

Friday, 26 February 2010 21:26:02 (W. Europe Standard Time, UTC+01:00)
# Tuesday, 09 February 2010

If you're interested in expanding your skills in SQL Server, check out these events;

Happy learning.

Tuesday, 09 February 2010 10:43:22 (W. Europe Standard Time, UTC+01:00)
# Friday, 04 December 2009

Yesterday I visited the SQL Server day 2009, organized by the Belgian SQL Server User Group SQLUG.BE, in Mechelen. Congratulations on the event guys!

After the keynote by Microsoft Belgium (I wish they had talked a little bit more about SQL Azure), I visited the session by Henk van der Valk on world record ETL... now Henk has control over some top notch iron, but that doesn't mean his tips don't apply to a modest setup. Henk also mentioned he recently joined the blogsphere at www.henkvandervalk.com.

Next I sat (sorry I have to say so) horrible sponsored session by Quest... and this has nothing to do with the FogLight product. On another occasion (an afternoon session by Quest Netherlands) I witnessed the possibilities of Foglight (for .NET, SQL Server and VMware) and I must say it's a good looking product. However we got 30 minutes of boring listing of challenges and day to day problems (as if we weren't aware of them already) and in the end got some screenshots, which were completely out of context. I would have be completely lost in the presentation if I hadn't been to the session earlier by Quest NL.

After that, I meant to sit the session "Reporting Services a DBA's tool? YES!!!", but since the agenda card was a little confusing (or better said, I didn't pay enough attention) I walked into the session by Nico Jacobs on "Handling XML in SQL Server". Funny, as there was nothing new for me in the session but still I really enjoyed it... most important because as a trainer you rarely get an opportunity to see one of your colleagues at work on a familiar subject. Thanks Nico, I really enjoyed it.

The other session I attended was on "Policy Based Management", again by Nico Jacobs. I hoped it would go deeper into the possibilities of implementing it for SQL Server 2005 and SQL Server 2000. Unfortunately that was not the case, so I'll have to dive into the Enterprise Policy Management Framework and PowerShell without a quick start. But again, it was a joy listening to Nico.

Final session and closing keynote was by Chris Webb on PowerPivot (a.k.a. Gemini). It wasn't my first glance at Gemini, but it definitely is the kind of quick start I was looking for. Sitting a session like that saves a day of looking for stuff.

All-in-all, a day well spent.

Friday, 04 December 2009 14:35:00 (W. Europe Standard Time, UTC+01:00)
# Saturday, 21 November 2009

Creating linked servers between two instances of SQL Server may not always be as straight forward as hoped. To make it more complicated, interference may be caused by matters beyond the control of the database administrator; Active Directory. After creating a linked server, being able to successfully use it may be the real challenge, it may even work for one user and fail for another.

Service Account

To communicate successfully in a domain environment, the service account for your database engine is the first to check/configure. Possible choices for the service account are;

  • Domain User (recommended)
  • Local User
  • LocalSystem
  • LocalService
  • NetworkService (recommended)

From those options, a local user or LocalService for the service account will eventually stop you. In both cases a SQL Server has the permission set of user on the machine where SQL Server operates, with no identity on the network. In case the SQL Server has to connect to any remote resource, it will do so as an anonymous user (null session). Unless you want to be absolutely sure that SQL Server should never do anything beyond anonymous on the network, you can go this route... but hey, you're reading about linked servers, so just drop the thought of using a local user or LocalService.

Valid options to enable SQL Server to operate on the network are LocalSystem, NetworkService and Domain User. All have the potential to work well, but LocalSystem is not recommended from a security perspective. LocalSystem has the permission set of an administrator on the system where SQL Server operates, anyone obtaining control over your SQL Server has full access to the underlying operating system (think about the damage that could be done when xp_cmdshell is enabled). From a network perspective, LocalSystem and NetworkService are identical; when the SQL Server operates over the network, it will do so under the credentials of the Windows machine. With a domain user as service account, SQL Server has user permissions locally (as is the case with NetworkService, LocalService and a local user), on the network SQL Server will present the credentials of the domain user.

So two good options remain; NetworkService and a domain user. Which one you chose mainly depends on the advantages of one over the other;

  • NetworkService:
    • No account or password management in Active Directory
    • Service Principal Names are automatically generated
  • Domain User
    • Multiple instances on different machines can have the same identity which is required for clustering.
    • Transparent security for Database Mirroring. 
    • Usually SQL Agent will be the same account, this makes things easier with Log Shipping and Replication. ChangeServiceAccount
    • Improved access to Service Master Key (in case you don't have a backup of the key itself). -->

Personally, I favor using a domain user.

Should you find that you need to change the Service Account; use the "SQL Server Configuration Manager"

Configuring and using a SQL Server linked server for Windows Authentication

NewLinkedServer1There are two ways to go about the configuration; graphically and via stored procedure. The graphical approach brings up a dialog, where just a little configuration is needed.

On the General page, select the server type to be SQL Server (choosing SQL Server Native or OLEDB drivers will work too, but require extra configuration) and type the name of the target instance using the flat name or FQDN followed by a backslash and instance name in case of a named instance.

On the Security page, select the option to "Be made using the login's current security context", which effectively tells SQL Server to impersonate or delegate the login requesting access though the linked server.

On the Server Options page, verify that Data Access is set to True. To see the stored procedures involved, either use the Script button on top of the dialog instead of pressing ok or right-click a configured linked server and "Script Linked Server as >".

To avoid running into (later explained) problems when creating linked servers, create the linked server from a shell (interactive or remote desktop) on the Windows machine where the linked server is configured.

To use the linked server, issue the query

SELECT TOP 50 * FROM [HOST.NET.LOCAL\INSTANCE].AdventureWorks.Sales.SalesOrderDetail

Note that the linked server HOST.NET.LOCAL\INSTANCE is enclosed in braces, which is required when the server name is fully qualified and/or when it is a named instance. If you don't want to expose host and/or instance names, consider creating an alias in "SQL Server Configuration Manager" under client configuration and point the linked server to the alias.

Impersonation and delegation

During the configuration (Security page) impersonation and delegation were mentioned. Also to avoid problems while configuring a linked server, the best way to do so was on the server where the linked server was to be created. The reason to do so is to avoid delegation. Below are a summary of impersonation and delegation (and links to a more complete overview). Windows accounts will be impersonated if necessary, Windows accounts will not be delegated unless both account and delegating system are configured to do so.

Impersonation happens when a process connects to resources pretending to have the identity of it's caller. Like when a user Bobbie is connected to SQLINST1 and issues a query for a table via a linked server on SQLINST2. Depending on the configuration of the linked server, SQLINST1 will pretend to be Bobbie when connecting on Bobbies behalf to SQLINST2. For a more detailed overview of impersonation, check the Pluralsight-wiki on impersonation. Note that impersonation is limited to the  system where process making the call is active.

Delegation is much like impersonation, with the additional feature that delegation is not limited to the system of the process making the call. Because the process now has to forward the credentials to another system and prove to operate on behalf of someone else than the system itself, the Kerberos protocol is required (that or knowledge of the password of the delegated user). For a more detailed overview of delegation, check the Pluralsight-wiki on delegation.

UserDelegationPropertyFrom the perspective of Windows systems, Bobbie (DOMAIN\Bobbie) is logged in on a typical client system (Windows XP, Vista or 7) and uses an application that connects to SERVER1\SQLINST1. In SQLINST1 two linked servers are configured; SERVER1 (being the default instance on SERVER1) and SERVER2\SQLINST1. The context of the application is DOMAIN\Bobbie and thus an access token for DOMAIN\Bobbie is handed to SERVER1 (for logging in on the SQL Server service SERVER1\SQLINST1). When the application issues a query for an object on the default instance of SERVER1, one local process on SERVER1 has to impersonate DOMAIN\Bobbie to the other local process on SERVER1. This will succeed and can work both with NTLM and Kerberos, because SERVER1 trusts itself and the domain; DOMAIN\Bobbie was authenticated by the domain and is only used by SERVER1 locally.

Next the application calls for SERVER2 and now the challenge is on! Bobbie get's a

Msg 18456, Level 14, State 1, Line 1
Login failed for user 'NT AUTHORITY\ANONYMOUS LOGON'.

SystemDelegationPropertySERVER2 does not trust SERVER1 and therefor will not accept the claim of SERVER1 that it acts on Bobbies behalf, SERVER2 wants to see PROOF! One way of proof would be if SERVER1 had the password of Bobbie, indicating that Bobbie fully trusts SERVER1... but Lesson number 1; never give anyone your password. Lesson number 2; don't write it on a Post-It... Lesson number 3; don't trust computers that violate lessons 1 or 2. The other option is to turn to a mutually trusted third party: the domain, but now the configuration needs to be right for Kerberos.

For Kerberos to do it's trick, the user (Domain\Bobbie) and the machine delegating (SERVER1) the user have to be trusted for delegation. This is sufficient if the SQL Server services on SERVER1\SQLINST1 and SERVER2 run as NetworkService.

In case SERVER1\SQLINST1 is running under a domain user account, trust that user account for delegation, not the server.

Service Principal Name

Kerberos distinguishes three roles;

  • Key Distribution Center (KDC), in Active Directory this role is held by the Domain Controllers.
  • Client, these are the users.
  • Server, these are the computers in your network.

With this basic division of roles, things work well in a domain, until a service account like that of the SQL Server database engine is a domain user. If so, that particular domain user should be made a server too. To mark an account as a service, a Service Principal Name must be registered. With NetworkService, this was done by the server (because servers can write SPN's on their own behalf), for any other account, by default only Domain Administrators can write an SPN. Use the SetSPN tool to mark an account as Service Principal Name, SetSPN can be downloaded, is part of the Windows Server 2003 Support Tools and is installed on Windows Server 2008 with the AD DS role.

To register an SPN for a service account, use the following syntax:

SetSPN -a MsSqlSvc/server1.domain.local:1433 domain\user

SetSPN -a MsSqlSvc/server1:1433 domain\user

Substitute the blue parts in the syntax for the server names (FQDN and NetBIOS), the port where SQL Server listens (this could be different from 1433, especially when registering an SPN for additional instances) and the domain user account used as SQL Server service account.

When running more SQL Server instances under the same domain account, it may be useful to check the approach listed in Step 3 of How to use Kerberos authentication in SQL Server, so the AD-people have to be called upon only once for the service account, not for every instance installation.

Reference Environment

ReferenceEnvironment As reference environment (may be a layout for a test-network), the following machines, users and services are used in a single domain configuration;

  • Windows Server 2003 or 2008: DC1 (domain controller), SERVER1 (member), SERVER2 (member)
  • Windows XP, Vista or 7: CLIENT (member)
  • SQL Server 2005 or 2008: SERVER1 (default instance) tcp 1433 static, SERVER1\SQLINST1 tcp dynamic (named instance), SERVER2\SQLINST1 (named instance) tcp 1433 static

To make it work, user Domain\Bobbie has to be trusted for delegation in all four scenario's. Depending on the service accounts for SERVER1\SQLINST1 and SERVER2\SQLINST2, use the following settings

Service account for SERVER1\SQLINST1 Service account for SERVER2\SQLINST1 Users and or computers to be trusted for delegation SPN's to create
NetworkService NetworkService SERVER1
DOMAIN\Bobbie
<none>
NetworkService DOMAIN\SRV_SQL2 SERVER1
DOMAIN\Bobbie
MSSQLSvc/server2.domain.local:1433 DOMAIN\SRV_SQL2
DOMAIN\SRV_SQL1 NetworkService DOMAIN\SRV_SQL1
DOMAIN\Bobbie
<none>
DOMAIN\SRV_SQL1 DOMAIN\SRV_SQL2 DOMAIN\SRV_SQL1
DOMAIN\Bobbie
MSSQLSvc/server2.domain.local:1433 DOMAIN\SRV_SQL2

Things to consider

AdjustKerberosIntervals When working with Kerberos, tickets are granted for a longer period of time (typically 10 hours), so configuration changes usually take a longer time to apply. Use the KERBTRAY and/or KLIST resource kit utilities to view and purge tickets, in order to speed things up (avoids rebooting systems or logging off/on users). Alternatively, consider shortening the ticket lifetimes to the minimum; 10 minutes for services and 1 hour for users.

When configuring the linked servers, it was best to do so at the server where the linked server is configured from. Depending on your role you may be a domain admin, so should domain admins be delegated in order to make setting up linked servers easier? Rather not! Imagine someone installed malicious program or script on a system that is trusted for delegation and somehow a domain administrator executes this program or script. It could hurt any system or user in your network.

Saturday, 21 November 2009 20:56:50 (W. Europe Standard Time, UTC+01:00)
# Wednesday, 22 July 2009

Sometimes SQL Server gets it wrong, UNIQUE constraints and NULL is one of the parts where it is wrong. People call it a bug and request for it to be fixed, Microsoft calls requests to fix it common and offers workarounds and overall current SQL Server behavior does not adhere to the standard (ISO-9075).

To see the bug reports (and vote); 126533, 299229, 311223 (closed), 387273. Basically they are each others duplicates and from those, the request 299229 has most attention and the best discussion.

Available workarounds;

  • Filtered index (starting with SQL Server 2008)
  • Indexed view (starting with SQL Server 2000)
  • Triggers (not recommended as the check is after the modification is completed and if invalid has to perform a rollback)

... and don't say Microsoft didn't get it right because it's Microsoft... Access implements UNIQUE constraints correctly.

Wednesday, 22 July 2009 15:00:57 (W. Europe Daylight Time, UTC+02:00)
# Friday, 15 May 2009

Microsoft launched a new preview site for SQL Server 2008 R2. You can sign up for notifications about the releases of CTP's.

From an administrative viewpoint, focus seems at tools to manage, utilize and consolidate instances and machines.

Towards usability (mainly business intelligence), enhancements to integrate with Office 2010 (mainly Excel and SharePoint) are featured. Also further integration with Virtual Earth is shown.

Finally there is mention about something called Master Data Services and the Stewardship Portal. In concept a very powerful way to create and maintain a data warehouse bus.

Friday, 15 May 2009 13:30:57 (W. Europe Daylight Time, UTC+02:00)
# Tuesday, 24 March 2009

Noticed this one on Born to Learn... Microsoft has two sessions today/tomorrow on preparing for the exam 70-432

Tuesday, March 24  7:30 A.M. Pacific Time   (What time is this in my region? 2009-03-24 15:30 CET)

Tuesday, March 24  5:30 P.M. Pacific Time   (What time is this in my region? 2009-03-25 01:30 CET)

It slipped past me too, but since I can manage to squeeze the first session in my schedule, I'll attend. Follow up and link to the recording to be posted later.

My impressions of this sessions is that Rob Boek takes you through the main areas of interest for the exam, at least it aligned quite well with what I can remember from my beta-test. Good stuff if you need an impression of the most important topics of the exam to prepare for.

Updated 2009-04-02

Technorati tags: ,
Tuesday, 24 March 2009 09:46:13 (W. Europe Standard Time, UTC+01:00)
# Monday, 02 March 2009

Updated 2010-07-24; added timeout to the receive command in the stored procedure. For more info, see Service Broker can make your transaction log big.

Event Notifications are based on Service Broker. This brings an advantage and a disadvantage; enhanced scalability at the cost of increased complexity (and lack of a User Interface, so we have to take it to T-SQL code). Lets take it there step by step for SQL Server 2005 and 2008;

  1. Some code to cause deadlocks.
  2. See how the deadlocks look like using the Trace Flags.
  3. See how the deadlocks look like in Profiler.
  4. Use the Event Notification on the deadlocks.
  5. Sending out the mail.

The code shown as part of this article is also available at this link. To code to create the database for the deadlocks is available at this link. The code works with SQL Server 2005 SP2 and SQL Server 2008

1. Some code to cause deadlocks.

To get the code to cause the deadlocks, click this link. Run the code from SQL Server Management Studio (SSMS) to create the DeadLockSample database. Next create two additional connections (New Query) to the DeadLockSample database, execute the procedure procLockedTran1 on one connection and procLockedTran2 on the other connection. There is a 5 seconds delay in the two procedures between locking the first and second resource. So if you execute the two procedures within 5 seconds, you get your deadlock. You could also change the delay in the procedures if you think 5 seconds is to short/long.

Normally a process would be terminated when it is the deadlock victim. For the procedures created by the script, the error is handled , so the process that is the deadlock victim just selects it is the victim and the connection will not be terminated.

If you want to create a deadlock in any of the following scenarios, just run procLockedTran1 and procLockedTran2 on two separate connections simultaneously.

2. See how the deadlocks look like using the Trace Flags.

TraceFlagConfiguration

To get the deadlocks recorded in the SQL Server Error Log, either set the trace flag as startup parameter in SQL Server Configuration Manager (see picture) which will cause the trace flag to be active each time you start your SQL Server (until you remove the trace flag again), or issue a DBCC TRACEON (1222,-1) command which sets the trace flag until you stop SQL Server or issue a DBCC TRACEOFF (1222,-1).

With the trace flag active, you can now see the deadlock appear in you SQL Server Error Log.

3. See how the deadlocks look like in Profiler.

... or in SSMS for that matter. With the SQL Server tools it is possible to display deadlock graphs; XML documents (with .xdl extension) describing the deadlock information of one or more deadlocks. When the deadlock graph event (EventClass 148) with the TextData column is included in the trace definition, the XML to create the deadlock graph is captured (both through the Profiler graphical tool and the sp_trace procedures) and the graph will automatically be displayed when opened in Profiler. Also the graphical profiler has the option to write deadlock graphs directly to a file. Note that the graphical presentation of the deadlock is limited compared to the information in the .xdl document. The information in the .xdl document is the same information as the information written to the SQL Server Error Log with trace flag 1222. One extra comment if you filter your trace on database, you won't get the deadlock graph, as noticed by Kalen Delaney.

4. Use the Event Notification on the deadlocks.

Event Notification lets you capture events that can occur on server, database or broker queue, for a list of events, query sys.event_notification_event_types. For this purpose, the event type 1148, DEADLOCK_GRAPH is of interest. Also note that the DEADLOCK_GRAPH is a trace event, which can only be captured at server level.

The second thing needed is a broker SERVICE to send the event to. Since we'll be handling the events at server level, the SERVICE (and QUEUE) will be created in the msdb database. There is no need to create the MESSAGE TYPEs  or CONTRACTs as they already exist for Event Notification.

USE msdb
GO
-- Create a queue to receive messages.
CREATE QUEUE queDeadLock_Graph
;
GO
-- Create a service on the queue that references
-- the event notifications contract.
CREATE SERVICE svcDeadLock_Graph
ON QUEUE queDeadLock_Graph
([http://schemas.microsoft.com/SQL/Notifications/PostEventNotification]
);
GO
-- Now query the sys.databases for the service_broker_guid of the msdb database.

SELECT service_broker_guid FROM sys.databases WHERE name = 'msdb';

The result of the query for the Service Broker GUID for the msdb databases is needed to point to the right database for the service.

-- Create the event notification.
CREATE EVENT NOTIFICATION evnDeadLock_Graph
ON SERVER
FOR DEADLOCK_GRAPH
TO SERVICE
'svcDeadLock_Graph'
, '2A7C5300-7C64-43F5-AB10-5B079D07678E'; -- the GUID for msdb goes here.

You can look at the content of the queue through SELECT * FROM msdb.dbo.queDeadLock_Graph, as a queue in fact is a table. Next step will be processing the messages that get into the queue and more specifically get the deadlock details out of the message_body. The message body looks binary, but if you query it like this;

SELECT CAST(message_body AS xml) FROM msdb.dbo.queDeadLock_Graph

you'll notice that it is actually XML, where the textdata node holds the xdl structure of the deadlock graph.

5. Sending out the mail

Okay, we're in Service Broker land by now. Event Notification acts as the Initiator Service and the svcDeadLock_Graph is the Target Service. In this scenario, the only thing left to be done is specify how the queue should be processed. The processing shall be done by a stored procedure that will be tied to the queue as the activation program. Once something comes in to the queue, this stored procedure will be invoked... kind of like a trigger for insert on a normal table. The specification for the stored procedure will be:

  1. Extract the deadlock graph from the message_body and store it in a variable.
  2. Extract the datetime of the deadlock graph notification and store it in a variable.
  3. Store the datetime and deadlock graph in a table.
  4. Send the datetime as part of the message and deadlock graph as attachment in an email. This does require database mail to be set up, if you haven't, an error will be logged in the table.

The code to create the table to hold the deadlock information (and the reason why no mail was sent).

USE msdb
GO
CREATE
TABLE dbo.tblDeadLock_Log
(
DeadLock_ID int IDENTITY(1,1) CONSTRAINT pk_tblDeadLock_Log PRIMARY
KEY
, DeadLock_Detected
datetime
, DeadLock_Graph
xml
, NoMailReason nvarchar(2048
))
GO

The stored procedure to be used as the activation program on the queue, this is where the real work is done. Alter the @profile_name and @recipients in at sp_send_dbmail to match the setup of your dbmail setup.

CREATE PROCEDURE dbo.procReceiveDeadLock_Graph
AS
DECLARE
@conversation_handle
uniqueidentifier
DECLARE
@message_body
xml
DECLARE
@message_type_name nvarchar(128
)
DECLARE @deadlock_graph
xml
DECLARE
@event_datetime
datetime
DECLARE
@deadlock_id
int
BEGIN
TRY
BEGIN
TRAN
WAITFOR
(
RECEIVE TOP(1) @conversation_handle =
conversation_handle
, @message_body = CAST(message_body AS xml
)
,
@message_type_name = message_type_name
FROM dbo.queDeadLock_Graph)

, TIMEOUT 10000
-- Line added 2010-07-24;
-- /ssb/2010/07/24/ServiceBrokerCanMakeYourTransactionLogBig.aspx

-- Validate message
IF (@message_type_name = 'http://schemas.microsoft.com/SQL/Notifications/EventNotification'
AND
@message_body.exist('(/EVENT_INSTANCE/TextData/deadlock-list)') = 1
)
BEGIN
-- Extract the info from the message
SELECT @deadlock_graph = @message_body.query('(/EVENT_INSTANCE/TextData/deadlock-list)'
)
,
@event_datetime = @message_body.value('(/EVENT_INSTANCE/PostTime)[1]','datetime'
)
-- Put the info in the table
INSERT dbo.tblDeadLock_Log (DeadLock_Detected, DeadLock_Graph
)
VALUES (@event_datetime, @deadlock_graph
)
SELECT @deadlock_id = SCOPE_IDENTITY
()
-- Send deadlock alert mail.
-- Requires configured database mail, will log an error if not (or anything else goes wrong).
BEGIN
TRY
DECLARE @subj nvarchar(255), @bdy nvarchar(max), @qry nvarchar(max), @attfn nvarchar(255
)
SELECT @subj = 'A deadlock occurred on ' +
@@SERVERNAME
, @bdy = 'A deadlock occurred at ' + CONVERT(varchar(50),@event_datetime, 120) + ' on SQL Server: ' + @@SERVERNAME +
'. See attached xdl-file for deadlock details.'
, @qry = 'SET NOCOUNT ON; SELECT deadlock_graph FROM msdb.dbo.tblDeadLock_Log WITH (READUNCOMMITTED) WHERE DeadLock_ID = ' + CAST(@deadlock_id AS varchar(10
)) -- Locking hint is to prevent this dynamic query to be blocked by the lock held by the insert. The dynamic SQL will not come from inside this transaction.
, @attfn = @@SERVERNAME + '_' + CAST(@deadlock_id AS varchar(10)) +
'.xdl'
EXEC sp_send_dbmail @profile_name =
'default'
, @recipients =
'administrator@localhost'
, @subject = @subj
, @body = @bdy
, @query = @qry
, @attach_query_result_as_file = 1
, @query_attachment_filename = @attfn
-- http://support.microsoft.com/kb/924345
, @query_result_header = 0
, @query_result_width = 32767
, @query_no_truncate = 1
END
TRY
BEGIN
CATCH
UPDATE dbo.tblDeadLock_Log
SET NoMailReason = ERROR_MESSAGE
()
WHERE DeadLock_ID = @deadlock_id
END
CATCH
END
ELSE -- Not an event notification with deadlock-list
END
CONVERSATION @conversation_handle
COMMIT
TRAN
END
TRY
BEGIN
CATCH
ROLLBACK
TRAN
END
CATCH
GO

Finally to tie the stored procedure to the queue so incoming notifications will be processed.

ALTER QUEUE dbo.queDeadLock_Graph
WITH
STATUS
= ON
,
ACTIVATION
(
PROCEDURE_NAME = msdb.dbo.procReceiveDeadLock_Graph
,
STATUS = ON
,
MAX_QUEUE_READERS = 1
,
EXECUTE AS OWNER
)
GO

Now try the procedures procLockedTran1 and procLockedTran2 and see the results in the msdb.dbo.tblDeadLock_Log and in your mail (if configured).

Monday, 02 March 2009 12:30:01 (W. Europe Standard Time, UTC+01:00)
# Friday, 27 February 2009

Okay, here's the deal; deadlocks sometimes occur and you want to be notified as they happen.

So define an SQL Agent Alert on error 1205 to send you an e-mail notification... partially, but it won't work as error 1205 isn't logged.

How to get the error logged; either set the severity level to 19 or above when the message is created (does not apply to 1205) or flag the message to be logged to event log, which can be done using sp_altermessage... at least this was the way to do it with SQL Server 2000 but one of the changes for SQL Server 2005 says:

sp_altermessage cannot be used to change the logging behavior of system messages (messages with Message ID < 50000). To audit system messages, use SQL Trace and the User Error Message Event Class. For more information, see Introducing SQL Trace.

For SQL Server 2005 it used to be a dead end, but with as of Service Pack 3 it is again possible to alter sys.messages < 50000 so they are written to the Windows NT Application Event Log. Specifically for deadlocks you'd run:

EXEC sp_altermessage 1205, 'WITH_LOG', 'true'

but it would apply to other system events you want logged too.

Update 2009-04-08; issue fixed in Service Pack 1 for SQL Server 2008. For SQL Server 2008 (up until Cumulative Update 3) it is still impossible to alter sys.messages < 50000. Though it was mentioned (2008-07-28) at connect in FeedBackID=294122 that this issue would be fixed for SQL Server 2008 too, so far (2009-02-27) it isn't.

Besides the fact that the alert option is still impossible with SQL Server 2008, it can only tell you a deadlock occurred. Should you need more information on the deadlock however, your next option would be TRACE flags 1204 or 1222, the difference between the two of them being the amount of detail. Trace flag 1222 will be use during the post as it provides the most detail. Setting the trace flags writes the deadlock information to the SQL Server Error Log, but not to the Windows NT Application Event Log... so there's one two (and a half) option remaining:

Process the Error Log with the undocumented sp_readerrorlog as part of the alert response... but that won't work for SQL Server 2008 (yet) and you have a better option for SQL Server 2005. For more info see MSSQLTips.

SQL Server 2005 & 2008: Event Notification

SQL Server 2005 & 2008: WMI Alert, see MSDN

So next post will be on getting your deadlock info through Event Notification.

Friday, 27 February 2009 22:47:32 (W. Europe Standard Time, UTC+01:00)
# Thursday, 26 February 2009

As I blogged before, relog is quite useful. The syntax examples on the TechNet page however are pretty much useless if you want to go to SQL Server. So let's look at a few scenarios and the syntax to make them work, but before that, let's get the DSN and the counter data file.

Configure the Data Source Name to the SQL Server as System DSN based on the SQL Server driver (SQLSRV32.DLL)... Native Client does NOT work. The name of the DSN in the syntax samples will be PerfDB.

Next is the file with performance data. Relog will detect the format from its internal structure (if it is a valid counter log file), so you do not have to specify if your file is comma separated (.csv), tab separated (.tsv) or binary (.blg). Since binary is the most practical format for large amounts of data, the file for the syntax examples will be c:\my perflogs\p_log01.blg (and consecutive numbers for any next file).

One final comment before going to the scenarios; relog creates three tables in the SQL Server database targeted by the DSN (if they do not already exist). These tables are;relog_schema

  • dbo.CounterData (holds the actual values of the counter)
  • dbo.CounterDetails (holds the machine, object, counter and, if applicable, the instance)
  • dbo.DisplayToID (holds information on the perfmon data that was loaded)

Senario 1: Load all counters into the database

relog "c:\my perflogs\p_log01.blg" -f SQL -o SQL:PerfDB!1stRun

All clear, except for the blue 1stRun (and the exclamation mark preceding it). The reason is that it is possible to write multiple logs to the same database. Each time log information is written to the database, a new DisplayToID record is created. The name given in blue is the  DisplayString. If the DisplayString is used before, the data will be added under the same GUID.

Scenario 2: Load all counters between begin and end datetime into the database

relog "c:\my perflogs\p_log02.blg" -f SQL -o SQL:PerfDB!2ndRun -b 26-02-2009 10:00:00 -e 26-02-2009 10:30:00

TechNet says the format for relog date times should be M/d/yyyy hh:mm:ss, which is the internal format of the log files is M/d/yyyy hh:mm:ss.sss, minus the milliseconds. In reality, relog looks at the regional settings, including the customizations you did to the regional settings! The string in the sample is valid for the regional settings Dutch (Netherlands), so dd-mm-yyyy HH:mm:ss. Best way to find out what format relog expects is to run relog /?.

Together with the previous issue of the char(24) storage of a binary converted datetime string, this regional settings dependency is horrible handling of datetime. For globalization support it would be great if relog was given an extra switch to indicate that datetime strings are in ISO 8601 or ODBC format, independent of the regional setting.

Scenario 3: Load a limited set of counters into the database

relog "c:\my perflogs\p_log03.blg" -q -o c:\counters.txt

Edit the c:\counters.txt file to only include the counters to be written to the database.

relog "c:\my perflogs\p_log03.blg" -cf c:\counters.txt -f SQL -o SQL:PerfDB!"select set of counters from the 3rd run"

It is possible to combine scenarios 2 and 3 to load a limited set of counters between two datetimes. Also, if you want spaces in the DisplayString, it can be done with the double quotes as shown in this example.

Thursday, 26 February 2009 15:42:13 (W. Europe Standard Time, UTC+01:00)
# Saturday, 31 January 2009

A common question for students during/after attending a Microsoft training: "Where do we get the virtual PC images we used during course?"

The answer is, you don't... Microsoft provides these images for classroom use only by Certified Partners for Learning Solutions and Microsoft Certified Trainers. Quite understandable, as these images, contain a lot of software. However, the students question is valid too, for practice and exam preparation. And you can get about the same experience you had in class, based on Microsoft's "Run IT on a Virtual Hard Disk" program. Run IT on a Virtual Hard Disk allows you to download and use a fully installed evaluation version of an installed product. So here is how you can build your own VHD for the 2779 or 2780 courses.

  1. Your PC; I recommend you use a PC with at least 1.5 GB of RAM and Windows XP.
  2. Virtual PC; download and install Virtual PC 2007 (if you want, you can use Virtual Server 2005 R2 instead). For the download and more information see the Microsoft Virtual PC site.
  3. SQL Server VHD; download the 4 files image files for the SQL Server 2005 Evaluation VHD and unpack the VHD. SQL Server 2005 is currently not listed on the Run IT on a Virtual Hard Disk site.
  4. SQL Server installation media; download the Evaluation Edition of SQL Server 2005 (180-day version), requires Windows Live ID. Some labs/practices/demonstrations require multiple instances, these are installed on the 2779 and 2780 images, but not on the Evaluation VHD. So you may need to install the SQLINSTANCE2 and SQLINSTANCE3.
  5. SQL Service Pack; the VHD for SQL Server Evaluation has no Service Pack for SQL Server applied, whereas the MOC courses 2779 and 2780 are based on SQL Server 2005 SP1. Links to the SQL Server 2005 Service Packs:
    1. SQL Server 2005 Service Pack 1
    2. SQL Server 2005 Service Pack 2
    3. SQL Server 2005 Service Pack 3
  6. Create a Virtual machine in Virtual PC based on the downloaded VHD. Set the amount of memory to at least 1024 MB, also enable undo disks.
  7. Start and log in to the Virtual PC guest, you will need the administrator password Evaluation1. You will notice that the Windows Server 2003 operating system is not activated, therefor you only have a limited period for evaluation.
  8. From the student CD that came with your courseware, run Allfiles.exe. This will extract all files required by the practices, labs and demonstrations. Note that the setup will be different from what you were used to during the course. The files in the course were on a separate VHD for each module, which was mounted as D:\. After unpacking Allfiles.exe the whole course is in one folder tree (typically C:\Program Files\Microsoft Learning\27xx\). Note that you may have to compensate for paths and server names; so when you are in 2780 module 4, a path D:\democode\SalesCert.cer should be changes to C:\Program Files\Microsoft Learning\2780\Mod04\SalesCert.cer. Likewise the name of the server is different too, so MIAMI should be changed to WIN2K3R2EE.

Tips about downloading and file interaction between the Virtual PC host, the Guest and the Internet.

  1. When you have a ISO-file on your host, you can mount this ISO as CD/DVD in the guest. You can also instruct Virtual PC guest to use the CD/DVD drive from the host.
  2. When you have normal files on your host, you can use the Virtual PC Shared Folders feature; this exposes a folder on the host as a network drive on the guest.
  3. You can use the Networking feature of Virtual PC to use Shared networking (NAT) or your hosts Network adapter to allow access to the network and to Internet, so you can download files directly into your Virtual PC guest.

Main differences between the MOC and Eval VHD's

  MOC Eval
Server name MIAMI Win2k3R2EE
SQL Server Edition Developer SP1 Enterprise no SP
Instances [default]
SQLINSTANCE2
SQLINSTANCE3
[default]
SQL Service Account MIAMI\SQLServer [LocalSystem]
Password Pa$$w0rd Evaluation1
Course files One VHD per module All files in a folder tree, paths have to be checked/changes.
Saturday, 31 January 2009 18:53:43 (W. Europe Standard Time, UTC+01:00)
# Monday, 24 November 2008

For those of you who are not familiar with the tool Relog; it is part of Windows and allows you to reprocess System Monitor logs. This is quite useful, as logging counters to the binary format (.blg) is very space efficient. However to analyze those counters, you may want to use SQL Server. Relog allows to rewrite the log and also creates the tables for you on SQL Server, if not yet present (syntax samples). The unfortunate part is in the CounterDateTime column of the CounterData table, despite the column name, the date type is char(24). When trying to convert the char(24) to a datetime, this throws the error;

Msg 241, Level 16, State 1, Line 2
Conversion failed when converting datetime from character string.

I was getting a bit annoyed by this error, as a conversion from a char to a datetime should succeed if the string is valid... and say 2008-09-30 12:10:15.322 looked valid. I even checked Tibors ultimate datetime guide to see if I missed something, because I was unsure of the impact of the last byte (a valid string yyyy-mm-dd HH:mi:ss.sss actually is 23 characters). Also when trying to insert a string with a valid datetime in a variable, the conversion succeeds. Next I turned my attention to how the data was stored in the CounterDateTime column, by inserting a valid datetime string and retrieving one that was inserted by Relog (both returning the string (char(24)) and its binary representation varbinary(24)).

Source Character representation Binary representation
Relog insert 2008-09-30 12:10:15.322 0x323030382D30392D33302031323A31303A31352E33323200
Manual insert 2008-09-30 12:10:15.322 0x323030382D30392D33302031323A31303A31352E33323220

The difference is in the last byte of the binary representation, the datetime string written by relog isn't padded with spaces as one would expect to happen for the unused positions in a char. Instead it is zeroed as one would expect for unused bytes in a binary string. To get a datetime representation of the CounterDateTime a double conversion is needed;

CAST(SUBSTRING([CounterDateTime],1,23) AS datetime)

Now for the "nasty" in Relogs database schema:

  • Changing the CounterDateTime to datetime is a no-go. Relog does not want it.
  • Adding a computed column with the conversions above is a no-go. Relog does not have all columns bound.

What remains is creating a view on top of the CounterData table with this conversion included.

Another issue that may save you some time, the DSN you create to access the SQL Server should be the "SQL Server" provider, not the "SQL Native Client".

Monday, 24 November 2008 20:48:37 (W. Europe Standard Time, UTC+01:00)
# Wednesday, 08 October 2008

A while back, I wrote about using sp_ procedures in master to create a toolbox, which works fine for me... until recently I hit a strange problem. I developed a procedure that would read the definition of a table and would create a change_log table and trigger to populate that table. On my dev-environment (Windows Vista x64, SQL Server 2005 Dev Ed x64 SP2) this worked like I expected. But when I tested the procedure on a different server, it didn't!?! A very little sample that touches the essence of the problem I witnessed;

use master
go
if
object_id('sp_test','p') is not
null
drop procedure sp_test
go
create
procedure sp_test
as
select
db_name() + '.' + table_schema + '.' + table_name
from information_schema.
tables
order by table_schema, table_name
go
exec
sp_test
go
use
adventureworks
go
exec
sp_test
go
use
master
go
drop
procedure sp_test
go

Now on my dev machine, this once listed all tables in master and once all tables in AdventureWorks, as intended. But on other servers, it returned the tables from master on both executions of sp_test (the db_name function is executed correctly in both cases; once master, once AdventureWorks). For some reason, when referencing objects, the stored procedure stayed to master. I was puzzled, but before crying BUG (out loud) I tried the newsgroups and a prompt reply from SQL Server MVP Roy Harvey pointed me to the undocumented stored procedure sp_MS_marksystemobject. Simply execute;

exec sp_MS_marksystemobject sp_test

and you won't suffer from the inconsistent behavior I witnessed.

Technorati tags: ,
Wednesday, 08 October 2008 17:51:28 (W. Europe Daylight Time, UTC+02:00)

Just checked the Prometric site and the status for my 70-432 (71-432) and 70-448 (71-448) changed from tested to passed ;-).

Technorati tags: , ,
Wednesday, 08 October 2008 10:55:23 (W. Europe Daylight Time, UTC+02:00)
# Friday, 08 August 2008

Yesterday evening I got into a fight with the SSIS XML Source Data Flow Source. It actually was the first time I used this Data Flow Source. I had no expectations other than, point to your XML-file and get the data. Wrong... I did point to an XML-file, generate a schema and... no columns! I ended up with the error:

Validation error. Data Flow Task: XML Source [000]: A component does not have any inputs or outputs.

In cases like this, Google should be your friend... well I read a lot about SSIS in general, some even about the XML Source but nothing to provide me with answers or even help me out. It was after reading a post by Oren Eini that I decided I was on the wrong trouble shooting track and a good night sleep would be the best next step.

This morning I started with clear vision and an open mind. No answers through Google, nothing useful on Connect, so I tried if I could reproduce my problem with an other document. The document I created was of a very simple structure;

<people>
  <person>
    <firstname>Jan</firstname>
    <lastname>Aerts</lastname>
  </person>
  <person>
    <firstname>Anne</firstname>
    <lastname>Mulders</lastname>
  </person>
</people>

It worked! But now I had to find out why the document above worked, and the other one didn't. Again I read about SSIS in general and a little something about the XML Source. In particular (SSIS in general), I stumbled upon a post by Jamie Thomson, sounding familiar and one about the XML Source I wish I came across earlier: Using XML Source by Matt Masson. I could already agree with his opening comment, especially the various degrees of success. While reading Matt's article I had this feeling my XML document might actually be to simple... it occurred to me that the XML Source was not just going to read XML, it was trying to represent the XML as one or more tables.

A very simple representation of my original document is;

  <person id="1">
    <firstname>Jan</firstname>
    <lastname>Aerts</lastname>
  </person>
<row column1="value">
  <column2>value</column2>
  <column3>value</column3>
</row>

The simplest representations Matt used, are:

<rootgoo>
  <goo>
    <subgoo>value</subgoo>
    <moregoo>1</moregoo>
  </goo>
  <goo>
    <subgoo>value</subgoo>
    <moregoo>2</moregoo>
  </goo>
</rootgoo>
<table>
  <row>
    <column1>value</column1>
    <column2>value</column2>
  </row>
  <row>
    <column1>value</column1>
    <column2>value</column2>
  </row>
</table>

AND

<root>
  <row CustomerID="1" TerritoryID="1" AccountNumber="AW00000001" />
  <row CustomerID="2" TerritoryID="1" AccountNumber="AW00000002" />
</root>
<table>
  <row column1="value" column2="value" column3="value" />
  <row column1="value" column2="value" column3="value" />
</table>

So my document could never be translated to a table... to get back to Oren's post: If only SSIS had told me so with a clear error or even a dialog in the XML Source, that would have saved me a couple of hours!

Or better, since XML Source tries to get data from the XML, it could do a best effort as wrapping something that looks a single row into a table (and to take it one step simpler, represent a single value as a table with just one row and one column.). If you'd like to see some improvement here too, take a moment to vote on FeedbackID 361057.

On a version note, it happens with SQL Server 2005 (SP2) and SQL Server 2008 (PreRelease).

Friday, 08 August 2008 12:40:45 (W. Europe Daylight Time, UTC+02:00)
# Wednesday, 06 August 2008

REDMOND, Wash. — Aug. 6, 2008 — Microsoft Corp. today announced the release to manufacturing of Microsoft SQL Server 2008 -->

Wednesday, 06 August 2008 20:46:46 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 02 August 2008

As most will know sp_ does not stand for stored procedure, it stands for system stored procedure. But calling your procedure sp_something doesn't make it a system procedure automatically, it just hints the server how to resolve the procedure.

When a procedure that starts with sp_ is called, first the master database is checked if it is a real system stored procedure. The books online shows this behavior by creating a procedure in AdventureWorks called dbo.sp_who. However, since sp_who is a real system stored procedure, the existence of AdventureWorks.dbo.sp_who is always ignored. If the procedure is not a real system stored procedure, the connected database is checked for the existence of the stored procedure. If it is in the database you're currently connected to, it gets executed. If it isn't in the database you're currently connected to, it is retrieved from master (or you receive an error if it isn't there either). You can verify this behavior based on the following code.

USE AdventureWorks
GO
CREATE PROCEDURE sp_sayhello
AS
SELECT 'Hello from AdventureWorks, you are connected to ' + DB_NAME() + '.'
GO
USE master
GO
CREATE PROCEDURE sp_sayhello
AS
SELECT 'Hello from master, you are connected to ' + DB_NAME() + '.'
GO

Now, when executing sp_sayhello while connected to AdventureWorks, it will return;

Hello from AdventureWorks, you are connected to AdventureWorks

With any other database, say msdb, you get the following result.

Hello from master, you are connected to msdb

So there are two reasons why starting your stored procedure name with sp_ isn't smart;

  • Performance; each time the procedure is called, a (futile) lookup is done against the master database.
  • Future; if you have a stored procedure in your database called sp_dosomething and Microsoft implements a system stored procedure sp_dosomething in SQL Server, your application is broken.

There is however one scenario where creating stored procedures with sp_ is smart: When you create it in master as part of your own standardized way of working. Creating your own toolbox so to say. With SQL Server 2005 and 2008 there is an automatic separation, your sp_ procedures are created in the dbo schema by default and the real system stored procedures reside in the sys schema (the actual system stored procedures are in the mssqlsystemresource database).

Your own sp_ procedures and schemas: DON'T!!! It does not work if the schema in master isn't dbo.

USE master
GO
CREATE SCHEMA
toolbox
GO
CREATE PROCEDURE
toolbox.sp_sayhello
AS
SELECT
'Hello from master. You are connected to ' + DB_NAME() + '.'
GO
USE
AdventureWorks
GO
EXEC
sp_sayhello -- Doesn't work
EXEC toolbox.sp_sayhello -- Doesn't work
EXEC master.toolbox.sp_sayhello -- Executes against master, not AdventureWorks.
GO
USE
master
GO
DROP PROCEDURE
toolbox.sp_sayhello
GO
DROP SCHEMA
toolbox
GO

Your own sp_ procedures and non-privileged users: Make sure the login has permissions to execute the procedure from master and that any needed permissions are held in the target database. To illustrate, a login, mapped to a user in AdventureWorks will execute a stored procedure named sp_maketable. To make this work, public (therefor any login through guest, which is appropriate for master) will receive execute permissions on the procedure and create table and alter schema permissions are granted to the user in AdventureWorks. The table is created in the default schema of the user.

USE master
GO
CREATE PROCEDURE sp_maketable
AS
CREATE TABLE tblTest (col1 int)
GO
GRANT EXECUTE ON dbo.sp_maketable TO public -- Make sure permissions allow the user to execute.
GO
CREATE LOGIN np_user WITH PASSWORD = 'secret', DEFAULT_DATABASE = AdventureWorks
GO
USE AdventureWorks
GO
CREATE USER np_user FOR LOGIN np_user WITH DEFAULT_SCHEMA = Sales
GO
GRANT CREATE TABLE TO np_user -- Make sure the user has proper permissions in the database.
GO
GRANT ALTER ON SCHEMA::Sales TO np_user -- Make sure the user has proper permissions in the schema.
GO
EXECUTE AS LOGIN = 'np_user'
GO
SELECT SUSER_SNAME(), USER_NAME() -- Verify it is executing as the user.
GO
EXEC sp_maketable
GO
REVERT

Important stuff when writing your own sp_ 's:

  • BACKUP DATABASE master just became even more important.
  • Double check on which of your own procedures you grant execute permissions.
  • Use a proper naming convention, like including your company name, to avoid naming collision with future Microsoft system stored procedures.
  • If a procedure exists with the same name in one of your databases and you are connected to that database, the local procedure gets executed, not the central one from master.
  • Document.
  • Mark your sp_ as system object with sp_MS_marksystemobject
Technorati tags: ,
Saturday, 02 August 2008 20:37:45 (W. Europe Daylight Time, UTC+02:00)
# Tuesday, 17 June 2008

After running through the prep-guide (looking through a pair of SQL Server 2005 glasses), I identified a couple of topics worth giving a closer look. The topics are derived from the prep-guide, my comments about the topic added in blue italics and the bulleted list refers to (mostly) BOL-resources. This post is based on the prep-guide for 70-432 with published date June 11, 2008

Installing and Configuring SQL Server 2008 (10 percent)

Configure additional SQL Server components.
This objective may include but is not limited to: SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS), replication. Not that I expect this to be really different from SQL Server 2005, but if your background is just DBA (MCTS/MCITP) it may be your first encounter with the BI-components.

Maintaining SQL Server Instances (13 percent)

Implement the declarative management framework (DMF).
This objective may include but is not limited to: create a policy; verify a policy; schedule a policy compliance check; enforce a policy; create a condition.

Back up a SQL Server environment.
This objective may include but is not limited to: operating system-level concepts. I don't expect a lot of fireworks, but the operating system-level concepts made me curious.

  • Planning for Disaster Recovery Actually, I'm still curious what is meant by operating system-level concepts. This link from BOL is actually my best shot at a document where some broader considerations are presented.

Managing SQL Server Security (15 percent)

Manage transparent data encryption.
This objective may include but is not limited to: impact of transparent data encryption on backups.

Maintaining a SQL Server Database (16 percent)

Back up databases.
This objective may include but is not limited to: full backups; differential backups; transaction log; compressed backups; file and filegroup backups; verifying backup. Only compressed backups is to be classified as new.

Performing Data Management Tasks (14 percent)

Implement data compression.
This objective may include but is not limited to: sparse columns; page/row.

Maintain indexes.
This objective may include but is not limited to: create spatial indexes; create partitioned indexes; clustered and non-clustered indexes; XML indexes; disable and enable indexes; filtered index on sparse columns; indexes with included columns; rebuilding/reorganizing indexes; online/offline. Spatial and filtered indexes on sparse columns are of interest here, along with "is not limited to" which could be indexes on hierarchyid columns.

Optimizing SQL Server Performance (10 percent)

Implement Resource Governor.

Use Performance Studio.

  • Data Collection Entry page, includes How-To
  • Again, Performance Studio, also an MS-Name-Game, what you're really looking for is Data Collection... and trying to get that confirmed, I found this webcast by Bill Ramos (62 minutes).

The rest, well it is all too familiar from SQL Server 2005. Sure, I'll look for some "What's new" resources, but I think the above pretty much covers what I need to familiarize my self with.

Technorati tags: , ,
Tuesday, 17 June 2008 18:13:20 (W. Europe Daylight Time, UTC+02:00)
# Tuesday, 10 June 2008

Release Candidate 0 is available for download (and downloading) and the MCTS exam 70-432 went into beta testing (and I registered). Since the beta is only running from June 9th through June 30th, I had to go for 27th as it was the only gap in my schedule. Let's see if I can find the time to blog about my preparations...

Technorati tags: , ,
Tuesday, 10 June 2008 20:02:47 (W. Europe Daylight Time, UTC+02:00)
# Friday, 06 June 2008
 #
 

Friday, 06 June 2008 06:52:38 (W. Europe Daylight Time, UTC+02:00)

Exam-stats: 180 minutes1, 61 question spread over 6 testlets (cases), passing score 700 points, only multiple choice questions, no simulations. I got 752, was lousy on SSAS ...and this actually was the first Microsoft exam where I really needed the time!!!  Compared to the other MCITP (70-443, 70-444 and 71-647) exams I sat, it was a lot more reading and fact-finding in the case-studies.

What surprised me on this exam, were a couple of questions targeted at the database engine. Think towards backup-requirements for filegroups (which are needed for partitioned tables), index optimization and transaction isolation levels (not mentioned in prep-guide). Unfortunately these topics aren't covered2 in the courses 2794 to 2797 (or in 2791 to 2793). From the topics that are covered in the prep-guide, I'd say the number of questions was pretty balanced, only four things were really sticking out:

  • Which data mining algorithm to apply in a certain scenario.
  • Storage strategy for SSAS cubes.
  • Slowly changing dimensions.
  • Designing dimensions and hierarchies.

Useful resources for preparation.

1Actually, you get about 3 minutes per question grouped per testlet. This means for a 9 question testlet you get about 27 minutes, time left on one testlet is not added to the next. The 180 minutes should be regarded as an indication for the maximum exam length.
2 At best superficially mentioned in 2796.
Friday, 06 June 2008 00:14:29 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 31 May 2008

Just a few links where you can find more info about SQL Server 2008 Certification in general and about the separate certification tracks and exams.

Track alignment Database Administration Database Development Business Intelligence
Microsoft Certified Technology Specialist (MCTS)

MCTS: SQL Server 2008, Implementation and Maintenance

MCTS: SQL Server 2008, Database Development

MCTS: SQL Server 2008, Business Intelligence Development and Maintenance

MCTS requirements Pass: Exam 70-432 (expected availability of exam August 2008) Pass: Exam 70-433 (expected availability of exam October 2008) Pass: Exam 70-448 (expected availability of exam August 2008)
Microsoft Certified Information Technology Professional

MCITP: Database Administrator 2008

MCITP: Database Developer 2008

MCITP: Business Intelligence Developer 2008

MCITP requirements Hold above MCTS certification and pass Exam 70-450 (expected availability of exam November 2008) Hold above MCTS certification and pass Exam 70-451 (expected availability of exam January 2009) Hold above MCTS certification and pass Exam 70-452 (expected availability of exam November 2008)
Upgrade option existing MCITP for SQL Server 2005 Existing MCITP:Database Administrators can upgrade above MCTS and MCITP by passing Exam 70-453 (expected availability of preparation guide September 2008) Existing MCITP:Database Developers can upgrade above MCTS and MCITP by passing Exam 70-454 (expected availability of preparation guide September 2008) Existing MCITP:Business Intelligence Developers can upgrade above MCTS and MCITP by passing Exam 70-455 (expected availability of preparation guide September 2008)
No upgrade paths exist for MCTS for SQL Server 2005 to MCTS for SQL Server 2008. Thanks Trika, for the pointer and poster.
Saturday, 31 May 2008 17:19:20 (W. Europe Daylight Time, UTC+02:00)
# Wednesday, 28 May 2008

In other words, the system could not find the specified path... that was when I tried restoring through a substituted (subst R: G:\project02\SQLBackup) drive, and for sure R:\MyDB_20080521.bak did exist. Nastier was the fact that G:\project02\SQLBackup\MyDB_20080521.bak restored in an instant, so the problem was likely to be found in the interaction between subst and SQL Server. This was rather disappointing, as it would have been nice to separate the environment for different customers/projects and still be able to use generic scripts targeted at a drive-letter.

So a little quest started to find some solutions, if it's a bug?, should be a feature... I was surprised by the small number of pointers I came across on the Internet and only one useful. After establishing net use R: \\myserver\projdata\project02\SQLBackup suffered the same problem when it came to restoring, I widened the search an came across a post from the beta stage of SQL Server 2005. In there Erland Sommarskog hinted it might have something to do with profile setting, which would apply if the setting in question weren't a system setting. Now that made perfect sense, so let's see how the world looks like through xp_cmdshell.

So on my SQL Server 2005 instance I enabled xp_cmdshell and I substituted a path for a drive letter on Windows. Sure enough, I could see the substituted drive W: as pointing to D:\SQLData, but running EXEC xp_cmdshell 'subst' yielded NULL from the instance running under LocalSystem. Then, after running EXEC xp_cmdshell 'subst W: D:\SQLData' my SQL Server instance running under LocalSystem got the picture too. From SQL Server Management Studio I could "see" drive W: and also restore from and backup to my "W-drive". Safe to say it isn't a bug for SQL Server and if a feature request is to be made, it has to be addressed to the Windows team. Something like; enable system-wide setting of subst and net use commands by administrators through an extra switch (or make it a policy setting default on for administrators only).

In the end it can easily be solved with a little sqlcmd-script like:

:setvar path "G:\project02\SQLBackup"
:setvar drive "R"
EXEC xp_cmdshell 'subst $(drive): /d'
GO
EXEC
xp_cmdshell 'subst $(drive): $(path)'
GO

Wednesday, 28 May 2008 21:55:46 (W. Europe Daylight Time, UTC+02:00)
# Thursday, 17 April 2008

From my perspective, a very successful event. Congratulations to the organizing committee and the volunteers supporting them. Also may thanks to the speakers, especially Bob Ward (pre-conference and session), Remus Rusana, James Luetkehoelter, Chuck Hawkins, Klaus Aschenbrenner (GEOGRAPHY data type).

Also look out for James' follow ups on "our" questions, guess the audience was paranoid enough.

Technorati tags:
Thursday, 17 April 2008 22:58:33 (W. Europe Daylight Time, UTC+02:00)
# Monday, 11 February 2008

Today I sat the 70-445 exam and completed those 53 questions with the minimal required passing score of 700 points. Now how did I get to that meager (but satisfactory) result?

First of all, I didn't allocate enough time for my preparation, so out of the initial things I mentioned, I only completed the MOC courses 2791 to 2794 (on a hindsight, the time spent with 2794 was wasted towards this exam). I also purchased the MS Press Training Kit for the 70-445 exam, but didn't really get to using it. In total, I only read the chapters 9 and 17 and ran through all 209 MeasureUP questions once. The reason for picking only chapters 9 and 17, besides time constraints, is the amount of Data Mining in the exam versus the (lack of) coverage of the subject in the MOC2791. The reason for running through the MeasureUP questions in study-mode was to get some exam-focus on the subjects and cover them all (see if there were things I missed from the MOC's). I should add that many of MeasureUP tests do not resemble the exam, like providing non-existent options or asking for trivial look-up fact... actual Microsoft exams have better quality.

Having that extra bit of exam-focus really helped, the MOC's tend to strongly focus on the development part of SSIS, SSRS and SSAS. The actual exam is more balanced between development and administration. The training kit too, seems to be more geared towards the administration part and definitely has a more task-based (or hands-on) approach than the MOC's. So in total, I think you need both for a proper preparation (or be able to compensate development or administration with in-depth practical skills).

Luckily the score report includes those scoring bars that indicate a little on your relative score. What is my experience based on the 7 topics tested;

  • Managing SSAS
    • My relative score; 5th
    • Impression, 2791 definitely is shallow on this subject. And since I don't have to much real-world experience managing Analysis Services...
  • Developing SSAS Solutions by Using BIDS
    • My relative score; 3rd
    • Impression, 2791 gives you all the handles you need, for the hands-on I can recommend taking a look at the tutorials in SQL Server Books On-Line. Again, no real-world experience on for me here either. Some applied MDX, nothing shocking.
  • Implementing Data Mining by Using BIDS
    • My relative score; 6th
    • Impression, next to no coverage in 2791, you really need the 70-445 training kit here (mind, I just read it, didn't do the exercises). Very little basic DMX.
  • Managing SSRS
    • My relative score; 2nd
    • Impression, some help from real-world experience, though I wasn't prepared for dealing with farms. Also, be prepared to modify the RSReportServer.config.
  • Developing Reporting Solutions by Using SSRS
    • My relative score; 7th
    • Impression, 2793 gives you all the handles you need, but you should also to develop reports and care about how they look (questions included some beatifications of reports). Also, take a good look at URLs. If you only have this covered with the training kit, it seems to me (based on a quick glance), it's not going to be enough.
  • Developing Business Intelligence Solutions by Using SSIS
    • My relative score; 4th
    • Impression, pretty well covered from the 2792... which was pretty helpful to me, as I'm used to solving a lot of stuff in the database (using views and stored procedures). Especially focus on transactions, checkpoints and logical combinations between expressions and constraints.
  • Administering SSIS Packages
    • My relative score; 1st.
    • Impression, my real-world experience helped me out here, not the 2792. Focus on things like supplying configurations on run-time, securing parts of packages and the differences between storing in SQL Server vs. on the file system. Also be prepared for some dtutil and dtexec syntax.

As Ronald Kraijesteijn noted on his blog-entry (in Dutch) on the exam, it's pretty tool-oriented (like how you do something, even in which order). I felt this was particularly true on the developing with SSAS and SSRS. A couple of months experience are definitely going to prove advantageous. This was not the case for me, SSAS real-world is non-existent and SSRS already dates back a year. But hey, a 700-point pass still is a pass.

Technorati tags: ,
Monday, 11 February 2008 23:05:42 (W. Europe Standard Time, UTC+01:00)
# Wednesday, 06 February 2008

Fresh from the often delayed "SQL Server 2008 and Your Certifications", first session.

MCDBA will retire march 2009, no direct upgrade from MCDBA to a SQL Server 2008 certification.

70-446 will be superseded by 70-448 (~August 2008), 70-431 will be split in an Administration exam 70-432 (~August 2008) and 70-433 (~September 2008) for Development. So be ready for the following titles:

  • 70-432, MCTS: SQL Server 2008, Implementation and Maintenance
  • 70-433, MCTS: SQL Server 2008, Database Development
  • 70-448, MCTS: SQL Server 2008, Business Intelligence Development and Maintenance

MCITP's can probably upgrade the MCTS and MCITP in a single upgrade exam. No timelines on the professional level exams yet.

MCA Databases is available, targeted at OLTP... Business Intelligence is under consideration.

Blogs to watch for more info on SQL Server 2008 Certifications:

Wednesday, 06 February 2008 17:52:41 (W. Europe Standard Time, UTC+01:00)
# Thursday, 17 January 2008

If you're going to take the SQL Server class 2779B, there is a lot of XML in it. That is, in relation to SQL Server 2005. But do you know your XML? Well, the training implies you do, though it is not one of the published prerequisites. If you're blank on XML, or want to check on your skills, you may want to take a look at www.w3schools.com.

The available tutorials give you an overview of the general usage of XML-technologies, some of them (XML, XPath, XQuery, XSD) will return in 2779 (Modules 3 and 6) where they are applied on SQL Server 2005. A little study-guide to prepare you for the things to come.

Thursday, 17 January 2008 18:40:39 (W. Europe Standard Time, UTC+01:00)
# Tuesday, 15 January 2008

A while back I wrote about Overridable IDENTITY or the order of things, offering a solution to store data in natural order. This solution worked form me in a couple of projects, until I recently had to reload a table... so the basic procedure would be to;

  • Create a new table based on the same definition
  • Do an INSERT new_table SELECT ... FROM old_table
  • DROP TABLE old_table
  • sp_rename 'new_table', 'old_table'

But this does not produce the proper result, as the most recent inserted identity prior to the insert will be applied for the default.

The trick is to make every insert an independent action, so the IDENTITY_CURRENT is properly set for every row; make it run in a cursor. Yeah, I know.... performance, but that pain is largely taken away by making it a FAST_FORWARD. Besides, if anyone has a better suggestion, I'm all ears. So instead of a INSERT ... SELECT, the CURSOR is opened for the selection and the new table is inserted based on the fetched rows.

Altogether in the attached script;

  • Create the soon to be table.
  • Populate the table.
  • Create the new table, pay attention to the default.
  • Use the cursor to fill the new table.
  • Drop the original table and rename the new table to the name of the original table, don't forget to recreate the default.
Tuesday, 15 January 2008 21:53:29 (W. Europe Standard Time, UTC+01:00)
# Monday, 14 January 2008
 #
 

... or at least didn't feel things were important enough to post, at least that's the excuse for not writing here for well over a month.

In the mean time, beta-season is opened again and I registered for the 71-647. However, I won't go trough the same depth of preparation as I did for the 70-649... I'll just go in and try to make it on my Windows 2003 and 70-649 prep-knowledge ;-).

The other exam I registered for is the 70-445 and I'm planning to take the 70-446 later this year. Just to get myself started for the preparation of this exam, I collected some links to hold on to:

and I'll be using the Microsoft courseware for the courses 2791, 2792, 2793, 2794 and the MCTS Self-Paced Training Kit (Exam 70-445): Microsoft® SQL Server™ 2005 Business Intelligence—Implementation and Maintenance.

That should keep me busy for a while again...

Monday, 14 January 2008 20:49:23 (W. Europe Standard Time, UTC+01:00)
# Thursday, 06 December 2007

One of the things that surface in module 2 of the 2780B course are the predefined reports you can use to get more info on what your system is doing. Good thing is, you can run these directly from Management Studio without Reporting Services. Bad thing, you don’t know what the reports are showing exactly (as in, you don’t know the report definition). That is, we can find out what the reports show since Microsoft released the report definitions, so you can reproduce what the report is showing. Also in Service Pack 2, Microsoft introduced the option to create your own custom reports… and that is just what one of the Microsoft Support Engineers did (performance dashboard). One of the community members also noticed and blogged about this new option and created his own DBA_Dashboard full of useful metrics. Thanks Greg, the effort is much appreciated, do keep up the good work for the next version.

Technorati tags: ,
Thursday, 06 December 2007 10:56:58 (W. Europe Standard Time, UTC+01:00)
# Thursday, 22 November 2007

About half a year ago I attended a webcast about ICE, today I can update that post with a link to the white paper. Maybe more on it later... when I found the time to actually read it.

Technorati tags: ,
Thursday, 22 November 2007 23:46:47 (W. Europe Standard Time, UTC+01:00)
# Tuesday, 13 November 2007

Bah, for the forth time in a week I encounter a star in the select-list of a view. So if no-one told you yet; when creating views, don’t do a SELECT * FROM source, it can (and in time will) hurt you badly. Let’s show by creating a table and two views based on that table;

CREATE TABLE tblSource(
    Col1 nchar(10),
    Col2 nchar(10))
GO
CREATE VIEW vwBad
AS
SELECT * FROM tblSource
GO
CREATE VIEW vwGood
AS
SELECT Col1, Col2 FROM tblSource
GO
INSERT tblSource (Col1, Col2) VALUES ('Column 1', 'Column 2')
GO

So far, so good. Selecting from both views shows you the expected results:

SELECT 'vwGood' view_name, * FROM vwGood
GO
SELECT 'vwBad' view_name, * FROM vwBad
GO

Now add a column to the table, and add some data for that column;

ALTER TABLE tblSource
    ADD Col3 tinyint
GO
UPDATE tblSource
SET Col3 = 3
GO

SELECT 'vwGood' view_name, * FROM vwGood
GO
SELECT 'vwBad' view_name, * FROM vwBad
GO

vwGood still behaves as expected, but vwBad doesn’t return ALL columns as you may have expected based on the *. Now let's see what happens if a column is removed from the table;

ALTER TABLE tblSource
    DROP COLUMN Col2
GO

SELECT 'vwGood' view_name, * FROM vwGood
GO
SELECT 'vwBad' view_name, * FROM vwBad
GO

You will notice that vwGood throws an error (rightfully so, it is instructed to select from a non-existing column), but vwBad doesn’t… it pretends Col3 is Col2. Based on the data, we know better. However this might not be so obvious if based on this change the purchase price is shown as sales price due to changes in the data model and your company starts losing money.

I bet now you want to know which views contain * so you can code them properly, try this query if you're on SQL Server 2005:

SELECT v.name view_name
    , s.name schema_name
    , u.name user_name
    , v.create_date
    , v.modify_date
    , sc.text
FROM sys.views v
    INNER JOIN sys.syscomments sc ON v.object_id = sc.id
    INNER JOIN sys.schemas s ON v.schema_id = s.schema_id
    INNER JOIN sys.sysusers u ON s.principal_id = u.uid
WHERE sc.text LIKE '%*%'

And this one for SQL Server 2000:

SELECT so.name view_name
    , u.name user_name
    , so.crdate create_date
    , sc.text
FROM dbo.sysobjects so
    INNER JOIN dbo.syscomments sc ON so.uid = sc.uid
    INNER JOIN dbo.sysusers u ON so.uid = u.uid
WHERE sc.text LIKE '%*%' AND v.type = 'V'

And don't forget to clean up the example:

DROP VIEW vwGood
GO
DROP VIEW vwBad
GO
DROP TABLE tblSource
GO

Technorati tags: ,
Tuesday, 13 November 2007 12:35:23 (W. Europe Standard Time, UTC+01:00)
# Monday, 05 November 2007

If an SQL Server 2005 database has multiple snapshots, do the first changes get written to all snapshots or only to the most recent? Good question, let's make the answer:

The following scenario is based on this sql-script. It assumes you have the AdventureWorks and your database engine uses C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data. Be careful, the script changes some data in AdventureWorks, so some examples from books-online may behave unexpected after this demo.

The scenario is in four stages;

  1. The first snapshot is created and a query displays the sizes of the snapshot file on disk (BytesOnDisk) and the internal size (BytesWritten).
  2. Person.Contact is updated, the Firstname and LastName for all records are swapped, practically resulting in the whole table being written to the snapshot before the update. Right after this update, the second snapshot is created and the query again shows the information about the (now 2) snapshots. The first snapshot holding the Person.Contact table before the update, the second being still empty.
  3. Person.Address is updated, AddressLine1 and AddressLine2 are swapped (and NULLs for AddressLine2 are set to empty strings). The query following this update shows both snapshots have grown, proving all snapshots get written (and we have the answer!!!).
  4. To show a snapshot only processes the changes once since it was created, the FirstName and LastName columns in Person.Contact are swapped again. This time, only the second snapshot gets written to (you may notice one or two data pages (8192 bytes) being added to the first snapshot, an insignificant change compared to previous changes).
Monday, 05 November 2007 22:36:03 (W. Europe Standard Time, UTC+01:00)
# Monday, 29 October 2007

Today's question of the course (2780B), how do licensing options mix with multiple instances. The course points out the choices on installation;

  • Per processor
  • Per Server
    • Device Client Access License
    • User Client Access License

That's about it, but as always, MS-Licensing is more complicated than a single page in the text-book. If you want to know it all, there is an SQL Server licensing white-paper, covering;

  • Virtualization / Multi-instances
  • Multi-core processors
  • Fail-over technologies (Cluster, Database Mirroring, Log Shipping)
  • Multiplexing
  • Separation of components over multiple servers.
  • SQL vs Workgroup CAL's
  • Special programs for ISV's and hosting providers
  • Downgrading

For just Virtualization and Multi-instancing, read this section of the above white-paper. For more info on licenses and pricing, visit How to buy SQL Server. Keep in mind that pricing and licensing may vary based on your location.

Monday, 29 October 2007 18:23:37 (W. Europe Standard Time, UTC+01:00)
# Wednesday, 12 September 2007

The IDENTITY property on a column in SQL Server allows to easily create a new number for each row added (INSERT). As a side effect, the column with the IDENTITY property also shows the natural order of records... until something is forgotten and is INSERTed at a later stage. The natural order of things is now disturbed (or has become fragmented, if you like).

CREATE TABLE OrderOfThings_1(
   Ident int IDENTITY(1,1) NOT NULL,
   OrderedSteps varchar(50) NOT NULL)
GO
INSERT OrderOfThings_1 (OrderedSteps) VALUES ('The first step.')
INSERT OrderOfThings_1 (OrderedSteps) VALUES ('The second step.')
INSERT OrderOfThings_1 (OrderedSteps) VALUES ('The fourth step.') -- Notice the forgotten third step.
INSERT OrderOfThings_1 (OrderedSteps) VALUES ('The fifth step.')
GO
SELECT * FROM OrderOfThings_1 ORDER BY Ident
GO
INSERT OrderOfThings_1 (OrderedSteps) VALUES ('The third step.') -- The forgotten third step is added.
GO
SELECT * FROM OrderOfThings_1 ORDER BY Ident
GO
Ident OrderedSteps   Ident OrderedSteps
1 The first step.   1 The first step.
2 The second step.   2 The second step.
3 The fourth step.   3 The fourth step.
4 The fifth step.   4 The fifth step.
      5 The third step.

Naturally we could have anticipated this scenario and set IDENTITY(10,10). But still if the thid step is to be inserted in natural order, this can only be done with significantly more effort. It requires the use of SET IDENTIY_INSERT dbo.OrderOfThings_1 ON, after which the third step can be INSERTed "in order". As the front-end application likely isn't coded for this scenario, it will require the intervention of a dba.

So it would be nice to be able to specify the natural order if needed only, and the best thing in SQL Server to just allow that is the DEFAULT. But the DEFAULT is quite limited; it can neither hold references to columns in the table nor make use of user-defined functions. So any flexibility should come from system functions, literals and operations. By using the IDENTITY property on one column and deriving the natural order from the IDENTITY with a DEFAULT based on IDENT_CURRENT, a flexible and transparent solution is available;

CREATE TABLE OrderOfThings_2(
   Ident int IDENTITY(1,1) NOT NULL,
   OrderOfSteps bigint NOT NULL
     CONSTRAINT df_OveridebleIdentity
       DEFAULT (IDENT_CURRENT('OrderOfThings_2') * 10),
   OrderedSteps varchar(50) NOT NULL)
GO
INSERT OrderOfThings_2 (OrderedSteps) VALUES ('The first step.')
INSERT OrderOfThings_2 (OrderedSteps) VALUES ('The second step.')
INSERT OrderOfThings_2 (OrderedSteps) VALUES ('The fourth step.') -- Notice the forgotten third step.
INSERT OrderOfThings_2 (OrderedSteps) VALUES ('The fifth step.')
GO
SELECT * FROM OrderOfThings_2 ORDER BY Ident
GO
INSERT OrderOfThings_2 (OrderedSteps, OrderOfSteps) VALUES ('The third step.', 25) -- The forgotten third step is added.
GO
SELECT * FROM OrderOfThings_2 ORDER BY OrderOfSteps
GO
Ident OrderOfSteps OrderedSteps   Ident OrderOfSteps OrderedSteps
1 10 The first step.   1 10 The first step.
2 20 The second step.   2 20 The second step.
3 30 The fourth step.   3 25 The third step.
4 40 The fifth step.   4 30 The fourth step.
        5 40 The fifth step.

Note that the OrderOfSteps column is based on bigint, which can hold any int multiplied by 10.

Wednesday, 12 September 2007 19:05:57 (W. Europe Daylight Time, UTC+02:00)
# Friday, 24 August 2007

SQL Server Books Online states there are a number of statements you can't include in a stored procedure, including CREATE VIEW. In most cases that wouldn't be a problem, as you can create the view using a script, unless... I need an indexed view on a table that is created by a stored procedure. At the time the script runs to create the objects (views, functions and stored procedures), I can't create a schemabound view on a non-existing table. I have to create the view after the table, but if I try in a stored procedure; error messages!

CREATE PROCEDURE procCreateIndexedView
AS
IF EXISTS (SELECT *
            FROM INFORMATION_SCHEMA.TABLES
            WHERE TABLE_NAME = 'vwDemo1Only'
                AND TABLE_TYPE = 'VIEW'
                AND TABLE_SCHEMA = 'dbo')
DROP VIEW dbo.vwDemo1Only
CREATE VIEW dbo.vwDemo1Only
    WITH SCHEMABINDING
    AS
    SELECT Col1, Col2, Col3, Col5
      FROM dbo.tblDemo 
      WHERE Col4 = 1
CREATE UNIQUE CLUSTERED INDEX ix_Demo1_Col1Col2 ON dbo.vwDemo1Only(Col1, Col2)
CREATE INDEX ix_Demo1_Col5 ON dbo.vwDemo1Only(Col5)

Msg 156, Level 15, State 1, Procedure procCreateIndexedViews, Line 9
Incorrect syntax near the keyword 'VIEW'.
Msg 319, Level 15, State 1, Procedure procCreateIndexedViews, Line 10
Incorrect syntax near the keyword 'with'. If this statement is a common table expression or an xmlnamespaces clause, the previous statement must be terminated with a semicolon.

Fortunately, there is an escape: Dynamic SQL. Simply by encapsulating the creation of the view as dynamic SQL, the stored procedure is created without errors. Also, when executing the stored procedure the view is created as well (that is, if the table already exists ;-) )

CREATE PROCEDURE procCreateIndexedView
AS
IF EXISTS (SELECT *
            FROM INFORMATION_SCHEMA.TABLES
            WHERE TABLE_NAME = 'vwDemo1Only'
                AND TABLE_TYPE = 'VIEW'
                AND TABLE_SCHEMA = 'dbo')
DROP VIEW dbo.vwDemo1Only

--  Add next line to encapsulated view defenition as dynamic SQL.
EXEC(
'
CREATE VIEW dbo.vwDemo1Only
    WITH SCHEMABINDING
    AS
    SELECT Col1, Col2, Col3, Col5
      FROM dbo.tblDemo
      WHERE Col4 = 1
'
)
-- Add previous line to stop encapsulation.
CREATE UNIQUE CLUSTERED INDEX ix_Demo1_Col1Col2 ON dbo.vwDemo1Only(Col1, Col2)
CREATE INDEX ix_Demo1_Col5 ON dbo.vwDemo1Only(Col5)

Technorati tags: ,
Friday, 24 August 2007 22:44:23 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 18 August 2007

What's on your sidebar? Well, nothing much on mine, until today. I just read a post on Rob Farley's blog, pointing to SSIS Junkie Jamie Thomson. Jamie and his colleague John Rayner developed a useful gadget, where you can connect to your favorite database and monitor file size and usage. Actually you see the sum of your data files sizes and the sum of your log files sizes (if your database contains more than one of each). Easy to use, just type the server name, database name and the polling interval and you're set (Jamie's original post has the instructions as screen capture).

You can have this gadget more than once on your sidebar with different connections, but I would like to suggest to Jamie and John to allow for a list of databases to be polled in a single gadget (you can still keep multiple gadgets on your sidebar, one for every server / instance).

Technorati tags:
Saturday, 18 August 2007 14:27:37 (W. Europe Daylight Time, UTC+02:00)
# Wednesday, 11 July 2007

From Mary-Jo's blog: SQL Server 2008 ("Katmai") will be released launched February 27th 2008, together with Windows Server 2008 ("Longhorn") and Visual Studio 2008 ("Orcas").

Read also Microsoft's Press release.

<Update date="2007-07-12">

Old title: SQL Server 2008 release date: 2007-02-27

Something very stupid happened to me in the title before the update! I put in the current year, shame on me!

But wait, that's not all. There is also something about the distinction between launch and release (aka RTM), thanks to Euan Garden for pointing that out on his blog. The release of SQL Server 2008 is set for somewhere in the second quarter of 2008 (source Michael J. Murphy). So mixing up the words launch and release was my second stupid thing.

<Update/>

Technorati tags: ,
Wednesday, 11 July 2007 09:05:38 (W. Europe Daylight Time, UTC+02:00)
# Tuesday, 10 July 2007

Clinic 7045: What's New in Microsoft® SQL Server™ 2008

While I was over at Arlindo's Blog for VRMCplus, I couldn't resist clicking SQL Server 2008 in his tag-cloud. Next I grabbed my passport (euh LiveID) and headed to the 1.5 to 2 hour eLearning module (content will be available on your LiveID for a year).

Little topic overview:

  • A lot on Database Engine and SSRS, less on SSIS and SSAS.
  • FILESTREAM and spatial datatypes, as well as the other datatype enhancements/additions.
  • Mentioning the new management capabilities and how this fits in with MS-initiatives like DSI.
  • Mentioning the development features and integration (think ORM, EDM and LINQ).
  • New to me (hadn't noticed it before); Security Auditing for Data Protection. This will be quite useful when you can't (due to vendor support restrictions) implement auditing based on altered table definitions, check constraints and triggers. Or now don't want with SQL2k8 because it will be faster to implement ;).
  • Improved integrations with Office 2007 (and SharePoint).
Technorati tags: ,
Tuesday, 10 July 2007 13:00:15 (W. Europe Daylight Time, UTC+02:00)
# Wednesday, 20 June 2007
ICE
 #
 

Yesterday I enjoyed the webcast by Hariharan Sethuraman and Chris Haslam, both from Microsoft. They talked about ICE, which stands for Information Security Consolidated Event Management System. ICE collects the in- and outbound e-mail traffic, login events and web browsing (web proxy and firewall logs) and stores it for 60 days, to provide an audit-trail in case of security events. The webcast was about how ICE 3.0 was designed and built on top of the infrastructure below.

Imagine the numbers; 40 TB designed -- 27 TB allocated (FibreChannel SAN-)storage, designed to load 60 GB/hour into staging tables -- currently receiving max 1.2 TB daily with 600 GB as daily average, table partitioning, 4 (multi-core?) x64 processors & 32 GB RAM for the Database Engine + 4 (multi-core?) x64 processors & 8 GB RAM for the Integration Services. Accessing all that data via Ad-hoc queries and Report Server reports.

Already ICE version 4.0 is envisioned, however again only for internal usage, ICE is not (yet?) planned as a commercial product. So with ICE not being planned as a product, I just had to ask "Sounds like a great reference project for SQL Server 2005 and is a very useful application, are you planning a white-paper?"... Turned out the webcast is a precursor to the white-paper!!!

I'm looking forward to update this post with the link to that white-paper. For now I can only point you to the on-demand webcast: How Microsoft IT Uses SQL Server 2005 to Power a Global Forensic Data Security Tool (Level 300) (~60 minutes)

/* UPDATE 2007-11-22 */ Link to the afore mentioned white-paper.

 

Technorati tags: , ,
Wednesday, 20 June 2007 13:51:23 (W. Europe Daylight Time, UTC+02:00)
# Tuesday, 05 June 2007

Past weeks haven't been boring, if you look at the amount of database related news from Microsoft. And plenty of stages to let the world know it's still moving... BI ConferenceMix'07. And now TechEd.

At BI Conference, the new version of SQL Server was officially announced. At Mix'07 besides Silverlight there also were Jasper and Astoria. To see the presentations go to sessions.visitmix.com and search on the session code DEV18 for Jasper and XD006 for Astoria.

The binding word for Jasper and Astoria and Katmai is Enity Data Model. You may find several usefull video's on Channel9;

Yep, a lot of developer stuff, but it will show the importance of EDM for data driven applications and thus access to the database. Also, judging from this press release, one of SQL Server 2008 main delivery goals is accessibility of data.

With TechEd started, there is also more and more info about Katmai, including this recorded webcast. Like expected, a lot of Entity Framework (includes EDM) here. Other very noteworthy additions; spatial data types (demo with Virtual Earth), policy based management, data  and backup compression, resource governor and a FILESTREAM data type. FILESTREAM is about storing metadata in SQL Server and the file itself on the file system, while maintaining a relation between them (including for BACKUP/RESTORE, I imagine something like with Full-Text Indexes on SQL Server 2005). Sounds like WinFS, doesn't it?

Also the first CTP's are out, based on the Developer Edition. Download the x86 and/or x64 versions here, after having registered. 

Tuesday, 05 June 2007 18:55:06 (W. Europe Daylight Time, UTC+02:00)
# Monday, 07 May 2007

Yep, the revisions are out. Today I checked the Microsoft Learning site for the 2779 and 2780. The 2779afinal.mspx page was redirected to the 2779bfinal.mspx. Same thing for the course 2780. Both courses are now 5 instead of 3 days!

  • What's changed on the 2779b;
    • Module 5 2779a is split in two, one about data integrity with constraints and one about data integrity based on triggers and XML schemas
    • Module 8 2779a is split in two, stored procedures and functions now have their own modules
  • New content on 2779b;
    • Module on transactions and locks
    • Module on Notification Services

What's changed on the 2780b? If you look at the syllabus, you would be tempted to say nothing. But when you look inside, content is rewritten (and improved) like I mentioned for Cryptography. Also, there are no extra modules, but since it needed over 4 days to properly teach the old course, there was no need for extra content anyway.

Also be advised that Microsoft Learning is working on an Instructor Led Training for 2778 (currently only available as eLearning). I'll keep you posted on that one as well.

When you compare the course offerings to the exam 70-431, I'm missing content on the HTTP-Endpoints in 2779b. Therefor I will add a module on HTTP-Endpoints for my deliveries of the 2779.

Monday, 07 May 2007 14:55:11 (W. Europe Daylight Time, UTC+02:00)
# Sunday, 29 April 2007

Some useful links with personal impressions, valuable resources or important information and their date posted. Don't forget, always check the preparation guide (last update 2007-03-20), both as a road map for your preparation and just prior to taking the exam (maybe things have changed since you started).

The links prior to 2007 don't include info on the simulation questions. The first time I heard about the simulation questions was in December 2006, it seems they were running beta into March. Starting March 2007 I heard and read about failing exams because of the simulation questions, which seem to make up a very big part of the overall score for the exam.

Must read is the discussion on simulations (2007-04-12) started by Zieglers after he passed his 70-431 (2007-01-23). Some very valuable links can be found in the discussion, including this simulation example.

Also check out the database certification newsgroup microsoft.public.cert.mcdba. This newsgroup is aimed at all SQL Server exams and certifications, not just 70-431.

Other links:

If you use the MCTS Self-Paced Training Kit (Exam 70-431): Microsoft® SQL Server™ 2005—Implementation and Maintenance, check out the comments and corrections.

Please post a comment if you have other good and free resources as well.

*UPDATED 2007-05-05*

Sunday, 29 April 2007 10:38:37 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 28 April 2007

Initially I started writing this article because of the poor quality of Module 4, Lesson 4 in the original version of the course MOC 2780. What's wrong with those 6 pages? Technically nothing, but if you've never seen what cryptography does in security systems, you're lost. Fortunately Microsoft Learning has acknowledged the course and its timetable were "suboptimal" and will be releasing the B-revision soon. From a MOC2780B (revised module 4, lesson 4) perspective, this article can be considered additional reading.

In much the same way as in MOC2780A, the entire Module 4 of the MOC 2787 is affected as well. Therefor reading this article is recommended in preparation of the exams 70-443, 70-444 and 70-447.

To be honest, I don't know how relevant the encryption stuff is towards the exam 70-431. I don't recall it from the beta exam I sat, nor have I read reports that specifically included cryptography. The preparation guide provides little guidance on it, it only says Configure encryption under Installing and Configuring SQL Server 2005, Configure SQL Server security, but that may be aimed at protocol encryption. Should you want to play at safe while preparing based on the MS Press Self-Paced Training Kit "SQL Server 2005, Implementation and Maintenance" (ISBN-10: 0-7356-2271-X), this article provides the necessary background for Chapter 2, Lesson 6. If you prepare for the exam based on the Sybex MCTS Study Guide "Microsoft SQL Server 2005 Implementation and Maintenance" (ISBN-10: 0-470-02565-4) this article is a must read, the Sybex book doesn't cover SQL Servers cryptography.

----

First I'll describe the tools needed for cryptographic security;

  • Cryptographic hashes (are not mentioned in courseware, but included here for completeness)
  • Key based encryption 
    • Symmetric keys
    • Asymmetric key-pairs (private key & public key)
    • Certificates
  • Passwords/passphrases

and then continue to encryption hierarchy, combining different forms of cryptography and the double protection of keys and certificates. 

Cryptographic hashes

Cryptographic hashes are in security related documentation mostly referred to as just hash. However in a database context, you may also read about hashes in the terms like hash indices and hash joins. Hash indices and hash joins are related to the query processor and have nothing to do with the security subsystem. So when you encounter hash in SQL Server documentation, verify if it is meant cryptographic or related to the query processor. In the remainder of this article, the term hash will refer to a cryptographic hash.

Now for what a hash does, a hash is based on some input of any length. This input is divided in blocks and based on these blocks, calculations are performed. These combined calculations are called the algorithm. One of the characteristics of the algorithm is that it's output has a fixed length. In SQL Server the following algorithms for hashes can be used, with the output length in bits between parentheses; MD2(128), MD4(128), MD5(128), SHA(160) aka SHA0, SHA1(160). The output of the hashing operation may be referred to as hash or as Message Digest, the input of a hash operation may also be named Message. The most important characteristic of a hash is that it is one way, so you can calculate the Message Digest of a Message, but it isn't possible to retrieve the Message if you only have the Message Digest.

You can call a hash function from SQL Server directly, see HASHBYTES for more info.

The typical application for hashes is to verify the original content, without having to store the original content. For example with passwords, where systems typically store the password hash, not the password itself. When the user types the password, it is hashed and compared to the stored hash. If they match, the user has proven to know the password. You may also encounter hashes with downloads, where the (MD5) hash of the downloadable file is posted. After you downloaded the file, you can compute the hash of the file. When it matches the posted hash, the download was successful. Check out winMd5Sum as example.

Key based encryption

When encrypting information, the original information is referred to as plaintext and the encrypted information as ciphertext (or cyphertext). During the encryption operation a key is applied to the plaintext based on an algorithm (the keys' properties must match the algorithm), resulting in unintelligible ciphertext. The only way to read the ciphertext is by decrypting it, which again involves applying a key to the ciphertext based on the corresponding algorithm.

The words plaintext and ciphertext could be a little misleading, as you might think of it as text, but plaintext and ciphertext can be binary too. Specific for SQL Server 2005, the key based encryption functions can handle the data types; char, varchar, nchar, nvarchar, binary, varbinary. Other data types should be casted as one of the afore mentioned datatypes, also the datatypes are limited to 8000 bytes (actually the encrypted data is limited to 8000 bytes, which means the plaintext usually is shorter based on the algorithm used)!

Symmetric keys

With symmetric keys, the algorithm performing the operations uses the same (symmetric) key for both the encryption and decryption operation. Symmetric keys are considered fast for cryptographic operations in comparison to asymmetric keys, naturally the real speed of the encryption and decryption depends on the encryption algorithm, the length (in bits) of the symmetric key and the available processing power. With encryption, the algorithm is bound to the symmetric key, so during creation you must specify for which algorithm (DES, DESX, Triple DES, RC2, RC4, AES 128, AES 192, AES 256) the key is created. The second requirement when creating a symmetric key is that it is encrypted. It may sound strange, encrypting a key, but in reality the key is the most vulnerable part in cryptographic systems. By encrypting the symmetric key (that may have encrypted numerous fields in the database), we prevent someone who has no access to the decryption of the symmetric key, to access the data protected by this key. At the end of this article, you will see that you can build/use a hierarchy of keys that will enable you to keep numerous secrets by guarding only a few.

Creating a symmetric key is done via CREATE SYMMETRIC KEY. After creation, the key is stored in the database and information about the key can be retrieved from the system view sys.symmetric_keys. As you may see in this view, the key has a name and a GUID (you'll also see the columns for storing the algorithm). Both the name and the GUID are important when identifying the key in encryption and decryption operations.

To make use of a symmetric key, this symmetric key should be open. This has to be done because the symmetric key is stored in an encrypted state. So you can only make use of the symmetric key when you have access to the key or know the password that was used to encrypt the symmetric key. For syntax, see OPEN SYMMETRIC KEY.

The final step is using the symmetric key to encrypt and decrypt information. This can be done using the functions EncryptByKey and DecryptByKey. Note that you can include an "authenticator", which basically is a property of the record that you can encrypt along with the data.

To see how the symmetric key based encryption should be used, please take a look at the example by Laurentiu Cristofor.

Asymmetric keys

With asymmetric keys, the algorithm performing the cryptographic operations uses a key pair. The keys of this pair (let's call them Pub and Priv right away) match in such a way that one key can undo the operation of the other key. So if you have plaintext and encrypt this with the key Pub, the resulting ciphertext can only be decrypted with Priv. The other way around, when you use Priv to encrypt some plaintext, the only way to decrypt the resulting ciphertext is using Pub.

By calling the key-pair Priv (private key) and Pub (public key), the main area of use is very clear. While a person or system can share it's public key with the whole world, the private key is kept secret. Now let's say that two people, Alice and Bob, want to exchange information, without anyone else being able to learn about the exchanged information. Both Alice and Bob have a private key (which each of them keeps a secret) and a public key (known to the both of them and the rest of the world). Now Alice wants to leave Bob a message to.... (well, it's to remain unknown to the rest of the world), so Alice encrypts the message with Bob's public key (Pub-B). With Bob being the only one who has access to Bob's private key (Priv-B), Alice knows only Bob can decrypt the message and she can safely store the encrypted message for Bob to read. After decrypting the message, Bob wants to answer Alice in an equally secure manner, so he uses Pub-A to encrypt the message. Even with everybody being able to retrieve the encrypted message, only Alice can decrypt it, for she has Priv-A.

Alice encrypts and stores:

EncryptPub-B(Plaintext_1:I am Alice) = Ciphertext_1

Bob retrieves and decrypts:

DecryptPriv-B(Ciphertext_1) = Plaintext_1:I am Alice

-------------------------------------

Bob encrypts and stores:

EncryptPub-A(Plaintext_2:I am Bob) = Ciphertext_2

Alice retrieves and decrypts:

DecryptPriv-A(Ciphertext_2) = Plaintext_2:I am Bob

Besides preventing information disclosure, asymmetric keys can also provide authentication. In the previous information exchange, Alice and Bob wanted to keep the information undisclosed. But what if Oscar wants to trick Bob and pretend he is Alice. Oscar would only have to fetch Bob's public key, encrypt the message with Pub-B and state in the message that he is Alice.

Oscar encrypts and stores:

EncryptPub-B(Plaintext_1:I am Alice) = Ciphertext_1

Bob retrieves and decrypts:

DecryptPriv-B(Ciphertext_1) = Plaintext_1:I am Alice

To prevent Oscar (or anyone else) from manipulating the flow of information, Alice and Bob agree to encrypt the message with their private keys prior to storing. So Alice has a message; "plaintext" and encrypts it with Pub-B (she now knows only Bob can decrypt it). Next she encrypts the resulting ciphertext with Priv-A and stores that message. When Bob retrieves that message, he uses Pub-A to decrypt the first stage, this verified that the message was encrypted by Alice (as only she has access to Priv-A), next he decrypts the message with Priv-B and now has access to the plaintext stored by Alice.

Alice encrypts and stores:

EncryptPriv-A(EncryptPub-B(Plaintext_1:I am Alice)) = Ciphertext_1

Bob retrieves and decrypts:

DecryptPub-A(DecryptPriv-B(Ciphertext_1)) = Plaintext_1:I am Alice

-------------------------------------

Bob encrypts and stores:

EncryptPriv-B(EncryptPub-A(Plaintext_2:I am Bob)) = Ciphertext_2

Alice retrieves and decrypts:

DecryptPub-B(DecryptPriv-A(Ciphertext_2)) = Plaintext_2:I am Bob

So it doesn't matter that Oscar has access to all public keys and can retrieve the ciphertext, because Priv-A and Priv-B are a secret held by their respective owners, Oscar can't interfere with this process. The outer encryption, where the originator uses its private key, is also referred to as signing. Also note that a message does not have to be encrypted to be signed.

Alice signs and stores:

EncryptPriv-A(Plaintext_1:I am Alice) = Ciphertext_1

Anyone retrieves and verifies:

DecryptPub-A(Ciphertext_1) = Plaintext_1:I am Alice

Like with symmetric key, asymmetric keys can be created with a T-SQL CREATE statement and viewed through a security catalog view; CREATE ASYMMETRIC KEY and sys.asymmetric_keys. Asymmetric keys are based on the RSA-algorithm with keys of 512, 1024 or 2048 bits. As you learned above, the private key is to remain secret, so in order to protect it, the private key has to be stored in encrypted form. Unlike symmetric key based encryption and decryption, with asymmetric key based operations the keys do not need to be opened. However when performing an operation that requires access to the private key, the private key must be decrypted during the operation. The following functions are performed with the public keys; EncryptByAsmKey and VerifySignedByAsmKey. The operations performed with a private key are; DecryptByAsmKey and SignByAsmKey.

Certificates

Very close to the asymmetric keys are certificates. In fact, nothing changes on the side of the private key and the public key. The certificate is only used to store properties associated with X.509 v1 certificates and associate those with the public key, read Laurentiu's post for more info. Because certificates are named different from asymmetric keys, you'll get different syntax and another security catalog view; CREATE CERTIFICATE, sys.certificates, EncryptByCert, VerifySignedByCert, DecryptByCert, SignByCert. However certificates do have a major advantage over asymmetric keys; you can backup a certificate (certificates were designed with .CER files in mind); BACKUP CERTIFICATE.

Passwords and passphrases

As you may have noticed in the syntax for creating and opening symmetric keys and when creating or using the private key on asymmetric keys and certificates, there is the option to use a password (ENCRYPTION BY PASSWORD='@v3RyCo/\/\pl&xPa$suu0rD'). When a password is provided on creation of the key or certificate, that password is used as a sort of symmetric key to provide the necessary encryption. Also, passphrases can be used to encrypt data directly without the need for any keys and key handling. This can be done with the functions EncryptByPassPhrase and DecryptByPassPhrase. Passwords and passphrases are essentially the same kind of thing, though we see passwords as hard to guess character strings and passphrases as long but easy to remember phrases. Passphrases typically contain a lot of spaces, whereas passwords contain mixed case, numbers and symbols and are usually enforced by a password policy.

Encryption hierarchy

Encrypting keys and certificates with a password is one option of protecting that key or certificate, but it does involve a lot of password management. Another option is building an encryption hierarchy in SQL Server 2005. To be able to explain the hierarchy, two special symmetric keys must be introduced;

  • SERVICE MASTER KEY
  • DATABASE MASTER KEY

Service master key

The service master key is created when SQL Server 2005 is installed. The key is created based on and protected with the credentials of the Windows Account that is used as the SQL Server Service account. The service master key is used to encrypt password which are stored in the master database (like SQL logins and credentials for linked servers), also this key acts as the root for the SQL Server encryption hierarchy. The service master key can not be created or dropped, it can however be altered, backed up and restored. In fact, backing up the service master key is recommended right after installing the SQL Server instance. See ALTER SERVICE MASTER KEY for more information.

Database master key

The database master key can be created as a starting point for encryption in the database. This can be done with the statement CREATE MASTER KEY which must include the ENCRYPTION BY PASSWORD. This creates the master key in the database (encrypted by the password and by the service master key), also the database master key is stored in the master database where it is encrypted with the service master key. The database master key is stored in the master database to facilitate automatic decryption. It is possible to drop the copy from the master database, but then it is required to manually open a database master key prior to using it. Like the service master key, it is best practice to backup the database master key too. See CREATE MASTER KEY for more information.

Encrypting asymmetric keys and certificates

When the ENCRYPTION BY PASSWORD directive is omitted for an asymmetric key or certificate when it is created, the master key of the database where the asymmetric key or certificate will be stored is used to encrypt the private key. This way, the new key or certificate is automatically tied into the encryption hierarchy of SQL Server. Asymmetric keys and certificates can only be encrypted once, so when you alter them, you can switch between encryption by the database master key and encryption by a password, or in the later case change the encryption password. 

Multiple encryptions of symmetric keys

Normal symmetric keys and database master keys can be encrypted more than once. For the database master key, this is very convenient, as it is both possible to use cryptographic functionality transparently (based on the encryption by the service master key) and transfer the database to another instance while preserving the encryption hierarchy (based on the encryption by password). For normal symmetric keys this is very convenient as well, as multiple users may need access to the same encrypted data. The bulk of the data (think like thousands of records with an encrypted field) does not need to be encrypted multiple times and can be accessed through the same symmetric key. This symmetric key is, depending on the capabilities users/processes, accessible through one or more passwords, other symmetric keys, asymmetric keys and/or certificates.

Combining cryptographic operations

Typically symmetric and asymmetric key (or certificate) based encryptions are combined to achieve the desired security level, while still maintaining good performance. If you revisit the example, you'll notice that the data in the column is encrypted with the symmetric key (remember: symmetric key = fast). The table in the example only has 2 rows, but that same table could hold millions of rows, making the algorithm doing the encryption and decryption very important in terms of performance. Because all this data is encrypted with a single symmetric key, this key should be well protected. The example used a password to encrypt the key, but a very common approach is to secure the symmetric key with a certificate (or asymmetric keys, which essentially comes down to the same). As a non-SQL Server example, EFS takes the same route, using a symmetric key to encrypt a potentially big file and then encrypting that symmetric key with the public key for each user that should have access to the file.

Now for SQL Server 2005, Laurentiu has a great 2nd example where he uses the possibilities offered by the encryption hierarchy. A database master key is created and used to encrypt the certificates for all users (dbo and Charlie) participating in the example. The certificates are then used to encrypt the symmetric key and the symmetric key will be used to encrypt the (2, but potentially thousands) salaries in the t_employees table.

Interesting to note about the 2nd example is the function DecryptByKeyAutoCert. The main advantage of this function is that it utilizes the encryption hierarchy and transparently opens keys on demand (and closes them after the operation completed). Similar functionality is provided by the function DecryptByKeyAutoAsymKey, if the symmetric key is encrypted with an asymmetric key.

Permissions on keys and certificates

One important thing not yet mentioned about the keys and certificates, is that they are securables. So in order to use them, a principal should be granted the necessary privileges on the keys and certificates. If you look closer at the second example, you see it is no issue at first, as everything is done as dbo, so dbo automatically is owner (and in full control) of the keys and certificates created. But when Charlie makes his appearance, he must be granted the necessary privileges; naturally select on the views, but also VIEW DEFENITION on the symmetric key (to be able to use it) and CONTROL on the certificate. Alternately Charlie could have been made owner of the certificate when it was created through the AUTHORIZATION directive. Bottom line is, to use a key or certificate, you must be granted the proper access to the key or certificate and be able to decrypt it. For more information on permissions, see GRANT.

Cryptography and authentication

What this article doesn't cover is signing programmed modules and mapping users and logins to certificates. However, I will add those topics when covering impersonation (EXESUTE AS on the todo-list).

Recommended reading: Handbook of Applied Cryptography, Laurentiu Cristofor's blog, Wikipedia. Not exactly reading, but check out those 4 presentation from the 2006 PASS conference. 

Saturday, 28 April 2007 21:34:46 (W. Europe Daylight Time, UTC+02:00)
# Friday, 13 April 2007

If you're looking for the SQL Server 2000 perspective, go to Space Program.

Before venturing further, when was your last checkpoint? I learnt in my test setup, that if you work with a small number of operations, there is no data recoverable from the data file. It was just on the dirty pages and in the transaction log (which I deliberately crashed, to be able to advise on recovery of that event). This is an indicator that, if you lose the transaction log in a running database, you are very likely going to end up with data loss.

-----

...continued from Recovery when the transaction log is lost, if the database was detached or you want to connect the database to another instance.

Unfortunately, after the failed attach, there is no entry for the LogCrash_Demo database in the system tables anymore. So first you have to create a new database with the same name. Next take this database offline and replace the data file of the newly created database with the datafile containing the data you want back. Also, throw away the transaction log that was just created.

Now proceed to the next section, where the database is started in emergency mode.

-----

...continued from Recovery when the transaction log is lost, if you didn't detach the database.

Start the LogCrash_Demo database in emergency mode (switch to emergency mode has changed with SQL Server 2005, as it required direct updating to system tables, which are now hidden from the DBA).

ALTER DATABASE LogCrash_Demo SET EMERGENCY

Now you can read from the database, which is enough to script the definition of the objects (SSMS: right-click the database, Tasks --> Generate Scripts), create a new database based with those scripted objects, and transfer (BULK INSERT, SQL Server Integration Service, etc.) the data from the database in emergency mode, to the new database.

Be careful though about data loss. Data data file that was made available for select in emergency mode does not contain the dirty pages.

So should you be worried about crashing of the transaction log media, make sure you use at least disk mirroring for your transaction log. Because when you look at backup restore as in the 2780, you can rebuild your data files after a disk crash using the right backup and restore strategy without data loss, but you can't if your transaction log is the disk crash victim.

 

Technorati tags: , ,
Friday, 13 April 2007 14:43:23 (W. Europe Daylight Time, UTC+02:00)

When does SQL Server, under its default settings, write dirty pages to disk?

The answer depends on your recovery model, but let's assume Full Recovery. In that case, either when the lazy writer kicks in or when a checkpoint is issued.

-- (Relevant) Memory usage by SQL Server --

To simplify, SQL Server needs datapages in memory to work with them. To do so, SQL Server uses available memory from the system for a buffer pool. This buffer pool is filled with datapages, while SQL Server has the opportunity to do so (and when memory runs out on the system, SQL Server will release memory from its buffer pool). When one or more records are SELECTed from the datapage, the datapages containing these records are either already present in the buffer pool, or read from disk. When the datapages are modified (like in UPDATE, DELETE and INSERT), SQL Server carries these operations out on the pages in memory (and also writes the modification to the disk-bound transaction log). When the data page modification is completed, the data page stays in memory and is marked dirty (because the page in memory and the corresponding page in the datafile are different).

There are also times when the buffer pool needs to be cleaned up, either because it's full and more needs to be stuffed in (like my garage) or because you want to clean it out (no real-world example applies to me here).

-- Lazy Writer and Checkpoint --

In the first case, where more data pages need to be stuffed in, the lazy writer gathers (some of) the dirty pages and has them written to disk. That way the memory that was occupied by those dirty pages can be reused. In the second case, where you want to clean the dirty pages, you issue a checkpoint, thereby ordering SQL Server to flush the dirty pages to the data file(s) on disk.

This way issueing a checkpoint, has the positive side-effect of making memory in the buffer pool available for reuse. Side note; neither the lazy writer, nor checkpoint, free data pages. They just mark them safe for reuse.... but wait a minute, cleaning out the buffer pool (or the garage), isn't that a must do??? To be honnest, only when it's full!

So when does a checkpoint occur? When a database is closed (like when taking the database offline, stopping the service) or when a backup is made, as part of the opperation a checkpoint is issued. Besides those events, there is also the option to programatically/interactively issue a checkpoint, finally SQL Server issues checkpoints automatically, based on the recovery interval.

-- Recovery --

There is a more compelling reason (compared to buffer pool maintenance) to issue a checkpoint every now and then. It is recovery!

And why is recovery important? When a database starts up, recovery is one of the processes. During recovery, SQL Server looks up the last known checkpoint in the transaction log and then starts redoing the committed transaction from the transaction log. If the database was closed clean the last time, the checkpoint is the last thing in the transaction log, so recovery costs next to nothing. But if the database wasn't closed nicely (like when the power supply unit fails), then it may be hours (or longer) since last checkpoint. SQL Server then starts redoing from the transaction log, this may take some time, but at least you don't lose commited transaction.

BUT DIDN'T YOU .... ?

YES, I DID! That is, pull the memory stick with the transaction log out, to simulate a failure of the disk. And that's where it hurts as data pages only get flushed to disk if;

  • Both the buffer pool and system memory are full (or memory is capped with max server memory). Based on a few inserts and updates... don't think so.
  • The database is closed or backed up. Not the case when you pull the stick.
  • A manual CHECKPOINT is issued. Didn't do that.
  • Recovery interval kicks in. Waited just for that to happen...

However, the recovery interval of 0 does not mean SQL Server will issue a checkpoint automatically within a minute. It says, checkpoints do occur when the database engine estimates it can process the number of log records within the interval specified during the recovery process! In other words, it may take very long between checkpoints if a system is running very few transactions... like I noticed during my tests.

Technorati tags: , ,
Friday, 13 April 2007 14:41:49 (W. Europe Daylight Time, UTC+02:00)
# Monday, 09 April 2007

"Let's check and see..." and we came back with no license info on SQL Server. Yet the course certainly gives an overview of the various licensing options, so does the Microsoft website. However, when installing SQL Server, you're not asked about licensing. Also the queries SELECT SERVERPROPERTY('LicenseType') and SELECT SERVERPROPERTY('NumLicenses') both return DISABLED.

By design, the product SQL Server 2005 does nothing in terms of enforcing or even storing license information. Eric Burgess mentions this on the Microsoft SQL Server Support Blog, including how to store the license information in the Windows Registry, as was done with SQL Server 2000.

Okay, that for the technical detail, but what remains puzzling is Microsoft's current strong/aggressive campaigning on licensing versus no instruments in SQL Server.

Technorati tags: , , ,
Monday, 09 April 2007 13:57:52 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 07 April 2007

Module 3 of the course 2780 is about backup and restore. When you combine this module with the database best-practice of separating the data files and transaction log files on different disk, you can even demonstrate how to recover from disk failure using the right backup and restore strategy without losing data... that is, recover if the failing disks contain the data files!!!

But what if the failing disk contains the transaction log files? Backup and restore are not going to help you to recover up to the moment of disaster. Sure, you can restore the database from last backup, but if last backup was on Sunday and the incident occurs Friday afternoon, about a weeks work is lost!

Yet we know all data is still sitting in the data file(s), the only sad part is, we can't bring the database up without the log file. So the question is, how to give the database its log file(s) back. Not knowing the answer from the top of my head, in which case I usually say "Let's see what happens if we stop the service and throw away a logfile", results were inconclusive and since (lack of) time is a major issue on the 2780, the subject of lost transaction log files landed on the todo-list.

/* BEGIN UPDATE 1 */

HALT, don't do anything yet if you have this problem in a production environment. Just read this post and the post Recovery continued, as there is no need to detach the failed database if your instance is still up and running.

/* END UPDATE 1 */

Now for the answer, which get's pretty complicated, depending on the state of your database. IF your database shutdown was clean:

Detach the damaged database. Then attach the data file(s) again, but remove the missing logfile from the list (see picture). This way, SQL Server recreates the logfile.

Depending on the state of the database SQL Server may even perform this operation at database startup. So you won't even notice it when you don't visit the log-files (Windows Application EventLog or SQL Server Activity log). Explains why results were inconclusive during class. This is easily tested (not on production databases, naturally) by stopping the database service, throwing away the logfile of a user database and starting the database service again. Either, you'll see the recreation of the transaction log in the Event log and Activity log, or you have to perform the detach and attach yourself.

Both for detaching and attaching, you may get error messages. However, these are warnings, the operations do complete successfully.

That was the easy part, but surely you've seen the big, fat IF! Databases don't shutdown clean if the log file crashes. To get a picture of what happens when a log-file is lost due to a disk crash, I created a database with the data file on disk (D:\SQL_DATA\LogCrash_Demo.mdf) and the transaction log on a memory stick (G:\SQL_DATA\LogCrash_Demo_Log.ldf). Next I added a table and started inserting new records. And... pull the stick out!

Msg 9001, Level 21, State 4, Line 1
The log for database 'LogCrash_Demo' is not available. Check the event log for related error messages. Resolve any errors and restart the database.
Msg 0, Level 20, State 0, Line 0
A severe error occurred on the current command. The results, if any, should be discarded.

After detaching the database and trying to attach (CREATE DATABASE ... FOR ATTACH_REBUILD_LOG) it again, the following error comes up;

File activation failure. The physical file name "G:\SQL_DATA\LogCrash_Demo_log.ldf" may be incorrect.
The log cannot be rebuilt because the database was not cleanly shut down.
Msg 1813, Level 16, State 2, Line 1
Could not open new database 'LogCrash_Demo'. CREATE DATABASE is aborted.

/* BEGIN UPDATE 2 */

I was going to look into what DBCC REBUILD_LOG could have done in this scenario and how that could be applied to SQL Server 2005... as of SQL Server 2005 DBCC REBUILD_LOG is no more and since DBCC REBUILD_LOG is an undocumented command, so the succession isn't documented either (if you care for an overview of DBCC changes, including "undocumented" stuff, look at Norb Technologies).

But just forget about DBCC REBUILD_LOG, Paul Randal is pretty clear about it (and all of a sudden it's gone in SQL Server 2005). Besides, the promised later post, how to get to the data in the data file is available too.

/* END UPDATE 2 */

UPDATED 2007-04-13

Technorati tags: , ,
Saturday, 07 April 2007 14:43:15 (W. Europe Daylight Time, UTC+02:00)
# Wednesday, 04 April 2007

Recently I finished teaching a couple of courses, to be more specific; the Microsoft courses 2779 and 2780. This resulted in a little todo-list for postings on the blog, either because time was lacking or because I needed to look up some details to properly answer the question.

In the next couple of days, I will post answers (if I can answer the question) on the following questions;

As the answer start appearing on the blog, I will update this post, linking to the answers. Another answer is also relevant in this context, the things that changed when doing full server recoveries.

Wednesday, 04 April 2007 19:35:02 (W. Europe Daylight Time, UTC+02:00)

Lately is was confronted with a couple of textfiles that wouldn't load into the database (SQL Server 2005). I was getting the errors 

Msg 4832, Level 16, State 1, Line 1
Bulk load: An unexpected end of file was encountered in the data file.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

when trying to acquire the file through

BULK INSERT tblImport
FROM 'Z:\concat-file.txt'
WITH(TABLOCK, FORMATFILE = 'Z:\xmlformatfile.xml')

Turned out I had an arrow sitting at EOF (end of file) with the troublesome textfiles, which where concatenated using the copy-command;

COPY source*.txt concat-file.txt

The arrow representation was new to me, but after a little digging it boiled down to CTRL+Z and the difference of ASCII and binary copying... by default copy is ASCII and when merging files, so an end of file character gets inserted. The EOF character is omitted when doing a binary copy, so

COPY source*.txt /b concat-file.txt

resolved the error.

Curious about the arrow representation on Vista, I decided to verify with Windows Server 2003. Here the EOF story is (almost) the same, only the arrow is shown as a rectangle. Read more about copy here.

Wednesday, 04 April 2007 16:13:08 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 24 March 2007

Does your backup and recovery strategy take the new SQL Server 2005 mssqlsystemresource database into account? What's different for SQL Server 2000 and 2005 from backup/restore perspective.

----

Compared to SQL Server 2000, SQL Server 2005 has a lot of nice features aimed at minimizing downtime during "unplanned events", most notably piecemeal restore and the inclusion of Full-Text indexes in the database backup and restore. However, before being able to do database restores, a functional database server is needed. Besides being a SQL Server instance, the database server also is at least a Windows operating system and hardware (or a virtual machine if you like). The later two usually are the domain of the system administrator who will also take care of the operating systems, including installed applications, being backed up. Likewise the DBA will take care of databases.

A typical system backup will exclude certain files and directories known to be always locked, like pagefile.sys and the SQL Server \Data folder. The other SQL Server folders and registry-setting (system state), are backed up. By restoring this backup for a SQL Server 2000, you get a database server with the Service Pack level of at time of the backup. The only thing that's missing for the database to start is the master database. In SQL Server 2005 only the master database isn't enough to get you started, you need mssqlsystemresource a.k.a. Resource Database as well.

What has changed that this is an important issue in the last step of getting a SQL Server instance running (quickly). The master database in SQL Server 2000 holds all information, the configuration related to the instance (like; logins, linked servers, databases) and the definition of system objects. When applying a service pack, typically maintenance on the system object is done, so the master database changes and is therefor Service Pack dependant. In SQL Server 2005 these two types of information are split, instance configuration still goes into the master database, but the system objects are held in the Resource Database. The advantage is that master is not depending on the Service Pack and since none but Microsoft should alter system objects, the Resource Database can remain read-only and will only be changed by Microsoft when applying the Service Pack.

When disaster strikes the (original) server, in many cases, a restore of the file system and system-state is performed on a more or less comparable machine configuration. This is the quick route, compared to installing Windows, a new instance and applying the Service Pack. The quick route has two extra advantages, no access to installation media is needed and other applications are restored too. But the quick restore route does not bring back a fully functional server, both sysadmin and DBA are aware of that. Therefor for this procedure to be successful, an offline backup of master (copy master.mdf and mastlog.ldf) was done after installation and application of a Service Pack. Now the last step is to copy master.mdf and mastlog.ldf to the SQL Server \Data folder. From here the DBA can start his/her database restore sequence (starting with the latest online backup of master (2000/2005)). This approach works on SQL Server 2000.
For SQL Server 2005, you have to add include the Resource Database in this strategy. And here is a little (positive) difference, because the Resource Database is accessed read-only, it only has read locks and thus can be copied while the instance is running. So you can either copy it after the installation or when the Service Pack is applied, or you can alter the file system backup to specifically include the files mssqlsystemresource.mdf and mssqlsystemresource.ldf.

----

What if disaster struck your SQL Server 2005 instance and you haven't cared for the Resource Database... well then you're off to the slow route of installing the instance after the restore and applying the Service Pack, before the database restore sequence can be started.

Saturday, 24 March 2007 14:29:51 (W. Europe Standard Time, UTC+01:00)
# Monday, 19 March 2007

Recently a student who was going through the Microsoft eLearning course 2778 asked for an explanation of "accent sensitive". Before going into why it matters to SQL Server, lets look into the role of accents. A lot of languages make use of accents to change the pronunciation of some letters. Like with the accents in the German language, which are called Umlaut, when translating the word Umlaut very directly, you'd get alternate tone.

Back to SQL Server and why it should be sensitive to accents, or for that matter case, kana or width (things we'll get to once we touch COLLATION). If we ask a question to our server, frequently we want the answer to be in order. And this order very much depends on data type.

Let's take a group of downhill skiing contestants. In preparation of a match, some overview is needed. The contestants have to be grouped in ages, so there is a need for sorting on birthdate (a datetime data type). During the match, there are results, so there also is a need to order by the time taken to get from start to finish (the number of seconds as decimal in descending order), so the winner is at first place. And probably before the match, there is the need for a list with all contestants ordered by lastname.

Where a programmer (VB, C#, etc) would see this lastname as a string, SQL Server actually has a lot of possible data types to represent and handle textual information.

Single Byte Double Byte
Fixed length char(x) nchar(x)
Variable length varchar(x) nvarchar(x)
CLOB (Character Large OBject) varchar(max) nvarchar(max)
Back-Compat CLOB text ntext

Though these data types all can store and handle textual data types, the collation has to be specified as well.

Collation

Collation is best thought of as the order of characters in a given character-set. In this sense, an alphabet is a character-set, but collation applies to other writing systems too. In SQL Server, the collation includes the character-set and the options;

  • binary
  • binary-code point
  • Case sensitive
  • Accent sensitive
  • Kana sensitive
  • Width sensitive

which are all explained at MSDN. These options are also referred to as sort order.

Collation exists at the levels of instance, database and column. If a collation is not defined when creating a column, it is inherited from the database. Likewise, if no collation is specified when creating the database, it inherits the collation from the instance. The collation of the instance is defined when it is installed. When defining the collation, the designer (see picture) can be used.

Note there is a choice for SQL Collations and Windows Collation. In SQL Server 2005, go for the Windows Collation (SQL Collation is mainly there for compatibility reasons).

Sorting and storing

The collation is used by SQL Server both when sorting data (SELECT ... FROM ... ORDER BY ...) and when storing textual data as part of an index (both clustered and non-clustered). It can happen that you need data that is stored in one collation, to be evaluated in another collation, like when comparing data from two different collations. However, two different collations can not be compared, so collations have to be made the same. This is where the COLLATE clause comes into play, so you can specify a collation in the query.

Example

To show some differences with sort order and collation, I've added a script for a demo database with two tables, both containing some character combinations with variations in case and accents. One table is stored in the Latin1_General_CS_AS collation, the other one in German_PhoneBook_BIN2. Both contain the same data;

RowNum
int
TextData
nchar(10)
1 ABC
2 ÄBC
3 abc
4 äbc
5 ABc
6 ÁBC
7 abC
8 ábc

When selecting from the Latin1_General_CS_AS collated table, ordered by TextData, you receive the order 3, 7, 5, 1, 8, 6, 4, 2. Whereas with the German_PhoneBook_BIN2 collation, the order is 1, 5, 7, 3, 6, 2, 8, 4. Now with the COLLATE clause, you can alter the collation in a query. So

SELECT RowNum, TextData FROM GerPB_BIN2 ORDER BY TextData COLLATE Latin1_General_CS_AS

though selected from the table in German_PhoneBook_BIN2 collation, is returned in the order of the Latin1_General_CS_AS collation. When comparing two from two different collations, like you do in a join, SQL Server will raise an error.

SELECT * FROM GerPB_BIN2 G INNER JOIN Latin1_CS_AS L ON G.TextData = L.TextData

Msg 468, Level 16, State 9, Line 1
Cannot resolve the collation conflict between "Latin1_General_CS_AS" and "German_PhoneBook_BIN2" in the equal to operation.

However, by using the COLLATE clause with one of the compared columns, they can be joined.

SELECT * FROM GerPB_BIN2 G INNER JOIN Latin1_CS_AS L ON G.TextData COLLATE Latin1_General_CS_AS = L.TextData

Guess this is the long answer, but not the last time I have to give it...

Monday, 19 March 2007 11:40:48 (W. Europe Standard Time, UTC+01:00)
# Thursday, 15 March 2007

To be more specific;

  • Microsoft Virtual Server 2005 R2
  • Microsoft SQL Server 2005 Reporting Services

James Kovacs and Virtual PC Guy provide the details per server, aggregated it becomes:

  • Internet Information Server
    • Web Management Tools
      • IIS 6 Management Compatibility
        • IIS 6 WMI Compatibility
        • IIS Metabase and IIS 6 configuration comparability
      • IIS Management Console
    • World Wide Web Service
      • Application Development
        • (.NET Extensibility comes with ASP.NET)
        • ASP.Net
        • CGI
        • ISAPI Extension (comes with ASP.NET)
        • ISAPI Filters (comes with ASP.NET)
      • Common HTTP Features
        • Default Document
        • Directory Browsing
        • HTTP Errors
        • HTTP Redirection
        • Static Content
      • Health and Diagnostics
        • HTTP Logging
        • Request Monitor
      • Performance Features
        • Static Content Compression
      • Security
        • (Request Filtering comes with ASP.NET)
        • Windows Authentication

Installing SQL Server 2005 (including SSRS)

Ignore any compatibility warnings, just proceed installation and apply SP2 after installing. During the install you also see a warning about a compatibility issue for Visual Studio (SQL Server Business Intelligence Development Studio is based on the Visual Studio 2005 Development Environment), you can solve this by installing the Visual Studio 2005 SP1 followed by the Update for Vista.

Now you may have heard about the re-release of SP2, but the only one you can download from Microsoft is the fixed one (available since March 5th 2007). If you're not sure, read KB-article 933508 how to verify.

Installing Virtual Server 2005 R2

Start the installation of Virtual Server by right-clicking the setup-exe and running it as administrator. If you don't, it pretends IIS is missing (also, it won't detect the IIS portnumber).

Thursday, 15 March 2007 10:40:03 (W. Europe Standard Time, UTC+01:00)
# Wednesday, 14 March 2007

Okay, you are an administrator, for you just successfully installed the SQL Server on your Vista box. Now you want to start SSMS. 

SQL Server Management Studio If you just click this, you're likely not able to logon with Windows Authentication although you are member of BUILTIN\administrators (how else did you install?)!!! This is User Account Control (UAC) trying to nag you, but there are five ways to be relieved of this behavior:

  1. Log on as sa in Mixed Authentication mode. Not recommended, besides you either had to have this configured during install, or have to do it now (go to point 2 to get started).
  2. Right-click SQL Server Management Studio and "Run as administrator" (if you don't want to do this every time you start SSMS, go to point 3).
  3. Add your user account, or an appropriate group (other than BUILTIN\administrators) where you're a member as a login with the sysadmin server role.
  4. If your fed up with UAC altogether, visit the How-To Geek
  5. Install SP2 and use the SQL Provisioning tool after the Service Pack is applied.
Wednesday, 14 March 2007 14:59:55 (W. Europe Standard Time, UTC+01:00)