# Friday, 08 August 2008

Yesterday evening I got into a fight with the SSIS XML Source Data Flow Source. It actually was the first time I used this Data Flow Source. I had no expectations other than, point to your XML-file and get the data. Wrong... I did point to an XML-file, generate a schema and... no columns! I ended up with the error:

Validation error. Data Flow Task: XML Source [000]: A component does not have any inputs or outputs.

In cases like this, Google should be your friend... well I read a lot about SSIS in general, some even about the XML Source but nothing to provide me with answers or even help me out. It was after reading a post by Oren Eini that I decided I was on the wrong trouble shooting track and a good night sleep would be the best next step.

This morning I started with clear vision and an open mind. No answers through Google, nothing useful on Connect, so I tried if I could reproduce my problem with an other document. The document I created was of a very simple structure;

<people>
  <person>
    <firstname>Jan</firstname>
    <lastname>Aerts</lastname>
  </person>
  <person>
    <firstname>Anne</firstname>
    <lastname>Mulders</lastname>
  </person>
</people>

It worked! But now I had to find out why the document above worked, and the other one didn't. Again I read about SSIS in general and a little something about the XML Source. In particular (SSIS in general), I stumbled upon a post by Jamie Thomson, sounding familiar and one about the XML Source I wish I came across earlier: Using XML Source by Matt Masson. I could already agree with his opening comment, especially the various degrees of success. While reading Matt's article I had this feeling my XML document might actually be to simple... it occurred to me that the XML Source was not just going to read XML, it was trying to represent the XML as one or more tables.

A very simple representation of my original document is;

  <person id="1">
    <firstname>Jan</firstname>
    <lastname>Aerts</lastname>
  </person>
<row column1="value">
  <column2>value</column2>
  <column3>value</column3>
</row>

The simplest representations Matt used, are:

<rootgoo>
  <goo>
    <subgoo>value</subgoo>
    <moregoo>1</moregoo>
  </goo>
  <goo>
    <subgoo>value</subgoo>
    <moregoo>2</moregoo>
  </goo>
</rootgoo>
<table>
  <row>
    <column1>value</column1>
    <column2>value</column2>
  </row>
  <row>
    <column1>value</column1>
    <column2>value</column2>
  </row>
</table>

AND

<root>
  <row CustomerID="1" TerritoryID="1" AccountNumber="AW00000001" />
  <row CustomerID="2" TerritoryID="1" AccountNumber="AW00000002" />
</root>
<table>
  <row column1="value" column2="value" column3="value" />
  <row column1="value" column2="value" column3="value" />
</table>

So my document could never be translated to a table... to get back to Oren's post: If only SSIS had told me so with a clear error or even a dialog in the XML Source, that would have saved me a couple of hours!

Or better, since XML Source tries to get data from the XML, it could do a best effort as wrapping something that looks a single row into a table (and to take it one step simpler, represent a single value as a table with just one row and one column.). If you'd like to see some improvement here too, take a moment to vote on FeedbackID 361057.

On a version note, it happens with SQL Server 2005 (SP2) and SQL Server 2008 (PreRelease).

Friday, 08 August 2008 12:40:45 (W. Europe Daylight Time, UTC+02:00)
# Wednesday, 06 August 2008

REDMOND, Wash. — Aug. 6, 2008 — Microsoft Corp. today announced the release to manufacturing of Microsoft SQL Server 2008 -->

Wednesday, 06 August 2008 20:46:46 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 02 August 2008

As most will know sp_ does not stand for stored procedure, it stands for system stored procedure. But calling your procedure sp_something doesn't make it a system procedure automatically, it just hints the server how to resolve the procedure.

When a procedure that starts with sp_ is called, first the master database is checked if it is a real system stored procedure. The books online shows this behavior by creating a procedure in AdventureWorks called dbo.sp_who. However, since sp_who is a real system stored procedure, the existence of AdventureWorks.dbo.sp_who is always ignored. If the procedure is not a real system stored procedure, the connected database is checked for the existence of the stored procedure. If it is in the database you're currently connected to, it gets executed. If it isn't in the database you're currently connected to, it is retrieved from master (or you receive an error if it isn't there either). You can verify this behavior based on the following code.

USE AdventureWorks
GO
CREATE PROCEDURE sp_sayhello
AS
SELECT 'Hello from AdventureWorks, you are connected to ' + DB_NAME() + '.'
GO
USE master
GO
CREATE PROCEDURE sp_sayhello
AS
SELECT 'Hello from master, you are connected to ' + DB_NAME() + '.'
GO

Now, when executing sp_sayhello while connected to AdventureWorks, it will return;

Hello from AdventureWorks, you are connected to AdventureWorks

With any other database, say msdb, you get the following result.

Hello from master, you are connected to msdb

So there are two reasons why starting your stored procedure name with sp_ isn't smart;

  • Performance; each time the procedure is called, a (futile) lookup is done against the master database.
  • Future; if you have a stored procedure in your database called sp_dosomething and Microsoft implements a system stored procedure sp_dosomething in SQL Server, your application is broken.

There is however one scenario where creating stored procedures with sp_ is smart: When you create it in master as part of your own standardized way of working. Creating your own toolbox so to say. With SQL Server 2005 and 2008 there is an automatic separation, your sp_ procedures are created in the dbo schema by default and the real system stored procedures reside in the sys schema (the actual system stored procedures are in the mssqlsystemresource database).

Your own sp_ procedures and schemas: DON'T!!! It does not work if the schema in master isn't dbo.

USE master
GO
CREATE SCHEMA
toolbox
GO
CREATE PROCEDURE
toolbox.sp_sayhello
AS
SELECT
'Hello from master. You are connected to ' + DB_NAME() + '.'
GO
USE
AdventureWorks
GO
EXEC
sp_sayhello -- Doesn't work
EXEC toolbox.sp_sayhello -- Doesn't work
EXEC master.toolbox.sp_sayhello -- Executes against master, not AdventureWorks.
GO
USE
master
GO
DROP PROCEDURE
toolbox.sp_sayhello
GO
DROP SCHEMA
toolbox
GO

Your own sp_ procedures and non-privileged users: Make sure the login has permissions to execute the procedure from master and that any needed permissions are held in the target database. To illustrate, a login, mapped to a user in AdventureWorks will execute a stored procedure named sp_maketable. To make this work, public (therefor any login through guest, which is appropriate for master) will receive execute permissions on the procedure and create table and alter schema permissions are granted to the user in AdventureWorks. The table is created in the default schema of the user.

USE master
GO
CREATE PROCEDURE sp_maketable
AS
CREATE TABLE tblTest (col1 int)
GO
GRANT EXECUTE ON dbo.sp_maketable TO public -- Make sure permissions allow the user to execute.
GO
CREATE LOGIN np_user WITH PASSWORD = 'secret', DEFAULT_DATABASE = AdventureWorks
GO
USE AdventureWorks
GO
CREATE USER np_user FOR LOGIN np_user WITH DEFAULT_SCHEMA = Sales
GO
GRANT CREATE TABLE TO np_user -- Make sure the user has proper permissions in the database.
GO
GRANT ALTER ON SCHEMA::Sales TO np_user -- Make sure the user has proper permissions in the schema.
GO
EXECUTE AS LOGIN = 'np_user'
GO
SELECT SUSER_SNAME(), USER_NAME() -- Verify it is executing as the user.
GO
EXEC sp_maketable
GO
REVERT

Important stuff when writing your own sp_ 's:

  • BACKUP DATABASE master just became even more important.
  • Double check on which of your own procedures you grant execute permissions.
  • Use a proper naming convention, like including your company name, to avoid naming collision with future Microsoft system stored procedures.
  • If a procedure exists with the same name in one of your databases and you are connected to that database, the local procedure gets executed, not the central one from master.
  • Document.
  • Mark your sp_ as system object with sp_MS_marksystemobject
Technorati tags: ,
Saturday, 02 August 2008 20:37:45 (W. Europe Daylight Time, UTC+02:00)
# Tuesday, 17 June 2008

After running through the prep-guide (looking through a pair of SQL Server 2005 glasses), I identified a couple of topics worth giving a closer look. The topics are derived from the prep-guide, my comments about the topic added in blue italics and the bulleted list refers to (mostly) BOL-resources. This post is based on the prep-guide for 70-432 with published date June 11, 2008

Installing and Configuring SQL Server 2008 (10 percent)

Configure additional SQL Server components.
This objective may include but is not limited to: SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS), replication. Not that I expect this to be really different from SQL Server 2005, but if your background is just DBA (MCTS/MCITP) it may be your first encounter with the BI-components.

Maintaining SQL Server Instances (13 percent)

Implement the declarative management framework (DMF).
This objective may include but is not limited to: create a policy; verify a policy; schedule a policy compliance check; enforce a policy; create a condition.

Back up a SQL Server environment.
This objective may include but is not limited to: operating system-level concepts. I don't expect a lot of fireworks, but the operating system-level concepts made me curious.

  • Planning for Disaster Recovery Actually, I'm still curious what is meant by operating system-level concepts. This link from BOL is actually my best shot at a document where some broader considerations are presented.

Managing SQL Server Security (15 percent)

Manage transparent data encryption.
This objective may include but is not limited to: impact of transparent data encryption on backups.

Maintaining a SQL Server Database (16 percent)

Back up databases.
This objective may include but is not limited to: full backups; differential backups; transaction log; compressed backups; file and filegroup backups; verifying backup. Only compressed backups is to be classified as new.

Performing Data Management Tasks (14 percent)

Implement data compression.
This objective may include but is not limited to: sparse columns; page/row.

Maintain indexes.
This objective may include but is not limited to: create spatial indexes; create partitioned indexes; clustered and non-clustered indexes; XML indexes; disable and enable indexes; filtered index on sparse columns; indexes with included columns; rebuilding/reorganizing indexes; online/offline. Spatial and filtered indexes on sparse columns are of interest here, along with "is not limited to" which could be indexes on hierarchyid columns.

Optimizing SQL Server Performance (10 percent)

Implement Resource Governor.

Use Performance Studio.

  • Data Collection Entry page, includes How-To
  • Again, Performance Studio, also an MS-Name-Game, what you're really looking for is Data Collection... and trying to get that confirmed, I found this webcast by Bill Ramos (62 minutes).

The rest, well it is all too familiar from SQL Server 2005. Sure, I'll look for some "What's new" resources, but I think the above pretty much covers what I need to familiarize my self with.

Technorati tags: , ,
Tuesday, 17 June 2008 18:13:20 (W. Europe Daylight Time, UTC+02:00)
# Tuesday, 10 June 2008

Release Candidate 0 is available for download (and downloading) and the MCTS exam 70-432 went into beta testing (and I registered). Since the beta is only running from June 9th through June 30th, I had to go for 27th as it was the only gap in my schedule. Let's see if I can find the time to blog about my preparations...

Technorati tags: , ,
Tuesday, 10 June 2008 20:02:47 (W. Europe Daylight Time, UTC+02:00)
# Friday, 06 June 2008
 #
 

Friday, 06 June 2008 06:52:38 (W. Europe Daylight Time, UTC+02:00)

Exam-stats: 180 minutes1, 61 question spread over 6 testlets (cases), passing score 700 points, only multiple choice questions, no simulations. I got 752, was lousy on SSAS ...and this actually was the first Microsoft exam where I really needed the time!!!  Compared to the other MCITP (70-443, 70-444 and 71-647) exams I sat, it was a lot more reading and fact-finding in the case-studies.

What surprised me on this exam, were a couple of questions targeted at the database engine. Think towards backup-requirements for filegroups (which are needed for partitioned tables), index optimization and transaction isolation levels (not mentioned in prep-guide). Unfortunately these topics aren't covered2 in the courses 2794 to 2797 (or in 2791 to 2793). From the topics that are covered in the prep-guide, I'd say the number of questions was pretty balanced, only four things were really sticking out:

  • Which data mining algorithm to apply in a certain scenario.
  • Storage strategy for SSAS cubes.
  • Slowly changing dimensions.
  • Designing dimensions and hierarchies.

Useful resources for preparation.

1Actually, you get about 3 minutes per question grouped per testlet. This means for a 9 question testlet you get about 27 minutes, time left on one testlet is not added to the next. The 180 minutes should be regarded as an indication for the maximum exam length.
2 At best superficially mentioned in 2796.
Friday, 06 June 2008 00:14:29 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 31 May 2008

Just a few links where you can find more info about SQL Server 2008 Certification in general and about the separate certification tracks and exams.

Track alignment Database Administration Database Development Business Intelligence
Microsoft Certified Technology Specialist (MCTS)

MCTS: SQL Server 2008, Implementation and Maintenance

MCTS: SQL Server 2008, Database Development

MCTS: SQL Server 2008, Business Intelligence Development and Maintenance

MCTS requirements Pass: Exam 70-432 (expected availability of exam August 2008) Pass: Exam 70-433 (expected availability of exam October 2008) Pass: Exam 70-448 (expected availability of exam August 2008)
Microsoft Certified Information Technology Professional

MCITP: Database Administrator 2008

MCITP: Database Developer 2008

MCITP: Business Intelligence Developer 2008

MCITP requirements Hold above MCTS certification and pass Exam 70-450 (expected availability of exam November 2008) Hold above MCTS certification and pass Exam 70-451 (expected availability of exam January 2009) Hold above MCTS certification and pass Exam 70-452 (expected availability of exam November 2008)
Upgrade option existing MCITP for SQL Server 2005 Existing MCITP:Database Administrators can upgrade above MCTS and MCITP by passing Exam 70-453 (expected availability of preparation guide September 2008) Existing MCITP:Database Developers can upgrade above MCTS and MCITP by passing Exam 70-454 (expected availability of preparation guide September 2008) Existing MCITP:Business Intelligence Developers can upgrade above MCTS and MCITP by passing Exam 70-455 (expected availability of preparation guide September 2008)
No upgrade paths exist for MCTS for SQL Server 2005 to MCTS for SQL Server 2008. Thanks Trika, for the pointer and poster.
Saturday, 31 May 2008 17:19:20 (W. Europe Daylight Time, UTC+02:00)
# Wednesday, 28 May 2008

In other words, the system could not find the specified path... that was when I tried restoring through a substituted (subst R: G:\project02\SQLBackup) drive, and for sure R:\MyDB_20080521.bak did exist. Nastier was the fact that G:\project02\SQLBackup\MyDB_20080521.bak restored in an instant, so the problem was likely to be found in the interaction between subst and SQL Server. This was rather disappointing, as it would have been nice to separate the environment for different customers/projects and still be able to use generic scripts targeted at a drive-letter.

So a little quest started to find some solutions, if it's a bug?, should be a feature... I was surprised by the small number of pointers I came across on the Internet and only one useful. After establishing net use R: \\myserver\projdata\project02\SQLBackup suffered the same problem when it came to restoring, I widened the search an came across a post from the beta stage of SQL Server 2005. In there Erland Sommarskog hinted it might have something to do with profile setting, which would apply if the setting in question weren't a system setting. Now that made perfect sense, so let's see how the world looks like through xp_cmdshell.

So on my SQL Server 2005 instance I enabled xp_cmdshell and I substituted a path for a drive letter on Windows. Sure enough, I could see the substituted drive W: as pointing to D:\SQLData, but running EXEC xp_cmdshell 'subst' yielded NULL from the instance running under LocalSystem. Then, after running EXEC xp_cmdshell 'subst W: D:\SQLData' my SQL Server instance running under LocalSystem got the picture too. From SQL Server Management Studio I could "see" drive W: and also restore from and backup to my "W-drive". Safe to say it isn't a bug for SQL Server and if a feature request is to be made, it has to be addressed to the Windows team. Something like; enable system-wide setting of subst and net use commands by administrators through an extra switch (or make it a policy setting default on for administrators only).

In the end it can easily be solved with a little sqlcmd-script like:

:setvar path "G:\project02\SQLBackup"
:setvar drive "R"
EXEC xp_cmdshell 'subst $(drive): /d'
GO
EXEC
xp_cmdshell 'subst $(drive): $(path)'
GO

Wednesday, 28 May 2008 21:55:46 (W. Europe Daylight Time, UTC+02:00)
# Thursday, 17 April 2008

From my perspective, a very successful event. Congratulations to the organizing committee and the volunteers supporting them. Also may thanks to the speakers, especially Bob Ward (pre-conference and session), Remus Rusana, James Luetkehoelter, Chuck Hawkins, Klaus Aschenbrenner (GEOGRAPHY data type).

Also look out for James' follow ups on "our" questions, guess the audience was paranoid enough.

Technorati tags:
Thursday, 17 April 2008 22:58:33 (W. Europe Daylight Time, UTC+02:00)