# Friday, 08 August 2008

Yesterday evening I got into a fight with the SSIS XML Source Data Flow Source. It actually was the first time I used this Data Flow Source. I had no expectations other than, point to your XML-file and get the data. Wrong... I did point to an XML-file, generate a schema and... no columns! I ended up with the error:

Validation error. Data Flow Task: XML Source [000]: A component does not have any inputs or outputs.

In cases like this, Google should be your friend... well I read a lot about SSIS in general, some even about the XML Source but nothing to provide me with answers or even help me out. It was after reading a post by Oren Eini that I decided I was on the wrong trouble shooting track and a good night sleep would be the best next step.

This morning I started with clear vision and an open mind. No answers through Google, nothing useful on Connect, so I tried if I could reproduce my problem with an other document. The document I created was of a very simple structure;

<people>
  <person>
    <firstname>Jan</firstname>
    <lastname>Aerts</lastname>
  </person>
  <person>
    <firstname>Anne</firstname>
    <lastname>Mulders</lastname>
  </person>
</people>

It worked! But now I had to find out why the document above worked, and the other one didn't. Again I read about SSIS in general and a little something about the XML Source. In particular (SSIS in general), I stumbled upon a post by Jamie Thomson, sounding familiar and one about the XML Source I wish I came across earlier: Using XML Source by Matt Masson. I could already agree with his opening comment, especially the various degrees of success. While reading Matt's article I had this feeling my XML document might actually be to simple... it occurred to me that the XML Source was not just going to read XML, it was trying to represent the XML as one or more tables.

A very simple representation of my original document is;

  <person id="1">
    <firstname>Jan</firstname>
    <lastname>Aerts</lastname>
  </person>
<row column1="value">
  <column2>value</column2>
  <column3>value</column3>
</row>

The simplest representations Matt used, are:

<rootgoo>
  <goo>
    <subgoo>value</subgoo>
    <moregoo>1</moregoo>
  </goo>
  <goo>
    <subgoo>value</subgoo>
    <moregoo>2</moregoo>
  </goo>
</rootgoo>
<table>
  <row>
    <column1>value</column1>
    <column2>value</column2>
  </row>
  <row>
    <column1>value</column1>
    <column2>value</column2>
  </row>
</table>

AND

<root>
  <row CustomerID="1" TerritoryID="1" AccountNumber="AW00000001" />
  <row CustomerID="2" TerritoryID="1" AccountNumber="AW00000002" />
</root>
<table>
  <row column1="value" column2="value" column3="value" />
  <row column1="value" column2="value" column3="value" />
</table>

So my document could never be translated to a table... to get back to Oren's post: If only SSIS had told me so with a clear error or even a dialog in the XML Source, that would have saved me a couple of hours!

Or better, since XML Source tries to get data from the XML, it could do a best effort as wrapping something that looks a single row into a table (and to take it one step simpler, represent a single value as a table with just one row and one column.). If you'd like to see some improvement here too, take a moment to vote on FeedbackID 361057.

On a version note, it happens with SQL Server 2005 (SP2) and SQL Server 2008 (PreRelease).

Friday, 08 August 2008 12:40:45 (W. Europe Daylight Time, UTC+02:00)
# Wednesday, 06 August 2008

REDMOND, Wash. — Aug. 6, 2008 — Microsoft Corp. today announced the release to manufacturing of Microsoft SQL Server 2008 -->

Wednesday, 06 August 2008 20:46:46 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 02 August 2008

As most will know sp_ does not stand for stored procedure, it stands for system stored procedure. But calling your procedure sp_something doesn't make it a system procedure automatically, it just hints the server how to resolve the procedure.

When a procedure that starts with sp_ is called, first the master database is checked if it is a real system stored procedure. The books online shows this behavior by creating a procedure in AdventureWorks called dbo.sp_who. However, since sp_who is a real system stored procedure, the existence of AdventureWorks.dbo.sp_who is always ignored. If the procedure is not a real system stored procedure, the connected database is checked for the existence of the stored procedure. If it is in the database you're currently connected to, it gets executed. If it isn't in the database you're currently connected to, it is retrieved from master (or you receive an error if it isn't there either). You can verify this behavior based on the following code.

USE AdventureWorks
GO
CREATE PROCEDURE sp_sayhello
AS
SELECT 'Hello from AdventureWorks, you are connected to ' + DB_NAME() + '.'
GO
USE master
GO
CREATE PROCEDURE sp_sayhello
AS
SELECT 'Hello from master, you are connected to ' + DB_NAME() + '.'
GO

Now, when executing sp_sayhello while connected to AdventureWorks, it will return;

Hello from AdventureWorks, you are connected to AdventureWorks

With any other database, say msdb, you get the following result.

Hello from master, you are connected to msdb

So there are two reasons why starting your stored procedure name with sp_ isn't smart;

  • Performance; each time the procedure is called, a (futile) lookup is done against the master database.
  • Future; if you have a stored procedure in your database called sp_dosomething and Microsoft implements a system stored procedure sp_dosomething in SQL Server, your application is broken.

There is however one scenario where creating stored procedures with sp_ is smart: When you create it in master as part of your own standardized way of working. Creating your own toolbox so to say. With SQL Server 2005 and 2008 there is an automatic separation, your sp_ procedures are created in the dbo schema by default and the real system stored procedures reside in the sys schema (the actual system stored procedures are in the mssqlsystemresource database).

Your own sp_ procedures and schemas: DON'T!!! It does not work if the schema in master isn't dbo.

USE master
GO
CREATE SCHEMA
toolbox
GO
CREATE PROCEDURE
toolbox.sp_sayhello
AS
SELECT
'Hello from master. You are connected to ' + DB_NAME() + '.'
GO
USE
AdventureWorks
GO
EXEC
sp_sayhello -- Doesn't work
EXEC toolbox.sp_sayhello -- Doesn't work
EXEC master.toolbox.sp_sayhello -- Executes against master, not AdventureWorks.
GO
USE
master
GO
DROP PROCEDURE
toolbox.sp_sayhello
GO
DROP SCHEMA
toolbox
GO

Your own sp_ procedures and non-privileged users: Make sure the login has permissions to execute the procedure from master and that any needed permissions are held in the target database. To illustrate, a login, mapped to a user in AdventureWorks will execute a stored procedure named sp_maketable. To make this work, public (therefor any login through guest, which is appropriate for master) will receive execute permissions on the procedure and create table and alter schema permissions are granted to the user in AdventureWorks. The table is created in the default schema of the user.

USE master
GO
CREATE PROCEDURE sp_maketable
AS
CREATE TABLE tblTest (col1 int)
GO
GRANT EXECUTE ON dbo.sp_maketable TO public -- Make sure permissions allow the user to execute.
GO
CREATE LOGIN np_user WITH PASSWORD = 'secret', DEFAULT_DATABASE = AdventureWorks
GO
USE AdventureWorks
GO
CREATE USER np_user FOR LOGIN np_user WITH DEFAULT_SCHEMA = Sales
GO
GRANT CREATE TABLE TO np_user -- Make sure the user has proper permissions in the database.
GO
GRANT ALTER ON SCHEMA::Sales TO np_user -- Make sure the user has proper permissions in the schema.
GO
EXECUTE AS LOGIN = 'np_user'
GO
SELECT SUSER_SNAME(), USER_NAME() -- Verify it is executing as the user.
GO
EXEC sp_maketable
GO
REVERT

Important stuff when writing your own sp_ 's:

  • BACKUP DATABASE master just became even more important.
  • Double check on which of your own procedures you grant execute permissions.
  • Use a proper naming convention, like including your company name, to avoid naming collision with future Microsoft system stored procedures.
  • If a procedure exists with the same name in one of your databases and you are connected to that database, the local procedure gets executed, not the central one from master.
  • Document.
  • Mark your sp_ as system object with sp_MS_marksystemobject
Technorati tags: ,
Saturday, 02 August 2008 20:37:45 (W. Europe Daylight Time, UTC+02:00)