# Friday, 13 April 2007

If you're looking for the SQL Server 2000 perspective, go to Space Program.

Before venturing further, when was your last checkpoint? I learnt in my test setup, that if you work with a small number of operations, there is no data recoverable from the data file. It was just on the dirty pages and in the transaction log (which I deliberately crashed, to be able to advise on recovery of that event). This is an indicator that, if you lose the transaction log in a running database, you are very likely going to end up with data loss.

-----

...continued from Recovery when the transaction log is lost, if the database was detached or you want to connect the database to another instance.

Unfortunately, after the failed attach, there is no entry for the LogCrash_Demo database in the system tables anymore. So first you have to create a new database with the same name. Next take this database offline and replace the data file of the newly created database with the datafile containing the data you want back. Also, throw away the transaction log that was just created.

Now proceed to the next section, where the database is started in emergency mode.

-----

...continued from Recovery when the transaction log is lost, if you didn't detach the database.

Start the LogCrash_Demo database in emergency mode (switch to emergency mode has changed with SQL Server 2005, as it required direct updating to system tables, which are now hidden from the DBA).

ALTER DATABASE LogCrash_Demo SET EMERGENCY

Now you can read from the database, which is enough to script the definition of the objects (SSMS: right-click the database, Tasks --> Generate Scripts), create a new database based with those scripted objects, and transfer (BULK INSERT, SQL Server Integration Service, etc.) the data from the database in emergency mode, to the new database.

Be careful though about data loss. Data data file that was made available for select in emergency mode does not contain the dirty pages.

So should you be worried about crashing of the transaction log media, make sure you use at least disk mirroring for your transaction log. Because when you look at backup restore as in the 2780, you can rebuild your data files after a disk crash using the right backup and restore strategy without data loss, but you can't if your transaction log is the disk crash victim.

 

Technorati tags: , ,
Friday, 13 April 2007 14:43:23 (W. Europe Daylight Time, UTC+02:00)

When does SQL Server, under its default settings, write dirty pages to disk?

The answer depends on your recovery model, but let's assume Full Recovery. In that case, either when the lazy writer kicks in or when a checkpoint is issued.

-- (Relevant) Memory usage by SQL Server --

To simplify, SQL Server needs datapages in memory to work with them. To do so, SQL Server uses available memory from the system for a buffer pool. This buffer pool is filled with datapages, while SQL Server has the opportunity to do so (and when memory runs out on the system, SQL Server will release memory from its buffer pool). When one or more records are SELECTed from the datapage, the datapages containing these records are either already present in the buffer pool, or read from disk. When the datapages are modified (like in UPDATE, DELETE and INSERT), SQL Server carries these operations out on the pages in memory (and also writes the modification to the disk-bound transaction log). When the data page modification is completed, the data page stays in memory and is marked dirty (because the page in memory and the corresponding page in the datafile are different).

There are also times when the buffer pool needs to be cleaned up, either because it's full and more needs to be stuffed in (like my garage) or because you want to clean it out (no real-world example applies to me here).

-- Lazy Writer and Checkpoint --

In the first case, where more data pages need to be stuffed in, the lazy writer gathers (some of) the dirty pages and has them written to disk. That way the memory that was occupied by those dirty pages can be reused. In the second case, where you want to clean the dirty pages, you issue a checkpoint, thereby ordering SQL Server to flush the dirty pages to the data file(s) on disk.

This way issueing a checkpoint, has the positive side-effect of making memory in the buffer pool available for reuse. Side note; neither the lazy writer, nor checkpoint, free data pages. They just mark them safe for reuse.... but wait a minute, cleaning out the buffer pool (or the garage), isn't that a must do??? To be honnest, only when it's full!

So when does a checkpoint occur? When a database is closed (like when taking the database offline, stopping the service) or when a backup is made, as part of the opperation a checkpoint is issued. Besides those events, there is also the option to programatically/interactively issue a checkpoint, finally SQL Server issues checkpoints automatically, based on the recovery interval.

-- Recovery --

There is a more compelling reason (compared to buffer pool maintenance) to issue a checkpoint every now and then. It is recovery!

And why is recovery important? When a database starts up, recovery is one of the processes. During recovery, SQL Server looks up the last known checkpoint in the transaction log and then starts redoing the committed transaction from the transaction log. If the database was closed clean the last time, the checkpoint is the last thing in the transaction log, so recovery costs next to nothing. But if the database wasn't closed nicely (like when the power supply unit fails), then it may be hours (or longer) since last checkpoint. SQL Server then starts redoing from the transaction log, this may take some time, but at least you don't lose commited transaction.

BUT DIDN'T YOU .... ?

YES, I DID! That is, pull the memory stick with the transaction log out, to simulate a failure of the disk. And that's where it hurts as data pages only get flushed to disk if;

  • Both the buffer pool and system memory are full (or memory is capped with max server memory). Based on a few inserts and updates... don't think so.
  • The database is closed or backed up. Not the case when you pull the stick.
  • A manual CHECKPOINT is issued. Didn't do that.
  • Recovery interval kicks in. Waited just for that to happen...

However, the recovery interval of 0 does not mean SQL Server will issue a checkpoint automatically within a minute. It says, checkpoints do occur when the database engine estimates it can process the number of log records within the interval specified during the recovery process! In other words, it may take very long between checkpoints if a system is running very few transactions... like I noticed during my tests.

Technorati tags: , ,
Friday, 13 April 2007 14:41:49 (W. Europe Daylight Time, UTC+02:00)
# Thursday, 12 April 2007

This morning, as I wanted to start working, I noticed my server wasn't working... Outlook couldn't connect to Exchange and the ReSQueL website was down as well. The server still had power, but no response (not even the keyboard LED for NumLock). So a power-cycle was next, well at least half of it, power down worked, power up didn't.

Absolutely nothing happened! Now I must admit, I've been there before with that type of system (not just my own). So I pulled the power cord and left the system for about 10 minutes, then tried again. This time I heard some ticks coming out of the power supply unit. And believe me, ticks from the PSU is not a good sign about the health of the thing.

However, I did have another unused ATX cabinet with PSU, so I tried transferring the internals of the old system to the replacement cabinet. Unfortunately, the TP123 motherboard has an extra power connector for the CPU core voltage. So putting power on the board worked, but (as expected) it didn't boot.

Not wanting to extend the downtime much further, I grabbed the car-keys, drove to Office Centre (cash and carry beats a webshop when you need something NOW!!!) and bought the cheapest config with 1 GB internal memory. Back home, I tested if the system worked. It did. Next I added the old disk and an extra network card and booted the system from the old disk... BOSDed and rebooted right away. But since the config is totally different, that shouldn't be too surprising. So I fetched the Windows Small Business Server 2003 R2 DVD and started the repair installation. Finally some time to sit down and write some... if all is well, this is on the blog (and the blog available) little over 5 hours 6 hours after I noticed the server down... always fun when you don't have the drivers for your new system (only the Vista drivers supplied)... NOT!

Hardware | ReSQueL | SBS
Thursday, 12 April 2007 14:18:32 (W. Europe Daylight Time, UTC+02:00)
# Monday, 09 April 2007

"Let's check and see..." and we came back with no license info on SQL Server. Yet the course certainly gives an overview of the various licensing options, so does the Microsoft website. However, when installing SQL Server, you're not asked about licensing. Also the queries SELECT SERVERPROPERTY('LicenseType') and SELECT SERVERPROPERTY('NumLicenses') both return DISABLED.

By design, the product SQL Server 2005 does nothing in terms of enforcing or even storing license information. Eric Burgess mentions this on the Microsoft SQL Server Support Blog, including how to store the license information in the Windows Registry, as was done with SQL Server 2000.

Okay, that for the technical detail, but what remains puzzling is Microsoft's current strong/aggressive campaigning on licensing versus no instruments in SQL Server.

Technorati tags: , , ,
Monday, 09 April 2007 13:57:52 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 07 April 2007

Module 3 of the course 2780 is about backup and restore. When you combine this module with the database best-practice of separating the data files and transaction log files on different disk, you can even demonstrate how to recover from disk failure using the right backup and restore strategy without losing data... that is, recover if the failing disks contain the data files!!!

But what if the failing disk contains the transaction log files? Backup and restore are not going to help you to recover up to the moment of disaster. Sure, you can restore the database from last backup, but if last backup was on Sunday and the incident occurs Friday afternoon, about a weeks work is lost!

Yet we know all data is still sitting in the data file(s), the only sad part is, we can't bring the database up without the log file. So the question is, how to give the database its log file(s) back. Not knowing the answer from the top of my head, in which case I usually say "Let's see what happens if we stop the service and throw away a logfile", results were inconclusive and since (lack of) time is a major issue on the 2780, the subject of lost transaction log files landed on the todo-list.

/* BEGIN UPDATE 1 */

HALT, don't do anything yet if you have this problem in a production environment. Just read this post and the post Recovery continued, as there is no need to detach the failed database if your instance is still up and running.

/* END UPDATE 1 */

Now for the answer, which get's pretty complicated, depending on the state of your database. IF your database shutdown was clean:

Detach the damaged database. Then attach the data file(s) again, but remove the missing logfile from the list (see picture). This way, SQL Server recreates the logfile.

Depending on the state of the database SQL Server may even perform this operation at database startup. So you won't even notice it when you don't visit the log-files (Windows Application EventLog or SQL Server Activity log). Explains why results were inconclusive during class. This is easily tested (not on production databases, naturally) by stopping the database service, throwing away the logfile of a user database and starting the database service again. Either, you'll see the recreation of the transaction log in the Event log and Activity log, or you have to perform the detach and attach yourself.

Both for detaching and attaching, you may get error messages. However, these are warnings, the operations do complete successfully.

That was the easy part, but surely you've seen the big, fat IF! Databases don't shutdown clean if the log file crashes. To get a picture of what happens when a log-file is lost due to a disk crash, I created a database with the data file on disk (D:\SQL_DATA\LogCrash_Demo.mdf) and the transaction log on a memory stick (G:\SQL_DATA\LogCrash_Demo_Log.ldf). Next I added a table and started inserting new records. And... pull the stick out!

Msg 9001, Level 21, State 4, Line 1
The log for database 'LogCrash_Demo' is not available. Check the event log for related error messages. Resolve any errors and restart the database.
Msg 0, Level 20, State 0, Line 0
A severe error occurred on the current command. The results, if any, should be discarded.

After detaching the database and trying to attach (CREATE DATABASE ... FOR ATTACH_REBUILD_LOG) it again, the following error comes up;

File activation failure. The physical file name "G:\SQL_DATA\LogCrash_Demo_log.ldf" may be incorrect.
The log cannot be rebuilt because the database was not cleanly shut down.
Msg 1813, Level 16, State 2, Line 1
Could not open new database 'LogCrash_Demo'. CREATE DATABASE is aborted.

/* BEGIN UPDATE 2 */

I was going to look into what DBCC REBUILD_LOG could have done in this scenario and how that could be applied to SQL Server 2005... as of SQL Server 2005 DBCC REBUILD_LOG is no more and since DBCC REBUILD_LOG is an undocumented command, so the succession isn't documented either (if you care for an overview of DBCC changes, including "undocumented" stuff, look at Norb Technologies).

But just forget about DBCC REBUILD_LOG, Paul Randal is pretty clear about it (and all of a sudden it's gone in SQL Server 2005). Besides, the promised later post, how to get to the data in the data file is available too.

/* END UPDATE 2 */

UPDATED 2007-04-13

Technorati tags: , ,
Saturday, 07 April 2007 14:43:15 (W. Europe Daylight Time, UTC+02:00)
# Wednesday, 04 April 2007

Just read on Mary Jo's Microsoft wants Open XML to be an ISO standard... They have my support.

Sign the petition
Wednesday, 04 April 2007 20:11:07 (W. Europe Daylight Time, UTC+02:00)

Recently I finished teaching a couple of courses, to be more specific; the Microsoft courses 2779 and 2780. This resulted in a little todo-list for postings on the blog, either because time was lacking or because I needed to look up some details to properly answer the question.

In the next couple of days, I will post answers (if I can answer the question) on the following questions;

As the answer start appearing on the blog, I will update this post, linking to the answers. Another answer is also relevant in this context, the things that changed when doing full server recoveries.

Wednesday, 04 April 2007 19:35:02 (W. Europe Daylight Time, UTC+02:00)

Lately is was confronted with a couple of textfiles that wouldn't load into the database (SQL Server 2005). I was getting the errors 

Msg 4832, Level 16, State 1, Line 1
Bulk load: An unexpected end of file was encountered in the data file.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

when trying to acquire the file through

BULK INSERT tblImport
FROM 'Z:\concat-file.txt'
WITH(TABLOCK, FORMATFILE = 'Z:\xmlformatfile.xml')

Turned out I had an arrow sitting at EOF (end of file) with the troublesome textfiles, which where concatenated using the copy-command;

COPY source*.txt concat-file.txt

The arrow representation was new to me, but after a little digging it boiled down to CTRL+Z and the difference of ASCII and binary copying... by default copy is ASCII and when merging files, so an end of file character gets inserted. The EOF character is omitted when doing a binary copy, so

COPY source*.txt /b concat-file.txt

resolved the error.

Curious about the arrow representation on Vista, I decided to verify with Windows Server 2003. Here the EOF story is (almost) the same, only the arrow is shown as a rectangle. Read more about copy here.

Wednesday, 04 April 2007 16:13:08 (W. Europe Daylight Time, UTC+02:00)
# Saturday, 24 March 2007

Does your backup and recovery strategy take the new SQL Server 2005 mssqlsystemresource database into account? What's different for SQL Server 2000 and 2005 from backup/restore perspective.

----

Compared to SQL Server 2000, SQL Server 2005 has a lot of nice features aimed at minimizing downtime during "unplanned events", most notably piecemeal restore and the inclusion of Full-Text indexes in the database backup and restore. However, before being able to do database restores, a functional database server is needed. Besides being a SQL Server instance, the database server also is at least a Windows operating system and hardware (or a virtual machine if you like). The later two usually are the domain of the system administrator who will also take care of the operating systems, including installed applications, being backed up. Likewise the DBA will take care of databases.

A typical system backup will exclude certain files and directories known to be always locked, like pagefile.sys and the SQL Server \Data folder. The other SQL Server folders and registry-setting (system state), are backed up. By restoring this backup for a SQL Server 2000, you get a database server with the Service Pack level of at time of the backup. The only thing that's missing for the database to start is the master database. In SQL Server 2005 only the master database isn't enough to get you started, you need mssqlsystemresource a.k.a. Resource Database as well.

What has changed that this is an important issue in the last step of getting a SQL Server instance running (quickly). The master database in SQL Server 2000 holds all information, the configuration related to the instance (like; logins, linked servers, databases) and the definition of system objects. When applying a service pack, typically maintenance on the system object is done, so the master database changes and is therefor Service Pack dependant. In SQL Server 2005 these two types of information are split, instance configuration still goes into the master database, but the system objects are held in the Resource Database. The advantage is that master is not depending on the Service Pack and since none but Microsoft should alter system objects, the Resource Database can remain read-only and will only be changed by Microsoft when applying the Service Pack.

When disaster strikes the (original) server, in many cases, a restore of the file system and system-state is performed on a more or less comparable machine configuration. This is the quick route, compared to installing Windows, a new instance and applying the Service Pack. The quick route has two extra advantages, no access to installation media is needed and other applications are restored too. But the quick restore route does not bring back a fully functional server, both sysadmin and DBA are aware of that. Therefor for this procedure to be successful, an offline backup of master (copy master.mdf and mastlog.ldf) was done after installation and application of a Service Pack. Now the last step is to copy master.mdf and mastlog.ldf to the SQL Server \Data folder. From here the DBA can start his/her database restore sequence (starting with the latest online backup of master (2000/2005)). This approach works on SQL Server 2000.
For SQL Server 2005, you have to add include the Resource Database in this strategy. And here is a little (positive) difference, because the Resource Database is accessed read-only, it only has read locks and thus can be copied while the instance is running. So you can either copy it after the installation or when the Service Pack is applied, or you can alter the file system backup to specifically include the files mssqlsystemresource.mdf and mssqlsystemresource.ldf.

----

What if disaster struck your SQL Server 2005 instance and you haven't cared for the Resource Database... well then you're off to the slow route of installing the instance after the restore and applying the Service Pack, before the database restore sequence can be started.

Saturday, 24 March 2007 14:29:51 (W. Europe Standard Time, UTC+01:00)
# Monday, 19 March 2007

Recently a student who was going through the Microsoft eLearning course 2778 asked for an explanation of "accent sensitive". Before going into why it matters to SQL Server, lets look into the role of accents. A lot of languages make use of accents to change the pronunciation of some letters. Like with the accents in the German language, which are called Umlaut, when translating the word Umlaut very directly, you'd get alternate tone.

Back to SQL Server and why it should be sensitive to accents, or for that matter case, kana or width (things we'll get to once we touch COLLATION). If we ask a question to our server, frequently we want the answer to be in order. And this order very much depends on data type.

Let's take a group of downhill skiing contestants. In preparation of a match, some overview is needed. The contestants have to be grouped in ages, so there is a need for sorting on birthdate (a datetime data type). During the match, there are results, so there also is a need to order by the time taken to get from start to finish (the number of seconds as decimal in descending order), so the winner is at first place. And probably before the match, there is the need for a list with all contestants ordered by lastname.

Where a programmer (VB, C#, etc) would see this lastname as a string, SQL Server actually has a lot of possible data types to represent and handle textual information.

Single Byte Double Byte
Fixed length char(x) nchar(x)
Variable length varchar(x) nvarchar(x)
CLOB (Character Large OBject) varchar(max) nvarchar(max)
Back-Compat CLOB text ntext

Though these data types all can store and handle textual data types, the collation has to be specified as well.

Collation

Collation is best thought of as the order of characters in a given character-set. In this sense, an alphabet is a character-set, but collation applies to other writing systems too. In SQL Server, the collation includes the character-set and the options;

  • binary
  • binary-code point
  • Case sensitive
  • Accent sensitive
  • Kana sensitive
  • Width sensitive

which are all explained at MSDN. These options are also referred to as sort order.

Collation exists at the levels of instance, database and column. If a collation is not defined when creating a column, it is inherited from the database. Likewise, if no collation is specified when creating the database, it inherits the collation from the instance. The collation of the instance is defined when it is installed. When defining the collation, the designer (see picture) can be used.

Note there is a choice for SQL Collations and Windows Collation. In SQL Server 2005, go for the Windows Collation (SQL Collation is mainly there for compatibility reasons).

Sorting and storing

The collation is used by SQL Server both when sorting data (SELECT ... FROM ... ORDER BY ...) and when storing textual data as part of an index (both clustered and non-clustered). It can happen that you need data that is stored in one collation, to be evaluated in another collation, like when comparing data from two different collations. However, two different collations can not be compared, so collations have to be made the same. This is where the COLLATE clause comes into play, so you can specify a collation in the query.

Example

To show some differences with sort order and collation, I've added a script for a demo database with two tables, both containing some character combinations with variations in case and accents. One table is stored in the Latin1_General_CS_AS collation, the other one in German_PhoneBook_BIN2. Both contain the same data;

RowNum
int
TextData
nchar(10)
1 ABC
2 ÄBC
3 abc
4 äbc
5 ABc
6 ÁBC
7 abC
8 ábc

When selecting from the Latin1_General_CS_AS collated table, ordered by TextData, you receive the order 3, 7, 5, 1, 8, 6, 4, 2. Whereas with the German_PhoneBook_BIN2 collation, the order is 1, 5, 7, 3, 6, 2, 8, 4. Now with the COLLATE clause, you can alter the collation in a query. So

SELECT RowNum, TextData FROM GerPB_BIN2 ORDER BY TextData COLLATE Latin1_General_CS_AS

though selected from the table in German_PhoneBook_BIN2 collation, is returned in the order of the Latin1_General_CS_AS collation. When comparing two from two different collations, like you do in a join, SQL Server will raise an error.

SELECT * FROM GerPB_BIN2 G INNER JOIN Latin1_CS_AS L ON G.TextData = L.TextData

Msg 468, Level 16, State 9, Line 1
Cannot resolve the collation conflict between "Latin1_General_CS_AS" and "German_PhoneBook_BIN2" in the equal to operation.

However, by using the COLLATE clause with one of the compared columns, they can be joined.

SELECT * FROM GerPB_BIN2 G INNER JOIN Latin1_CS_AS L ON G.TextData COLLATE Latin1_General_CS_AS = L.TextData

Guess this is the long answer, but not the last time I have to give it...

Monday, 19 March 2007 11:40:48 (W. Europe Standard Time, UTC+01:00)