# Friday, 13 April 2007

When does SQL Server, under its default settings, write dirty pages to disk?

The answer depends on your recovery model, but let's assume Full Recovery. In that case, either when the lazy writer kicks in or when a checkpoint is issued.

-- (Relevant) Memory usage by SQL Server --

To simplify, SQL Server needs datapages in memory to work with them. To do so, SQL Server uses available memory from the system for a buffer pool. This buffer pool is filled with datapages, while SQL Server has the opportunity to do so (and when memory runs out on the system, SQL Server will release memory from its buffer pool). When one or more records are SELECTed from the datapage, the datapages containing these records are either already present in the buffer pool, or read from disk. When the datapages are modified (like in UPDATE, DELETE and INSERT), SQL Server carries these operations out on the pages in memory (and also writes the modification to the disk-bound transaction log). When the data page modification is completed, the data page stays in memory and is marked dirty (because the page in memory and the corresponding page in the datafile are different).

There are also times when the buffer pool needs to be cleaned up, either because it's full and more needs to be stuffed in (like my garage) or because you want to clean it out (no real-world example applies to me here).

-- Lazy Writer and Checkpoint --

In the first case, where more data pages need to be stuffed in, the lazy writer gathers (some of) the dirty pages and has them written to disk. That way the memory that was occupied by those dirty pages can be reused. In the second case, where you want to clean the dirty pages, you issue a checkpoint, thereby ordering SQL Server to flush the dirty pages to the data file(s) on disk.

This way issueing a checkpoint, has the positive side-effect of making memory in the buffer pool available for reuse. Side note; neither the lazy writer, nor checkpoint, free data pages. They just mark them safe for reuse.... but wait a minute, cleaning out the buffer pool (or the garage), isn't that a must do??? To be honnest, only when it's full!

So when does a checkpoint occur? When a database is closed (like when taking the database offline, stopping the service) or when a backup is made, as part of the opperation a checkpoint is issued. Besides those events, there is also the option to programatically/interactively issue a checkpoint, finally SQL Server issues checkpoints automatically, based on the recovery interval.

-- Recovery --

There is a more compelling reason (compared to buffer pool maintenance) to issue a checkpoint every now and then. It is recovery!

And why is recovery important? When a database starts up, recovery is one of the processes. During recovery, SQL Server looks up the last known checkpoint in the transaction log and then starts redoing the committed transaction from the transaction log. If the database was closed clean the last time, the checkpoint is the last thing in the transaction log, so recovery costs next to nothing. But if the database wasn't closed nicely (like when the power supply unit fails), then it may be hours (or longer) since last checkpoint. SQL Server then starts redoing from the transaction log, this may take some time, but at least you don't lose commited transaction.

BUT DIDN'T YOU .... ?

YES, I DID! That is, pull the memory stick with the transaction log out, to simulate a failure of the disk. And that's where it hurts as data pages only get flushed to disk if;

  • Both the buffer pool and system memory are full (or memory is capped with max server memory). Based on a few inserts and updates... don't think so.
  • The database is closed or backed up. Not the case when you pull the stick.
  • A manual CHECKPOINT is issued. Didn't do that.
  • Recovery interval kicks in. Waited just for that to happen...

However, the recovery interval of 0 does not mean SQL Server will issue a checkpoint automatically within a minute. It says, checkpoints do occur when the database engine estimates it can process the number of log records within the interval specified during the recovery process! In other words, it may take very long between checkpoints if a system is running very few transactions... like I noticed during my tests.

Technorati tags: , ,