Sunday, June 28, 2009

tablediff: howto compare all database tables

tablediff.exe is a command-line utility that you can you to compare tables from different database / servers
It is located in the ...\ Program Files\Microsoft SQL Server\90\COM folder.

It is a handy tool that can compare tables at an amazingly high speed and create sync scripts along the way.
Downside is that it doesn't accept wildcards like: compare all tables of a database,
you have to pass all object info to the tool via the command line.

The purpose of this T-SQL script is to build a full tablediff.exe command line for every object in the specified databases and print it in the messages window. You can then copy paste it in a .bat file and execute it from a cmd.exe window.

Log file and sync scripts are placed in c:\ folder with 'tablediff' + table name as file name and .sql or .txt as extention.

Tips;
Primary keys must be defined on all the tables for the tool to work ok
Sync'ing of blob datatypes (like varchar(max) or varbinary(max)) is not supported -unfortunately-,
You can use the bcp utility or a linked server to accomplish this.
Script presented here uses integrated security, so you have to log on with the right privileges
or use the SQL Server service account for the script to work ok.


USE your_db_here -- for selection of the objects from INFORMATION_SCHEMA.TABLES
go

DECLARE @compareschema NVARCHAR(100), @comparetable NVARCHAR(400)
DECLARE @sourceserver NVARCHAR(100), @destinationserver NVARCHAR(100)
DECLARE @sourcedatabase NVARCHAR(100), @destinationdatabase NVARCHAR(100)
DECLARE @tablediffpath NVARCHAR(512)
DECLARE @command NVARCHAR(4000)

SET @sourceserver = 'Server1'
SET @sourcedatabase = 'CodeCatalog' -- or db_name()
SET @destinationserver = 'Server2'
SET @destinationdatabase = 'CodeCatalog' -- or db_name(), if same name

-- set the right path here....
SET @tablediffpath= '"C:\Program Files\Microsoft SQL Server\90\COM\tablediff.exe"'

DECLARE tablenames_cursor CURSOR FAST_FORWARD LOCAL FOR
SELECT TABLE_SCHEMA,TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
ORDER BY TABLE_NAME

OPEN tablenames_cursor

FETCH NEXT FROM tablenames_cursor INTO @compareschema, @comparetable
WHILE (@@FETCH_STATUS <> -1)
BEGIN

SET @command = @tablediffpath + ' -sourceserver ' + @sourceserver + ' -sourcedatabase ' + @sourcedatabase + ' -sourceschema ' + @compareschema + ' -sourcetable ' +
@comparetable + ' -destinationserver ' + @destinationserver + ' -destinationdatabase ' + @destinationdatabase + ' -destinationschema ' + @compareschema +
' -destinationtable ' + @comparetable + ' -f C:\tablediff-' + @comparetable + '.sql' + ' -o C:\tablediff-' + @comparetable + '.txt'

PRINT @command -- output to messages window/tab

FETCH NEXT FROM tablenames_cursor INTO @compareschema, @comparetable
END

CLOSE tablenames_cursor
DEALLOCATE tablenames_cursor





Bookmark and Share

Howto compare all database tables using T-SQL


By looping through the INFORMATION_SCHEMA.TABLES collection and construct the T-SQL
along the way, you have a simple and easy method for comparing databases.

USE = your_source_db-for_comparison_here --(to select the right tables from sys.objects)
GO

DECLARE @compareschema NVARCHAR(100)
DECLARE @comparetable1 NVARCHAR(400), @comparetable2 NVARCHAR(400)
DECLARE @dbtocompare NVARCHAR(200), @command NVARCHAR(4000)

SET @dbtocompare ='dbtocompare' -- db to compare
-- or set @dbtocompare = 'LINKEDSQLSERVER.dbtocompare' -- or db plus linked server if required

DECLARE tablenames_cursor CURSOR FAST_FORWARD LOCAL FOR
SELECT TABLE_SCHEMA,TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
ORDER BY TABLE_NAME

OPEN tablenames_cursor

FETCH NEXT FROM tablenames_cursor INTO @compareschema, @comparetable1
WHILE (@@FETCH_STATUS <> -1)
BEGIN
SET @comparetable1= @compareschema + '.' + @comparetable1
SET @comparetable2 = @dbtocompare + '.' + @comparetable1

PRINT @comparetable1 -- print tables in the messages window/tab
PRINT @comparetable2

SET @command = 'select ''' + @comparetable1 + ''' CompTable, * from
(select * from '
+ @comparetable1 + '
except
select * from '
+ @comparetable2 + ') as CompTable
union all
select '
'' + @comparetable2 + ''' CompTable, * from
(select * from '
+ @comparetable2 + '
except
select * from '
+ @comparetable1 +') as CompTable '

EXEC SP_EXECUTESQL @command

FETCH NEXT FROM tablenames_cursor INTO @compareschema, @comparetable1
END

CLOSE tablenames_cursor
DEALLOCATE tablenames_cursor

Because of the 'except' this construction works for SQL Server 2005 and higher only, and you can't use the the -older- text, ntext and image datatypes in comparisons.
You have to convert them to the newer varchar(max), nvarchar(max), and varbinary(max) data types first. (see my post on SQL Server 2000 post upgrade steps)

This is the error you get when you do use older data types
Msg 421, Level 16, State 1, Line 1
The text data type cannot be selected as DISTINCT because it is not comparable.
Msg 402, Level 16, State 1, Line 1
The data types text and text are incompatible in the is operator.



Bookmark and Share

Saturday, April 04, 2009

How to use SQL Server T-SQL cursors


The general advice within the SQL Server community is do not use any T-SQL cursors at any time.
I must say that I have to agree, although if you have browsed my blog you've noticed that there are a couple of occasions when I do use cursors but only when the following is true:

- All cursors are read_only / forward-only for optimum speed.
- They're not used in OLTP environments.
- Datasources are small sets, usually system objects, like a list of databases / tables.
- The impact the cursor is many times smaller then the processing steps taken with the row results.

(like a cursor based on sys.databases and with the result of each row a full database backup is made).

Coding guidelines:
- Avoid using cursors and try solve the problem by using set based operations.
- Try to base them on readonly tables or local (#temptable) temporarily objects.
- Cleanup! Always CLOSE and DEALLOCATE cursors.
- Specify the 'LOCAL' option.
- Specify the FAST_FORWARD clause on DECLARE CURSOR. This opens an optimized forward-only, read-only cursor.
- Never use cursors in OLTP environments.
- Never use them as source of an update: UPDATE table SET col1 = 2 WHERE CURRENT OF the_cursor;
- Cursors are memory intensive; base them on smaller sets (less 5,000-10,000 rows, your site could differ: test!).

template:

    DECLARE the_cursor CURSOR FAST_FORWARD LOCAL FOR
        SELECT col1,col2,col3
         FROM dbo.mytable

    OPEN the_cursor

    FETCH NEXT FROM the_cursor INTO @col1,@col2,@col3
    WHILE (@@FETCH_STATUS <> -1)
    BEGIN

        FETCH NEXT FROM the_cursor INTO @col1,@col2,@col3
    END

    CLOSE the_cursor
    DEALLOCATE the_cursor
 


TIP:
If you do need cursors in high load OLTP environments because of some complex calculation that can't be done set based take the following approach:

Copy the required result set in a temporary object. Retrieve only the rows and columns you need, but do include all the fields of the primary key.

create #temptable (CalcValue int, pkCol int)

INSERT INTO #temptable (CalcValue, pkCol)

SELECT 0, PrimaryKeyCol
FROM dbo.HighLoadOLTPtable
WHERE -- your where clause here

Base your cursor on the temp. object.

Loop the cursor, perform your calculation and store the result in the temp. object row:

UPDATE #temptable SET CalcValue=complex_calculated_value
WHERE pkCol=pk_fields_as_fetched_by_cursor

When done looping close/deallocate the cursor, and update the high load OLTP source table(s) set based by primarykey, use a BEGIN TRANSACTION / COMMIT if required:

UPDATE dbo.HighLoadOLTPtable SET CalculatedValue = #temptable.CalcValueFROM dbo.HighLoadOLTPtable, #temptableWHERE dbo.HighLoadOLTPtable.PrimaryKeyCol = #temptable.pkCol


see my posts on
TempDB configuration and TempDB objects



Bookmark and Share

Wednesday, April 01, 2009

Working with SQL Server TempDB objects


Try to avoid them in the first place by using derived tables or CTEs

Table variables are easy an convenient, because you don't have to clean them up.
But only use them with small sets (smaller 1000 rows)

Do not use SELECT col1 INTO #temptable FROM dbo.table1

But do use a create table statement and a INSERT/SELECT:
CREATE #temptable (col1 int,col2 int)

INSERT INTO #temptable (col1,col2) SELECT Col1, Col2 FROM dbo.table1

a bit more work, but less blocking in environments with a higher load.

With larger sets you could consider using indexes and check with the query plan and statistics io to see if there are any benefits in using them.
Create these indexes after the table is filled with a set:

INSERT INTO #temptable (col1,col2) SELECT Col1,Col2 FROM dbo.table1
CREATE CLUSTERED INDEX CIX_#temptable ON #temptable (col1)

CREATE INDEX IX_#temptable_1 ON #temptable (Col2)

Create ony the indexes you absolutely need.

Configure TempDB as suggested in my post configure TempDB

Can your TempDB database disks handle the current load? See my post on fn_virtualfilestats

Prefer locally (#temptable) over globaly (##temptable) scoped definitions

Store results of linked servers in TempDB objects instead of accessing the remote datasource multiple times. See my post on linked servers

Be a good citizen, at the end of your stored procedure or T-SQL code always cleanup after your self and drop all temporarily objects you have created.


    --drop temp tables if they exists
    IF OBJECT_ID('tempdb..#temptable') IS NOT NULL
        DROP TABLE #temptable
 



Bookmark and Share

Monday, March 30, 2009

How to configure AWE memory


AWE according to BOL:
SQL Server 2005 Enterprise Edition supports Address Windowing Extensions (AWE) allowing use of physical memory over 4 gigabytes (GB) on 32-bit versions of Microsoft Windows operating systems. Up to 64 GB of physical memory is supported.

My advice: Don't use AWE!

As mentioned above, it is only supported by the far more expensive SQL Server Enterprise edition (32bit of course) and it is only used by the core database engine and no other components.
SSIS -and others- won't benefit and are still limited to 32bit (=2GB max).
AWE has also a performance overhead, and it's always a hassle with the boot.ini switches:

If your system has <= 4 GB - use /3GB (none of the other options)
If your system has > 4 GB and < 16 GB - use /3GB and /PAE and configure AWE in SSMS
If your system has > 16 GB - use /PAE and configure AWE in SSMS

Go for SQL Server standard Edition (which is more then enough for most shops) and go for a 64bit environment with enough memory (start with 8GB) for SQL Server to use.
Far Cheaper and far more scalable.
You can serve more databases on the same instance, leading to substantial cost savings.

See my post on
Tools for performance and How many instances on a SQL Server



Bookmark and Share

Saturday, March 28, 2009

Always run SQLIO on new servers!


When your vendor brings a new type of server on the market always check if the hardware is as good as they claim it is.

A while ago I did a test run and compared the IO results of a Dell server and an -almost- equal configured IBM server. Both were small configs with six internal drives. The tool I used was SQLIO.


SQLIO is a small .exe file (250KB) that mimics the SQL Server database engine. You don't have to install SQL Server for the tool to work. Only the Windows OS is enough, s
o you can also check the thoughput of your SAN filesystem.
You can download SQLIO and some documentation here

The Result? Guess what? The Dell server scored 10(ten!) times better on IO performance then the more expensive IBM...

Never take your vendors word for it, always test yourself. Because who gets the blaim in the end?

The test file was 25MB, but results were roughly the same for bigger test files.

Results for: sqlio -kR -s360 -frandom -o8 -b64 -LS -Fparam.txt

IBM server local drives:

throughput metrics:
IOs/sec: 1177.91
MBs/sec: 73.61
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 13
Max_Latency(ms): 919
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 16 11 4 5 8 8 6 5 4 3 3 2 2 2 1 1 1 1 1 1 1 1 1 1 13

Dell server local drives:

throughput metrics:
IOs/sec: 11,292.99
MBs/sec: 705.81
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 1
Max_Latency(ms): 394
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 96 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

See my post on disk configuration tips.



Bookmark and Share

Friday, March 27, 2009

How many instances on a SQL Server?


In my opinion easy and short answer: install one (1) instance and use SQL Server Standard edition 64bit with enough memory (at least 8GB) for SQL Server to run comfortably.

Why? with multiple instances on the same server you have to make assumptions about how much memory you configure to each instance, leading to a less optimal configuration.
Also multiple -housekeeping- tasks scheduled on the seperate SQL Agents have to be configured and monitored separately.

On a dedicated SQL Server I usually opt for giving all but 1-2GB of memory (dependent on the features installed like SSIS or Reporting Services) to Windows and all other memory to SQL Server and definitely go for a 64 bit (32=dinosaur) environment. So if you need additional memory to support more databases on the same instance it can easily be installed without the hassle of boot.ini switches and the performance overhead of AWE. (and AWE is only supported by the more expensive SQL Server Enterprise Edition)

Do not forget to set an upper limit of how much memory SQL Server can max. use, to leave enough room for other apps and services.

sample: on a 16GB server you can set a maximum of 14GB for SQL Server to use, leaving 2GB for the OS and other stuff.


(click to enlarge)


With the single instance approach together with Tools for performance and so reducing the workload you can have more databases on the same single instance server, leading to significant cost savings.



Bookmark and Share

Tuesday, March 17, 2009

Linked Server collation, performance and temp tables

When I create a linked server to a non SQL Server platform or to a SQL Server instance that's using another collation I always use the following linked server options:

- Collation Compatible: False
- Use Remote Collation: False

In all other cases set collation compatibility to True, so indexes on the remote SQL Server tables can be usedleading to a significant performance boost.

I usually set the RPC / RPC Out and Data Access properties to True.


(click to enlarge)


When the linked server is a read-only data store I definitely prefer to use an OPENQUERY construction:

select * from openquery
(

SQLSERVER2,'SELECT col1,col2 FROM CodeCatalog.dbo.recommendations'
)

over the four part name equivalent:
SELECT col1,col2 FROM SQLSERVER2.CodeCatalog.dbo.recommendations

Reason?
Far less distributed transaction errors (none so far!) generated by the OLE-DB driver.
Using OPENQUERY requires a bit more code, but it really, really pays of...

Performance.
Because linked servers are not the fastest objects on the planet, try to access them as less as you can. I'll usually access them just once and dump te result in a temporary table and use that as input for further processing.

sample:
create table #t (id int)

insert into #T (id)
select * from openquery
(
SQLSERVER2,'SELECT id FROM CodeCatalog.dbo.recommendations'
)
...process results from #T here....

Don't forget to script your linked server definitions and store them somewhere safe.

also, check out my post on msdtc.



Bookmark and Share

Monday, March 16, 2009

SQL Server MSDTC settings

During installation of a new Windows OS I set the following properties for the MSDTC security settings and transaction timeout.

Required when you want to configure and use linked servers

Security setings:



Transaction Timeout:


To configure these properties in Windows Component Services, complete the following actions:

1.From your Microsoft Windows desktop,click Start > Settings > Administrative Tools > Component Services, or start dcomcnfg.exe from a command prompt.

2. Expand the tree view to locate the computer where you want to change the configuration for;for example, My Computer.

3. Display the context menu for the computer name, and then click properties.

4. Click the 'Options' tab, and set the 'Transaction Timeout' to a length of time that suits your environment.The minimum setting I use is 180 seconds.

5. Click the 'MSDTC' tab, and then 'Security Configuration'. make the changes as shown in the .jpg above and click OK to save.

The MSDTC service must be restarted for changes to take effect.The OS will ask for confirmation.



Bookmark and Share

Friday, March 13, 2009

Enable Instant File Initialization

In SQL Server 2005 (and higher versions), data files can be initialized instantaneously.
Instant file initialization reclaims used disk space without filling that space with zeros. Instead, disk content is overwritten as new data is written to the files.

This can lead to a significant performance boost of ‘create database’, restore and autogrow operations.
For example, when a thread executing a query and is a victim of an autogrow, less disk activity needs to be done and it can finish much quicker.

Unfortunately this option is not available for Log files.

How to assign ‘Perform volume maintenance tasks’ to your SQL Server instance:

Start the gpedit.msc utility, browse to ‘user rights assignment’ and add your SQL Server service account to the ‘Perform volume maintenance tasks’ policy.
Restart the SQL Server service (not the windows server), for changes to take effect.


(click to enlarge):



Check the assigned privileges:

Open SQL Server Management Studio under the same account the SQL Server service is running. Open a query window, set result to out put to text and paste/run the following code:

EXEC sp_configure 'xp_cmdshell', 1;
GO
RECONFIGURE WITH OVERRIDE;
GO

EXEC ('xp_cmdshell ''whoami /priv''');

EXEC sp_configure 'xp_cmdshell', 0;
GO
RECONFIGURE WITH OVERRIDE;
GO


Result in SSMS should contain SeManageVolumePrivilege with state Enabled

(click to enlarge):





Bookmark and Share

Thursday, March 12, 2009

SQL Server Windows OS Settings

Favor 64bit OS'es
32 bit = dinosaur

Impact of other applications:
Limit the numbers of services / applications running to an absolute minimum.
Virus scanner: do not scan on-line but schedule it as a task during off-peak hours.
During the scanning process exclude the SQL Server device extensions (.MDF, .NDF, .LDF, .BAK, .DIF and .TRN), so that they are not accessed during the scanning process.


Pagefile
Create a pagefile 1.5 - 2 times the amount of internal memory.
Don't put it on a bussy database volume.

Memory:
Use the /3GB switch in boot.ini on dedicated 32bit SQL Server OS’es with 4GB of memory and no other programs like Reporting Services installed. This switch instructs the OS to assign max 3GB to processes instead of the 32bit OS limit of 2GB, but leaving max. 1GB to the OS and other services.
Favor 64 bit over AWE


Network properties:
Maximize Data Throughput for Network Applications


(click to enlarge)

















System properties:
Application Response" setting: "Background services"
Change the memory allocation to favor "Programs."

(click to enlarge)









Bookmark and Share

Wednesday, March 11, 2009

Five methods converting rows to columns

Post with five methods for converting rows to columns.

From the classic ‘CASE’ construction to the newer PIVOT and ROW_NUMBER() OVER (ORDER BY) functions.

Be careful!
Some methods presented here can have a severe negative impact on performance. Especially the in-line function with a cursor should be used with the utmost precaution.Check the query plan and the statistics i/o (see this post) of your queries before putting the code into your production environment!

1. use a CASE statement


SELECT DATEPART(YEAR,orderdate),
    SUM(CASE WHEN DATEPART(q,orderdate) = 1 THEN orderamount ELSE 0 END) AS Qtr1,
    SUM(CASE WHEN DATEPART(q,orderdate) = 2 THEN orderamount ELSE 0 END) AS Qtr2,
    SUM(CASE WHEN DATEPART(q,orderdate) = 3 THEN orderamount ELSE 0 END) AS Qtr3,
    SUM(CASE WHEN DATEPART(q,orderdate) = 4 THEN orderamount ELSE 0 END) AS Qtr4,
    SUM(orderamount) AS Total
 FROM Orders
-- additional where clause goes here...
GROUP BY DATEPART(YEAR,orderdate)


2. use the COALESCE function
DECLARE @AllValues VARCHAR(4000)

SELECT @AllValues = COALESCE(@AllValues + ',', '') + HandlingCode
FROM OrdersDetails
WHERE OrderNumber = @OrderNumber

-->
3. use ROW_NUMBER() OVER (ORDER BY)
SELECT OrderNumber, OrderDate,
--get the special handling codes and show them as columns, max of 3 (agreed by users)
    (SELECT HandlingCode
        FROM
        (
          SELECT ROW_NUMBER() OVER (ORDER BY HandlingCode) AS ROWNUMBER,HandlingCode
          FROM OrdersDetails
          WHERE OrdersDetails.OrderNumber = Orders.OrderNumber
        ) HandlingCode
        WHERE ROWNUMBER = 1) HandlingCode1,
    (SELECT HandlingCode
        FROM
        (
          SELECT ROW_NUMBER() OVER (ORDER BY HandlingCode) AS ROWNUMBER,HandlingCode
          FROM OrdersDetails
          WHERE OrdersDetails.OrderNumber = Orders.OrderNumber
        ) HandlingCode
        WHERE ROWNUMBER = 2) HandlingCode2,
    (SELECT HandlingCode
        FROM
        (
          SELECT ROW_NUMBER() OVER (ORDER BY HandlingCode) AS ROWNUMBER,HandlingCode
          FROM OrdersDetails
          WHERE OrdersDetails.OrderNumber = Orders.OrderNumber
        ) HandlingCode
        WHERE ROWNUMBER = 3) HandlingCode3
FROM Orders
WHERE OrderNumber = @OrderNumber


4. Use an inline function
SELECT OrderNumber, OrderDate, fn_GetHandlingCodes (OrderNumber)
FROM Orders
WHERE OrderNumber = @OrderNumber

CREATE FUNCTION dbo.fn_GetHandlingCodes (@OrderNumber INT)
RETURNS VARCHAR(1200)
AS
BEGIN
   DECLARE @HandlingCode VARCHAR(20)
   DECLARE @ReturnValue  VARCHAR(4000)

-- use that fastest cursor methods: local fast_forward
   DECLARE code_cursor CURSOR LOCAL fast_forward FOR
    SELECT HandlingCode
     FROM OrdersDetails
    WHERE OrderNumber = @OrderNumber  
      AND NOT HandlingCode IS NULL -- filled

   SET @ReturnValue = ''  -- set to non null

   OPEN code_cursor
   FETCH NEXT FROM code_cursor  INTO @HandlingCode
   WHILE (@@FETCH_STATUS = 0)
   BEGIN
       SET @ReturnValue = @ReturnValue + @HandlingCode + ', '      

       IF LEN (@ReturnValue) > 1000 BREAK -- avoid overflow

       FETCH NEXT FROM code_cursor INTO @HandlingCode
   END

   CLOSE code_cursor
   DEALLOCATE code_cursor

-- remove last delimiter
   IF LEN(@ReturnValue) > 1 SET @ReturnValue = SUBSTRING(@ReturnValue,1,LEN(@ReturnValue)-2)

   RETURN @ReturnValue
END


-->

5. Use a pivot
USE AdventureWorks
GO

SELECT VendorID, [164] AS Emp1, [198] AS Emp2, [223] AS Emp3, [231] AS Emp4, [233] AS Emp5
FROM
(SELECT PurchaseOrderID, EmployeeID, VendorID
FROM Purchasing.PurchaseOrderHeader) p
PIVOT
(
COUNT (PurchaseOrderID)
FOR EmployeeID IN
( [164], [198], [223], [231], [233] )
) AS pvt
ORDER BY VendorID




Bookmark and Share

Tuesday, March 10, 2009

SQL Server database settings

When creating a new SQL Server database I always use this checklist:

Name: Use as less non-alpha numeric characters in the database name as possible (do not use the ‘-‘, it will mess up your scripts). Use pascal casing for readability (use OrdersDb, do not use Orders-db or Orders_db)

Set Page verify to Checksum

Pre allocate data and log size to a reasonable size, do not rely on autogrowth.
Set a reasonable Autogrow size in MB (not a percentage) for data and log files


Enabling auto create/update statistics, is fine for most situations.

Set the right recovery model.
If your using ‘full’ or ‘bulk recovery’, schedule regular log backups (see this post )

Try to create at least two –equally sized!- device files for the data devices.
Not only reduces this locking issues but also when a database is regular under a heavy load,files can easily be moved to additional physical volumes (see this post )

Create one file for the Log

Use fn_virtualfilestats() to measure I/O load per device file (see this post )

Set the .MDF, .NDF and .LDF files in their own folders with the same name as the database. Don’t store all your db’s in one folder

Reorganize indexes/statistics on a regular basis (see this post )

Run DBCC CHECKDB on a regular basis (see this post )

Do not schedule any shrink operations

Configure instant file initialization (see this post )

Check out my post on auto index management and tools for performance, to setup more databases on the same instance and save costs



Bookmark and Share

Saturday, March 07, 2009

SQL Server 2000 post upgrade steps

Small post on additional steps to perform when upgrading from an existing SQL Server 2000 database to SQL Server 2005 / 2008 (after you backup’ed, run upgrade advisor and tested everything twice right!)

After a successful restore or attach database operation, execute the following commands in the new environment:

USE your_db_name
GO
DBCC UPDATEUSAGE ('your_db_name')
DBCC CHECKDB ('your_db_name') WITH DATA_PURITY
EXEC sp_updatestats


And:
- Set database compatibility level to ‘SQL Server 2005’ or ‘SQL Server 2008’
- Set Page verify option to Checksum
- Set 'Auto Create' and 'Auto Update' statistics to TRUE
- Reorganize all indexes

Running ‘DBCC UPDATEUSAGE’ and ‘DBCC WITH DATA_PURITY’ once is enough.
The DBCC CHECKDB of the newer (2005/2008) SQL Server versions has it all combined in one statement.


Upgrade to the the newer varchar(max), nvarchar(max), and varbinary(max) data types instead of text, ntext, and image data types.
You can check by executing this T-SQL on your upgraded database on SQL Server 2005 / 2008, to see which tables need some work:


USE = 'your_db-name_here'
GO

SELECT userobjects.name tablename, columns.name columnname, types.name columntype FROM sys.all_columns columns INNER JOIN sys.all_objects userobjects ON columns.object_id = userobjects.object_id INNER JOIN sys.types types ON columns.system_type_id = types.system_type_id
WHERE userobjects.[type] = 'U'
and types.name IN ('image','text','ntext')



Bookmark and Share

SQL Server Bookstore

more cool stuff at Amazon

Want UK prices? Click here





Bookmark and Share

Friday, March 06, 2009

SQL Server disk configuration tips

Most important metrics for SQL Server: ‘throughput’: how many MB/s can the disks handle and what is the load placed on the disks.

If you’re buying a standard package your vendor should know this.
Surprisingly however in the hardware proposal (top end server(s) / high rpm disks right?) this figure isn’t mentioned anywhere in most situations.
What you can do, is ask your vendor if they could run a fn_virualfilestats() function (see my fn_virtualfilestats post) on a similar production configuration of another customer (better: talk to them yourself!) . This gives you insight on the I/O pattern (more read of more write activity) and the load on TempDB, to determine if it’s a good candidate –which it usually is- to store on it’s own disk spindles.

Run the Microsoft SQLIO tool to determine a baseline on MB/s and latency per disk partition.
Store it for later reference, the Windows performance tool can also measure the throughput and it gives you the possibility to compare the numbers and check if there might be a disk throughput performance bottleneck in your production system.







Disk configuration tips:

Use as many disk spindles as you can, and try spreading data / log and tempdb on different raid arrays.
Stripe your RAID-config at 256k instead of 64k / 128k.This will increase performance of SQL Server read-aheads.
If you use Windows 2000 / 2003 align the disk partitions!!!:
(see Miscrosoft knowledge base article KB300415 on how to use Diskpart).
At the Diskpart command prompt,
type: Create Partition Primary Align=X,
where X is either 32 or 64, depending on the recommendation from your storage vendor.
If your storage vendor does not have any specific recommendations, it is recommended that

you use 64.
This step is not needed in Windows 2008

Full (not quick) format with a 64kb stripe size for disks holding data/log and backup files.

Full (not quick) format with a 4kb stripe size for OS and program/swap devices

Data raid5 or raid10 when there’s more write then read activity (check with fn_virualfilestats),

Log (raid1 or raid10 under heavy load), separate backup disk (raid5)

Put TempDB on its own physical disks (raid1, raid10 under heavy load).
In a non-clustered environment, I normally use the local server storage for TempDB.

-->
>
Use NTFS volumes

Do not use more then 80% of the volume

Defragment the disks on a regular basis



Bookmark and Share

How to configure SQL Server TempDB

As a general guideline, create one data file for each CPU on the server.
Note that a dual-core CPU is considered to be two CPUs. logical procs (hyperthreading) do not

Only create one Log file

Do not use autoshrink on TempDB

If your'e using SQL Server 2008 set Page verify to ChecksumInvestigate the possibility if you can switch off the properties ‘Auto create’ and ‘Auto Update’ statistics,it can speed up the creation of objects in TempDB. Be careful however: measure the impact these changes can have on production! Change and measure only one parameter at the time.

After installation move the TempDB database to its own disks that differ from those that are used by user databases

ALTER DATABASE TempDB
MODIFY FILE (NAME=tempdev, FILENAME= '\tempdb.mdf');
GO

ALTER DATABASE TempDB
MODIFY FILE (NAME=templog, FILENAME= '\TempLog.ldf');
GO


Restart SQL Server service (not the Windows server), to make changes permanent

Pre-allocate data and log device sizes (create files with same equal size), do not rely on small auto growth steps

If you use auto growth set it to a reasonable size in MB, not a percentage

TempDB file size default FILEGROWTH increment
0 to 100 MB growth:10 MB
100 to 200 MB growth:20 MB
200 to 1000 MB growth:50 to 75 MB
1 GB or More growth:150 to 250 MB

Measure TempDB usage over time with fn_virtualfilestats
New releases of your –vendor- software could have a different impact on TempDB load.



Bookmark and Share

Backup the mssqlsystemresource database

Since the introduction of SQL Server 2005, Microsoft implemented the mssqlsystemresource database.This is a read-only database that contains all the system objects that are included with SQL Server.It is stored as mssqlsystemresource.mdf / .ldf file combination and installed in the same location as the master databaseand it should stay there!

It is a SQL Server requirement that both the master and resource databases are in the same location.Due to the low IO overhead it's im my opinion not worth the trouble moving both databases.They should however be stored on a redundant disk. Too risky to lose them!


Its content is specific to a version / patch level and because it cannot be included inregular a SQL Server backup you should always make a manual copy of both .mdf and .ldf files:

- After a SQL Server instance is installed

- Before and after a service pack of hot fix is applied

Copying can be done while the SQL Server instance is on-line



Bookmark and Share

Thursday, March 05, 2009

Do you have a datetime T-SQL 'BETWEEN AND' bug??

Create a table with a datetime column

create table #t (col1 datetime)

add couple of rows with date and time values

insert into #t values ('1/mar/2009 08:00')
insert into #t values ('1/mar/2009 09:00')
insert into #t values ('1/mar/2009 10:00')


get the rows

select * from #t
where col1 between '1/mar/2009' and '1/mar/2009'


and guess what?
(0 row(s) affected)


Why? You’re trying to select a date without a time and that's treated as 00:00 midnight
and these rows do not exist in the table

Three ways to solve:

1.
append a time value to the sql string:

select * from #t
where col1 between '1/mar/2009 00:00:00' and '1/mar/2009 23:59:59'
(3 row(s) affected)


no go, too much work -specially with variables- and could lead to (more) bugs


2.
select * from #t
where CONVERT(datetime,CONVERT(varchar(20), col1, 112),112) between '1/mar/2009' and '1/mar/2009'
(3 row(s) affected)

uses double convert() T-SQL function to get rid of the time part and switch back to datetime format

no go: if you have a index defined an the column it will not be used by the SQL Server query optimizer because of the convert function. Bad performance!

3.
my absolute thumbs up favorite:

select * from #t
where col1 >= '1/mar/2009' and col1 < dateadd(d,1,'1/mar/2009')
(3 row(s) affected)

GO, this construction has served me well over the years: date/time values are handled correctly,
indexes are used and no hassle with convert() leading to good performance and good maintainable T-SQL code!

have fun





Bookmark and Share

Tuesday, March 03, 2009

SQL Server tools for performance

Since the introduction of SQL Server 2005 Microsoft implemented Dynamic Management Views (DMV’s) in the database engine that enable us to peek inside of it, and retrieve the internal –performance- counters.

In this post I present a couple of queries gathered and used over time that assisted me to pin down the problem when having performance related issues.

If you’ve inherited a system from an older / upgraded SQL Server version, I recommend running the ‘unused indexes‘ query mentioned later on. It saved my life in more than one occasion.
In my experience as tables have grown bigger, indexes were created for specific needs and over time became obsolete, but no one bothered to remove them, leaving the system with a lot of unnecessary I/O overhead.

Explore sys.dm_exec_query_stats it contains more fields than used here (last_physical_reads for instance) and also check out other DMVs for additional info you might be interested in.


Before starting any investigation make sure that all indexes are defragmented and all the statistics are up to date up front (see my post on index reorganization)

The queries described here, combined with my previous posts on fn_virtualfilestats, auto index management, STATISTICS IO, multiple files per database and index reorganization should supply you with enough info to tackle most of I/O related performance issues.

Beware: DMV’s are reset when the SQL Server instance is restarted! So the numbers are only meaningful when the db engine is running for a considerable amount of time


First determine how long the instance has been running:

SELECT crdate AS Instance_Started_On
FROM sysdatabases
WHERE name = 'tempdb'
 



Queries with highest IO load:

SELECT TOP 100
        total_logical_reads, total_logical_writes, execution_count,
        total_logical_reads+total_logical_writes AS total_IO,
        st.TEXT AS query_text,
    st.dbid AS database_id,
        DB_NAME(st.dbid) AS database_name
    FROM sys.dm_exec_query_stats  qs
   CROSS APPLY sys.dm_exec_sql_text(sql_handle) st
   WHERE total_logical_reads+total_logical_writes > 0
   ORDER BY (total_logical_reads+total_logical_writes) DESC -– most i/o intensive
 


Queries using the most CPU:

SELECT TOP 100
    highest_cpu_queries.total_worker_time,
    sql_text.dbid,
    DB_NAME(sql_text.dbid) AS database_name,
    sql_text.[TEXT]
FROM  
    (SELECT TOP 100 PERCENT
        qs.plan_handle,  
        qs.total_worker_time
    FROM  
        sys.dm_exec_query_stats qs
    ORDER BY qs.total_worker_time DESC) AS highest_cpu_queries
    cross apply sys.dm_exec_sql_text(plan_handle) AS sql_text
ORDER BY highest_cpu_queries.total_worker_time DESC
 


Get the unused indexes:

USE your_db_name –- switch TO your own db here BEFORE executing the dmv query below
go

SELECT TOP 100
OBJECT_NAME(i.OBJECT_ID) AS tablename,
i.name AS indexname,
s.user_updates,
s.user_seeks + s.user_scans + s.user_lookups AS totalreads,
s.user_updates AS totalwrites
FROM sys.indexes i
            LEFT join sys.dm_db_index_usage_stats s
ON s.OBJECT_ID = i.OBJECT_ID and  
                  i.index_id = s.index_id
WHERE OBJECTPROPERTY(i.OBJECT_ID, 'IsIndexable') = 1 and
-- index_usage_stats has no reference to this index (not being used)
s.index_id IS null or
-- index is being updated, but not used by seeks/scans/lookups
(s.user_updates > 0 and s.user_seeks = 0 and s.user_scans = 0 and s.user_lookups = 0)
ORDER BY s.user_updates DESC
go
 



Bookmark and Share

Sunday, March 01, 2009

SET STATISTICS IO ON!

The graphical query plans introduced in the latest SQL Server versions give you great inside info on how your query is executing. Downside: sometimes they are difficult to read and they take time to fully understand.

Enter: SET STATISTICS IO ON

It gives you an overview off how many I/Os your query is executing on each accessed table, and by working together with the query optimizer you can absolutely minimize these numbers.

And by doing so optimize and increase the work load a SQL Server instance can handle.

For example: what is the impact of replacing an SELECT IN sub-query by an IF EXISTS () construction.

This feature is available since SQL Server 6.x, so you can still tune your old instances.

In my opinion this option should always be used with any new or changed query against a representative amount of data: a copy from, or simulation of, a production environment to check if the results are in-line with expectations.

Here’s a simple query and result when the option is enabled in the message tab of the results pane (click to enlarge):











If you have a hosted SQL Server environment with tight security, it could well be that generating graphical query plans is prohibited, while this option could still be granted giving you lots of meaningful information.

Before diving in to it check that all indexes a reorganized and the statistics are up to date.
(see my post index defragmentation)

Also check my post auto index management and let SQL Server assist you with your indexes!

Oh yeah, switch it to off again when you are done tuning your queries, don’t include it in your stored procedures.



Bookmark and Share

Friday, February 20, 2009

Enable those alerts!


Since SQL Server 7.0 Microsoft has implemented so called alerts in SQL Agent.
The main purpose of these alerts is that when specific events occur, the DBA automatically gets informed by mail / pager or other means of these event(s) and take corrective action.

There are a couple of basic event that should be implemented on an instance by default but I’ve haven’t seen that many servers were the actually were installed, used and monitored.

Here’s a T-SQL script that will create the right basic alerts that, after setup, will pop-up under the ‘Alert’ folder in SQL Agent. It’s an easy step to connect the alerts to an operator so they can be sent out. I’ll leave this challenge to you ;-)



USE [msdb]
GO
/****** Object:  Alert [019 - Fatal Error in Resource] 13:29:32 ******/
EXEC msdb.dbo.sp_add_alert @name=N'019 - Fatal Error in Resource',
        @message_id=0,
        @severity=19,
        @enabled=1,
        @delay_between_responses=0,
        @include_event_description_in=1,
        @category_name=N'[Uncategorized]',
        @job_id=N'00000000-0000-0000-0000-000000000000'

USE [msdb]
GO
/****** Object:  Alert [020 - Fatal Error in Current Process] ******/
EXEC msdb.dbo.sp_add_alert @name=N'020 - Fatal Error in Current Process',
        @message_id=0,
        @severity=20,
        @enabled=1,
        @delay_between_responses=0,
        @include_event_description_in=1,
        @category_name=N'[Uncategorized]',
        @job_id=N'00000000-0000-0000-0000-000000000000'

USE [msdb]
GO
/****** Object:  Alert [021 - Fatal Error in Database Process] ******/
EXEC msdb.dbo.sp_add_alert @name=N'021 - Fatal Error in Database Process',
        @message_id=0,
        @severity=21,
        @enabled=1,
        @delay_between_responses=0,
        @include_event_description_in=1,
        @category_name=N'[Uncategorized]',
        @job_id=N'00000000-0000-0000-0000-000000000000'

USE [msdb]
GO
/****** Object:  Alert [022 - Fatal Error Table Integritiy Suspect] ******/
EXEC msdb.dbo.sp_add_alert @name=N'022 - Fatal Error Table Integritiy Suspect',
        @message_id=0,
        @severity=22,
        @enabled=1,
        @delay_between_responses=0,
        @include_event_description_in=1,
        @category_name=N'[Uncategorized]',
        @job_id=N'00000000-0000-0000-0000-000000000000'

USE [msdb]
GO
/****** Object:  Alert [023 - Fatal Error Database Integrity Suspect] ******/
EXEC msdb.dbo.sp_add_alert @name=N'023 - Fatal Error Database Integrity Suspect',
        @message_id=0,
        @severity=23,
        @enabled=1,
        @delay_between_responses=0,
        @include_event_description_in=1,
        @category_name=N'[Uncategorized]',
        @job_id=N'00000000-0000-0000-0000-000000000000'

USE [msdb]
GO
/****** Object:  Alert [024 - Fatal Error Hardware Error]  ******/
EXEC msdb.dbo.sp_add_alert @name=N'024 - Fatal Error Hardware Error',
        @message_id=0,
        @severity=24,
        @enabled=1,
        @delay_between_responses=0,
        @include_event_description_in=1,
        @category_name=N'[Uncategorized]',
        @job_id=N'00000000-0000-0000-0000-000000000000'


USE [msdb]
GO
/****** Object:  Alert [025 - Fatal Error] ******/
EXEC msdb.dbo.sp_add_alert @name=N'025 - Fatal Error',
        @message_id=0,
        @severity=25,
        @enabled=1,
        @delay_between_responses=0,
        @include_event_description_in=1,
        @category_name=N'[Uncategorized]',
        @job_id=N'00000000-0000-0000-0000-000000000000'
 



Bookmark and Share