Troubleshoot a Stored Procedure

Hello SQL Folks – I observed multiple times when a stored procedure in your SQL Server instance is running slow and creating typical performance problems. Sometimes it feels a bit hard to troubleshoot it and situation become worse when it’s a case of nested stored procedure OR SP itself contains thousand lines of code.

To simplify this problem, I wrote a below query which will help you to identify, only time consuming part OR problematic part of that SP with its execution plan, moreover this is the only execution plan for time consuming part of that SP and not a full execution plan of SP. In the below query output column name as text filter is a column which shows different T-SQL\batches that runs inside a Stored Procedure. Also it shows Avg. run time, Last run time and min. run time of each T-SQL\batch.

By using below script it will make your life easier to concentrate only on problematic part instead to troubleshoot full SP.

 

Select top 50

qs.plan_handle

, st.text

,substring (st.text, (qs.statement_start_offset/2)+1,

((case qs.statement_end_offset

when 1 then datalength(st.text)

else qs.statement_end_offset

end qs.statement_start_offset)/2) + 1) as text_filters

,(qs.total_worker_time)/1000.0 as [Total CPU (ms)]

,(qs.total_worker_time/qs.execution_count)/1000.0 as [Avg CPU Time (ms)]

,(qs.total_elapsed_time)/1000.0 as [Total Duration (ms)]

,(qs.total_elapsed_time/qs.execution_count)/1000.0 as [Avg Duration (ms)]

,(qs.last_elapsed_time)/1000.0 as [Last Duration (ms)]

,(qs.min_elapsed_time)/1000.0 as [Min Duration (ms)]

,(qs.max_elapsed_time)/1000.0 as [Max Duration (ms)]

,qs.last_execution_time as [Last Execution]

,execution_count as [#Executions]

,cast (qp.query_plan as xml) as Query_Plan

from sys.dm_exec_query_stats as qs

cross apply sys.dm_exec_sql_text (qs.sql_handle) as st

cross apply sys.dm_exec_text_query_plan (qs.plan_handle, qs.statement_start_offset,

qs.statement_end_offset) as qp

where st.text like ‘% <Stored_Procedure_Name> %’

order by [Total CPU (ms)] desc

 

Hope it will help you. Stay tune for more upcoming interesting solutions on complex SQL performance issues.

Cheers – Gaurav

Don’t know who dropped the tables in your database, Here you go

Hi Folks – Before some time I came to a situation where somebody dropped so many tables from our database and we had a hard time to find the details. This article will tell you that what were the tables that were dropped and who dropped them.  To illustrate this I’m taking a below example :

/* Use an existing db OR create new one, I’m using existing one */

Use Test_1
go

/* Lets create a test table */

 Create table Test_Table(empID int, name nvarchar(20))

 

/* Insert values in that table */

 insert into Test_Table values (100,’Gaurav’)
insert into Test_Table values (101,’Ashish’)
insert into Test_Table values (102,’Vivek’)

 

/* Check the data in that table */

select * from Test_Table

 

/* Note down the Object Id of that table */

SELECT OBJECT_ID(N’Test_Table’); — In My case object id was 581577110

GO

 

/* Drop that test table */

DROP TABLE [Test_Table];
GO

Everything that we do in db is captured in T-Log and By using below command we can find the transactions that drop tables in the log:

SELECT [Transaction Id], [Begin Time], SUSER_SNAME([Transaction SID]) AS [User]
FROM fn_dblog (NULL, NULL)
WHERE [Transaction Name] = N’DROPOBJ’;
GO

This command will show an output like below :

Pic1

 

 

 

 

Above output tell us who dropped the tables as User (In this case it was gshri) but to know what tables got dropped, pass transaction id in the same function :

SELECT [Lock Information] FROM fn_dblog(NULL, NULL)
WHERE [Transaction Id] = ‘0000:00000343’
AND [Lock Information] LIKE ‘%SCH_M OBJECT%’;

In my case Output was :

Pic2_By_Gaurav_shrivastav

 

 

 

Here 6 shows a database id and 581577110 shows an object id, which is the same that I captured above.

Hope it will help. Thanks.

How to Resolve database log file corruption

Hi Folks – Before Some time we faced an issue when log backup and CDC.<db_name>_capture jobs for a database were continuously failing.  After checking the SQL error log we found below error message :

2013-12-17 12:15:12.450 spid344 Backup detected log corruption in database Gaurav. Context is FirstSector. LogFile: 34 ‘K:\Mount12\Gaurav_Log\Gaurav_Log.ldf’ VLF SeqNo: x18e44eVLFBase: x374000000LogBlockOffset: x3777c4200SectorStatus: 2 LogBlock.StartLsn.SeqNo: x3020001LogBlock.StartLsn 

2013-12-17 12:15:12.450 Backup       Error: 3041, Severity: 16, State: 1.

As in above message its clearly stating that log file for a db has been corrupted. Since it was a 4 TB and prod db so we couldn’t afford a downtime. We implemented below steps to remove the corruption from db log file.

  1. Change the recovery model for culprit db to simple by using command Alter database <db_name> set Recovery Simple.
  2. Shrink the corrupted log file by using Wizard to its max. extent. This way SQL returned the corrupted and free space part to OS. Now OS will not write any data to bad sector if its have.
  3. Take the full backup for database and it should complete this time w\o any log file corruption error. 
  4. Change the db to full recovery model and again initiate a Full backup, it should also get success.

Hope these steps will help you.

Cheers : Gaurav

Enable CDC with Database Mirroring environment

Hi SQL Server Folks – Recently I got a situation where database mirroring was setup in my production environment, but due to other needs I need to enable CDC (Change Data Capture) feature also at my SQL Server 2008 R2 instance. If you wants to know about CDC and its usage, you can follow below link :

http://technet.microsoft.com/en-us/library/bb522489(v=sql.105).aspx

To explain & simplify the situation in our environment I’m taking an example here :

1. We had a Server A and Server B. Database Mirroring on a database name as “Test” was setup from Server A to Server B, here Server A was principal and Server B was acting as Mirror.

2. We enabled the CDC feature at Server A @ database “Test” at “Emp” table. As soon as you enable the CDC it will create two jobs at Server A with name as below :
 cdc.<db_name>_capture
cdc.<db_name>_cleanup

Also it will create a system table name as cdc_jobs under msdb system database that will keep the information about above 2 jobs.

3. But here the problem is, it will not create those above jobs and system table at Server B. So if you fail over the database “Test” from Server A to Server B, changes that you made on emp table will not capture in CDC related tables at server B. So to make CDC effective on both the servers we need to create above 2 CDC jobs at Server B also. To create the above 2 jobs at Server B follow the below below process :

a)  Stop and disable both CDC jobs at Server A.

b) Fail over the database mirroring for “Test” from Server A to B.

c) Connect to Server B now since here db “Test” is Principal now and

fire the below commands :

Use  Test                                                                                                                                                              go

EXEC sys.sp_cdc_add_job @job_type = N’capture’;                                          

GO

EXEC sys.sp_cdc_add_job    

@job_type = N’cleanup’, 

@start_job = 0,    

@retention = 5760;

d) After executing above statements It will create 2 CDC jobs (CDC Capture & CDC cleanup) and a system table name as cdc_jobs in msdb database at Server B.  So now if you will make any  transaction at Server B at emp table  under “Test” db, it will also capture under CDC related tables and will mirror to mirrored Server A.

Hope this article will help, Thanks for reading and stay tune for upcoming posts.

Cheers,                                                                                                                                                         Gaurav

How to enable RCSI (Read Committed Snapshot Isolation) level for a Database with Database Mirroring

Hi SQL Guys – By default we’re having Read Committed isolation level on in our databases. Some times its required to reduce the locking mechanism at database level, RCSI (Read committed snapshot isolation) level can reduce blocking and deadlocking issues caused by lock contention. Every time a row is modified by a specific transaction, the instance of the Database Engine stores a version of the previously committed image of the row in tempdb. The read-committed snapshot isolation level provides the following benefits:

  1. SELECT statements do not lock data during a read operation. Read transactions do not block write transactions and vice versa.
  2. Since SELECT statements do not acquire locks in most cases, the number of locks required by a transaction is reduced, which reduces the system overhead required to manage locks.
  3. The possibility of blocking is significantly reduced.
  4. SELECT statements can access the last committed value of the row while other transactions are updating the row without getting blocked.
  5. The number of blocks and deadlocks is reduced.
  6. Fewer lock escalations occur.

But its a bit difficult to enable RCSI at database which is participating in database mirroring. Below are the steps that will make your task easier :

  • Disable the log backup for a database on which you wants to enable RCSI.
  • Disconnect all users from database by taking out all web\Extra-net servers from HLB (in case when application back end is SQL db) OR you can kill the connections on that db, if still connections are visible through sp_who2 then disable the lo-gins through which connections are coming. RCSI can be enable w\o taking the db into single user mode.
  • To enable the RCSI you need to break the db mirroring but don’t panic you need not to restore full db backup here, database engine will take care of it. To break the mirroring fire below command :

                    ALTER DATABASE <db_name> SET PARTNER OFF

  • Now you ca enable RCSI by using below command :

                        Use <db_name>

                         go

                   ALTER DATABASE <db_name> SET READ_COMMITTED_SNAPSHOT ON

  • Isolation level must be set now, verify it through below command, here col. name is_read_committed_snapshot_on = 1, shows RCSI is set.

         select name, is_read_committed_snapshot_on from sys.databases where name = <db_name>

  • Re-configure the db mirroring, it’s not require to restore full db backup here. Mirroring will only transfer left over T-Logs.
  • Wait till db mirroring is not synchronized for db, then enable T-Log backup for database.

Stay tune for more stuffs !

Implementing Resource Governor to limit the CPU usage by Backup Compression

HI SQL Folks : As we all know that backup compression is resource intensive operation, to get rid of this problem we can implement Resource Governor (RG) in our environment and RG can take care of job to limit the CPU on it. I’m showing this demo on m\c which is having multiple CPU’s on it. I’m using SQL Server 2008 R2 build 10.50.4266 on win 2008 R2 SP1.

1. If we run the backup compression w\o RG then it consumes 90 % + of total CPU. I ran it and below is the Perfmon output (You need to select two counters in permon name as SQL Server : resource Pool stats and SQL Server : Workload group stats) As you can see in below that a Blue Line (CPU Usage by default group) is consuming most of the CPU.

             RG0

2.  Now I’m going to create and configure Resource Governor by using below scripts :

———————————————–

 Step 1: Create Resource Pool

———————————————–

CREATE RESOURCE POOL pSlow

WITH (MIN_CPU_PERCENT=0,

MAX_CPU_PERCENT= 30,

MIN_MEMORY_PERCENT=0,

MAX_MEMORY_PERCENT=30)

GO

———————————————–

Step 2: Creating Workload Group

———————————————–

CREATE WORKLOAD GROUP gSlow

USING pSlow

—————————————————

Step 3: Creating user defined function which routes the workload to the appropriate workload group.

—————————————————

/*In this example we are checking SUSER_NAME() and making the decision of Workgroup selection.

We can use other functions such as HOST_NAME(), APP_NAME(), IS_MEMBER() etc. */

CREATE FUNCTION f1()

RETURNS SYSNAME WITH SCHEMABINDING

BEGIN

 

DECLARE @val sysname

if ‘UserSlow’ = SUSER_SNAME()

SET @val = ‘gSlow’;

else

SET @val = ‘default’;

return @val;

END

GO

—————————————————

Create logins, assign function and make changes effective

—————————————————

CREATE LOGIN UserSlow WITH PASSWORD=’UserSlowPwd’ , CHECK_POLICY=OFF

GO

ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = dbo.f1)

GO

ALTER RESOURCE GOVERNOR RECONFIGURE

3. Resource Governor has been setup at server level, Now I need to make a job that will take a backup with compression option :

Before taking a backup of database, we need to add that login as user in peculiar database (for which we’re taking the backup) and need to assign the permission :

Use <db_name>

go

sp_addrolemember  ‘db_backupoperator’, ‘UserSlowPwd’

go

 

Backup database <db_name>

Go

to disk = ‘D:\abc.bak’

with compression

Save this script say as SQLQuery2.sql and make a SQL job and put below command in job to execute this script

sqlcmd -S <Server_Name>-U UserSlow -P UserSlowPwd -i “I:\Perf Data By Gaurav\SQLQuery2.sql”

4. Since I assigned 30 % CPU to this RG i.e. whenever any query running by this peculiar login (UserSlow) then engine will only assign 30 % of single CPU to it.

Beauty of the RG is, if no workload is there on the system then it can maximum utilize a single CPU that is assigned to it (i.e. if 4 CPU’s are there and out of them 1 CPU is assigned to RG then it an use 90 % of that single CPU, it’ll not touch other 3 CPU’s), if workload is there then it will use only 30 % of single CPU that is assigned to it.

To prove this point I ran a backup compression job (now job is using RG that’s having limit on CPU) and created an initial small load on system. As you can in below perfmon values, default load that I created; highlighted in blue line ; using 30 % of CPU but since load is less, my RG is consuming 70 % of total CPU.

RG1

5. Now I increased the load on system by running below script in query analyzer :

set nocount on

declare @i int

declare @s varchar(100)

 set @i = 100000000

 while @i > 0

begin 

select @s = @@version;

set @i = @i – 1;

 end

and as you can see in below reading that RG started to release CPU to other queries. Now RG (highlighted with yellow) is consuming less than 60 % CPU (pre. was using 70 %) and other load started to consuming 45 % CPU (pre. was using 30 %)

RG2

6. To prove a point that RG is using a separate CPU, I fired below query :

select r.session_id, CONVERT(NCHAR(20), wg.name) as group_name, t.scheduler_id,r.status

from sys.dm_exec_requests r join sys.dm_os_tasks t on r.task_address = t.task_address

join sys.dm_resource_governor_workload_groups wg on r.group_id = wg.group_id

where r.session_id > 50

In below output of the query you can see, RG group “gSlow” is using different scheduler (CPU no. 3) and other workload are using CPU 0\1\2.

RG4

Hope it’ll  help someone. Thanks.

Restore Log Shipping without full backup when added data\log file at Primary server

Hi SQL Server Folks – Before some time I faced a scenario when Log Shipping was setup between two servers (say Primary and Secondary Server). For some reason we added one more file in a database at Primary Server side and after that Log Shipping start breaking and we got below error message :

 Log Shipping Failure By Gaurav ShrivastavIn this scenario you need not to restore full backup from the primary, you can simply restore peculiar transaction log file with move option. You will get all desired information in your Log-Shipping restore job, same as you can see in above error.  I’m explaining all the errors and its meaning here step by step :

Message 2013-01-20 03:00:01.38 *** Error: Could not apply log backup file ‘H:\ProdDB_eWork_Copy\eWorkBackup\Tran\ eWork_20130119080001.trn’ to secondary database ‘eWork_ProdCopy’.(Microsoft.SqlServer.Management.LogShipping) *** 

  • This is the transaction log file which was not able to apply at Secondary server side because we added one data file at primary server and not at secondary server.

2013-01-20 03:00:01.38 *** Error: Directory lookup for the file ” h:\Mount16\eWork\CDCFileGroup.ndf” failed with the operating system error 3(The system cannot find the path specified.).

  • This is the data file name and its location which got added at Primary server and not at secondary server, we need to specify that file in our Restore Log option. Don’t create this file manually, Restore Log option will auto. take care of it.

File ‘ CDCFileGroup.ndf ‘ cannot be restored to ‘h:\Mount16\eWork\CDCFileGroup.ndf”. Use WITH MOVE to identify a valid location for the file.

  • This is the Logical file name of newly added data file at Primary server side.

Problems were identified while planning for the RESTORE statement. Previous messages provide details. RESTORE LOG is terminating abnormally.(.Net SqlClient Data Provider) ***

Now Open Query Analyzer and fire below query.

 RESTORE log eWork_ProdCopy FROM Disk=’H:\ProdDB_eWork_Copy\eWorkBackup\Tran\eWork_20130119080001.trn’ WITH MOVE ‘CDCFiles ‘TO’ H:\MSSQL10_50.SSIS\MSSQL\eWork_ProdCopy\Data\CDCFiles’,

 — this is the location in secondary server that you need to define.

 standby=’H:\ProdDB_eWork_Copy\eWorkBackup\Tran\eWork_ProdCopy.tuf’

After that restore command completion, you can manually kick of Log-Shipping copy and restore Job one by one at Secondary server. It’ll restore all remaining files.  To see the sync up status of Log-Shipping with primary server run below commands.

select * from log_shipping_secondary_databases

select * from log_shipping_monitor_secondary

 

Hope this info. will be useful for you.