Monday, 14 October 2019

SQL server DB status runnable, sleeping, suspended, running, pending and background meanings

Some of the most seen status of SPID's in SQL Server and what do they mean:
RUNNING:
This status means session is running one or more batches. When Multiple Active Result Sets (MARS) is enabled, a session can run multiple batches. What this actually means is, the client connected to SQL Server using this session has already submitted a query for SQL Server to process and SQL Server is currently processing the query. The query could be anywhere between generating a parser tree to performing a join to sorting the data... and it is consuming the CPU (Processor) cycles currently.
SUSPENDED:
It means that the request currently is not active because it is waiting on a resource. The resource can be an I/O for reading a page, A WAIT it can be communication on the network, or it is waiting for lock or a latch. It will become active once the task it is waiting for is completed. For example, if the query the has posted a I/O request to read data of a complete table tblStudents then this task will be suspended till the I/O is complete. Once I/O is completed (Data for table tblStudents is available in the memory), query will move into RUNNABLE queue.
RUNNABLE:
The SPID is in the runnable queue of a scheduler and waiting for a quantum to run on the scheduler. This means that requests got a worker thread assigned but they are not getting CPU time.
The RUNNABLE queue can be likened to a grocery analogy where there are multiple check out lines. The register clerk is the CPU. There is just one customer checking out e.g. “RUNNING” at any given register. The time spent in the checkout line represents CPU pressure. So this SPID is waiting for that customer who is running (with register clerk) to get out so that it can start RUNNING. You can use the query SELECT wait_type,waiting_tasks_count,signal_wait_time_ms FROM sys.dm_os_wait_stats ORDER BY signal_wait_time_ms DESC to find out the difference between the time the waiting thread was signaled and when it started running. This difference is the time spent in RUNNABLE queue. Some of the waits on the top of the list can be safely ignored.
PENDING:
The request is waiting for a worker to pick it up. This means the request is ready to run but there are no worker threads available to execute the requests in CPU. This doesn't mean that you have to increase 'Max. Worker threads", you have to check what the currently executing threads are doing and why they are not yielding back. I personally have seen more SPID's with status PENDING on issues which ended up in "Non-yielding Scheduler" and "Scheduler deadlock".
BACKGROUND:
The request is a background thread such as Resource Monitor or Deadlock Monitor.
SLEEPING:
There is no work to be done.

Thursday, 28 March 2019

Database in Suspect Mode


Yesterday I had mail to check the database health after the recovery of servers from Storage crash issue. I found one particular database was in 'Suspect' mode. Performed steps to fix this. 

Database will go to Suspect mode usually if there is any issue with Server Reboot, Storage issue, Database Corruption due to Network issue etc. In this case Storage SAN drive crashed, restored the storage from the Snapshot. 

1. Check the database state
select state_desc, * from sys.databases where state_desc<>'online'

2. Found database was in Suspect Mode. 

First step is to Restore the database from the latest Backup file. and Run the DBCC Integrity command to check it. Made sure everything fine on that.          

As a part of troubleshoot on the issue database.
               
            Brought the database to Emergency Mode, and do the DBCC Integrity check.

Use Master
go
ALTER DATABASE DBNAME SET EMERGENCY

DBCC CHECKDB (DBNAME )

3. Based on the DBSS Result\suggestion performed the DBCC REPAIR_ALLOW_DATA_LOSS.  Depending on the DBCC Results and suggestions 

Use Master
go
ALTER DATABASE DBNAME SET SINGLE_USER WITH ROLLBACK IMMEDIATE

DBCC CHECKDB ('DBNAME', REPAIR_ALLOW_DATA_LOSS)
            go


4. I was not able to access the database into single user mode, transaction was ending in the Deadlock. I changed the Deadlock priority and later was able to take the database into Single user mode.

SET DEADLOCK_PRIORITY HIGH 
GO
ALTER DATABASE DBNAME SET  SINGLE_USER WITH ROLLBACK IMMEDIATE
GO

or 

SET DEADLOCK_PRIORITY 10
GO
ALTER DATABASE DBNAME SET  SINGLE_USER WITH ROLLBACK IMMEDIATE
GO


DBCC CHECKDB ('DBNAME', REPAIR_ALLOW_DATA_LOSS)
            go



5. Then performed the the DBCC REPAIR_ALLOW_DATA_LOSS. Database went to Recovery mode. DBCC Repair Allow data Loss should have fixed the issue by allowing some data loss. But it didn't fix the issue. Hence continued using the Restore database. Later I just dropped the database and renamed the Restored database to original name. 


EXEC msdb.dbo.sp_delete_database_backuphistory @database_name = N'DBNAME'
GO
USE [master]
GO
/****** Object:  Database [DBNAME]    Script Date: 3/27/2019 12:29:38 PM ******/
DROP DATABASE [DBNAME]
GO

Unable to Kill SPID

                  

                       Recently one user reported he is facing some issue while deleting the table records. When I checked it there was a blocking. Immediately I tried to kill the session which was causing the issue. I used  KILL spid to kill the session. Though it was executed successfully, session was still running with 'suspended' status by holding a blocking on sessions. 

In order to kill the session, I alter the database to Single User mode. 

USE MASTER 
GO
ALTER DATABASE (DBNAME) SET SINGLE_USER WITH ROLLBACK IMMEDIATE 

ALTER DATABASE (DBNAME) SET MULTI_USER WITH ROLLBACK IMMEDIATE 
GO


This resolved the issue by killing all the sessions. But it was on a NonProduction database. When doing it on Production database please be aware of affected users and the work to be redone.