Tuesday, 21 October 2025

                                           How to Manage Big tables in SQL server? 

Managing or Maintenances of Big tables need proper attention. Below are few of the scenarios when it required proper attention, other it can bring down the DB server itself. 

Case-1 : Alter table to modify the column data type
Case-2 : Data Purge
Case-3 : Adding Index


1. Check the Size of the Table, No of Rows exe SP_SpaceUsed 'Tablename' 

2. Check the Free space of Data and Log Files of the database of the Table

3. Check the Free space of Data and Log Files of TempDB database

4. Check the Data\Log and TempDB drives free space and ensure they have sufficient free space. 

5. Make sure Index creation is ONLINE=ON 


Practical experience : 
I had to change the data type of the column in a table but Primary Key was defined on that column. 
In that case, I had to drop the Primary Key, change the column data type and create the Primary back on the same table. 
I tried all three steps in single transaction but Creating of Primary Key took very long time. Table was around 34GB in size. During the Index creation process, Log file started growing rapidly. There was a point where Log drive had zero free space. Immediately, disabled the growth of existing Log file and created a new Log file on other drive. In this scenario, it would have affected the other databases in the server. Once the Index creation completed, shrunk the Log files and released some space on the drive. Enabled the growth on the old log file and removed the new Log file which was created. 



Thursday, 24 July 2025

 Change the Default Collation of SQL server, Database 

I had installed a SQL server with default collation SQL_Latin1_General_CP1_CI_AS but later realized it should be SQL_Latin1_General_CP850_CI_AS. I followed the below steps. 

1. Change SQL Server (Instance) Collation

⚠️ You cannot change the server collation without rebuilding the system databases, which means reinstalling SQL Server components.

Steps:

  • Backup all user databases and settings

  • Use setup.exe from installation media with REBUILDDATABASE flag

Example Command:

setup.exe /QUIET /ACTION=REBUILDDATABASE /INSTANCENAME=InstanceName /SQLSYSADMINACCOUNTS=domain\user /SAPWD=StrongPassword /SQLCOLLATION=New_Collation_Name

๐Ÿ” Example collation: SQL_Latin1_General_CP1_CI_AS

๐Ÿงท This will remove all system data, including logins and jobs. Use with caution.

INSTANCENAME= MSSQLSERVER for Default Instance

              ServerName\Instance for named instance


SQLSYSADMINACCOUNTS = Must provide windows account

SAPWD = Its going to be new password for 'SA' account.

SQLCOLLATION = New collation to be changed



2. Change Database Collation

You can change a database's default collation only if no tables exist, or you'll have to recreate all objects manually.

Example (for empty or new DB):

ALTER DATABASE YourDatabaseName COLLATE New_Collation_Name;


3. Change Column Collation

To change collation for specific columns:

ALTER TABLE YourTable ALTER COLUMN YourColumn VARCHAR(100) COLLATE New_Collation_Name;

⚠️ This can fail if the column is part of an index or constraint.


4. Specify Collation in Queries

For joining or comparing across different collations:


SELECT * FROM TableA JOIN TableB ON TableA.Name
COLLATE Latin1_General_CI_AS = TableB.Name COLLATE Latin1_General_CI_AS; 




Monday, 21 July 2025

Resolving Disconnected Secondary Replica or Database in Always On Availability Groups

It’s not uncommon to encounter situations where either the secondary database or the secondary replica enters a disconnected state in an Always On Availability Group (AG). This typically occurs during initial AG configurations or after events like OS/SQL patching, network issues, or failovers.

Recently, I faced a scenario where not only was the secondary database in a disconnected state, but the entire secondary replica was also marked as disconnected in the AG dashboard. As a result, the transaction log (T-log) started growing due to unsynchronized data movement.

Below are the steps I followed to troubleshoot and resolve the issue:

Steps to Resolve:

  1. Connected to both Primary and Secondary replicas via SSMS.

  2. On the Secondary node, navigated to:
    Always On High Availability > Availability Groups (AVG) > [Your AG Name]

  3. Under Availability Databases, right-clicked the affected database, then:
    Suspend Data Movement > Resume Data Movement

    Note: This did not resolve the issue.

  4. Restarted the HADR Endpoint on both replicas:


    SELECT * FROM sys.endpoints; ALTER ENDPOINT [Hadr_endpoint] STATE = STOPPED; ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED;

    This also did not resolve the issue initially.

  5. Granted CONNECT permissions to the SQL Server service account (used by the secondary replica) on both nodes:


    GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\SQLServiceAccount];
  6. After applying the endpoint permission, I re-executed Step 4 (restarting endpoints).
    This time, the secondary replica successfully reconnected, and data synchronization resumed.



Monday, 14 October 2019

SQL server DB status runnable, sleeping, suspended, running, pending and background meanings

Some of the most seen status of SPID's in SQL Server and what do they mean:
RUNNING:
This status means session is running one or more batches. When Multiple Active Result Sets (MARS) is enabled, a session can run multiple batches. What this actually means is, the client connected to SQL Server using this session has already submitted a query for SQL Server to process and SQL Server is currently processing the query. The query could be anywhere between generating a parser tree to performing a join to sorting the data... and it is consuming the CPU (Processor) cycles currently.
SUSPENDED:
It means that the request currently is not active because it is waiting on a resource. The resource can be an I/O for reading a page, A WAIT it can be communication on the network, or it is waiting for lock or a latch. It will become active once the task it is waiting for is completed. For example, if the query the has posted a I/O request to read data of a complete table tblStudents then this task will be suspended till the I/O is complete. Once I/O is completed (Data for table tblStudents is available in the memory), query will move into RUNNABLE queue.
RUNNABLE:
The SPID is in the runnable queue of a scheduler and waiting for a quantum to run on the scheduler. This means that requests got a worker thread assigned but they are not getting CPU time.
The RUNNABLE queue can be likened to a grocery analogy where there are multiple check out lines. The register clerk is the CPU. There is just one customer checking out e.g. “RUNNING” at any given register. The time spent in the checkout line represents CPU pressure. So this SPID is waiting for that customer who is running (with register clerk) to get out so that it can start RUNNING. You can use the query SELECT wait_type,waiting_tasks_count,signal_wait_time_ms FROM sys.dm_os_wait_stats ORDER BY signal_wait_time_ms DESC to find out the difference between the time the waiting thread was signaled and when it started running. This difference is the time spent in RUNNABLE queue. Some of the waits on the top of the list can be safely ignored.
PENDING:
The request is waiting for a worker to pick it up. This means the request is ready to run but there are no worker threads available to execute the requests in CPU. This doesn't mean that you have to increase 'Max. Worker threads", you have to check what the currently executing threads are doing and why they are not yielding back. I personally have seen more SPID's with status PENDING on issues which ended up in "Non-yielding Scheduler" and "Scheduler deadlock".
BACKGROUND:
The request is a background thread such as Resource Monitor or Deadlock Monitor.
SLEEPING:
There is no work to be done.

Thursday, 28 March 2019

Database in Suspect Mode


Yesterday I had mail to check the database health after the recovery of servers from Storage crash issue. I found one particular database was in 'Suspect' mode. Performed steps to fix this. 

Database will go to Suspect mode usually if there is any issue with Server Reboot, Storage issue, Database Corruption due to Network issue etc. In this case Storage SAN drive crashed, restored the storage from the Snapshot. 

1. Check the database state
select state_desc, * from sys.databases where state_desc<>'online'

2. Found database was in Suspect Mode. 

First step is to Restore the database from the latest Backup file. and Run the DBCC Integrity command to check it. Made sure everything fine on that.          

As a part of troubleshoot on the issue database.
               
            Brought the database to Emergency Mode, and do the DBCC Integrity check.

Use Master
go
ALTER DATABASE DBNAME SET EMERGENCY

DBCC CHECKDB (DBNAME )

3. Based on the DBSS Result\suggestion performed the DBCC REPAIR_ALLOW_DATA_LOSS.  Depending on the DBCC Results and suggestions 

Use Master
go
ALTER DATABASE DBNAME SET SINGLE_USER WITH ROLLBACK IMMEDIATE

DBCC CHECKDB ('DBNAME', REPAIR_ALLOW_DATA_LOSS)
            go


4. I was not able to access the database into single user mode, transaction was ending in the Deadlock. I changed the Deadlock priority and later was able to take the database into Single user mode.

SET DEADLOCK_PRIORITY HIGH 
GO
ALTER DATABASE DBNAME SET  SINGLE_USER WITH ROLLBACK IMMEDIATE
GO

or 

SET DEADLOCK_PRIORITY 10
GO
ALTER DATABASE DBNAME SET  SINGLE_USER WITH ROLLBACK IMMEDIATE
GO


DBCC CHECKDB ('DBNAME', REPAIR_ALLOW_DATA_LOSS)
            go



5. Then performed the the DBCC REPAIR_ALLOW_DATA_LOSS. Database went to Recovery mode. DBCC Repair Allow data Loss should have fixed the issue by allowing some data loss. But it didn't fix the issue. Hence continued using the Restore database. Later I just dropped the database and renamed the Restored database to original name. 


EXEC msdb.dbo.sp_delete_database_backuphistory @database_name = N'DBNAME'
GO
USE [master]
GO
/****** Object:  Database [DBNAME]    Script Date: 3/27/2019 12:29:38 PM ******/
DROP DATABASE [DBNAME]
GO

Unable to Kill SPID

                  

                       Recently one user reported he is facing some issue while deleting the table records. When I checked it there was a blocking. Immediately I tried to kill the session which was causing the issue. I used  KILL spid to kill the session. Though it was executed successfully, session was still running with 'suspended' status by holding a blocking on sessions. 

In order to kill the session, I alter the database to Single User mode. 

USE MASTER 
GO
ALTER DATABASE (DBNAME) SET SINGLE_USER WITH ROLLBACK IMMEDIATE 

ALTER DATABASE (DBNAME) SET MULTI_USER WITH ROLLBACK IMMEDIATE 
GO


This resolved the issue by killing all the sessions. But it was on a NonProduction database. When doing it on Production database please be aware of affected users and the work to be redone. 

Thursday, 27 September 2018

Shrink TempDB database

Its very common issue on day to day DBA activities, where TempDB databases will go out of space. Sometime it is cumbersome to shrink the database files to release the space. 

Below are some on the Methods to release the space on TempDB database. 

1. Shrink Database files 
use tempdb
GO

CHECKPOINT
GO
dbcc shrinkfile ('tempdev') -- shrink db file
dbcc shrinkfile ('templog') -- shrink log file
2. Clean up the Cache to and shrink the database files. 
DBCC FREEPROCCACHE -- clean cache
DBCC DROPCLEANBUFFERS -- clean buffers
DBCC FREESYSTEMCACHE ('ALL') -- clean system cache
DBCC FREESESSIONCACHE -- clean session cache
DBCC SHRINKDATABASE(tempdb, 10); -- shrink tempdb
After this again Run the shrink commands
dbcc shrinkfile ('tempdev') -- shrink db file
dbcc shrinkfile ('templog') -- shrink log file
GO
3. Restart the SQL service
This is should be a last option.