8

We use an ASPState database to persist .NET Session state on a SQL Server 2005 cluster. We are seeing some strange behavior during peak periods

  • The DeleteExpiredSessions proc is run every minute via an agent job. Sometimes this job is taking many minutes to run and delete expired sessions

  • Requests from the application to the ASPState database are very slow. I believe this is because there are exclusive locks being held on the table by DeleteExpiredSessions procedure

Code:

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE PROCEDURE [dbo].[DeleteExpiredSessions]
AS
        SET NOCOUNT ON
        SET DEADLOCK_PRIORITY LOW 

        DECLARE @now datetime
        SET @now = GETUTCDATE() 

        DECLARE @tblExpiredSessions TABLE
        (
            SessionID nvarchar(88) NOT NULL PRIMARY KEY
        )

        INSERT INTO @tblExpiredSessions (SessionID)
            SELECT SessionID
            FROM [ASPState].dbo.ASPStateTempSessions WITH (READUNCOMMITTED)
            WHERE Expires < @now

        --EXPIRED SESSION LOGGING 
        /*
        DECLARE @ExpiredSessionCount Int;
        SELECT @ExpiredSessionCount = COUNT(SessionID) 
        FROM @tblExpiredSessions;
        */


        IF @@ROWCOUNT <> 0 
        BEGIN 
            DECLARE ExpiredSessionCursor CURSOR LOCAL FORWARD_ONLY READ_ONLY
            FOR SELECT SessionID FROM @tblExpiredSessions ORDER BY CHECKSUM(NEWID())

            DECLARE @SessionID nvarchar(88)

            OPEN ExpiredSessionCursor

            FETCH NEXT FROM ExpiredSessionCursor INTO @SessionID

            WHILE @@FETCH_STATUS = 0 
                BEGIN
                    DELETE FROM [ASPState].dbo.ASPStateTempSessions WHERE SessionID = @SessionID AND Expires < @now
                    FETCH NEXT FROM ExpiredSessionCursor INTO @SessionID
                END

            CLOSE ExpiredSessionCursor

            DEALLOCATE ExpiredSessionCursor

        END 

        --EXPIRED SESSION LOGGING
        /*
        BEGIN TRY
             INSERT INTO DeleteExpiredSessionLog(RunStart, RunEnd, ExpiredSessionsDeleted)
             VALUES (@now, GETUTCDATE(), @ExpiredSessionCount);
        END TRY
        BEGIN CATCH
             --SWALLOW ANY EXCEPTION
        END CATCH;
        */


    RETURN 0                          
  • The log file is filling up and in some cases forcing autogrowth, despite the db being in simple recovery

In addition to these, a trace reveals multiple requests for the same session coming in quick succession. For example exec dbo.TempResetTimeout @id=N'32gghltsuoesnvlzbehchp2m2014c0f1' 14 times in <1 second, so we're wondering about this as a cause or congestion, but not sure of the root of this behavior.

Any suggestions or explanation around this behavior would be appreciated.

marc_s
  • 8,613
  • 6
  • 43
  • 51
reticentKoala
  • 629
  • 1
  • 9
  • 23
  • Background to session state is here: http://msdn.microsoft.com/en-us/library/aa478952.aspx – reticentKoala Dec 14 '12 at 15:07
  • We have implemented the following recommendation: http://aspalliance.com/articleViewer.aspx?aId=1184&pId=-1 – reticentKoala Dec 14 '12 at 15:08
  • Ah, I see, this is a replacement. Did you replace the original because you were having a problem then, or because someone on the Internet said to do so? I would suggest trying to switch back to the original to see if the symptom gets better or worse. To reduce *potential* blocking, you could also do `SELECT 1; WHILE @@ROWCOUNT <> 0 BEGIN DELETE TOP (10) ... END` – Aaron Bertrand Dec 14 '12 at 15:30
  • Because the procedure runs for > 1 minute at busy times, there is the danger that 2 copies of the proc are trying to delete the same row, the ordering by NEWID reduces the chance of this. – reticentKoala Dec 14 '12 at 15:37
  • Why would 2 copies of the procedure ever be running? It's triggered from a job, right? Have you ever actually witnessed two copies of the procedure running simultaneously? (Also: random is no guarantee.) (Also: if two copies of the procedure are running at the same time, it is even *less* likely they'll conflict. Even if they started at exactly the same time, when processing in a predictable order, they'll both get to the first row, and one process will win, the other will have nothing to delete. Next.) – Aaron Bertrand Dec 14 '12 at 15:40
  • Hmm. fairdoos. The jobs won't overlap, and I don't think it get's called from anywhere else. But adding TOP 10 will just make blocking worse. At present the cursor ensures that only one record is deleted at a time, right? – reticentKoala Dec 14 '12 at 15:57
  • Yes, but that is SLOOOOOOW. You want to delete MORE rows in single operations. In my solution below I identify rows in chunks ordered by the clustered key. This will minimize the number of pages that have to be touched. You need to TRY these things, not make assumptions. – Aaron Bertrand Dec 14 '12 at 16:09
  • This is the latest supported version of the procedure. http://support.microsoft.com/kb/973849 The latest version is supposed to work on individual rows based on the clustered index key and intended to avoid blocking active sessions. The older version was prone to lock escalation and severe blocking. Are you actually observing blocking on this procedure? – Roji P Thomas Dec 14 '12 at 21:11
  • Blocking is being investigated as one possible cause of the intermittent poor performance. I'm really interested to understand why the same session would be getting updated so frequently. – reticentKoala Dec 17 '12 at 10:00
  • @reticentKoala perhaps it's an application problem, not a problem with the cleanup stored procedure. The procedure that updates a session does not get called on its own, your app does that... perhaps you're including session touching code in some js or jquery code that gets called a lot more often than you think. – Aaron Bertrand Dec 17 '12 at 14:59
  • 2
    I wrote a blog post covering some potential enhancements to make the impact of ASPState less noticeable: http://www.sqlperformance.com/2013/01/t-sql-queries/optimize-aspstate – Aaron Bertrand Jan 11 '13 at 15:03
  • View this reference: http://stackoverflow.com/questions/8806125/how-can-i-recycle-asp-net-sessions-stored-in-sql-server **which are best patterns and practices with SQL Server ASPState? deleted sessions?** – Kiquenet Oct 21 '16 at 20:29

1 Answers1

5

I suspect you implemented Greg's replacement procedure as pre-emptive optimization. The one-row-at-a-time approach restricts locking to a single row, sure, but it's going to take a lot longer - especially if you force SQL Server to attack rows in a random order.

My suggestion would be to revert to the original procedure:

ALTER PROCEDURE dbo.DeleteExpiredSessions
AS
BEGIN
  SET NOCOUNT ON;

  DECLARE @now DATETIME;
  SET @now = GETUTCDATE();

  DELETE ASPState..ASPStateTempSessions 
    WHERE Expires < @now;
END
GO

If you find that this becomes a performance issue because of the locks taken, you can split this out and even reduce impact on the logs using something like:

ALTER PROCEDURE dbo.DeleteExpiredSessions
AS
BEGIN
  SET NOCOUNT ON;

  DECLARE @now DATETIME, @c INT;
  SELECT @now = GETUTCDATE(), @c = 1;

  BEGIN TRANSACTION;

  WHILE @c <> 0
  BEGIN
    ;WITH x AS 
    (
      SELECT TOP (1000) SessionId
        FROM dbo.ASPStateTempSessions
        WHERE Expires < @now
        ORDER BY SessionId
    ) 
    DELETE x;

    SET @c = @@ROWCOUNT;

    IF @@TRANCOUNT = 1
    BEGIN
      COMMIT TRANSACTION;
      BEGIN TRANSACTION;
    END
  END

  IF @@TRANCOUNT = 1
  BEGIN
    COMMIT TRANSACTION;
  END
END
GO

This is going to be much better, IMHO, than using a randomized cursor to delete one row at a time. You can tweak the value in TOP (1000) based on actual observation.

A couple of other ideas that might help:

(a) set the ASPState database's recovery model to simple (it defaults to full).

(b) change ASPState_Job_DeleteExpiredSessions to run every 5 or 10 minutes instead of every minute. If this job is taking > 1 minute to run then it is going to always be running. Hopefully it isn't completely linear, meaning that waiting 5 minutes instead of 1 won't queue up more than 5 minutes of work instead of 1. Implementing my suggestions above should help with this.

Aaron Bertrand
  • 175,645
  • 25
  • 380
  • 589