ORMLockTimeoutException

Posts   
 
    
Puser
User
Posts: 228
Joined: 20-Sep-2012
# Posted on: 10-Sep-2019 13:43:05   

LLBLGenPro 5.5 (5.5.1) RTM 28-12-2018

A customer suddenly starts getting this message in our software. Nothing in the software(configuration) has changed on our side. Maybe there is something going on in their hardware/os I don't know of (not under my control). Maybe you know what this means and how it can be fixed.

Exception Info: SD.LLBLGen.Pro.ORMSupportClasses.ORMLockTimeoutException at SD.LLBLGen.Pro.ORMSupportClasses.TimedLock.Lock(System.Object, System.TimeSpan) at SD.LLBLGen.Pro.ORMSupportClasses.ResultsetCache.PurgeExpiredElements() at SD.LLBLGen.Pro.ORMSupportClasses.ResultsetCache._purgeTimer_Elapsed(System.Object) at System.Threading.TimerQueueTimer.CallCallbackInContext(System.Object) at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.TimerQueueTimer.CallCallback() at System.Threading.TimerQueueTimer.Fire() at System.Threading.TimerQueue.FireQueuedTimerCompletion(System.Object) at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() at System.Threading.ThreadPoolWorkQueue.Dispatch() at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39590
Joined: 17-Aug-2003
# Posted on: 10-Sep-2019 15:01:25   

It looks like a timeout in the lock that's guarding the data in the default cache implementation, which suggests a slow operation or deadlock.

The default timeout is 10 seconds. The purge interval is also 10 seconds. So each 10 seconds the timer elapses and will call the purge function. This will then try to lock the data and clear it. It will wait for the lock for 10 seconds and if that passes, it will throw a locktimeout exception, as you're seeing.

The cached data is simply a dictionary, and we're locking that both for storing and purging the data.

So if you cache a resultset, the same procedure is taken: it will take the same lock on the same data, then store the data, and then give back the lock.

What I can think of is this: - you fetch a lot of data and want to cache the resultset. This operation takes > 10 seconds or at least longer than the purgeInterval specified when creating the resultsetCache (default is 10seconds). This store operation grabs the lock to store the data - the purge timer elapses, and as the lock is taken, it has to wait - it waits for 10 seconds. If the lock isn't returned before that, you'll get the exception you're seeing.

So if the caching operation of the large resultset takes > 10 seconds and the purge timer elapses e.g. right after that operation starts, you get this exception I think, because the caching of the resultset (which has the lock) takes longer than the purge timer will wait for its turn on the lock, so it will timeout. The stacktrace shows that.

One workaround could be to look at what large resultset is there to be cached. That they start to see it now is likely because they perhaps imported or generated a lot of data for a resultset you're caching somewhere. To increase the purgeInterval to e.g. 1 minute, change the value you pass to the ResultsetCache constructor (if you're not passing anything, it's 10 seconds). purgeInterval is in seconds.

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39590
Joined: 17-Aug-2003
# Posted on: 11-Sep-2019 10:21:45   

Investigating it a bit, the caching operation is a simple one: the resultset is cached in a cacheitem during the query, and if it has to be cached, it's cached in one go. This is in general very fast as it's simply adding an item to a dictionary.

The retrieval side of things, where you read a cached resultset, it's also very fast, as it simply obtains an item from the cache by using a dictionary lookup. Only that time the cache is locked.

So it's odd that there's a 10second (if you left the default as purgeInterval) delay somewhere so the purge operation timed out.

The purge operation itself is first creating a list of elements to purge, and then purges them from the dictionary. This is also done by key removal, so should be very fast.

I also can't think of a scenario where a deadlock might occur, as there's no operation which starts to take the lock and which then takes a long time while the same thread will then try to take the lock again.

Frans Bouma | Lead developer LLBLGen Pro
Puser
User
Posts: 228
Joined: 20-Sep-2012
# Posted on: 12-Sep-2019 12:00:29   

thanks for the response Frans,

I use the durationInSeconds (different settings depending on use) but not the purgeInterval. There seems to be no external trigger from a client, but I need to investigate that more. Next time I get an error report I check the last external trigger(s) and time between the last and the exception. Maybe that will give me more input.

EDIT: furthermore, I invalidate caches when an object changes by key. So every object type that can be fetched with a cachedresultset also has a key. When an object change is notified, the CacheController.PurgeResultsets(cacheKey) is called. When there is no activity this will not be called.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39590
Joined: 17-Aug-2003
# Posted on: 13-Sep-2019 09:17:39   

Hmm, this still shouldn't create the deadlock I think, as the purge action from the outside is also quick.

These things are a pain to reproduce. If they can reproduce it very often, it might be worth it to investigate tho, but otherwise you could also copy the ResultsetCache class and make changes there and use that copy (under a different name of course) as your cache instead of the resultsetcache class instance.

Let us know if we have to look into things here...

Frans Bouma | Lead developer LLBLGen Pro
Puser
User
Posts: 228
Joined: 20-Sep-2012
# Posted on: 13-Sep-2019 09:59:20   

Thanks, I'll keep monitoring this and if this happens more, I'll get back. Cheers!