Analysis of an issue with Sitecore AccessResultCache running full and being cleared continuously despite large cache size
We recently had a situation, where we quickly had to grab a virtual whip in order to tame the Sitecore AccessResultCache from going crazy – in terms of reaching it's allocated size within seconds and thus being constantly cleared by Sitecore.Caching.Generics.Cache
What symptoms where we seeing?
Well, the ball started rolling with an incident reported by a client stating
«the Sitecore backend is extremely slow»
Upon investigation we indeed observed – not permanent, but quite regular – long loading times as soon as clicking / viewing items in the Sitecore backend. Or when doing item operations such as switching an item's version / language.
Checking the Sitecore log for any irregular events, we quickly found the following log entry – which was easy, as such where basically spamming the Sitecore log since hours:
AccessResultCache cache is cleared by Sitecore.Caching.Generics.Cache strategy. Cache running size was 9 MB.
Turns out, that this message showed up – due to the particular cache being cleared – over 60-80 thousand times within certain hours!
And yes, this is not a mistake, see the following excerpt from the log of the affected period:
Soooo… what was going on behind the scene?
The explanation causing all the continuous log entries is quite easy:
Our caching strategy and cache sizing configuration allocated 9 megabyte as the maximal cache size for the AccessResultCache. It is common standard behaviour, that once a cache reaches it's allocated size, it will be cleared – and rebuilt.
This is what happened here as well. Except, that it's not the idea that a cache reaches its max within seconds and therefore will basically immediately be cleared.
What is the AccessResultCache & how is it being used in Sitecore?
First, of course, we wanted to understand why Sitecore needs this cache at all, in order to elaborate what might cause it to literally "explode" in terms of over allocating the small size we assigned it. Have to mention, that this happened without any changes for quite a period of time.
We also wanted to understand, why 9 MB is suddenly not enough size anymore?!
The purpose of the AccessResultCache is described as follows, from a neat answer on Sitecore Stackexchange:
Every time when anybody accesses any item in Sitecore, result of resolved security right is put to AccessResultCache.
It is not related to content editing, but to content accessing.
Okay, got it. But we did not see an excessive increase in users accessing or working in the Sitecore backend…
While researching, we also learned that the AccessResultCache can be disabled on Sitecore Contend Delivery environments (meaning your webservers where visitors browse your site) based on a blog post on sitecore saga by Deepak Bhat.
However, we were facing no issues with the AccessResultCache on the CD environments, we were affected on the Content Management instance. So this seemed a dead end, too.
Mitigating too many and too quick cache clearings of AccessResultCache
One of the first steps we took was to try to make our AccessResultCache – and thus the Sitecore instance – more healthy again.
This was done by patching the cache size of the access result cache and allocating (much!) more space.
- We increased the Caching.AccessResultCacheSize from 9 MB to 100 MB.
Result: no impact – still a continuously cleared cache. - Okay, again. now we increased the Caching.AccessResultCacheSize from 100 MB to 512 MB.
Result: we still had regular cleared cache, but it went down to like once per minute. However, this is still far from "healthy". - Going full throttle: we checked with the infrastructure guys, how much resources we still had available for cache size tuning. And after getting a green light, we increased the Caching.AccessResultCacheSize from 512 MB to 2048 MB (2 GB).
Result: now we got something! The AccessResultCache stayed for much longer without being cleared. Is this the solution?
Even with 2 GB to allocate, it increased in size excessively with every (first) click on an item in the Sitecore backend. Meaning, between 5-10 MB per item click. It was not impacting the performance so badly anymore – at least on items that at least 1 user had already visited once – but it still felt like not being the solution.
Something must be wrong somewhere, causing those interferences with our AccessResultCache.
Root cause for the AccessResultCache on a Content Management environment being cleared excessively
Due to our confusion with the unexpected behaviour and impact on the CM and the Sitecore users, we already opened a support ticket with Sitecore very early. And once again, we we're very happy we did so – because they pointed us in the right direction for the root cause:
The actual root cause for our AccessResultCache filling up so quickly and excessively, requiring a disproportional amount of resources to mitigate, came down to the fact that accessing items in the backend also had to load a whole tree of hundreds of other items.
Now it suddenly made sense: the purpose of the AccessResultCache was doing what it should do: it stored access permissions for items.
But because Sitecore needed to read a lot of items, this exceeded the healthy capacity of the cache (and in the end was also not giving an advantage feature wise).
We figured that we recently added a new DropLink item field, which was inherited to many content items via a shared section.
But the problem causing piece of this new DropLink field was it's datasource:
it contained a Sitecore query to dynamically grab a list of items from the content tree based on a specified template.
The issue with this was however, that in order to produce these DropLink field entries Sitecore actually had to iterate through a content tree with hundreds of containers and thousands of (sub-)items.
The fix how we finally tamed the out-of-control AccessResultCache size
We verified for what feature we introduced that field and the mechanics it had. It was just, so a user could select an existing element, which then would be displayed as a headline on the page. We checked back with the client & agreed that a simple "Single-line text" for manually typing the desired value, will offer the same benefits and was just as easy for the content managers to edit.
Byebye DropLink field, byebye expensive datasource query!
As a learning, keep in mind: don't use Sitecore item queries in fields' datasources, if this query will iterate over a huge amount of items.
Or be aware what the consequences – and unexpected side effects – could occur.