Tracking Esent Memory Usage

Dec 7, 2012 at 8:56 AM
Edited Dec 7, 2012 at 9:15 AM

I'm looking for a way to track where Esent is using memory when it's used as part of RavenDB.

At the moment I'm using this code to get the cache size and it works fine:

Api.JetGetSystemParameter(instance, pht.Session, JET_param.CacheSize, ref cacheSizeInPages, out unused, 1024);
Api.JetGetSystemParameter(instance, pht.Session, JET_param.DatabasePageSize, ref pageSize, out unused, 1024);
cacheSizeInBytes = ((long) cacheSizeInPages) * pageSize;

However I can't find a way to find out how much memory is being used by the Version pages, JET_param.MaxVerPages always gives the config setting, not the current usage?

Basically I'd like to be able to account for all the unmanaged memory that Esent is using across it's caches, is this possible in code?

Also what are the heuristics Esent uses for freeing "cache" memory. For example take JET_param.CacheSize? After doing some inserts and reads I can see that go up (which I expect) but it doesn't seem to automatically go down over time. Does Esent assume that it can hold onto memory it's allocated, it seems like settings JET_param.CacheSize should force it back down, is this correct?

Developer
Dec 7, 2012 at 7:29 PM

Most of the data is exposed via perf counters. Look under 'Database' for things like 'Database Cache Memory Committed' and 'Version Buckets Allocated'. Yes, you can query perf counters from code.

The heuristics used are kind of advanced. :) It balances the paging of the current process with the paging of the whole system. It will shrink the cache if the rest of the system is doing work. But yes, you can override the Dynamic Buffer Allocation by explicitly setting the CacheSize parameters.

 

-martin

Dec 7, 2012 at 10:04 PM
I'd hoped to be able to avoid having to query the perf counters because of the overhead of having them running all the time, but if it's the only way.

Would the cache and version page size always account for the majority (say 95%) of Esents memory usage, or is there something else?

Thanks very much for the info

On Friday, 7 December 2012, martinc wrote:

From: martinc

Most of the data is exposed via perf counters. Look under 'Database' for things like 'Database Cache Memory Committed' and 'Version Buckets Allocated'. Yes, you can query perf counters from code.

The heuristics used are kind of advanced. :) It balances the paging of the current process with the paging of the whole system. It will shrink the cache if the rest of the system is doing work. But yes, you can override the Dynamic Buffer Allocation by explicitly setting the CacheSize parameters.

-martin

Read the full discussion online.

To add a post to this discussion, reply to this email (ManagedEsent@discussions.codeplex.com)

To start a new discussion for this project, email ManagedEsent@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe on CodePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at CodePlex.com

Developer
Dec 7, 2012 at 10:10 PM

It all depends on the load. :) On Active Directory or Exchange servers, the database cache can easily be in the dozens of gigabytes.

Without knowing anything more, I would say that it is *very likely* that the database cache is using up the majority of the memory.

 

-martin

Dec 7, 2012 at 10:28 PM
Thanks for that, it's very useful to know

I'll put in the perf counters and see what they show
Dec 9, 2012 at 11:12 AM

Checking the SystemParameters.CacheSize gives us about 10 MB (so I don't really believe that number). Is that going to be different number from the actual cache size?

Note that we have about 1.5 GB memory used by the process, and a lot of that is probably Esent.

Developer
Dec 12, 2012 at 3:36 AM

Sorry for the delay.

Checking the perf counter is the most definitive.

SystemParameters.CacheSize returns the value in pages (not bytes or MB). But with a value of 10 million, even with 4k pages that would be way off (40 GB!). So I'll assume you knew that and did the multiplication for me. :) I asked around, and it should be pretty accurate (+/- a few percent; it shouldn't be off by orders of magnitude!).

1.5 GB is indeed a lot! Do you have a repro? Can you attach cdb/windbg? If so, try the following commands:
!address -summary   // Breaks down the overall memory usage of the process. ESENT cache size and CLR Heap are both under <Unknown>
!heap -s                 // Dumps the CRT heaps. (Unlikely to be a problem in your case, but just making sure)
.loadby sos clr         // Loads the CLR debugger extension
!dumpheap -stat        // Dumps the CLR heap statistics. Is there a leak in managed objects?

Hope that helps.

-martin

Dec 12, 2012 at 9:38 AM

Can you just confirm the calculation to go from "Version Buckets Allocated" to a value in bytes? Is it the #buckets * 65,536? According to this page 1 buckets is 64K or 65,536 bytes.

 

But this page says, it's 16KB and you can query it, but I can't see that param (JET_paramVerPageSize) available in managed Esent:

JET_paramMaxVerPages 9

This parameter reserves the requested number of version store pages for use by an instance. The version store holds a live record of all the different versions of each record or index entry in the database that can be seen by all active transactions. These versions are used to support the multi-versioned concurrency control in use by the database engine to support transactions using snapshot isolation. This setting will affect how many updates can be held in memory at a time. This in turn will affect either the maximum number of updates a single transaction can perform, the maximum duration a transaction can be held open, the maximum concurrent load of updating transactions on the system, or a combination of these.

Each version store page as configured by this parameter is 16KB in size.

Windows Vista and later:  The version store page size can be read and changed via JET_paramVerPageSize.

Windows 2000, Windows XP and Windows Server 2003:  Large values for this parameter will consume address space and may increase memory usage.

Developer
Dec 12, 2012 at 11:41 PM

Not everything has been exposed in ManagedEsent. We've just added them as necessary, so sometimes it's more that we haven't come across a need to expose, rather than purposefully hiding.

I just looked at the Win8 code for setting the version bucket size. It is... complicated.

Yes, you can set it to JET_paramVerPageSize. The default is 16k.
But it gets silently doubled when running on 64bit.
And it must be at least as twice as big as the database page size.
And it caps out at 64k.

Isn't that fun? :|

-martin

Dec 13, 2012 at 5:39 AM

Hm,

Is there a good way for us the actually _get_ the actual version page size?

I am doing something like this:

 

			const int JET_paramVerPageSize = 128;
			int versionPageSize = 0;
			string paramString;
			Api.JetGetSystemParameter(JET_INSTANCE.Nil, JET_SESID.Nil, (JET_param) JET_paramVerPageSize, ref versionPageSize,
			                          out paramString, 0);
			
Would this give me the configured value or the actual value?

Developer
Dec 13, 2012 at 5:56 AM

That unfortunately gives you the configured value, not the actual value.

Maybe try looking through the perf counters to see if one of the values is in bytes and another is in buckets? Then you could do some math. This would be pretty cumbersome programmatically, but may be fine if you're just curious.

-martin

Dec 17, 2012 at 11:47 AM

Okay, I have a repo, and I got it to be 1.7 GB (!)

!address -summary

--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
Free                                    275      7ff`5c06e000 (   7.997 Tb)           99.97%
                         1255        0`938d9000 (   2.306 Gb)  89.99%    0.03%
Image                                  1146        0`0d1d3000 ( 209.824 Mb)   8.00%    0.00%
MemoryMappedFile                         75        0`0346d000 (  52.426 Mb)   2.00%    0.00%
TEB                                      52        0`00068000 ( 416.000 kb)   0.02%    0.00%
PEB                                       1        0`00001000 (   4.000 kb)   0.00%    0.00%

--- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_PRIVATE                            1232        0`92d31000 (   2.294 Gb)  89.54%    0.03%
MEM_IMAGE                              1222        0`0dde4000 ( 221.891 Mb)   8.46%    0.00%
MEM_MAPPED                               75        0`0346d000 (  52.426 Mb)   2.00%    0.00%

--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_FREE                                275      7ff`5c06e000 (   7.997 Tb)           99.97%
MEM_COMMIT                             2131        0`79305000 (   1.894 Gb)  73.91%    0.02%
MEM_RESERVE                             398        0`2ac7d000 ( 684.488 Mb)  26.09%    0.01%

--- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal
PAGE_READWRITE                         1173        0`6a7ad000 (   1.664 Gb)  64.94%    0.02%
PAGE_EXECUTE_READ                       122        0`0a674000 ( 166.453 Mb)   6.34%    0.00%
PAGE_READONLY                           375        0`02799000 (  39.598 Mb)   1.51%    0.00%
PAGE_WRITECOPY                          267        0`016bc000 (  22.734 Mb)   0.87%    0.00%
PAGE_EXECUTE_READWRITE                   92        0`00412000 (   4.070 Mb)   0.16%    0.00%
PAGE_EXECUTE_WRITECOPY                   50        0`001df000 (   1.871 Mb)   0.07%    0.00%
PAGE_READWRITE|PAGE_GUARD                52        0`0009e000 ( 632.000 kb)   0.02%    0.00%

--- Largest Region by Usage ----------- Base Address -------- Region Size ----------
Free                                      0`b0000000      7fd`d72b0000 (   7.992 Tb)
                            0`3d270000        0`10000000 ( 256.000 Mb)
Image                                   7fe`d616a000        0`01338000 (  19.219 Mb)
MemoryMappedFile                          0`01263000        0`012dd000 (  18.863 Mb)
TEB                                     7ff`ffe98000        0`00002000 (   8.000 kb)
PEB                                     7ff`fffd4000        0`00001000 (   4.000 kb)

!dumpheap -stat 

 

000007fe87b8e108     8967       286944 Raven.Json.Linq.RavenJArray
000007fe87733dc0    12791       306984 NLog.Internal.SingleCallContinuation
000007fe87d28fe0    13810       331440 System.Collections.Generic.Dictionary`2+KeyCollection[[System.String, mscorlib],[Raven.Json.Linq.RavenJToken, Raven.Abstractions]]
000007fe87eb7f58       14       345024 System.Collections.Generic.Dictionary`2+Entry[[Lucene.Net.Index.Term, Lucene.Net],[Lucene.Net.Index.BufferedDeletes+Num, Lucene.Net]][]
000007fe87b8f818     8967       358680 System.Collections.Generic.List`1[[Raven.Json.Linq.RavenJToken, Raven.Abstractions]]
000007fe87766460    12791       409312 NLog.LoggerImpl+<>c__DisplayClass1
000007fee5b42250     5998       671776 System.Reflection.RuntimeMethodInfo
000007fe875d6000    26071       834272 Raven.Database.Storage.ReduceKeyAndBucket
000007fee5b4c7b8    36175       868200 System.Boolean
000007fee5b5ff60    38766       930384 System.Int64
000007fe874e5fb8     8953       931112 Raven.Abstractions.Data.JsonDocument
000007fe87c10810    33116      1324640 Raven.Storage.Esent.StorageActions.OptimizedIndexReader`1+Key[[System.Object, mscorlib]]
000007fe876baf18    12790      1534800 NLog.LogEventInfo
000007fe876ba570    25582      1637248 NLog.Common.AsyncContinuation
000007fe875a73c8    44164      1766560 Raven.Database.Indexing.MapReduceIndex+MapResultItem
000007fe87ebf918       95      2517120 System.Collections.Generic.Dictionary`2+Entry[[Lucene.Net.Index.Term, Lucene.Net],[System.Collections.Generic.LinkedListNode`1[[Lucene.Net.Util.Cache.SimpleLRUCache`2+ListValueEntry`2[[Lucene.Net.Index.Term, Lucene.Net],[Lucene.Net.Index.TermInfo, Lucene.Net],[Lucene.Net.Index.Term, Lucene.Net],[Lucene.Net.Index.TermInfo, Lucene.Net]], Lucene.Net]], System]][]
000007fee5b4c168     1156      3293564 System.Char[]
000007fee5b4f058    38338      3829996 System.Byte[]
000007fe875d6310    68635      4941720 Raven.Database.Storage.MappedResultInfo
000007fee5b2f1b8    34302      9235480 System.Object[]
000007fe874e6738   506555     20262200 Raven.Json.Linq.RavenJObject
000007fee5b4dc30   518101     21982304 System.Int32[]
000007fe87b8e5a8   513460     28753760 Raven.Json.Linq.DictionaryWithParentSnapshot
000007fe87b8f2b8   513460     41076800 System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[Raven.Json.Linq.RavenJToken, Raven.Abstractions]]
000007fe87b8ec50  1285650     41140800 Raven.Json.Linq.RavenJValue
000007fe87c18dc0   513461     61112832 System.Collections.Generic.Dictionary`2+Entry[[System.String, mscorlib],[Raven.Json.Linq.RavenJToken, Raven.Abstractions]][]
000007fee5b4aee0  2334297    195257958 System.String
00000000000f6e00   100703    620911546      Free

Dec 17, 2012 at 12:06 PM

I am making the following assumptions.

  • .NET memory size GC.GetTotalMemory(false);
  • Total process memory: Process.GetCurrentProcess().PrivateMemorySize64
  • Cache Page Size is equal to database page size.
  • Version page size is calculated using:
    // see dicussion here: http://managedesent.codeplex.com/discussions/405939
    const int JET_paramVerPageSize = 128;
    int versionPageSize = 0;
    string paramString;
    Api.JetGetSystemParameter(JET_INSTANCE.Nil, JET_SESID.Nil, (JET_param)JET_paramVerPageSize, ref versionPageSize,
    						  out paramString, 0);
    
    versionPageSize = Math.Max(versionPageSize, SystemParameters.DatabasePageSize * 2);
    
    if (Environment.Is64BitProcess)
    {
    	versionPageSize *= 2;
    }
    return Math.Min(versionPageSize, 64 * 1024);

Using this code, I get the following numbers (after I forced a gc):

  • DatabaseCacheSizeInMB639.98,
  • DatabaseTransactionVersionSizeInMB5.56,
  • ManagedMemorySizeInMB238.82,
  • TotalProcessMemorySizeInMB1388.78,
  • MemoryThatIsNotAccountedFor504.42

As you can see, we have about 500 Mb that are not accounted for. And I would really like to know what is going on.

I should mention that I have a perfect repro, and that on my machine the memory actually peaks at about 2 GB before starting to drop. So _something_ funky is going on in here.

Developer
Dec 17, 2012 at 7:10 PM

First of all: Yes, Cache Page Size is equal to Database Page Size.

You have 195 MB of C# strings (is that expected? :), and 620 MB of 'Free' memory in the managed heap. This sounds pretty suspicious to me. I did a quick web search (http://www.bing.com/search?q=dumpheap+stat+free) and found the following:

http://blogs.msdn.com/b/tess/archive/2005/11/25/496973.aspx
'Another important thing to notice is the Free objects. These are not really objects but rather objects that have been marked as Free during a garbage collection but where the space is not yet compacted. If you have a lot of free space on the heap you may have a problem with pinning (a lot is a very vague term, but if more than 30% of the managed heap is Free you should definitely look into it). For some good explanations on this, please read Maoni’s blog http://blogs.msdn.com/maoni'

In order to track this down, you will likely need to explore more of the !sos commands related to garbage collection.

http://stackoverflow.com/questions/870723/is-my-heap-fragmented alludes that !sosex has some more ability to track this sort of error.

ManagedEsent does do pinning during its calls. If you find that it's a problem with ManagedEsent, please let us know!

-martin