|<< Back to previous view|
[GLASSFISH-15592] [STRESS] Slow Memory growth observed over 24x7. Created: 17/Jan/11 Updated: 01/Feb/11 Resolved: 01/Feb/11
|Remaining Estimate:||Not Specified|
|Time Spent:||Not Specified|
|Original Estimate:||Not Specified|
|File Attachments:||cpu-mem.jpg instance101-jmap-live.out-1 instance101-jmap-live.out-34 instance101-jmap-live.out-67|
|Participants:||Mahesh Kannan, Nazrul and varunrupela|
Details of the scenario are in the parent issue for this bug: http://java.net/jira/browse/GLASSFISH-15423
The re-run of this RichAccess 24x7 scenario with build 37 shows slow but evident memory growth. One of the instance's (instance101) jmap (-histo:live) logs were taken every 20 minutes. Three of those are attached here and so is the CPU/Mem plot. The jmap -dump and the -histo:lie logs will be sent directly to Mahesh.
The jmap -histo:live log files from the instance indicate a slow rise in number/size the following data structures:
The observed memory growth is perhaps slow due to the low number of simultaneous users (100 per instance).
|Comment by Mahesh Kannan [ 18/Jan/11 04:25 PM ]|
FYI, SimpleSessionKeys are used as Keys for EJBs.
|Comment by varunrupela [ 18/Jan/11 08:39 PM ]|
Checked all the older runs, there was only one run with nightly build from dec 21 that showed a really really small hint of memory growth. No other runs show it.
This issue is being reported with build 37 and on Windows 2008. The other run with build 37 on OEL does not show a memory growth.
Logs location is being sent by e-mail to you.
|Comment by Nazrul [ 19/Jan/11 09:44 AM ]|
Any update on the memory leak?
|Comment by Mahesh Kannan [ 20/Jan/11 12:25 AM ]|
I looked into the two jmap s (using jhat -baseline <first map> <second map>) and it looks like the EJB references are slowly leaking. All these are residing in the ReplicaStore.
There are two reasons why they could be leaking in instance 1
1. instance 1 could be acting as replica instance for more keys than other two instances (though this means a non-uniform key distribution)
2. The save commands could be arriving after remove commands. One way to handle this is to maintain a set that contains keys that were removed in the recent past (say 5 minutes?). Then, if the save commands arrive out of order we could throw away those that arrive remove commands. Obviously, this can be implemented only in 3.2. The other option is to rely of ejb idle processor to remove unused keys. Maybe running the longivity test with a lower re-move-timeout-in-sceonds might eliminate this slow growth.
|Comment by Nazrul [ 27/Jan/11 11:11 AM ]|
Tracking bug at this point. Excluding from un-scrubbed list
|Comment by Mahesh Kannan [ 01/Feb/11 03:41 PM ]|
Elena could run richAccess for 4 days without any memory leaks. The jvm though crashed. I believe there is a separate issue for that.
I am marking this issue as resolved. Please reopen if you see this issue again.
|Comment by Mahesh Kannan [ 01/Feb/11 03:43 PM ]|
If you see a growth, please rerun the app with the following system property -Dorg.shoal.ha.cache.mbean.register=true