Issue Details (XML | Word | Printable)

Key: GLASSFISH-15592
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Critical Critical
Assignee: Mahesh Kannan
Reporter: varunrupela
Votes: 0
Watchers: 1

If you were logged in you would be able to see more operations.

[STRESS] Slow Memory growth observed over 24x7.

Created: 17/Jan/11 08:24 PM   Updated: 01/Feb/11 03:43 PM   Resolved: 01/Feb/11 03:41 PM
Component/s: failover
Affects Version/s: 3.1_b37
Fix Version/s: 3.1_b40

Time Tracking:
Not Specified

File Attachments: 1. File instance101-jmap-live.out-1 (375 kB) 17/Jan/11 08:34 PM - varunrupela
2. File instance101-jmap-live.out-34 (374 kB) 17/Jan/11 08:34 PM - varunrupela
3. File instance101-jmap-live.out-67 (375 kB) 17/Jan/11 08:34 PM - varunrupela

Image Attachments:

1. cpu-mem.jpg
(267 kB)
Issue Links:

Tags: 3_1-review
Participants: Mahesh Kannan, Nazrul and varunrupela

 Description  « Hide

Details of the scenario are in the parent issue for this bug:

The re-run of this RichAccess 24x7 scenario with build 37 shows slow but evident memory growth. One of the instance's (instance101) jmap (-histo:live) logs were taken every 20 minutes. Three of those are attached here and so is the CPU/Mem plot. The jmap -dump and the -histo:lie logs will be sent directly to Mahesh.

The jmap -histo:live log files from the instance indicate a slow rise in number/size the following data structures:

  • java.util.concurrent.ConcurrentHashMap$HashEntry
  • com.sun.ejb.base.sfsb.util.SimpleKeyGenerator$SimpleSessionKey

The observed memory growth is perhaps slow due to the low number of simultaneous users (100 per instance).

Mahesh Kannan added a comment - 18/Jan/11 04:25 PM

Did we see the growth on b36? There were no major changes in the replication module since b36.
Can you point me to the server logs? One possibility is that instance1 could be acting as replica for more keys than the other two instances.

FYI, SimpleSessionKeys are used as Keys for EJBs.

varunrupela added a comment - 18/Jan/11 08:39 PM

Checked all the older runs, there was only one run with nightly build from dec 21 that showed a really really small hint of memory growth. No other runs show it.

This issue is being reported with build 37 and on Windows 2008. The other run with build 37 on OEL does not show a memory growth.

Logs location is being sent by e-mail to you.

Nazrul added a comment - 19/Jan/11 09:44 AM

Any update on the memory leak?

Mahesh Kannan added a comment - 20/Jan/11 12:25 AM

I looked into the two jmap s (using jhat -baseline <first map> <second map>) and it looks like the EJB references are slowly leaking. All these are residing in the ReplicaStore.

There are two reasons why they could be leaking in instance 1

1. instance 1 could be acting as replica instance for more keys than other two instances (though this means a non-uniform key distribution)

2. The save commands could be arriving after remove commands. One way to handle this is to maintain a set that contains keys that were removed in the recent past (say 5 minutes?). Then, if the save commands arrive out of order we could throw away those that arrive remove commands. Obviously, this can be implemented only in 3.2. The other option is to rely of ejb idle processor to remove unused keys. Maybe running the longivity test with a lower re-move-timeout-in-sceonds might eliminate this slow growth.

Nazrul added a comment - 27/Jan/11 11:11 AM

Tracking bug at this point. Excluding from un-scrubbed list

Mahesh Kannan added a comment - 01/Feb/11 03:41 PM

Elena could run richAccess for 4 days without any memory leaks. The jvm though crashed. I believe there is a separate issue for that.

I am marking this issue as resolved. Please reopen if you see this issue again.

Mahesh Kannan added a comment - 01/Feb/11 03:43 PM

If you see a growth, please rerun the app with the following system property -Dorg.shoal.ha.cache.mbean.register=true