[GLASSFISH-15592] [STRESS] Slow Memory growth observed over 24x7. Created: 17/Jan/11  Updated: 19/Dec/16  Resolved: 01/Feb/11

Status: Resolved
Project: glassfish
Component/s: failover
Affects Version/s: 3.1_dev
Fix Version/s: 3.1_dev

Type: Bug Priority: Critical
Reporter: varunrupela Assignee: Mahesh Kannan
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: JPEG File cpu-mem.jpg     File instance101-jmap-live.out-1     File instance101-jmap-live.out-34     File instance101-jmap-live.out-67    
Issue Links:
blocks GLASSFISH-15423 [STRESS] [BigApps] [Umbrella-Issue] 2... Closed
Tags: 3_1-review


Details of the scenario are in the parent issue for this bug: http://java.net/jira/browse/GLASSFISH-15423

The re-run of this RichAccess 24x7 scenario with build 37 shows slow but evident memory growth. One of the instance's (instance101) jmap (-histo:live) logs were taken every 20 minutes. Three of those are attached here and so is the CPU/Mem plot. The jmap -dump and the -histo:lie logs will be sent directly to Mahesh.

The jmap -histo:live log files from the instance indicate a slow rise in number/size the following data structures:

  • org.shoal.ha.cache.impl.store.DataStoreEntry
  • java.util.concurrent.ConcurrentHashMap$HashEntry
  • com.sun.ejb.base.sfsb.util.SimpleKeyGenerator$SimpleSessionKey

The observed memory growth is perhaps slow due to the low number of simultaneous users (100 per instance).

Comment by Mahesh Kannan [ 18/Jan/11 ]

Did we see the growth on b36? There were no major changes in the replication module since b36.
Can you point me to the server logs? One possibility is that instance1 could be acting as replica for more keys than the other two instances.

FYI, SimpleSessionKeys are used as Keys for EJBs.

Comment by varunrupela [ 18/Jan/11 ]

Checked all the older runs, there was only one run with nightly build from dec 21 that showed a really really small hint of memory growth. No other runs show it.

This issue is being reported with build 37 and on Windows 2008. The other run with build 37 on OEL does not show a memory growth.

Logs location is being sent by e-mail to you.

Comment by Nazrul [ 19/Jan/11 ]

Any update on the memory leak?

Comment by Mahesh Kannan [ 20/Jan/11 ]

I looked into the two jmap s (using jhat -baseline <first map> <second map>) and it looks like the EJB references are slowly leaking. All these are residing in the ReplicaStore.

There are two reasons why they could be leaking in instance 1

1. instance 1 could be acting as replica instance for more keys than other two instances (though this means a non-uniform key distribution)

2. The save commands could be arriving after remove commands. One way to handle this is to maintain a set that contains keys that were removed in the recent past (say 5 minutes?). Then, if the save commands arrive out of order we could throw away those that arrive remove commands. Obviously, this can be implemented only in 3.2. The other option is to rely of ejb idle processor to remove unused keys. Maybe running the longivity test with a lower re-move-timeout-in-sceonds might eliminate this slow growth.

Comment by Nazrul [ 27/Jan/11 ]

Tracking bug at this point. Excluding from un-scrubbed list

Comment by Mahesh Kannan [ 01/Feb/11 ]

Elena could run richAccess for 4 days without any memory leaks. The jvm though crashed. I believe there is a separate issue for that.

I am marking this issue as resolved. Please reopen if you see this issue again.

Comment by Mahesh Kannan [ 01/Feb/11 ]

If you see a growth, please rerun the app with the following system property -Dorg.shoal.ha.cache.mbean.register=true

Generated at Tue Jan 17 21:47:29 UTC 2017 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.