[GLASSFISH-16909] start-cluster/stop-cluster leaks threads Created: 24/Jun/11  Updated: 27/Jun/11  Resolved: 27/Jun/11

Status: Resolved
Project: glassfish
Component/s: admin
Affects Version/s: 3.1
Fix Version/s: 3.1.1_b10, 4.0_b13

Type: Bug Priority: Major
Reporter: Joe Di Pol Assignee: Joe Di Pol
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Tags: 3_1_1-approved


start-cluster and stop-cluster appear to leak threads. For a 3 instance cluster I'm seeing around 6 threads leaked when I start a cluster. This can be seen by running jstack on the DAS before and immediately after starting the cluster. The leaked threads look like:

"pool-11-thread-3" prio=3 tid=0x08f20800 nid=0x5f waiting on condition [0xcd845000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xdc894060> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:662)

It's possible these threads are reclaimed after some period of time (when garbage collection kicks in), but I have not verified that.

Comment by Joe Di Pol [ 24/Jun/11 ]

This is likely due to a missing threadPool.shutdown() call in ClusterCommandHelper.runCommand().

runCommand() creates a new ExecutorService() for each execution of start-cluster/stop-cluster. It correctly waits for the tasks to complete, but it does not shut the threadpool down after completion.

Comment by scatari [ 24/Jun/11 ]

Pre approving for 3.1.1 as this is a test/release blocker.

Comment by Joe Di Pol [ 24/Jun/11 ]
  • Why fix this issue in 3.1.1?

This bug is potentially causing unbounded resource growth in the DAS. We believe it could be impacting QA's testing on AIX where they are seeing DAS hangs.

  • Which is the targeted build of 3.1.1 for this fix?


  • Do regression tests exist for this issue?


  • Which tests should QA (re)run to verify the fix did not destabilize GlassFish?

Existing cluster lifecycle tests will verify the fix did not introduce a regression.

I have run the admin instance/cluster lifecycle devtests.

Comment by Joe Di Pol [ 24/Jun/11 ]

Fixed in 3.1.1 r47670:
Index: ClusterCommandHelper.java
— ClusterCommandHelper.java (revision 47668)
+++ ClusterCommandHelper.java (working copy)
@@ -321,6 +321,7 @@

+ threadPool.shutdown();
return report;

Comment by Joe Di Pol [ 27/Jun/11 ]

Fixed in trunk:
Project: glassfish
Repository: svn
Revision: 47707
Author: jfdipol
Date: 2011-06-27 16:48:49 UTC

Generated at Sun Apr 26 08:17:22 UTC 2015 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.