Issue Details (XML | Word | Printable)

Key: GLASSFISH-13592
Type: Bug Bug
Status: Open Open
Priority: Major Major
Assignee: marina vatkina
Reporter: Stephen DiMilla
Votes: 0
Watchers: 2

If you were logged in you would be able to see more operations.

asadmin stop-local-instance does not stop instance with outstanding ejb invocation

Created: 23/Sep/10 12:56 PM   Updated: 08/Dec/10 03:36 PM
Component/s: ejb_container
Affects Version/s: 3.1
Fix Version/s: future release

Time Tracking:
Not Specified

File Attachments: 1. Text File clientside_output.rtf (10 kB) 23/Sep/10 01:02 PM - Stephen DiMilla
2. Text File pidmonitorscript_and_output.rtf (4 kB) 23/Sep/10 01:01 PM - Stephen DiMilla
3. Text File shutdown_jstack.out (37 kB) 12/Oct/10 02:58 PM - Joe Fialli


Operating System: All
Platform: Linux

Issue Links:

Issuezilla Id: 13,592
Status Whiteboard:


Tags: 3_1-exclude
Participants: Joe Fialli, marina vatkina and Stephen DiMilla

 Description  « Hide

I have a distributed cluster (1 das and 3 instances) all on separate machines.
I deploy a ejb web app to the das and cluster. I then send post messages to the
das and each instance which initiates all the instances to start sending
messages between each other. After 15 seconds I issue a asadmin
stop-local-instance to the second instance. The command returns with the following:

Waiting for the instance to stop
Timed out (60 seconds) waiting for the domain to stop.
Command stop-local-instance failed.

I then issue asadmin list-instances clustername and get back:
ins01 running
ins02 not running
ins03 running

I then issue asadmin get-health clustername and get back:
ins01 started since Thu Sep 23 11:29:58 PDT 2010
ins02 started since Thu Sep 23 11:29:58 PDT 2010
ins03 started since Thu Sep 23 11:29:57 PDT 2010

After this I try to connect to the HTTP port of ins02 (using a browser) and get:
Unable able to connect.

The whole time while this situtation was going on I used a script to
monitor the pids of all the java processes on the machine for ins02 and what I
noticed was that I saw the process for the stop-local-instance start and end but
never did the appserver terminate.

Looking in the DAS server log, I was able to determine that the test running
on ins02 finished it's work and sent a message to the das indicating such.

From all this information it appears that the appserver (ins02) was partially
shutdown via the stop-local-instance but that the GMS process and the ejb
container were not stopped and continued to run leaving the appserver in a
inconsistent state.

Debugging this issue is currently blocked as a result of the limited information
contained in the server log. See dependency bug

Stephen DiMilla added a comment - 23/Sep/10 01:01 PM

Created an attachment (id=4950)
scirpt to monitor pids and the output it generated during test

Stephen DiMilla added a comment - 23/Sep/10 01:02 PM

Created an attachment (id=4951)
client side output with debug turned on for CLI commands

marina vatkina added a comment - 23/Sep/10 01:29 PM

I've noticed that even when stop-local-instance succeeds, some mq processes are
not stopped (no JMS was used by the test app)

Joe Fialli added a comment - 28/Sep/10 08:37 AM

Given that the test case is using stateless EJBs to send GMS messages in a heavy
load, there is a chance that GMS under a heavy load is interferring with asadmin
attempt to shutdown server. Until gf issue 13593, logging stopped at beginning
of shutdown, there is no easy way to investigate this. GMSAdapter has a app
server event listener registered. This listener handles SERVER_READY and
SERVER_SHUTDOWN. GMS leaves its group when the handler is invoked with
SERVER_SHUTDOWN. There is an info log message that would confirm if the
eventhandler is properly getting called.

Here is event handler in question from

glassfishEventListener =
new org.glassfish.api.event.EventListener() {
public void event(Event event) {
if ( { logger.log(Level.INFO, "gmsservice.server_shutdown.received", ...); gms.shutdown(GMSConstants.shutdownType.INSTANCE_SHUTDOWN); events.unregister(glassfishEventListener); }

Joe Fialli added a comment - 04/Oct/10 09:19 AM

progress is being made on dependent issue 13593.
once that is fixed, we will be able to evaluate this issue.

Joe Fialli added a comment - 12/Oct/10 02:55 PM

We moved group-management-service shutdown from being invoked during SERVER_SHUTDOWN to
PREPARE_SHUTDOWN, and GMS is now properly shutting down for this test case in question.

However, the test case uses a stateful EJB to run the test. One EJB invocation last for the duration of the
test (which is 4 -5 minutes typically). However, in this case, we were shutting down one of the clustered
instances in the middle of the test. I will attach complete jstack of process after asadmin stop-instance
has timed out.

Here are two thread traces relevant to application server being in a half shutdown state.
"GlassFish Shutdown Hook" prio=10 tid=0x0000000057ca0800 nid=0x647f waiting on condition
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)

  • parking to wait for <0x00002aaac3933e68> (a
    at java.util.concurrent.locks.LockSupport.park(
    at com.sun.ejb.containers.BaseContainer.doContainerCleanup(
    at com.sun.ejb.containers.BaseContainer.onShutdown(
    at org.glassfish.ejb.startup.EjbApplication.stop(
  • locked <0x00002aaabfd48378> (a
    at com.sun.enterprise.v3.server.ApplicationLifecycle.disable(
    at com.sun.hk2.component.SingletonInhabitant.release(
  • locked <0x00002aaabf553330> (a com.sun.hk2.component.SingletonInhabitant)
    at com.sun.hk2.component.EventPublishingInhabitant.release(
    at com.sun.hk2.component.LazyInhabitant.release(
  • locked <0x00002aaabf5532e0> (a com.sun.hk2.component.LazyInhabitant)
    at com.sun.enterprise.v3.server.AppServerStartup.stop(
  • locked <0x00002aaabebe69a8> (a com.sun.enterprise.v3.server.AppServerStartup)
    at com.sun.enterprise.glassfish.bootstrap.GlassFishImpl.stop(
  • locked <0x00002aaabebe6980> (a com.sun.enterprise.glassfish.bootstrap.GlassFishImpl)
    at com.sun.enterprise.glassfish.bootstrap.GlassFishMain$Launcher$

"http-thread-pool-18080(2)" daemon prio=10 tid=0x00002aaae5374800 nid=0x6443 waiting on
condition [0x00000000483bf000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.sun.shoal.test.SendBean.sleep(
at com.sun.shoal.test.SendBean.waitTillDone(
at com.sun.shoal.test.SendBean.sendMessage(
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at com.sun.ejb.containers.BaseContainer.invokeBeanMethod(
at com.sun.ejb.EjbInvocation.invokeBeanMethod(
at com.sun.ejb.EjbInvocation.proceed(
at org.jboss.weld.ejb.SessionBeanInterceptor.aroundInvoke(
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(
at com.sun.ejb.containers.BaseContainer.__intercept(
at com.sun.ejb.containers.BaseContainer.intercept(
at com.sun.ejb.containers.EJBObjectInvocationHandler.invoke(
at $Proxy163.sendMessage(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at com.sun.shoal.test._SendIF_Wrapper.sendMessage(com/sun/shoal/test/
at org.apache.jsp.test_jsp._jspService( from :133)
at org.apache.jasper.runtime.HttpJspBase.service(
at javax.servlet.http.HttpServlet.service(
at org.apache.jasper.servlet.JspServletWrapper.service(
at org.apache.jasper.servlet.JspServlet.serviceJspFile(
at org.apache.jasper.servlet.JspServlet.service(
at javax.servlet.http.HttpServlet.service(
at org.apache.catalina.core.StandardWrapper.service(
at org.apache.catalina.core.StandardWrapperValve.invoke(
at org.apache.catalina.core.StandardContextValve.invoke(
at org.apache.catalina.core.StandardPipeline.doInvoke(
at org.apache.catalina.core.StandardPipeline.invoke(
at com.sun.enterprise.web.WebPipeline.invoke(
at org.apache.catalina.core.StandardHostValve.invoke(
at org.apache.catalina.connector.CoyoteAdapter.doService(
at org.apache.catalina.connector.CoyoteAdapter.service(
at com.sun.grizzly.http.ProcessorTask.invokeAdapter(
at com.sun.grizzly.http.ProcessorTask.doProcess(
at com.sun.grizzly.http.ProcessorTask.process(
at com.sun.grizzly.http.DefaultProtocolFilter.execute(
at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(
at com.sun.grizzly.DefaultProtocolChain.execute(
at com.sun.grizzly.DefaultProtocolChain.execute(
at com.sun.grizzly.http.HttpProtocolChain.execute(
at com.sun.grizzly.ProtocolChainContextTask.doCall(
at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(
at com.sun.grizzly.util.AbstractThreadPool$

One could simply recreate this as an ejb unit test by creating a stateful EJB with one method
that sleeps for 10 minutes. The asadmin stop-instance while the method is still being invoked
will time out. Should a wedged EJB invocation being stopping server shutdown?

We realize that the test EJB question needs some work, but we did want to allow someone from ejb-
container to evaluate impact of pending EJB invocation on shutting down the server.

Joe Fialli added a comment - 12/Oct/10 02:58 PM

Created an attachment (id=5131)
jstack of threads still running after asadmin stop-instance has timed out trying to stop an app server with an active EJB invocation

kenaiadmin made changes - 26/Nov/10 12:15 AM
Field Original Value New Value
issue.field.bugzillaimportkey 13592 45196
marina vatkina made changes - 08/Dec/10 03:36 PM
Fix Version/s future release [ 11148 ]
Fix Version/s 3.2 [ 10969 ]