Issue Details (XML | Word | Printable)

Key: GLASSFISH-13592
Type: Bug Bug
Status: Open Open
Priority: Major Major
Assignee: marina vatkina
Reporter: Stephen DiMilla
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
glassfish

asadmin stop-local-instance does not stop instance with outstanding ejb invocation

Created: 23/Sep/10 12:56 PM   Updated: 08/Dec/10 03:36 PM
Component/s: ejb_container
Affects Version/s: 3.1
Fix Version/s: future release

Time Tracking:
Not Specified

File Attachments: 1. Text File clientside_output.rtf (10 kB) 23/Sep/10 01:02 PM - Stephen DiMilla
2. Text File pidmonitorscript_and_output.rtf (4 kB) 23/Sep/10 01:01 PM - Stephen DiMilla
3. Text File shutdown_jstack.out (37 kB) 12/Oct/10 02:58 PM - Joe Fialli

Environment:

Operating System: All
Platform: Linux

Issue Links:
Dependency
 

Issuezilla Id: 13,592
Status Whiteboard:

3.1-exclude

Tags: 3_1-exclude
Participants: Joe Fialli, marina vatkina and Stephen DiMilla


 Description  « Hide

I have a distributed cluster (1 das and 3 instances) all on separate machines.
I deploy a ejb web app to the das and cluster. I then send post messages to the
das and each instance which initiates all the instances to start sending
messages between each other. After 15 seconds I issue a asadmin
stop-local-instance to the second instance. The command returns with the following:

Waiting for the instance to stop
..........................................................
Timed out (60 seconds) waiting for the domain to stop.
Command stop-local-instance failed.

I then issue asadmin list-instances clustername and get back:
ins01 running
ins02 not running
ins03 running

I then issue asadmin get-health clustername and get back:
ins01 started since Thu Sep 23 11:29:58 PDT 2010
ins02 started since Thu Sep 23 11:29:58 PDT 2010
ins03 started since Thu Sep 23 11:29:57 PDT 2010

After this I try to connect to the HTTP port of ins02 (using a browser) and get:
Unable able to connect.

The whole time while this situtation was going on I used a script to
monitor the pids of all the java processes on the machine for ins02 and what I
noticed was that I saw the process for the stop-local-instance start and end but
never did the appserver terminate.

Looking in the DAS server log, I was able to determine that the test running
on ins02 finished it's work and sent a message to the das indicating such.

From all this information it appears that the appserver (ins02) was partially
shutdown via the stop-local-instance but that the GMS process and the ejb
container were not stopped and continued to run leaving the appserver in a
inconsistent state.

Debugging this issue is currently blocked as a result of the limited information
contained in the server log. See dependency bug



Stephen DiMilla added a comment - 23/Sep/10 01:01 PM

Created an attachment (id=4950)
scirpt to monitor pids and the output it generated during test


Stephen DiMilla added a comment - 23/Sep/10 01:02 PM

Created an attachment (id=4951)
client side output with debug turned on for CLI commands


marina vatkina added a comment - 23/Sep/10 01:29 PM

I've noticed that even when stop-local-instance succeeds, some mq processes are
not stopped (no JMS was used by the test app)


Joe Fialli added a comment - 28/Sep/10 08:37 AM

Given that the test case is using stateless EJBs to send GMS messages in a heavy
load, there is a chance that GMS under a heavy load is interferring with asadmin
attempt to shutdown server. Until gf issue 13593, logging stopped at beginning
of shutdown, there is no easy way to investigate this. GMSAdapter has a app
server event listener registered. This listener handles SERVER_READY and
SERVER_SHUTDOWN. GMS leaves its group when the handler is invoked with
SERVER_SHUTDOWN. There is an info log message that would confirm if the
eventhandler is properly getting called.

Here is event handler in question from
v3/cluster/gms-adapter/GMSAdapterImpl.initializeGMS()

glassfishEventListener =
new org.glassfish.api.event.EventListener() {
@Override
public void event(Event event) {
...
if (event.is(EventTypes.SERVER_SHUTDOWN)) { logger.log(Level.INFO, "gmsservice.server_shutdown.received", ...); gms.shutdown(GMSConstants.shutdownType.INSTANCE_SHUTDOWN); events.unregister(glassfishEventListener); }


Joe Fialli added a comment - 04/Oct/10 09:19 AM

progress is being made on dependent issue 13593.
once that is fixed, we will be able to evaluate this issue.


Joe Fialli added a comment - 12/Oct/10 02:55 PM

We moved group-management-service shutdown from being invoked during SERVER_SHUTDOWN to
PREPARE_SHUTDOWN, and GMS is now properly shutting down for this test case in question.

However, the test case uses a stateful EJB to run the test. One EJB invocation last for the duration of the
test (which is 4 -5 minutes typically). However, in this case, we were shutting down one of the clustered
instances in the middle of the test. I will attach complete jstack of process after asadmin stop-instance
has timed out.

Here are two thread traces relevant to application server being in a half shutdown state.
"GlassFish Shutdown Hook" prio=10 tid=0x0000000057ca0800 nid=0x647f waiting on condition
[0x00000000491cf000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)

  • parking to wait for <0x00002aaac3933e68> (a
    java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
    at
    java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchroni
    zer.java:1925)
    at com.sun.corba.ee.impl.oa.poa.POAImpl$DestroyThread.completeDestruction(POAImpl.java:735)
    at com.sun.corba.ee.impl.oa.poa.POAImpl$DestroyThread.performDestroy(POAImpl.java:702)
    at com.sun.corba.ee.impl.oa.poa.POAImpl$DestroyThread.run(POAImpl.java:604)
    at com.sun.corba.ee.impl.oa.poa.POAImpl$DestroyThread.doIt(POAImpl.java:583)
    at com.sun.corba.ee.impl.oa.poa.POAImpl.destroy(POAImpl.java:1003)
    at
    com.sun.corba.ee.impl.oa.rfm.ReferenceFactoryManagerImpl.destroy(ReferenceFactoryManagerImpl.java:
    499)
    at com.sun.corba.ee.impl.oa.rfm.ReferenceFactoryImpl.destroy(ReferenceFactoryImpl.java:66)
    at
    org.glassfish.enterprise.iiop.impl.POARemoteReferenceFactory.destroy(POARemoteReferenceFactory.java
    :555)
    at com.sun.ejb.containers.BaseContainer.doContainerCleanup(BaseContainer.java:4353)
    at com.sun.ejb.containers.BaseContainer.onShutdown(BaseContainer.java:4234)
    at org.glassfish.ejb.startup.EjbApplication.stop(EjbApplication.java:311)
    at org.glassfish.internal.data.EngineRef.stop(EngineRef.java:169)
    at org.glassfish.internal.data.ModuleInfo.stop(ModuleInfo.java:267)
  • locked <0x00002aaabfd48378> (a org.glassfish.internal.data.ModuleInfo)
    at org.glassfish.internal.data.ApplicationInfo.stop(ApplicationInfo.java:277)
    at com.sun.enterprise.v3.server.ApplicationLifecycle.disable(ApplicationLifecycle.java:1824)
    at
    com.sun.enterprise.v3.server.ApplicationLoaderService.stopApplication(ApplicationLoaderService.java:42
    7)
    at
    com.sun.enterprise.v3.server.ApplicationLoaderService.preDestroy(ApplicationLoaderService.java:395)
    at
    com.sun.hk2.component.AbstractWombInhabitantImpl.dispose(AbstractWombInhabitantImpl.java:74)
    at com.sun.hk2.component.SingletonInhabitant.release(SingletonInhabitant.java:78)
  • locked <0x00002aaabf553330> (a com.sun.hk2.component.SingletonInhabitant)
    at com.sun.hk2.component.EventPublishingInhabitant.release(EventPublishingInhabitant.java:105)
    at com.sun.hk2.component.LazyInhabitant.release(LazyInhabitant.java:130)
  • locked <0x00002aaabf5532e0> (a com.sun.hk2.component.LazyInhabitant)
    at com.sun.enterprise.v3.server.AppServerStartup.stop(AppServerStartup.java:401)
  • locked <0x00002aaabebe69a8> (a com.sun.enterprise.v3.server.AppServerStartup)
    at com.sun.enterprise.glassfish.bootstrap.GlassFishImpl.stop(GlassFishImpl.java:88)
  • locked <0x00002aaabebe6980> (a com.sun.enterprise.glassfish.bootstrap.GlassFishImpl)
    at com.sun.enterprise.glassfish.bootstrap.GlassFishMain$Launcher$1.run(GlassFishMain.java:194)

"http-thread-pool-18080(2)" daemon prio=10 tid=0x00002aaae5374800 nid=0x6443 waiting on
condition [0x00000000483bf000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.sun.shoal.test.SendBean.sleep(SendBean.java:927)
at com.sun.shoal.test.SendBean.waitTillDone(SendBean.java:629)
at com.sun.shoal.test.SendBean.sendMessage(SendBean.java:171)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.glassfish.ejb.security.application.EJBSecurityManager.runMethod(EJBSecurityManager.java:1052)
at org.glassfish.ejb.security.application.EJBSecurityManager.invoke(EJBSecurityManager.java:1124)
at com.sun.ejb.containers.BaseContainer.invokeBeanMethod(BaseContainer.java:5363)
at com.sun.ejb.EjbInvocation.invokeBeanMethod(EjbInvocation.java:619)
at
com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:801)
at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:571)
at org.jboss.weld.ejb.SessionBeanInterceptor.aroundInvoke(SessionBeanInterceptor.java:47)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:862)
at
com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:801)
at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:371)
at com.sun.ejb.containers.BaseContainer.__intercept(BaseContainer.java:5335)
at com.sun.ejb.containers.BaseContainer.intercept(BaseContainer.java:5323)
at com.sun.ejb.containers.EJBObjectInvocationHandler.invoke(EJBObjectInvocationHandler.java:206)
at
com.sun.ejb.containers.EJBObjectInvocationHandlerDelegate.invoke(EJBObjectInvocationHandlerDelegate
.java:79)
at $Proxy163.sendMessage(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
com.sun.corba.ee.impl.presentation.rmi.StubInvocationHandlerImpl.privateInvoke(StubInvocationHandle
rImpl.java:244)
at
com.sun.corba.ee.impl.presentation.rmi.StubInvocationHandlerImpl.invoke(StubInvocationHandlerImpl.j
ava:155)
at
com.sun.corba.ee.impl.presentation.rmi.codegen.CodegenStubBase.invoke(CodegenStubBase.java:229)
at
com.sun.shoal.test._SendIF_Remote_DynamicStub.sendMessage(com/sun/shoal/test/_SendIF_Remote
_DynamicStub.java)
at com.sun.shoal.test._SendIF_Wrapper.sendMessage(com/sun/shoal/test/_SendIF_Wrapper.java)
at org.apache.jsp.test_jsp._jspService(test_jsp.java from :133)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:109)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:401)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:483)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:373)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at org.apache.catalina.core.StandardWrapper.service(StandardWrapper.java:1522)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:281)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:655)
at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:595)
at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:98)
at
com.sun.enterprise.web.PESessionLockingStandardPipeline.invoke(PESessionLockingStandardPipeline.jav
a:91)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:162)
at org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:326)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:227)
at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:170)
at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:824)
at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:721)
at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1014)
at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:220)
at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:135)
at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:102)
at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:88)
at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:76)
at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:53)
at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:57)
at com.sun.grizzly.ContextTask.run(ContextTask.java:69)
at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:530)
at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:511)
at java.lang.Thread.run(Thread.java:619)

One could simply recreate this as an ejb unit test by creating a stateful EJB with one method
that sleeps for 10 minutes. The asadmin stop-instance while the method is still being invoked
will time out. Should a wedged EJB invocation being stopping server shutdown?

We realize that the test EJB question needs some work, but we did want to allow someone from ejb-
container to evaluate impact of pending EJB invocation on shutting down the server.


Joe Fialli added a comment - 12/Oct/10 02:58 PM

Created an attachment (id=5131)
jstack of threads still running after asadmin stop-instance has timed out trying to stop an app server with an active EJB invocation


kenaiadmin made changes - 26/Nov/10 12:15 AM
Field Original Value New Value
issue.field.bugzillaimportkey 13592 45196
marina vatkina made changes - 08/Dec/10 03:36 PM
Fix Version/s future release [ 11148 ]
Fix Version/s 3.2 [ 10969 ]