Skip to main content

[JIRA] Closed: (COHINC-48) Calling CacheFactory.ensureCluster() on CommandExecutor.stop() hangs

  • From: "brianoliver (JIRA)" < >
  • To:
  • Subject: [JIRA] Closed: (COHINC-48) Calling CacheFactory.ensureCluster() on CommandExecutor.stop() hangs
  • Date: Thu, 18 Apr 2013 13:37:53 +0000 (GMT+00:00)
  • Auto-submitted: auto-generated


     [ 
http://java.net/jira/browse/COHINC-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

brianoliver closed COHINC-48.
-----------------------------

    Resolution: Fixed

> Calling CacheFactory.ensureCluster() on CommandExecutor.stop() hangs
> --------------------------------------------------------------------
>
>                 Key: COHINC-48
>                 URL: http://java.net/jira/browse/COHINC-48
>             Project: Coherence Incubator
>          Issue Type: Bug
>          Components: Module: Command Pattern
>    Affects Versions: (Legacy) Incubator 10 Patch 6, 11.0.0
>         Environment: * Command Pattern 2.8.5
> * Cluster with three members (Tomcat 7 + Coherence 3.7). 
> * All of them running the command pattern distributed service with a single 
> thread.
>            Reporter: guillermo.garcia-ochoa
>            Assignee: brianoliver
>             Fix For: 11.0.1-SNAPSHOT, 12.0.0-SNAPSHOT
>
>
> To be able to redeploy a cluster member correctly you need to detach the 
> member from the cluster programmatically calling _CacheFactory.shutdown()_ 
> when application is undeployed. Then the method _CacheFactory.shutdown()_ 
> will call the _stop()_ methods of the distributed services that runs on the 
> leaving member. 
> Because the _stop()_ method is run by a service thread, no reentrant 
> service calls should be invoked inside the _stop_ method to avoid deadlocks.
> *THE PROBLEM*
> The _CommandExecutor.stop()_ have a _CacheFactory.ensureCluster()_ that is 
> a service call within a service call (thus, a reentrant call)
> {code}
> public void stop() {
>       if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, 
> "Stopping CommandExecutor for %s", contextIdentifier);     
>       //stop immediately
>       setState(State.Stopped);
>               
>       //this CommandExecutor must not be available any further to other 
> threads
>       
> CommandExecutorManager.removeCommandExecutor(this.getContextIdentifier());
>       //unregister JMX mbean for the CommandExecutor
>       Registry registry = CacheFactory.ensureCluster().getManagement(); // 
> THIS IS THE SERVICE CALL
>       if (registry != null) {
>               if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, 
> "Unregistering JMX management extensions for CommandExecutor %s", 
> contextIdentifier);      
>               registry.unregister(getMBeanName());
>       }
>               
>       if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, "Stopped 
> CommandExecutor for %s", contextIdentifier);      
> }
> {code}
> If the distributed service use to support the command pattern is configured 
> to have a single thread (as it is by default). This call will produce a 
> deadlock with a thread dump like this:
> {code}
> Thread[DistributedCache:DistributedCacheForCommandPattern|SERVICE_STOPPING,5,Cluster]
>       com.tangosol.net.CacheFactory.ensureCluster(CacheFactory.java:424)
>       
> com.oracle.coherence.patterns.command.internal.CommandExecutor.stop(CommandExecutor.java:671)
>         ...
> {code}
> *DIAGNOSTIC (AND POTENTIAL SOLUTION)*
> I've changed the code of the _CommandExecutor.stop()_ method to use a non 
> blocking service call to obtain the _Cluster_
> {code}
> Registry registry = CacheFactory.getCluster() != null ? 
> CacheFactory.getCluster().getManagement() : null;
> {code}
> Because _CacheFactory.getCluster()_ is not a blocking service call the 
> deadlock is avoided.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://java.net/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


[JIRA] Created: (COHINC-48) Calling CacheFactory.ensureCluster() on CommandExecutor.stop() hangs

guillermo.garcia-ochoa (JIRA) 04/17/2013

[JIRA] Commented: (COHINC-48) Calling CacheFactory.ensureCluster() on CommandExecutor.stop() hangs

brianoliver (JIRA) 04/18/2013

[JIRA] Work started: (COHINC-48) Calling CacheFactory.ensureCluster() on CommandExecutor.stop() hangs

brianoliver (JIRA) 04/18/2013

[JIRA] Updated: (COHINC-48) Calling CacheFactory.ensureCluster() on CommandExecutor.stop() hangs

brianoliver (JIRA) 04/18/2013

[JIRA] Commented: (COHINC-48) Calling CacheFactory.ensureCluster() on CommandExecutor.stop() hangs

guillermo.garcia-ochoa (JIRA) 04/18/2013

[JIRA] Closed: (COHINC-48) Calling CacheFactory.ensureCluster() on CommandExecutor.stop() hangs

brianoliver (JIRA) 04/18/2013
 
 
Close
loading
Please Confirm
Close