glassfish
  1. glassfish
  2. GLASSFISH-20462

DAS is slow to stop if JMX RMI bind URL is no longer reachable when stop-domain is run

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 4.0_b87_RC3
    • Fix Version/s: future release
    • Component/s: amx
    • Labels:
      None

      Description

      Occasionally, the asadmin stop-domain command fails to stop the server. It times out after 60 seconds waiting for the DAS to stop:

      $ asadmin stop-domain
      Waiting for the domain to stop ........................................................
      Timed out (60 seconds) waiting for the domain to stop.
      Command stop-domain failed.

      Here is some of the jstack output from the DAS after this occurs:

      The problem appears to be in the thread the is running the stop-domain command:

      "Thread-22" daemon prio=5 tid=0x00007fcbf92b2000 nid=0xd233 runnable [0x000000013d324000]
      java.lang.Thread.State: RUNNABLE
      at java.net.PlainSocketImpl.socketConnect(Native Method)
      at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)

      • locked <0x000000012cd628d0> (a java.net.SocksSocketImpl)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
        at java.net.Socket.connect(Socket.java:579)
        at java.net.Socket.connect(Socket.java:528)
        at java.net.Socket.<init>(Socket.java:425)
        at java.net.Socket.<init>(Socket.java:208)
        at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
        at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:146)
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
        at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
        at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
        at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:340)
        at sun.rmi.registry.RegistryImpl_Stub.unbind(Unknown Source)
        at com.sun.jndi.rmi.registry.RegistryContext.unbind(RegistryContext.java:173)
        at com.sun.jndi.toolkit.url.GenericURLContext.unbind(GenericURLContext.java:272)
        at javax.naming.InitialContext.unbind(InitialContext.java:435)
        at javax.naming.InitialContext.unbind(InitialContext.java:435)
        at javax.management.remote.rmi.RMIConnectorServer.stop(RMIConnectorServer.java:561)
        at org.glassfish.admin.mbeanserver.ConnectorStarter.stop(ConnectorStarter.java:125)
      • locked <0x0000000118b93bf8> (a org.glassfish.admin.mbeanserver.RMIConnectorStarter)
        at org.glassfish.admin.mbeanserver.RMIConnectorStarter.stopAndUnexport(RMIConnectorStarter.java:310)
        at org.glassfish.admin.mbeanserver.JMXStartupService$JMXConnectorsStarterThread.shutdown(JMXStartupService.java:245)
        at org.glassfish.admin.mbeanserver.JMXStartupService.shutdown(JMXStartupService.java:193)
      • locked <0x00000001184d4ca0> (a org.glassfish.admin.mbeanserver.JMXStartupService)
        at org.glassfish.admin.mbeanserver.JMXStartupService.access$000(JMXStartupService.java:96)
        at org.glassfish.admin.mbeanserver.JMXStartupService$ShutdownListener.event(JMXStartupService.java:163)
        at org.glassfish.kernel.event.EventsImpl.send(EventsImpl.java:131)
        at com.sun.enterprise.v3.server.AppServerStartup.stop(AppServerStartup.java:478)
      • locked <0x0000000118773f38> (a com.sun.enterprise.v3.server.AppServerStartup)
        at com.sun.enterprise.glassfish.bootstrap.GlassFishImpl.stop(GlassFishImpl.java:88)
      • locked <0x00000001187835f8> (a com.sun.enterprise.glassfish.bootstrap.GlassFishImpl)
        at com.sun.enterprise.glassfish.bootstrap.GlassFishDecorator.stop(GlassFishDecorator.java:68)
        at com.sun.enterprise.glassfish.bootstrap.osgi.EmbeddedOSGiGlassFishImpl.stop(EmbeddedOSGiGlassFishImpl.java:82)
        at com.sun.enterprise.v3.admin.StopServer.doExecute(StopServer.java:79)
        at com.sun.enterprise.v3.admin.StopDomainCommand.execute(StopDomainCommand.java:96)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:527)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:523)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:356)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl$2.execute(CommandRunnerImpl.java:522)
        at org.glassfish.api.AsyncImpl$1$1.run(AsyncImpl.java:76)

      This problem seems to occur after the server has been up for a while (for example overnight).

        Activity

        Hide
        Tom Mueller added a comment -

        In this situation, eventually the DAS does exit. Apparently, the Socket.connect call eventually times out and the server finishes exiting. So the mystery here is why is javax.naming.InitialContext.unbind making a socket connection during shutdown.

        Show
        Tom Mueller added a comment - In this situation, eventually the DAS does exit. Apparently, the Socket.connect call eventually times out and the server finishes exiting. So the mystery here is why is javax.naming.InitialContext.unbind making a socket connection during shutdown.
        Hide
        Tom Mueller added a comment -

        To reproduce this problem, start the server while connected to VPN (so the hostname/IP address of the server is set). While the server is running, disconnect from VPN so the IP address is no longer associated with the host. Then run stop-domain. (This sometimes causes the problem).

        The JMX unbind request is being done on a URL like this:

        "rmi://192.168.0.16:8686/jmxrmi"

        or sometimes, like this:

        "rmi://dhcp-adc-twvpn-3-vpnpool-10-154-105-156.vpn.oracle.com:8686/jmxrmi"

        In the first case, the URL uses the IP address of my host on the local LAN. In the second case, it uses the VPN hostname of the VPN address. If the JMX URL is of this latter kind, and the host is disconnected from the VPN at the time stop-domain is run, the unbind request will attempt to connect to the URL and it will fail but with a long timeout. This causes the DAS exit to take a long time so the stop-domain times out.

        Show
        Tom Mueller added a comment - To reproduce this problem, start the server while connected to VPN (so the hostname/IP address of the server is set). While the server is running, disconnect from VPN so the IP address is no longer associated with the host. Then run stop-domain. (This sometimes causes the problem). The JMX unbind request is being done on a URL like this: "rmi://192.168.0.16:8686/jmxrmi" or sometimes, like this: "rmi://dhcp-adc-twvpn-3-vpnpool-10-154-105-156.vpn.oracle.com:8686/jmxrmi" In the first case, the URL uses the IP address of my host on the local LAN. In the second case, it uses the VPN hostname of the VPN address. If the JMX URL is of this latter kind, and the host is disconnected from the VPN at the time stop-domain is run, the unbind request will attempt to connect to the URL and it will fail but with a long timeout. This causes the DAS exit to take a long time so the stop-domain times out.
        Hide
        Tom Mueller added a comment -

        Lowering the priority, deferring to a future release and changing the component to AMX.

        Show
        Tom Mueller added a comment - Lowering the priority, deferring to a future release and changing the component to AMX.

          People

          • Assignee:
            prasads
            Reporter:
            Tom Mueller
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated: