glassfish
  1. glassfish
  2. GLASSFISH-18178

[PERF] 65% regression in trade2 in-memory replication benchmark on Linux with high thread contention in grizzly OutboundConnectionCacheBlockingImpl

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.1.2_b17
    • Fix Version/s: None
    • Component/s: performance
    • Labels:
      None
    • Environment:

      Linux

      Description

      Latest promoted build b17 shows high thread contention in grizzly code-path while running trade2 in-memory replication benchmark on Linux platform. This results into 65% regression. Below is the call stack showing contention,

      java.lang.Thread.State: BLOCKED (on object monitor)
      at com.sun.grizzly.connectioncache.impl.transport.OutboundConnectionCacheBlockingImpl.get(OutboundConnectionCacheBlockingImpl.java:422)

      • waiting to lock <0x00000006ee0499c8> (a com.sun.grizzly.connectioncache.impl.transport.OutboundConnectionCacheBlockingImpl)
        at com.sun.grizzly.connectioncache.client.CacheableConnectorHandler.doConnect(CacheableConnectorHandler.java:167)
        at com.sun.grizzly.connectioncache.client.CacheableConnectorHandler.connect(CacheableConnectorHandler.java:120)
        at com.sun.enterprise.mgmt.transport.grizzly.grizzly1_9.GrizzlyTCPConnectorWrapper.send(GrizzlyTCPConnectorWrapper.java:121)
        at com.sun.enterprise.mgmt.transport.grizzly.grizzly1_9.GrizzlyTCPConnectorWrapper.doSend(GrizzlyTCPConnectorWrapper.java:96)
        at com.sun.enterprise.mgmt.transport.AbstractMessageSender.send(AbstractMessageSender.java:74)
        at com.sun.enterprise.mgmt.transport.grizzly.GrizzlyNetworkManager.send(GrizzlyNetworkManager.java:288)
        at com.sun.enterprise.mgmt.ClusterManager.send(ClusterManager.java:395)
        at com.sun.enterprise.mgmt.ClusterManager.send(ClusterManager.java:419)
        at com.sun.enterprise.ee.cms.impl.base.GroupCommunicationProviderImpl.sendMessage(GroupCommunicationProviderImpl.java:353)
        at com.sun.enterprise.ee.cms.impl.base.GroupHandleImpl.sendMessage(GroupHandleImpl.java:142)
        at org.shoal.ha.group.gms.GroupServiceProvider.sendMessage(GroupServiceProvider.java:268)
        at org.shoal.ha.cache.impl.interceptor.TransmitInterceptor.onTransmit(TransmitInterceptor.java:83)
        at org.shoal.ha.cache.api.AbstractCommandInterceptor.onTransmit(AbstractCommandInterceptor.java:98)
        at org.shoal.ha.cache.impl.interceptor.ReplicationCommandTransmitterManager.onTransmit(ReplicationCommandTransmitterManager.java:86)
        at org.shoal.ha.cache.api.AbstractCommandInterceptor.onTransmit(AbstractCommandInterceptor.java:98)
        at org.shoal.ha.cache.impl.interceptor.CommandHandlerInterceptor.onTransmit(CommandHandlerInterceptor.java:74)
        at org.shoal.ha.cache.impl.command.CommandManager.executeCommand(CommandManager.java:122)
        at org.shoal.ha.cache.impl.command.CommandManager.execute(CommandManager.java:114)
        at org.shoal.ha.cache.impl.interceptor.ReplicationCommandTransmitterWithMap$BatchedCommandMapDataFrame.run(ReplicationCommandTransmitterWithMap.java:316)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

      java.lang.Thread.State: TIMED_WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)

      • waiting on <0x00000007a7a77518> (a java.lang.Object)
        at com.sun.grizzly.util.FutureImpl.get(FutureImpl.java:142)
      • locked <0x00000007a7a77518> (a java.lang.Object)
        at com.sun.grizzly.TCPConnectorHandler.connect(TCPConnectorHandler.java:188)
        at com.sun.grizzly.TCPConnectorHandler.connect(TCPConnectorHandler.java:92)
        at com.sun.grizzly.AbstractConnectorHandler.connect(AbstractConnectorHandler.java:228)
        at com.sun.grizzly.connectioncache.client.CacheableConnectorHandler$ConnectExecutor.invokeProtocolConnect(CacheableConnectorHandler.java:399)
        at com.sun.grizzly.connectioncache.client.CacheableConnectorHandler$CacheableConnectorInfo.createConnection(CacheableConnectorHandler.java:436)
        at com.sun.grizzly.connectioncache.client.CacheableConnectorHandler$CacheableConnectorInfo.createConnection(CacheableConnectorHandler.java:409)
        at com.sun.enterprise.mgmt.transport.grizzly.grizzly1_9.GrizzlyNetworkManager1_9$1.find(GrizzlyNetworkManager1_9.java:154)
        at com.sun.enterprise.mgmt.transport.grizzly.grizzly1_9.GrizzlyNetworkManager1_9$1.find(GrizzlyNetworkManager1_9.java:142)
        at com.sun.grizzly.connectioncache.impl.transport.OutboundConnectionCacheBlockingImpl.get(OutboundConnectionCacheBlockingImpl.java:446)
      • locked <0x00000006ee0499c8> (a com.sun.grizzly.connectioncache.impl.transport.OutboundConnectionCacheBlockingImpl)
        at com.sun.grizzly.connectioncache.client.CacheableConnectorHandler.doConnect(CacheableConnectorHandler.java:167)
        at com.sun.grizzly.connectioncache.client.CacheableConnectorHandler.connect(CacheableConnectorHandler.java:120)
        at com.sun.enterprise.mgmt.transport.grizzly.grizzly1_9.GrizzlyTCPConnectorWrapper.send(GrizzlyTCPConnectorWrapper.java:121)
        at com.sun.enterprise.mgmt.transport.grizzly.grizzly1_9.GrizzlyTCPConnectorWrapper.doSend(GrizzlyTCPConnectorWrapper.java:96)
        at com.sun.enterprise.mgmt.transport.AbstractMessageSender.send(AbstractMessageSender.java:74)
        at com.sun.enterprise.mgmt.transport.grizzly.GrizzlyNetworkManager.send(GrizzlyNetworkManager.java:288)
        at com.sun.enterprise.mgmt.ClusterManager.send(ClusterManager.java:395)
        at com.sun.enterprise.mgmt.ClusterManager.send(ClusterManager.java:419)
        at com.sun.enterprise.ee.cms.impl.base.GroupCommunicationProviderImpl.sendMessage(GroupCommunicationProviderImpl.java:353)
        at com.sun.enterprise.ee.cms.impl.base.GroupHandleImpl.sendMessage(GroupHandleImpl.java:142)
        at org.shoal.ha.group.gms.GroupServiceProvider.sendMessage(GroupServiceProvider.java:260)
        at org.shoal.ha.cache.impl.interceptor.TransmitInterceptor.onTransmit(TransmitInterceptor.java:83)
        at org.shoal.ha.cache.api.AbstractCommandInterceptor.onTransmit(AbstractCommandInterceptor.java:98)
        at org.shoal.ha.cache.impl.interceptor.ReplicationCommandTransmitterManager.onTransmit(ReplicationCommandTransmitterManager.java:86)
        at org.shoal.ha.cache.api.AbstractCommandInterceptor.onTransmit(AbstractCommandInterceptor.java:98)
        at org.shoal.ha.cache.impl.interceptor.CommandHandlerInterceptor.onTransmit(CommandHandlerInterceptor.java:74)
        at org.shoal.ha.cache.impl.command.CommandManager.executeCommand(CommandManager.java:122)
        at org.shoal.ha.cache.impl.command.CommandManager.execute(CommandManager.java:114)
        at org.shoal.ha.cache.impl.interceptor.ReplicationCommandTransmitterWithMap$BatchedCommandMapDataFrame.run(ReplicationCommandTransmitterWithMap.java:316)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
      1. server.log_2012-01-17T23-50-34
        1.95 MB
        amitagarwal

        Activity

        Hide
        Joe Fialli added a comment -

        Thanks for the server.log context with GMS1703 WARNING. There was not enough context.
        I have added a ShoalLogger.mcast FINE message that contains the stack trace when this happens
        to the attached shoal-gms-impl.jar. Just copy this jar into a glassfish build expecting shoal 1.6.17.
        The shoal-cache.jar needs to be the original one so we can track what is going wrong.

        My hunch is that there is a computation that is suppose to return a non-null ID value and it is
        returning null. If one sends a message to the null target, then it converts the sendMessage from
        a point to point message to a broadcast. So I believe a sendMessage of a session is occuring
        to a null destination.

        Show
        Joe Fialli added a comment - Thanks for the server.log context with GMS1703 WARNING. There was not enough context. I have added a ShoalLogger.mcast FINE message that contains the stack trace when this happens to the attached shoal-gms-impl.jar. Just copy this jar into a glassfish build expecting shoal 1.6.17. The shoal-cache.jar needs to be the original one so we can track what is going wrong. My hunch is that there is a computation that is suppose to return a non-null ID value and it is returning null. If one sends a message to the null target, then it converts the sendMessage from a point to point message to a broadcast. So I believe a sendMessage of a session is occuring to a null destination.
        Hide
        Joe Fialli added a comment - - edited

        copy this jar[1] into glassfish3/glassfish/modules in a glassfish build expecting shoal 1.6.17.
        add following logging line after cluster is created and before test starts to run.

        [note: replace YourClusterName with the application's cluster name.]

        % asadmin set-log-levels --target YourClusterName ShoalLogger.mcast=FINE

        When the above is set, we will get a stack trace with every GMS1073 message that
        identifies the context that is broadcasting a message that is too large.

        [1] - http://java.net/jira/secure/attachment/48986/shoal-gms-impl.jar

        Show
        Joe Fialli added a comment - - edited copy this jar [1] into glassfish3/glassfish/modules in a glassfish build expecting shoal 1.6.17. add following logging line after cluster is created and before test starts to run. [note: replace YourClusterName with the application's cluster name.] % asadmin set-log-levels --target YourClusterName ShoalLogger.mcast=FINE When the above is set, we will get a stack trace with every GMS1073 message that identifies the context that is broadcasting a message that is too large. [1] - http://java.net/jira/secure/attachment/48986/shoal-gms-impl.jar
        Hide
        amitagarwal added a comment -

        Configuring IPv4 address instead of IPv6 for GMS traffic resolved this. We no longer see contention reported. Hence closing this. Thanks Joe for debugging this.

        Show
        amitagarwal added a comment - Configuring IPv4 address instead of IPv6 for GMS traffic resolved this. We no longer see contention reported. Hence closing this. Thanks Joe for debugging this.
        Hide
        Joe Fialli added a comment -

        Shoal/GMS was incorrectly selecting a non-native virtual network interface that happened to be IPv6 only.
        (In this case, the configuration was introduced by XEN).

        Workaround is described at following link.
        http://java.net/jira/browse/GLASSFISH-18024?focusedCommentId=328155&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_328155.

        Show
        Joe Fialli added a comment - Shoal/GMS was incorrectly selecting a non-native virtual network interface that happened to be IPv6 only. (In this case, the configuration was introduced by XEN). Workaround is described at following link. http://java.net/jira/browse/GLASSFISH-18024?focusedCommentId=328155&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_328155 .
        Hide
        Joe Fialli added a comment -

        Here is output of shoal gms NetworkUtility diagnostics to capture the multiple network interfaces involved.
        (similar to ifconfig but using java.NetworkInterface API for output)

        $ /export/VMs/jdk1.6.0_29/bin/java -classpath shoal-gms-impl.jar com.sun.enterprise.mgmt.transport.NetworkUtility
        Java property java.net.preferIPv6Addresses=false
        AllLocalAddresses() = [/fe80:0:0:0:fcff:ffff:feff:ffff%8, /fe80:0:0:0:21e:68ff:fec5:9f42%9, /10.13X.185.17, /fe80:0:0:0:200:ff:fe00:0%7, /192.168.122.1]
        getFirstNetworkInterface() = name:vif0.0 (vif0.0)
        getFirstInetAddress(preferIPv6Addresses:false)=null
        getFirstInetAddress()=/fe80:0:0:0:fcff:ffff:feff:ffff%8
        getFirstInetAddress( true ) = /fe80:0:0:0:fcff:ffff:feff:ffff%8
        getFirstInetAddress( false ) = null
        getLocalHostAddress = XXXX-x4150-1/10.13X.185.17
        getFirstNetworkInteface() = name:vif0.0 (vif0.0)
        getNetworkInetAddress(firstNetworkInteface, true) = /fe80:0:0:0:fcff:ffff:feff:ffff%8
        getNetworkInetAddress(firstNetworkInteface, false) = null

        -------------------------------------------------------

        All Network Interfaces

        **************************************************
        Display name: vif0.0
        Name: vif0.0
        PreferIPv6Addresses: false
        InetAddress: /fe80:0:0:0:fcff:ffff:feff:ffff%8
        Up? true
        Loopback? false
        PointToPoint? false
        Supports multicast? false
        Virtual? false
        Hardware address: [-2, -1, -1, -1, -1, -1]
        MTU: 1500
        Network Inet Address (preferIPV6=false) null
        Network Inet Address (preferIPV6=true) /fe80:0:0:0:fcff:ffff:feff:ffff%8
        resolveBindInterfaceName(vif0.0)=fe80:0:0:0:fcff:ffff:feff:ffff%8

        **************************************************
        Display name: peth0
        Name: peth0
        PreferIPv6Addresses: false
        InetAddress: /fe80:0:0:0:fcff:ffff:feff:ffff%2
        Up? true
        Loopback? false
        PointToPoint? false
        Supports multicast? false
        Virtual? false
        Hardware address: [-2, -1, -1, -1, -1, -1]
        MTU: 1500
        Network Inet Address (preferIPV6=false) null
        Network Inet Address (preferIPV6=true) /fe80:0:0:0:fcff:ffff:feff:ffff%2
        resolveBindInterfaceName(peth0)=fe80:0:0:0:fcff:ffff:feff:ffff%2

        **************************************************
        Display name: eth0
        Name: eth0
        PreferIPv6Addresses: false
        InetAddress: /fe80:0:0:0:21e:68ff:fec5:9f42%9
        InetAddress: /10.133.185.17
        Up? true
        Loopback? false
        PointToPoint? false
        Supports multicast? true
        Virtual? false
        Hardware address: [0, 30, 104, -59, -97, 66]
        MTU: 1500
        Network Inet Address (preferIPV6=false) /10.133.185.17
        Network Inet Address (preferIPV6=true) /fe80:0:0:0:21e:68ff:fec5:9f42%9
        resolveBindInterfaceName(eth0)=127.0.0.1

        **************************************************
        Display name: virbr0
        Name: virbr0
        PreferIPv6Addresses: false
        InetAddress: /fe80:0:0:0:200:ff:fe00:0%7
        InetAddress: /192.168.122.1
        Up? true
        Loopback? false
        PointToPoint? false
        Supports multicast? true
        Virtual? false
        Hardware address: null
        MTU: 1500
        Network Inet Address (preferIPV6=false) /192.168.122.1
        Network Inet Address (preferIPV6=true) /fe80:0:0:0:200:ff:fe00:0%7
        resolveBindInterfaceName(virbr0)=192.168.122.1

        **************************************************
        Display name: lo
        Name: lo
        PreferIPv6Addresses: false
        InetAddress: /0:0:0:0:0:0:0:1%1
        InetAddress: /127.0.0.1
        Up? true
        Loopback? true
        PointToPoint? false
        Supports multicast? false
        Virtual? false
        Hardware address: null
        MTU: 16436
        Network Inet Address (preferIPV6=false) /127.0.0.1
        Network Inet Address (preferIPV6=true) /0:0:0:0:0:0:0:1%1
        resolveBindInterfaceName(lo)=127.0.0.1
        [jfialli@perf-x4150-1 modules]$ /export/VMs/jdk1.6.0_29/bin/java -classpath shoal-gms-impl.jar com.sun.enterprise.mgmt.transport.NetworkUtility
        Java property java.net.preferIPv6Addresses=false
        AllLocalAddresses() = [/fe80:0:0:0:fcff:ffff:feff:ffff%8, /fe80:0:0:0:21e:68ff:fec5:9f42%9, /10.133.185.17, /fe80:0:0:0:200:ff:fe00:0%7, /192.168.122.1]
        getFirstNetworkInterface() = name:vif0.0 (vif0.0)
        getFirstInetAddress(preferIPv6Addresses:false)=null
        getFirstInetAddress()=/fe80:0:0:0:fcff:ffff:feff:ffff%8
        getFirstInetAddress( true ) = /fe80:0:0:0:fcff:ffff:feff:ffff%8
        getFirstInetAddress( false ) = null
        getLocalHostAddress = perf-x4150-1/10.133.185.17
        getFirstNetworkInteface() = name:vif0.0 (vif0.0)
        getNetworkInetAddress(firstNetworkInteface, true) = /fe80:0:0:0:fcff:ffff:feff:ffff%8
        getNetworkInetAddress(firstNetworkInteface, false) = null

        Show
        Joe Fialli added a comment - Here is output of shoal gms NetworkUtility diagnostics to capture the multiple network interfaces involved. (similar to ifconfig but using java.NetworkInterface API for output) $ /export/VMs/jdk1.6.0_29/bin/java -classpath shoal-gms-impl.jar com.sun.enterprise.mgmt.transport.NetworkUtility Java property java.net.preferIPv6Addresses=false AllLocalAddresses() = [/fe80:0:0:0:fcff:ffff:feff:ffff%8, /fe80:0:0:0:21e:68ff:fec5:9f42%9, /10.13X.185.17, /fe80:0:0:0:200:ff:fe00:0%7, /192.168.122.1] getFirstNetworkInterface() = name:vif0.0 (vif0.0) getFirstInetAddress(preferIPv6Addresses:false)=null getFirstInetAddress()=/fe80:0:0:0:fcff:ffff:feff:ffff%8 getFirstInetAddress( true ) = /fe80:0:0:0:fcff:ffff:feff:ffff%8 getFirstInetAddress( false ) = null getLocalHostAddress = XXXX-x4150-1/10.13X.185.17 getFirstNetworkInteface() = name:vif0.0 (vif0.0) getNetworkInetAddress(firstNetworkInteface, true) = /fe80:0:0:0:fcff:ffff:feff:ffff%8 getNetworkInetAddress(firstNetworkInteface, false) = null ------------------------------------------------------- All Network Interfaces ************************************************** Display name: vif0.0 Name: vif0.0 PreferIPv6Addresses: false InetAddress: /fe80:0:0:0:fcff:ffff:feff:ffff%8 Up? true Loopback? false PointToPoint? false Supports multicast? false Virtual? false Hardware address: [-2, -1, -1, -1, -1, -1] MTU: 1500 Network Inet Address (preferIPV6=false) null Network Inet Address (preferIPV6=true) /fe80:0:0:0:fcff:ffff:feff:ffff%8 resolveBindInterfaceName(vif0.0)=fe80:0:0:0:fcff:ffff:feff:ffff%8 ************************************************** Display name: peth0 Name: peth0 PreferIPv6Addresses: false InetAddress: /fe80:0:0:0:fcff:ffff:feff:ffff%2 Up? true Loopback? false PointToPoint? false Supports multicast? false Virtual? false Hardware address: [-2, -1, -1, -1, -1, -1] MTU: 1500 Network Inet Address (preferIPV6=false) null Network Inet Address (preferIPV6=true) /fe80:0:0:0:fcff:ffff:feff:ffff%2 resolveBindInterfaceName(peth0)=fe80:0:0:0:fcff:ffff:feff:ffff%2 ************************************************** Display name: eth0 Name: eth0 PreferIPv6Addresses: false InetAddress: /fe80:0:0:0:21e:68ff:fec5:9f42%9 InetAddress: /10.133.185.17 Up? true Loopback? false PointToPoint? false Supports multicast? true Virtual? false Hardware address: [0, 30, 104, -59, -97, 66] MTU: 1500 Network Inet Address (preferIPV6=false) /10.133.185.17 Network Inet Address (preferIPV6=true) /fe80:0:0:0:21e:68ff:fec5:9f42%9 resolveBindInterfaceName(eth0)=127.0.0.1 ************************************************** Display name: virbr0 Name: virbr0 PreferIPv6Addresses: false InetAddress: /fe80:0:0:0:200:ff:fe00:0%7 InetAddress: /192.168.122.1 Up? true Loopback? false PointToPoint? false Supports multicast? true Virtual? false Hardware address: null MTU: 1500 Network Inet Address (preferIPV6=false) /192.168.122.1 Network Inet Address (preferIPV6=true) /fe80:0:0:0:200:ff:fe00:0%7 resolveBindInterfaceName(virbr0)=192.168.122.1 ************************************************** Display name: lo Name: lo PreferIPv6Addresses: false InetAddress: /0:0:0:0:0:0:0:1%1 InetAddress: /127.0.0.1 Up? true Loopback? true PointToPoint? false Supports multicast? false Virtual? false Hardware address: null MTU: 16436 Network Inet Address (preferIPV6=false) /127.0.0.1 Network Inet Address (preferIPV6=true) /0:0:0:0:0:0:0:1%1 resolveBindInterfaceName(lo)=127.0.0.1 [jfialli@perf-x4150-1 modules] $ /export/VMs/jdk1.6.0_29/bin/java -classpath shoal-gms-impl.jar com.sun.enterprise.mgmt.transport.NetworkUtility Java property java.net.preferIPv6Addresses=false AllLocalAddresses() = [/fe80:0:0:0:fcff:ffff:feff:ffff%8, /fe80:0:0:0:21e:68ff:fec5:9f42%9, /10.133.185.17, /fe80:0:0:0:200:ff:fe00:0%7, /192.168.122.1] getFirstNetworkInterface() = name:vif0.0 (vif0.0) getFirstInetAddress(preferIPv6Addresses:false)=null getFirstInetAddress()=/fe80:0:0:0:fcff:ffff:feff:ffff%8 getFirstInetAddress( true ) = /fe80:0:0:0:fcff:ffff:feff:ffff%8 getFirstInetAddress( false ) = null getLocalHostAddress = perf-x4150-1/10.133.185.17 getFirstNetworkInteface() = name:vif0.0 (vif0.0) getNetworkInetAddress(firstNetworkInteface, true) = /fe80:0:0:0:fcff:ffff:feff:ffff%8 getNetworkInetAddress(firstNetworkInteface, false) = null

          People

          • Assignee:
            Joe Fialli
            Reporter:
            amitagarwal
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: