glassfish
  1. glassfish
  2. GLASSFISH-17926

Elasticity auto-scale up test failed on Ubuntu laptop due to GMS failures when multiple network interfaces exist

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.2_dev, 4.0_dev
    • Fix Version/s: 3.1.2
    • Labels:
      None
    • Environment:

      Ubuntu 11.10 on Dell E6420 laptop
      java version "1.6.0_24"

      Description

      GF4.0 b12.

      I ran into inconsistent behavior of Elasticity for GF4. The auto-scale test passed in home with wireless network connection while has intermittent failures at work with ethernet cable connection. The NetworkUtils test results when the failure was reproduce:
      $ ifconfig -a
      ...
      wlan0 Link encap:Ethernet HWaddr 08:11:96:0c:14:b0
      inet6 addr: fe80::a11:96ff:fe0c:14b0/64 Scope:Link
      UP BROADCAST MULTICAST MTU:1500 Metric:1
      RX packets:12 errors:0 dropped:0 overruns:0 frame:0
      TX packets:136 errors:0 dropped:0 overruns:0 carrier:0
      collisions:0 txqueuelen:1000
      RX bytes:1006 (1.0 KB) TX bytes:22446 (22.4 KB)

      $ java -classpath glassfish/modules/shoal-gms-impl.jar com.sun.enterprise.mgmt.transport.NetworkUtility
      AllLocalAddresses() = [/fe80:0:0:0:5e26:aff:fe7e:bef8%2, /10.132.179.12]
      getFirstNetworkInterface() = name:wlan0 (wlan0) index: 3 addresses:
      /fe80:0:0:0:a11:96ff:fe0c:14b0%3;

      Dec 7, 2011 1:36:09 PM com.sun.enterprise.mgmt.transport.NetworkUtility getNetworkInetAddress
      INFO: enter getFirstInetAddress networkInterface=name:wlan0 (wlan0) index: 3 addresses:
      /fe80:0:0:0:a11:96ff:fe0c:14b0%3;
      preferIPv6=true
      getFirstInetAddress( true ) = /fe80:0:0:0:a11:96ff:fe0c:14b0%3
      Dec 7, 2011 1:36:09 PM com.sun.enterprise.mgmt.transport.NetworkUtility getNetworkInetAddress
      INFO: enter getFirstInetAddress networkInterface=name:wlan0 (wlan0) index: 3 addresses:
      /fe80:0:0:0:a11:96ff:fe0c:14b0%3;
      preferIPv6=false
      getFirstInetAddress( false ) = null
      getLocalHostAddress = chicago/127.0.1.1
      getFirstNetworkInteface() = name:wlan0 (wlan0) index: 3 addresses:
      /fe80:0:0:0:a11:96ff:fe0c:14b0%3;

      getFirstInetAddress(firstNetworkInteface, true) = /fe80:0:0:0:a11:96ff:fe0c:14b0%3
      getFirstInetAddress(firstNetworkInteface, false) = null

      All Network Interfaces
      Display name: wlan0
      Name: wlan0
      InetAddress: /fe80:0:0:0:a11:96ff:fe0c:14b0%3
      Up? false
      Loopback? false
      PointToPoint? false
      Supports multicast? true
      Virtual? false
      Hardware address: [8, 17, -106, 12, 20, -80]
      MTU: 1500
      Dec 7, 2011 1:36:09 PM com.sun.enterprise.mgmt.transport.NetworkUtility getNetworkInetAddress
      INFO: enter getFirstInetAddress networkInterface=name:wlan0 (wlan0) index: 3 addresses:
      /fe80:0:0:0:a11:96ff:fe0c:14b0%3;
      preferIPv6=false
      Exception in thread "main" java.lang.NullPointerException
      at com.sun.enterprise.mgmt.transport.NetworkUtility.displayInterfaceInformation(NetworkUtility.java:695)
      at com.sun.enterprise.mgmt.transport.NetworkUtility.main(NetworkUtility.java:674)

      The original server.log stack trace:
      [#|2011-12-01T14:14:13.673-0800|WARNING|44.0|elasticity-logger|_ThreadID=22;_ThreadName=Thread-2;|Error during groupHandle.sendMessage(cloud-2, ConferencePlanner; size=1567)
      com.sun.enterprise.ee.cms.core.GMSException: java.io.IOException: failed to connect to fe80:0:0:0:a11:96ff:fe0c:14b0%3:9188:228.9.10.114:7100:ConferencePlanner:cloud-2
      at com.sun.enterprise.ee.cms.impl.base.GroupCommunicationProviderImpl.sendMessage(GroupCommunicationProviderImpl.java:380)
      at com.sun.enterprise.ee.cms.impl.base.GroupHandleImpl.sendMessage(GroupHandleImpl.java:142)
      at org.glassfish.elasticity.group.gms.GroupServiceProvider.sendMessage(GroupServiceProvider.java:276)
      at org.glassfish.elasticity.engine.message.MessageProcessor.sendMessage(MessageProcessor.java:151)
      at org.glassfish.elasticity.expression.ElasticExpressionEvaluator.evaluate(ElasticExpressionEvaluator.java:95)
      at org.glassfish.elasticity.expression.ElasticExpressionEvaluator.evaluate(ElasticExpressionEvaluator.java:50)
      at org.glassfish.elasticity.engine.util.ExpressionBasedAlert.execute(ExpressionBasedAlert.java:91)
      at org.glassfish.elasticity.engine.container.AlertContextImpl.run(AlertContextImpl.java:71)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
      at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      at java.lang.Thread.run(Thread.java:662)
      Caused by: java.io.IOException: failed to connect to fe80:0:0:0:a11:96ff:fe0c:14b0%3:9188:228.9.10.114:7100:ConferencePlanner:cloud-2
      at com.sun.enterprise.mgmt.transport.grizzly.grizzly2.GrizzlyTCPMessageSender.send(GrizzlyTCPMessageSender.java:134)
      at com.sun.enterprise.mgmt.transport.grizzly.grizzly2.GrizzlyTCPMessageSender.doSend(GrizzlyTCPMessageSender.java:99)
      at com.sun.enterprise.mgmt.transport.AbstractMessageSender.send(AbstractMessageSender.java:74)
      at com.sun.enterprise.mgmt.transport.grizzly.GrizzlyNetworkManager.send(GrizzlyNetworkManager.java:285)

        Activity

        Hide
        Joe Fialli added a comment -

        The GMS selecting the incorrect network interface was due to the wireless network interface being in a slightly unusual and inconsistent state.

        From ifconfig -a:

        wlan0 Link encap:Ethernet HWaddr 08:11:96:0c:14:b0
        inet6 addr: fe80::a11:96ff:fe0c:14b0/64 Scope:Link
        UP BROADCAST MULTICAST MTU:1500 Metric:1

        Note that the interface is not running but it does have an IPv6 address assigned.

        Fix in Shoal GMS NetworkUtility was to check if the network interface is UP
        and it has ip address assignged. Before fix, the code incorrectly assumed if IP
        address assigned that the network interface was up.

        Fix for this issue was confirmed to work on this system.

        Workaround for misconfigured network interface wlan0 was to explicitly
        run the following:

        % sudo ifconfig wlan0 down

        After running this, the inconsistent wlan0 network interface no longer
        had an IP address assigned when running "ifconfig -a wlan0".

        Show
        Joe Fialli added a comment - The GMS selecting the incorrect network interface was due to the wireless network interface being in a slightly unusual and inconsistent state. From ifconfig -a: wlan0 Link encap:Ethernet HWaddr 08:11:96:0c:14:b0 inet6 addr: fe80::a11:96ff:fe0c:14b0/64 Scope:Link UP BROADCAST MULTICAST MTU:1500 Metric:1 Note that the interface is not running but it does have an IPv6 address assigned. Fix in Shoal GMS NetworkUtility was to check if the network interface is UP and it has ip address assignged. Before fix, the code incorrectly assumed if IP address assigned that the network interface was up. Fix for this issue was confirmed to work on this system. Workaround for misconfigured network interface wlan0 was to explicitly run the following: % sudo ifconfig wlan0 down After running this, the inconsistent wlan0 network interface no longer had an IP address assigned when running "ifconfig -a wlan0".
        Hide
        Joe Fialli added a comment -

        Fix committed to shoal 1.6 source code workspace.

        Will integrate into 3.1.2 and 4.0 workspace today.

        Show
        Joe Fialli added a comment - Fix committed to shoal 1.6 source code workspace. Will integrate into 3.1.2 and 4.0 workspace today.
        Hide
        Joe Fialli added a comment -

        shoal 1.6.15 contains this fix. integrated in gf 3.1.2 and 4.0 workspace today.

        Show
        Joe Fialli added a comment - shoal 1.6.15 contains this fix. integrated in gf 3.1.2 and 4.0 workspace today.

          People

          • Assignee:
            Joe Fialli
            Reporter:
            mzh777
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: