glassfish
  1. glassfish
  2. GLASSFISH-16568

GMS can select incorrect network interface when a Virtual Machine created bridge n/w interface (virbr0) exists

    Details

    • Type: Bug Bug
    • Status: In Progress
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.1.1_b04
    • Fix Version/s: None
    • Labels:
      None
    • Environment:

      Description

      Please see the parent bug http://java.net/jira/browse/GLASSFISH-15425 for scenario details.

      On running the RichAccess Big App test the instance logs are observed to be filled with Grizzly and Shoal logger messages of the following type:

      ******
      [#|2011-05-06T11:26:53.439+0530|SEVERE|glassfish3.1|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=29;_ThreadName=Thread-1;|Connection refused|#]

      [#|2011-05-06T11:26:53.445+0530|SEVERE|glassfish3.1|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=30;_ThreadName=Thread-1;|Connection refused|#]

      [#|2011-05-06T11:26:53.447+0530|WARNING|glassfish3.1|ShoalLogger|_ThreadID=31;_ThreadName=Thread-1;|Error during groupHandle.sendMessage(instance103, /richAccess; size=30672|#]

      *******

      • No http failures were on observed on the client.
      • 2 sets of logs are being attached. 1 with Shoal logger set to fine and 1 without. Unzip and look under "logs/st-cluster" for the instance logs.
      • This issue appears with both Sun JDK and JRockit JDK

        Issue Links

          Activity

          Hide
          Joe Fialli added a comment -

          Please follow documentation to configure gms to bind to a specific network interface.

          http://download.oracle.com/docs/cd/E18930_01/html/821-2426/gjfnl.html#gjdlw

          Also, recommend running "asadmin validate-multicast -bindaddress X.X.X.X" on all three machines
          to double check that UDP multicast traffic is working properly on whatever subnet that you select.

          Show
          Joe Fialli added a comment - Please follow documentation to configure gms to bind to a specific network interface. http://download.oracle.com/docs/cd/E18930_01/html/821-2426/gjfnl.html#gjdlw Also, recommend running "asadmin validate-multicast -bindaddress X.X.X.X" on all three machines to double check that UDP multicast traffic is working properly on whatever subnet that you select.
          Hide
          Joe Fialli added a comment -

          Removed blocking and regression from subject line and changed subject line to match what the issue was discovered to be.
          This was not a regression, same issue would exist in 3.1 as 3.1.1. No changes were made in 3.1.1 that caused
          this. There was a change in the configured environment that caused this issue to surface.

          Simple workaround is to disable or down the virbr0 network interface that were not being used.

          The following error messages were being repeated many times in server.log file.

          [#|2011-05-06T11:26:53.445+0530|SEVERE|glassfish3.1|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=30;_ThreadName=Thread-1;|Connection refused|#]

          [#|2011-05-06T11:26:53.447+0530|WARNING|glassfish3.1|ShoalLogger|_ThreadID=31;_ThreadName=Thread-1;|Error during groupHandle.sendMessage(instance103, /richAccess; size=30672|#]

          We will add to the GMS log event message above with the IP address trying to be sent to assist in diagnosing this problem in future.

          java.net.NetworkInterface.getNetworkInterfaces() was returning the virbr0 network interface as first interface
          and that resulted in this issue. To resolve this issue, GMS will default initially to
          network interface associated with InetAddress.getLocalHost(), (as long as that n/w interface is multicast enabled
          and not a loopback address and UP.) This default would have avoided the reported issue.

          When there are multiple n/w interfaces on a machine and the default is not the one desired to use for GMS,
          the following documentation should be followed to configure GMS to use a specific n/w interface on each machine.

          http://download.oracle.com/docs/cd/E18930_01/html/821-2426/gjfnl.html#gjdlw

          Show
          Joe Fialli added a comment - Removed blocking and regression from subject line and changed subject line to match what the issue was discovered to be. This was not a regression, same issue would exist in 3.1 as 3.1.1. No changes were made in 3.1.1 that caused this. There was a change in the configured environment that caused this issue to surface. Simple workaround is to disable or down the virbr0 network interface that were not being used. The following error messages were being repeated many times in server.log file. [#|2011-05-06T11:26:53.445+0530|SEVERE|glassfish3.1|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=30;_ThreadName=Thread-1;|Connection refused|#] [#|2011-05-06T11:26:53.447+0530|WARNING|glassfish3.1|ShoalLogger|_ThreadID=31;_ThreadName=Thread-1;|Error during groupHandle.sendMessage(instance103, /richAccess; size=30672|#] We will add to the GMS log event message above with the IP address trying to be sent to assist in diagnosing this problem in future. java.net.NetworkInterface.getNetworkInterfaces() was returning the virbr0 network interface as first interface and that resulted in this issue. To resolve this issue, GMS will default initially to network interface associated with InetAddress.getLocalHost(), (as long as that n/w interface is multicast enabled and not a loopback address and UP.) This default would have avoided the reported issue. When there are multiple n/w interfaces on a machine and the default is not the one desired to use for GMS, the following documentation should be followed to configure GMS to use a specific n/w interface on each machine. http://download.oracle.com/docs/cd/E18930_01/html/821-2426/gjfnl.html#gjdlw
          Hide
          Joe Fialli added a comment -

          Also, add a configuration message to show the localPeerID and GMS system advertisement being sent to other machines to dynamically form the GMS group (glassfish cluster). This configuration message will show what IP address that GMS is telling other members of the cluster to contact it at.

          Show
          Joe Fialli added a comment - Also, add a configuration message to show the localPeerID and GMS system advertisement being sent to other machines to dynamically form the GMS group (glassfish cluster). This configuration message will show what IP address that GMS is telling other members of the cluster to contact it at.
          Hide
          Joe Fialli added a comment -

          was unable to identify the non-functional virtual network interface using any of the java.network.NetworkInterface
          methods. recommend postponing attempting to fix this issue in 3.1.1 time frame since changing the algorithm
          on selecting the first network address could potentially introduce a regression for a previously
          working existing network configuration. There is no means to just correct this issue without changing
          how first network address is selected.

          workaround did exist for this issue. simply disabled the virtual network interface that was not
          being used.

          Show
          Joe Fialli added a comment - was unable to identify the non-functional virtual network interface using any of the java.network.NetworkInterface methods. recommend postponing attempting to fix this issue in 3.1.1 time frame since changing the algorithm on selecting the first network address could potentially introduce a regression for a previously working existing network configuration. There is no means to just correct this issue without changing how first network address is selected. workaround did exist for this issue. simply disabled the virtual network interface that was not being used.
          Hide
          Joe Fialli added a comment -

          lowered priority to fix to minor since there is a workaround. Additionally this problem
          is only the result of having a virtual network interface that was created by virtualbox but
          was not being used. simply disabling the virb0 network interface that was not being used fixed
          the problem. At this time, my recommendation is to document the issue and its workaround in release notes.

          Show
          Joe Fialli added a comment - lowered priority to fix to minor since there is a workaround. Additionally this problem is only the result of having a virtual network interface that was created by virtualbox but was not being used. simply disabling the virb0 network interface that was not being used fixed the problem. At this time, my recommendation is to document the issue and its workaround in release notes.

            People

            • Assignee:
              Joe Fialli
              Reporter:
              varunrupela
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: