[GLASSFISH-17016] Inconsistency between validate-multicast and GMS picking interface for binding Created: 12/Jul/11  Updated: 07/Dec/11

Status: Open
Project: glassfish
Component/s: group_management_service
Affects Version/s: 3.1
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: arungupta Assignee: Joe Fialli
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

In Ubuntu 11.04, with eth0 disabled and no wireless connectivity ifconfig reports:

arun@ArunUbuntu:~/tools/glassfish-web$ ifconfig
eth0 Link encap:Ethernet HWaddr 00:26:b9:f1:15:19
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:5698 errors:0 dropped:0 overruns:0 frame:0
TX packets:4575 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:4717594 (4.7 MB) TX bytes:1129576 (1.1 MB)
Interrupt:20 Memory:f6900000-f6920000

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:192770 errors:0 dropped:0 overruns:0 frame:0
TX packets:192770 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:197208585 (197.2 MB) TX bytes:197208585 (197.2 MB)

Explicitly enabled MULTICAST on lo as:

sudo ifconfig lo multicast

and then got:

arun@ArunUbuntu:~/tools/glassfish-web$ ifconfig
eth0 Link encap:Ethernet HWaddr 00:26:b9:f1:15:19
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:5698 errors:0 dropped:0 overruns:0 frame:0
TX packets:4575 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:4717594 (4.7 MB) TX bytes:1129576 (1.1 MB)
Interrupt:20 Memory:f6900000-f6920000

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MULTICAST MTU:16436 Metric:1
RX packets:192914 errors:0 dropped:0 overruns:0 frame:0
TX packets:192914 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:197220833 (197.2 MB) TX bytes:197220833 (197.2 MB)

Explicitly added route as:

sudo route add -net 224.0.0.0 netmask 240.0.0.0 dev lo

and then saw:

arun@ArunUbuntu:~/tools/glassfish-web$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.151.0.0 0.0.0.0 255.255.224.0 U 2 0 0 wlan0
169.254.0.0 0.0.0.0 255.255.0.0 U 1000 0 0 wlan0
224.0.0.0 0.0.0.0 240.0.0.0 U 0 0 0 lo
0.0.0.0 10.151.0.1 0.0.0.0 UG 0 0 0 wlan0

Running validate-multicast command in two separate shells show:

arun@ArunUbuntu:~/tools/glassfish-web$ ./glassfish3/bin/asadmin validate-multicastWill use port 2048
Will use address 228.9.3.1
Will use bind interface null
Will use wait period 2,000 (in milliseconds)

Listening for data...
Sending message with content "ArunUbuntu" every 2,000 milliseconds
Received data from ArunUbuntu (loopback)
Received data from ArunUbuntu
Exiting after 20 seconds. To change this timeout, use the --timeout command line option.
Command validate-multicast executed successfully.

Creating a cluster with 2 instances and starting it shows the following log message:

Caused by: com.sun.enterprise.ee.cms.core.GMSException: initialization failure
at com.sun.enterprise.mgmt.ClusterManager.<init>(ClusterManager.java:142)
at com.sun.enterprise.ee.cms.impl.base.GroupCommunicationProviderImpl.initializeGroupCommunicationProvider(GroupCom
municationProviderImpl.java:164)
at com.sun.enterprise.ee.cms.impl.base.GMSContextImpl.join(GMSContextImpl.java:176)
... 22 more
Caused by: java.io.IOException: can not find a first InetAddress
at com.sun.enterprise.mgmt.transport.grizzly.GrizzlyNetworkManager.start(GrizzlyNetworkManager.java:376)
at com.sun.enterprise.mgmt.ClusterManager.<init>(ClusterManager.java:140)
... 24 more

Even though validate-multicast is working the instances are not able to join the cluster.

Here is what Joe mentioned in an email thread:

– cut here –
Validate-multicast is not using NetworkUtility.getFirstInetAddress(false).
validate-multicast is not specifying any IP address by default when creating the multicast socket.
Just to remind you, validate-multicast is only creating a MulticastSocket and only communicating
over UDP. While the getFirstInetAddress(false) is being used to compute the IP address that
another instance can communicate via TCP to an instance. That is totally different.
We are trying to use same IP address for both TCP and UDP in GMS. We need to revisit
this logic. We will need to remove the check for multicast enabled in selecting network interface
now since we are working on supporting non-multicast mode.
– cut here –

Explicitly setting GMS_BIND_INTERFACE_ADDRESS-c1 property to "127.0.0.1" in each instance and DAS and then restarting the DAS and cluster makes sure the instances can join the cluster.



 Comments   
Comment by Bobby Bissett [ 20/Oct/11 ]

Assigning to me.

Comment by Bobby Bissett [ 07/Dec/11 ]

Moving to Joe (hi) since I'm not on the GF project any more. The work for this is mostly done, and Joe knows what change to make in the mcast sender thread so it mirrors what GMS proper is doing.

Generated at Wed Apr 01 03:29:24 UTC 2015 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.