glassfish
  1. glassfish
  2. GLASSFISH-15428

[STRESS] NoClassDefFoundError from Shoal observed on an instance's log when another instance goes down

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Incomplete
    • Affects Version/s: 3.1_b35
    • Fix Version/s: None
    • Labels:
      None

      Description

      Please see parent issue http://java.net/jira/browse/GLASSFISH-15425 for details of the scenario that shows this bug.

      The following log messages were observed in instance 102 when instance103 went down:

      **********
      [#|2011-01-02T22:23:05.584+0530|INFO|glassfish3.1|ShoalLogger|_ThreadID=22;_ThreadName=Thread-3;|GMS1092: GMS View Change Received for group: st-cluster : Members in view for IN_DOUBT_EVENT(before change analysis) are :
      1: MemberId: instance101, MemberType: CORE, Address: 10.12.153.54:9195:228.9.37.37:31818:st-cluster:instance101
      2: MemberId: instance102, MemberType: CORE, Address: 10.12.153.52:9147:228.9.37.37:31818:st-cluster:instance102
      3: MemberId: instance103, MemberType: CORE, Address: 10.12.153.53:9110:228.9.37.37:31818:st-cluster:instance103
      4: MemberId: server, MemberType: SPECTATOR, Address: 10.12.153.54:9103:228.9.37.37:31818:st-cluster:server

      #]

      [#|2011-01-02T22:23:05.584+0530|INFO|glassfish3.1|ShoalLogger|_ThreadID=22;_ThreadName=Thread-3;|GMS1016: Analyzing new membership snapshot received as part of event: IN_DOUBT_EVENT for member: instance103 of group: st-cluster|#]

      [#|2011-01-02T22:23:05.585+0530|INFO|glassfish3.1|ShoalLogger|_ThreadID=22;_ThreadName=Thread-3;|GMS1007: Received FailureSuspectedEvent for member: instance103 of group: st-cluster|#]

      [#|2011-01-02T22:23:05.586+0530|WARNING|glassfish3.1|ShoalLogger|_ThreadID=22;_ThreadName=Thread-3;|GMS1090: handled exception processing event packet IN_DOUBT_EVENT from instance103|#]

      [#|2011-01-02T22:23:05.586+0530|WARNING|glassfish3.1|ShoalLogger|_ThreadID=22;_ThreadName=Thread-3;|stack trace
      java.lang.NoClassDefFoundError:
      at com.sun.enterprise.ee.cms.impl.base.ViewWindowImpl.addInDoubtMemberSignals(ViewWindowImpl.java:399)
      at com.sun.enterprise.ee.cms.impl.base.ViewWindowImpl.analyzeViewChange(ViewWindowImpl.java:287)
      at com.sun.enterprise.ee.cms.impl.base.ViewWindowImpl.newViewObserved(ViewWindowImpl.java:222)
      at com.sun.enterprise.ee.cms.impl.base.ViewWindowImpl.run(ViewWindowImpl.java:193)
      at java.lang.Thread.run(Thread.java:662)

      #]

      ***********

      The server log showing this message is attached

        Issue Links

          Activity

          Hide
          Joe Fialli added a comment -

          Here is the source code line where the NoClassDefFoundError occurred.

          signals.add(new FailureSuspectedSignalImpl(token, member.getGroupName(), member.getStartTime()));

          So the class that was not found was FailureSuspectedSignalImpl which is defined in shoal-gms-impl.jar.
          2975 Fri Dec 03 09:11:38 EST 2010 com/sun/enterprise/ee/cms/impl/common/FailureSuspectedSignalImpl.class

          There was no class not found error when reporting Failure 2 seconds later based on following log event from server.log.
          [#|2011-01-02T22:23:07.601+0530|INFO|glassfish3.1|ShoalLogger|_ThreadID=22;_ThreadName=Thread-3;|GMS1016: Analyzing new membership snapshot received as part of event: FAILURE_EVENT for member: instance103 of group: st-cluster|#]

          [#|2011-01-02T22:23:07.603+0530|INFO|glassfish3.1|ShoalLogger|_ThreadID=22;_ThreadName=Thread-3;|GMS1019: member: instance103 of group: st-cluster has failed|#]

          So we can assume that FailureSignalImpl was found in the same shoal-gms-impl.jar. (unless the ClassNotFoundError is in next server.log file).

          dhcp-burlington9-3rd-a-east-10-152-23-224:impl jf39279$ jar tvf target/shoal-gms-impl.jar | grep FailureNotificationSignalImpl
          3798 Fri Dec 03 09:11:38 EST 2010 com/sun/enterprise/ee/cms/impl/common/FailureNotificationSignalImpl.class

          This does not look like a bug in Shoal GMS but rather a higher level class loading issue.
          Potentially, one could see this type of behavior when there was a previous OutOfMemoryException
          thrown. Looking through the attached server.log, there was no previous history of OutOfMemory,
          but there was evidence of other class loading issues. Specifically, the following log event occurred
          5 times before the reported NoClassDefFoundError.

          [#|2011-01-02T20:28:35.674+0530|SEVERE|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=22;_ThreadName=Thread-3;|Exception in thread "RMI RenewClean-[10.12.153.52:27688]" |#]

          [#|2011-01-02T20:28:35.675+0530|SEVERE|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=22;_ThreadName=Thread-3;|java.lang.ClassFormatError: sun/reflect/GeneratedSerializationConstructorAccessor9015 : illegal JVM_CONSTANT_Methodref name: <init>
          at sun.misc.Unsafe.defineClass(Native Method)
          at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:45)
          at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:381)
          at sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:377)
          at sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:95)
          at sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:313)
          at java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1327)
          at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:52)
          at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:437)
          at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:413)
          at java.io.ObjectStreamClass.lookup0(ObjectStreamClass.java:310)
          at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java)

          At this point, there is not enough info in this report to investigate this further.
          It would be helpful to note if there were any previous out of memory exceptions occuring in this instance's server.logs before this occurred. (The other instance is failing presumably due to out of memory, but it is only this instance that is of interest)
          Once there is an out of memory in server.log, all future failures are suspect and may be result of system dealing with out of memory issues.

          Show
          Joe Fialli added a comment - Here is the source code line where the NoClassDefFoundError occurred. signals.add(new FailureSuspectedSignalImpl(token, member.getGroupName(), member.getStartTime())); So the class that was not found was FailureSuspectedSignalImpl which is defined in shoal-gms-impl.jar. 2975 Fri Dec 03 09:11:38 EST 2010 com/sun/enterprise/ee/cms/impl/common/FailureSuspectedSignalImpl.class There was no class not found error when reporting Failure 2 seconds later based on following log event from server.log. [#|2011-01-02T22:23:07.601+0530|INFO|glassfish3.1|ShoalLogger|_ThreadID=22;_ThreadName=Thread-3;|GMS1016: Analyzing new membership snapshot received as part of event: FAILURE_EVENT for member: instance103 of group: st-cluster|#] [#|2011-01-02T22:23:07.603+0530|INFO|glassfish3.1|ShoalLogger|_ThreadID=22;_ThreadName=Thread-3;|GMS1019: member: instance103 of group: st-cluster has failed|#] So we can assume that FailureSignalImpl was found in the same shoal-gms-impl.jar. (unless the ClassNotFoundError is in next server.log file). dhcp-burlington9-3rd-a-east-10-152-23-224:impl jf39279$ jar tvf target/shoal-gms-impl.jar | grep FailureNotificationSignalImpl 3798 Fri Dec 03 09:11:38 EST 2010 com/sun/enterprise/ee/cms/impl/common/FailureNotificationSignalImpl.class This does not look like a bug in Shoal GMS but rather a higher level class loading issue. Potentially, one could see this type of behavior when there was a previous OutOfMemoryException thrown. Looking through the attached server.log, there was no previous history of OutOfMemory, but there was evidence of other class loading issues. Specifically, the following log event occurred 5 times before the reported NoClassDefFoundError. [#|2011-01-02T20:28:35.674+0530|SEVERE|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=22;_ThreadName=Thread-3;|Exception in thread "RMI RenewClean- [10.12.153.52:27688] " |#] [#|2011-01-02T20:28:35.675+0530|SEVERE|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=22;_ThreadName=Thread-3;|java.lang.ClassFormatError: sun/reflect/GeneratedSerializationConstructorAccessor9015 : illegal JVM_CONSTANT_Methodref name: <init> at sun.misc.Unsafe.defineClass(Native Method) at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:45) at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:381) at sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:377) at sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:95) at sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:313) at java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1327) at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:52) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:437) at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:413) at java.io.ObjectStreamClass.lookup0(ObjectStreamClass.java:310) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java) At this point, there is not enough info in this report to investigate this further. It would be helpful to note if there were any previous out of memory exceptions occuring in this instance's server.logs before this occurred. (The other instance is failing presumably due to out of memory, but it is only this instance that is of interest) Once there is an out of memory in server.log, all future failures are suspect and may be result of system dealing with out of memory issues.
          Hide
          Joe Fialli added a comment -

          I researched if this could possibly be a missing osgi export.

          Export-Package below is from shoal-gms-impl.jar manifest. Both referencing class ViewWindowImpl and
          class not found FailureSuspectedSignalImp are in same jar: shoal-gms-impl.jar. The [1] rule below
          represents the dependency of com.sun.enterprise.ee.cms.impl.base.ViewWindowImpl
          on com.sun.enterprise.ee.cms.impl.common.FailureSuspectedSignalImpl.

          From shoal-gms-impl.jar/META-INF/MANIFEST.MF
          (added carriage returns below so each rule could be inspected.)

          Export-Package: com.sun.enterprise.ee.cms.impl.common;uses:="com.sun.e
          nterprise.ee.cms.core,com.sun.enterprise.ee.cms.spi,com.sun.enterpris
          e.ee.cms.impl.client,com.sun.enterprise.ee.cms.logging,com.sun.enterp
          rise.ee.cms.impl.base",

          com.sun.enterprise.ee.cms.impl.client;uses:="c
          om.sun.enterprise.ee.cms.core,com.sun.enterprise.ee.cms.logging",

          <!-- begin [1] -->
          com.sun.enterprise.ee.cms.impl.base;uses:="com.sun.enterprise.ee.cms.logg
          ing,com.sun.enterprise.ee.cms.core,com.sun.enterprise.ee.cms.impl.com
          mon,com.sun.enterprise.ee.cms.spi,com.sun.enterprise.mgmt,com.sun.ent
          erprise.mgmt.transport,com.sun.enterprise.mgmt.transport.grizzly",
          <!-- end [1] -->

          com .sun.enterprise.gms.tools,com.sun.enterprise.ee.cms.logging;uses:="su
          n.security.action",
          com.sun.enterprise.mgmt;uses:="com.sun.enterprise.
          ee.cms.impl.base,com.sun.enterprise.ee.cms.logging,com.sun.enterprise
          .ee.cms.core,com.sun.enterprise.mgmt.transport,com.sun.enterprise.ee.
          cms.impl.common,com.sun.enterprise.ee.cms.impl.client,com.sun.enterpr
          ise.ee.cms.spi",
          com.sun.enterprise.mgmt.transport;uses:="com.sun.ente
          rprise.ee.cms.impl.base,com.sun.enterprise.ee.cms.logging,com.sun.ent
          erprise.ee.cms.core,com.sun.enterprise.ee.cms.impl.common,com.sun.ent
          erprise.ee.cms.spi",

          com.sun.enterprise.mgmt.transport.grizzly;uses:="
          com.sun.grizzly,com.sun.grizzly.util,com.sun.grizzly.connectioncache.
          server,com.sun.enterprise.mgmt.transport,com.sun.grizzly.connectionca
          che.spi.transport,com.sun.enterprise.mgmt,com.sun.enterprise.ee.cms.i
          mpl.common,com.sun.enterprise.ee.cms.impl.base,com.sun.grizzly.connec
          tioncache.client,com.sun.grizzly.async,com.sun.grizzly.filter"

          Additional info is needed to continue any further investigation of this issue from group-management-service
          point of view.

          Show
          Joe Fialli added a comment - I researched if this could possibly be a missing osgi export. Export-Package below is from shoal-gms-impl.jar manifest. Both referencing class ViewWindowImpl and class not found FailureSuspectedSignalImp are in same jar: shoal-gms-impl.jar. The [1] rule below represents the dependency of com.sun.enterprise.ee.cms.impl.base.ViewWindowImpl on com.sun.enterprise.ee.cms.impl.common.FailureSuspectedSignalImpl. From shoal-gms-impl.jar/META-INF/MANIFEST.MF (added carriage returns below so each rule could be inspected.) Export-Package: com.sun.enterprise.ee.cms.impl.common;uses:="com.sun.e nterprise.ee.cms.core,com.sun.enterprise.ee.cms.spi,com.sun.enterpris e.ee.cms.impl.client,com.sun.enterprise.ee.cms.logging,com.sun.enterp rise.ee.cms.impl.base", com.sun.enterprise.ee.cms.impl.client;uses:="c om.sun.enterprise.ee.cms.core,com.sun.enterprise.ee.cms.logging", <!-- begin [1] --> com.sun.enterprise.ee.cms.impl.base;uses:="com.sun.enterprise.ee.cms.logg ing,com.sun.enterprise.ee.cms.core,com.sun.enterprise.ee.cms.impl.com mon,com.sun.enterprise.ee.cms.spi,com.sun.enterprise.mgmt,com.sun.ent erprise.mgmt.transport,com.sun.enterprise.mgmt.transport.grizzly", <!-- end [1] --> com .sun.enterprise.gms.tools,com.sun.enterprise.ee.cms.logging;uses:="su n.security.action", com.sun.enterprise.mgmt;uses:="com.sun.enterprise. ee.cms.impl.base,com.sun.enterprise.ee.cms.logging,com.sun.enterprise .ee.cms.core,com.sun.enterprise.mgmt.transport,com.sun.enterprise.ee. cms.impl.common,com.sun.enterprise.ee.cms.impl.client,com.sun.enterpr ise.ee.cms.spi", com.sun.enterprise.mgmt.transport;uses:="com.sun.ente rprise.ee.cms.impl.base,com.sun.enterprise.ee.cms.logging,com.sun.ent erprise.ee.cms.core,com.sun.enterprise.ee.cms.impl.common,com.sun.ent erprise.ee.cms.spi", com.sun.enterprise.mgmt.transport.grizzly;uses:=" com.sun.grizzly,com.sun.grizzly.util,com.sun.grizzly.connectioncache. server,com.sun.enterprise.mgmt.transport,com.sun.grizzly.connectionca che.spi.transport,com.sun.enterprise.mgmt,com.sun.enterprise.ee.cms.i mpl.common,com.sun.enterprise.ee.cms.impl.base,com.sun.grizzly.connec tioncache.client,com.sun.grizzly.async,com.sun.grizzly.filter" Additional info is needed to continue any further investigation of this issue from group-management-service point of view.
          Hide
          varunrupela added a comment -

          We have started a re-run of this scenario with the latest build to collect more information on issue http://java.net/jira/browse/GLASSFISH-15426. Will also check if this issue still exists

          Show
          varunrupela added a comment - We have started a re-run of this scenario with the latest build to collect more information on issue http://java.net/jira/browse/GLASSFISH-15426 . Will also check if this issue still exists
          Hide
          Joe Fialli added a comment -

          If the re-run has just restarted, it is possible to verify the Shoal GMS code path by doing the following.

          Select one of the clustered instances to be killed using "kill -9".
          Shoal GMS will generate the FailureSuspected notification 6 seconds later.
          If this fails as reported at beginning of run, the failure is not stress related,
          but would have something to do with running on Windows with Jrocket. (No GMS QE test
          have been run with jrocket or on Windows at this time.)

          Show
          Joe Fialli added a comment - If the re-run has just restarted, it is possible to verify the Shoal GMS code path by doing the following. Select one of the clustered instances to be killed using "kill -9". Shoal GMS will generate the FailureSuspected notification 6 seconds later. If this fails as reported at beginning of run, the failure is not stress related, but would have something to do with running on Windows with Jrocket. (No GMS QE test have been run with jrocket or on Windows at this time.)
          Hide
          Joe Fialli added a comment -

          I have verified running with jrockit-jdk1.6.0_20-R28.1.0-4.0.1 (from public website, do not know how to get the reported jrockit version of 1.6.0_22) on Windows XP 32 bit that the reported exception NoClassDefFoundError does not occur. I have attached complete server logs of the run.

          We are also performing a GMS QE run with Jrockit on OEL.

          Show
          Joe Fialli added a comment - I have verified running with jrockit-jdk1.6.0_20-R28.1.0-4.0.1 (from public website, do not know how to get the reported jrockit version of 1.6.0_22) on Windows XP 32 bit that the reported exception NoClassDefFoundError does not occur. I have attached complete server logs of the run. We are also performing a GMS QE run with Jrockit on OEL.
          Hide
          Joe Fialli added a comment -

          DAS and instance logs running on windows xp with jrockit 1.6_0_20.
          Illustrates that FailureSuspectedSignalImpl NoClassDefFound did not occur in simple
          Shoal GMS run. Thus, reported failure is suspected to be stress run impact on class loading.

          Show
          Joe Fialli added a comment - DAS and instance logs running on windows xp with jrockit 1.6_0_20. Illustrates that FailureSuspectedSignalImpl NoClassDefFound did not occur in simple Shoal GMS run. Thus, reported failure is suspected to be stress run impact on class loading.
          Hide
          Joe Fialli added a comment -

          Abbreviated server.log showing jrockit version and FailureSuspected being reported.

          Jan 5, 2011 9:53:10 AM com.sun.enterprise.admin.launcher.GFLauncherLogger info
          INFO: JVM invocation command line:
          C:\jrockit-jdk1.6.0_20-R28.1.0-4.0.1\bin\java.exe

          <deleted>
          [#|2011-01-05T09:54:17.625-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1092: GMS View Change Received for group: myCluster : Members in view for IN_DOUBT_EVENT(before change analysis) are :
          1: MemberId: instance01, MemberType: CORE, Address: 129.148.6.186:9121:228.9.1.3:2231:myCluster:instance01
          2: MemberId: instance02, MemberType: CORE, Address: 129.148.6.186:9130:228.9.1.3:2231:myCluster:instance02
          3: MemberId: server, MemberType: SPECTATOR, Address: 129.148.6.186:9173:228.9.1.3:2231:myCluster:server

          #]

          [#|2011-01-05T09:54:17.625-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1016: Analyzing new membership snapshot received as part of event: IN_DOUBT_EVENT for member: instance01 of group: myCluster|#]

          [#|2011-01-05T09:54:17.625-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1007: Received FailureSuspectedEvent for member: instance01 of group: myCluster|#]

          [#|2011-01-05T09:54:17.625-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=22;_ThreadName=Thread-3;|GMS1005: Sending FailureSuspectedSignals to registered Actions. member: instance01 ...|#]

          [#|2011-01-05T09:54:21.547-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1092: GMS View Change Received for group: myCluster : Members in view for FAILURE_EVENT(before change analysis) are :
          1: MemberId: instance02, MemberType: CORE, Address: 129.148.6.186:9130:228.9.1.3:2231:myCluster:instance02
          2: MemberId: server, MemberType: SPECTATOR, Address: 129.148.6.186:9173:228.9.1.3:2231:myCluster:server

          #]

          [#|2011-01-05T09:54:21.547-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1016: Analyzing new membership snapshot received as part of event: FAILURE_EVENT for member: instance01 of group: myCluster|#]

          [#|2011-01-05T09:54:21.547-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1019: member: instance01 of group: myCluster has failed|#]

          The above sequence of log events included a NoClassDefFound for FailureSuspectedSignalImpl in this report.
          So it is definitely working in non-stress case.

          Show
          Joe Fialli added a comment - Abbreviated server.log showing jrockit version and FailureSuspected being reported. Jan 5, 2011 9:53:10 AM com.sun.enterprise.admin.launcher.GFLauncherLogger info INFO: JVM invocation command line: C:\jrockit-jdk1.6.0_20-R28.1.0-4.0.1\bin\java.exe <deleted> [#|2011-01-05T09:54:17.625-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1092: GMS View Change Received for group: myCluster : Members in view for IN_DOUBT_EVENT(before change analysis) are : 1: MemberId: instance01, MemberType: CORE, Address: 129.148.6.186:9121:228.9.1.3:2231:myCluster:instance01 2: MemberId: instance02, MemberType: CORE, Address: 129.148.6.186:9130:228.9.1.3:2231:myCluster:instance02 3: MemberId: server, MemberType: SPECTATOR, Address: 129.148.6.186:9173:228.9.1.3:2231:myCluster:server #] [#|2011-01-05T09:54:17.625-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1016: Analyzing new membership snapshot received as part of event: IN_DOUBT_EVENT for member: instance01 of group: myCluster|#] [#|2011-01-05T09:54:17.625-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1007: Received FailureSuspectedEvent for member: instance01 of group: myCluster|#] [#|2011-01-05T09:54:17.625-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=22;_ThreadName=Thread-3;|GMS1005: Sending FailureSuspectedSignals to registered Actions. member: instance01 ...|#] [#|2011-01-05T09:54:21.547-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1092: GMS View Change Received for group: myCluster : Members in view for FAILURE_EVENT(before change analysis) are : 1: MemberId: instance02, MemberType: CORE, Address: 129.148.6.186:9130:228.9.1.3:2231:myCluster:instance02 2: MemberId: server, MemberType: SPECTATOR, Address: 129.148.6.186:9173:228.9.1.3:2231:myCluster:server #] [#|2011-01-05T09:54:21.547-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1016: Analyzing new membership snapshot received as part of event: FAILURE_EVENT for member: instance01 of group: myCluster|#] [#|2011-01-05T09:54:21.547-0500|INFO|glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-3;|GMS1019: member: instance01 of group: myCluster has failed|#] The above sequence of log events included a NoClassDefFound for FailureSuspectedSignalImpl in this report. So it is definitely working in non-stress case.
          Hide
          Joe Fialli added a comment -

          Unable to proceed that this is a group-management-service issue with current information.

          Previous comments on this issue document my attempts to recreate the reported failure in a
          simplified Shoal GMS test on windows xp platform (32 bit) using jrockit 1.6_0_20. The
          simple test demonstrates that the package exports for Shoal GMS are correct and working.
          The attached logs illustrate that for this simple run the class FailureSuspectedSignalImpl
          was found and that NoClassDefFound did not occur. At this time, there is no evidence that
          this is a a GMS issue but it looks more like a class loading issue caused by the stress test.
          The issue may have been caused by previous out of memory exceptions or some other bad conditions.
          But in simple terms, Shoal GMS SuspectedFailure notification has been verified with jrockit on
          both windows and linux. We have successfully run all Shoal GMS QE test on OEL with jrockit 1.6_0_20
          at least one run of each test. We have started 5 iterations of each scenario just to make sure.

          Show
          Joe Fialli added a comment - Unable to proceed that this is a group-management-service issue with current information. Previous comments on this issue document my attempts to recreate the reported failure in a simplified Shoal GMS test on windows xp platform (32 bit) using jrockit 1.6_0_20. The simple test demonstrates that the package exports for Shoal GMS are correct and working. The attached logs illustrate that for this simple run the class FailureSuspectedSignalImpl was found and that NoClassDefFound did not occur. At this time, there is no evidence that this is a a GMS issue but it looks more like a class loading issue caused by the stress test. The issue may have been caused by previous out of memory exceptions or some other bad conditions. But in simple terms, Shoal GMS SuspectedFailure notification has been verified with jrockit on both windows and linux. We have successfully run all Shoal GMS QE test on OEL with jrockit 1.6_0_20 at least one run of each test. We have started 5 iterations of each scenario just to make sure.

            People

            • Assignee:
              Joe Fialli
              Reporter:
              varunrupela
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: