shoal
  1. shoal
  2. SHOAL-82

notifying cluster view event is not thread safe

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: current
    • Fix Version/s: 1.1
    • Component/s: GMS
    • Labels:
      None
    • Environment:

      Operating System: All
      Platform: Windows

    • Issuezilla Id:
      82

      Description

      ClusterViewManager.notifyListeners() can be executed on multi-threads when many
      members join the same group concurrently.

      Though there are no member's failures, you can see the following log.

      ------------------------------------
      2008. 11. 12 오후 5:44:00 com.sun.enterprise.shoal.jointest.SimpleJoinTest
      initializeGMS
      ì •ë³´: Initializing Shoal for member: 5d3280a2-a0c5-4ae2-8d41-d59b57400b8f
      group:TestGroup
      2008. 11. 12 오후 5:44:00 com.sun.enterprise.shoal.jointest.SimpleJoinTest
      runSimpleSample
      ì •ë³´: Registering for group event notifications
      2008. 11. 12 오후 5:44:00 com.sun.enterprise.shoal.jointest.SimpleJoinTest
      runSimpleSample
      ì •ë³´: Joining Group TestGroup
      2008. 11. 12 오후 5:44:07 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
      getMemberTokens
      ì •ë³´: GMS View Change Received for group TestGroup (5d3280a2-a0c5-4ae2-8d41-
      d59b57400b8f) : Members in view for (before change analysis) are :
      1: MemberId: 5d3280a2-a0c5-4ae2-8d41-d59b57400b8f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033090183254F6D47E7B235BC8D656194FA03

      2008. 11. 12 오후 5:44:07 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
      newViewObserved
      ì •ë³´: Analyzing new membership snapshot received as part of event :
      MASTER_CHANGE_EVENT
      2008. 11. 12 오후 5:44:08 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
      getMemberTokens
      ì •ë³´: GMS View Change Received for group TestGroup (aeea918f-571b-463b-bfa6-
      55c536df0d11) : Members in view for (before change analysis) are :
      (a)
      1: MemberId: 5d3280a2-a0c5-4ae2-8d41-d59b57400b8f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033090183254F6D47E7B235BC8D656194FA03
      2: MemberId: addb1dbe-06cf-43b8-8903-78605f29091f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A787461503250336C047E2077544A5692C1EA21407A886303
      3: MemberId: aeea918f-571b-463b-bfa6-55c536df0d11, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033DBAE9788614944F8A40ED352C8E7A03B03
      4: MemberId: fae1414d-702a-42fd-8c7d-6ffabe8b2e69, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033EF69FCF215DE43038FD0C3AA0535A08203

      2008. 11. 12 오후 5:44:08 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
      newViewObserved
      ì •ë³´: Analyzing new membership snapshot received as part of event : ADD_EVENT
      2008. 11. 12 오후 5:44:17 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
      getMemberTokens
      ì •ë³´: GMS View Change Received for group TestGroup (addb1dbe-06cf-43b8-8903-
      78605f29091f) : Members in view for (before change analysis) are :
      (b)
      1: MemberId: 5d3280a2-a0c5-4ae2-8d41-d59b57400b8f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033090183254F6D47E7B235BC8D656194FA03
      2: MemberId: addb1dbe-06cf-43b8-8903-78605f29091f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A787461503250336C047E2077544A5692C1EA21407A886303

      2008. 11. 12 오후 5:44:17 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
      newViewObserved
      ì •ë³´: Analyzing new membership snapshot received as part of event : ADD_EVENT
      2008. 11. 12 오후 5:44:17 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
      getMemberTokens
      ì •ë³´: GMS View Change Received for group TestGroup (fae1414d-702a-42fd-8c7d-
      6ffabe8b2e69) : Members in view for (before change analysis) are :
      (c)
      1: MemberId: 5d3280a2-a0c5-4ae2-8d41-d59b57400b8f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033090183254F6D47E7B235BC8D656194FA03
      2: MemberId: addb1dbe-06cf-43b8-8903-78605f29091f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A787461503250336C047E2077544A5692C1EA21407A886303
      3: MemberId: fae1414d-702a-42fd-8c7d-6ffabe8b2e69, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033EF69FCF215DE43038FD0C3AA0535A08203

      2008. 11. 12 오후 5:44:17 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
      newViewObserved
      ì •ë³´: Analyzing new membership snapshot received as part of event : ADD_EVENT
      2008. 11. 12 오후 5:44:20 com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
      getMemberTokens
      ì •ë³´: GMS View Change Received for group TestGroup (42b22147-7683-481f-a9f4-
      85ba5a2b847f) : Members in view for (before change analysis) are :
      1: MemberId: 5d3280a2-a0c5-4ae2-8d41-d59b57400b8f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033090183254F6D47E7B235BC8D656194FA03
      2: MemberId: 42b22147-7683-481f-a9f4-85ba5a2b847f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A787461503250334501FF701A644877A4B4C65068965F3403
      3: MemberId: addb1dbe-06cf-43b8-8903-78605f29091f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A787461503250336C047E2077544A5692C1EA21407A886303
      4: MemberId: aeea918f-571b-463b-bfa6-55c536df0d11, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033DBAE9788614944F8A40ED352C8E7A03B03
      5: MemberId: fae1414d-702a-42fd-8c7d-6ffabe8b2e69, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033EF69FCF215DE43038FD0C3AA0535A08203

      ...
      ------------------------------------

      This log means that five members join "TestGroup"

      1: MemberId: 5d3280a2-a0c5-4ae2-8d41-d59b57400b8f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033090183254F6D47E7B235BC8D656194FA03
      2: MemberId: 42b22147-7683-481f-a9f4-85ba5a2b847f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A787461503250334501FF701A644877A4B4C65068965F3403
      3: MemberId: addb1dbe-06cf-43b8-8903-78605f29091f, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A787461503250336C047E2077544A5692C1EA21407A886303
      4: MemberId: aeea918f-571b-463b-bfa6-55c536df0d11, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033DBAE9788614944F8A40ED352C8E7A03B03
      5: MemberId: fae1414d-702a-42fd-8c7d-6ffabe8b2e69, MemberType: CORE, Address:
      urn:jxta:uuid-59616261646162614A78746150325033EF69FCF215DE43038FD0C3AA0535A08203

      And this log is printed in ViewWindow based on the viewQueue when new view is
      observed.

      But above log message, you can see that (a), (b) and (c)'s order are strange.

      Because there are no failures, I think that member's number should be increased
      gradually(or (a)'num <= (b)'s num <= (c)'s num).

      The following code is ClusterViewManager's notifyListeners() method.


      void notifyListeners(final ClusterViewEvent event) {
      Log.log(...);
      for (ClusterViewEventListener elem : cvListeners)

      { elem.clusterViewEvent(event, getLocalView()); }

      }


      getLocalView() is thread safe with viewLock but ClusterViewEventListener's
      clusterViewEvent() is not thread safe.

      The following code is GroupCommunicationProviderImpl's clusterViewEvent()
      method which implements ClusterViewEventListener interface.


      public void clusterViewEvent(final ClusterViewEvent clusterViewEvent, final
      ClusterView clusterView) {
      ...
      final EventPacket ePacket = new EventPakcet(clusterViewEvent.getEvent(),
      clusterViewEvent.getAdvertisement(), clusterView);
      final ArrayBlockingQueue<EventPacket> viewQueue = getGMSContext
      ().getViewQueue();
      try

      { viewQueue.put(ePacket); } catch(InterruptedExcetion e) { ... }

      }
      -----

      I think that local view's snapshot(getLocalView()'s return value) and
      viewQueue.put() should be atomic like this.
      -----
      void notifyListeners(final ClusterViewEvent event) {
      Log.log(...);
      for (ClusterViewEventListener elem : cvListeners) {
      synchronized( elem ) { elem.clusterViewEvent(event, getLocalView()); }
      }
      }

      or

      public synchronized void clusterViewEvent(final ClusterViewEvent
      clusterViewEvent, final ClusterView clusterView) {
      ...
      final EventPacket ePacket = new EventPakcet(clusterViewEvent.getEvent(),
      clusterViewEvent.getAdvertisement(), clusterView);
      final ArrayBlockingQueue<EventPacket> viewQueue = getGMSContext
      ().getViewQueue();
      try { viewQueue.put(ePacket); }

      catch(InterruptedExcetion e)

      { ... }

      }

      (In my opinion, I think that the former is better because clusterViewEvent()
      can be implemented variously)


      In other words,
      -------------------------------------------------------------------
      getLocalView() --> local view's snapshot --> (hole) --> insert view queue
      -------------------------------------------------------------------

      As you can see above, before EventPacket is inserted into view queue, there is
      some hole. So we can remove the hole with synchronized block or individual lock
      object.
      If the hole is removed, I think that ViewWindow can receive local view capture
      from queue correctly.

        Activity

        Hide
        shreedhar_ganapathy added a comment -

        ..

        Show
        shreedhar_ganapathy added a comment - ..

          People

          • Assignee:
            Joe Fialli
            Reporter:
            carryel
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated: