sailfin
  1. sailfin
  2. SAILFIN-1120

Some non-Persistent Servlet Timers do not fire after Upscale (some of those that are on SASs owned by the newly added instance)

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 1.0
    • Fix Version/s: milestone 1
    • Component/s: doc
    • Labels:
      None
    • Environment:

      Operating System: Linux
      Platform: Other

    • Issuezilla Id:
      1,120

      Description

      Sailfin build 47

      Setup: 5 instance cluster - instance101, instance102, instance103, instance104,
      instance105

      Scenario:

      1. Complete 15 INVITE calls (establish and end it). As a result 15 SAS should be
      created, each of the which should now have a 2 minute Persistent and a
      non-Persistent Servlet Timer set on them.
      2. Add an additional instance (instance107 in the case of the attached logs)
      3. wait 7 minutes (just waiting extra time here) for all the timers to fire.
      4. Complete another set of 15 INVITE calls - This is done only to figure the new
      home instance for each of the SASs, so as to know where the Persistent Timers
      would fire.
      [Note that the Persistent Timers would fire on the new home instances of the SAS
      while the non-Persistent Timers should fire on the instance on which they were
      created.]
      5. Check that both Persistent and non-Persistent Timers fired. The timer
      listener of the attached app prints the following messages for each type of
      timer "120 second Persistent Timer for SAS with key - <sas-key> Just Went Off"
      and "120 second NON - Persistent Timer for SAS with key - <sas-key> Just Went Off".

      Issue:

      • Some of the non-persistent timers that belong to SASs whose new home instance
        is the newly added instance (instance107 in the attached logs) do not fire.

      Attached:

      • SAS keys of the 15 calls
      • Serving Instances of the first set of the 15 calls
      • Home instances of the SAS keys after the Upscale (i.e. addition of instance107)
      • instance logs showing that non-Persistent Timer for SAS with key "gddidlrsbcv"
        correctly fires on instance105. While the non-Persistent Timer for SAS with key
        "xyeycsuis" does not fire at all. For both the SAS the new home instance is
        instance107.

        Activity

        Hide
        varunrupela added a comment -

        Created an attachment (id=620)
        attached applications sources, sar, SIPp scenario file and cvs file and other stuff mentioned in the issue desription

        Show
        varunrupela added a comment - Created an attachment (id=620) attached applications sources, sar, SIPp scenario file and cvs file and other stuff mentioned in the issue desription
        Hide
        varunrupela added a comment -

        Some more information:

        • The sas-keys were chosen at random.
        • To run the application, deploy it on a cluster and then run the following SIPp
          commands (with the files attached):
          sipp -r 1 -inf receiver-instance.csv -p 7000 -m 15 -rp 1s -trace_msg
          -trace_screen -trace_err -trace_timeout -trace_stat -trace_rtt -trace_logs -nd
          -l 1000000 -timeout 180s -recv_timeout 60s -sf upscale-timer.xml
          <sailfin-host:sip-port>
        Show
        varunrupela added a comment - Some more information: The sas-keys were chosen at random. To run the application, deploy it on a cluster and then run the following SIPp commands (with the files attached): sipp -r 1 -inf receiver-instance.csv -p 7000 -m 15 -rp 1s -trace_msg -trace_screen -trace_err -trace_timeout -trace_stat -trace_rtt -trace_logs -nd -l 1000000 -timeout 180s -recv_timeout 60s -sf upscale-timer.xml <sailfin-host:sip-port>
        Hide
        varunrupela added a comment -

        Created an attachment (id=626)
        attaching logs for the test run on build 48; this time the non-persistent timers were set at 100 seconds 20 seconds below the persistent timers. The issue exists with this new setting as well

        Show
        varunrupela added a comment - Created an attachment (id=626) attaching logs for the test run on build 48; this time the non-persistent timers were set at 100 seconds 20 seconds below the persistent timers. The issue exists with this new setting as well
        Hide
        lwhite added a comment -

        Since this issue is confined to non-persistent timers
        which are not replicated, I am assigning this to Peter
        and changing the subcomponent to sip_container.

        Show
        lwhite added a comment - Since this issue is confined to non-persistent timers which are not replicated, I am assigning this to Peter and changing the subcomponent to sip_container.
        Hide
        lwhite added a comment -

        After more thought I am concluding that this issue
        is invalid. Non-persistent timers by design do not
        survive beyond the instance on which they were created.

        So:
        a) if the instance upon which the timer is created fails
        or is stopped, you cannot count on that timer firing.
        b) and because of the consistent hash mapping, if a
        SAS and it's associated timers is required to migrate,
        then here too the non-persistent timers will not migrate.

        I think this should be documented so that people who are
        expecting to have cluster shape changes of any sort will
        not expect non-persistent timers to work.

        Show
        lwhite added a comment - After more thought I am concluding that this issue is invalid. Non-persistent timers by design do not survive beyond the instance on which they were created. So: a) if the instance upon which the timer is created fails or is stopped, you cannot count on that timer firing. b) and because of the consistent hash mapping, if a SAS and it's associated timers is required to migrate, then here too the non-persistent timers will not migrate. I think this should be documented so that people who are expecting to have cluster shape changes of any sort will not expect non-persistent timers to work.
        Hide
        varunrupela added a comment -

        The cluster reshape here is an up-scale (i.e. a new instance was added). None of
        the instances were failed or stopped during the test.

        Show
        varunrupela added a comment - The cluster reshape here is an up-scale (i.e. a new instance was added). None of the instances were failed or stopped during the test.
        Hide
        varunrupela added a comment -

        [Additional Note]

        Section 9.1 of JSR 289 spec mentions details only for the persistent timer.

        "if (isPersistent is) true the ServletTimer should be reinstantiated if the
        server is shut down and subsequently restarted. During the restart, the
        container will call the TimerInterface for a timer that has expired during the
        shutdown. The SipApplicationSession associated with the ServletTimer should be
        persistent."

        This could be interpreted as saying that the non Persistent Timers should be
        supported unless the instance is failed or shut-down.

        Gathering some more inputs on this subject. If appropriate, will move this issue
        to the docs category.

        Show
        varunrupela added a comment - [Additional Note] Section 9.1 of JSR 289 spec mentions details only for the persistent timer. "if (isPersistent is) true the ServletTimer should be reinstantiated if the server is shut down and subsequently restarted. During the restart, the container will call the TimerInterface for a timer that has expired during the shutdown. The SipApplicationSession associated with the ServletTimer should be persistent." This could be interpreted as saying that the non Persistent Timers should be supported unless the instance is failed or shut-down. Gathering some more inputs on this subject. If appropriate, will move this issue to the docs category.
        Hide
        strandp added a comment -

        The consensus among ssr team members is this Issue should be addressed by
        documenting the following topics (this may not be the complete or final list):

        • Document that Upscale leads to migrated sessions
        • Document that nonpersisted timers do not survive migration
        • SAS is re-activated and migrates at the appointed time of the non-persistent
          timer but does not fire after it migrates
        • Document that it is up to the application to manage nonpersistent timers and
          that this can be done via the passivate/activate lifecycle methods
        • all of the above is valid only in a clustered environment
        Show
        strandp added a comment - The consensus among ssr team members is this Issue should be addressed by documenting the following topics (this may not be the complete or final list): Document that Upscale leads to migrated sessions Document that nonpersisted timers do not survive migration SAS is re-activated and migrates at the appointed time of the non-persistent timer but does not fire after it migrates Document that it is up to the application to manage nonpersistent timers and that this can be done via the passivate/activate lifecycle methods all of the above is valid only in a clustered environment

          People

          • Assignee:
            sanandal
            Reporter:
            varunrupela
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated: