sailfin
  1. sailfin
  2. SAILFIN-1989

High periodic response time observed for a minute after failure was introduced in the conference scenario.

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0
    • Fix Version/s: milestone 1
    • Component/s: session_replication
    • Labels:
      None
    • Environment:

      Operating System: All
      Platform: All

      Description

      Parent Issue: 1983
      Build: 31b 15th nightly

      SIP traffic shows up to 15 seconds periodic response time for a minute after
      instance105 was failed. Below is a snapshot of the periodic response time after
      instance105 was failed:

      Notice the last column below from time 22:39:43 to 22:40:46.

      CurrentTime;CallRate(P);ResponseTime1(P)
      2009-09-15 22:39:19:461 1253034559.461694;37.0123;00:00:00:005
      2009-09-15 22:39:22:461 1253034562.461279;37.0123;00:00:00:005
      2009-09-15 22:39:25:461 1253034565.461225;37.0123;00:00:00:004
      2009-09-15 22:39:28:461 1253034568.461000;37.0123;00:00:00:003
      2009-09-15 22:39:31:461 1253034571.461121;37.0123;00:00:00:005
      2009-09-15 22:39:34:461 1253034574.461068;37.0123;00:00:00:008
      2009-09-15 22:39:37:465 1253034577.465016;36.963;00:00:00:006
      2009-09-15 22:39:40:468 1253034580.468963;36.963;00:00:00:004
      2009-09-15 22:39:43:469 1253034583.469011;37.0123;00:00:00:092
      2009-09-15 22:39:46:468 1253034586.468942;37.0123;00:00:02:261
      2009-09-15 22:39:49:468 1253034589.468868;37.0123;00:00:01:127
      2009-09-15 22:39:52:472 1253034592.472784;36.963;00:00:00:574
      2009-09-15 22:39:55:472 1253034595.472762;37.0123;00:00:01:320
      2009-09-15 22:39:58:472 1253034598.472226;37.0123;00:00:01:008
      2009-09-15 22:40:01:472 1253034601.472658;37;00:00:03:118
      2009-09-15 22:40:04:472 1253034604.472722;37.0123;00:00:04:155
      2009-09-15 22:40:07:472 1253034607.472458;37.0123;00:00:06:023
      2009-09-15 22:40:10:480 1253034610.480396;36.9139;00:00:06:745
      2009-09-15 22:40:13:480 1253034613.480503;37.0123;00:00:08:344
      2009-09-15 22:40:16:480 1253034616.480427;37.0123;00:00:11:651
      2009-09-15 22:40:19:480 1253034619.480282;37.0123;00:00:10:388
      2009-09-15 22:40:22:481 1253034622.481014;37;00:00:07:273
      2009-09-15 22:40:25:484 1253034625.484184;37.3085;00:00:13:209
      2009-09-15 22:40:28:484 1253034628.484128;37.0123;00:00:00:000
      2009-09-15 22:40:31:484 1253034631.484180;37.0123;00:00:15:760
      2009-09-15 22:40:34:484 1253034634.484023;37.0123;00:00:11:694
      2009-09-15 22:40:37:487 1253034637.487959;36.963;00:00:04:536
      2009-09-15 22:40:40:488 1253034640.488028;37.0123;00:00:15:285
      2009-09-15 22:40:43:487 1253034643.487893;37.0123;00:00:08:490
      2009-09-15 22:40:46:491 1253034646.491824;36.963;00:00:01:742
      2009-09-15 22:40:49:491 1253034649.491775;37.0123;00:00:00:009
      2009-09-15 22:40:52:495 1253034652.495716;36.963;00:00:00:011
      2009-09-15 22:40:55:495 1253034655.495666;37.0123;00:00:00:004
      2009-09-15 22:40:58:495 1253034658.495677;37.0123;00:00:00:004
      2009-09-15 22:41:01:495 1253034661.495652;37.0123;00:00:00:014
      2009-09-15 22:41:04:495 1253034664.495540;37.0123;00:00:00:008
      2009-09-15 22:41:07:495 1253034667.495214;37.0123;00:00:00:006
      2009-09-15 22:41:10:495 1253034670.495041;37.0123;00:00:00:006

      HTTP traffic also notes a higher periodic response time for each transaction.
      Please see attached file between time 10:39:43 to 10:40:55 and watch for the
      value "Periodic Avg. transactions time (ms)".

        Issue Links

          Activity

          Hide
          varunrupela added a comment -

          updated keyword

          Show
          varunrupela added a comment - updated keyword
          Hide
          varunrupela added a comment -

          The response time information attached above was gathered by using the
          "-trace_stat" option when running SIPp. The columns above were cut out from the
          csv file thus generated.

          The complete SIPp stats file (for 1 of the clients) for the entire run is
          available at:
          sf-x2200-11:/space/sony/logs/2.0/b31b-15thnightly/conference-ssr/IEC1-SuSE/sipp/ejp-156x-127/st5_1_conference_uac_failure.xml_21977_.csv
          (temporarily moved to sf-x2200-17:/space/system-test/backup-from-sf-x2200-11/)

          The periodic response time information was cut out from that file and saved in
          21977-periodic.out. The information pasted above is a snippet from this file to
          show the problem areas.

          Show
          varunrupela added a comment - The response time information attached above was gathered by using the "-trace_stat" option when running SIPp. The columns above were cut out from the csv file thus generated. The complete SIPp stats file (for 1 of the clients) for the entire run is available at: sf-x2200-11:/space/sony/logs/2.0/b31b-15thnightly/conference-ssr/IEC1-SuSE/sipp/ejp-156x-127/st5_1_conference_uac_failure.xml_21977_.csv (temporarily moved to sf-x2200-17:/space/system-test/backup-from-sf-x2200-11/) The periodic response time information was cut out from that file and saved in 21977-periodic.out. The information pasted above is a snippet from this file to show the problem areas.
          Hide
          Stephen DiMilla added a comment -

          Using the following sailfin nightly build:
          sailfin-image-v2-b32-nightly-09_sep_2009.jar

          and the patch:
          deal_lock.jar (2009-09-23 07:17 dead_lock.jar)

          When I ran the test at 37cps I saw no response times that were greater than 1
          second.

          The kill occurred at 08:19

          Here's the new data:

          CurrentTime" "CallRate(P)" "ResponseTime1(P)"
          07:36:44:302 0 "00:00:00:000"
          07:37:44:299 36.98 "00:00:00:008"
          07:38:44:299 37.02 "00:00:00:006"
          07:39:44:298 36.98 "00:00:00:005"
          07:40:44:298 37.02 "00:00:00:005"
          07:41:44:299 36.98 "00:00:00:005"
          07:42:44:298 37.02 "00:00:00:005"
          07:43:44:299 36.98 "00:00:00:004"
          07:44:44:302 37.02 "00:00:00:004"
          07:45:44:304 37 "00:00:00:005"
          07:46:44:307 37 "00:00:00:004"
          07:47:44:310 37 "00:00:00:005"
          07:48:44:313 37 "00:00:00:004"
          07:49:44:317 37 "00:00:00:004"
          07:50:44:320 37 "00:00:00:004"
          07:51:44:323 37 "00:00:00:004"
          07:52:44:326 37 "00:00:00:004"
          07:53:44:325 37.02 "00:00:00:004"
          07:54:44:327 37 "00:00:00:005"
          07:55:44:328 37 "00:00:00:005"
          07:56:44:327 37 "00:00:00:005"
          07:57:44:327 37 "00:00:00:004"
          07:58:44:328 37 "00:00:00:004"
          07:59:44:329 37 "00:00:00:005"
          08:00:44:329 37 "00:00:00:004"
          08:01:44:333 37 "00:00:00:004"
          08:02:44:332 37 "00:00:00:004"
          08:03:44:335 37 "00:00:00:005"
          08:04:44:338 37 "00:00:00:004"
          08:05:44:342 37 "00:00:00:005"
          08:06:44:345 37 "00:00:00:004"
          08:07:44:348 37 "00:00:00:004"
          08:08:44:351 37 "00:00:00:004"
          08:09:44:354 37 "00:00:00:004"
          08:10:44:356 37.02 "00:00:00:004"
          08:11:44:356 37 "00:00:00:004"
          08:12:44:356 37 "00:00:00:005"
          08:13:44:356 37 "00:00:00:005"
          08:14:44:357 37 "00:00:00:005"
          08:15:44:356 37 "00:00:00:005"
          08:16:44:360 37 "00:00:00:005"
          08:17:44:363 37 "00:00:00:004"
          08:18:44:366 37 "00:00:00:005"
          08:19:44:369 37 "00:00:00:004"
          08:20:44:372 37 "00:00:00:125"
          08:21:44:375 37 "00:00:00:008"
          08:22:44:375 37 "00:00:00:007"
          08:23:44:375 37 "00:00:00:007"
          08:24:44:374 37 "00:00:00:007"
          08:25:44:378 37 "00:00:00:005"
          08:26:44:377 37 "00:00:00:004"
          08:27:44:381 37 "00:00:00:004"
          08:28:44:381 37 "00:00:00:004"
          08:29:44:381 37.02 "00:00:00:004"
          08:30:44:381 37 "00:00:00:004"
          08:31:44:382 37 "00:00:00:004"
          08:32:44:383 37 "00:00:00:003"
          08:33:44:386 37 "00:00:00:004"
          08:34:44:386 37 "00:00:00:004"
          08:35:44:389 37 "00:00:00:003"
          08:36:44:392 37 "00:00:00:004"
          08:37:44:391 37 "00:00:00:004"
          08:38:44:394 37 "00:00:00:003"
          08:39:44:397 37 "00:00:00:003"
          08:40:44:401 37 "00:00:00:003"
          08:41:44:404 37 "00:00:00:004"
          08:42:44:407 37 "00:00:00:004"
          08:43:44:409 37.02 "00:00:00:003"
          08:44:44:408 37 "00:00:00:003"
          08:45:44:412 37 "00:00:00:004"
          08:46:44:415 37 "00:00:00:003"
          08:47:44:414 37 "00:00:00:003"
          08:48:44:418 37 "00:00:00:003"
          08:49:44:420 37 "00:00:00:003"
          08:50:44:420 37 "00:00:00:003"
          08:51:44:423 37 "00:00:00:003"
          08:52:44:426 37 "00:00:00:004"
          08:53:44:429 37 "00:00:00:003"
          08:54:44:433 37 "00:00:00:004"
          08:55:44:434 37.02 "00:00:00:003"
          08:56:44:435 36.98 "00:00:00:003"
          08:57:44:435 37.02 "00:00:00:003"
          08:58:44:437 36.98 "00:00:00:004"
          08:59:44:440 37.01 "00:00:00:004"
          09:00:44:443 37 "00:00:00:004"
          09:01:44:446 37 "00:00:00:004"
          09:02:44:449 37 "00:00:00:004"
          09:03:44:450 37 "00:00:00:002"

          When the code is checked in (from the patch) this bug will be marked fixed

          Show
          Stephen DiMilla added a comment - Using the following sailfin nightly build: sailfin-image-v2-b32-nightly-09_sep_2009.jar and the patch: deal_lock.jar (2009-09-23 07:17 dead_lock.jar) When I ran the test at 37cps I saw no response times that were greater than 1 second. The kill occurred at 08:19 Here's the new data: CurrentTime" "CallRate(P)" "ResponseTime1(P)" 07:36:44:302 0 "00:00:00:000" 07:37:44:299 36.98 "00:00:00:008" 07:38:44:299 37.02 "00:00:00:006" 07:39:44:298 36.98 "00:00:00:005" 07:40:44:298 37.02 "00:00:00:005" 07:41:44:299 36.98 "00:00:00:005" 07:42:44:298 37.02 "00:00:00:005" 07:43:44:299 36.98 "00:00:00:004" 07:44:44:302 37.02 "00:00:00:004" 07:45:44:304 37 "00:00:00:005" 07:46:44:307 37 "00:00:00:004" 07:47:44:310 37 "00:00:00:005" 07:48:44:313 37 "00:00:00:004" 07:49:44:317 37 "00:00:00:004" 07:50:44:320 37 "00:00:00:004" 07:51:44:323 37 "00:00:00:004" 07:52:44:326 37 "00:00:00:004" 07:53:44:325 37.02 "00:00:00:004" 07:54:44:327 37 "00:00:00:005" 07:55:44:328 37 "00:00:00:005" 07:56:44:327 37 "00:00:00:005" 07:57:44:327 37 "00:00:00:004" 07:58:44:328 37 "00:00:00:004" 07:59:44:329 37 "00:00:00:005" 08:00:44:329 37 "00:00:00:004" 08:01:44:333 37 "00:00:00:004" 08:02:44:332 37 "00:00:00:004" 08:03:44:335 37 "00:00:00:005" 08:04:44:338 37 "00:00:00:004" 08:05:44:342 37 "00:00:00:005" 08:06:44:345 37 "00:00:00:004" 08:07:44:348 37 "00:00:00:004" 08:08:44:351 37 "00:00:00:004" 08:09:44:354 37 "00:00:00:004" 08:10:44:356 37.02 "00:00:00:004" 08:11:44:356 37 "00:00:00:004" 08:12:44:356 37 "00:00:00:005" 08:13:44:356 37 "00:00:00:005" 08:14:44:357 37 "00:00:00:005" 08:15:44:356 37 "00:00:00:005" 08:16:44:360 37 "00:00:00:005" 08:17:44:363 37 "00:00:00:004" 08:18:44:366 37 "00:00:00:005" 08:19:44:369 37 "00:00:00:004" 08:20:44:372 37 "00:00:00:125" 08:21:44:375 37 "00:00:00:008" 08:22:44:375 37 "00:00:00:007" 08:23:44:375 37 "00:00:00:007" 08:24:44:374 37 "00:00:00:007" 08:25:44:378 37 "00:00:00:005" 08:26:44:377 37 "00:00:00:004" 08:27:44:381 37 "00:00:00:004" 08:28:44:381 37 "00:00:00:004" 08:29:44:381 37.02 "00:00:00:004" 08:30:44:381 37 "00:00:00:004" 08:31:44:382 37 "00:00:00:004" 08:32:44:383 37 "00:00:00:003" 08:33:44:386 37 "00:00:00:004" 08:34:44:386 37 "00:00:00:004" 08:35:44:389 37 "00:00:00:003" 08:36:44:392 37 "00:00:00:004" 08:37:44:391 37 "00:00:00:004" 08:38:44:394 37 "00:00:00:003" 08:39:44:397 37 "00:00:00:003" 08:40:44:401 37 "00:00:00:003" 08:41:44:404 37 "00:00:00:004" 08:42:44:407 37 "00:00:00:004" 08:43:44:409 37.02 "00:00:00:003" 08:44:44:408 37 "00:00:00:003" 08:45:44:412 37 "00:00:00:004" 08:46:44:415 37 "00:00:00:003" 08:47:44:414 37 "00:00:00:003" 08:48:44:418 37 "00:00:00:003" 08:49:44:420 37 "00:00:00:003" 08:50:44:420 37 "00:00:00:003" 08:51:44:423 37 "00:00:00:003" 08:52:44:426 37 "00:00:00:004" 08:53:44:429 37 "00:00:00:003" 08:54:44:433 37 "00:00:00:004" 08:55:44:434 37.02 "00:00:00:003" 08:56:44:435 36.98 "00:00:00:003" 08:57:44:435 37.02 "00:00:00:003" 08:58:44:437 36.98 "00:00:00:004" 08:59:44:440 37.01 "00:00:00:004" 09:00:44:443 37 "00:00:00:004" 09:01:44:446 37 "00:00:00:004" 09:02:44:449 37 "00:00:00:004" 09:03:44:450 37 "00:00:00:002" When the code is checked in (from the patch) this bug will be marked fixed
          Hide
          varunrupela added a comment -

          You can use the "-fd" option with SIPp to control the frequency at which stats
          are dumped in the csv file. Default is once every 60 seconds. Since the issue
          was seen only for a minute after failure, with the default -fd value it will be
          hard to say whether the issue is fixed or not.

          Show
          varunrupela added a comment - You can use the "-fd" option with SIPp to control the frequency at which stats are dumped in the csv file. Default is once every 60 seconds. Since the issue was seen only for a minute after failure, with the default -fd value it will be hard to say whether the issue is fixed or not.
          Hide
          Mahesh Kannan added a comment -

          Fixed. One of the problems was that the thread was trying to do a unicast load
          from itself.

          Checking in ReplicationManagerBase.java;
          /cvs/glassfish/appserv-core-ee/http-session-persistence/src/java/com/sun/enterprise/ee/web/sessmgmt/ReplicationManagerBase.java,v
          <-- ReplicationManagerBase.java
          new revision: 1.31.2.88.2.1; previous revision: 1.31.2.88
          done
          Checking in ReplicationStore.java;
          /cvs/glassfish/appserv-core-ee/http-session-persistence/src/java/com/sun/enterprise/ee/web/sessmgmt/ReplicationStore.java,v
          <-- ReplicationStore.java
          new revision: 1.20.2.32.2.1; previous revision: 1.20.2.32
          done

          Show
          Mahesh Kannan added a comment - Fixed. One of the problems was that the thread was trying to do a unicast load from itself. Checking in ReplicationManagerBase.java; /cvs/glassfish/appserv-core-ee/http-session-persistence/src/java/com/sun/enterprise/ee/web/sessmgmt/ReplicationManagerBase.java,v <-- ReplicationManagerBase.java new revision: 1.31.2.88.2.1; previous revision: 1.31.2.88 done Checking in ReplicationStore.java; /cvs/glassfish/appserv-core-ee/http-session-persistence/src/java/com/sun/enterprise/ee/web/sessmgmt/ReplicationStore.java,v <-- ReplicationStore.java new revision: 1.20.2.32.2.1; previous revision: 1.20.2.32 done
          Hide
          varunrupela added a comment -

          Verified with Build 31c.
          Response Times was ~3 seconds for just one reading after instance failure (sipp
          option -fd was 3).

          Show
          varunrupela added a comment - Verified with Build 31c. Response Times was ~3 seconds for just one reading after instance failure (sipp option -fd was 3).

            People

            • Assignee:
              Mahesh Kannan
              Reporter:
              varunrupela
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: