sailfin
  1. sailfin
  2. SAILFIN-1843

JXTA socket IO errors on Expat list calculation

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0
    • Fix Version/s: milestone 1
    • Component/s: session_replication
    • Labels:
      None
    • Environment:

      Operating System: Linux
      Platform: All

      Description

      ******************************************************************************************************

      • Template v0.1 ( 05/01/08 )
      • Sailfin Stress test issue
        ******************************************************************************************************
        Sailfin Build : 20
        Cluster size : 10
        Happens in a single instance (y/n) ? : NA
        Test id : st2_4_presence_subscribe-refresh
        Location of the test : as-telco-sqe/stress-ws/presence
        JDK version : 64 bit - 1.6.0_14
        CLB used : Yes
        HW LB used : No.
        SSR: Enabled
        =================================================
        Started 9 SIPp clients with a loading: -r 300 -l 222222

      1) First time an instance (instance 5) was killed after about 2 hours of the
      running.
      2) I've waited about 1 hour 30 minutes and restarted the instance,
      4) After that the memory still was stable, but according to presence-stats.txt
      on machine5 (where an instance 5 was restarted), the loading was not restored
      correctly. The loading numbers jumped from about 40 cps to 600 cps. For other
      instances the loading was still stable. I've waited about 6 hours, but the
      loading for instance 5 still was unstable.

      Please see instance 5 presence-stats.txt after restart under:
      http://agni-1.sfbay.sun.com/net/asqe-logs/export1/SailFin/Results/sfbuild20/sbrf_r2/inst5/presence-stats.txt
      I've copied all logs at
      http://agni-1.sfbay.sun.com/net/asqe-logs/export1/SailFin/Results/sfbuild20/sbrf_r2

        Activity

        Hide
        easarina added a comment -

        Created an attachment (id=1043)
        messages from inst5 and inst4 server.log files from 17:22 to 17:41

        Show
        easarina added a comment - Created an attachment (id=1043) messages from inst5 and inst4 server.log files from 17:22 to 17:41
        Hide
        easarina added a comment -

        I've attached log.txt that represented messages from server.log files of
        instance5 (restarted) and instance4, after the instance5 was restarted and all
        GMS messages were completed. The time is presented from 17:22 to 17:41.
        According to the presence-stats.txt of instance5 the traffic through the
        instance5 started at 17:37:21.

        In server.log files of not restarted instances, a lot of messages like:
        -----------------------------------------------------------------------
        INFO|sun-glassfish-comms-server2.0|com.sun.enterprise.ee.web.sessmgmt.pipe|
        _ThreadID=37;_ThreadName=ReplicationUnicastSenderExecutor - 12;|failed to send
        message over unicast pipe for sessionid=819386 cmd=valveSave to instance n1c1m5|#]
        -------------------------------------------------------------------

        The same messages were seen in other instances server.log files.

        Correspondingly in instance5 server.log were such messages:

        "Cant find matching transaction - Terminating:.

        So I believe an issue here, but not sipp issue.

        Show
        easarina added a comment - I've attached log.txt that represented messages from server.log files of instance5 (restarted) and instance4, after the instance5 was restarted and all GMS messages were completed. The time is presented from 17:22 to 17:41. According to the presence-stats.txt of instance5 the traffic through the instance5 started at 17:37:21. In server.log files of not restarted instances, a lot of messages like: ----------------------------------------------------------------------- INFO|sun-glassfish-comms-server2.0|com.sun.enterprise.ee.web.sessmgmt.pipe| _ThreadID=37;_ThreadName=ReplicationUnicastSenderExecutor - 12;|failed to send message over unicast pipe for sessionid=819386 cmd=valveSave to instance n1c1m5|#] ------------------------------------------------------------------- The same messages were seen in other instances server.log files. Correspondingly in instance5 server.log were such messages: "Cant find matching transaction - Terminating:. So I believe an issue here, but not sipp issue.
        Hide
        Scott Oaks added a comment -

        There are two things going on here. First, the logs show some errors in
        calculating expat lists:

        IOException: stream is closed
        at
        net.jxta.impl.util.pipe.reliable.ReliableOutputStream.write(ReliableOutputStream.java:460)
        at
        java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1838)
        at
        java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1876)
        at java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1537)
        at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329)
        at
        com.sun.enterprise.ee.web.sessmgmt.JxtaSocketChannel.sendMessage(JxtaSocketChannel.java:163)
        at
        com.sun.enterprise.ee.web.sessmgmt.JxtaSocketChannel$MessageSenderTask.run(JxtaSocketChannel.java:211)

        That is indeed an issue; Bhavani is working on a fix.

        Then there is the out of balance issue – it is likely this error might have
        caused the first pause in processing the sipp traffic. However, a pause in
        calculating traffic is not necessarily by itself a bug, and as I've explained,
        once sipp has a pause, it will remain forever unbalanced. So I'm changing the
        synopsis to make clear that the first error is a bug, but the unbalanced traffic
        is not.

        Show
        Scott Oaks added a comment - There are two things going on here. First, the logs show some errors in calculating expat lists: IOException: stream is closed at net.jxta.impl.util.pipe.reliable.ReliableOutputStream.write(ReliableOutputStream.java:460) at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1838) at java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1876) at java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1537) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329) at com.sun.enterprise.ee.web.sessmgmt.JxtaSocketChannel.sendMessage(JxtaSocketChannel.java:163) at com.sun.enterprise.ee.web.sessmgmt.JxtaSocketChannel$MessageSenderTask.run(JxtaSocketChannel.java:211) That is indeed an issue; Bhavani is working on a fix. Then there is the out of balance issue – it is likely this error might have caused the first pause in processing the sipp traffic. However, a pause in calculating traffic is not necessarily by itself a bug, and as I've explained, once sipp has a pause, it will remain forever unbalanced. So I'm changing the synopsis to make clear that the first error is a bug, but the unbalanced traffic is not.
        Hide
        Scott Oaks added a comment -
            • Issue 1836 has been marked as a duplicate of this issue. ***
        Show
        Scott Oaks added a comment - Issue 1836 has been marked as a duplicate of this issue. ***
        Show
        Bhavanishankar added a comment - Fixed. The fix details are at: https://glassfish.dev.java.net/servlets/ReadMsg?list=cvs&msgNo=30922 https://sailfin.dev.java.net/servlets/ReadMsg?list=cvs&msgNo=7272

          People

          • Assignee:
            Bhavanishankar
            Reporter:
            easarina
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: