glassfish
  1. glassfish
  2. GLASSFISH-15210

[PERF] Regression (Rel. v2.1) in IORFactories.makeIOR

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.1_b32
    • Fix Version/s: 4.0
    • Component/s: orb
    • Labels:
      None

      Description

      With the latest builds, the total regression in the orb copyObject is significantly reduced, but

      The current largest contributor to that is IORFactories.makeIOR(), which takes >2x longer in 3.1 than in 2.1 and accounts for 4.5% of total time in our testing. In both releases, EncapsulationFactoryBase.create() is the main component taking the time (and from the, code code appears to be somewhat recursive with a loop back to EncapsulationFactoryBase.create(), so the time attribution below that in current profiles can't be traced).

        Activity

        Hide
        Ken Cavanaugh added a comment -

        Most of the IOR code has been touched for years, certainly since well before v2.1.1. The only new bit is in
        the addition of the direct support for the ClusterInstanceInfo component, which is used for IIOP FOLB.
        The ORB has not changed much from 2.1.1 to 3.1 for FOLB support, but this is one change. It is possible that
        the difference is that this used to be read as a generic tagged component (that is, just a byte[] which is
        a GIOP encapsulation), but now it is fully unmarshaled. This may make a difference, because the
        tagged component may be getting completely unmarshaled all the time, instead of just when actually needed.

        I don't think I can go back to the old model at this point. Ideally I'd prefer to use a lazy unmarshaling model,
        where the data is held as a byte[] and only unmarshalled as needed.

        If this is the problem, I'd expect to see some activity in the profile in two classes:

        • com.sun.corba.ee.impl.folb.ClusterInstanceInfo
        • com.sun.corba.ee.impl.folb.SocketInfo

        The constructors for these two classes do several read_string and read_long calls to the InputStream,
        so it might be possible to see some activity in the profile showing this, even if the calls are at
        a relatively low frequency. ORB marshalling produces flat profiles, with a lot of calls that individually
        don't take much time to things like read_long and read_string.

        It is also possible that the underlying InputStream (CDRInputStream_1_0) has gotten slower, but I don't
        think that's the case here. Most of the code has seen little change, other than the ORB tracing facility,
        but the instrumented methods basically add just a couple of if (mm == null)

        { skip to method }

        kinds of calls, which should be << 1% of the execution team, even for small methods.

        If further analysis of the profile suggests that the ClusterInstanceInfo is the problem, I'll need to
        investigate making unmarshaling lazy, but I don't yet know how difficult that would be.

        Show
        Ken Cavanaugh added a comment - Most of the IOR code has been touched for years, certainly since well before v2.1.1. The only new bit is in the addition of the direct support for the ClusterInstanceInfo component, which is used for IIOP FOLB. The ORB has not changed much from 2.1.1 to 3.1 for FOLB support, but this is one change. It is possible that the difference is that this used to be read as a generic tagged component (that is, just a byte[] which is a GIOP encapsulation), but now it is fully unmarshaled. This may make a difference, because the tagged component may be getting completely unmarshaled all the time, instead of just when actually needed. I don't think I can go back to the old model at this point. Ideally I'd prefer to use a lazy unmarshaling model, where the data is held as a byte[] and only unmarshalled as needed. If this is the problem, I'd expect to see some activity in the profile in two classes: com.sun.corba.ee.impl.folb.ClusterInstanceInfo com.sun.corba.ee.impl.folb.SocketInfo The constructors for these two classes do several read_string and read_long calls to the InputStream, so it might be possible to see some activity in the profile showing this, even if the calls are at a relatively low frequency. ORB marshalling produces flat profiles, with a lot of calls that individually don't take much time to things like read_long and read_string. It is also possible that the underlying InputStream (CDRInputStream_1_0) has gotten slower, but I don't think that's the case here. Most of the code has seen little change, other than the ORB tracing facility, but the instrumented methods basically add just a couple of if (mm == null) { skip to method } kinds of calls, which should be << 1% of the execution team, even for small methods. If further analysis of the profile suggests that the ClusterInstanceInfo is the problem, I'll need to investigate making unmarshaling lazy, but I don't yet know how difficult that would be.
        Hide
        Ken Cavanaugh added a comment -

        If lazy unmarshaling would help, it can be done as follows:

        • Modify IIOPFactories.makeClusterInstanceInfoComponentFactory to pass the InputStream directly to the
          ClusterInstanceInfoComponentImpl constructor instead of unmarshaling it in the ClusterInstanceInfo
          constructor.
        • Add a constructor for ClusterInstanceInfoComponentImpl that takes an input stream. It stores the input
          stream in a private data member.
        • With appropriate synchronization, do the unmarshaling into the current clusterInstanceInfoValue field
          whenever necessary, which is basically when the field is null at the beginning of a call to
          writeContents, equals, hashCode, toString, or getClusterInstanceInfo.

        Let me know if you think this fix is worth trying.

        Show
        Ken Cavanaugh added a comment - If lazy unmarshaling would help, it can be done as follows: Modify IIOPFactories.makeClusterInstanceInfoComponentFactory to pass the InputStream directly to the ClusterInstanceInfoComponentImpl constructor instead of unmarshaling it in the ClusterInstanceInfo constructor. Add a constructor for ClusterInstanceInfoComponentImpl that takes an input stream. It stores the input stream in a private data member. With appropriate synchronization, do the unmarshaling into the current clusterInstanceInfoValue field whenever necessary, which is basically when the field is null at the beginning of a call to writeContents, equals, hashCode, toString, or getClusterInstanceInfo. Let me know if you think this fix is worth trying.
        Hide
        Ken Cavanaugh added a comment -

        After discussion with Scott on 12/20, it appears that the regression is fixed,
        so this fix is not needed in GF 3.1. Moving to 3.2/P4.

        Show
        Ken Cavanaugh added a comment - After discussion with Scott on 12/20, it appears that the regression is fixed, so this fix is not needed in GF 3.1. Moving to 3.2/P4.

          People

          • Assignee:
            Harshad Vilekar
            Reporter:
            Scott Oaks
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated: