glassfish
  1. glassfish
  2. GLASSFISH-18754

[PERF] JSP Cookie Handling performance has large regression

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.0_b37
    • Fix Version/s: 4.0_b52
    • Component/s: grizzly-kernel
    • Labels:
      None

      Description

      Continuing to track down the regressions in the web container benchmarks:

      The cookie processing in grizzly 2.x is very slow compared to 1.9.x: we are spending 3x the time in org.apache.catalina.connector.Request.parseSessionCookiesId(). This is by far the bulk of the time in CPU difference between GF 4.0 and 3.1.2 on the web container tests.

      The bulk of the time in parsing the cookies is:
      DataChunk.toString() (and really Charset.decode() – how did we convert this before; I don't see anything similar in the 3.1.2 profile)
      CookieParseUtils.parseClientCookies – that itself takes a fair amount of time, so maybe the parsing algorithm needs to be improved, plus lots of time to create all the data chunks when it initializes a LazyCookie() object. LazyCookie has a recycle() method, do we need to create them all the time?

      1. gf3.png
        103 kB
      2. gf4.png
        110 kB

        Issue Links

          Activity

          Hide
          Ryan Lubke added a comment -

          Integrated Grizzly 2.2.10 (r54548).

          Should be available in this week's nightly build. Please let us know of any changes in performance.

          Show
          Ryan Lubke added a comment - Integrated Grizzly 2.2.10 (r54548). Should be available in this week's nightly build. Please let us know of any changes in performance.
          Hide
          Scott Oaks added a comment -

          Testing on build 43 – the call to Request.parseSessionCookiesId() still takes 3x longer in GF 4.0 than in GF 3.1.2. The time attributed to parseClientCookies has significantly decreased, but over all we're spending 5.5% of time on a simple request just parsing cookies (of which there are only the ones added by the web container).

          However, I was looking at the wrong column in the profiler when I asserted that this is the bulk of the CPU time difference between 4.0 and 3.1.2; it is about 20% of the total regression.

          Show
          Scott Oaks added a comment - Testing on build 43 – the call to Request.parseSessionCookiesId() still takes 3x longer in GF 4.0 than in GF 3.1.2. The time attributed to parseClientCookies has significantly decreased, but over all we're spending 5.5% of time on a simple request just parsing cookies (of which there are only the ones added by the web container). However, I was looking at the wrong column in the profiler when I asserted that this is the bulk of the CPU time difference between 4.0 and 3.1.2; it is about 20% of the total regression.
          Hide
          Ryan Lubke added a comment -

          Ok, so some progress has been made. Will begin profiling the actual parsing to see what we can improve.

          Show
          Ryan Lubke added a comment - Ok, so some progress has been made. Will begin profiling the actual parsing to see what we can improve.
          Hide
          Ryan Lubke added a comment -

          Scott,

          Can you take some screenshots of the performance numbers for org.glassfish.grizzly.http.server.Request.getCookies() and org.glassfish.grizzly.http.util.CookieParserUtils and attach them to the issue?

          Show
          Ryan Lubke added a comment - Scott, Can you take some screenshots of the performance numbers for org.glassfish.grizzly.http.server.Request.getCookies() and org.glassfish.grizzly.http.util.CookieParserUtils and attach them to the issue?
          Hide
          Scott Oaks added a comment -

          There are two basic reasons for the remaining regression.

          The first is that the lazy initialization code is broken – the Cookie.initialize() method never resets the usingLazyCookieState value, so every getter call on
          the Cookie re-parses all the values.

          The second is the actual parsing. I am attaching two profiles of parseSessionCookiesId, wherein gfv3 we spend 1.8 seconds parsing 10K requests, and in gfv4 we spend 4.8 seconds to parse 10K requests (extrapolating from the actual figure, which is 2.4 secconds for 5K requests).

          Two things are going on here – the small overhead for working on a buffer instead of an array is having an effect on things like getTokenEnPosition, but the largest part is call to DataChunk.toString in v4 – as I mentioned, I don't see any corresponding encoding in the gfv3 case, so I'm not sure how it was done there. If we used the charset encoder, I suspect that part of the issue is the number of calls made – did we do the entire cookie string in one shot in GFv3 and in
          V4 we do each individual part of the cookie? It might be more efficient to work
          on a single array and convert it in one swoop.

          Show
          Scott Oaks added a comment - There are two basic reasons for the remaining regression. The first is that the lazy initialization code is broken – the Cookie.initialize() method never resets the usingLazyCookieState value, so every getter call on the Cookie re-parses all the values. The second is the actual parsing. I am attaching two profiles of parseSessionCookiesId, wherein gfv3 we spend 1.8 seconds parsing 10K requests, and in gfv4 we spend 4.8 seconds to parse 10K requests (extrapolating from the actual figure, which is 2.4 secconds for 5K requests). Two things are going on here – the small overhead for working on a buffer instead of an array is having an effect on things like getTokenEnPosition, but the largest part is call to DataChunk.toString in v4 – as I mentioned, I don't see any corresponding encoding in the gfv3 case, so I'm not sure how it was done there. If we used the charset encoder, I suspect that part of the issue is the number of calls made – did we do the entire cookie string in one shot in GFv3 and in V4 we do each individual part of the cookie? It might be more efficient to work on a single array and convert it in one swoop.
          Hide
          Ryan Lubke added a comment -

          Integrated Grizzly 2.2.13. Please give it a shot when an opportunity presents itself.

          Show
          Ryan Lubke added a comment - Integrated Grizzly 2.2.13. Please give it a shot when an opportunity presents itself.
          Hide
          Ryan Lubke added a comment -

          Issue confirmed as resolved by Scott some time back.

          Show
          Ryan Lubke added a comment - Issue confirmed as resolved by Scott some time back.

            People

            • Assignee:
              Ryan Lubke
              Reporter:
              Scott Oaks
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: