glassfish
  1. glassfish
  2. GLASSFISH-7529

[UB][JDK ISSUE] IO Exception: Invalid Argument during longevity test

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: V3
    • Fix Version/s: 3.1
    • Component/s: grizzly-kernel
    • Labels:
      None
    • Environment:

      Operating System: Solaris
      Platform: Sun

    • Issuezilla Id:
      7,529

      Description

      Glassfish V3 B42
      Nilebookstore application
      Configuration:
      33 Users -> v2.1 LB + WS7.0 -> V3 B42

      During a HTTP longevity test, the following exception was seen after about 42
      hours into the run. The instance and application is still accessible during the
      run.

      [#|2009-04-05T17:41:26.537-0700|SEVERE|glassfish|javax.enterprise.system.core|_ThreadID=15;_ThreadName=Thread-1;|doSelect
      exception
      java.io.IOException: Invalid argument
      at sun.nio.ch.DevPollArrayWrapper.registerMultiple(Native Method)
      at
      sun.nio.ch.DevPollArrayWrapper.updateRegistrations(DevPollArrayWrapper.java:220)
      at sun.nio.ch.DevPollArrayWrapper.poll(DevPollArrayWrapper.java:163)
      at sun.nio.ch.DevPollSelectorImpl.doSelect(DevPollSelectorImpl.java:68)
      at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
      at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
      at com.sun.grizzly.TCPSelectorHandler.select(TCPSelectorHandler.java:476)
      at com.sun.grizzly.Controller.doSelect(Controller.java:350)
      at com.sun.grizzly.SelectorHandlerRunner.run(SelectorHandlerRunner.java:81)
      at
      com.sun.grizzly.Controller.startSelectorHandlerRunner(Controller.java:1144)
      at com.sun.grizzly.Controller.start(Controller.java:951)
      at
      com.sun.grizzly.http.SelectorThread.startListener(SelectorThread.java:1161)
      at com.sun.grizzly.http.SelectorThread.run(SelectorThread.java:1012)
      at
      com.sun.grizzly.http.SelectorThread.startEndpoint(SelectorThread.java:1088)
      at
      com.sun.enterprise.v3.services.impl.GrizzlyServiceListener.start(GrizzlyServiceListener.java:84)
      at
      com.sun.enterprise.v3.services.impl.GrizzlyProxy$1.run(GrizzlyProxy.java:211)

      #]
      1. netstat-summary.log
        5 kB
        meenap
      2. pfiles-summary.log
        2 kB
        meenap

        Activity

        Hide
        jluehe added a comment -

        ...

        Show
        jluehe added a comment - ...
        Hide
        jfarcand added a comment -

        See: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6824477

        Make sure you have properly configured ulimit (see the bug description).

        Show
        jfarcand added a comment - See: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6824477 Make sure you have properly configured ulimit (see the bug description).
        Hide
        jfarcand added a comment -

        Fixing the title. The workaround is to properly configure ulimit. Meena, can you
        monitor the file descriptor as well as we may have a leak here. Letting the
        issue open.

        Show
        jfarcand added a comment - Fixing the title. The workaround is to properly configure ulimit. Meena, can you monitor the file descriptor as well as we may have a leak here. Letting the issue open.
        Hide
        meenap added a comment -

        I increased the ulimit -n from 256 (default) to 65536 on appserver machine. I am
        still seeing the exception, this time happening after about 18 hours into the run.
        There are about 6 client errors. Monitoring netstat shows initial increase but
        after some time "netstat -an | grep 8080 | wc" stays stable approx in the range
        between 90-120. CLOSE_WAIT is always 0 and no increase is seen in FIN_WAIT_2.

        Monitoring pfiles "pfiles pid | grep 8080 | wc -l" stays stable approx in the
        range between 60s to 70s.

        Show
        meenap added a comment - I increased the ulimit -n from 256 (default) to 65536 on appserver machine. I am still seeing the exception, this time happening after about 18 hours into the run. There are about 6 client errors. Monitoring netstat shows initial increase but after some time "netstat -an | grep 8080 | wc" stays stable approx in the range between 90-120. CLOSE_WAIT is always 0 and no increase is seen in FIN_WAIT_2. Monitoring pfiles "pfiles pid | grep 8080 | wc -l" stays stable approx in the range between 60s to 70s.
        Hide
        meenap added a comment -

        Created an attachment (id=2492)
        Netstat Output File

        Show
        meenap added a comment - Created an attachment (id=2492) Netstat Output File
        Hide
        meenap added a comment -

        Created an attachment (id=2493)
        Pfile Command Output File

        Show
        meenap added a comment - Created an attachment (id=2493) Pfile Command Output File
        Hide
        jfarcand added a comment -

        Hum not good you still see the exception. Can you try with JDK b 50 (sorry for
        asking) and see if this is fixed? If yes then some backport will be required to
        JDK 6.

        Thanks!!!

        Show
        jfarcand added a comment - Hum not good you still see the exception. Can you try with JDK b 50 (sorry for asking) and see if this is fixed? If yes then some backport will be required to JDK 6. Thanks!!!
        Hide
        jfarcand added a comment -

        I means JDK 7 b50.

        Show
        jfarcand added a comment - I means JDK 7 b50.
        Hide
        meenap added a comment -

        On the same test bed, I tried with https test but I didn't get any exceptions
        for 3 days and no client errors. Then I re-ran the test once more with http but
        again still saw the exception which happened once during 3 days test causing 16
        client errors at the same time.

        Show
        meenap added a comment - On the same test bed, I tried with https test but I didn't get any exceptions for 3 days and no client errors. Then I re-ran the test once more with http but again still saw the exception which happened once during 3 days test causing 16 client errors at the same time.
        Hide
        jfarcand added a comment -

        You need to set xnet_skip_checks to 1 in your /etc/system file. When
        xnet_skip_checks is set the code does not check if the socket is connected or
        not. Now this will effect the whole system. Reboot the system for this to take
        effect or you can issue the following command on your system for immediate
        effect. Once you reboot you will have to issue the command again so it's better
        to set it in /etc/system.

        1. echo "xnet_skip_checks/W 1" | mdb -kw
          xnet_skip_checks: 0 = 0x1

        We are waiting for a Solaris fix, then JDK, then Grizzly (ouf )

        Show
        jfarcand added a comment - You need to set xnet_skip_checks to 1 in your /etc/system file. When xnet_skip_checks is set the code does not check if the socket is connected or not. Now this will effect the whole system. Reboot the system for this to take effect or you can issue the following command on your system for immediate effect. Once you reboot you will have to issue the command again so it's better to set it in /etc/system. echo "xnet_skip_checks/W 1" | mdb -kw xnet_skip_checks: 0 = 0x1 We are waiting for a Solaris fix, then JDK, then Grizzly (ouf )
        Hide
        jfarcand added a comment -

        Another one that needs to be documented. Thanks Paul!

        Show
        jfarcand added a comment - Another one that needs to be documented. Thanks Paul!
        Hide
        Paul Davies added a comment -

        To be added to the v3 Release Notes

        Show
        Paul Davies added a comment - To be added to the v3 Release Notes
        Hide
        Paul Davies added a comment -

        Reassigned to grisdal

        Show
        Paul Davies added a comment - Reassigned to grisdal
        Hide
        Gail Risdal added a comment -

        Added to v3 Release Notes.

        Show
        Gail Risdal added a comment - Added to v3 Release Notes.
        Hide
        Gail Risdal added a comment -

        Closed prematurely. Reopened to enable code fix.

        Show
        Gail Risdal added a comment - Closed prematurely. Reopened to enable code fix.
        Hide
        kumara added a comment -

        Waiting for JDK fix.

        Show
        kumara added a comment - Waiting for JDK fix.
        Hide
        oleksiys added a comment -

        fixed in JDK

        Show
        oleksiys added a comment - fixed in JDK

          People

          • Assignee:
            oleksiys
            Reporter:
            meenap
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: