Issue Details (XML | Word | Printable)

Key: GLASSFISH-7529
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Minor Minor
Assignee: oleksiys
Reporter: meenap
Votes: 0
Watchers: 4
Operations

If you were logged in you would be able to see more operations.
glassfish

[UB][JDK ISSUE] IO Exception: Invalid Argument during longevity test

Created: 06/Apr/09 11:57 AM   Updated: 22/Nov/11 08:32 PM   Resolved: 22/Nov/11 08:32 PM
Component/s: grizzly-kernel
Affects Version/s: V3
Fix Version/s: 3.1

Time Tracking:
Not Specified

File Attachments: 1. Text File netstat-summary.log (5 kB) 09/Apr/09 12:09 PM - meenap
2. Text File pfiles-summary.log (2 kB) 09/Apr/09 12:10 PM - meenap

Environment:

Operating System: Solaris
Platform: Sun


Issuezilla Id: 7,529
Tags:
Participants: Gail Risdal, jfarcand, jluehe, kumara, meenap, oleksiys and Paul Davies


 Description  « Hide

Glassfish V3 B42
Nilebookstore application
Configuration:
33 Users -> v2.1 LB + WS7.0 -> V3 B42

During a HTTP longevity test, the following exception was seen after about 42
hours into the run. The instance and application is still accessible during the
run.

[#|2009-04-05T17:41:26.537-0700|SEVERE|glassfish|javax.enterprise.system.core|_ThreadID=15;_ThreadName=Thread-1;|doSelect
exception
java.io.IOException: Invalid argument
at sun.nio.ch.DevPollArrayWrapper.registerMultiple(Native Method)
at
sun.nio.ch.DevPollArrayWrapper.updateRegistrations(DevPollArrayWrapper.java:220)
at sun.nio.ch.DevPollArrayWrapper.poll(DevPollArrayWrapper.java:163)
at sun.nio.ch.DevPollSelectorImpl.doSelect(DevPollSelectorImpl.java:68)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at com.sun.grizzly.TCPSelectorHandler.select(TCPSelectorHandler.java:476)
at com.sun.grizzly.Controller.doSelect(Controller.java:350)
at com.sun.grizzly.SelectorHandlerRunner.run(SelectorHandlerRunner.java:81)
at
com.sun.grizzly.Controller.startSelectorHandlerRunner(Controller.java:1144)
at com.sun.grizzly.Controller.start(Controller.java:951)
at
com.sun.grizzly.http.SelectorThread.startListener(SelectorThread.java:1161)
at com.sun.grizzly.http.SelectorThread.run(SelectorThread.java:1012)
at
com.sun.grizzly.http.SelectorThread.startEndpoint(SelectorThread.java:1088)
at
com.sun.enterprise.v3.services.impl.GrizzlyServiceListener.start(GrizzlyServiceListener.java:84)
at
com.sun.enterprise.v3.services.impl.GrizzlyProxy$1.run(GrizzlyProxy.java:211)

#]


jluehe added a comment - 06/Apr/09 12:08 PM

...


jfarcand added a comment - 06/Apr/09 02:17 PM

See: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6824477

Make sure you have properly configured ulimit (see the bug description).


jfarcand added a comment - 06/Apr/09 02:19 PM

Fixing the title. The workaround is to properly configure ulimit. Meena, can you
monitor the file descriptor as well as we may have a leak here. Letting the
issue open.


meenap added a comment - 09/Apr/09 12:03 PM

I increased the ulimit -n from 256 (default) to 65536 on appserver machine. I am
still seeing the exception, this time happening after about 18 hours into the run.
There are about 6 client errors. Monitoring netstat shows initial increase but
after some time "netstat -an | grep 8080 | wc" stays stable approx in the range
between 90-120. CLOSE_WAIT is always 0 and no increase is seen in FIN_WAIT_2.

Monitoring pfiles "pfiles pid | grep 8080 | wc -l" stays stable approx in the
range between 60s to 70s.


meenap added a comment - 09/Apr/09 12:09 PM

Created an attachment (id=2492)
Netstat Output File


meenap added a comment - 09/Apr/09 12:10 PM

Created an attachment (id=2493)
Pfile Command Output File


jfarcand added a comment - 09/Apr/09 01:25 PM

Hum not good you still see the exception. Can you try with JDK b 50 (sorry for
asking) and see if this is fixed? If yes then some backport will be required to
JDK 6.

Thanks!!!


jfarcand added a comment - 09/Apr/09 01:25 PM

I means JDK 7 b50.


meenap added a comment - 20/Apr/09 05:07 PM

On the same test bed, I tried with https test but I didn't get any exceptions
for 3 days and no client errors. Then I re-ran the test once more with http but
again still saw the exception which happened once during 3 days test causing 16
client errors at the same time.


jfarcand added a comment - 11/Jun/09 06:42 AM

You need to set xnet_skip_checks to 1 in your /etc/system file. When
xnet_skip_checks is set the code does not check if the socket is connected or
not. Now this will effect the whole system. Reboot the system for this to take
effect or you can issue the following command on your system for immediate
effect. Once you reboot you will have to issue the command again so it's better
to set it in /etc/system.

  1. echo "xnet_skip_checks/W 1" | mdb -kw
    xnet_skip_checks: 0 = 0x1

We are waiting for a Solaris fix, then JDK, then Grizzly (ouf )


jfarcand added a comment - 11/Sep/09 11:29 AM

Another one that needs to be documented. Thanks Paul!


Paul Davies added a comment - 30/Sep/09 05:39 PM

To be added to the v3 Release Notes


Paul Davies added a comment - 22/Oct/09 06:24 PM

Reassigned to grisdal


Gail Risdal added a comment - 08/Dec/09 06:45 PM

Added to v3 Release Notes.


Gail Risdal added a comment - 09/Dec/09 03:14 PM

Closed prematurely. Reopened to enable code fix.


kumara added a comment - 10/Dec/09 02:08 AM

Waiting for JDK fix.


oleksiys added a comment - 22/Nov/11 08:32 PM

fixed in JDK