[GLASSFISH-7529] [UB][JDK ISSUE] IO Exception: Invalid Argument during longevity test Created: 06/Apr/09  Updated: 22/Nov/11  Resolved: 22/Nov/11

Status: Resolved
Project: glassfish
Component/s: grizzly-kernel
Affects Version/s: V3
Fix Version/s: 3.1

Type: Bug Priority: Minor
Reporter: meenap Assignee: oleksiys
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Solaris
Platform: Sun


Attachments: Text File netstat-summary.log     Text File pfiles-summary.log    
Issuezilla Id: 7,529

 Description   

Glassfish V3 B42
Nilebookstore application
Configuration:
33 Users -> v2.1 LB + WS7.0 -> V3 B42

During a HTTP longevity test, the following exception was seen after about 42
hours into the run. The instance and application is still accessible during the
run.

[#|2009-04-05T17:41:26.537-0700|SEVERE|glassfish|javax.enterprise.system.core|_ThreadID=15;_ThreadName=Thread-1;|doSelect
exception
java.io.IOException: Invalid argument
at sun.nio.ch.DevPollArrayWrapper.registerMultiple(Native Method)
at
sun.nio.ch.DevPollArrayWrapper.updateRegistrations(DevPollArrayWrapper.java:220)
at sun.nio.ch.DevPollArrayWrapper.poll(DevPollArrayWrapper.java:163)
at sun.nio.ch.DevPollSelectorImpl.doSelect(DevPollSelectorImpl.java:68)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at com.sun.grizzly.TCPSelectorHandler.select(TCPSelectorHandler.java:476)
at com.sun.grizzly.Controller.doSelect(Controller.java:350)
at com.sun.grizzly.SelectorHandlerRunner.run(SelectorHandlerRunner.java:81)
at
com.sun.grizzly.Controller.startSelectorHandlerRunner(Controller.java:1144)
at com.sun.grizzly.Controller.start(Controller.java:951)
at
com.sun.grizzly.http.SelectorThread.startListener(SelectorThread.java:1161)
at com.sun.grizzly.http.SelectorThread.run(SelectorThread.java:1012)
at
com.sun.grizzly.http.SelectorThread.startEndpoint(SelectorThread.java:1088)
at
com.sun.enterprise.v3.services.impl.GrizzlyServiceListener.start(GrizzlyServiceListener.java:84)
at
com.sun.enterprise.v3.services.impl.GrizzlyProxy$1.run(GrizzlyProxy.java:211)

#]


 Comments   
Comment by jluehe [ 06/Apr/09 ]

...

Comment by jfarcand [ 06/Apr/09 ]

See: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6824477

Make sure you have properly configured ulimit (see the bug description).

Comment by jfarcand [ 06/Apr/09 ]

Fixing the title. The workaround is to properly configure ulimit. Meena, can you
monitor the file descriptor as well as we may have a leak here. Letting the
issue open.

Comment by meenap [ 09/Apr/09 ]

I increased the ulimit -n from 256 (default) to 65536 on appserver machine. I am
still seeing the exception, this time happening after about 18 hours into the run.
There are about 6 client errors. Monitoring netstat shows initial increase but
after some time "netstat -an | grep 8080 | wc" stays stable approx in the range
between 90-120. CLOSE_WAIT is always 0 and no increase is seen in FIN_WAIT_2.

Monitoring pfiles "pfiles pid | grep 8080 | wc -l" stays stable approx in the
range between 60s to 70s.

Comment by meenap [ 09/Apr/09 ]

Created an attachment (id=2492)
Netstat Output File

Comment by meenap [ 09/Apr/09 ]

Created an attachment (id=2493)
Pfile Command Output File

Comment by jfarcand [ 09/Apr/09 ]

Hum not good you still see the exception. Can you try with JDK b 50 (sorry for
asking) and see if this is fixed? If yes then some backport will be required to
JDK 6.

Thanks!!!

Comment by jfarcand [ 09/Apr/09 ]

I means JDK 7 b50.

Comment by meenap [ 20/Apr/09 ]

On the same test bed, I tried with https test but I didn't get any exceptions
for 3 days and no client errors. Then I re-ran the test once more with http but
again still saw the exception which happened once during 3 days test causing 16
client errors at the same time.

Comment by jfarcand [ 11/Jun/09 ]

You need to set xnet_skip_checks to 1 in your /etc/system file. When
xnet_skip_checks is set the code does not check if the socket is connected or
not. Now this will effect the whole system. Reboot the system for this to take
effect or you can issue the following command on your system for immediate
effect. Once you reboot you will have to issue the command again so it's better
to set it in /etc/system.

  1. echo "xnet_skip_checks/W 1" | mdb -kw
    xnet_skip_checks: 0 = 0x1

We are waiting for a Solaris fix, then JDK, then Grizzly (ouf )

Comment by jfarcand [ 11/Sep/09 ]

Another one that needs to be documented. Thanks Paul!

Comment by Paul Davies [ 30/Sep/09 ]

To be added to the v3 Release Notes

Comment by Paul Davies [ 22/Oct/09 ]

Reassigned to grisdal

Comment by Gail Risdal [ 08/Dec/09 ]

Added to v3 Release Notes.

Comment by Gail Risdal [ 09/Dec/09 ]

Closed prematurely. Reopened to enable code fix.

Comment by kumara [ 10/Dec/09 ]

Waiting for JDK fix.

Comment by oleksiys [ 22/Nov/11 ]

fixed in JDK

Generated at Sat Jan 21 15:15:18 UTC 2017 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.