[GLASSFISH-8566] session failover not working when instance comes back Created: 20/Jun/09  Updated: 06/Mar/12

Status: Open
Project: glassfish
Component/s: failover
Affects Version/s: 9.0pe
Fix Version/s: not determined

Type: Bug Priority: Trivial
Reporter: jccosta Assignee: lwhite
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Linux
Platform: Linux


Attachments: GZip Archive server-domain.tar.gz     GZip Archive server-node1.tar.gz     GZip Archive server-node2.tar.gz    
Issuezilla Id: 8,566

 Description   

Hello,

I'm experiencing the same problem that was reported in issue #5106.

I'm using Glassfish v2.1 on CentOS 5.3 x86 on VMware 3.5 ESXi with a
HW-balancer. Simple roundrobin without persistence. (I need without persistence
because most of the clients work behind a NATed router)

Everything works (and i can see the messages if I configure the logger with
FINE) with clusterjsp app.

When i shutdown one instance and that start it up again but the instance
restarted always create a new session.

I can see that the restarted instance resquests (broadcasts) the session id but
it doesn't receive a response and creates a new session.

I can confirm that it broadcasts the request using tcpdump.



 Comments   
Comment by lwhite [ 20/Jun/09 ]

Several questions:

a) What version of GlassFish are you running? (please take a look at
these blogs - they may help with your issue).
b) how many instances are in your cluster? If the answer is two, then
again these blogs may be of help.
c) what do you mean by "Simple roundrobin without persistence"? If you mean
that there is no server affinity (also known as "stickiness"), this will not
be a good design for you. Essentially every request will result in a load with
network latency hurting the performance of your system.
d) have you confirmed that your instances are part of the same subnet and that
multicast is working? This is a requirement for memory replication to work
correctly.

This concerns issues encountered by users of 2 node clusters:
http://blogs.sun.com/memrep/entry/2_node_cluster_memory_replication

and this one is about issues for applications that involve multi-threaded access
to a single http session:
http://blogs.sun.com/memrep/entry/memory_replication_multi_threaded_concurrent

Comment by jccosta [ 20/Jun/09 ]

a) v2.1

b) Tomorrow I'll install v2.1.1-b19 and test.

c) Well, the problem is that the clients are NATed behind a router and show up
as only one IP address so, if I can disable stickyness... it would be good. I
really don't worry about network latency performance. I'll just VLAN+Team the
interfaces.

d) Yeah, same subnet and tested multicast.

Thanks!

Comment by jccosta [ 21/Jun/09 ]

Created an attachment (id=2906)
server node1

Comment by jccosta [ 21/Jun/09 ]

Created an attachment (id=2907)
server node2

Comment by jccosta [ 21/Jun/09 ]

Created an attachment (id=2908)
server domain

Comment by jccosta [ 21/Jun/09 ]

I've installed v2.1.1-b19 and same problem. I've attached the server logs and
nodeagents logs.

Comment by lwhite [ 22/Jun/09 ]

There are many issues not involving replication in these log files.
Please resolve them first. A non-exhaustive search includes (see below).
Question: I see references to faces - is this a version of faces that has AJAX
like connection patterns (e.g. IceFaces). If so, please re-study one of the
blogs I sent before about the need for relaxVersionSemantics for those cases.

issues with Grizzly socket connections including port already in use problems:

[#|2009-06-21T20:52:04.202+0100|FINE|sun-appserver2.1|javax.enterprise.system.container.web|_ThreadID=27;_ThreadName=httpSSLWorkerThread-38080-0;ClassName=com.sun.enterprise.web.connector.grizzly.DefaultReadTask;MethodName=manageKeepAlive;_RequestID=32747810-8f0a-4786-b41a-33de83114764;|SocketChannel
Read Exception:
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
at sun.nio.ch.IOUtil.read(IOUtil.java:206)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
at
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:241)
at
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:214)
at com.sun.enterprise.web.connector.grizzly.TaskBase.run(TaskBase.java:265)
at
com.sun.enterprise.web.connector.grizzly.ssl.SSLWorkerThread.run(SSLWorkerThread.java:106)

#]

and
[#|2009-06-21T20:25:55.560+0100|SEVERE|sun-appserver2.1|javax.enterprise.system.container.web|_ThreadID=10;_ThreadName=main;_RequestID=4faca3fc-a11f-4aa2-8c08-131836293e96;|WEB0701:
Error initializing endpoint
java.net.BindException: Address already in use: 38181
at
com.sun.enterprise.web.connector.grizzly.SelectorThread.initEndpoint(SelectorThread.java:763)
at
com.sun.enterprise.web.connector.grizzly.GrizzlyHttpProtocol.init(GrizzlyHttpProtocol.java:226)
at org.apache.coyote.tomcat5.CoyoteConnector.initialize(CoyoteConnector.java:1627)
at
com.sun.enterprise.web.connector.coyote.PECoyoteConnector.initialize(PECoyoteConnector.java:791)
at org.apache.catalina.startup.Embedded.start(Embedded.java:950)
at com.sun.enterprise.web.WebContainer.start(WebContainer.java:864)
at com.sun.enterprise.web.PEWebContainer.startInstance(PEWebContainer.java:793)
at
com.sun.enterprise.web.PEWebContainerLifecycle.onStartup(PEWebContainerLifecycle.java:89)
at
com.sun.enterprise.server.ApplicationServer.onStartup(ApplicationServer.java:446)
at
com.sun.enterprise.server.ondemand.OnDemandServer.onStartup(OnDemandServer.java:134)
at com.sun.enterprise.server.PEMain.run(PEMain.java:409)
at com.sun.enterprise.server.PEMain.main(PEMain.java:336)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.sun.enterprise.server.PELaunch.main(PELaunch.java:415)

#]

invalid xml document issues like:
[#|2009-06-21T21:04:47.515+0100|SEVERE|sun-appserver2.1|org.apache.commons.digester.Digester|_ThreadID=30;_ThreadName=RMI
TCP
Connection(53)-10.10.1.151;_RequestID=63fa2baf-22aa-485d-b71a-4809b0e0a476;|Parse Error
at line 7 column 118: Document root element "faces-config", must match DOCTYPE
root "null".
org.xml.sax.SAXParseException: Document root element "faces-config", must match
DOCTYPE root "null".
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:195)
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:131)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:384)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:318)
at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.rootElementSpecified(XMLDTDValidator.java:1621)
at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.handleStartElement(XMLDTDValidator.java:1900)
at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(XMLDTDValidator.java:764)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1359)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(XMLDocumentScannerImpl.java:1317)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3095)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:922)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:807)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:107)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
at org.apache.commons.digester.Digester.parse(Digester.java:1745)
at org.apache.shale.tiger.config.Fac

Comment by lwhite [ 22/Jun/09 ]

There are many issues not involving replication in these log files.
Please resolve them first. A non-exhaustive search includes (see below).
Question: I see references to faces - is this a version of faces that has AJAX
like connection patterns (e.g. IceFaces). If so, please re-study one of the
blogs I sent before about the need for relaxVersionSemantics for those cases.

issues with Grizzly socket connections including port already in use problems:

[#|2009-06-21T20:52:04.202+0100|FINE|sun-appserver2.1|javax.enterprise.system.container.web|_ThreadID=27;_ThreadName=httpSSLWorkerThread-38080-0;ClassName=com.sun.enterprise.web.connector.grizzly.DefaultReadTask;MethodName=manageKeepAlive;_RequestID=32747810-8f0a-4786-b41a-33de83114764;|SocketChannel
Read Exception:
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
at sun.nio.ch.IOUtil.read(IOUtil.java:206)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
at
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:241)
at
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:214)
at com.sun.enterprise.web.connector.grizzly.TaskBase.run(TaskBase.java:265)
at
com.sun.enterprise.web.connector.grizzly.ssl.SSLWorkerThread.run(SSLWorkerThread.java:106)

#]

and
[#|2009-06-21T20:25:55.560+0100|SEVERE|sun-appserver2.1|javax.enterprise.system.container.web|_ThreadID=10;_ThreadName=main;_RequestID=4faca3fc-a11f-4aa2-8c08-131836293e96;|WEB0701:
Error initializing endpoint
java.net.BindException: Address already in use: 38181
at
com.sun.enterprise.web.connector.grizzly.SelectorThread.initEndpoint(SelectorThread.java:763)
at
com.sun.enterprise.web.connector.grizzly.GrizzlyHttpProtocol.init(GrizzlyHttpProtocol.java:226)
at org.apache.coyote.tomcat5.CoyoteConnector.initialize(CoyoteConnector.java:1627)
at
com.sun.enterprise.web.connector.coyote.PECoyoteConnector.initialize(PECoyoteConnector.java:791)
at org.apache.catalina.startup.Embedded.start(Embedded.java:950)
at com.sun.enterprise.web.WebContainer.start(WebContainer.java:864)
at com.sun.enterprise.web.PEWebContainer.startInstance(PEWebContainer.java:793)
at
com.sun.enterprise.web.PEWebContainerLifecycle.onStartup(PEWebContainerLifecycle.java:89)
at
com.sun.enterprise.server.ApplicationServer.onStartup(ApplicationServer.java:446)
at
com.sun.enterprise.server.ondemand.OnDemandServer.onStartup(OnDemandServer.java:134)
at com.sun.enterprise.server.PEMain.run(PEMain.java:409)
at com.sun.enterprise.server.PEMain.main(PEMain.java:336)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.sun.enterprise.server.PELaunch.main(PELaunch.java:415)

#]

invalid xml document issues like:
[#|2009-06-21T21:04:47.515+0100|SEVERE|sun-appserver2.1|org.apache.commons.digester.Digester|_ThreadID=30;_ThreadName=RMI
TCP
Connection(53)-10.10.1.151;_RequestID=63fa2baf-22aa-485d-b71a-4809b0e0a476;|Parse Error
at line 7 column 118: Document root element "faces-config", must match DOCTYPE
root "null".
org.xml.sax.SAXParseException: Document root element "faces-config", must match
DOCTYPE root "null".
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:195)
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:131)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:384)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:318)
at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.rootElementSpecified(XMLDTDValidator.java:1621)
at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.handleStartElement(XMLDTDValidator.java:1900)
at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(XMLDTDValidator.java:764)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1359)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(XMLDocumentScannerImpl.java:1317)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3095)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:922)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:807)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:107)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
at org.apache.commons.digester.Digester.parse(Digester.java:1745)
at org.apache.shale.tiger.config.Fac

Comment by Tom Mueller [ 06/Mar/12 ]

Bulk update to change fix version to "not determined" for all issues still open but with a fix version for a released version.

Generated at Sun Aug 30 07:55:02 UTC 2015 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.