[SAILFIN-1798] CLB unable to find sip session under load Created: 04/Jun/09  Updated: 25/Nov/10  Resolved: 09/Jul/09

Status: Resolved
Project: sailfin
Component/s: sip_container
Affects Version/s: 2.0
Fix Version/s: milestone 1

Type: Bug Priority: Major
Reporter: Scott Oaks Assignee: kshitiz_saxena
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: File PresenceServlet.sar     XML File test.xml    
Issue Links:
Dependency
blocks SAILFIN-1790 b2bua, ssr enabled a lot of error mes... Resolved
Issuezilla Id: 1,798

 Description   

Under load, I get this exception:

[#|2009-06-04T12:51:13.669-0700|SEVERE|sun-glassfish-comms-server1.5|javax.enterprise.system.container.clb|_ThreadID=24;_ThreadName=SipContainer-serversWorkerThread-5060-9;_RequestID=67bc53cc-cbaf-4c4c-979d-9b0664d606b0;|WorkerThreadImpl
unexpected exception:
java.lang.NullPointerException
at
org.jvnet.glassfish.comms.clb.core.common.chr.StickyHashKeyExtractor.encodeHashKeyToBeKey(StickyHashKeyExtractor.java:245)
at
org.jvnet.glassfish.comms.clb.core.sip.SipLoadBalancerBackend.handleOutgoingRequest(SipLoadBalancerBackend.java:175)
at
org.jvnet.glassfish.comms.clb.core.sip.SipLoadBalancerManagerBackEnd.dispatch(SipLoadBalancerManagerBackEnd.java:248)
at
com.ericsson.ssa.sip.transaction.TransactionManager.dispatch(TransactionManager.java:456)
at
com.ericsson.ssa.sip.persistence.ReplicationManager.dispatch(ReplicationManager.java:163)
at
com.ericsson.ssa.sip.dns.ResolverManager.dispatch(ResolverManager.java:219)
at com.ericsson.ssa.sip.DialogManager.dispatch(DialogManager.java:808)
at
com.ericsson.ssa.sip.LocalRouteManager.dispatch(LocalRouteManager.java:148)
at
com.ericsson.ssa.container.sim.ApplicationDispatcher.dispatchViaStatelessProxy(ApplicationDispatcher.java:626)
at
com.ericsson.ssa.container.sim.ApplicationDispatcher.dispatch(ApplicationDispatcher.java:177)
at com.ericsson.ssa.sip.FSM$1.call(FSM.java:141)
at
com.sun.grizzly.util.WorkerThreadImpl.processTask(WorkerThreadImpl.java:325)
at com.sun.grizzly.util.WorkerThreadImpl.run(WorkerThreadImpl.java:184)

#]

This is a subscribe_refresh test. I can get it in various circumstances, but the
most reliable way is to create a number of subscriptions (e.g. 10000) with a
short expiration time (e.g. 180 seconds). The stack trace above is from the
servlet attempting to send a NOTIFY of the timeout back, but it has apparently
lost track of the session (hence the hashkey is null).

Note that this is load dependent; I only see it when there are a fairly large
number of sessions, and not all of them get the error (e.g., in my most recent
test, 2300 sessions got the error out of 7040 that I established). It is likely
machine dependent, as it is easier to reproduce on slower machines (I'm running
on two old sparc machines; a simple two-instance cluster without SSR).



 Comments   
Comment by Scott Oaks [ 04/Jun/09 ]

Created an attachment (id=1025)
This is the standard perf team/QE presence servlet

Comment by Scott Oaks [ 04/Jun/09 ]

Created an attachment (id=1026)
Usual sub_ref scenario modified to remove loop so that timeouts will occur

Comment by ehsroha [ 10/Jun/09 ]

Reassign to Sankar

Comment by sankara [ 07/Jul/09 ]

Reassigning the issue to Kshitiz.

Comment by kshitiz_saxena [ 09/Jul/09 ]

Introduced a null check to avoid this NullPointerException.

However this does not fix the root cause of why SipSession/SAS is null for the
request.

Checkin logs:

Checking in
src/main/java/org/jvnet/glassfish/comms/clb/core/common/chr/StickyHashKeyExtractor.java;
/cvs/sailfin/clb/src/main/java/org/jvnet/glassfish/comms/clb/core/common/chr/StickyHashKeyExtractor.java,v
<-- StickyHashKeyExtractor.java
new revision: 1.19; previous revision: 1.18
done

Generated at Wed Feb 10 22:32:01 UTC 2016 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.