[GLASSFISH-15558] Caching JMS session in a session bean causes errors when invoked by a MDB when under load Created: 13/Jan/11  Updated: 21/Sep/15

Status: Open
Project: glassfish
Component/s: jms
Affects Version/s: 3.1_b37
Fix Version/s: 4.1.1

Type: Bug Priority: Major
Reporter: Nigel Deakin Assignee: Nigel Deakin
Resolution: Unresolved Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File ServerLog.odt     Zip Archive TransactionTests.zip    
Tags: 3_1-exclude, 3_1-release-note-added, 3_1-release-notes, 3_1_1-scrubbed, 3_1_2-exclude


This issue was originally reported by a user on the GlassFish developer list. Here is the thread:

The issue can be summarised as follows: a MDB consumes a message from an inbound queue, updates a database then invokes a session bean which sends a message to an outbound queue. If 40 messages are placed on the inbound queue then we see a variety of messages in the server log (see that thread), including this one:

625+0000|SEVERE|glassfish3.1|javax.resourceadapter.mqjmsra.outbound.connection|_ThreadID=36;_ThreadName=Thread-1;|commitTransaction (XA) on JMSService:jmsdirect failed for connectionId:950872495901869318 and onePhase:false due to Unknown JMSService server error ERROR: com.sun.messaging.jmq.jmsserver.util.BrokerException: Bad transaction state transition. Cannot perform operation COMMIT_TRANSACTION(46) (XAFlag=0x0:TMNOFLAGS) on a transaction in state COMPLETE(4).|#]

The error occurs ONLY if the session bean caches its JMS connection, session and producer in a field of the bean. This is valid, though it is contrary to the conventional practice which is to create the connection, use it, and close the connection every time the session bean is invoked. If these objects are not cached then this bug is NOT seen. This bug therefore has a workaround.

The issue can be reproduced using JMS only (i.e. not using a database), though to see exactly the same errors as the user reported it is necessary to force the use of two-phase commits.

A simple NetBeans application is attached which demonstrates the issue. This consists of a Enterprise Application "TransactionTests" which is composed of a ejb application "TransactionTests-ejb" and a web application "TransactionTests-war".

Steps to reproduce:

1. Install the latest version of GlassFish 3.1 (I used build 37)
2. Before starting GlassFish, edit domain.xml to set the JVM option -Dimq.jmsra.isSameRMAllowed=false . This is needed to force two-phase transactions to be used. (If this is not done the application will still fail but you will get different errors).
3. Use NetBeans to build the application (which is an ear cotaining an ejb and a web app) and deploy it in GlassFish.
4. Visit http://localhost:8080/TransactionTests-war/ and click on "Run MDB Test 1". This causes a servlet to send 40 messages to the inbound queue.
5. Inspect the server log for errors

Comment by Nigel Deakin [ 13/Jan/11 ]

The attached file ServerLog.odt is an extract from the server log, which includes logging in DirectXAResource and in the application.

Note particularly Thread 36 (highlighted in green) and Thread 51 (highlighted in red). This suggests that the session bean instance used by thread 36 was reused by thread 51 after the business method returned but before the MDB returned and the transaction was committed. This meant that the same JMS session object was being used by two threads at the same time, which caused the error.

(Full disclosure: to create this logging a modified version of MQ was used with all use to System.out() in DirectXAResource changed to use JDK logging. This was necessary to ensure that such log messages were reported using the correct thread)

Comment by Nigel Deakin [ 14/Jan/11 ]

This behaviour is also seen in GlassFish 2.1.1, so this is not a regression. There's also a workaround (and the workaround is generally considered better practice than the problem case). So this bug doesn't need to be fixed now, so setting the 3_1-exclude tag.

Comment by Nigel Deakin [ 14/Jan/11 ]

Have created documentation bug
to record this in the release note for 3.1.

Comment by Paul Davies [ 14/Jan/11 ]

For the GlassFish 3.1 release notes add the following information:

A stateless session bean should not save JMS connections or sessions in fields of the bean. Applications that do so may encounter errors.

To avoid this issue, if a stateless session bean's business method requires the use of a JMS connection and session then the business method should create the JMS connection and session, use it to send or receive messages, and then close the connection and session before returning. This is GlassFish issue 15558.

Comment by theodor.richard [ 14/Jan/11 ]

I'm the user who initially reported this issue on the mailing list. A problem with not caching the connection is that the maximum number of connections is reached quickly. I'm seeing the following exceptions in the log when sending 50 messages in a for loop, i.e. the method that acquires and releases the JMS connection is invoked 50 times in a row:

com.sun.messaging.jms.JMSException: MQRA:DCF:allocation failure:createConnection:Error in allocating a connection. Cause: In-use connections equal max-pool-size and expired max-wait-time. Cannot allocate more connections.

My max connection pool size has the default size of 32.

Comment by Nigel Deakin [ 17/Jan/11 ]

@theodor.richard: If you believe that managed connections are not being returned correctly to the pool (and this isn't because your pool simply isn't big enough), then please log this as a separate issue or raise it on the user list. Please keep this issue for discussions of the effect of caching the connection, session and producer.

Comment by Nigel Deakin [ 25/Jan/11 ]

Analysis of the test case shows that the cause of the problem is that the container is reusing the session bean instance (and hence the connection's XAResource instance) after the business method has returned but before the transaction has been committed.

It is legal for the container to reuse the stateless session bean instance before the transaction has been committed: the EJB spec, section 4.7 "Stateless Session Beans" states that "the container may interleave requests from multiple transactions to the same instance".

However doing so causes errors in the JMSRA resource adapter, because it is designed on the basis that the same XAResource instance is used for start, end, prepare and commit and that the instance will not be reused until the transaction is committed or rolled back.

That is a breach of the JCA 1.5 spec, which states in section "Implementation" that "A transaction manager can use any XAResource instance (if it refers to the proper resource manager instance) to initiate transaction completion. The XAResource instance used during the transaction completion process need not be the one initially enlisted with the transaction manager for this transaction"

This has been logged as internal (bugs.sun.com) bug 7014537.

Comment by jthoennes [ 14/Apr/11 ]

Hello Nigel,

as we heavily use that kind of scenario, I would like to ask whether this issue will be fixed for 3.1.1
without raising a service request.

A quick answer is highly appreciated.

Thanks, Jörg

Comment by Nigel Deakin [ 14/Apr/11 ]

This is currently scheduled for 3.2, though, as always, I can't make commitments as to the contents of future releases.

If you have a support licence and this issue is causing a problem them please contact your support representative (and let me know you've done so) since this would definitely affect the priority we give to fixing it.


Comment by jthoennes [ 14/Apr/11 ]

In reply to comment #9:
> If you have a support licence and this issue is causing a problem them please
> contact your support representative (and let me know you've done so) since this
> would definitely affect the priority we give to fixing it.

Thanks, Nigel. Yes, we have a support contract. What do you need if I file a service request on My Oracle Support (MOS).
Do you have access to the service requests submitted?

Cheers, Jörg

Comment by Scott Fordin [ 15/Apr/11 ]

Added issue to 3.1 Release Notes.

Comment by Nazrul [ 21/Apr/11 ]

It would be good to take a look at this issue for 3.1.1

Comment by Nigel Deakin [ 03/May/11 ]

@jthoennes - Yes, please file an issue with Oracle support as you suggest. There is a separate engineering team to resolve customer issues, so raising it with support increases the resources available to address this issue.

Comment by Nigel Deakin [ 03/May/11 ]

I have reviewed this bug for 3.1.1 and decided not to fix it in that version for the following reasons:

  • This bug is in older versions of GlassFish (including GlassFish 2.1.1) and so is not a regression
  • There is a workaround (see earlier comment)
  • The fix would require significant changes to the XAResource implementation classes in the JMSRA resource adapter. In addition to the work involved it would require a lot of testing to be sure that it does not introduce a regression. 3.2 will have much more testing than 3.1.1 and so, given that this is an old bug which has a workaround, I would like to defer fixing this bug until 3.2 so it can be properly tested.

Removing the 3_1_1-review tag.

@jthoennes - note that if you raise this issue with Oracle support this will still be reviewed by Oracle sustaining.

Comment by jthoennes [ 27/May/11 ]

Filed Oracle Service Request "SR 3-3705874175: Resolve GLASSFISH-15558 for Glassfish 3.1.1" for this issue.

Comment by marina vatkina [ 16/Nov/11 ]

Re EJB container behavior: In our current implementation, bean instances are returned to pool at the end of method invocation. If we were to to delay it till the termination of tx, we would need more instances because transaction can last much longer than a single method invocation.

Comment by Nigel Deakin [ 14/Dec/11 ]

Adding 3_1_2-exclude tag. Excluding from 3.1.2 for the same reason it was excluded from 3.1.1 (see my comment above).

Generated at Mon Oct 24 04:07:03 UTC 2016 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.