Issue Details (XML | Word | Printable)

Key: GLASSFISH-20335
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Mahesh Kannan
Reporter: blankema
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
glassfish

[BATCH RI] PartitionedStepControllerImpl is one-off when retrieving work from parallelBatchWorkUnits

Created: 17/Apr/13 06:41 PM   Updated: 19/Apr/13 05:28 PM   Resolved: 19/Apr/13 05:28 PM
Component/s: batch
Affects Version/s: 4.0_b84_RC1
Fix Version/s: 4.0_b86_RC2

Time Tracking:
Not Specified

Environment:

3.5.0-24-generic #37-Ubuntu SMP Thu Feb 7 01:50:30 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux


Tags: 4_0-approved
Participants: blankema, Mahesh Kannan, ScottKurz, shreedhar_ganapathy and Tom Mueller


 Description  « Hide

Disclaimer: The bug was found on b25 of the BATCH-RI

Example: Partition plan with 4 partitions and 2 threads
On line 334
// Start up to to the max num we are allowed from the num threads attribute
The variable numCurrentSubmitted has value 2 after completing this

Starting from line 375
if (readyToSubmitAnother) {
numCurrentCompleted++;
logger.fine("Ready to submit another (if there is another left to submit); numCurrentCompleted = " + numCurrentCompleted);
if (numCurrentCompleted < numTotalForThisExcecution) {
if (numCurrentSubmitted < numTotalForThisExcecution) {
numCurrentSubmitted++;
logger.fine("Submitting # " + numCurrentSubmitted + " out of " + numTotalForThisExcecution + " total for this execution");
if (stepStatus.getStartCount() > 1) { batchKernel.startGeneratedJob(parallelBatchWorkUnits.get(numCurrentSubmitted)); } else { batchKernel.restartGeneratedJob(parallelBatchWorkUnits.get(numCurrentSubmitted)); }
readyToSubmitAnother = false;

The numCurrentSubmitted is increased to 3 and the next workUnit is retrieved is 3, thereby skipping entry 2 of the arry.
This means that while retrieving the last workUnit an exception is thrown and the third workUnit is never executed.

The exception was:
Wed Apr 17 19:37:41 CEST 2013 [com.ibm.jbatch.container.impl.BaseStepControllerImpl execute] WARNING: Caught exception executing step: java.lang.IndexOutOfBoundsException: Index: 4, Size: 4
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.get(ArrayList.java:382)
at com.ibm.jbatch.container.impl.PartitionedStepControllerImpl.executeAndWaitForCompletion(PartitionedStepControllerImpl.java:385)



blankema added a comment - 17/Apr/13 06:43 PM

I have a test project (maven based) and a patch file but could not find where to upload it.

If needed, just contact me.


shreedhar_ganapathy added a comment - 17/Apr/13 07:51 PM

-> Mahesh for eval on Batch RI.


Mahesh Kannan added a comment - 18/Apr/13 04:03 AM

Just wanted to add that currently (RC1) GlassFish is using only b23 batch jars.
Anyways, Scott can comment on this more


ScottKurz added a comment - 18/Apr/13 04:24 PM

This is a bug. As we didn't test @threads in the TCK we ended up not testing it at all. I will fix in the next drop.


ScottKurz added a comment - 18/Apr/13 05:06 PM

Actually there's another bug here where we're neglecting to even honor the @threads attribute. I'm guessing this is the patch file mentioned since without this fixed I can't see how you could have created this in the first place. If you want to send it in case there's something else I missed.. you can send to ScottKurz@java.net, but I think I have this fixed now. Thanks for identifying this...

BTW, Mahesh... I think this is b25. The 84 driver was supposed to have been built on 10 Apr 2013 ... and we delivered b25 on 4/9.


Mahesh Kannan added a comment - 19/Apr/13 05:07 AM
  • What is the impact on the customer of the bug?
    Without this fix some workUnit is never executed
  • What is the cost/risk of fixing the bug?
    Medium. Just a few lines. But Scott has already fixed this.
  • Is there an impact on documentation or message strings?
    No
  • Which tests should QA (re)run to verify the fix did not destabilize GlassFish?
    All batch tests. Although I have run batch devtests and QL.
  • Which is the targeted build of 4.0 for this fix?
    The fix is ready, Most likely in RC2.
  • If this an integration of a new version of a component from another project,
    what are the changes that are being brought in? This might be list of
    Jira issues from that project or a list of revision messages.
    This requires integration of b26 jars from IBM

Tom Mueller added a comment - 19/Apr/13 01:31 PM

Approved for 4.0, but in the future please fill out the change control template.


ScottKurz added a comment - 19/Apr/13 02:12 PM

Just confirming that this is indeed fixed in 1.0-b26.


Mahesh Kannan added a comment - 19/Apr/13 05:28 PM

Resolved in b26 jars.

Svn commit info

svn commit -m "Integrate b26 jars. Fix for 20335, 20264. QL and batch devtests passed. Approved by Tom"
Sending appserver/pom.xml
Transmitting file data ....
Committed revision 61563.