Bugzilla – Full Text Bug Listing
|Summary:||Metric values are 0 when batch runs in a partition|
|Severity:||major||CC:||issues, m_edgar, radcortez|
Description radcortez 2013-12-31 15:53:28 UTC
When running a job with a partition, the Metric array from the StepExecution has all it's values equal to 0. Removing the partition, the Metric array shows the expected values for the data processed. You can check a couple of test cases the reproduce the problem here: https://arungupta.ci.cloudbees.com/job/Java%20EE%207%20Samples%20on%20GlassFish-cb/76/testReport/org.javaee7.batch.sample.chunk.partition/BatchChunkPartitionTest/testBatchChunkPartition/ https://arungupta.ci.cloudbees.com/job/Java%20EE%207%20Samples%20on%20GlassFish-cb/76/testReport/org.javaee7.batch.sample.chunk.mapper/BatchChunkMapperTest/testBatchChunkMapper/ And here is the code: https://github.com/radcortez/javaee7-samples/tree/master/batch/chunk-mapper https://github.com/radcortez/javaee7-samples/tree/master/batch/chunk-partition
Comment 1 m_edgar 2013-12-31 21:26:28 UTC
Since I believe it to be related, it's worth noting that the persistent user data is also not available for partitioned steps. I'm guessing that a solution won't be possible with the 1.0 specification (since the persistent data can't be aggregated like the Metrics can be, possibly).
Comment 2 ScottKurz 2013-12-31 22:38:40 UTC
Hi, I had realized the Metrics weren't set up correctly for partitioned steps, thanks for opening the bug to track this. --- Question to m_edgar: what issue are you having with the persistent user data? I'm wondering if this is working as designed and your issue relates to the fact that each partition gets its own StepContext, and along with it its own persistent data, in addition to the top-level thread's StepContext and persistent data. Could you give some sample code and explain? Thanks
Comment 3 m_edgar 2014-01-02 01:04:31 UTC
(In reply to ScottKurz from comment #2) Scott - The persistent data is working as expected within the partitions as they execute. However, my issue is with accessing the data via the job operator and StepExecution list. Since only the execution representing the parent thread of the step is returned, none of the details of the partitions is available. It seems that the missing metrics from the parent of the step is the same situation, since they are present on the child records as seen in the job repository. Would a modified runtime which returns both the parent as well as the children step executions for a particular step still be in compliance with the specification? It seems to me that it (the spec) doesn't necessarily indicate whether the StepExecution list returned by the job operator are for the parent, the children, or both. Thanks
Comment 4 ScottKurz 2014-02-13 11:36:05 UTC
While we could improve the behavior in the RI alone w/o a spec update, this isn't the first time we've touched on a need to possibly consider from a spec view how to view the partition-level equivalent of StepExecution. Marking as "SPEC".
Comment 5 ScottKurz 2014-11-01 03:36:33 UTC
I'm breaking off the idea of a partition-level StepExecution into Bug 6490, since this would require a new API. For this bug, 5675, we'll just fix the RI to aggregate the metrics. I started it but haven't finished yet.
Comment 6 ScottKurz 2014-11-05 19:46:05 UTC
Comment 7 ScottKurz 2014-11-05 19:47:52 UTC
Fixed in: https://github.com/WASdev/standards.jsr352.jbatch/commit/8a82576ae3f85aaf9d00c09045e63c8f1055b342 We aggregate the metrics (not mentioned in spec IIRC but probably non-controversial). We only do the summation on a successful execution. If we blow up before then you'll never see the metrics again (avoids the need to understand when they need updating).
Comment 8 ScottKurz 2014-11-07 11:26:22 UTC
Extended fix to case where partition runs from a split-flow in: https://github.com/WASdev/standards.jsr352.jbatch/commit/151a0060b89d0a4f5f81ed9a942c100525d79834
Comment 9 ScottKurz 2015-09-01 21:05:52 UTC
Marking as resolved, now that the RI 1.0.1 version has been released. I'll just note that the behavior of aggregating the metrics should probably be considered to be an RI-specific behavior at this time (not required by the standard). If someone feels a need to clarify this at the spec level, please raise a new issue. I think it's OK to leave this for now.