Bug 5675

Summary: Metric values are 0 when batch runs in a partition
Product: jbatch Reporter: radcortez
Component: RIAssignee: ScottKurz
Severity: major CC: issues, m_edgar, radcortez
Priority: P5    
Version: 1   
Target Milestone: ---   
Hardware: Macintosh   
OS: Mac OS   

Comment 1 m_edgar 2013-12-31 21:26:28 UTC
Since I believe it to be related, it's worth noting that the persistent user data is also not available for partitioned steps. I'm guessing that a solution won't be possible with the 1.0 specification (since the persistent data can't be aggregated like the Metrics can be, possibly).
Comment 2 ScottKurz 2013-12-31 22:38:40 UTC
Hi, I had realized the Metrics weren't set up correctly for partitioned steps, thanks for opening the bug to track this.


Question to m_edgar:  what issue are you having with the persistent user data?  

I'm wondering if this is working as designed and your issue relates to the fact that each partition gets its own StepContext, and along with it its own persistent data, in addition to the top-level thread's StepContext and persistent data.   

Could you give some sample code and explain?

Comment 3 m_edgar 2014-01-02 01:04:31 UTC
(In reply to ScottKurz from comment #2)

Scott - 
The persistent data is working as expected within the partitions as they execute. However, my issue is with accessing the data via the job operator and StepExecution list. Since only the execution representing the parent thread of the step is returned, none of the details of the partitions is available. 

It seems that the missing metrics from the parent of the step is the same situation, since they are present on the child records as seen in the job repository. 

Would a modified runtime which returns both the parent as well as the children step executions for a particular step still be in compliance with the specification? It seems to me that it (the spec) doesn't necessarily indicate whether the StepExecution list returned by the job operator are for the parent, the children, or both.

Comment 4 ScottKurz 2014-02-13 11:36:05 UTC
While we could improve the behavior in the RI alone w/o a spec update, this isn't the first time we've touched on a need to possibly consider from a spec view how to view the partition-level equivalent of StepExecution.  

Marking as "SPEC".
Comment 5 ScottKurz 2014-11-01 03:36:33 UTC
I'm breaking off the idea of a partition-level StepExecution into Bug 6490, since this would require a new API.

For this bug, 5675, we'll just fix the RI to aggregate the metrics.   I started it but haven't finished yet.
Comment 7 ScottKurz 2014-11-05 19:47:52 UTC
Fixed in:

We aggregate the metrics (not mentioned in spec IIRC but probably non-controversial).  We only do the summation on a successful execution.   If we blow up before then you'll never see the metrics again (avoids the need to understand when they need updating).
Comment 8 ScottKurz 2014-11-07 11:26:22 UTC
Extended fix to case where partition runs from a split-flow in:

Comment 9 ScottKurz 2015-09-01 21:05:52 UTC
Marking as resolved, now that the RI 1.0.1 version has been released.

I'll just note that the behavior of aggregating the metrics should probably be considered to be an RI-specific behavior at this time (not required by the standard).   If someone feels a need to clarify this at the spec level, please raise a new issue.  I think it's OK to leave this for now.