Bug 5911 - Clarify partition restart processing, PartitionPlan properties, and persistent user data for partitioned steps.
Clarify partition restart processing, PartitionPlan properties, and persisten...
Product: jbatch
Classification: Unclassified
Component: SPEC
PC Windows
: P5 enhancement
: ---
Assigned To: ScottKurz
Depends on:
  Show dependency treegraph
Reported: 2014-03-28 14:17 UTC by ScottKurz
Modified: 2015-09-03 16:25 UTC (History)
1 user (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description ScottKurz 2014-03-28 14:17:35 UTC

Comment 1 ScottKurz 2014-03-31 16:41:11 UTC
Made these six updates:

1) For the Javadoc of PartitionPlan, updated class-level doc to:

 * A PartitionPlan contains: 
 * <ol>
 * <li>number of partition instances </li>
 * <li>number of threads on which to execute the partitions</li>
 * <li>substitution properties for each Partition (which can be
 * referenced using the <b><i>#{partitionPlan['propertyName']}</i></b> 
 * syntax. </li>	  
 * </ol> 

and changed getPartitionProperties() Javadoc to:

 * Gets array of Partition Properties objects for Partitions.
 * <p>
 * These can be used in Job XML substitution using  
 * substitution expressions with the syntax:
 *   <b><i>#{partitionPlan['propertyName']}</i></b>
 * <p>
 * Each element of the Properties array returned can  
 * be used to resolving substitutions for a single partition.
 * In the typical use case, each Properties element will
 * have a similar set of property names, with a 
 * substitution potentially resolving to the corresponding
 * value for each partition.
 * @return Partition Properties object array


2) Enhanced example at the end of Section to:

E.g. Given job, job1: 
<job id="job1">
	<step id="step1">
			<reader ref="MyReader>
					<property name="infile.name" 
					<property name="outfile.name" 
			<writer ref="MyWriter"/>
			<mapper ref="MyMapper"/>

and MyMapper implementation: 

public class MyMapper implements PartitionMapper { 
	public PartitionPlan mapPartitions() { 
		PartitionPlanImpl pp= new PartitionPlanImpl();
		Properties p0= new Properties();
		p0.setProperty("myPartitionNumber", "0");
		p0.setProperty("outFile", "outFileA.txt");

		Properties p1= new Properties();
		p1.setProperty("myPartitionNumber", "1");
		p1.setProperty("outFile", "outFileB.txt");
		Properties[] partitionProperties= new Properties[2];
		partitionProperties[0]= p0;
		partitionProperties[1]= p1;
		return pp;
The step1 chunk would run as two partitions, with the itemReader property "infile.name" resolved to "file0.txt" and "file1.txt" for partitions 0 and 1, respectively.  Also, itemReader property "outfile.name" would resolve to "outFileA.txt", and "outFileB.txt" for partitions 0 and 1, respectively.


3) At the end of Section, added: 

Note the specification does not attempt to guarantee order of partition execution with respect to the order within a statically or dynamically-defined plan.


4) Changed Javadoc for StepExecution#getPersistentUserData to:

	 * Get persistent user data.
	 * <p>
	 * For a partitioned step, this returns
	 * the persistent user data of the 
	 * <code>StepContext</code> of the "top-level"
	 * or main thread (the one the <code>PartitionAnalyzer</code>, etc.
	 * execute on).   It does not return the persistent user
	 * data of the partition threads. 
	 * @return persistent data 
         public Serializable getPersistentUserData();


5) Added clarification at the end of Section  It now ends with:

"For a partitioned step, there is one StepContext for the parent step/thread;  there is a distinct StepContext for each sub-thread and each StepContext has its own distinct persistent user data for each sub-thread."


6) Added clarification to 2nd paragraph in Sec. 10.8.4, Rule 3.c.

It now reads:

"Note if the step is a partitioned step, only the partitions that did not complete previously are restarted.  This behavior may be overridden through the PartitionPlan, as specified in section 10.9.4 PartitionPlan.  Note for a partitioned step, the checkpoints and persistent user data are loaded from the persistent store on a per-partition basis (this is not a new rule, but a fact implied by the discussion of checkpoints in Section 8.2.6 and the Step Context in Section, which is summarized here for convenience)."
Comment 2 ScottKurz 2015-09-03 16:25:22 UTC
Since we kept Bug 5919 open for TCK updates, we can close this one, since neither of the TODOs is particularly clear-cut.