Bug 5911

Summary: Clarify partition restart processing, PartitionPlan properties, and persistent user data for partitioned steps.
Product: jbatch Reporter: ScottKurz
Component: SPECAssignee: ScottKurz
Status: NEW ---    
Severity: enhancement CC: issues
Priority: P5    
Version: 1   
Target Milestone: ---   
Hardware: PC   
OS: Windows   
Whiteboard: 1.0_mr_pending

Description ScottKurz 2014-03-28 14:17:35 UTC

    
Comment 1 ScottKurz 2014-03-31 16:41:11 UTC
Made these six updates:

1) For the Javadoc of PartitionPlan, updated class-level doc to:

 * A PartitionPlan contains: 
 * <ol>
 * <li>number of partition instances </li>
 * <li>number of threads on which to execute the partitions</li>
 * <li>substitution properties for each Partition (which can be
 * referenced using the <b><i>#{partitionPlan['propertyName']}</i></b> 
 * syntax. </li>	  
 * </ol> 

and changed getPartitionProperties() Javadoc to:

/**
 * Gets array of Partition Properties objects for Partitions.
 * <p>
 * These can be used in Job XML substitution using  
 * substitution expressions with the syntax:
 *   <b><i>#{partitionPlan['propertyName']}</i></b>
 * <p>
 * Each element of the Properties array returned can  
 * be used to resolving substitutions for a single partition.
 * In the typical use case, each Properties element will
 * have a similar set of property names, with a 
 * substitution potentially resolving to the corresponding
 * value for each partition.
 * 
 * @return Partition Properties object array
 */

----

2) Enhanced example at the end of Section 8.8.1.4 to:

E.g. Given job, job1: 
<job id="job1">
	<step id="step1">
		<chunk>
			<reader ref="MyReader>
				<properties>
					<property name="infile.name" 
						   value="file#{partitionPlan['myPartitionNumber']}.txt"/>
					<property name="outfile.name" 
						   value="#{partitionPlan['outFile']}"/>
				</properties>
			</reader>
			<writer ref="MyWriter"/>
		</chunk>
		<partition>
			<mapper ref="MyMapper"/>
		</partition>
	</step>
</job>

and MyMapper implementation: 

public class MyMapper implements PartitionMapper { 
	public PartitionPlan mapPartitions() { 
		PartitionPlanImpl pp= new PartitionPlanImpl();
		pp.setPartitions(2);
		Properties p0= new Properties();
		p0.setProperty("myPartitionNumber", "0");
		p0.setProperty("outFile", "outFileA.txt");

		Properties p1= new Properties();
		p1.setProperty("myPartitionNumber", "1");
		p1.setProperty("outFile", "outFileB.txt");
		Properties[] partitionProperties= new Properties[2];
		partitionProperties[0]= p0;
		partitionProperties[1]= p1;
		pp.setPartitionProperties(partitionProperties);
		return pp;
	}
}
The step1 chunk would run as two partitions, with the itemReader property "infile.name" resolved to "file0.txt" and "file1.txt" for partitions 0 and 1, respectively.  Also, itemReader property "outfile.name" would resolve to "outFileA.txt", and "outFileB.txt" for partitions 0 and 1, respectively.

----

3) At the end of Section 8.2.6.1, added: 

Note the specification does not attempt to guarantee order of partition execution with respect to the order within a statically or dynamically-defined plan.

----

4) Changed Javadoc for StepExecution#getPersistentUserData to:

	/**
	 * Get persistent user data.
	 * <p>
	 * For a partitioned step, this returns
	 * the persistent user data of the 
	 * <code>StepContext</code> of the "top-level"
	 * or main thread (the one the <code>PartitionAnalyzer</code>, etc.
	 * execute on).   It does not return the persistent user
	 * data of the partition threads. 
	 * @return persistent data 
	 */	
         public Serializable getPersistentUserData();

----

5) Added clarification at the end of Section 9.4.1.1.  It now ends with:

"For a partitioned step, there is one StepContext for the parent step/thread;  there is a distinct StepContext for each sub-thread and each StepContext has its own distinct persistent user data for each sub-thread."

----

6) Added clarification to 2nd paragraph in Sec. 10.8.4, Rule 3.c.

It now reads:

"Note if the step is a partitioned step, only the partitions that did not complete previously are restarted.  This behavior may be overridden through the PartitionPlan, as specified in section 10.9.4 PartitionPlan.  Note for a partitioned step, the checkpoints and persistent user data are loaded from the persistent store on a per-partition basis (this is not a new rule, but a fact implied by the discussion of checkpoints in Section 8.2.6 and the Step Context in Section 9.4.1.1, which is summarized here for convenience)."