Bugzilla – Bug 4531
remove time-limit from chunk configuration
Last modified: 2013-01-24 20:18:10 UTC
In section 5.2.1 on page 18, the time-limit attribute on a chunk should be removed since the only two checkpoint policy options are item or custom.
When policy=item, item-count specifies the (desired maximum) number of items per chunk - i.e. after item-count items have been read, take a checkpoint. So it's like SpringBatch's commit-interval. The time-limit is a qualifier for policy=item to allow the job developer to configure a time-limit on the chunk to help it avoid unexpected transaction timeouts. The item-count and
time-limit always work together when policy=item.
The effect of item-count and time-limit is that you can optionally end the chunk and checkpoint when either: item-count items have been processed or time-limit is reached.
Result is that chunk size is consistently 10.
Result is chunk size is variably 10, because a checkpoint is taken after every 10 items or 30 seconds, whichever comes first.
I think avoiding transaction timeout via deterministic configuration is better than trial and error and hoping for no surprises during production runs. I think not having to implement a custom policy for this is a nice feature.
Now one might claim variable chunk size is a bad idea. Or that time-limit over reaches and concepts like that should be relegated exclusively to custom policies. Or perhaps there are other concerns not yet raised.
Where do you side?
Instead of having another configuration facet like this, I'd rather handle that with composition (that's the way we handle it in SB). if you want to add a timeout to your chunk completion policy, you can create a composite custom policy that handles both of these scenarios.
With regards to not needing to implement this as a nice feature, there is nothing to preclude an implementer to provide a composite policy and a timeout implementation as well with their distribution.
What you say is true. But why shouldn't the spec include time-limit? I think the use case it covers is pervasive enough to warrant it being an intrinsic behavior.
With this attribute being re-inroduced, does that mean that leaving item-count off and configuring time-limit=30 is acceptable?
(In reply to comment #4)
It is valid syntax. Remember the default for item-count is 10. So specifying time-limit=30 alone would mean checkpoint chunk every 10 items or 30 seconds, whichever comes first.
> With this attribute being re-inroduced, does that mean that leaving item-count
> off and configuring time-limit=30 is acceptable?