Bugzilla – Full Text Bug Listing
|Summary:||remove time-limit from chunk configuration|
Description mminella 2013-01-17 15:46:05 UTC
In section 5.2.1 on page 18, the time-limit attribute on a chunk should be removed since the only two checkpoint policy options are item or custom.
Comment 1 cvignola 2013-01-17 16:07:01 UTC
When policy=item, item-count specifies the (desired maximum) number of items per chunk - i.e. after item-count items have been read, take a checkpoint. So it's like SpringBatch's commit-interval. The time-limit is a qualifier for policy=item to allow the job developer to configure a time-limit on the chunk to help it avoid unexpected transaction timeouts. The item-count and time-limit always work together when policy=item. The effect of item-count and time-limit is that you can optionally end the chunk and checkpoint when either: item-count items have been processed or time-limit is reached. Defaults: item-count=10 time-limit=0 (unlimited) Result is that chunk size is consistently 10. Some examples: item-count=10 time-limit=30 Result is chunk size is variably 10, because a checkpoint is taken after every 10 items or 30 seconds, whichever comes first. I think avoiding transaction timeout via deterministic configuration is better than trial and error and hoping for no surprises during production runs. I think not having to implement a custom policy for this is a nice feature. Now one might claim variable chunk size is a bad idea. Or that time-limit over reaches and concepts like that should be relegated exclusively to custom policies. Or perhaps there are other concerns not yet raised. Where do you side?
Comment 2 mminella 2013-01-17 17:32:35 UTC
Instead of having another configuration facet like this, I'd rather handle that with composition (that's the way we handle it in SB). if you want to add a timeout to your chunk completion policy, you can create a composite custom policy that handles both of these scenarios. With regards to not needing to implement this as a nice feature, there is nothing to preclude an implementer to provide a composite policy and a timeout implementation as well with their distribution.
Comment 3 cvignola 2013-01-17 18:09:20 UTC
What you say is true. But why shouldn't the spec include time-limit? I think the use case it covers is pervasive enough to warrant it being an intrinsic behavior.
Comment 4 mminella 2013-01-17 19:29:36 UTC
With this attribute being re-inroduced, does that mean that leaving item-count off and configuring time-limit=30 is acceptable?
Comment 5 cvignola 2013-01-24 20:18:10 UTC
(In reply to comment #4) It is valid syntax. Remember the default for item-count is 10. So specifying time-limit=30 alone would mean checkpoint chunk every 10 items or 30 seconds, whichever comes first. > With this attribute being re-inroduced, does that mean that leaving item-count > off and configuring time-limit=30 is acceptable?