On 12/11/12 11:37 AM, Scott Ferguson wrote:
OK at least I understand it, which is progress !
On 12/11/12 10:25 AM, Danny Coward
OK, I think I understand. So the idea is to allow
implementations to send messages in a batch in order to get a
big performance gain for applications that send a lot of
messages in a short amount of time and to allow an explicit
way for developers to take advantage of that, if the batching
optimization is in the implementation.
Yes I agree that requiring flush is not a good solution.
And I think with the flush() method, we would have allowed
containers who choose to do batching under the existing model
without the extra setBatching/setAutoflush() idea ?
Only if we always require a flush. We could do that. That's the
equivalent of auto-flush=false always, and since it's how
BufferedOutputStream works, it's an existing programming model.
If a developer forgets a flush, the message might never get sent.
I'm a bit wary of that definition, because some implementations
won't bother with buffering, and lazy programmers will forget the
flush, but it will work anyway, and the spec will eventually
revert to auto-flush after the lazy programmers complain about
Right, but are our APIs are only expressed in terms of the data
objects not the channels. Does it really lock you into using
AsynchronousChannel with its one-write-at-a-time rule ?
I think that sort of approach already fits under the async
model we have: the async send operations allow implementations
to make their own choice about when to send the message after
the async send has been called. i.e.
sendString/sendBytes - send the message now (no batching)
sendStringByFuture() - send the message when the container
decides to (possibly batching if it chooses to)
That doesn't work, but the reason is a bit complicated (see
below). (Secondarily, the *ByFuture is a high-overhead API, which
doesn't work well in high-performance.)
It doesn't work because the *ByFuture and *ByCompletion are single
item queues. You can't batch or queue more than one item with
those APIs. If you look at java.nio.channels.AsynchronousChannel,
implementations may support concurrent reading and writing, but
may not allow more than one read and one write operation to be
outstanding at any given time."
Since it's only a single-item queue, the implementation can't
batch the items -- there's only one item.
I'm probably being really stupid, but can't an implementation use a
BlockingQueue under our APIs, and determine itself based on a
knowledge of its own implementation environment when to send a batch
of messages ?
And it's a single-item queue because multiple-item queues require
more API methods, like in BlockingQueue, and a longer spec
definition to describe the queue behavior, e.g. what happens when
the queue is full or even what "full" means.
Its a bit tricky imposing a different development model on top of
what we have, especially because I'll bet there will be some
implementations that will not support batching. I have some ideas on
a subtype of RemoteEndpoint which might separate out the batching
model better than the flags and the flush(), but lets see.
I'm flagging this in the spec for v10 because the spec has not
resolved this yet.
On 11/29/12 12:11 PM, Scott Ferguson wrote:
On 11/29/12 11:34 AM, Danny
My apologies Scott, I must have
missed your original request - I've logged this as issue
So auto flush true would require the implementation never
keep anything in a send buffer, false would allow it ?
Not quite. It's more like auto-flush false means "I'm batching
messages; don't bother sending if you don't have to." I don't
think the wording should be "never", because of things like
mux, or other server heuristics. It's more like "start the
process of sending."
setBatching(true) might be a better name, if that's clearer.
When setBatching(false) [autoFlush=true] -- the default --
and an app calls sendString(), the message will be delivered
(with possible buffering, delays, mux, optimizations, etc,
depending on the implementation, but it will be delivered
without further intervention from the app.)
When setBatching(true) [autoFlush=false], and an app calls
sendString(), the message might sit in the buffer forever
until the application calls flush().
sendPartialString would be unaffected by the flag; the WS
implementation is free to do whatever it wants with partial
Basically, it's a hint: setBatching(true) [autoFlush=false]
means "I'm batching a bunch of messages, so don't bother
sending the data if you don't need to until I call flush."
Does that make sense? I don't want to over-constrain
implementations with autoFlush(true) either option. Maybe
"batching" is the better name to avoid confusion. (But even
batching=true doesn't require buffering. Implementations can
still send fragments early if they want or even ignore
It seems like a reasonable request - do you think the
autoflush property is a per-peer setting / per logical
endpoint / per container setting ? I'm wondering if
typically developers will want to set this once per
application rather than keep setting it per
I think it's on the RemoteEndpoint, like setAutoCommit for
JDBC. It's easy to set in @WebSocketOpen, and the application
might want to start and stop batching mode while processing.
On 11/28/12 3:28 PM, Scott Ferguson wrote:
I'd like a setAutoFlush() and flush() on RemoteEndpoint
for high performance messaging. Defaults to true, which is
the current behavior.
The performance difference is on the order of 5-7 times as
many messages in some early micro-benchmarks. It's a big
improvement and puts us near the high-speed messaging like
|| Danny Coward
|| Danny Coward
|| Danny Coward