When this bug manifests itself, the end user will incorrectly receive an
additional character event containing the end mark up for the cdata section
itself (i.e. "]]>").
This bug appears to only occur when a specific size of cdata content is used in
combination with InputStream.read(buff, offset, length) returning a number of
bytes being read less than 'length'.
A test case has been attached with the specific cdataSize that causes a problem.
It is likely that there is an internal buffer limit being reached just at the
point when the input stream chunks the cdata section. Tracing through the code
we can see that XMLNSDocumentScannerImpl.scanCDATASection() incorrectly returns
true (to signify that CDATA parsing is complete) when there is still some
remaining CDATA characters. The result is that the parser thinks the next event
is content for which it happily returns the remainder of the CDATA section
(including the markup) as a CHARACTER event.
The test case attached has a ReplayInputStream that mimics an InputStream
chunking the data before the closing tags of a CDATA section for a given
cdataSize. An IllegalStateException is thrown to indicate invalid XML.
The test does not fail when used with BEA's MXParser or the SJSXP implementation
bundled with Sun 1.6.0_13, it does fail with SJSXP 1.0.1.