[JAXP-74] Data corruption in SAXParser, chars outside XML passed to DefaultHandler.characters() Created: 28/Nov/12  Updated: 17/Dec/12

Status: Open
Project: jaxp
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Christian d'Heureuse Assignee: Joe Wang
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Java Source File SaxParserError.java     XML File SaxParserError.xml    
Tags: SAXParser, Xerces


In 2008, I have isolated and documented a serious bug in SAXParser. The bug can be easily reproduced on multiple platforms and leads to data corruption, e.g. when importing a database dump from an XML file.


This bug seems to be fixed in newer Xerces versions, but JDK 7 still includes Xerces 2.7.1., which dates from 2005.

Comment by Joe Wang [ 17/Dec/12 ]

See JAXP release notes 1.4.4 through 1.4.6, JDK7 has been updated partially to Xerces. 2.10. Since it's not a complete update, the version number has not been changed. I wish we could have done a complete update but were constrained by resources.

As for the issue you reported, do you happen to know a particular patch in the newer Xerces that would fix your problem?

Comment by Christian d'Heureuse [ 17/Dec/12 ]

I don't know a patch. The error does not occur with any of the Apache Xerces versions I tested. But it occurs with all the JDK versions I have tested. I guess it's a problem of the JDK implementation.

I have tested with old binary Xerces JAR files from http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22xerces%22%20AND%20a%3A%22xercesImpl%22

The first Xerces version that supports XML 1.1 is 2.4.0. When I copy xercesImpl-2.4.0.jar into the lib/endorsed directory of the JRE, the error does not occur.

Also when I change the XML version at the first line of the data file from "1.1" to "1.0", the error does not occur.

Generated at Sat Oct 10 03:50:31 UTC 2015 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.