I have an xml file I'm trying to parse and it is UTF-16 encoded. I'd like to convert it to UTF-8 in order to put it into a Document.
When I open the file in vi, I see something like <^@t^@a^@g^@>^@
This is the code I thought would work:
InputStream in = _context.openFileInput(_fileName);
InputSource is = new InputSource(new InputStreamReader(in, "UTF-8"));
doc = builder.parse(is);
This isn't working correctly, unrecognized characters are still there after streamed into a string.
Also, the error I get when trying to parse the document is:
org.xml.sax.SAXParseException: name expected (position:START_TAG <null>@1:1 in [email protected])
Once you read the file the encoding is not relevant until you choose to write it back to text (or convert to bytes) since it will be properly represented in JVM after reading. So something like this should work:
InputSource is = new InputSource(new InputStreamReader(in, "UTF-16"));
And you don't need to set any other encoding until you try to save data.