Are there any characters that are encoded in HTML but not XML, or vice versa?
Are all the encodings the same between them? Like > for greater than symbol?
XML does predefine a handful of character entities. See section 4.6 of the XML 1.1 spec:
In particular, XML defines <, >, &, ', and " ("All XML processors MUST recognize these entities whether they are declared or not"). Any other entities must be referenced via numeric reference, as Brian states, or by an appropriate definition in an <!ENTITY ...> construct in the document itself or a referenced DTD.
All of these entities are defined in HTML as well.
Yes. HTML4 defines a number of named entities which aren't present by default in XML. You can see the list on the w3.org website.
> is one such encoded entity. Likewise,
< is the named entity for
<, but you can also write it like so:
<. As far as I know you can use the numbered version freely in both HTML and XML. See the w3.org link for how to define your own entities in XML documents.