Hive-XML-SerDe

XML Serializer/Deserializer for Apache Hive

For more information and samples of usage see https://github.com/dvasilen/Hive-XML-SerDe/wiki/XML-data-sources

See also

  1. https://github.com/dvasilen/Hive-XML-SerDe-VTD for VTD-XML based processor. When used with Hive XML SerDe it can provide significant performance gains.

  2. An Empirical Study on XML Schema Idiosyncrasies in Big Data Processing

  3. Efficient Processing of XML Documents in Hadoop Map Reduce