XPack: A high-performance Web document encoding Conference Paper uri icon

abstract

  • XML is an increasingly popular data storage and exchange format whose popularity can be attributed to its self-describing syntax, acceptance as a data transmission and archival standard, strong internationalization support, and a plethora of supporting tools and technologies. However, XML's verbose, repetitive, text-oriented document specification syntax is a liability for many emerging applications such as mobile computing and distributed document dissemination. This paper presents XPack, an efficient XML document compression system that exploits information inherent in the document structure to enhance compression quality. Additionally, the utilization of XML structure features in XPack's design should provide valuable support for structure-aware queries over compressed documents. Taken together, the techniques employed in the XPack compression scheme provide a foundation for efficiently storing, transmitting, and operating over Web documents. Initial experimental results demonstrate that XPack can reduce the storage requirements for Web documents by up to 20% over previous XML compression techniques. More significantly, XPack can simultaneously support operations over the documents, providing up to two orders of magnitude performance improvement for certain document operations when compared to equivalent operations on unencoded XML documents.

author list (cited authors)

  • Rocco, D., Caverlee, J., & Liu, L.

editor list (cited editors)

  • Cordeiro, J., Pedrosa, V., Encarnação, B., & Filipe, J.

publication date

  • December 2005