A content-constrained spatial (CCS) model for layout analysis of mathematical expressions Conference Paper uri icon

abstract

  • 2017 IEEE. This paper proposes a content-constrained spatial (CCS) model to recover the mathematical layout (M-layout, or ML me ) of an mathematical expression (ME) from its font setting layout (F-layout, or FLme). The M-layout can be used for content analysis applications such as ME based indexing and retrieval of documents. The first of the two-step process is to divide a compounded ME into blocks based on explicit mathematical structure primitives such as fraction lines, radical signs, fence, etc. Subscripts and superscripts within a block are resolved by probabilistic inference of their likelihood based on a global optimization model. The dual peak distributions of the features to capture the relative position between sibling blocks as super/subscript call for a sampling based non-parametric probability distribution estimation method to resolve their ambiguity. The notion of spatial constraint indicators is proposed to reduce the search space while improving the prediction performance. The proposed scheme is tested using the InftyCDB data set to achieve the F1 score of 0.98.

name of conference

  • 2017 Twelfth International Conference on Digital Information Management (ICDIM)

published proceedings

  • 2017 Twelfth International Conference on Digital Information Management (ICDIM)

author list (cited authors)

  • Wang, X., & Liu, J.

citation count

  • 2

complete list of authors

  • Wang, Xing||Liu, Jyh-Charn

publication date

  • January 2017

publisher