Integrating geometrical and linguistic analysis for e-mail signature block parsing Hao Chen, Jianying Hu, and Richard Sproat The signature block is a common structured component found in e-mail messages. Accurate identification and analysis of signature blocks is important in many multimedia messaging and information retrieval applications such as e-mail text-to-speech rendering, automatic construction of personal address databases, and interactive message retrieval. It is also a very challenging task, because signature blocks often appear in complex 2-dimensional layouts which are guided only by loose conventions. Traditional text analysis methods designed to deal with sequential text cannot handle 2-dimensional structures, while the highly unconstrained nature of signature blocks makes the application of 2-dimensional grammars very difficult. In this paper we describe an algorithm for signature block analysis which combines 2-dimensional structural segmentation with 1-dimensional grammatical constraints. The information obtained from both layout and linguistic analysis is integrated in the form of weighted finite state transducers. The algorithm is currently implemented as a component in a preprocessing system for e-mail text-to-speech rendering.