Scale and Orientation Invariant Text Segmentation for Born-Digital Compound Images

Scale and Orientation Invariant Text Segmentation for Born-Digital Compound Images Many recent applications require text segmentation for born-digital compound images. To this end, we propose a coarse-to-fine framework for segmenting texts of arbitrary scales and orientations in born-digital compound images. In the coarse stage, the local image activity measure is designed based upon the variation distribution of characters, to highlight the difference between textual and pictorial regions. This stage outputs a coarse textual layer including textual regions as well as a few pictorial regions with high activity. In the fine stage, a textual connected component (TCC) based refinement is proposed to eliminate the survived pictorial regions. In particular, a scale and orientation invariant grouping algorithm is proposed to adaptively generate TCCs with uniform statistical features. The minimum average distance and morphological operations are employed to assist the formation of candidate TCCs. Then, three string-level features (i.e., shapeness, color similarity, and mean activity level) are designed to distinguish the true TCCs from the false positive ones that are formed by connecting the high activity pictorial components. Extensive experiments show that the proposed framework can segment textual regions precisely from born-digital compound images, while preserving the integrity of texts with varied scales and orientations, and avoiding over-connection of textual regions.