IMAGE SET-I

These images are computer generated document images showing the effect of different types of degradations. Some of them (image00 to image00d) are computer generated document. This set contains the images for which we have the ground truth image and we can evaluate our method using ground truth based evaluation measures. F-measure, PSNR and NRM are established evaluation measures. A higher quality binarized image has higher F-Measure1 and PSNR1 but lower NRM1.

Information to Noise Difference (IND)

We have designed a method to test the quality of the binarized image based on information and noise.
IND = Ivalue - Nvalue, where Ivalue = TP/NBGT and Nvalue = FP/NBBI
TP is number of true positives, FP is number of false positives, NBGT and NBBI are the number of black pixels in ground truth and in binarized image respectively. Here Ivalue signifies the information preserved in the binarized image and Nvalue represents the noise in the binarized image. The value of IND ranges between -1 to +1 where +1 means binarized image is the exact copy of ground truth while -1 signifies that binarized image is the invert of ground truth.

1 DIBCO 2009, B. Gatos, K. Ntirogiannis and I. Pratikakis


IMAGES

* Click on link to view result of used binarization algorithms

1) set1/image00

Image showing the worst scenario for a binarization technique. It has many degradation including variable background and shadows, non-uniform illumination, ink bleed-through and blur caused by humidity.

Image size (in pixels): 640×427


2) set1/image00a

A non-uniform illuminated image producing variable contrast areas.

Image size (in pixels): 800×444


3) set1/image00b

Image with variable background and shadows. It also contains patches and paper fold marks.

Image size (in pixels): 800×444


4) set1/image00c

Sometimes ink leaks through the other side of the paper. This is known as strike through and occurs when the paper has insufficient absorption capacity for the density of the ink usually because of light coated paper.

Image size (in pixels): 800×444


5) set1/image00d

Some paper swells in the presence of the moisture in the water-based ink or in the uncoated paper, which tends to absorb the ink resulting in a blurred image, allowing the colorants to spread near the text areas.

Image size (in pixels): 800×444


6) set1/paper1a

Image size (in pixels): 1336×1320


7) set1/paper1b

Image size (in pixels): 1336×1320


8) set1/paper2

Image size (in pixels): 664×1596


9) set1/paper3a

Image size (in pixels): 1336×1712


10) set1/paper3b

Image size (in pixels): 1336×1712