Leptonica  1.83.1
Image processing and image analysis suite
baseline.c File Reference
#include <math.h>
#include "allheaders.h"

Go to the source code of this file.

Functions

NUMApixFindBaselines (PIX *pixs, PTA **ppta, PIXA *pixadb)
 
PIXpixDeskewLocal (PIX *pixs, l_int32 nslices, l_int32 redsweep, l_int32 redsearch, l_float32 sweeprange, l_float32 sweepdelta, l_float32 minbsdelta)
 
l_ok pixGetLocalSkewTransform (PIX *pixs, l_int32 nslices, l_int32 redsweep, l_int32 redsearch, l_float32 sweeprange, l_float32 sweepdelta, l_float32 minbsdelta, PTA **pptas, PTA **pptad)
 
NUMApixGetLocalSkewAngles (PIX *pixs, l_int32 nslices, l_int32 redsweep, l_int32 redsearch, l_float32 sweeprange, l_float32 sweepdelta, l_float32 minbsdelta, l_float32 *pa, l_float32 *pb, l_int32 debug)
 

Variables

static const l_int32 MinDistInPeak = 35
 
static const l_int32 PeakThresholdRatio = 20
 
static const l_int32 ZeroThresholdRatio = 100
 
static const l_int32 DefaultSlices = 10
 
static const l_int32 DefaultSweepReduction = 2
 
static const l_int32 DefaultBsReduction = 1
 
static const l_float32 DefaultSweepRange = 5.
 
static const l_float32 DefaultSweepDelta = 1.
 
static const l_float32 DefaultMinbsDelta = 0.01
 
static const l_float32 OverlapFraction = 0.5
 
static const l_float32 MinAllowedConfidence = 3.0
 

Detailed Description


     Locate text baselines in an image
          NUMA     *pixFindBaselines()

     Projective transform to remove local skew
          PIX      *pixDeskewLocal()

     Determine local skew
          l_int32   pixGetLocalSkewTransform()
          NUMA     *pixGetLocalSkewAngles()

 We have two apparently different functions here:
   ~ finding baselines
   ~ finding a projective transform to remove keystone warping
 The function pixGetLocalSkewAngles() returns an array of angles,
 one for each raster line, and the baselines of the text lines
 should intersect the left edge of the image with that angle.

Definition in file baseline.c.

Function Documentation

◆ pixDeskewLocal()

PIX* pixDeskewLocal ( PIX pixs,
l_int32  nslices,
l_int32  redsweep,
l_int32  redsearch,
l_float32  sweeprange,
l_float32  sweepdelta,
l_float32  minbsdelta 
)

pixDeskewLocal()

Parameters
[in]pixs1 bpp
[in]nslicesthe number of horizontal overlapping slices; must be larger than 1 and not exceed 20; use 0 for default
[in]redsweepsweep reduction factor: 1, 2, 4 or 8; use 0 for default value
[in]redsearchsearch reduction factor: 1, 2, 4 or 8, and not larger than redsweep; use 0 for default value
[in]sweeprangehalf the full range, assumed about 0; in degrees; use 0.0 for default value
[in]sweepdeltaangle increment of sweep; in degrees; use 0.0 for default value
[in]minbsdeltamin binary search increment angle; in degrees; use 0.0 for default value
Returns
pixd, or NULL on error
Notes:
     (1) This function allows deskew of a page whose skew changes
         approximately linearly with vertical position.  It uses
         a projective transform that in effect does a differential
         shear about the LHS of the page, and makes all text lines
         horizontal.
     (2) The origin of the keystoning can be either a cheap document
         feeder that rotates the page as it is passed through, or a
         camera image taken from either the left or right side
         of the vertical.
     (3) The image transformation is a projective warping,
         not a rotation.  Apart from this function, the text lines
         must be properly aligned vertically with respect to each
         other.  This can be done by pre-processing the page; e.g.,
         by rotating or horizontally shearing it.
         Typically, this can be achieved by vertically aligning
         the page edge.

Definition at line 322 of file baseline.c.

◆ pixFindBaselines()

NUMA* pixFindBaselines ( PIX pixs,
PTA **  ppta,
PIXA pixadb 
)

pixFindBaselines()

Parameters
[in]pixs1 bpp, 300 ppi
[out]ppta[optional] pairs of pts corresponding to approx. ends of each text line
[in]pixadbfor debug output; use NULL to skip
Returns
na of baseline y values, or NULL on error
Notes:
     (1) Input binary image must have text lines already aligned
         horizontally.  This can be done by either rotating the
         image with pixDeskew(), or, if a projective transform
         is required, by doing pixDeskewLocal() first.
     (2) Input null for &pta if you don't want this returned.
         The pta will come in pairs of points (left and right end
         of each baseline).
     (3) Caution: this will not work properly on text with multiple
         columns, where the lines are not aligned between columns.
         If there are multiple columns, they should be extracted
         separately before finding the baselines.
     (4) This function constructs different types of output
         for baselines; namely, a set of raster line values and
         a set of end points of each baseline.
     (5) This function was designed to handle short and long text lines
         without using dangerous thresholds on the peak heights.  It does
         this by combining the differential signal with a morphological
         analysis of the locations of the text lines.  One can also
         combine this data to normalize the peak heights, by weighting
         the differential signal in the region of each baseline
         by the inverse of the width of the text line found there.

Definition at line 117 of file baseline.c.

◆ pixGetLocalSkewAngles()

NUMA* pixGetLocalSkewAngles ( PIX pixs,
l_int32  nslices,
l_int32  redsweep,
l_int32  redsearch,
l_float32  sweeprange,
l_float32  sweepdelta,
l_float32  minbsdelta,
l_float32 *  pa,
l_float32 *  pb,
l_int32  debug 
)

pixGetLocalSkewAngles()

Parameters
[in]pixs1 bpp
[in]nslicesthe number of horizontal overlapping slices; must be larger than 1 and not exceed 20; use 0 for default
[in]redsweepsweep reduction factor: 1, 2, 4 or 8; use 0 for default value
[in]redsearchsearch reduction factor: 1, 2, 4 or 8, and not larger than redsweep; use 0 for default value
[in]sweeprangehalf the full range, assumed about 0; in degrees; use 0.0 for default value
[in]sweepdeltaangle increment of sweep; in degrees; use 0.0 for default value
[in]minbsdeltamin binary search increment angle; in degrees; use 0.0 for default value
[out]pa[optional] slope of skew as fctn of y
[out]pb[optional] intercept at y = 0 of skew, 8 as a function of y
[in]debug1 for generating plot of skew angle vs. y; 0 otherwise
Returns
naskew, or NULL on error
  Notes:
       (1) The local skew is measured in a set of overlapping strips.
           We then do a least square linear fit parameters to get
           the slope and intercept parameters a and b in
               skew-angle = a * y + b  (degrees)
           for the local skew as a function of raster line y.
           This is then used to make naskew, which can be interpreted
           as the computed skew angle (in degrees) at the left edge
           of each raster line.
       (2) naskew can then be used to find the baselines of text, because
           each text line has a baseline that should intersect
           the left edge of the image with the angle given by this
           array, evaluated at the raster line of intersection.
  

Definition at line 508 of file baseline.c.

◆ pixGetLocalSkewTransform()

l_ok pixGetLocalSkewTransform ( PIX pixs,
l_int32  nslices,
l_int32  redsweep,
l_int32  redsearch,
l_float32  sweeprange,
l_float32  sweepdelta,
l_float32  minbsdelta,
PTA **  pptas,
PTA **  pptad 
)

pixGetLocalSkewTransform()

Parameters
[in]pixs
[in]nslicesthe number of horizontal overlapping slices; must be larger than 1 and not exceed 20; use 0 for default
[in]redsweepsweep reduction factor: 1, 2, 4 or 8; use 0 for default value
[in]redsearchsearch reduction factor: 1, 2, 4 or 8, and not larger than redsweep; use 0 for default value
[in]sweeprangehalf the full range, assumed about 0; in degrees; use 0.0 for default value
[in]sweepdeltaangle increment of sweep; in degrees; use 0.0 for default value
[in]minbsdeltamin binary search increment angle; in degrees; use 0.0 for default value
[out]pptas4 points in the source
[out]pptadthe corresponding 4 pts in the dest
Returns
0 if OK, 1 on error
Notes:
     (1) This generates two pairs of points in the src, each pair
         corresponding to a pair of points that would lie along
         the same raster line in a transformed (dewarped) image.
     (2) The sets of 4 src and 4 dest points returned by this function
         can then be used, in a projective or bilinear transform,
         to remove keystoning in the src.

Definition at line 389 of file baseline.c.