![]() |
Leptonica
1.83.1
Image processing and image analysis suite
|
Go to the source code of this file.
Functions | |
static l_int32 | recogTemplatesAreOK (L_RECOG *recog, l_int32 minsize, l_float32 minfract, l_int32 *pok) |
static SARRAY * | recogAddMissingClassStrings (L_RECOG *recog) |
static l_int32 | recogCharsetAvailable (l_int32 type) |
static PIX * | pixDisplayOutliers (PIXA *pixas, NUMA *nas) |
static PIX * | recogDisplayOutlier (L_RECOG *recog, l_int32 iclass, l_int32 jsamp, l_int32 maxclass, l_float32 maxscore) |
l_ok | recogTrainLabeled (L_RECOG *recog, PIX *pixs, BOX *box, char *text, l_int32 debug) |
l_ok | recogProcessLabeled (L_RECOG *recog, PIX *pixs, BOX *box, char *text, PIX **ppix) |
l_ok | recogAddSample (L_RECOG *recog, PIX *pix, l_int32 debug) |
PIX * | recogModifyTemplate (L_RECOG *recog, PIX *pixs) |
l_int32 | recogAverageSamples (L_RECOG *recog, l_int32 debug) |
l_int32 | pixaAccumulateSamples (PIXA *pixa, PTA *pta, PIX **ppixd, l_float32 *px, l_float32 *py) |
l_ok | recogTrainingFinished (L_RECOG **precog, l_int32 modifyflag, l_int32 minsize, l_float32 minfract) |
PIXA * | recogFilterPixaBySize (PIXA *pixas, l_int32 setsize, l_int32 maxkeep, l_float32 max_ht_ratio, NUMA **pna) |
PIXAA * | recogSortPixaByClass (PIXA *pixa, l_int32 setsize) |
l_ok | recogRemoveOutliers1 (L_RECOG **precog, l_float32 minscore, l_int32 mintarget, l_int32 minsize, PIX **ppixsave, PIX **ppixrem) |
PIXA * | pixaRemoveOutliers1 (PIXA *pixas, l_float32 minscore, l_int32 mintarget, l_int32 minsize, PIX **ppixsave, PIX **ppixrem) |
l_ok | recogRemoveOutliers2 (L_RECOG **precog, l_float32 minscore, l_int32 minsize, PIX **ppixsave, PIX **ppixrem) |
PIXA * | pixaRemoveOutliers2 (PIXA *pixas, l_float32 minscore, l_int32 minsize, PIX **ppixsave, PIX **ppixrem) |
PIXA * | recogTrainFromBoot (L_RECOG *recogboot, PIXA *pixas, l_float32 minscore, l_int32 threshold, l_int32 debug) |
l_ok | recogPadDigitTrainingSet (L_RECOG **precog, l_int32 scaleh, l_int32 linew) |
l_int32 | recogIsPaddingNeeded (L_RECOG *recog, SARRAY **psa) |
PIXA * | recogAddDigitPadTemplates (L_RECOG *recog, SARRAY *sa) |
L_RECOG * | recogMakeBootDigitRecog (l_int32 nsamp, l_int32 scaleh, l_int32 linew, l_int32 maxyshift, l_int32 debug) |
PIXA * | recogMakeBootDigitTemplates (l_int32 nsamp, l_int32 debug) |
l_ok | recogShowContent (FILE *fp, L_RECOG *recog, l_int32 index, l_int32 display) |
l_ok | recogDebugAverages (L_RECOG *recog, l_int32 debug) |
l_int32 | recogShowAverageTemplates (L_RECOG *recog) |
l_ok | recogShowMatchesInRange (L_RECOG *recog, PIXA *pixa, l_float32 minscore, l_float32 maxscore, l_int32 display) |
PIX * | recogShowMatch (L_RECOG *recog, PIX *pix1, PIX *pix2, BOX *box, l_int32 index, l_float32 score) |
Training on labeled data l_int32 recogTrainLabeled() PIX *recogProcessLabeled() l_int32 recogAddSample() PIX *recogModifyTemplate() l_int32 recogAverageSamples() l_int32 pixaAccumulateSamples() l_int32 recogTrainingFinished() static l_int32 recogTemplatesAreOK() PIXA *recogFilterPixaBySize() PIXAA *recogSortPixaByClass() l_int32 recogRemoveOutliers1() PIXA *pixaRemoveOutliers1() l_int32 recogRemoveOutliers2() PIXA *pixaRemoveOutliers2() Training on unlabeled data L_RECOG recogTrainFromBoot() Padding the digit training set l_int32 recogPadDigitTrainingSet() l_int32 recogIsPaddingNeeded() static SARRAY *recogAddMissingClassStrings() PIXA *recogAddDigitPadTemplates() static l_int32 recogCharsetAvailable() Making a boot digit recognizer L_RECOG *recogMakeBootDigitRecog() PIXA *recogMakeBootDigitTemplates() Debugging l_int32 recogShowContent() l_int32 recogDebugAverages() l_int32 recogShowAverageTemplates() static PIX *pixDisplayOutliers() PIX *recogDisplayOutlier() PIX *recogShowMatchesInRange() PIX *recogShowMatch() These abbreviations are for the type of template to be used: * SI (for the scanned images) * WNL (for width-normalized lines, formed by first skeletonizing the scanned images, and then dilating to a fixed width) These abbreviations are for the type of recognizer: * BAR (book-adapted recognizer; the best type; can do identification with unscaled images and separation of touching characters. * BSR (bootstrap recognizer; used if more labeled templates are required for a BAR, either for finding more templates from the book, or making a hybrid BAR/BSR. The recog struct typically holds two versions of the input templates (e.g. from a pixa) that were used to generate it. One version is the unscaled input templates. The other version is the one that will be used by the recog to identify unlabeled data. That version depends on the input parameters when the recog is created. The choices for the latter version, and their suggested use, are: (1) unscaled SI -- typical for BAR, generated from book images (2) unscaled WNL -- ditto (3) scaled SI -- typical for recognizers containing template images from sources other than the book to be recognized (4) scaled WNL -- ditto For cases (3) and (4), we recommend scaling to fixed height; e.g., scalew = 0, scaleh = 40. When using WNL, we recommend using a width of 5 in the template and 4 in the unlabeled data. It appears that better results for a BAR are usually obtained using SI than WNL, but more experimentation is needed. This utility is designed to build recognizers that are specifically adapted from a large amount of material, such as a book. These use labeled templates taken from the material, and not scaled. In addition, two special recognizers are useful: (1) Bootstrap recognizer (BSR). This uses height-scaled templates, that have been extended with several repetitions in one of two ways: (a) aniotropic width scaling (for either SI or WNL) (b) iterative erosions/dilations (for SI). (2) Outlier removal. This uses height scaled templates. It can be implemented without using templates that are aligned averages of all templates in a class. Recognizers are inexpensive to generate, for example, from a pixa of labeled templates. The general process of building a BAR is to start with labeled templates, e.g., in a pixa, make a BAR, and analyze new samples from the book to augment the BAR until it has enough samples for each character class. Along the way, samples from a BSR may be added for help in training. If not enough samples are available for the BAR, it can finally be augmented with BSR samples, in which case the resulting hybrid BAR/BSR recognizer must work on scaled images. Here are the steps in doing recog training: A. Generate a BAR from any existing labeled templates (1) Create a recog and add the templates, using recogAddSample(). This stores the unscaled templates. [Note: this can be done in one step if the labeled templates are put into a pixa: L_Recog *rec = recogCreateFromPixa(pixa, ...); ] (2) Call recogTrainingFinished() to generate the (sometimes modified) templates to be used for correlation. (3) Optionally, remove outliers. If there are sufficient samples in the classes, we're done. Otherwise, B. Try to get more samples from the book to pad the BAR. (1) Save the unscaled, labeled templates from the BAR. (2) Supplement the BAR with bootstrap templates to make a hybrid BAR/BSR. (3) Do recognition on more unlabeled images, scaled to a fixed height (4) Add the unscaled, labeled images to the saved set. (5) Optionally, remove outliers. If there are sufficient samples in the classes, we're done. Otherwise, C. For classes without a sufficient number of templates, we can supplement the BAR with templates from a BSR (a hybrid RAR/BSR), and do recognition scaled to a fixed height. Here are several methods that can be used for identifying outliers: (1) Compute average templates for each class and remove a candidate that is poorly correlated with the average. This is the most simple method. recogRemoveOutliers1() uses this, supplemented with a second threshold and a target number of templates to be saved. (2) Compute average templates for each class and remove a candidate that is more highly correlated with the average of some other class. This does not require setting a threshold for the correlation. recogRemoveOutliers2() uses this method, supplemented with a minimum correlation score. (3) For each candidate, find the average correlation with other members of its class, and remove those that have a relatively low average correlation. This is similar to (1), gives comparable results and because it does not use average templates, it requires a bit more computation.
Definition in file recogtrain.c.
l_int32 pixaAccumulateSamples | ( | PIXA * | pixa, |
PTA * | pta, | ||
PIX ** | ppixd, | ||
l_float32 * | px, | ||
l_float32 * | py | ||
) |
[in] | pixa | of samples from the same class, 1 bpp |
[in] | pta | [optional] of centroids of the samples |
[out] | ppixd | accumulated samples, 8 bpp |
[out] | px | [optional] average x coordinate of centroids |
[out] | py | [optional] average y coordinate of centroids |
Notes: (1) This generates an aligned (by centroid) sum of the input pix. (2) We use only the first 256 samples; that's plenty. (3) If pta is not input, we generate two tables, and discard after use. If this is called many times, it is better to precompute the pta.
Definition at line 652 of file recogtrain.c.
References L_CLONE, makePixelCentroidTab8(), makePixelSumTab8(), PIX_SRC, pixAccumulate(), pixaGetCount(), pixaGetPix(), pixaSizeRange(), pixCentroid(), pixClearAll(), pixCreate(), pixInitAccumulate(), pixRasterop(), ptaAddPt(), ptaClone(), ptaCreate(), ptaGetCount(), and ptaGetPt().
Referenced by recogAverageSamples().
PIXA* pixaRemoveOutliers1 | ( | PIXA * | pixas, |
l_float32 | minscore, | ||
l_int32 | mintarget, | ||
l_int32 | minsize, | ||
PIX ** | ppixsave, | ||
PIX ** | ppixrem | ||
) |
[in] | pixas | unscaled labeled templates |
[in] | minscore | keep everything with at least this score; use -1.0 for default. |
[in] | mintarget | minimum desired number to retain if possible; use -1 for default. |
[in] | minsize | minimum number of samples required for a class; use -1 for default. |
[out] | ppixsave | [optional debug] saved templates, with scores |
[out] | ppixrem | [optional debug] removed templates, with scores |
Notes: (1) Removing outliers is particularly important when recognition goes against all the samples in the training set, as opposed to the averages for each class. The reason is that we get an identification error if a mislabeled template is a best match for an input sample. (2) Because the score values depend strongly on the quality of the character images, to avoid losing too many samples we supplement a minimum score for retention with a score necessary to acquire the minimum target number of templates. To do this we are willing to use a lower threshold, LowerScoreThreshold, on the score. Consequently, with poor quality templates, we may keep samples with a score less than minscore, but never less than LowerScoreThreshold. And if the number of samples is less than minsize, we do not use any. (3) This is meant to be used on a BAR, where the templates all come from the same book; use minscore ~0.75. (4) Method: make a scaled recog from the input pixas. Then, for each class: generate the averages, match each scaled template against the average, and save unscaled templates that had a sufficiently good match.
Definition at line 1135 of file recogtrain.c.
Referenced by recogRemoveOutliers1().
PIXA* pixaRemoveOutliers2 | ( | PIXA * | pixas, |
l_float32 | minscore, | ||
l_int32 | minsize, | ||
PIX ** | ppixsave, | ||
PIX ** | ppixrem | ||
) |
[in] | pixas | unscaled labeled templates |
[in] | minscore | keep everything with at least this score; use -1.0 for default. |
[in] | minsize | minimum number of samples required for a class; use -1 for default. |
[out] | ppixsave | [optional debug] saved templates, with scores |
[out] | ppixrem | [optional debug] removed templates, with scores |
Notes: (1) Removing outliers is particularly important when recognition goes against all the samples in the training set, as opposed to the averages for each class. The reason is that we get an identification error if a mislabeled template is a best match for an input sample. (2) This method compares each template against the average templates of each class, and discards any template that has a higher correlation to a class different from its own. It also sets a lower bound on correlation scores with its class average. (3) This is meant to be used on a BAR, where the templates all come from the same book; use minscore ~0.75.
Definition at line 1336 of file recogtrain.c.
Referenced by recogRemoveOutliers2().
[in] | pixas | unscaled labeled templates |
[in] | nas | scores of templates (against class averages) |
Notes: (1) This debug routine is called from recogRemoveOutliers2(), and takes the saved templates and their scores as input.
Definition at line 2170 of file recogtrain.c.
References L_CLONE, L_GET_WHITE_VAL, L_INSERT, numaGetCount(), numaGetFValue(), pixaAddPix(), pixaCreate(), pixAddBlackOrWhiteBorder(), pixaDestroy(), pixaDisplayTiledWithText(), pixaGetCount(), pixaGetPix(), pixDestroy(), pixGetText(), and pixSetText().
[in] | recog | trained |
[in] | sa | set of text strings that need to be padded |
Notes: (1) Call recogIsPaddingNeeded() first, which returns sa of template text strings for classes where more templates are needed.
Definition at line 1732 of file recogtrain.c.
References L_Recog::charset_type, L_CLONE, L_COPY, L_NOCOPY, pixaAddPix(), pixaDestroy(), pixaGetCount(), pixaGetPix(), pixDestroy(), pixGetText(), recogCharsetAvailable(), recogExtractPixa(), recogMakeBootDigitTemplates(), sarrayGetCount(), and sarrayGetString().
Referenced by recogPadDigitTrainingSet().
[in] | recog | trained |
Notes: (1) This returns an empty sa if there is at least one template in each class in recog.
Definition at line 1675 of file recogtrain.c.
References L_Recog::charset_size, L_Recog::charset_type, L_COPY, L_NOCOPY, numaAddNumber(), numaCreate(), numaDestroy(), numaGetIValue(), numaSetValue(), L_Recog::pixaa_u, pixaaGetCount(), L_Recog::sa_text, sarrayAddString(), sarrayCreate(), and sarrayGetString().
Referenced by recogIsPaddingNeeded().
[in] | recog | |
[in] | pix | a single character, 1 bpp |
[in] | debug |
Notes: (1) The pix is 1 bpp, with the character string label embedded. (2) The pixaa_u array of the recog is initialized to accept up to 256 different classes. When training is finished, the arrays are truncated to the actual number of classes. To pad an existing recog from the boot recognizers, training is started again; if samples from a new class are added, the pixaa_u array is extended by adding a pixa to hold them.
Definition at line 353 of file recogtrain.c.
Referenced by recogTrainLabeled().
l_int32 recogAverageSamples | ( | L_RECOG * | recog, |
l_int32 | debug | ||
) |
[in] | recog | addr of existing recog |
[in] | debug |
Notes: (1) This is only called in two situations: (a) When splitting characters using either the DID method recogDecode() or the the greedy splitter recogCorrelationBestRow() (b) By a special recognizer that is used to remove outliers. Both unscaled and scaled inputs are averaged. (2) If the data in any class is nonexistent (no samples), or very bad (no fg pixels in the average), or if the ratio of max/min average unscaled class template heights is greater than max_ht_ratio, this function fails. The caller must check the return value of the recog, and destroy the recog on failure. (3) Set debug = 1 to view the resulting templates and their centroids.
Definition at line 484 of file recogtrain.c.
References L_Recog::ave_done, boxDestroy(), boxGetGeometry(), L_CLONE, L_INSERT, L_Recog::max_ht_ratio, L_Recog::max_splith, L_Recog::maxheight_u, L_Recog::maxwidth, L_Recog::maxwidth_u, L_Recog::min_splitw, L_Recog::minheight_u, L_Recog::minwidth, L_Recog::minwidth_u, L_Recog::nasum, L_Recog::nasum_u, numaAddNumber(), numaCreate(), numaDestroy(), L_Recog::pixa, L_Recog::pixa_u, L_Recog::pixaa, L_Recog::pixaa_u, pixaAccumulateSamples(), pixaAddPix(), pixaaGetPixa(), pixaCreate(), pixaDestroy(), pixaGetCount(), pixaSizeRange(), pixClipToForeground(), pixCountPixels(), pixDestroy(), pixInvert(), pixThresholdToBinary(), L_Recog::pta, L_Recog::pta_u, L_Recog::ptaa, L_Recog::ptaa_u, ptaAddPt(), ptaaGetPta(), ptaCreate(), ptaDestroy(), recogShowAverageTemplates(), L_Recog::setsize, and L_Recog::sumtab.
Referenced by recogDebugAverages().
|
static |
[in] | type | of charset for padding |
Definition at line 1781 of file recogtrain.c.
References L_ARABIC_NUMERALS, L_LC_ALPHA, L_LC_ROMAN_NUMERALS, L_UC_ALPHA, and L_UC_ROMAN_NUMERALS.
Referenced by recogAddDigitPadTemplates().
l_ok recogDebugAverages | ( | L_RECOG * | recog, |
l_int32 | debug | ||
) |
[in] | recog | addr of recog |
[in] | debug | 0 no output; 1 for images; 2 for text; 3 for both |
Notes: (1) Generates an image that pairs each of the input images used in training with the average template that it is best correlated to. This is written into the recog. (2) It also generates pixa_tr of all the input training images, which can be used, e.g., in recogShowMatchesInRange(). (3) Returns an error if the averaging function finds bad classes.
Definition at line 2022 of file recogtrain.c.
References L_CLONE, L_INSERT, lept_mkdir(), lept_stderr(), L_Recog::pixa_tr, L_Recog::pixaa, pixaaAddPixa(), pixaaCreate(), pixaAddPix(), pixaaDisplayByPixa(), pixaaFlattenToPixa(), pixaaGetCount(), pixaaGetPix(), pixaaGetPixa(), pixaCreate(), pixAddBorder(), pixaDestroy(), pixaGetCount(), L_Recog::pixdb_ave, pixDestroy(), L_Recog::rch, rchExtract(), recogAverageSamples(), and recogIdentifyPix().
|
static |
[in] | recog | |
[in] | iclass | sample is in this class |
[in] | jsamp | index of sample is class i |
[in] | maxclass | index of class with closest average to sample |
[in] | maxscore | score of sample with average of class maxclass |
Notes: (1) This shows three templates, side-by-side:
Definition at line 2224 of file recogtrain.c.
References L_Recog::bmf, L_ADD_BELOW, L_CLONE, L_INSERT, L_Recog::pixa, L_Recog::pixaa, pixaAddPix(), pixaaGetPix(), pixaCreate(), pixAddSingleTextblock(), pixaDestroy(), pixaDisplayTiledInRows(), pixaGetPix(), and pixDestroy().
PIXA* recogFilterPixaBySize | ( | PIXA * | pixas, |
l_int32 | setsize, | ||
l_int32 | maxkeep, | ||
l_float32 | max_ht_ratio, | ||
NUMA ** | pna | ||
) |
[in] | pixas | labeled templates |
[in] | setsize | size of character set (number of classes) |
[in] | maxkeep | max number of templates to keep in a class |
[in] | max_ht_ratio | max allowed height ratio (see below) |
[out] | pna | [optional] debug output, giving the number in each class after filtering; use NULL to skip |
Notes: (1) The basic assumption is that the most common and larger templates in each class are more likely to represent the characters we are interested in. For example, larger digits are more likely to represent page numbers, and smaller digits could be data in tables. Therefore, we bias the first stage of filtering toward the larger characters by removing very small ones, and select based on proximity of the remaining characters to median height. (2) For each of the setsize classes, order the templates increasingly by height. Take the rank 0.9 height. Eliminate all templates that are shorter by more than max_ht_ratio. Of the remaining ones, select up to maxkeep that are closest in rank order height to the median template.
Definition at line 952 of file recogtrain.c.
References L_CLONE, L_COPY, L_INSERT, L_SORT_BY_HEIGHT, L_SORT_INCREASING, numaAddNumber(), numaCreate(), pixaAddPix(), pixaaDestroy(), pixaaGetCount(), pixaaGetPixa(), pixaCopy(), pixaCreate(), pixaDestroy(), pixaGetCount(), pixaGetPix(), pixaGetPixDimensions(), pixaJoin(), pixaSelectRange(), pixaSort(), and recogSortPixaByClass().
[in] | recog | trained |
[out] | psa | addr of returned string containing text value |
Notes: (1) This returns a string array in &sa containing character values for which extra templates are needed; this sarray is used by recogGetPadTemplates(). It returns NULL if no padding templates are needed.
Definition at line 1618 of file recogtrain.c.
References L_Recog::charset_size, L_COPY, L_INSERT, L_Recog::min_nopad, numaDestroy(), numaGetIValue(), numaGetMin(), L_Recog::pixaa_u, pixaaGetCount(), recogAddMissingClassStrings(), L_Recog::sa_text, sarrayAddString(), and sarrayGetString().
Referenced by recogPadDigitTrainingSet().
L_RECOG* recogMakeBootDigitRecog | ( | l_int32 | nsamp, |
l_int32 | scaleh, | ||
l_int32 | linew, | ||
l_int32 | maxyshift, | ||
l_int32 | debug | ||
) |
[in] | nsamp | number of samples of each digit; or 0 |
[in] | scaleh | scale all heights to this; typ. use 40 |
[in] | linew | normalized line width; typ. use 5; 0 to skip |
[in] | maxyshift | from nominal centroid alignment; typically 0 or 1 |
[in] | debug | 1 for showing templates; 0 otherwise |
Notes: (1) This takes a set of pre-computed, labeled pixa of single digits, and generates a recognizer from them. The templates used in the recognizer can be modified by:
Definition at line 1840 of file recogtrain.c.
References pixaDestroy(), recogCreateFromPixa(), recogMakeBootDigitTemplates(), and recogShowContent().
PIXA* recogMakeBootDigitTemplates | ( | l_int32 | nsamp, |
l_int32 | debug | ||
) |
[in] | nsamp | number of samples of each digit; or 0 |
[in] | debug | 1 for display of templates |
Notes: (1) See recogMakeBootDigitRecog().
Definition at line 1877 of file recogtrain.c.
References l_bootnum_gen1(), l_bootnum_gen2(), l_bootnum_gen4(), numaAddNumber(), numaCreate(), pixaDestroy(), pixaDisplayTiledWithText(), pixaExtendByScaling(), pixaJoin(), and pixDestroy().
Referenced by recogAddDigitPadTemplates(), and recogMakeBootDigitRecog().
[in] | recog | |
[in] | pixs | 1 bpp, to be optionally scaled and turned into strokes of fixed width |
Definition at line 416 of file recogtrain.c.
References L_Recog::linew, pixClone(), pixCopy(), pixDestroy(), pixGetDimensions(), pixScaleToSize(), pixSetStrokeWidth(), pixZero(), L_Recog::scaleh, and L_Recog::scalew.
Referenced by recogTrainingFinished().
l_ok recogPadDigitTrainingSet | ( | L_RECOG ** | precog, |
l_int32 | scaleh, | ||
l_int32 | linew | ||
) |
[in,out] | precog | trained; if padding is needed, it is replaced by a a new padded recog |
[in] | scaleh | must be > 0; suggest ~40. |
[in] | linew | use 0 for original scanned images |
Notes: (1) This is a no-op if padding is not needed. However, if it is, this replaces the input recog with a new recog, padded appropriately with templates from a boot recognizer, and set up with correlation templates derived from scaleh and linew.
Definition at line 1562 of file recogtrain.c.
References L_Recog::maxyshift, pixaDestroy(), recogAddDigitPadTemplates(), recogCreateFromPixa(), recogDestroy(), recogIsPaddingNeeded(), sarrayDestroy(), and L_Recog::threshold.
[in] | recog | in training mode |
[in] | pixs | if depth > 1, will be thresholded to 1 bpp |
[in] | box | [optional] cropping box |
[in] | text | [optional] if null, use text field in pix |
[out] | ppix | addr of pix, 1 bpp, labeled |
Notes: (1) This crops and binarizes the input image, generating a pix of one character where the charval is inserted into the pix.
Definition at line 264 of file recogtrain.c.
References L_Recog::num_samples, pixClipRectangle(), pixClone(), and Pix::text.
Referenced by recogTrainLabeled().
l_ok recogRemoveOutliers1 | ( | L_RECOG ** | precog, |
l_float32 | minscore, | ||
l_int32 | mintarget, | ||
l_int32 | minsize, | ||
PIX ** | ppixsave, | ||
PIX ** | ppixrem | ||
) |
[in] | precog | addr of recog with unscaled labeled templates |
[in] | minscore | keep everything with at least this score |
[in] | mintarget | minimum desired number to retain if possible |
[in] | minsize | minimum number of samples required for a class |
[out] | ppixsave | [optional debug] saved templates, with scores |
[out] | ppixrem | [optional debug] removed templates, with scores |
Notes: (1) This is a convenience wrapper when using default parameters for the recog. See pixaRemoveOutliers1() for details. (2) If this succeeds, the new recog replaces the input recog; if it fails, the input recog is destroyed.
Definition at line 1059 of file recogtrain.c.
References pixaDestroy(), pixaRemoveOutliers1(), recogCreateFromPixa(), recogDestroy(), and recogExtractPixa().
l_ok recogRemoveOutliers2 | ( | L_RECOG ** | precog, |
l_float32 | minscore, | ||
l_int32 | minsize, | ||
PIX ** | ppixsave, | ||
PIX ** | ppixrem | ||
) |
[in] | precog | addr of recog with unscaled labeled templates |
[in] | minscore | keep everything with at least this score |
[in] | minsize | minimum number of samples required for a class |
[out] | ppixsave | [optional debug] saved templates, with scores |
[out] | ppixrem | [optional debug] removed templates, with scores |
Notes: (1) This is a convenience wrapper when using default parameters for the recog. See pixaRemoveOutliers2() for details. (2) If this succeeds, the new recog replaces the input recog; if it fails, the input recog is destroyed.
Definition at line 1274 of file recogtrain.c.
References pixaDestroy(), pixaRemoveOutliers2(), recogCreateFromPixa(), recogDestroy(), and recogExtractPixa().
l_int32 recogShowAverageTemplates | ( | L_RECOG * | recog | ) |
[in] | recog |
Notes: (1) This debug routine generates a display of the averaged templates, both scaled and unscaled, with the centroid visible in red.
Definition at line 2094 of file recogtrain.c.
References L_CLONE, L_INSERT, lept_stderr(), L_Recog::max_splith, L_Recog::maxheight_u, L_Recog::maxwidth_u, L_Recog::min_splitw, L_Recog::minheight_u, L_Recog::minwidth_u, PIX_SRC, L_Recog::pixa, L_Recog::pixa_u, pixaAddPix(), pixaCreate(), L_Recog::pixadb_ave, pixaDestroy(), pixaDisplayTiledInRows(), pixaGetPix(), pixConvertTo32(), pixCreate(), pixDestroy(), pixRasterop(), pixSetAllArbitrary(), L_Recog::pta, L_Recog::pta_u, ptaGetPt(), and L_Recog::setsize.
Referenced by recogAverageSamples().
l_ok recogShowContent | ( | FILE * | fp, |
L_RECOG * | recog, | ||
l_int32 | index, | ||
l_int32 | display | ||
) |
[in] | fp | file stream |
[in] | recog | |
[in] | index | for naming of output files of template images |
[in] | display | 1 for showing template images; 0 otherwise |
Definition at line 1941 of file recogtrain.c.
References L_Recog::dna_tochar, l_dnaGetIValue(), lept_mkdir(), L_Recog::linew, L_Recog::maxyshift, numaDestroy(), numaGetIValue(), L_Recog::pixaa_u, pixaaDisplayByPixa(), pixaaGetCount(), L_Recog::scaleh, L_Recog::scalew, L_Recog::setsize, and L_Recog::threshold.
Referenced by recogMakeBootDigitRecog().
PIX* recogShowMatch | ( | L_RECOG * | recog, |
PIX * | pix1, | ||
PIX * | pix2, | ||
BOX * | box, | ||
l_int32 | index, | ||
l_float32 | score | ||
) |
[in] | recog | |
[in] | pix1 | input pix; several possibilities |
[in] | pix2 | [optional] matching template |
[in] | box | [optional] region in pix1 for which pix2 matches |
[in] | index | index of matching template; use -1 to disable printing |
[in] | score | score of match |
Notes: (1) pix1 can be one of these: (a) The input pix alone, which can be either a single character (box == NULL) or several characters that need to be segmented. If more than character is present, the box region is displayed with an outline. (b) Both the input pix and the matching template. In this case, pix2 and box will both be null. (2) If the bmf has been made (by a call to recogMakeBmf()) and the index >= 0, the text field, match score and index will be rendered; otherwise their values will be ignored.
Definition at line 2369 of file recogtrain.c.
References L_Recog::bmf, L_ADD_BELOW, L_CLONE, pixaAddPix(), pixaCreate(), pixAddBorderGeneral(), pixAddSingleTextblock(), pixaDestroy(), pixaDisplayTiledInRows(), pixClone(), pixConvertTo32(), pixCopy(), pixDestroy(), pixRenderBoxArb(), and recogGetClassString().
Referenced by recogIdentifyPixa(), and recogShowMatchesInRange().
l_ok recogShowMatchesInRange | ( | L_RECOG * | recog, |
PIXA * | pixa, | ||
l_float32 | minscore, | ||
l_float32 | maxscore, | ||
l_int32 | display | ||
) |
[in] | recog | |
[in] | pixa | of 1 bpp images to match |
[in] | minscore | min score to include output |
[in] | maxscore | max score to include output |
[in] | display | 1 to display the result |
Notes: (1) This gives a visual output of the best matches for a given range of scores. Each pair of images can optionally be labeled with the index of the best match and the correlation. (2) To use this, save a set of 1 bpp images (labeled or unlabeled) that can be given to a recognizer in a pixa. Then call this function with the pixa and parameters to filter a range of scores.
Definition at line 2277 of file recogtrain.c.
References L_CLONE, L_INSERT, numaAddNumber(), numaCreate(), numaGetFValue(), numaGetIValue(), pixaAddPix(), pixaCreate(), pixaGetCount(), pixaGetPix(), pixDestroy(), L_Recog::rch, rchExtract(), recogIdentifyPix(), and recogShowMatch().
[in] | pixa | labeled templates |
[in] | setsize | size of character set (number of classes) |
Definition at line 1021 of file recogtrain.c.
References L_Recog::pixaa_u, recogCreateFromPixaNoFinish(), and recogDestroy().
Referenced by recogFilterPixaBySize().
|
static |
[in] | recog | |
[in] | minsize | set to -1 for default |
[in] | minfract | set to -1.0 for default |
[out] | pok | set to 1 if template set is valid; 0 otherwise |
Notes: (1) This is called by recogTrainingFinished(). A return value of 0 will cause recogTrainingFinished() to destroy the recog. (2) minsize is the minimum number of samples required for the class; -1 uses the default (3) minfract is the minimum fraction of classes required for the recog to be usable; -1.0 uses the default
Definition at line 892 of file recogtrain.c.
Referenced by recogTrainingFinished().
PIXA* recogTrainFromBoot | ( | L_RECOG * | recogboot, |
PIXA * | pixas, | ||
l_float32 | minscore, | ||
l_int32 | threshold, | ||
l_int32 | debug | ||
) |
[in] | recogboot | labeled boot recognizer |
[in] | pixas | set of unlabeled input characters |
[in] | minscore | min score for accepting the example; e.g., 0.75 |
[in] | threshold | for binarization, if needed |
[in] | debug | 1 for debug output saved to recogboot; 0 otherwise |
Notes: (1) This takes pixas of unscaled single characters and recboot, a bootstrep recognizer (BSR) that has been set up with parameters * scaleh: scale all templates to this height * linew: width of normalized strokes, or 0 if using the input image It modifies the pix in pixas accordingly and correlates with the templates in the BSR. It returns those input images in pixas whose best correlation with the BSR is at or above minscore. The returned pix have added text labels for the text string of the class to which the best correlated template belongs. (2) Identification occurs in scaled mode (typically with h = 40), optionally using a width-normalized line images derived from those in pixas.
Definition at line 1460 of file recogtrain.c.
References L_CLONE, L_COPY, L_INSERT, L_Recog::linew, pixaAddPix(), pixaCopy(), pixaCreate(), L_Recog::pixadb_boot, pixaDestroy(), pixaGetCount(), pixaGetPix(), pixaSetStrokeWidth(), pixaVerifyDepth(), pixConvertTo1(), pixDestroy(), pixScaleToSize(), pixSetText(), L_Recog::rch, rchExtract(), recogIdentifyPix(), and L_Recog::scaleh.
l_ok recogTrainingFinished | ( | L_RECOG ** | precog, |
l_int32 | modifyflag, | ||
l_int32 | minsize, | ||
l_float32 | minfract | ||
) |
[in] | precog | addr of recog |
[in] | modifyflag | 1 to use recogModifyTemplate(); 0 otherwise |
[in] | minsize | set to -1 for default |
[in] | minfract | set to -1.0 for default |
Notes: (1) This must be called after all training samples have been added. (2) If the templates are not good enough, the recog input is destroyed. (3) Usually, modifyflag == 1, because we want to apply recogModifyTemplate() to generate the actual templates that will be used. The one exception is when reading a serialized recog: there we want to put the same set of templates in both the unscaled and modified pixaa. See recogReadStream() to see why we do this. (4) See recogTemplatesAreOK() for minsize and minfract usage. (5) The following things are done here: (a) Allocate (or reallocate) storage for (possibly) modified bitmaps, centroids, and fg areas. (b) Generate the (possibly) modified bitmaps. (c) Compute centroid and fg area data for both unscaled and modified bitmaps. (d) Truncate the pixaa, ptaa and numaa arrays down from 256 to the actual size. (6) Putting these operations here makes it simple to recompute the recog with different modifications on the bitmaps. (7) Call recogShowContent() to display the templates, both unscaled and modified.
Definition at line 769 of file recogtrain.c.
References L_Recog::centtab, L_CLONE, L_INSERT, L_Recog::maxarraysize, L_Recog::naasum, L_Recog::naasum_u, numaaAddNumber(), numaaCreateFull(), numaaDestroy(), numaaTruncate(), L_Recog::pixaa, L_Recog::pixaa_u, pixaaAddPix(), pixaaCreate(), pixaaDestroy(), pixaaGetPixa(), pixaaInitFull(), pixaaTruncate(), pixaCreate(), pixaDestroy(), pixaGetCount(), pixaGetPix(), pixCentroid(), pixClone(), pixCountPixels(), pixDestroy(), L_Recog::ptaa, L_Recog::ptaa_u, ptaaAddPt(), ptaaCreate(), ptaaDestroy(), ptaaInitFull(), ptaaTruncate(), ptaCreate(), ptaDestroy(), recogDestroy(), recogModifyTemplate(), recogTemplatesAreOK(), L_Recog::setsize, L_Recog::sumtab, and L_Recog::train_done.
Referenced by recogAddAllSamples(), and recogCreateFromPixa().
[in] | recog | in training mode |
[in] | pixs | if depth > 1, will be thresholded to 1 bpp |
[in] | box | [optional] cropping box |
[in] | text | [optional] if null, use text field in pix |
[in] | debug | 1 to display images of samples not captured |
Notes: (1) Training is restricted to the addition of a single character in an arbitrary (e.g., UTF8) charset (2) If box != null, it should represent the location in pixs of the character image.
Definition at line 217 of file recogtrain.c.
References pixDestroy(), recogAddSample(), and recogProcessLabeled().
Referenced by recogCreateFromPixaNoFinish().