TODO
----

Character segmentation is bad

Line segmentation is bad

Coding style (because the code has been merged from three projects)
is confusing.

GUI

Other languages: training data, testing data

Heaps and heaps!

