User Tools

Site Tools


software

Software

The same software that works with the DIY Book Scanner should work with BookLiberator Beta. This includes:

  • ScanTailor for processing page images: page splitting, deskewing, adding/removing borders, etc. It turns raw scans into images ready to be printed or assembled into a PDF or DJVU file. However, ScanTailor does not perform optical character recognition (OCR) – that is, it does not convert images to text.
  • Book Scan Wizard Does similar things as ScanTailor; also does not convert images to text.
  • Tesseract Open source OCR software – this is what converts images to text.
  • OCRopus Another open source OCR system.

We try to keep an eye on the software listed at the DIY Book Scanner's software page. We're not sure yet whether it makes sense to just point to that wiki instead of maintaining a list of resources here. For the moment (as of October 2014), we're keeping an eye on the sitation and trying to avoid gratuitously duplicating information, so please check those pages too.

Remember that OCR software only gets better over time! Any page images you generate now can be re-processed as often as you like in the future, generating better text each time. In other words, you should buy a BookLiberator Beta now, and start getting your pages digitized right away; the OCR stage can happen whenever you are ready.

software.txt · Last modified: 2014/10/06 21:15 by kfogel