[ale] Document Imaging under Linux

John Wells jb at sourceillustrated.com
Sun Sep 14 20:25:20 EDT 2003


I have a family member who'd like me to help design/develop or integrate a
document imaging system under Linux.  He has a large amount of documents
he'd like to scan in, store, and be able to retrieve easily for his

I'm very new to document imaging, so I'm not convinced I have a handle on
everything that goes into it, but my layman's understanding is that it is
simply converting paper docs into storable electronic docs.

So, first of all, is there anything out there already?  I'd hate to
reinvent the wheel, and I'm sure this has been done before many times.

If there isn't anything out there, then what formats are available?  We're
talking potentionally hundreds of thousands of documents here.

My first thought was to store them as JPEGs in the filesystem and then
store their ids in a database, allowing the filesystem to handle the
loading and storing of the files.  Course, JPEG is just the smallest
fairly good quality format I'm aware of, and I'm sure I'm overlooking some
better ones.

If you were approaching this project, what would you do? :-)

Thanks very much!


More information about the Ale mailing list