Many users handle large volumes of incoming documents. For instance, attorneys work with hundreds and thousands of document pages for each case and each client. These case documents often come in almost simultaneously or in rapid succession. Copies need to be made quickly so that work may start.
How can one create a convenient archive for a broad assortment of image files of various documents in the quickest way possible, bypassing recognition while allowing for easy and subsequent text extraction later on?
With ABBYY FineReader 12, you can do this in three simple steps.
1. Selecting the appropriate image processing settings — this only needs to be done once.
Deselect the option for automatic page processing:
If the documents are photographed, automatic detection of page edges as well as background whitening may be activated (options in the green box).
Then, under the Save / PDF tab, you must select:
2. Scanning or downloading photos from camera.
To obtain images of documents quickly, use a scanner with an automatic feeder or a digital camera.
- All documents are scanned into FineReader using the button.
- If the documents were photographed with a camera, press , select all files with photographed documents in the opened window and press several seconds later they will appear in the FineReader window.
3. Saving documents as separate PDF files
Select the first page of the first document by clicking on its icon in the left window:
If you want to name the resulting files using some text from the document, such as its title or subheader, then now is a good time to copy it to the clipboard using the quick fragment copy function:
- Highlight the desired phrase using your mouse and click the Copy button (the leftmost one in the line of buttons that appear above the highlighted area):
- Now, scroll through the pages to find the last page of the first document and click on it while holding down the Shift button, thus highlighting all pages of the first document.
- Now press (if you saved to a different format last time, you there will be one additional step to select in the drop-down list).
In the opened file-saving dialog box, paste the file name from the clipboard (Ctrl+V):
Select a destination folder to save to and press Save.
Now that the pages of the first document have been processed, remove them from the FineReader window to prevent clutter: press Del on the keyboard and confirm the removal of highlighted pages in the opened dialog box.
Repeat step 3 for all documents in the package. The description may read long, but since many of the operations are executed by quick keystrokes, the processing in fact goes by quickly.
Subsequently, if you need to extract a quote from any of the created PDF files, you can extract text from the document and save in a different format, such as an editable one to quote or for an e-book, allowing for detailed review on your e-book reader or tablet.
- You can use automated processing enabled in the ABBYY HotFolder (available with the FineReader Corporate version) iinstead of working with the interface manually – it is faster and more convenient. However, you will only be able to automatically obtain a multi-page document within an individual file if the function of page distribution into individual multi-page documents is supported by your scanner’s software or multifunction device.
- You may optimize the software interface to streamline your work with images. To do this, press Ctrl+F5 (hide the Zoom window) and minimize (but do not hide completely!) the text window by grabbing the border between the windows and dragging it to the right. During processing without recognition, these windows are not needed, while the Image window being expanded allows you to expedite step 3.
- Experiment with the settings to explore quality parameters of saved PDF documents in order to achieve the optimal balance of compact size and sufficient quality (both for reading text extraction and possible conversion to other formats).
- Well-organized archive in a short space of time
- Optimal size
- Documents are ready for subsequent text extraction