Digitization – Not just scanning

Written By – Priya Amaresh

The process of converting information from analogue format into a digital or electronic format is called Digitization.

Multi-function devices (MFD) are used for scanning flat images. Optical Character Recognition (OCR) is used to search a digital document, the downside being it is of no use if the document is handwritten with images and columns. MFDs can do the job for a quick and nimble job, but for vital business documents to be searchable, accessible and useful; the approach has to be professional.

Five points to be considered when digitizing documents:-

  1. Accuracy of the whole document

It is important to have hardware that is capable of scanning a range of sizes, not just conventional A4 or smaller as this ensures accuracy of the scanned image. By accuracy, I mean no loss of text at the edges and no blurring of text either. Large and other specialized scanners are incorporated to achieve the desired results.

  1. Character recognition

Documents contain not just text, there could be hand written documents, hard copy of photos and graphs too which are to be inserted in the scan to get a consolidated and true version of the hard copy. Hence, OCR manages the typed text, Intelligent Character Recognition (ICR) recognizes handwritten characters and Optical Mark Recognition (OMR) recognizes characters entered into forms, such as “ticked boxes”. In this way images are captured in the digitization process.

  1. Useable formats

The digitization output particularly medical documents, engineering drawings are saved in a variety of formats such as PDF, JPEG, TIFF, GIFF, etc., this enables the documents to be repurposed as high definition images for presentation, high-quality printing and more.

  1. Redaction to ensure privacy

Privacy is important and should be maintained in the entire digitization process. Hence, it is advisable to remove sensitive or personal information from documents in the digitization process. This maintains the privacy of the individual while making important information available e.g. medical tests which can be used for research purposes. This requires specialist software. Redaction can also bring together dissimilar documents into a single accessible document.

Conclusion

Therefore, I conclude that digitization is not just scanning.To just scan documents and store them on a computer drive somewhere will be a good backup for the hard copy but what use is it to the organization?  An organization must have metadata applied, keywords applied and all of these filed in an appropriated structure. This ensures documents can be located either by the metadata applied to the digital document or the words and phrases found in the document – made possible with OCR, ICR and OMR. The classification and indexing of documents requires knowledge and experience to get it right.

My advice to all of you, if your business documents are not being digitized with all these components in place, you are not getting the most out of the documents you hold. All of the above features enable documents to become data and ultimately knowledge.

 

Leave a comment

Your email address will not be published. Required fields are marked *