Enterprise Content Management

Step 1: Document Capture

Document capture and data capture are not the same - Document capture is the conversion of a paper document into an electronic image of that document, and Data capture extracts data from a business form.

Documents can be captured by:
      • Scanning
      • Importing electronic documents (word files, video,spreadsheets, etc.) for sharing or archiving
      • Converting existing electronic documents into unalterable images

What kind of documents can be captured?
The document type to be scanned will determine what kind of scanners are needed-photos or glossy items, colour documents and/or documents on coloured paper, high-lighted documents, hand written correspondence, etc. will need extra handling to ingest. Documents such as invoices, cheques, claims, applications, or forms are similar, allowing the scanning speed to be optimised and are generally higher volume. You will also need to determine for each type of documents: size, single or double-sided, colour or not, average number of pages, quality, and volumes (by day, week, or month and if there are seasonal or business-related variations).

Document Preparation:
Paper documents have to be manually prepared for scanning - repair torn pages, sort documents, remove sticky notes (or track down so they can be scanned too), paper clips, and staples, etc. Time-consuming and too often underestimated. Poor document preparation will also slow down document throughput.

Step 2: Bulk Uploading by Scanning

The transformation of paper documents into an electronic image can also mean the digitizing of microfilm.

Electronic images can also be captured by:
     • Fax – software can read from the fax server. Be aware that image quality will be lower, which may negatively affect recognition accuracy.
     • Multi-function Device (or Peripheral)- Network-connected MFDs can suffice for low-volume imaging needs.
     • Camera phone – Higher resolution cameras in mobile phones, and software designed to work on a mobile phone, allow the capture and conversion of documents on the go; whether a conference whiteboard or a restaurant menu.

Scanner Categories: Document throughput in the real world will be slower than the scanner's rated speed. Hence plan accordingly.
    • Workgroups: 10 – 25 ppm
    • Departmental: 26 – 40 ppm
    • Mid-volume productions: 40 – 120 ppm
    • Production: 120 plus
    • Large format: for over-sized documents and engineering drawings
    • Cheque scanners: Will read the account number on a cheque, speeding processing
    • Microfilm: for digitising film-based documents

Document Imaging: Document images can be saved as one of a number of file formats, including:

    • TIFF (Tagged Image File Format) – Generally used for monochrome office documents.
    • JPEG – Often used for colour documents.
    • PDF – De facto standard-Replica of document With OCR text, can be full-text indexed and searchable.
    • PDF/A – Recent standard for archival storage.
    • GIF (Graphics Interchange Format) – Exchange and display for high-quality/resolution graphics.

Step 3: Image Clean-up

Many products include image-enhancement features to increase the quality of the scanned documents–de-skew, de-speckle, crop, rotate, and/or blank page and double feed detection, etc. This step can be performed by software or an image processing board in the scanner.

Step 4.a: Forms Processing

When forms are scanned either the data or the entire form can be captured – depending on your business requirements. Data capture from a form can be entered seamlessly into the appropriate database and can be linked to other enterprise applications such as ERP.

Step 4.b: Recognition

  Recognition is valuable for indexing each image.

     • OCR (Optical Character Recognition): recognises machine-printed characters.
     • Zonal: used where only specific fields on a form are required.
     • Full-text: free form document conversion allowing search on all words in the document.
     • ICR (Intelligent Character Recognition): for hand-printed characters.
     • OMR:(Optical Mark Recognition): recognises check boxes, filled-in bubbles, etc.
     • Bar codes: read and extract information form a pre-printed bar code.

Step 5: Indexing for Searching & Retrieving

  Indexing is NOT optional. There is no other way to find and manage documents.

     • Key from index fields (document type, customer name, etc.) - a data entry person manually indexes documents.
     • Zone OCR – automatic way of capturing indexes.
     • Ingest from other applications like email, word processing etc.(metadata for the document i.e., subjectline, sender, etc., become the index fields)
     • Barcodes allow auto-indexing by storing form information on a bar code before scanning a batch of documents, certain index values & can be
       automatically populated.

The index can be either key fields or full text; a combination is generally the best.

Step 6: Quality Control/Assurance

Electronic images must be double-checked. Data can be validated by a second operator or via automated processes like database look-ups. Bad images are flagged and re-scanned.

Output : After going through the capture process, electronic content is released to:

              • Storage
              • Document filing/management system
              • Records management/Retention plan
              • Print
              • Email
              • Workflow/Business Process: a customer claim could be launched as insurance business process.
              • Enterprise Content Management system: access and collaboration across the enterprise.

Step 7: Backup & Log

Both metadata and images can be backed up to any location, and User activities log will capture all the activities performed by users and is not editable text.