Opening and Saving image files

Opening Image files

API to open image files

The ClearImage API offers several methods to open images from an image file or from an image stream.

  • CiImage.Open (string filename, int pageNumber) to open a specific image page in an image file.
  • CiImage.OpenFromStream (Stream stream, int pageNumber) to open an image page from a stream equivalent to image files.
  • ImageIO.Open (String filename, int pageNumber) returns .NET Bitmap object. This method is recommended ONLY if the application needs to use Bitmap object outside of ClearImage API. It is not efficient to read barcodes or process images with this method.
    For those purposes, use BarcodeReader and ImageEditor methods listed below to process the image files.

The ClearImageNet namespace offers methods to read barcodes and process the whole image file or a specific image page:

  • BarcodeReader.Read (string filename), BarcodeReader.Read (string filename, int pageNumber), BarcodeReader.Read (Stream filename) reads barcodes from specified image file or stream.
  • ImageEditor.Image.Open (string filename, int pageNumber) to edit a specific image page in an image file. To read barcode from an image page opened and pre-processed in ImageEditor use BarcodeReader.Read (ImageEditor editor) method.
  • ImageEditor.Edit (string filename, ImageEditor.EditPageEventHandler handler, string outputFile, ImageFileFormat format, bool overwrite) uses a handler method to process each page in an image file and save result in an output file.
  • ImageEditor.Edit (Stream stream, ImageEditor.EditPageEventHandler handler, ImageFileFormat format) uses a handler method to process each page in a stream and return result in a stream.

Additional methods are described in the API Help to open an image page from Windows Bitmap, Windows DIB, .NET Bitmap object, Clipboard and from uncompressed memory.

Recommend practices

  • To discover the number of pages in a multi-page TIFF or PDF file:
    • COM API: Use oImage.Open (filename, 1) to open first page. Get number of pages form oImage.PageCount property. Where oImage is a CiImage object.
    • .NET API: Use ImageInfo info = io.Info (filename); and use info.PageCount; Where io is an ImageIO object.
  • To obtain other properties, such as bits-per-pixel, resolution, image size, etc.
    • COM API: oImage.Open (filename, pageNumber) and use oImage properties. Where oImage is a CiImage object.
    • .NET API: Use ImageInfo info = io.Info (filename, pageNumber); and use info properties. Where io is an ImageIO object.
  • The most popular configurations to get the images from a PDF file using CiImage.Pdf properties are:
    • The default configuration is to automatically analyze content of PDF page and obtain image through image extraction or rasterization.
    • You can set this default behavior explicitly:
      CiImage.Pdf.readMode = epemAuto, CiImage.Pdf.minImageWidth = -98765, CiImage.Pdf.minImageHeight = -98765

      Configure the CiPdf rasterization properties if behaviors other than the default are desired.
    • EXTRACT all the images in the PDF file (as opposed to rasterizing the page). In the extraction method, each found image will increment the CiImage.PageCount (it will be considered as its own page).
    • To use this mode set: CiImage.Pdf.readMode = epemImage, CiImage.Pdf.minImageWidth = 0, CiImage.Pdf.minImageHeight = 0
    • The Image resolution and bits-per-pixel properties for image PDF and MRC (see below) are specified inside the PDF file.
    • Rasterize all pages at the specified bits-per-pixel and DPI.
    • To use this mode set: CiImage.Pdf.readMode = epemRaster. Explicitly configure the specific CiPdf rasterization properties if values other than the default are desired.
      • CiImage.Pdf.rasterColorMode controls color scheme (bits-per-pixel) property of rasterized image: bitonal (bw – 1 bpp), grayscale (gs – 8 bpp) or (rgb – 24 bpp). The default value is eprmAuto which analyzes the page color contents to find the lowest bits-per-pixel value.
      • CiImage.Pdf.dpiRasterBw, CiImage.Pdf.dpiRasterGs, CiImage.Pdf.dpiRasterRgb define the rasterization resolution for each bpp value. The default values are 300 dpi.

Working with PDF files

PDF format has a ubiquitous presence in document imaging. In many cases it has replaced TIF as the way to store multiple image in a single file. Depending on an application producing the PDF file there are a few variations of how data are stored in the PDF file:

  • Simple image PDF produced by many scanning devices. Each page of the original document is stored as a single page size image, compressed with Group 4 or JPEG, in a single page of the PDF file..
  • Multiple raster content (MRC) produced by newer, enterprise scanners and software. The scanned image is analyzed, and segmented into multiple images, where each segment is compressed in optimized way that is optimized for the contents of that segment. E.g. text is recognized as such and may be stored as a bitonal image using TIF compression, while a picture may be stored as a color JPEG. Some portion can be stored as searchable text or as graphic (non-image) elements. MRC pages must be reconstructed to recover a viewable image.
  • Data PDF is produced by PDF generator utilities or by a PDF printer driver, which convert reports to PDF format, for delivery to users, or for printing. These PDF files contain primarily text, graphics (such as logos, lines or shapes), and any images. Most often the barcodes in these documents are specified by a short text string, that references a barcode font; they must be rendered in order to produce the image of the barcode.
    • Data PDF are converted to an image through the process of rasterization, where the properties of rasterized image are set through CiImage.Pdf.rasterColorMode, CiImage.Pdf.dpiRasterBw, CiImage.Pdf.dpiRasterGs, CiImage.Pdf.dpiRasterRgb values

ClearImage is designed to address ALL the processing challenges of complex PDF documents.

Saving images files

API to save and append image files

ClearImage API offers several methods to save images in a file or to a stream. The format of output file and the compression algorithm are specified by the parameters of the method and the object’s properties:

  • CiImage.SaveAs (string filename, EFileFormat format) and CiImage.Append (string filename, EFileFormat format) methods are available in the COM API and the Inlite.ClearImage namespace.
    The properties controlling the compression methods are: CiImage.pComprBitonal , CiImage.pComprColor, CiImage.JpegQuality
  • CiImage.SaveToStream (EFileFormat format) and CiImage.Append(string filename, EFileFormat format) method available in Inlite.ClearImage namespace.
    The properties controlling the compression methods are: CiImage.pComprBitonal , CiImage.pComprColor, CiImage.JpegQuality
  • ImageIO.SaveAs (Bitmap bmp, String filename) and ImageIO.Append (Bitmap bmp, String filename) methods are available in the Inlite.ClearImageNet namespace.
    The properties controlling the compression methods are: ImageIO.compressionBitonalEx , ImageIO.compressionColorEx, ImageIO.jpegQuality
  • ImageEditor.Edit (string inputFile, ImageEditor.EditPageEventHandler handler, string outputFile, ImageFileFormat format, bool overwrite) and
    ImageEditor.Edit (Stream stream, ImageEditor.EditPageEventHandler handler, ImageFileFormat format) are available in the Inlite.ClearImageNet namespace.
    The properties controlling the compression methods are: ImageEditor.Image.pComprBitonal , ImageEditor.Image.pComprColor, ImageEditor.Image.JpegQuality

Additional methods described in API Help are available to save an image page to Windows Bitmap, Windows DIB, .NET Bitmap object, Clipboard and to uncompressed memory.

Recommend practices

  • Use output file name extension that correspond to image file format, e.g. .pdf for PDF file, .tif or .tiff for TIFF files, .jpg for JPEG files, etc.
  • To store bitonal (black and white images) use TIFF G4 compression.
  • For color or gray-scale images use LZW compression. If the application requires a higher level of compression then use JPEG compression with an appropriate JpegQuality setting. Saving in JPEG format may result in noticeable "blocking effect" that is fuzziness in the area of transition from one solid color (e.g. black) to another (e.g. white). Lower quality produces greater fuzziness. This fuzziness adversely affects the recognition of barcode as well as OCR. A higher JpegQuality value results in better quality, and a larger file size. JPEG is a lossy algorithm. Opening and saving a JPEG compressed file and saving it again adds distortions to the image and reduces its quality.
  • To save images in a multi-page file use the PDF or TIFF image file formats.
  • Set CiImage.pComprBitonal, CiImage.pComprColor, ImageIO.compressionBitonalEx, ImageIO.compressionColorEx explicitly before each save or append operation, if values other than the default are desired.

File format and image compression

These rules describes how file format and image compression are controlled by the method parameters and object properties:

  • Image file format:
    • File with extension .pdf is always saved in PDF format.
    • The methods CiImage.SaveAs, CiImage.Append and CiImage.SaveToStream save a file according to.
      • The EFileFormat format parameter
      • For EFileFormat.ciEXT format is defined by output file name extension, with the default behavior that TIFF format is used if:
        • Output file name extension is not specified (e.g. "somename")
        • Extension is not a well known image file extension (e.g. "somename.abc")
        • CiImage.SaveToStream call is used
    • For ImageIO.SaveAs and ImageIO.Append methods format is defined by output file name extension , with the default behavior that TIFF format is used if:
      • Output file name extension is not specified (e.g. "somename")
      • Extension is not a well known image file extension (e.g. "somename.abc")
    • For ImageEditor.Edit methods, the file format is specified
      • by ImageFileFormat format parameter
      • For ImageFileFormat.outputFileExtension format is defined by output file name extension, with the default behavior that TIFF format is used if:
        • Output file name extension is not specified (e.g. "somename")
        • Extension is not a well known image file extension (e.g. "somename.abc")
        • ImageEditor.Edit (Stream stream, …) call is used.
  • Image compression:
    • If the output image file format is PDF:
      • bitonal image is compressed as TIFF G4
      • Color and grayscale image is compressed as JPEG
    • If the output image file format is TIFF:
      • For CiImage.SaveAs, CiImage.Append and CiImage.SaveToStream
        • If the format parameter is EFileFormat.ciTIFF then the image is stored uncompressed
        • If the format is EFileFormat.ciTIFF_nnnn (e.g. ciTIFF_G3_1D) then the TIFF compression is defined by nnnn (e.g. TIFF G3)
        • If the format is EFileFormat.ciEXT
          • Bitonal image compression is defined by the CiImage.pComprBitonal property. The default value specifies TIFF G4 compression.
          • Color/Grayscale image compression is defined by the CiImage.pComprColor property. The default value specifies LZW compression.
        • If JPEG compression is specified, then the CiImage.JpegQuality property determines the compression quality and file size
      • For ImageIO.SaveAs and ImageIO.Append methods:
        • Bitonal image compression is defined by the ImageIO.compressionBitonalEx property. The default value specifies TIFF G4 compression.
        • Color/Grayscale image compression is defined by the ImageIO.compressionColorEx property. The default value specifies LZW compression.
      • For ImageEditor.Edit methods
        • Bitonal image compression is defined by the ImageEditor.Image.pComprBitonal property. The default value specifies TIFF G4 compression.
        • Color/Grayscale image compression is defined by the ImageEditor.Image.pComprColor property. The default value specifies LZW compression.
        • If JPEG compression is specified, then the ImageEditor.Image.JpegQuality property determines the compression quality and file size
    • If the image file format is other than TIFF or PDF, then the compression is defined by that file format standard.
  • Only TIFF or PDF file formats can be used to create multi-page image files with CiImage.Append, ImageIO.Append or ImageEditor.Edit methods.