Saturday, January 9, 2010

PDF - Yesterday, Today and Tomorrow

Back again... Seriously did not get time but now that I have time I would want to present my ideas or thoughts on PDF.

Currently with so many electronic formats for content available around the Globe, PDF has been compared to all the electronic content formats from since 1990s. Many people have posted the view that PDF is not a format that is sustainable in the long run. Since the very start PDF has been compared with the likes Softbook, OeB and currently with ePUB. People fail to realise that is not just an eBook.

Currently people fail to realize that there is no other format that can replicate the paper version of any content as PDF can. PDF is a not just a format, it is a standard by itself. I would rather not compare it with other eFormats available for electronic medium distribution.

PDF for me is a universal formats. I have been asked numerous times on what types of PDFs are available or can be created.

For me there are two major types of PDF files in terms of structure further divided in to sub types:
  1. Scan PDF
  2. Text PDF
Scan PDF or Bitmap PDF: This is where the PDF is created out of image files. These good be scanned images of paper or digitally photographed content. In this PDF the content on the PDF pages is non searchable. This type is further sub divided into
  • Printable (POD PDF)
    This PDF is used to create Print Version of the content that is to a particular standard and used majorly for commercial production of content. The images used in this PDF are high resolution for good quality reproduction on paper
  • Non-Printable Scan PDF
    This PDF is generally used only to retain the digital copy of the content in the image format. This format cannot be used for commercial printing as the quality of reproduction on paper is not as good. However this can be used for quality printing which will not be commercially saleable.
  • On-Line Scan PDF
    This PDF is for typical created for online viewing or easy downloads from the internet. These PDF files have a very low resolution images which is supported on PC's and other devices for easy and quick rendition. The sole purpose of this PDF is for making content available online or in a format that is easily distributable, but not printable.

Text PDF: In this PDF file the text is searchable, however images still are not searchable. The content is highly structured or styled, however this does not stop the user from creating the Text PDF from unstructured text.
  • Printable (POD PDF and Traditional Print)
    This PDF is used for commercially viable printing. Hence this PDF is generally used for creation of Books, Magazines, Journals, Newspapers etc. which can be distributed in the print format. The content structure in this PDF is highly structured and styled. Mainly to appeal to the reader and make it look good
  • Non-Printable Text PDF
    This PDF cannot be used for commercial printing however is very good for distribution of content with searchable content. In this PDF the structure or the style used in the PDF does not carry much importance as the reach of this PDF is very low.
  • On-Line Text PDF
    This PDF is for typical created for online viewing or easy downloads from the internet. These PDF files have a very low resolution images which is supported on PC's and other devices for easy and quick rendition. The sole purpose of this PDF is for making content available online or in a format that is easily distributable, but not printable. In this PDF the struture and style may have relatively high value as these can be the online versions of commercially printed content search
    Books, Magazines, Journals, Newspapers etc.
There is another format of the PDF, that has a very different purpose, the Text Under/Over Image PDF. In this PDF the main content is rendered or captured as pages created of the images, however there is a Text layer that is introduced either under or over the Image of the page. The purpose of this to retain the structure or style of the content as is as well as make it searchable. There is a major reason I see as to why this format or version of the PDF is created. The structure or the style of the original content needs to be retained, but the underlying reason is creating a replica of the content structure using Text PDF creation methods is highly expensive in comparison to creation of Scan PDF with a layer of text under or over it.

Secure PDF
Content security is carries utmost importance. Even free to distribute content carries rights. Secure PDF can be created while creation of the PDF itself or using third party DRM servers. Each level of security has its own purpose and carries certain amount of importance.

PDF as Archive Standard
PDF/A is a standard which defines the requirements of creation of PDF file format for long term archiving. There are two sub standards:
  • PDF/A-1a - Level A compliance
  • PDF/A-1b - Level B compliance
More and detailed information about these standards http://en.wikipedia.org/wiki/PDF/A