What is PDF/A?The PDF/A (or Portable Document Format Archival) is a format designed as a preservation format for digital records, particularly documents. The format, though, can also be used for scanned documents. It is an international standard and a subset of the PDF format. One of the great values of PDF formats is that they are open standards, used widely across the world, and designed to record both images and machine-readable text in one document. Show
Uses of PDF/APDF/A can be used to store many types of records, but it is most valuable as a format for storing long-term copies of digital textual documents, such as Microsoft Word files. When you convert such a file into a PDF/A, the resulting file retains the look and feel of the original document. Each page of the original document appears as a single page in the preservation file, the same fonts are used in both documents, and you can search the text of the PDF/A just as you had in the original. If the document is in color, the color is still there as well. For these reasons, PDF/A is a good format in cases where the appearance of the document matters to interpretation and understanding it. Other digital files may also be converted to PDF/A, including regular PDFs, email, digital images, and spreadsheets. You could even convert a sequence of digital images into one PDF/A. Any digital file that can be printed can be converted to a PDF/A, though this format is better for some documents than others. The format works best for static files that do not change. It is not appropriate for files that are always in flux, such as databases. Paper documents can also be converted to PDF/A during scanning, but if doing so it will be best if you also use optical character recognition (OCR) software to convert the images of letters in the document into electronic text. Whenever you OCR a document, however, there will be data errors in the converted text. (See the State Archives’ 2013 Digital Imaging Guidelines for the guidelines for scanning and OCR’ing textual documents.) Advantages of PDF/APDF/A has many advantages as a file format for the storage of records with long or permanent retention periods. If you are considering other digital file formats as options for long-term or permanent storage, compare their advantages to those of PDF/A. The advantages of PDF/A given below will serve as a checklist of features necessary in any preservation format. Note that you will find file formats that have one or even a few of these advantages, but it is the accumulation of these that makes PDF/A a good preservation format. Microsoft Word, for instance, is ubiquitous and long-lived, but it is missing other essential features that would make it a candidate for long-term storage of records. Not platform-dependentSince its inception, the PDF format has been accessible across computing platforms, and the PDF/A format has this same advantage. What this means is that a PDF/A created in a Windows environment will be perfectly readable and usable in a Mac environment, or vice versa. UbiquitousSomething ubiquitous is something that you can find everywhere, and the PDF and PDF/A formats are used by hundreds of millions of people across the world every day. The value of this universal use of PDF is that this means it is unlikely to die out as a format anytime soon. Also, since PDF/A is merely a subset of PDF, any software product that can read a PDF can read a PDF/A. Adobe distributes for free Adobe Reader software for reading PDFs, allowing everyone to read a PDF/A at no additional cost in computing equipment or hardware. (This is downloadable at http://get.adobe.com/reader/.) Long-livedThe PDF format has been around since 1991, so this format is unlikely to disappear soon. Again, so long as PDF is around, PDF/As will be easy to read and use. Supporting metadataTo understand a digital file, you often need good metadata to give context to the file. This metadata can include many pieces of information, such as the name of the author and the date of a file. For digital files, metadata is often stored within the file itself, so it is important to be able to save this metadata (and even add to it) when converting one digital file to another. PDF/A is specifically designed to support rich metadata. Supporting perfect conversionThe goal of any conversion program, even microfilming and scanning, is to create a new record that is as much like the original as possible. PDF/A is designed to do much in this area: It saves the look and searchability of the original file, and it requires that the original fonts, colors, and layout be preserved in the PDF/A you produce. The PDF/A format does this by being self-contained, meaning by saving within the file itself all of the information it needs to display the document. (This includes the fonts and the color definitions, which are not always saved in other file formats.) OpenAn open file format is one in which the specifications are available to anyone and where anyone can use those specifications to develop a software product to create and read the file format. PDF/A has always been a preservation standard since its initial release in 2005, so it clearly meets this criterion. Supports authenticityIn the digital world, even more so than the analog world, it is important to ensure that records retains that authenticity, that they are not modified after their creation, that they do not come to hold information different than they originally held. No file format alone can ensure authenticity, but PDF/A supports authenticity by being difficult (though not impossible) to modify and by providing document security (such as digital signatures). ExtensibleAll that extensible means is that the readability of a digital file will extend into the future, that a file will not become unreadable as software changes. The PDF/A standard is designed so that the earliest PDF/A will always be readable in the most current PDF viewer. This is assured by the fact that every version of PDF/A is always a subset of the one that comes after it, meaning that the PDF/A-3 standard always supports all the characteristics of the original PDF/A-1—along with a few extra features. Disadvantages of PDF/AAlthough PDF/A has many advantages, it has disadvantages as well. All digital files cannot be converted to PDF/A. Sometimes this happens because the files have features that are forbidden from PDF/A because there is no known way to preserve these features over time. Such files include documents with audio and video data or Javascript. PDF/A is also a complex text and image format, and its complexity might become a liability in the future. Finally, some digital files or records are simply not appropriate for conversion to PDF/A. For instance, it is possible to save a website as a PDF/A, but the resulting file would be cumbersome and difficult to use. Since PDF/As require the embedding of any fonts used in the file, they can also be larger than regular PDFs. Despite these drawbacks, over all PDF/A is a good preservation format for most digital documents. Versions of PDF/ASince PDF/A is designed to be a format that extends its features over time, there are already a number of different versions of PDF/A (PDF/A-1, -2, and -3). Beyond this, each generation of the format has different conformance levels, which indicate the degree to which each meets the highest goals of PDF/A. Defining characteristics of PDF/AAll versions of PDF/A are joined together by a certain subset of supported features, which can primarily be boiled down to one idea: each PDF/A file has to be self-contained, holding within itself all the information needed for it to be read as a complete file. It might seem that all digital files are self-contained, that each one carries everything needed to make it readable as it was intended to be read, but this is not the case. For instance, if you work on a Microsoft Word file at work and then open it at home, it may look much different: If you don’t have the same font at home that you have at work, then the Word file will choose the closest font it can find on your computer. A Word file does not have to store the fonts it uses within itself; instead, it stores only information about the font it uses and then searches for that font in whatever computing environment it is in. A PDF/A, however, must embed all of its fonts within itself, so that it never has to search for the fonts it needs to reveal itself fully to a user. To save space, the file will store only the subset of the font it needs, so if the file does not have a capital X within it, the information to show that character is not stored in the file. PDF/As also need to have unlimited legal use of any embedded fonts, because if they do not then they will not be able to be viewed accurately in the future. Some fonts have metadata within them that will not allow them to be used in a PDF or that limit the timeframe in which the font may be legally used. If such fonts are in a document you are trying to convert to a PDF/A, then you will not be able to produce a PDF/A from it. Besides embedded fonts, a PDF/A also needs device-independent color, which means that the display of color in a file cannot be dependent on the computing device you used to view it. A PDF/A has to one of two kinds of color encoding to ensure device independence. These two issues, embedded fonts and device-independent color, are part of a larger rule that a PDF/A file cannot have any reference to outside content. Also essential to the definition of a PDF/A are the metadata requirements. Because PDF/As are archival files, they must include metadata describing the file, and the file must identify itself as a PDF/A of a certain version. Since the file extension for a PDF/A is the same as for any kind of PDF (they are all .pdf), the file must store metadata within itself that identifies precisely what version of PDF/A it is. PDF/A-1 (2005)ISO Standard 19005-1:2005 PDF/A-1 also supports the fewest features of any version of PDF/A. It does not support transparency (which is a feature that supports the creation of text shadowing, since the means of supporting transparency long term had not yet been solved). This version also does not support JPEG2000 compression or embedded files, which are supported in all subsequent versions. PDF/A-1a conformance level PDF/A-1b conformance level PDF/A-2 (2011)ISO
Standard 19005-2:2011 PDF/A-2a conformance level PDF/A-2b conformance level PDF/A-2u conformance level PDF/A-3 (2012)ISO Standard 19005-3:2012 PDF/A-3a conformance level PDF/A-3b conformance level PDF/A-3u conformance level Choosing a version of PDF/A to useA number of considerations come into play when deciding what version of PDF/A to use, but to some degree any version is fine. If you only have software that will produce a PDF/A-1b and that supports all the features you need, then that is a good choice, and a permanent one. Remember, given the extensibility of the PDF/A series, the first version of PDF/A is compliant with all later versions, and there is never a reason to convert a PDF/A to a more recent version of the PDF/A format. There are a couple of basic rules you can follow in making your choices. The first is that the best conformance level to use is always level a, which will always produce the most accessible file. Barring that, you should choose level u, for its Unicode encoding, but keep in mind that the basic level (b) will almost always be sufficient for your needs. It also makes sense to use the latest version of the series that you can produce, because doing so will allow you to support the greatest number of features. What might be a more important consideration is your color encoding. If you will need to print out high-quality copies of a document, then you should choose CMYK encoding (which stands for Cyan, Magenta, Yellow, and Black). But if you expect only to be reading your files on a computer screen then RGB Color (for Red, Green, and Blue) is your better choice. Creating PDF/AsTo create a PDF/A, you need a product that can produce PDF/As. One of the most commonly used products is Adobe Acrobat Professional, versions 8 and later. Keep in mind, however, that there are many other software products you can use as well, and some of them have different features that you might find useful. (For a list of some of these products, see “Appendix A: PDF/A Tools.”) Also, a number of general products, such as the Microsoft Office suite now include tools within them that create PDF/As, so you may not need to purchase any new software at all, depending on your needs. If you need to create many PDF/As all at once, though, you’ll need to purchase a product focused on the creation of PDF/As, because those support batch processing, which allows you to convert multiple documents at once. Conversion PracticesThe process of converting a digital file into a preservation file is technically called normalization. In this process, the target format (in this case, the PDF/A) has to be one that meets the requirements of a preservation format, so it has to be a format that is not expected to disappear or become unusable in the near future. What you need to ensure before converting any files is that you have the necessary fonts installed on the computer you are using for the normalization. Without the necessary fonts, you will not be able to create a PDF/A. Of course, this is not an issue when converting a scanned image into a PDF/A. When to create a PDF/AYou actually have a choice of when to create a PDF/A, and you may choose to create PDF/As at different points in a records life cycle based on your business processes for different records. At the point of creation At the point of
recordation At the point of archiving Scanning from paperWhen scanning from paper, you have to set your scanner to create a PDF/A-compliant file. You then scan the document, keeping all pages of the document in one PDF/A, and run OCR text recognition, if needed, to convert the text within the document into intelligent digital text. Converting existing scanned imagesIf you have existing digital images of text documents to convert to PDF/A, you can use PDF software to conduct OCR text recognition and save the file in your chosen version of PDF/A. Only conformance levels b and u are possible when scanning records, and level u is preferred. Using the Distiller engineOne method of converting a file to a PDF/A is available only in Adobe Acrobat, and that is the Distiller engine. Distiller works separately from Adobe Acrobat, but it is also part of that software. It is usually accessible on the taskbar of your computer. To create a PDF/A with the Distiller, you would choose the appropriate PDF setting and then save or export the file. The Distiller engine may be a little more convenient sometimes, but it has no other advantages, and it cannot produce a fully accessible file (meaning one that meets conformance level a). Converting from within proprietary software productsYou can also create a PDF/A from within many software products that do much more than create PDFs. These include word-processing, spreadsheet, and page layout software. You can usually create PDF/As by “printing” or saving files to PDF/A, but you must be sure to change the PDF settings to your PDF/A preference. You can also set the software’s default to your preferred settings, for ease of use later on. Converting from regular PDFsMany people have stores of regular PDFs that they want to convert to PDF/As for preservation purposes. To do this, you might first have to remove any features that are prohibited in PDF/A, or you can run the conversion and see if any errors occur during the conversion. If using Adobe Acrobat, you will have to use its Preflight function to convert a regular PDF to a PDF/A. Since PDF-to-PDF/A conversions are notoriously unsuccessful, you might want to purchase a product designed for such conversions. The product 3-Heights PDF to PDF/A analyzes files in more detail to afford you a higher success rate in conversion. Still, no product will always be able to produce a PDF/A from regular PDFs. Quality Control PracticesAny form of reprographics (such as microfilming, imaging, or preservation photocopying) must include a quality control step to ensure an accurate copy of the original has been produced. The same is true of the process of normalization. There are two basic steps to the quality control of a PDF/A. First you must visually inspect the document to ensure that the new file looks just like the old file. If the conversion has somehow gone awry, you should be able to see this in the file and then repeat your conversion processes, after rechecking your settings and methodology. The second step in quality control is to validate the created files’ conformance to the version of the PDF/A standard you are using. To do this, you’ll have to use any of a number of validation tools, including Adobe Acrobat’s Preflight function. For a list of such products, see “Appendix B: PDF/A Validation Tools.” Ensuring PreservationThe preservation of records involves much more than simply creating PDF/As. It requires much work over time, and constant vigilance. You must develop solid conversion procedures, followed by good quality control practices. You will have to create and maintain metadata on the files to make them accessible and usable. You will need to ensure that your environmental controls are good for the storage of electronic files and that your data management controls (especially backup procedures) are sensible and consistent. And you’ll have to ensure one other fact: that your chosen file format for storage remains a valid preservation format. Right now, PDF/A is a good format for the long-term storage of documents, particularly digital textual documents, but that might not be the case ten years from now. Appendix A: PDF/A ToolsAdobe Acrobat Apago Callas Compart PDFlib PDF Tools AG Luratech Appendix B: PDF/A Validation ToolsAdobe Acrobat Preflight Function Callas Software’s pdfaPilot PDF Tools AG's 3-Heights’ PDF Validator Appendix C: Additional ResourcesGeneral PDF Resources Questions? Email . Issued 8/08/2013 What term is used for 2 or more PCs that are connected and share resources?A network consists of two or more computers that are linked in order to share resources (such as printers and CDs), exchange files, or allow electronic communications.
Which protocol ensures there is a government grade data encryption in place to help protect your personal data?The Advanced Encryption Standard (AES) is a symmetric block cipher chosen by the U.S. government to protect classified information. AES is implemented in software and hardware throughout the world to encrypt sensitive data. It is essential for government computer security, cybersecurity and electronic data protection.
What does the abbreviation Soho stand for it fundamentals?In information technology, SOHO is a term for the small office or home office environment and business culture. A number of organizations, businesses, and publications now exist to support people who work or have businesses in this environment.
What works in conjunction with a secure socket layer to ensure that data is transported safely?Hypertext Transfer Protocol Secure (https) is a combination of the Hypertext Transfer Protocol (HTTP) with the Secure Socket Layer (SSL)/Transport Layer Security (TLS) protocol. TLS is an authentication and security protocol widely implemented in browsers and Web servers.
|