MIT - Academic Computing

Information technology evolves rapidly. While some of the information on these pages may still be valid, some of it is also outdated and will not be revised. For assistance with a computing problem, please contact the Computing Help Desk

PDF FAQ

PDF (Portable Document Format) is a file format developed by Adobe Systems for distributing documents cross-platform, with original formatting intact and independent of the application software used to create them. Reading PDF files requires a viewing program, which can be set up for use within a graphical web browser (on Athena, the Adobe Acrobat Reader is part of the standard Netscape configuration and is also available as a stand-alone application).

While PDF is a highly convenient and popular format, individual PDF files can be problematic if not created carefully. The information below is intended to help you create faithful reproductions of your original documents as PDF files which behave properly when printed or viewed on other systems.

Common Problems

Background information

Creating PDF

Using PDF effectively on the web



Q. What is PDF? Why use it?  

A. PDF (Portable Document Format) is a cross-platform file format developed by Adobe Systems for delivering documents electronically in a device- and resolution-independent manner, with original formatting intact. That is, the PDF version of a document should look just like the original, but it can be viewed or printed without access to the application or system used to create it.

Some of the key features of PDF can be highlighted by comparing it to two common cross-platform delivery mechanisms: HTML and PostScript.

How PDF differs from HTML:

How PDF differs from PostScript:

The above is meant to indicate only some basic characteristics of PDF; for further details and technical specifications, see the information on the web about PDF.

Q. What is Adobe Acrobat? Where can I get it?  

A. Acrobat is the Adobe software suite for creating and viewing PDF files (for complete product information, see Adobe's Acrobat page). The basic components of the suite are:

For non-Athena machines, I/S provides a step-by-step guide to downloading and installing Acrobat Reader. The full suite can be purchased at an MIT discount price through govconnection.

On Athena, the full Acrobat suite is available on Suns, and the Reader is available on SGIs (Adobe does not currently provide SGI versions of the rest of the suite). Acrobat Reader is configured as part of the standard Netscape setup (i.e., if you click on a PDF link in a web page, it will launch automatically in the browser window). To launch it separately:

     athena% add acro; acroread filename.pdf
or leave off the file name, and use File-->Open to locate the file once the Reader is up.

The other programs on Athena Suns are distill and acroexch (Acrobat Distiller and Exchange, respectively). For more information:

     athena% add acro
     athena% more /mit/acro/README.athena
     athena% distill -help
     athena% acroexch -help


Q. How can I create PDF files?  

A. On Athena, you can convert PostScript files to PDF files with Acrobat Distiller, or you can use the free Ghostscript PS-to-PDF converter; if you have TeX/LaTeX files, you can now convert them directly to PDF with pdftex. If you purchase the full Acrobat suite for Windows or Macintosh, you can use either Distiller or PDFWriter to create PDF files (see the answer on differences between PDFWriter and Distiller).

If you are using Adobe FrameMaker on athena, you may save the document directly as a PDF file.

Keep in mind that while it is very easy to create a PDF file, it takes a little care to make sure it looks good and won't cause problems when posted on the web. Please see the guidelines below, and feel free to contact the Faculty Liaisons for advice relating to course pages, or CWIS for general help.

On Athena, Distiller is available on Suns only (Adobe does not provide an SGI version). To use it, type:

     athena% add acro
     athena% distill filename.ps
where "filename.ps" is your PostScript file. This will create a PDF file named "filename.pdf", which you can then link from your web page as you would any other file. For help on distill options, type:
     athena% distill -help

Note: if you aren't sure how to create a PostScript file from a specific application, try using its print command, but specify "to file" rather than "to printer" and supply a filename ending in ".ps".

Ghostscript has a free PS-to-PDF converter, which is available on Athena in the gnu locker (the version currently installed is distributed under the GNU General Public License). To use it:

     athena% add gnu; ps2pdf filename.ps

For more information, see the ps2pdf man page, READMEs in the gnu locker, or the Ghostscript home page.

To save PDF files on Athena using FrameMaker, launch it:

     athena% add frame; maker &

Go to File and select SaveAs. In the format menu, select PDF from the drop down list. Click on Save.



Q. What is the difference between using PDFWriter and Distiller? 

A. Acrobat Distiller is Adobe's full-feature PostScript-to-PDF converter. Distiller allows you to control options for the PDF you generate, but requires that you start with a PS file. PDFWriter is a printer driver which allows you to use the print command within an application (for example, a word processor, spreadsheet, or PowerPoint) to create a PDF file directly from your original; it is available only for Macintosh and Windows. While convenient for direct conversion of simple documents, PDFWriter lacks some features of Distiller and doesn't work well for some types of documents.

In general, if you don't get satisfactory results with PDFWriter, try Distiller instead. In particular, PDFWriter may not always successfully process documents that contain Encapsulated PostScript (EPS) artwork or images, or documents that use features available only with PostScript printers. (With EPS it uses the bitmap preview image that accompanies the EPS file instead of the EPS graphic itself.)



Q. Before I get started, are there any caveats about posting PDF on the web? 

A. Two important things to think about are printing issues (even if you don't intend for what you post to be printed) and the accessibility of your materials to the disabled, particularly the visually impaired.

Printing issues

1. Put it on a web page and it will be printed. While it may be more convenient to put up files electronically than to distribute paper copies to students, you should anticipate that anything you make available on the web will be printed at some point by some portion of the class.

2. PDF files may cause printing problems if not properly generated. When you print a PDF file, Acrobat converts it into PostScript and sends the converted file to the printer. Depending on the choices made at each step of the process (creation of the original doc., conversion to PDF, and finally the print command), the PS file is sometimes generated very inefficiently, and becomes much, much larger than it needs to be.  When such a file is sent to the printer, it can drastically slow down printing and make server resources unavailable to other users.

3. Athena printers are a shared resource. When students print very long documents, excessively large files, or multiple copies of the same file, it can block printing for everyone else in a cluster (or on the same print server) for a significant period of time, and may require I/S staff to intervene before normal service is restored. Also keep in mind that printing costs money (for supplies, printer upkeep, server resources, and staff support); for the Institute as a whole, printing is always more expensive per page than photocopying.

As a general rule of thumb, you should provide hardcopy (either by distributing photocopies in class or through CopyTech or your dept. offices) of any document longer than a few pages, or for use by a class of more than 10 students. Problems specific to PDF files can be avoided by following the Guidelines for generating PDF documents and by posting a note on your web page(s) to explain that: 

Accessibility for the visually impaired":

To access information on the web, blind and visually impaired users rely on tools such as speech synthesis software which can read text (including HTML) out loud to them; because such software processes text but not graphics or other formats, simple HTML is obviously the best choice for web accessibility. If you do choose to use PDF or another format, you can make your web materials accessible by also providing the content in either HTML or plain text form.

Adobe offers a tool (Access.Adobe.Com) which converts PDF into HTML or text for this purpose. Note that this is not the same as a regular PDF-to-HTML converter, which would aim to preserve the visual appearance of the document as far as possible; Adobe Access focuses on stripping out graphics elements, and arranging the text content into an approximate "reading order". For documents with complex visual layout or graphics, any conversion tool may provide poor results.

There are several places on the web where you can find more information about designing accessible web materials.




Q.What guidelines should I follow when generating PDF documents? 

A. Providing good quality PDF files depends upon two things: the characteristics of your originals, and the options you use when converting them to PDF.

Optimizing original documents

The importance of fonts

When Acrobat Reader opens a PDF file, it renders the characters it contains in one of three ways: from fonts installed on the system, from fonts embedded in the PDF file, or by creating substitute fonts. (To see exactly how each font is handled, you can view the PDF file's font information in Acrobat Reader). The type of font you use can affect how text in your PDF file appears and prints. (It can also affect whether the text is searchable using the Acrobat viewer's search feature, but note that this is independent of searching PDF content with an internet search engine.)

If the characters in a PDF file display poorly, it is usually due to the choice of fonts in the original. In particular, if Acrobat Reader can't find the font installed on the system where it's being displayed, and the font isn't embedded in the PDF file itself, it uses a substitute font which may be a poor approximation to the original (unlike installed and embedded fonts, which should match appearance of the original, on whatever system they are displayed). When creating PDF files, you can avoid font problems and make sure your document's appearance is preserved cross-platform by taking a few simple steps.

TeX/LaTeX originals

There are two options (as of the summer 2000, Athena 8.4 release):

The following information applies if you are using a machine which has not been updated to Athena 8.4.

BaKoMa fonts are installed on Athena and can be used by creating a ~/.dvipsrc file consisting of the following line (or adding it your existing file):

     p +/afs/athena.mit.edu/contrib/tex-contrib/BaKoMa/fontmap.map
If you don't wish to use these substitutions every time you use DVIPS, you can either move aside the .dvipsrc file before running it, or leave the above line out of .dvipsrc and instead create a file named ~/.config.bakoma consisting of the same line as above and then use the syntax:

     athena% dvips -P bakoma foo.dvi
when you want DVIPS to use the BaKoMa fonts. 

For more information about posting TeX/LaTeX files on the web, see:

Originals from a word processor or other application (including spreadsheets and PowerPoint slides)

Acrobat Reader includes a set of standard fonts (called the "Base 13" Type 1 fonts) on all platforms: 

These fonts are always available, and should display properly on other systems. Unless you have a need for a particular font (for example, for special symbols), sticking to the Base 13 gives best results. Keep in mind that any other fonts you use may be rendered on other systems by "substitute" fonts which may display or print poorly unless you're able to embed them when you convert to a PDF file (see converting to PDF, below).

Links to more information about fonts and PDF.

Scanned documents

If you are starting with a scanned document, take a look at the tips in Kai's Power Tip #4: Clean up faxes & scans.

Converting to PDF

Note: it is best to retain your original documents for future changes; while the full Acrobat suite has some features for modifying PDF files, for complete flexibility you would generally want to edit your originals and then convert them to new PDF files. To conserve space, we advise discarding any intermediate files used in the PDF creation process (e.g., TeX users should discard .dvi and .ps files, and retain only the original .tex file and final PDF version).

File size 

There are two issues here: number of pages in document, and the actual size of the file which Acrobat will send to the printer.

  1. We recommend that individual PDF files be limited to about 10 pages in length.While Acrobat Exchange can be used to chop up an existing PDF file, it's usually easier to split the original document into smaller chunks before converting it to PDF. If it is a complex document, there are table-of-contents and linking features that can be used for easy navigation.

  2. Ideally, you should not only keep the number of pages small, but also check whether your PDF files are suitable for Athena printing (i.e., how big the postscript file Acrobat sends to the printer will be). If this sounds too complicated or time-consuming, we're happy to take a look at what you plan to post and advise you of problems we find...Otherwise:

Acrobat compatibility setting 

The newer versions of Acrobat Distiller offer different "compatibility options": selecting 2.1-compatibility will allow your PDF files to be viewed by people who are using an Acrobat 2.1 or earlier viewer; selecting 3.0-compatibility will optimize your PDF files, but they won't be readable from viewers older than 3.0. 

Dealing with fonts 

There are three options for handling each font in the file: 

1. Embed the font.  This "packs" the font into the PDF file itself, so the PDF Reader can recreate it on other systems for display faithful to the original.  This makes the PDF file larger, and is not necessary for fonts which are already installed on the system (e.g., the Base 13 Type 1 fonts, which are installed by Acrobat Reader on every system).  

2. Subset the font.  This embeds just the characters you're using in the PDF file, rather than the entire font.  If you are only using a limited number of characters from a nonstandard font, this will take up less space than embedding. Distiller and PDFWriter are configured with a threshold value for subsetting, and will automatically embed the entire font if you specify subsetting when more than the threshold (usually 35%) of the font's characters are used. 

3. Don't embed at all.  If a font is not embedded, but is installed on the system where the PDF is viewed, it should appear correctly.  If it's not installed, it will be rendered by a substitute font, which may look poor on other systems or when printed . 

Note that it is not always legal to embed/subset a font (like any software, fonts may be freeware, shareware, or commercially licensed and have restrictions on redistribution). However, it is permissible to embed most Type 1 fonts.

Checking fonts used in a PDF file 

To view information on how fonts are being used in a PDF file:

  1. Open the PDF file in Acrobat Reader. To do this on Athena, type:

    athena% add acro; acroread filename.pdf
  2. Go to the File menu, select Document Properties, and then Fonts
This will show you information about the original fonts and any embedded/substitute fonts. Note that it lists only the fonts seen so far in the document; click on "List All Fonts" to see info. for the entire document.

How to embed or subset fonts

Athena 
System-wide default is set to subset, but if you have customized settings and need to specify it manually:

     athena% add acro
     athena% distill -subsetfonts on filename.ps
For options:
     athena% distill -help
     athena% distill -help fonts

Mac
Distiller:  Distiller-->Job Options, Font Embedding tab. (For drag-and-drop, hold down the Command key to bring up the dialog box for specifying settings.) 

PDFWriter: File-->Page Setup, click fonts button.  Older versions may require you to hold down the Ctrl key while choosing File-->Page Setup to bring up Fonts.   

Windows
Distiller: Distiller-->Settings, Job Options, Fonts tab. (For drag-and-drop, hold down the Command key to bring up the dialog box for specifying settings.) 
 
By default, Distiller and PDFWriter treat fonts as follows:

adds description (doesn't embed):
Type 1 fonts, ISO Latin characters (alphabet, numbers, punctuation) 
 
embeds:
Type 1 fonts, non-ISO Latin characters (symbols, ligatures) 
Type 3 fonts
Mac OS bitmap fonts (after converting them to Type 3 bitmaps)   

converts to bitmap images: (will be viewed as graphics, not text)
Windows outline fonts 
PCL fonts

For more information about fonts and PDF, see:

Available Font List Is Incomplete in PDF Writer 2.1 and Later (info. on base 13 fonts)
Adobe Document no. 314618
Note: if any of these links do not take you to the named document, go to Adobe's main page, hit the Search button, and type in the document number shown here.

Q. What should I tell my students about viewing/printing PDF files? 

A. Viewing and navigation tips: 

Magnification: When viewing a PDF file, the first thing you can do to improve readability is to set the page view.  Use the choices at the top of the View menu; be sure to try the "Fit Width" and "Fit Visible" settings as well as specific maginifications.  The three "page" icons on the right side of the toolbar at the top of the viewer can also be used to switch between 100%, Fit Page, and Fit Width, respectively. 

Navigation: You can use the right scroll bar to move around a page or between pages, one page at a time.  To jump to a particular page number, use Documents-->Go to Page or click on the box containing the page number located at the bottom left hand side; you can type the page number in there.  

Printing guidelines: Please advise students to follow the general Athena printing guidelines, as well as the specific Acrobat Reader tips below.

Summary of Athena printing guidelines (for full details, see the posters in clusters or the web page Tips on Printer Etiquette

  1. Make single, not multiple copies, of what you print.
  2. Print only the pages that you need.
  3. Preview before you print.
  4. Use "lpq" to check the print queue.
  5. Know how to kill your print job.
  6. Print PDF files using "Level 2 Only" and "Download Fonts Once".

Tips for Printing from Acrobat Reader on Athena

When you go to printing from Acrobat Reader on Athena, you should always check the following settings in the Print dialog box: 

Printing from the command line 

To print a PDF file from the command line, first convert it to PS with Acrobat Exchange or the Ghostscript pdf2ps translator, then print the PS file with standard lpr commands.  Remember to delete the PS file after printing, to save space; note that when converting, you can save the PS file directly to /tmp (temporary storage) rather than your account, so it won't use up your quota. 

  1. Convert from PDF to PS:


       athena% add acro
athena% acroexch -toPostScript -level2 -fast filename.pdf  

Or to specify a different name/path:

       athena% acroexch -toPostScript -level2 -fast -pairs filename.pdf /tmp/newname.ps
Note: the -level2 and -fast flags correspond to the "Level2 PostScript Only" and "Download Fonts Once" Acrobat Reader settings explained above.

       athena% add gnu
       athena% pdf2ps filename.pdf /tmp/newname.ps
       



Q. Are PDF files searchable? 

A. Your PDF files should be searchable by standard internet search engines as long as you keep them in binary format, which is the default for PDF files generated on Athena using distill. The alternative "asciipdf" format is 7-bit ASCII, which is not supported by all search engines (in addition, asciipdf files are larger than binary pdf).

Whether all of your text is searchable from the search feature within Acrobat programs depends on your choice of fonts. The short answer is that Type 1 fonts are always searchable. For more details, see:

How Acrobat Distiller and PDF Writer Handle Fonts, section on "How Font Types Affect Text in PDF Files"
Adobe doc. 319266
Note: if this link does not take you to the named document, go to Adobe's main page, hit the Search button, and type in the document number shown here.

Q. Where can I find more information about PDF and related topics? 

A. More information is available from the following sites.

Adobe Systems is the developer of PS, PDF, and the Acrobat software:

PDFzone.COM is an independent company which provides information for PDF users and developers: Ghostscript is an interpreter for PS and PDF; free software based on Ghostscript can be used to view and print both formats, but availability and quality varies by platform. Information on making web materials accessible to those with disabilities:

MIT | Academic Computing | Contact us
Last modified: May 2003