PDF (Portable Document Format) is a file format developed by Adobe Systems for distributing documents cross-platform, with original formatting intact and independent of the application software used to create them. Reading PDF files requires a viewing program, which can be set up for use within a graphical web browser (on Athena, the Adobe Acrobat Reader is part of the standard Netscape configuration and is also available as a stand-alone application).
While PDF is a highly convenient and popular format, individual PDF files can be problematic if not created carefully. The information below is intended to help you create faithful reproductions of your original documents as PDF files which behave properly when printed or viewed on other systems.
A. PDF (Portable Document Format) is a cross-platform file format developed by Adobe Systems for delivering documents electronically in a device- and resolution-independent manner, with original formatting intact. That is, the PDF version of a document should look just like the original, but it can be viewed or printed without access to the application or system used to create it.
Some of the key features of PDF can be highlighted by comparing it to two common cross-platform delivery mechanisms: HTML and PostScript.
How PDF differs from HTML:
How PDF differs from PostScript:
A. Acrobat is the Adobe software suite for creating and viewing PDF files (for complete product information, see Adobe's Acrobat page). The basic components of the suite are:
On Athena, the full Acrobat suite is available on Suns, and the Reader is available on SGIs (Adobe does not currently provide SGI versions of the rest of the suite). Acrobat Reader is configured as part of the standard Netscape setup (i.e., if you click on a PDF link in a web page, it will launch automatically in the browser window). To launch it separately:
athena% add acro; acroread filename.pdfor leave off the file name, and use File-->Open to locate the file once the Reader is up.
The other programs on Athena Suns are distill and acroexch (Acrobat Distiller and Exchange, respectively). For more information:
athena% add acro athena% more /mit/acro/README.athena athena% distill -help athena% acroexch -help
A. On Athena, you can convert PostScript files to PDF files with Acrobat Distiller, or you can use the free Ghostscript PS-to-PDF converter; if you have TeX/LaTeX files, you can now convert them directly to PDF with pdftex. If you purchase the full Acrobat suite for Windows or Macintosh, you can use either Distiller or PDFWriter to create PDF files (see the answer on differences between PDFWriter and Distiller).
If you are using Adobe FrameMaker on athena, you may save the document directly as a PDF file.
Keep in mind that while it is very easy to create a PDF file, it takes a little care to make sure it looks good and won't cause problems when posted on the web. Please see the guidelines below, and feel free to contact the Faculty Liaisons for advice relating to course pages, or CWIS for general help.
On Athena, Distiller is available on Suns only (Adobe does not provide an SGI version). To use it, type:
athena% add acro athena% distill filename.pswhere "filename.ps" is your PostScript file. This will create a PDF file named "filename.pdf", which you can then link from your web page as you would any other file. For help on distill options, type:
athena% distill -help
Note: if you aren't sure how to create a PostScript file from a specific application, try using its print command, but specify "to file" rather than "to printer" and supply a filename ending in ".ps".
Ghostscript has a free PS-to-PDF converter, which is available on Athena in the gnu locker (the version currently installed is distributed under the GNU General Public License). To use it:
athena% add gnu; ps2pdf filename.ps
For more information, see the ps2pdf man page, READMEs in the gnu locker, or the Ghostscript home page.
To save PDF files on Athena using FrameMaker, launch it:
athena% add frame; maker &
Go to File and select SaveAs. In the format menu, select PDF from the drop
down list. Click on Save.
A. Acrobat Distiller is Adobe's full-feature PostScript-to-PDF converter. Distiller allows you to control options for the PDF you generate, but requires that you start with a PS file. PDFWriter is a printer driver which allows you to use the print command within an application (for example, a word processor, spreadsheet, or PowerPoint) to create a PDF file directly from your original; it is available only for Macintosh and Windows. While convenient for direct conversion of simple documents, PDFWriter lacks some features of Distiller and doesn't work well for some types of documents.
In general, if you don't get satisfactory results with PDFWriter, try Distiller instead. In particular, PDFWriter may not always successfully process documents that contain Encapsulated PostScript (EPS) artwork or images, or documents that use features available only with PostScript printers. (With EPS it uses the bitmap preview image that accompanies the EPS file instead of the EPS graphic itself.)
A. Two important things to think about are printing issues (even if you don't intend for what you post to be printed) and the accessibility of your materials to the disabled, particularly the visually impaired.
1. Put it on a web page and it will be printed. While it may be more convenient to put up files electronically than to distribute paper copies to students, you should anticipate that anything you make available on the web will be printed at some point by some portion of the class.
2. PDF files may cause printing problems if not properly generated. When you print a PDF file, Acrobat converts it into PostScript and sends the converted file to the printer. Depending on the choices made at each step of the process (creation of the original doc., conversion to PDF, and finally the print command), the PS file is sometimes generated very inefficiently, and becomes much, much larger than it needs to be. When such a file is sent to the printer, it can drastically slow down printing and make server resources unavailable to other users.
3. Athena printers are a shared resource. When students print very long documents, excessively large files, or multiple copies of the same file, it can block printing for everyone else in a cluster (or on the same print server) for a significant period of time, and may require I/S staff to intervene before normal service is restored. Also keep in mind that printing costs money (for supplies, printer upkeep, server resources, and staff support); for the Institute as a whole, printing is always more expensive per page than photocopying.
As a general rule of thumb, you should provide hardcopy (either by distributing photocopies in class or through CopyTech or your dept. offices) of any document longer than a few pages, or for use by a class of more than 10 students. Problems specific to PDF files can be avoided by following the Guidelines for generating PDF documents and by posting a note on your web page(s) to explain that:
Accessibility for the visually impaired":
To access information on the web, blind and visually impaired users rely on tools such as speech synthesis software which can read text (including HTML) out loud to them; because such software processes text but not graphics or other formats, simple HTML is obviously the best choice for web accessibility. If you do choose to use PDF or another format, you can make your web materials accessible by also providing the content in either HTML or plain text form.
Adobe offers a tool (Access.Adobe.Com) which converts PDF into HTML or text for this purpose. Note that this is not the same as a regular PDF-to-HTML converter, which would aim to preserve the visual appearance of the document as far as possible; Adobe Access focuses on stripping out graphics elements, and arranging the text content into an approximate "reading order". For documents with complex visual layout or graphics, any conversion tool may provide poor results.
There are several places on the web where you can find more information about designing accessible web materials.
A. Providing good quality PDF files depends upon two things: the characteristics of your originals, and the options you use when converting them to PDF.
When Acrobat Reader opens a PDF file, it renders the characters it contains in one of three ways: from fonts installed on the system, from fonts embedded in the PDF file, or by creating substitute fonts. (To see exactly how each font is handled, you can view the PDF file's font information in Acrobat Reader). The type of font you use can affect how text in your PDF file appears and prints. (It can also affect whether the text is searchable using the Acrobat viewer's search feature, but note that this is independent of searching PDF content with an internet search engine.)
If the characters in a PDF file display poorly, it is usually due to the choice of fonts in the original. In particular, if Acrobat Reader can't find the font installed on the system where it's being displayed, and the font isn't embedded in the PDF file itself, it uses a substitute font which may be a poor approximation to the original (unlike installed and embedded fonts, which should match appearance of the original, on whatever system they are displayed). When creating PDF files, you can avoid font problems and make sure your document's appearance is preserved cross-platform by taking a few simple steps.
athena% pdflatex filename.tex
athena% pdftex filename.tex
This produces filename.pdf in the current directory.
Note that included postscript files aren't currently
supported (pdflatex will just omit them); for such files, use the method
below instead. Formats which will work with pdflatex for graphics are jpeg,
pdf, png, or tiff, using
athena% dvips -Ppdf filename.dvi athena% add acro; distill filename.psThe first line generates filename.ps in the current directory, from which the second line generates filename.pdf.
Note: This supersedes previous instructions for editing your own ~/.dvipsrc file. If you try the old method on a current Athena machine, you are likely to get errors such as the following:
</afs/athena.mit.edu/contrib/tex-contrib/BaKoMa/pfb/cmex10.pfb> First number not found ERROR in encoding vector in </afs/athena.mit.edu/contrib/tex-contrib/BaKoMa/pfb/cmex10.pfb>
The following information applies if you are using a machine which has not been updated to Athena 8.4.
BaKoMa fonts are installed on Athena and can be used by creating a ~/.dvipsrc file consisting of the following line (or adding it your existing file):p +/afs/athena.mit.edu/contrib/tex-contrib/BaKoMa/fontmap.mapIf you don't wish to use these substitutions every time you use DVIPS, you can either move aside the .dvipsrc file before running it, or leave the above line out of .dvipsrc and instead create a file named ~/.config.bakoma consisting of the same line as above and then use the syntax:athena% dvips -P bakoma foo.dviwhen you want DVIPS to use the BaKoMa fonts.
For more information about posting TeX/LaTeX files on the web, see:
Links to more information about fonts and PDF.
There are two issues here: number of pages in document, and the actual size of the file which Acrobat will send to the printer.
Acrobat compatibility setting
The newer versions of Acrobat Distiller offer different "compatibility options": selecting 2.1-compatibility will allow your PDF files to be viewed by people who are using an Acrobat 2.1 or earlier viewer; selecting 3.0-compatibility will optimize your PDF files, but they won't be readable from viewers older than 3.0.
Dealing with fonts
There are three options for handling each font in the file:
1. Embed the font. This "packs" the font into the PDF file itself, so the PDF Reader can recreate it on other systems for display faithful to the original. This makes the PDF file larger, and is not necessary for fonts which are already installed on the system (e.g., the Base 13 Type 1 fonts, which are installed by Acrobat Reader on every system).
2. Subset the font. This embeds just the characters you're using in the PDF file, rather than the entire font. If you are only using a limited number of characters from a nonstandard font, this will take up less space than embedding. Distiller and PDFWriter are configured with a threshold value for subsetting, and will automatically embed the entire font if you specify subsetting when more than the threshold (usually 35%) of the font's characters are used.
3. Don't embed at all. If a font is not embedded, but is installed on the system where the PDF is viewed, it should appear correctly. If it's not installed, it will be rendered by a substitute font, which may look poor on other systems or when printed .
Note that it is not always legal to embed/subset a font (like any software, fonts may be freeware, shareware, or commercially licensed and have restrictions on redistribution). However, it is permissible to embed most Type 1 fonts.
Checking fonts used in a PDF file
To view information on how fonts are being used in a PDF file:
athena% add acro; acroread filename.pdf
How to embed or subset fonts
System-wide default is set to subset, but if you have customized settings and need to specify it manually:
athena% add acro athena% distill -subsetfonts on filename.psFor options:
athena% distill -help athena% distill -help fonts
Distiller: Distiller-->Job Options, Font Embedding tab. (For drag-and-drop, hold down the Command key to bring up the dialog box for specifying settings.)
PDFWriter: File-->Page Setup, click fonts button. Older versions may require you to hold down the Ctrl key while choosing File-->Page Setup to bring up Fonts.
Distiller: Distiller-->Settings, Job Options, Fonts tab. (For drag-and-drop, hold down the Command key to bring up the dialog box for specifying settings.)
By default, Distiller and PDFWriter treat fonts as follows:
adds description (doesn't embed):
Type 1 fonts, ISO Latin characters (alphabet, numbers, punctuation)
Type 1 fonts, non-ISO Latin characters (symbols, ligatures)
Type 3 fonts
Mac OS bitmap fonts (after converting them to Type 3 bitmaps)
converts to bitmap images: (will be viewed as graphics, not text)
Windows outline fonts
For more information about fonts and PDF, see:
Available Font List Is Incomplete in PDF Writer 2.1 and Later (info. on base 13 fonts)Note: if any of these links do not take you to the named document, go to Adobe's main page, hit the Search button, and type in the document number shown here.
Adobe Document no. 314618
A. Viewing and navigation tips:
Magnification: When viewing a PDF file, the first thing you can do to improve readability is to set the page view. Use the choices at the top of the View menu; be sure to try the "Fit Width" and "Fit Visible" settings as well as specific maginifications. The three "page" icons on the right side of the toolbar at the top of the viewer can also be used to switch between 100%, Fit Page, and Fit Width, respectively.
Navigation: You can use the right scroll bar to move around a page or between pages, one page at a time. To jump to a particular page number, use Documents-->Go to Page or click on the box containing the page number located at the bottom left hand side; you can type the page number in there.
Printing guidelines: Please advise students to follow the general Athena printing guidelines, as well as the specific Acrobat Reader tips below.
Summary of Athena printing guidelines (for full details, see the posters in clusters or the web page Tips on Printer Etiquette)
These settings can substantially reduce the size of the PS files Acrobat produces for printing, speed printer processing, and reduce demand on print servers. (Level 2 PostScript uses newer features which allocate resources more efficiently, and is supported on all Athena cluster printers. If "download fonts once" is not selected, any font downloading happens at the start of each page, rather than once at the beginnning of the file, which can result in a much larger PS file and longer transmission time.)
Newer versions of Acrobat Reader will make some of this unnecessary, but for now you may set level 2 as the default by editing your ~/.acrorc file to read:
*PSLevel: 2rather than:
Printing from the command line
To print a PDF file from the command line, first convert it to PS with Acrobat Exchange or the Ghostscript pdf2ps translator, then print the PS file with standard lpr commands. Remember to delete the PS file after printing, to save space; note that when converting, you can save the PS file directly to /tmp (temporary storage) rather than your account, so it won't use up your quota.
athena% add acro
athena% acroexch -toPostScript -level2 -fast filename.pdf
Or to specify a different name/path:
athena% acroexch -toPostScript -level2 -fast -pairs filename.pdf /tmp/newname.psNote: the -level2 and -fast flags correspond to the "Level2 PostScript Only" and "Download Fonts Once" Acrobat Reader settings explained above.
athena% add gnu athena% pdf2ps filename.pdf /tmp/newname.ps
athena% lpr filename.ps [-Pprintername]
A. Your PDF files should be searchable by standard internet search engines as long as you keep them in binary format, which is the default for PDF files generated on Athena using distill. The alternative "asciipdf" format is 7-bit ASCII, which is not supported by all search engines (in addition, asciipdf files are larger than binary pdf).
Whether all of your text is searchable from the search feature within Acrobat programs depends on your choice of fonts. The short answer is that Type 1 fonts are always searchable. For more details, see:
How Acrobat Distiller and PDF Writer Handle Fonts, section on "How Font Types Affect Text in PDF Files"Note: if this link does not take you to the named document, go to Adobe's main page, hit the Search button, and type in the document number shown here.
Adobe doc. 319266
A. More information is available from the following sites.
Adobe Systems is the developer of PS, PDF, and the Acrobat software: