Skip to content Accesskey=4Skip to sub-navigation Accesskey=NView our Accessibility Options MIT Information Services and Technology Home About IS&T Contact IS&T Site Map Search Advanced Search
Getting StartedGetting Services by Topic or Alphabetically Getting Help

Web Publishing Reference
 

Web Reference home

MIT Guidelines

Creating Web Pages

 

Web design process
Information Design
Code Standards
Meta tags
Graphics and Color
Printing
Testing
Software
FAQ

Using MIT Web Space

web.mit.edu Resources

MIT/IS&T Resources

Training

Search the Web Reference:

 

Frequently Asked Questions: process-comments

Contents


What is process-comments?

process-comments is a Perl script that converts specially formatted text into tab-delimited records. It is intended to facilitate the processing of information received using web.mit.edu forms support. Information sent using forms on web.mit.edu is received as email, and some work is necessary to prepare this data for inclusion into a database or spreadsheet. This script will simplify the translation process.


How do I use it?

First, create your comments form (see the documentation).

Edit the .txt template file so that process-comments can easily identify the information you want it to extract. There are three markers you can insert:

  • {begin-record} should appear immediately before the first data field you want extracted.
  • {field-separator} should be placed between fields
  • {end-record} should follow the last field.

These markers are inserted automatically when you run translateform to convert the HTML form into a text template file for cgiemail.

A sample file that simply logs comments might look like this:

From: [email]
To: achmed
Subject: [required-subject]
Errors-To: achmed@MIT.EDU
{begin-record}
[email]{field-separator}
[required-subject]{field-separator}
[required-body]{field-separator}
{end-record}
----
NOTE: This message was sent using a WWW form. The address [email]
was typed manually, and may easily be incorrect.
        

You may find it useful to create a mail folder on Athena for the express purpose of holding these submissions (see the mh man pages or the Athena Mail Documentation). To place all mail files in a directory into one file for easy processing, while in the directory, type "cat * > output.txt" and replace output.txt with the name you wish to call the file. If you will be receiving a large amount of mail that you wish to process, it may be advisable to set up a discuss archive for that purpose. For more information regarding setting up a discuss archive, please contact WCS. Finally, it is possible to use use a mail program such as Eudora on your PC or Mac to filter messages. Please see our page on using Eudora with process comments.

When you have some files you would like to process, get a copy of the perl script:

add cwis; cp /mit/cwis/process/process-comments.pl [destination]
Type the following at your UNIX prompt:

process-comments.pl [files to process]

The files will be processed and the result appended to a file named process.out

Example:If you had a mail folder called "survey" you might type

process-comments.pl ~/Mail/survey/*

How can I use the output with my favorite spreadsheet or database?

Many spreadsheets and databases have an "Import" command or can open a text file directly. After you've used process-comments to process your files, simply point your database or spreadsheet to the output file for import. If you have a choice of formats, choose tab-delimited text. If your spreadsheet does not have an Import command, consult the appropriate manuals.


How can I add unique identifiers to each output record?

Sometimes you will want a unique identifier for each record you process. You can tell process-comments to do this with the +n switch. The command line would look something like:

process-comments.pl +n [files to process]

This tells process-comments to append a unique number (starting with 1) as the first field of each record. By default, this option is OFF.


How can I change the name of the output file?

Use the -o [output filename] option:

process-comments.pl -o myoutputfile.out [files to process]

How can I replace the output file rather than appending?

Use the -a option:

process-comments.pl -a [files to process]

If the output file is not empty, will be asked whether you are sure you want to replace it.

Note: the +a option turns appending on.


How can I turn off those annoying progress messages?

Use the -v option:

process-comments.pl -v [files to process]

Note: the +v option puts the script in verbose mode.


How can I change the way returns are converted?

By default returns (newlines) are converted into spaces, but you can change this behavior using the +r [replacement] option. To break lines of poetry with slashes you might use:

process-comments.pl -r '/ ' [files to process]

The -r option turns off return conversion entirely. Be careful using this option, since many programs will deal with this in a strange manner (treating each user-typed line as a separate record).


What if I don't want the output to be tab-delimited?

You can change the separator character using the +s [separator] option. To use commas to separate fields you could use:

process-comments.pl +s ',' [files to process]

What happens if a tab (or other separator) appears in the text I'm processing?

By default it is converted into a space. You can change this behavior using the +t [replacement] option. To convert each tab to three spaces you could use:

process-comments.pl +t '   '  [files to process]

To covert commas in comma-delimited text to semicolons you could use:

process-comments.pl +s ',' +t ';'  [files to process]

The -t option turns off separator replacement entirely. Be careful using this option, since many programs will deal with this in a strange manner (treating each user-typed separator character as the beginning of a new field).


I want to change the record separators!

You can change the record separators using the -br -fs -er options. -br controls the beginning-of-record marker (default "{begin-record}"). -fs controls the field-separator marker (default "{field-separator}"). -er controls the end-of-record marker (default "{end-record}"). However, this is not recommended. If you need this much control you might be better off learning Perl.

If you do decide to change the record separators, be sure to use separators that you are sure do not appear anywhere in the text.

 

MIT Home | Getting Started | Getting Services | Getting Help | About IS&T | Accessibility
Ask a technology question or send a comment about this web page.

Mit Counter