Package org.apache.hadoop.mapred

A system for scalable, fault-tolerant, distributed computation over large data collections.

See:
          Description

Interface Summary
InputFormat An input data format.
InputSplit The description of the data for a single map task.
JobConfigurable That what may be configured.
JobHistory.Listener Callback interface for reading back log events from JobHistory.
JobSubmissionProtocol Protocol that a JobClient and the central JobTracker use to communicate.
Mapper Maps input key/value pairs to a set of intermediate key/value pairs.
MapRunnable Expert: Permits greater control of map processing.
OutputCollector Passed to Mapper and Reducer implementations to collect output data.
OutputFormat An output data format.
Partitioner Partitions the key space.
RecordReader Reads key/value pairs from an input file FileSplit.
RecordWriter Writes key/value pairs to an output file.
Reducer Reduces a set of intermediate values which share a key to a smaller set of values.
Reporter Passed to application code to permit alteration of status.
RunningJob Includes details on a running MapReduce job.
SequenceFileInputFilter.Filter filter interface
 

Class Summary
ClusterStatus Summarizes the size and current state of the cluster.
Counters A set of named counters.
Counters.Group Represents a group of counters, comprising the counters from a particular counter enum class.
DefaultJobHistoryParser Default parser for job history files.
FileInputFormat A base class for InputFormat.
FileSplit A section of an input file.
InputFormatBase Deprecated. replaced by FileInputFormat
IsolationRunner  
JobClient JobClient interacts with the JobTracker network interface.
JobConf A map/reduce job configuration.
JobEndNotifier  
JobHistory Provides methods for writing to and reading from job history.
JobHistory.HistoryCleaner Delete history files older than one month.
JobHistory.JobInfo Helper class for logging or reading back events related to job start, finish or failure.
JobHistory.MapAttempt Helper class for logging or reading back events related to start, finish or failure of a Map Attempt on a node.
JobHistory.ReduceAttempt Helper class for logging or reading back events related to start, finish or failure of a Map Attempt on a node.
JobHistory.Task Helper class for logging or reading back events related to Task's start, finish or failure.
JobHistory.TaskAttempt Base class for Map and Reduce TaskAttempts.
JobProfile A JobProfile is a MapReduce primitive.
JobStatus Describes the current status of a job.
JobTracker JobTracker is the central location for submitting and tracking MR jobs in a network environment.
KeyValueLineRecordReader This class treats a line in the input as a key/value pair separated by a separator character.
KeyValueTextInputFormat An InputFormat for plain text files.
LineRecordReader Treats keys as offset in file and value as line.
MapFileOutputFormat An OutputFormat that writes MapFiles.
MapReduceBase Base class for Mapper and Reducer implementations.
MapRunner Default MapRunnable implementation.
OutputFormatBase A base class for OutputFormat.
PhasedFileSystem Deprecated. PhasedFileSystem is no longer used during speculative execution of tasks.
SequenceFileAsTextInputFormat This class is similar to SequenceFileInputFormat, except it generates SequenceFileAsTextRecordReader which converts the input keys and values to their String forms by calling toString() method.
SequenceFileAsTextRecordReader This class converts the input keys and values to their String forms by calling toString() method.
SequenceFileInputFilter A class that allows a map/red job to work on a sample of sequence files.
SequenceFileInputFilter.FilterBase base calss for Filters
SequenceFileInputFilter.MD5Filter This class returns a set of records by examing the MD5 digest of its key against a filtering frequency f.
SequenceFileInputFilter.PercentFilter This class returns a percentage of records The percentage is determined by a filtering frequency f using the criteria record# % f == 0.
SequenceFileInputFilter.RegexFilter Records filter by matching key to regex
SequenceFileInputFormat An InputFormat for SequenceFiles.
SequenceFileOutputFormat An OutputFormat that writes SequenceFiles.
SequenceFileRecordReader An RecordReader for SequenceFiles.
StatusHttpServer Create a Jetty embedded server to answer http requests.
StatusHttpServer.StackServlet A very simple servlet to serve up a text representation of the current stack traces.
TaskCompletionEvent This is used to track task completion events on job tracker.
TaskLogAppender A simple log4j-appender for the task child's map-reduce system logs.
TaskReport A report on the state of a task.
TaskTracker TaskTracker is a process that starts and tracks MR Tasks in a networked environment.
TaskTracker.Child The main() for child processes.
TaskTracker.MapOutputServlet This class is used in TaskTracker's Jetty to serve the map outputs to other nodes.
TextInputFormat An InputFormat for plain text files.
TextOutputFormat An OutputFormat that writes plain text files.
TextOutputFormat.LineRecordWriter  
 

Enum Summary
JobClient.TaskStatusFilter  
JobHistory.Keys Job history files contain key="value" pairs, where keys belong to this enum.
JobHistory.RecordTypes Record types are identifiers for each line of log in history files.
JobHistory.Values This enum contains some of the values commonly used by history log events.
TaskCompletionEvent.Status  
 

Exception Summary
FileAlreadyExistsException Used when target file already exists for any operation and is not configured to be overwritten.
InvalidFileTypeException Used when file type differs from the desired file type.
InvalidInputException This class wraps a list of problems with the input, so that the user can get a list of problems together instead of finding and fixing them one by one.
InvalidJobConfException This exception is thrown when jobconf misses some mendatory attributes or value of some attributes is invalid.
 

Package org.apache.hadoop.mapred Description

A system for scalable, fault-tolerant, distributed computation over large data collections.

Applications implement Mapper and Reducer interfaces. These are submitted as a JobConf and are applied to data stored in a FileSystem.

See Google's original Map/Reduce paper for background information.



Copyright © 2006 The Apache Software Foundation