org.apache.hadoop.mapred
Class TaskTracker

java.lang.Object
  extended by org.apache.hadoop.mapred.TaskTracker
All Implemented Interfaces:
Runnable

public class TaskTracker
extends Object
implements Runnable

TaskTracker is a process that starts and tracks MR Tasks in a networked environment. It contacts the JobTracker for Task assignments and reporting results.

Author:
Mike Cafarella

Nested Class Summary
static class TaskTracker.Child
          The main() for child processes.
static class TaskTracker.MapOutputServlet
          This class is used in TaskTracker's Jetty to serve the map outputs to other nodes.
 
Field Summary
static int FILE_NOT_FOUND
           
static long HEARTBEAT_INTERVAL
           
static org.apache.commons.logging.Log LOG
           
static String MAP_OUTPUT_LENGTH
          The custom http header used for the map output length.
static float MAX_INMEM_FILESIZE_FRACTION
          Constant denoting the max size (in terms of the fraction of the total size of the filesys) of a map output file that we will try to keep in mem.
static float MAX_INMEM_FILESYS_USE
          Constant denoting when a merge of in memory files will be triggered
static int SUCCESS
           
static long versionID
          Changed the version to 2, since we have a new method getMapOutputs
 
Constructor Summary
TaskTracker(JobConf conf)
          Start with the local machine name, and the default JobTracker
 
Method Summary
 void close()
          Close down the TaskTracker and all its components.
 void done(String taskid)
          The task is done.
 void fsError(String taskId, String message)
          A child task had a local filesystem error.
 FileSystem getFileSystem()
          Return the DFS filesystem
 org.apache.hadoop.mapred.InterTrackerProtocol getJobClient()
          The connection to the JobTracker, used by the TaskRunner for locating remote files.
 TaskCompletionEvent[] getMapCompletionEvents(String jobId, int fromEventId, int maxLocs)
          Called by a reduce task to get the map output locations for finished maps.
 long getProtocolVersion(String protocol, long clientVersion)
          Return protocol version corresponding to protocol interface.
 org.apache.hadoop.mapred.Task getTask(String taskid)
          Called upon startup by the child process, to fetch Task data.
 boolean isIdle()
          Is this task tracker idle?
static void main(String[] argv)
          Start the TaskTracker, point toward the indicated JobTracker
 void mapOutputLost(String taskid, String errorMsg)
          A completed map task's output has been lost.
 boolean ping(String taskid)
          Child checking to see if we're alive.
 void progress(String taskid, float progress, String state, TaskStatus.Phase phase, Counters counters)
          Called periodically to report Task progress, from 0.0 to 1.0.
 void reportDiagnosticInfo(String taskid, String info)
          Called when the task dies before completion, and we want to report back diagnostic info
 void run()
          The server retry loop.
 void shutdown()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

HEARTBEAT_INTERVAL

public static final long HEARTBEAT_INTERVAL
See Also:
Constant Field Values

MAX_INMEM_FILESYS_USE

public static final float MAX_INMEM_FILESYS_USE
Constant denoting when a merge of in memory files will be triggered

See Also:
Constant Field Values

MAX_INMEM_FILESIZE_FRACTION

public static final float MAX_INMEM_FILESIZE_FRACTION
Constant denoting the max size (in terms of the fraction of the total size of the filesys) of a map output file that we will try to keep in mem. Ideally, this should be a factor of MAX_INMEM_FILESYS_USE

See Also:
Constant Field Values

SUCCESS

public static final int SUCCESS
See Also:
Constant Field Values

FILE_NOT_FOUND

public static final int FILE_NOT_FOUND
See Also:
Constant Field Values

MAP_OUTPUT_LENGTH

public static final String MAP_OUTPUT_LENGTH
The custom http header used for the map output length.

See Also:
Constant Field Values

versionID

public static final long versionID
Changed the version to 2, since we have a new method getMapOutputs

See Also:
Constant Field Values
Constructor Detail

TaskTracker

public TaskTracker(JobConf conf)
            throws IOException
Start with the local machine name, and the default JobTracker

Throws:
IOException
Method Detail

getProtocolVersion

public long getProtocolVersion(String protocol,
                               long clientVersion)
                        throws IOException
Description copied from interface: VersionedProtocol
Return protocol version corresponding to protocol interface.

Parameters:
protocol - The classname of the protocol interface
clientVersion - The version of the protocol that the client speaks
Returns:
the version that the server will speak
Throws:
IOException

shutdown

public void shutdown()
              throws IOException
Throws:
IOException

close

public void close()
           throws IOException
Close down the TaskTracker and all its components. We must also shutdown any running tasks or threads, and cleanup disk space. A new TaskTracker within the same process space might be restarted, so everything must be clean.

Throws:
IOException

getJobClient

public org.apache.hadoop.mapred.InterTrackerProtocol getJobClient()
The connection to the JobTracker, used by the TaskRunner for locating remote files.


getFileSystem

public FileSystem getFileSystem()
Return the DFS filesystem


run

public void run()
The server retry loop. This while-loop attempts to connect to the JobTracker. It only loops when the old TaskTracker has gone bad (its state is stale somehow) and we need to reinitialize everything.

Specified by:
run in interface Runnable

getTask

public org.apache.hadoop.mapred.Task getTask(String taskid)
                                      throws IOException
Called upon startup by the child process, to fetch Task data.

Throws:
IOException

progress

public void progress(String taskid,
                     float progress,
                     String state,
                     TaskStatus.Phase phase,
                     Counters counters)
              throws IOException
Called periodically to report Task progress, from 0.0 to 1.0.

Parameters:
taskid - the id of the task
progress - value between zero and one
state - description of task's current state
phase - current phase of the task.
counters - the counters for this task.
Throws:
IOException

reportDiagnosticInfo

public void reportDiagnosticInfo(String taskid,
                                 String info)
                          throws IOException
Called when the task dies before completion, and we want to report back diagnostic info

Parameters:
taskid - the id of the task involved
info - the text to report
Throws:
IOException

ping

public boolean ping(String taskid)
             throws IOException
Child checking to see if we're alive. Normally does nothing.

Returns:
True if the task is known
Throws:
IOException

done

public void done(String taskid)
          throws IOException
The task is done.

Throws:
IOException

fsError

public void fsError(String taskId,
                    String message)
             throws IOException
A child task had a local filesystem error. Kill the task.

Throws:
IOException

getMapCompletionEvents

public TaskCompletionEvent[] getMapCompletionEvents(String jobId,
                                                    int fromEventId,
                                                    int maxLocs)
                                             throws IOException
Called by a reduce task to get the map output locations for finished maps.

fromEventId - the index starting from which the locations should be fetched
maxLocs - the max number of locations to fetch
Returns:
an array of TaskCompletionEvent
Throws:
IOException

mapOutputLost

public void mapOutputLost(String taskid,
                          String errorMsg)
                   throws IOException
A completed map task's output has been lost.

Throws:
IOException

isIdle

public boolean isIdle()
Is this task tracker idle?

Returns:
has this task tracker finished and cleaned up all of its tasks?

main

public static void main(String[] argv)
                 throws Exception
Start the TaskTracker, point toward the indicated JobTracker

Throws:
Exception


Copyright © 2006 The Apache Software Foundation