|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.hadoop.contrib.utils.join.JobBase org.apache.hadoop.contrib.utils.join.DataJoinMapperBase
public abstract class DataJoinMapperBase
This abstract class serves as the base class for the mapper class of a data join job. This class expects its subclasses to implement methods for the following functionalities: 1. Compute the source tag of input values 2. Compute the map output value object 3. Compute the map output key object The source tag will be used by the reducer to determine from which source (which table in SQL terminology) a value comes. Computing the map output value object amounts to performing projecting/filtering work in a SQL statement (through the select/where clauses). Computing the map output key amounts to choosing the join key. This class provides the appropriate plugin points for the user defined subclasses to implement the appropriate logic.
Field Summary | |
---|---|
protected String |
inputFile
|
protected Text |
inputTag
|
protected JobConf |
job
|
protected Reporter |
reporter
|
Fields inherited from class org.apache.hadoop.contrib.utils.join.JobBase |
---|
LOG |
Constructor Summary | |
---|---|
DataJoinMapperBase()
|
Method Summary | |
---|---|
void |
close()
Called after the last call to any other method on this object to free and/or flush resources. |
void |
configure(JobConf job)
Initializes a new instance from a JobConf . |
protected abstract Text |
generateGroupKey(TaggedMapOutput aRecord)
Generate a map output key. |
protected abstract Text |
generateInputTag(String inputFile)
Determine the source tag based on the input file name. |
protected abstract TaggedMapOutput |
generateTaggedMapOutput(Writable value)
Generate a tagged map output value. |
void |
map(WritableComparable key,
Writable value,
OutputCollector output,
Reporter reporter)
Maps a single input key/value pair into intermediate key/value pairs. |
void |
reduce(WritableComparable arg0,
Iterator arg1,
OutputCollector arg2,
Reporter arg3)
Combines values for a given key. |
Methods inherited from class org.apache.hadoop.contrib.utils.join.JobBase |
---|
addDoubleValue, addLongValue, getDoubleValue, getLongValue, getReport, report, setDoubleValue, setLongValue |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected String inputFile
protected JobConf job
protected Text inputTag
protected Reporter reporter
Constructor Detail |
---|
public DataJoinMapperBase()
Method Detail |
---|
public void configure(JobConf job)
JobBase
JobConf
.
configure
in interface JobConfigurable
configure
in class JobBase
job
- the configurationprotected abstract Text generateInputTag(String inputFile)
inputFile
-
protected abstract TaggedMapOutput generateTaggedMapOutput(Writable value)
value
-
protected abstract Text generateGroupKey(TaggedMapOutput aRecord)
aRecord
-
public void map(WritableComparable key, Writable value, OutputCollector output, Reporter reporter) throws IOException
Mapper
OutputCollector.collect(WritableComparable,Writable)
.
key
- the keyvalue
- the valuesoutput
- collects mapped keys and values
IOException
public void close() throws IOException
Closeable
IOException
public void reduce(WritableComparable arg0, Iterator arg1, OutputCollector arg2, Reporter arg3) throws IOException
Reducer
OutputCollector.collect(WritableComparable,Writable)
.
arg0
- the keyarg1
- the values to combinearg2
- to collect combined values
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |