cascading.flow.hadoop.planner
Class HadoopPlanner

java.lang.Object
  extended by cascading.flow.planner.FlowPlanner<HadoopFlow,JobConf>
      extended by cascading.flow.hadoop.planner.HadoopPlanner

public class HadoopPlanner
extends FlowPlanner<HadoopFlow,JobConf>

Class HadoopPlanner is the core Hadoop MapReduce planner.

Notes:

Custom JobConf properties
A custom JobConf instance can be passed to this planner by calling copyJobConf(java.util.Map, org.apache.hadoop.mapred.JobConf) on a map properties object before constructing a new HadoopFlowConnector.

A better practice would be to set Hadoop properties directly on the map properties object handed to the FlowConnector. All values in the map will be passed to a new default JobConf instance to be used as defaults for all resulting Flow instances.

For example, properties.set("mapred.child.java.opts","-Xmx512m"); would convince Hadoop to spawn all child jvms with a heap of 512MB.


Field Summary
 
Fields inherited from class cascading.flow.planner.FlowPlanner
assertionLevel, checkpointRootPath, debugLevel, properties
 
Constructor Summary
HadoopPlanner()
           
 
Method Summary
 HadoopFlow buildFlow(FlowDef flowDef)
          Method buildFlow renders the actual Flow instance.
static void copyJobConf(Map<Object,Object> properties, JobConf jobConf)
          Method copyJobConf adds the given JobConf values to the given properties object.
static void copyProperties(JobConf jobConf, Map<Object,Object> properties)
          Method copyProperties adds the given Map values to the given JobConf object.
protected  HadoopFlow createFlow(FlowDef flowDef)
           
static JobConf createJobConf(Map<Object,Object> properties)
          Method createJobConf returns a new JobConf instance using the values in the given properties argument.
 JobConf getConfig()
           
static boolean getNormalizeHeterogeneousSources(Map<Object,Object> properties)
          Method getNormalizeHeterogeneousSources returns if this planner will normalize heterogeneous input sources.
 PlatformInfo getPlatformInfo()
           
 void initialize(FlowConnector flowConnector, Map<Object,Object> properties)
           
protected  Tap makeTempTap(String prefix, String name)
           
static void setNormalizeHeterogeneousSources(Map<Object,Object> properties, boolean doNormalize)
          Method setNormalizeHeterogeneousSources adds the given doNormalize boolean to the given properties object.
 
Methods inherited from class cascading.flow.planner.FlowPlanner
createElementGraph, failOnGroupEverySplit, failOnLoneGroupAssertion, failOnMissingGroup, failOnMisusedBuffer, getProperties, handleExceptionDuringPlanning, handleJobPartitioning, handleJoins, handleNonSafeOperations, insertTempTapAfter, makeTempTap, resolveAssemblyPlanners, resolveTails, verifyAllTaps, verifyAssembly, verifyCheckpoints, verifyPipeAssemblyEndPoints, verifySourceNotSinks, verifyTaps, verifyTraps
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HadoopPlanner

public HadoopPlanner()
Method Detail

copyJobConf

public static void copyJobConf(Map<Object,Object> properties,
                               JobConf jobConf)
Method copyJobConf adds the given JobConf values to the given properties object. Use this method to pass custom default Hadoop JobConf properties to Hadoop.

Parameters:
properties - of type Map
jobConf - of type JobConf

createJobConf

public static JobConf createJobConf(Map<Object,Object> properties)
Method createJobConf returns a new JobConf instance using the values in the given properties argument.

Parameters:
properties - of type Map
Returns:
a JobConf instance

copyProperties

public static void copyProperties(JobConf jobConf,
                                  Map<Object,Object> properties)
Method copyProperties adds the given Map values to the given JobConf object.

Parameters:
jobConf - of type JobConf
properties - of type Map

setNormalizeHeterogeneousSources

public static void setNormalizeHeterogeneousSources(Map<Object,Object> properties,
                                                    boolean doNormalize)
Method setNormalizeHeterogeneousSources adds the given doNormalize boolean to the given properties object. Use this method if additional jobs should be planned in to handle incompatible InputFormat classes.

Normalization is off by default and should only be enabled by advanced users. Typically this will decrease application performance.

Parameters:
properties - of type Map
doNormalize - of type boolean

getNormalizeHeterogeneousSources

public static boolean getNormalizeHeterogeneousSources(Map<Object,Object> properties)
Method getNormalizeHeterogeneousSources returns if this planner will normalize heterogeneous input sources.

Parameters:
properties - of type Map
Returns:
a boolean

getConfig

public JobConf getConfig()
Specified by:
getConfig in class FlowPlanner<HadoopFlow,JobConf>

getPlatformInfo

public PlatformInfo getPlatformInfo()
Specified by:
getPlatformInfo in class FlowPlanner<HadoopFlow,JobConf>

initialize

public void initialize(FlowConnector flowConnector,
                       Map<Object,Object> properties)
Overrides:
initialize in class FlowPlanner<HadoopFlow,JobConf>

createFlow

protected HadoopFlow createFlow(FlowDef flowDef)
Specified by:
createFlow in class FlowPlanner<HadoopFlow,JobConf>

buildFlow

public HadoopFlow buildFlow(FlowDef flowDef)
Description copied from class: FlowPlanner
Method buildFlow renders the actual Flow instance.

Specified by:
buildFlow in class FlowPlanner<HadoopFlow,JobConf>
Returns:
Flow

makeTempTap

protected Tap makeTempTap(String prefix,
                          String name)
Specified by:
makeTempTap in class FlowPlanner<HadoopFlow,JobConf>


Copyright © 2007-2013 Concurrent, Inc. All Rights Reserved.