cascading.tap.local
Class PartitionTap

java.lang.Object
  extended by cascading.tap.Tap<Config,Input,Output>
      extended by cascading.tap.partition.BasePartitionTap<Properties,InputStream,OutputStream>
          extended by cascading.tap.local.PartitionTap
All Implemented Interfaces:
cascading.flow.FlowElement, Serializable

public class PartitionTap
extends cascading.tap.partition.BasePartitionTap<Properties,InputStream,OutputStream>

Class PartitionTap can be used to write tuple streams out to files and sub-directories based on the values in the current Tuple instance.

The constructor takes a FileTap Tap and a Partition implementation. This allows Tuple values at given positions to be used as directory names.

openWritesThreshold limits the number of open files to be output to. This value defaults to 300 files. Each time the threshold is exceeded, 10% of the least recently used open files will be closed.

PartitionTap will populate a given partition without regard to case of the values being used. Thus the resulting paths 2012/June/ and 2012/june/ will likely result in two open files into the same location. Forcing the case to be consistent with a custom Partition implementation or an upstream cascading.operation.Function is recommended, see cascading.operation.expression.ExpressionFunction.

See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class cascading.tap.partition.BasePartitionTap
cascading.tap.partition.BasePartitionTap.Counters, cascading.tap.partition.BasePartitionTap.PartitionScheme<Config,Input,Output>
 
Field Summary
 
Fields inherited from class cascading.tap.partition.BasePartitionTap
keepParentOnDelete, OPEN_WRITES_THRESHOLD_DEFAULT, openWritesThreshold, parent, partition
 
Constructor Summary
PartitionTap(FileTap parent, cascading.tap.partition.Partition partition)
          Constructor PartitionTap creates a new PartitionTap instance using the given parent FileTap Tap as the base path and default Scheme, and the partition.
PartitionTap(FileTap parent, cascading.tap.partition.Partition partition, int openWritesThreshold)
          Constructor PartitionTap creates a new PartitionTap instance using the given parent FileTap Tap as the base path and default Scheme, and the partition.
PartitionTap(FileTap parent, cascading.tap.partition.Partition partition, cascading.tap.SinkMode sinkMode)
          Constructor PartitionTap creates a new PartitionTap instance using the given parent FileTap Tap as the base path and default Scheme, and the partition.
PartitionTap(FileTap parent, cascading.tap.partition.Partition partition, cascading.tap.SinkMode sinkMode, boolean keepParentOnDelete)
          Constructor PartitionTap creates a new PartitionTap instance using the given parent FileTap Tap as the base path and default Scheme, and the partition.
PartitionTap(FileTap parent, cascading.tap.partition.Partition partition, cascading.tap.SinkMode sinkMode, boolean keepParentOnDelete, int openWritesThreshold)
          Constructor PartitionTap creates a new PartitionTap instance using the given parent FileTap Tap as the base path and default Scheme, and the partition.
 
Method Summary
protected  cascading.tuple.TupleEntrySchemeCollector createTupleEntrySchemeCollector(cascading.flow.FlowProcess<Properties> flowProcess, cascading.tap.Tap parent, String path, long sequence)
           
protected  cascading.tuple.TupleEntrySchemeIterator createTupleEntrySchemeIterator(cascading.flow.FlowProcess<Properties> flowProcess, cascading.tap.Tap parent, String path, InputStream input)
           
 boolean deleteResource(Properties conf)
           
protected  String getCurrentIdentifier(cascading.flow.FlowProcess<Properties> flowProcess)
           
 
Methods inherited from class cascading.tap.partition.BasePartitionTap
commitResource, createResource, equals, getChildPartitionIdentifiers, getIdentifier, getModifiedTime, getOpenWritesThreshold, getParent, getPartition, hashCode, openForRead, openForWrite, resourceExists, rollbackResource, toString
 
Methods inherited from class cascading.tap.Tap
createResource, deleteResource, flowConfInit, getConfigDef, getFullIdentifier, getFullIdentifier, getModifiedTime, getScheme, getSinkFields, getSinkMode, getSourceFields, getStepConfigDef, getTrace, hasConfigDef, hasStepConfigDef, id, isEquivalentTo, isKeep, isReplace, isSink, isSource, isTemporary, isUpdate, openForRead, openForWrite, outgoingScopeFor, presentSinkFields, presentSourceFields, resolveIncomingOperationArgumentFields, resolveIncomingOperationPassThroughFields, resourceExists, retrieveSinkFields, retrieveSourceFields, setScheme, sinkConfInit, sourceConfInit, taps
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

PartitionTap

@ConstructorProperties(value={"parent","partition"})
public PartitionTap(FileTap parent,
                                               cascading.tap.partition.Partition partition)
Constructor PartitionTap creates a new PartitionTap instance using the given parent FileTap Tap as the base path and default Scheme, and the partition.

Parameters:
parent - of type Tap
partition - of type String

PartitionTap

@ConstructorProperties(value={"parent","partition","openWritesThreshold"})
public PartitionTap(FileTap parent,
                                               cascading.tap.partition.Partition partition,
                                               int openWritesThreshold)
Constructor PartitionTap creates a new PartitionTap instance using the given parent FileTap Tap as the base path and default Scheme, and the partition.

openWritesThreshold limits the number of open files to be output to.

Parameters:
parent - of type Hfs
partition - of type String
openWritesThreshold - of type int

PartitionTap

@ConstructorProperties(value={"parent","partition","sinkMode"})
public PartitionTap(FileTap parent,
                                               cascading.tap.partition.Partition partition,
                                               cascading.tap.SinkMode sinkMode)
Constructor PartitionTap creates a new PartitionTap instance using the given parent FileTap Tap as the base path and default Scheme, and the partition.

Parameters:
parent - of type Tap
partition - of type String
sinkMode - of type SinkMode

PartitionTap

@ConstructorProperties(value={"parent","partition","sinkMode","keepParentOnDelete"})
public PartitionTap(FileTap parent,
                                               cascading.tap.partition.Partition partition,
                                               cascading.tap.SinkMode sinkMode,
                                               boolean keepParentOnDelete)
Constructor PartitionTap creates a new PartitionTap instance using the given parent FileTap Tap as the base path and default Scheme, and the partition.

keepParentOnDelete, when set to true, prevents the parent Tap from being deleted when BasePartitionTap.deleteResource(Object) is called, typically an issue when used inside a Cascade.

Parameters:
parent - of type Tap
partition - of type String
sinkMode - of type SinkMode
keepParentOnDelete - of type boolean

PartitionTap

@ConstructorProperties(value={"parent","partition","sinkMode","keepParentOnDelete","openWritesThreshold"})
public PartitionTap(FileTap parent,
                                               cascading.tap.partition.Partition partition,
                                               cascading.tap.SinkMode sinkMode,
                                               boolean keepParentOnDelete,
                                               int openWritesThreshold)
Constructor PartitionTap creates a new PartitionTap instance using the given parent FileTap Tap as the base path and default Scheme, and the partition.

keepParentOnDelete, when set to true, prevents the parent Tap from being deleted when BasePartitionTap.deleteResource(Object) is called, typically an issue when used inside a Cascade.

openWritesThreshold limits the number of open files to be output to.

Parameters:
parent - of type Tap
partition - of type String
sinkMode - of type SinkMode
keepParentOnDelete - of type boolean
openWritesThreshold - of type int
Method Detail

getCurrentIdentifier

protected String getCurrentIdentifier(cascading.flow.FlowProcess<Properties> flowProcess)
Specified by:
getCurrentIdentifier in class cascading.tap.partition.BasePartitionTap<Properties,InputStream,OutputStream>

deleteResource

public boolean deleteResource(Properties conf)
                       throws IOException
Overrides:
deleteResource in class cascading.tap.partition.BasePartitionTap<Properties,InputStream,OutputStream>
Throws:
IOException

createTupleEntrySchemeCollector

protected cascading.tuple.TupleEntrySchemeCollector createTupleEntrySchemeCollector(cascading.flow.FlowProcess<Properties> flowProcess,
                                                                                    cascading.tap.Tap parent,
                                                                                    String path,
                                                                                    long sequence)
                                                                             throws IOException
Specified by:
createTupleEntrySchemeCollector in class cascading.tap.partition.BasePartitionTap<Properties,InputStream,OutputStream>
Throws:
IOException

createTupleEntrySchemeIterator

protected cascading.tuple.TupleEntrySchemeIterator createTupleEntrySchemeIterator(cascading.flow.FlowProcess<Properties> flowProcess,
                                                                                  cascading.tap.Tap parent,
                                                                                  String path,
                                                                                  InputStream input)
                                                                           throws FileNotFoundException
Specified by:
createTupleEntrySchemeIterator in class cascading.tap.partition.BasePartitionTap<Properties,InputStream,OutputStream>
Throws:
FileNotFoundException


Copyright © 2007-2013 Concurrent, Inc. All Rights Reserved.