Name | Description | Type | Package | Framework |
AliasableEvalFunc | Makes implementing and using UDFs easier by enabling named parameters. | Class | datafu.pig.util | DataFu |
|
AppendToBag | Appends a tuple to a bag. | Class | datafu.pig.bags | DataFu |
|
Assert | Assert has been renamed to AssertUDF. | Class | datafu.pig.util | DataFu |
|
AssertUDF | Filter function which asserts that a value is true. | Class | datafu.pig.util | DataFu |
|
BagConcat | Unions all input bags to produce a single bag containing all tuples. | Class | datafu.pig.bags | DataFu |
|
BagGroup | Performs an in-memory group operation on a bag. | Class | datafu.pig.bags | DataFu |
|
BagLeftOuterJoin | Performs an in-memory left outer join across multiple bags. | Class | datafu.pig.bags | DataFu |
|
BagSplit | Splits a bag of tuples into a bag of bags, where the inner bags collectively contain the tuples from the original bag. | Class | datafu.pig.bags | DataFu |
|
BoolToInt | UDF which converts a Boolean to an Integer. | Class | datafu.pig.util | DataFu |
|
Coalesce | Returns the first non-null value from a tuple, just like COALESCE in SQL. | Class | datafu.pig.util | DataFu |
|
ContextualEvalFunc | An abstract class which enables UDFs to store instance properties on the front end which will be available on the back end. | Class | datafu.pig.util | DataFu |
|
CountEach | Generates a count of the number of times each distinct tuple appears in a bag. | Class | datafu.pig.bags | DataFu |
|
DataFuException | See Also:Serialized FormConstructor SummaryDataFuException() | Class | datafu.pig.util | DataFu |
|
DistinctBy | Get distinct elements in a bag by a given set of field positions. | Class | datafu.pig.bags | DataFu |
|
EmptyBagToNull | Returns null if the input is an empty bag; otherwise, returns the input bag unchanged. | Class | datafu.pig.bags | DataFu |
|
EmptyBagToNullFields | For an empty bag, inserts a tuple having null values for all fields; otherwise, the input bag is returned unchanged. | Class | datafu.pig.bags | DataFu |
|
Enumerate | Enumerate a bag, appending to each tuple its index within the bag. | Class | datafu.pig.bags | DataFu |
|
FieldNotFound | Thrown by {see AliasableEvalFunc} when attempting to access an unknown field by name. | Class | datafu.pig.util | DataFu |
|
FirstTupleFromBag | Returns the first tuple from a bag. | Class | datafu.pig.bags | DataFu |
|
HaversineDistInMiles | Computes the distance (in miles) between two latitude-longitude pairs using the Haversine formula. | Class | datafu.pig.geo | DataFu |
|
HyperLogLogPlusPlus | A UDF that applies the HyperLogLog++ cardinality estimation algorithm. | Class | datafu.pig.stats | DataFu |
|
In | In has been renamed to InUDF. | Class | datafu.pig.util | DataFu |
|
IntToBool | UDF which converts an Integer to a Boolean. | Class | datafu.pig.util | DataFu |
|
InUDF | Similar to the SQL IN function, this function provides a convenient way to filter using a logical disjunction over many values. | Class | datafu.pig.util | DataFu |
|
MarkovPairs | Accepts a bag of tuples, with user supplied ordering, and generates pairs that can be used for a Markov chain analysis. | Class | datafu.pig.stats | DataFu |
|
MD5 | Computes the MD5 value of a string and outputs it in hex (by default). | Class | datafu.pig.hash | DataFu |
|
Median | Computes the median for a sorted input bag, using type R-2 estimation. | Class | datafu.pig.stats | DataFu |
|
NullToEmptyBag | Returns an empty bag if the input is null; otherwise, returns the input bag unchanged. | Class | datafu.pig.bags | DataFu |
|
PageRank | A UDF which implements PageRank. | Class | datafu.pig.linkanalysis | DataFu |
|
PageRankImpl | An implementation of PageRank, used by the PageRank UDF. | Class | datafu.pig.linkanalysis | DataFu |
|
PrependToBag | Prepends a tuple to a bag. | Class | datafu.pig.bags | DataFu |
|
Quantile | for a sorted input bag, using type R-2 estimation. | Class | datafu.pig.stats | DataFu |
|
QuantileUtil | Methods used by Quantile. | Class | datafu.pig.stats | DataFu |
|
RandInt | Generates a uniformly distributed integer between two bounds. | Class | datafu.pig.random | DataFu |
|
ReservoirSample | Performs a simple random sample using an in-memory reservoir to produce a uniformly random sample of a given size. | Class | datafu.pig.sampling | DataFu |
|
ReverseEnumerate | Enumerate a bag, appending to each tuple its index within the bag, with indices being produced in {(A),(B),(C),(D)} => {(A,3),(B,2),(C,1),(D,0)} | Class | datafu.pig.bags | DataFu |
|
SampleByKey | Provides a way of sampling tuples based on certain fields. | Class | datafu.pig.sampling | DataFu |
|
SessionCount | Performs a count of events, ignoring events which occur within the This is useful for tasks such as counting the number of page views per user since it: | Class | datafu.pig.sessions | DataFu |
|
Sessionize | Sessionizes an input stream, appending a session ID to each tuple. | Class | datafu.pig.sessions | DataFu |
|
SetDifference | Computes the set difference of two or more bags. | Class | datafu.pig.sets | DataFu |
|
SetIntersect | Computes the set intersection of two or more bags. | Class | datafu.pig.sets | DataFu |
|
SetUnion | Computes the set union of two or more bags. | Class | datafu.pig.sets | DataFu |
|
SHA | Fields inherited from class org. | Class | datafu.pig.hash | DataFu |
|
SimpleEvalFunc | Uses reflection to makes writing simple wrapper Pig UDFs easier. | Class | datafu.pig.util | DataFu |
|
SimpleRandomSample | Scalable simple random sampling. | Class | datafu.pig.sampling | DataFu |
|
SimpleRandomSampleWithReplacementElect | Select the candidate with the smallest score for each position from the candidates proposed by SimpleRandomSampleWithReplacementVote. | Class | datafu.pig.sampling | DataFu |
|
SimpleRandomSampleWithReplacementVote | Scalable simple random sampling with replacement (ScaSRSWR). | Class | datafu.pig.sampling | DataFu |
|
StreamingMedian | Computes the approximate median for a (not necessarily sorted) input bag, using the Munro-Paterson algorithm. | Class | datafu.pig.stats | DataFu |
|
StreamingQuantile | Computes approximate quantiles for a (not necessarily sorted) input bag, using the Munro-Paterson algorithm. | Class | datafu.pig.stats | DataFu |
|
TransposeTupleToBag | Performs a transpose on a tuple, resulting in a bag of key, value fields where the key is the column name and the value is the value of that column in the tuple. | Class | datafu.pig.util | DataFu |
|
UnorderedPairs | Generates pairs of all items in a bag. | Class | datafu.pig.bags | DataFu |
|
UserAgentClassify | Given a user agent string, this UDF classifies clients to 'mobile' and 'desktop'. | Class | datafu.pig.urls | DataFu |
|
VAR | Generates the Variance of a set of Values. | Class | datafu.pig.stats | DataFu |
|
WeightedSample | Performs weighted bernoulli sampling on a bag. | Class | datafu.pig.sampling | DataFu |
|
WilsonBinConf | Computes the Wilsonian binomial proportion confidence interval Constructor requires the confidence interval (alpha) parameter, and the | Class | datafu.pig.stats | DataFu |