| Name | Description | Type | Package | Framework |
| AliasableEvalFunc | Makes implementing and using UDFs easier by enabling named parameters. | Class | datafu.pig.util | DataFu |
|
| AppendToBag | Appends a tuple to a bag. | Class | datafu.pig.bags | DataFu |
|
| Assert | Assert has been renamed to AssertUDF. | Class | datafu.pig.util | DataFu |
|
| AssertUDF | Filter function which asserts that a value is true. | Class | datafu.pig.util | DataFu |
|
| BagConcat | Unions all input bags to produce a single bag containing all tuples. | Class | datafu.pig.bags | DataFu |
|
| BagGroup | Performs an in-memory group operation on a bag. | Class | datafu.pig.bags | DataFu |
|
| BagLeftOuterJoin | Performs an in-memory left outer join across multiple bags. | Class | datafu.pig.bags | DataFu |
|
| BagSplit | Splits a bag of tuples into a bag of bags, where the inner bags collectively contain the tuples from the original bag. | Class | datafu.pig.bags | DataFu |
|
| BoolToInt | UDF which converts a Boolean to an Integer. | Class | datafu.pig.util | DataFu |
|
| Coalesce | Returns the first non-null value from a tuple, just like COALESCE in SQL. | Class | datafu.pig.util | DataFu |
|
| ContextualEvalFunc | An abstract class which enables UDFs to store instance properties on the front end which will be available on the back end. | Class | datafu.pig.util | DataFu |
|
| CountEach | Generates a count of the number of times each distinct tuple appears in a bag. | Class | datafu.pig.bags | DataFu |
|
| DataFuException | See Also:Serialized FormConstructor SummaryDataFuException() | Class | datafu.pig.util | DataFu |
|
| DistinctBy | Get distinct elements in a bag by a given set of field positions. | Class | datafu.pig.bags | DataFu |
|
| EmptyBagToNull | Returns null if the input is an empty bag; otherwise, returns the input bag unchanged. | Class | datafu.pig.bags | DataFu |
|
| EmptyBagToNullFields | For an empty bag, inserts a tuple having null values for all fields; otherwise, the input bag is returned unchanged. | Class | datafu.pig.bags | DataFu |
|
| Enumerate | Enumerate a bag, appending to each tuple its index within the bag. | Class | datafu.pig.bags | DataFu |
|
| FieldNotFound | Thrown by {see AliasableEvalFunc} when attempting to access an unknown field by name. | Class | datafu.pig.util | DataFu |
|
| FirstTupleFromBag | Returns the first tuple from a bag. | Class | datafu.pig.bags | DataFu |
|
| HaversineDistInMiles | Computes the distance (in miles) between two latitude-longitude pairs using the Haversine formula. | Class | datafu.pig.geo | DataFu |
|
| HyperLogLogPlusPlus | A UDF that applies the HyperLogLog++ cardinality estimation algorithm. | Class | datafu.pig.stats | DataFu |
|
| In | In has been renamed to InUDF. | Class | datafu.pig.util | DataFu |
|
| IntToBool | UDF which converts an Integer to a Boolean. | Class | datafu.pig.util | DataFu |
|
| InUDF | Similar to the SQL IN function, this function provides a convenient way to filter using a logical disjunction over many values. | Class | datafu.pig.util | DataFu |
|
| MarkovPairs | Accepts a bag of tuples, with user supplied ordering, and generates pairs that can be used for a Markov chain analysis. | Class | datafu.pig.stats | DataFu |
|
| MD5 | Computes the MD5 value of a string and outputs it in hex (by default). | Class | datafu.pig.hash | DataFu |
|
| Median | Computes the median for a sorted input bag, using type R-2 estimation. | Class | datafu.pig.stats | DataFu |
|
| NullToEmptyBag | Returns an empty bag if the input is null; otherwise, returns the input bag unchanged. | Class | datafu.pig.bags | DataFu |
|
| PageRank | A UDF which implements PageRank. | Class | datafu.pig.linkanalysis | DataFu |
|
| PageRankImpl | An implementation of PageRank, used by the PageRank UDF. | Class | datafu.pig.linkanalysis | DataFu |
|
| PrependToBag | Prepends a tuple to a bag. | Class | datafu.pig.bags | DataFu |
|
| Quantile | for a sorted input bag, using type R-2 estimation. | Class | datafu.pig.stats | DataFu |
|
| QuantileUtil | Methods used by Quantile. | Class | datafu.pig.stats | DataFu |
|
| RandInt | Generates a uniformly distributed integer between two bounds. | Class | datafu.pig.random | DataFu |
|
| ReservoirSample | Performs a simple random sample using an in-memory reservoir to produce a uniformly random sample of a given size. | Class | datafu.pig.sampling | DataFu |
|
| ReverseEnumerate | Enumerate a bag, appending to each tuple its index within the bag, with indices being produced in {(A),(B),(C),(D)} => {(A,3),(B,2),(C,1),(D,0)} | Class | datafu.pig.bags | DataFu |
|
| SampleByKey | Provides a way of sampling tuples based on certain fields. | Class | datafu.pig.sampling | DataFu |
|
| SessionCount | Performs a count of events, ignoring events which occur within the This is useful for tasks such as counting the number of page views per user since it: | Class | datafu.pig.sessions | DataFu |
|
| Sessionize | Sessionizes an input stream, appending a session ID to each tuple. | Class | datafu.pig.sessions | DataFu |
|
| SetDifference | Computes the set difference of two or more bags. | Class | datafu.pig.sets | DataFu |
|
| SetIntersect | Computes the set intersection of two or more bags. | Class | datafu.pig.sets | DataFu |
|
| SetUnion | Computes the set union of two or more bags. | Class | datafu.pig.sets | DataFu |
|
| SHA | Fields inherited from class org. | Class | datafu.pig.hash | DataFu |
|
| SimpleEvalFunc | Uses reflection to makes writing simple wrapper Pig UDFs easier. | Class | datafu.pig.util | DataFu |
|
| SimpleRandomSample | Scalable simple random sampling. | Class | datafu.pig.sampling | DataFu |
|
| SimpleRandomSampleWithReplacementElect | Select the candidate with the smallest score for each position from the candidates proposed by SimpleRandomSampleWithReplacementVote. | Class | datafu.pig.sampling | DataFu |
|
| SimpleRandomSampleWithReplacementVote | Scalable simple random sampling with replacement (ScaSRSWR). | Class | datafu.pig.sampling | DataFu |
|
| StreamingMedian | Computes the approximate median for a (not necessarily sorted) input bag, using the Munro-Paterson algorithm. | Class | datafu.pig.stats | DataFu |
|
| StreamingQuantile | Computes approximate quantiles for a (not necessarily sorted) input bag, using the Munro-Paterson algorithm. | Class | datafu.pig.stats | DataFu |
|
| TransposeTupleToBag | Performs a transpose on a tuple, resulting in a bag of key, value fields where the key is the column name and the value is the value of that column in the tuple. | Class | datafu.pig.util | DataFu |
|
| UnorderedPairs | Generates pairs of all items in a bag. | Class | datafu.pig.bags | DataFu |
|
| UserAgentClassify | Given a user agent string, this UDF classifies clients to 'mobile' and 'desktop'. | Class | datafu.pig.urls | DataFu |
|
| VAR | Generates the Variance of a set of Values. | Class | datafu.pig.stats | DataFu |
|
| WeightedSample | Performs weighted bernoulli sampling on a bag. | Class | datafu.pig.sampling | DataFu |
|
| WilsonBinConf | Computes the Wilsonian binomial proportion confidence interval Constructor requires the confidence interval (alpha) parameter, and the | Class | datafu.pig.stats | DataFu |