Mathematical or statistical transformations are transformation functions which handle some statistical functionality, and which usually apply some mathematical or statistical operation on existing RDDs, generating a new RDD. Sampling is a great example of this and is used often in Spark programs.
Examples of such transformations are:
- sampleByKey
- randomSplit