Standard deviation is the square root of the variance (see previously).
The stddev API has several implementations, as follows. The exact API used depends on the specific use case:
def stddev(columnName: String): Column
Aggregate function: alias for stddev_samp.
def stddev(e: Column): Column
Aggregate function: alias for stddev_samp.
def stddev_pop(columnName: String): Column
Aggregate function: returns the population standard deviation of the expression in a group.
def stddev_pop(e: Column): Column
Aggregate function: returns the population standard deviation of the expression in a group.
def stddev_samp(columnName: String): Column
Aggregate function: returns the sample standard deviation of the expression in a group.
def stddev_samp(e: Column): Column
Aggregate function: returns the sample standard deviation of the expression in a group.
Let's look at an example of invoking stddev on the DataFrame printing the standard deviation of Population:
import org.apache.spark.sql.functions._
scala> statesPopulationDF.select(stddev("Population")).show
+-----------------------+
|stddev_samp(Population)|
+-----------------------+
| 7044528.191173398|
+-----------------------+