The average of the values is calculated by adding the values and dividing by the number of values.
Average of 1,2,3 is (1 + 2 + 3) / 3 = 6/3 = 2
The avg API has several implementations, as follows. The exact API used depends on the specific use case:
def avg(columnName: String): Column
Aggregate function: returns the average of the values in a group.
def avg(e: Column): Column
Aggregate function: returns the average of the values in a group.
Let's look at an example of invoking avg on the DataFrame to print the average population:
import org.apache.spark.sql.functions._
scala> statesPopulationDF.select(avg("Population")).show
+-----------------+
| avg(Population)|
+-----------------+
|6253399.371428572|
+-----------------+