The sample() method

The sample() method can be used for the random sampling of a DataFrame or series. The parameters of the sample function support sampling across either axis and also sampling with and without replacement. For sampling, either the number of records to be sampled or the fraction of records to be sampled must be specified.

Let's sample three records from the Movie series:

sample_df["Movie"].sample(3)

The following is the output:

Sample function for series

Now, let's sample 50% of the columns from the DataFrame:

sample_df.sample(frac = 0.5, axis = 1)

The following is the output:

Column sampling with the fraction parameter

The replace parameter can be set to True or False so that we can sample with or without replacement; the default is False. The random state parameter helps in setting the seed of the random number generator and it depends on the Random module of the NumPy package.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset