Keras includes a make_sampling_table method that allows us to create a training set as pairs of context and noise words with corresponding labels, sampled according to their corpus frequencies.
The result is 27 million positive and negative examples of context and target pairs.