Stanford Datasets (https://nlp.stanford.edu/data/)—Pretrained word2vec and GloVE models, multilingual language models and datasets, multilingual dictionaries, lexica, and corpora.
nlpia (https://github.com/totalgood/nlpia)—Python package with data loaders (nlpia.loaders) and preprocessors for all the NLP data you’ll ever need... until you finish this book ;).