Research papers and talks

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Research papers and talks

One of the best way to gain a deep understanding of a topic is to try to repeat the experiments of researchers and then modify them in some way. That’s how the best professors and mentors “teach” their students, by just encouraging them to try to duplicate the results of other researchers they’re interested in. You can’t help but tweak an approach if you spend enough time trying to get it to work for you.

Vector space models and semantic search

Semantic Vector Encoding and Similarity Search Using Fulltext Search Engines (https://arxiv.org/pdf/1706.00957.pdf)—Jan Rygl et al. were able to use a conventional inverted index to implement efficient semantic search for all of Wikipedia.
Learning Low-Dimensional Metrics (https://papers.nips.cc/paper/7002-learning-low-dimensional-metrics.pdf)—Lalit Jain et al. were able to incorporate human judgement into pairwise distance metrics, which can be used for better decision-making and unsupervised clustering of word vectors and topic vectors. For example, recruiters can use this to steer a content-based recommendation engine that matches resumes with job descriptions.
RAND-WALK: A latent variable model approach to word embeddings (https://arxiv.org/pdf/1502.03520.pdf) by Sanjeev Arora, Yuanzhi Li, Yingyu Liang, Tengyu Ma, and Andrej Risteski—Explains the latest (2016) understanding of the “vector-oriented reasoning” of Word2vec and other word vector space models, particularly analogy questions
Efficient Estimation of Word Representations in Vector Space (https://arxiv.org/pdf/1301.3781.pdf) by Tomas Mikolov, Greg Corrado, Kai Chen, and Jeffrey Dean at Google, Sep 2013—First publication of the Word2vec model, including an implementation in C++ and pretrained models using a Google News corpus
Distributed Representations of Words and Phrases and their Compositionality (https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) by Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean at Google—Describes refinements to the Word2vec model that improved its accuracy, including subsampling and negative sampling
From Distributional to Semantic Similarity (https://www.era.lib.ed.ac.uk/bitstream/handle/1842/563/IP030023.pdf) 2003 Ph.D. Thesis by James Richard Curran —Lots of classic information retrieval (full-text search) research, including TF-IDF normalization and page rank techniques for web search

Finance

Predicting Stock Returns by Automatically Analyzing Company News Announcements (http://www.stagirit.org/sites/default/files/articles/a_0275_ssrn-id2684558.pdf)—Bella Dubrov used gensim’s Doc2vec to predict stock prices based on company announcements with excellent explanations of Word2vec and Doc2vec.
Building a Quantitative Trading Strategy to Beat the S&P 500 (https://www.youtube.com/watch?v=ll6Tq-wTXXw)—At PyCon 2016, Karen Rubin explained how she discovered that female CEOs are predictive of rising stock prices, though not as strongly as she initially thought.

Question answering systems

Keras-based LSTM/CNN models for Visual Question Answering (https://github.com/avisingh599/visual-qa) by Avi Singh
Open Domain Question Answering: Techniques, Resources and Systems (http://lml.bas.bg/ranlp2005/tutorials/magnini.ppt) by Bernardo Magnini
Question Answering Techniques for the World Wide Web by Lin Katz, University of Waterloo, Canada (https://cs.uwaterloo.ca/~jimmylin/publications/Lin_Katz_EACL2003_tutorial.pdf)
NLP-Question-Answer-System (https://github.com/raoariel/NLP-Question-Answer-System/blob/master/simpleQueryAnswering.py)—Built from scratch using corenlp and nltk for sentence segmenting and POS tagging
PiQASso: Pisa Question Answering System (http://trec.nist.gov/pubs/trec10/papers/piqasso.pdf) by Attardi et al., 2001—Uses traditional information retrieval (IR) NLP

Deep learning

Understanding LSTM Networks (https://colah.github.io/posts/2015-08-Understanding-LSTMs) by Christopher Olah—A clear and correct explanation of LSTMs
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation (https://arxiv.org/pdf/1406.1078.pdf) by Kyunghyun Cho et al., 2014—Paper that first introduced gated recurrent units, making LSTMs more efficient for NLP

LSTMs and RNNs

We had a lot of difficulty understanding the terminology and architecture of LSTMs. This is a gathering of the most cited references, so you can let the authors “vote” on the right way to talk about LSTMs. The state of the Wikipedia page (and Talk page discussion) on LSTMs is a pretty good indication of the lack of consensus about what LSTM means:

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation (https://arxiv.org/pdf/1406.1078.pdf) by Cho et al.—Explains how the contents of the memory cells in an LSTM layer can be used as an embedding that can encode variable length sequences and then decode them to a new variable length sequence with a potentially different length, translating or transcoding one sequence into another.
Reinforcement Learning with Long Short-Term Memory (https://papers.nips.cc/paper/1953-reinforcement-learning-with-long-short-term-memory.pdf) by Bram Bakker—Application of LSTMs to planning and anticipation cognition with demonstrations of a network that can solve the T-maze navigation problem and an advanced pole-balancing (inverted pendulum) problem.
Supervised Sequence Labelling with Recurrent Neural Networks (https://mediatum.ub.tum.de/doc/673554/file.pdf)—Thesis by Alex Graves with advisor B. Brugge; a detailed explanation of the mathematics for the exact gradient for LSTMs as first proposed by Hochreiter and Schmidhuber in 1997. But Graves fails to define terms like CEC or LSTM block/cell rigorously.
Theano LSTM documentation (http://deeplearning.net/tutorial/lstm.html) by Pierre Luc Carrier and Kyunghyun Cho—Diagram and discussion to explain the LSTM implementation in Theano and Keras.
Learning to Forget: Continual Prediction with LSTM (http://mng.bz/4v5V) by Felix A. Gers, Jurgen Schmidhuber, and Fred Cummins—Uses nonstandard notation for layer inputs (yⁱⁿ) and outputs (y^out) and internal hidden state (h). All math and diagrams are “vectorized.”
Sequence to Sequence Learning with Neural Networks (http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf) by Ilya Sutskever, Oriol Vinyals, and Quoc V. Le at Google.
Understanding LSTM Networks (http://colah.github.io/posts/2015-08-Understanding-LSTMs) 2015 blog by Charles Olah—lots of good diagrams and discussion/feedback from readers.
Long Short-Term Memory (http://www.bioinf.jku.at/publications/older/2604.pdf) by Sepp Hochreiter and Jurgen Schmidhuber, 1997—Original paper on LSTMs with outdated terminology and inefficient implementation, but detailed mathematical derivation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Research papers and talks

Create new playlist

Sign In

Sign Up

Research papers and talks

Vector space models and semantic search

Finance

Question answering systems

Deep learning

LSTMs and RNNs

Table of Contents for
Research papers and talks