Paper References from "Deep Learning School 2016"

Written on October 29, 2019
[ deep-learning  easi  ]

I’ve been watching through the video lectures of the “Deep Learning School 2016” playlist on Lex Fridman’s YouTube account. While doing so, I found it useful to collect and collate all the references in each lecture (or as many as I could distinguish and find).

2. Deep Learning for Computer Vision (Andrej Karpathy, OpenAI)

  • 1998: LeCun et al: Gradient-Based Learning Applied to Document Recognition: http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf
  • 2012: Krizhevsky et al: ImageNet Classification with Deep Convolutional Neural Networks (AlexNet): http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
  • 2013: Donahue et al: DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition: http://proceedings.mlr.press/v32/donahue14.pdf
  • 2013: Zeiler & Fergus: Stochastic Pooling for Regularization of Deep Convolutional Neural Networks: https://arxiv.org/pdf/1301.3557.pdf
  • 2014: Razavian et al: CNN Features off-the-shelf: an Astounding Baseline for Recognition: https://www.cv-foundation.org/openaccess/content_cvpr_workshops_2014/W15/papers/Razavian_CNN_Features_Off-the-Shelf_2014_CVPR_paper.pdf
  • 2014: Cadieu et al: Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003963
  • 2014: Simonyan & Zisserman: Very Deep Convolutional Networks for Large-Scale Image Recognition (VGGNet): https://arxiv.org/pdf/1409.1556.pdf%20http://arxiv.org/abs/1409.1556.pdf
  • 2015: Szegedy et al: Going Deeper with Convolutions (Inception): https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf
  • 2015: Yosinski et al: Understanding Neural Networks Through Deep Visualization: https://arxiv.org/pdf/1506.06579.pdf
  • 2015: He et al: Deep Residual Learning for Image Recognition (ResNet): https://arxiv.org/abs/1512.03385
  • 2016: He et al: Identity Mappings in Deep Residual Networks: https://arxiv.org/pdf/1603.05027.pdf
  • 2016: Huang et al: Deep Networks with Stochastic Depth: https://arxiv.org/pdf/1603.09382.pdf
  • 2016: Oord et al: WaveNet: a Generative Model for Raw Audio: https://arxiv.org/pdf/1609.03499.pdf
  • 2016: Targ et al: ResNet in ResNet: Generalizing Residual Architectures: https://arxiv.org/pdf/1603.08029.pdf
  • 2016: Wang et al: Deeply-Fused Nets: https://arxiv.org/pdf/1605.07716.pdf
  • 2016: Shen et al: Weighted Residuals for Very Deep Networks: https://arxiv.org/pdf/1605.08831.pdf
  • 2016: Zhang et al: Residual Networks of Residual Networks: Multilevel Residual Networks: https://arxiv.org/pdf/1608.02908.pdf
  • 2016: Redmon et al: You Only Look Once: Unified, Real-Time Object Detection: https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf
  • 2016: Singh et al: Swapout: Learning an ensemble of deep architectures: https://papers.nips.cc/paper/6205-swapout-learning-an-ensemble-of-deep-architectures.pdf
  • 2016: Johnson et al: DenseCap: Fully Convolutional Localization Networks for Dense Captioning: http://openaccess.thecvf.com/content_cvpr_2016/papers/Johnson_DenseCap_Fully_Convolutional_CVPR_2016_paper.pdf
  • 2017: Zagoruyko & Komodakis: Wide Residual Networks: https://arxiv.org/pdf/1605.07146.pdf
  • 2017: Huang et al: Densely Connected Convolutional Networks: http://openaccess.thecvf.com/content_cvpr_2017/papers/Huang_Densely_Connected_Convolutional_CVPR_2017_paper.pdf
  • 2017: Larsson et al: FractalNet: Ultra-Deep Neural Networks without Residuals: https://arxiv.org/pdf/1605.07648.pdf

3. Deep Learning for Natural Language Processing (Richard Socher, Salesforce)

  • 1997: Hochreiter & Schmidhuber: Long Short-Term Memory: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.676.4320&rep=rep1&type=pdf
  • 2010: Mikolov et al: Recurrent Neural Network Based Language Model: https://www.isca-speech.org/archive/archive_papers/interspeech_2010/i10_1045.pdf
  • 2013: Mikolov et al (word2vec ref): Efficient Estimation of Word Representations in Vector Space: https://arxiv.org/pdf/1301.3781.pdf
  • 2013: Mikolov et al (word2vec ref): Distributed Representations of Words and Phrases and their Compositionality: https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
  • 2013: Mikolov et al (word2vec ref): Linguistic Regularities in Continuous Space Word Representations: https://www.microsoft.com/en-us/research/publication/linguistic-regularities-in-continuous-space-word-representations/?from=http%3A%2F%2Fresearch.microsoft.com%2Fpubs%2F189726%2Frvecs.pdf
  • 2013: Mikolov et al: Exploiting Similarities among Languages for Machine Translation: https://arxiv.org/pdf/1309.4168.pdf
  • 2013: Socher et al: Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank: https://www.aclweb.org/anthology/D13-1170/
  • 2014: Cho et al: On the Properties of Neural Machine Translation: Encoder-Decoder Approaches: https://arxiv.org/abs/1409.1259
  • 2014: Chung et al: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling: https://arxiv.org/abs/1412.3555
  • 2014: Graves et al: Neural Turing Machines: https://arxiv.org/abs/1410.5401
  • 2014: Irsoy & Cardie: Opinion Mining with Deep Recurrent Neural Networks: https://www.aclweb.org/anthology/D14-1080/
  • 2014: Irsoy & Cardie: Deep Recursive Neural Networks for Compositionality in Language: http://papers.nips.cc/paper/5551-deep-recursive-neural-networks-for-compositionality-in-language
  • 2014: Kalchbrenner et al: A Convolutional Neural Network for Modelling Sentences: https://arxiv.org/abs/1404.2188
  • 2014: Kim: Convolutional Neural Networks for Sentence Classification: https://arxiv.org/abs/1408.5882
  • 2014: Le & Mikolov: Distributed Representations of Sentences and Documents: http://proceedings.mlr.press/v32/le14.pdf
  • 2014: Pennington et al (GloVe ref): Glove: Global Vectors for Word Representation: https://www.aclweb.org/anthology/D14-1162/
  • 2014: Sutskever et al: Sequence to Sequence Learning with Neural Networks: http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks
  • 2014: Weston et al: Memory Networks: https://arxiv.org/abs/1410.3916
  • 2014: Zaremba et al: Recurrent Neural Network Regularization: https://arxiv.org/abs/1409.2329
  • 2014: Zaremba & Sutskever: Learning to Execute: https://arxiv.org/abs/1410.4615
  • 2015: Antol et al: VQA: Visual Question Answering: http://openaccess.thecvf.com/content_iccv_2015/html/Antol_VQA_Visual_Question_ICCV_2015_paper.html
  • 2015: Gal & Ghahramani: A Theoretically Grounded Application of Dropout in Recurrent Neural Networks: http://papers.nips.cc/paper/6241-a-theoretically-grounded-application-of-dropout-in-recurren
  • 2015: Grefenstette et al: Learning to Transduce with Unbounded Memory: http://papers.nips.cc/paper/5648-learning-to-transduce-with-unbounded-memory
  • 2015: Hermann et al: Teaching Machines to Read and Comprehend: http://papers.nips.cc/paper/5945-teaching-machines-to-read-and-comprehend
  • 2015: Huang et al: Bidirectional LSTM-CRF Models for Sequence Tagging: https://arxiv.org/abs/1508.01991
  • 2015: Sukhbaatar et al: End-To-End Memory Networks: http://papers.nips.cc/paper/5846-end-to-end-memorynetworks
  • 2015: Tai et al: Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks: https://arxiv.org/abs/1503.00075
  • 2015: Weston et al: Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks: https://arxiv.org/abs/1502.05698
  • 2015: Zhang et al: Structured Memory for Neural Turing Machines: https://arxiv.org/abs/1510.03931
  • 2015: Zhou et al: Simple Baseline for Visual Question Answering: https://arxiv.org/abs/1512.02167
  • 2016: Andreas et al: Neural Module Networks: http://openaccess.thecvf.com/content_cvpr_2016/html/Andreas_Neural_Module_Networks_CVPR_2016_paper.html
  • 2016: Andreas et al: Learning to Compose Neural Networks for Question Answering: https://arxiv.org/abs/1601.01705
  • 2016: Kumar et al: Ask Me Anything: Dynamic Memory Networks for Natural Language Processing: http://proceedings.mlr.press/v48/kumar16.pdf
  • 2016: Merity et al: Pointer Sentinel Mixture Models: https://arxiv.org/abs/1609.07843
  • 2016: Noh et al: Image Question Answering Using Convolutional Neural Network With Dynamic Parameter Prediction: http://openaccess.thecvf.com/content_cvpr_2016/html/Noh_Image_Question_Answering_CVPR_2016_paper.html
  • 2016: Yang et al: Stacked Attention Networks for Image Question Answering: http://openaccess.thecvf.com/content_cvpr_2016/html/Yang_Stacked_Attention_Networks_CVPR_2016_paper.html
  • 2017: Zilly et al: Recurrent Highway Networks: https://arxiv.org/pdf/1607.03474.pdf

4. TensorFlow Tutorial (Sherry Moore, Google Brain)

NOTE: this video is old and TF has changed A LOT. Some of the code in Sherry’s TF tutorial (link below) will likely cause a lot of warnings to be issued, while some of it might not work. A good exercise might be to get it to work TF 2.0.

  • Sherry’s TF Tutorial: https://github.com/sherrym/tf-tutorial/

  • Train your own image classifier with Inception in TensorFlow: https://ai.googleblog.com/2016/03/train-your-own-image-classifier-with.html

  • Models on TensorFlow (NOTE - many links in video no longer work; UPDATED links below):

    • Models Page: https://github.com/tensorflow/models
    • Inception: https://github.com/tensorflow/models/tree/master/inception
    • A Neural Image Caption Generator: https://github.com/tensorflow/models/tree/master/research/im2txt
    • Language Model (1B words): https://github.com/tensorflow/models/tree/master/research/lm_1b
    • SyntaxNet: https://github.com/tensorflow/models/tree/master/research/syntaxnet
    • ResNet: https://github.com/tensorflow/models/tree/master/official/resnet
    • Seq2Seq w/ Attention for Text Summarization: https://github.com/tensorflow/models/tree/master/research/textsum
    • Image Compression: https://github.com/tensorflow/models/tree/master/research/compression
    • Autoencoder: https://github.com/tensorflow/models/tree/master/research/autoencoder
    • Spatial Transformer Network: https://github.com/tensorflow/models/tree/master/official/transformer

5. Foundations of Unsupervised Deep Learning (Ruslan Salakhutdinov, CMU)

  • 1995: Hinton et al: The wake-sleep algorithm for unsupervised neural networks: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.51.215&rep=rep1&type=pdf
  • 1996: Olshausen & Field: Natural image statistics and efficient coding: https://pdfs.semanticscholar.org/4435/2b35791ceaad3439b8ccf165cc9b4978d801.pdf
  • 1996: Olshausen & Field: Sparse coding of natural images produces localized, oriented, bandpass receptive fields: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.38.6079&rep=rep1&type=pdf
  • 2002: Hinton: Training Products of Experts by Minimizing Contrastive Divergence: http://www.cs.utoronto.ca/~hinton/absps/nccd.pdf
  • 2006: Hinton et al: A Fast Learning Algorithm for Deep Belief Nets: https://www.mitpressjournals.org/doi/pdfplus/10.1162/neco.2006.18.7.1527
  • 2006: Hinton & Salakhutdinov: Reducing the dimensionality of data with neural networks: https://www.semanticscholar.org/paper/Reducing-the-dimensionality-of-data-with-neural-Hinton-Salakhutdinov/46eb79e5eec8a4e2b2f5652b66441e8a4c921c3e
  • 2006: Lee et al: Efficient sparse coding algorithms: http://papers.nips.cc/paper/2979-efficient-sparse-coding-algorithms.pdf
  • 2007: Bengio et al: Greedy Layer-Wise Training of Deep Networks: http://papers.nips.cc/paper/3048-greedy-layer-wise-training-of-deep-networks.pdf
  • 2007: Salakhutdinov et al: Restricted Boltzmann Machines for Collaborative Filtering: http://www.utstat.toronto.edu/~rsalakhu/papers/rbmcf.pdf
  • 2008: Salakhutdinov: Learning and Evaluating Boltzmann Machines: http://www.cs.toronto.edu/~rsalakhu/papers/bm.pdf
  • 2008: Torralba et al: Small Codes and Large Image Databases for Recognition: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.229.3256&rep=rep1&type=pdf
  • 2009: Bengio: Learning Deep Architectures for AI: http://axon.cs.byu.edu/~martinez/classes/678/Papers/ftml.pdf
  • 2009: Kavukcuoglu et al: Learning Invariant Features through Topographic Filter Maps: http://yann.lecun.org/exdb/publis/pdf/koray-cvpr-09.pdf
  • 2009: Kulis & Darrell: Learning to Hash with Binary Reconstructive Embeddings: http://papers.nips.cc/paper/3667-learning-to-hash-with-binary-reconstructive-embeddings.pdf
  • 2009: Larochelle et al: Exploring Strategies for Training Deep Neural Networks: http://www.jmlr.org/papers/volume10/larochelle09a/larochelle09a.pdf
  • 2009: Weiss et al: Spectral Hashing: https://papers.nips.cc/paper/3383-spectral-hashing.pdf
  • 2010: Salakhutdinov & Hinton: Deep Boltzmann Machines: http://proceedings.mlr.press/v5/salakhutdinov09a/salakhutdinov09a.pdf
  • 2010: Salakhutdinov & Larochelle: Efficient Learning of Deep Boltzmann Machines: http://proceedings.mlr.press/v9/salakhutdinov10a/salakhutdinov10a.pdf
  • 2011: Larochelle et al: The Neural Autoregressive Distribution Estimator (NADE): http://proceedings.mlr.press/v15/larochelle11a/larochelle11a.pdf
  • 2012: Hinton & Salakhutdinov: A Better Way to Pretrain Deep Boltzmann Machines: http://papers.nips.cc/paper/4610-a-better-way-to-pretrain-deep-boltzmann-machines
  • 2012: Srivastava & Salakhutdinov: Multimodal Learning with Deep Boltzmann Machines: https://papers.nips.cc/paper/4683-multimodal-learning-with-deep-boltzmann-machines.pdf
  • 2012: Srivastava & Salakhutdinov: Learning Representations for Multimodal Data with Deep Belief Nets: https://pdfs.semanticscholar.org/5555/b28607cada5474bca772e1cc553b624415c9.pdf
  • 2013: Tang & Salakhutdinov: Learning Stochastic Feedforward Neural Networks: http://papers.nips.cc/paper/5026-learning-stochastic-feedforward-neural-networks
  • 2013: Uria et al: RNADE: The real-valued neural autoregressive density-estimator: http://papers.nips.cc/paper/5060-rnade-the-real-valued-neural-autoregressive-density-estimator
  • 2014: Bornschein & Bengio: Reweighted Wake-Sleep: https://arxiv.org/abs/1406.2751
  • 2014: Goodfellow et al: Generative Adversarial Nets: http://papers.nips.cc/paper/5423-generative-adversarial-nets
  • 2014: Kingma & Welling: Stochastic Gradient VB and the Variational Auto-Encoder (“reparameterization trick”): https://pdfs.semanticscholar.org/eaa6/bf5334bc647153518d0205dca2f73aea971e.pdf
  • 2014: Kiros et al: Multimodal Neural Language Models: http://proceedings.mlr.press/v32/kiros14.pdf
  • 2014: Kiros et al: Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models: https://arxiv.org/abs/1411.2539
  • 2014: Mnih & Gregor: Neural Variational Inference and Learning in Belief Networks: https://arxiv.org/abs/1402.0030
  • 2014: Rezende et al: Stochastic Backpropagation and Approximate Inference in Deep Generative Models: https://arxiv.org/abs/1401.4082
  • 2014: Uria et al: A Deep and Tractable Density Estimator: http://proceedings.mlr.press/v32/uria14.pdf
  • 2015: Burda et al: Importance Weighted Autoencoders: https://arxiv.org/abs/1509.00519
  • 2015: Denton et al: Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks (LaPGAN): http://papers.nips.cc/paper/5773-deep-generative-image-models-using-a-5
  • 2015: Gregor et al: DRAW: A Recurrent Neural Network For Image Generation: https://arxiv.org/abs/1502.04623
  • 2015: Lake et al: Human-Level Concept Learning through Probabilistic Program Induction: https://www.sas.upenn.edu/~astocker/lab/teaching-files/PSYC739-2016/Lake_etal2015.pdf
  • 2015: Mansimov et al: Generating Images from Captions with Attention: https://arxiv.org/abs/1511.02793
  • 2015: Radford et al: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks: https://arxiv.org/abs/1511.06434
  • 2016: Salimans et al: Improved Techniques for Training GANs: http://papers.nips.cc/paper/6124-improved-techniques-for-training-gans
  • 2016: van den Oord: Conditional Image Generation with PixelCNN Decoders: http://papers.nips.cc/paper/6527-conditional-image-generation-with-pixelcnn-decoders
  • 2016: van den Oord: Pixel Recurrent Neural Networks: https://arxiv.org/abs/1601.06759

6. Nuts and Bolts of Applying Deep Learning (Andrew Ng)

Video Lecture

Ng talks about a typical, decent ML workflow, but notes one can do a better job at error analysis (i.e., understanding how model bias and model variance are affecting your results).

          *---------------------------------------------------------------*
          v                                                               |
[Training Error High?] ---YES---> [Bigger Model | Train Longer | New Model Architecture]
          |        ^
          NO        \_______
          |                 *------*
          V                         \
[Dev Set Error High?] ---YES---> [More Data | Regularization | New Model Architecture ]
          |
          NO
          |
          V
         DONE

He actually had the Dev Set change point back to the asking whether the dev set error was high, but I pointed it back to asking whether the training set error was high. Makes more sense, especially if you change the model architecture!

He showed you can do better than this work flow, and in fact it is here I learned a new trick. Basically, in addition to looking and training and dev errors, you should look at the combined set error. Hopefully this next graphic helps that make more sense:

Human Error Rate: 1%
Training Set Error: 10%        
Training-Dev Set Error:  10.1%
Dev Set Error: 10.1%
Test Set Error: 10.2%
  • An intuitive sense of model bias is found by comparing the human error rate to the training set error: here, we have large model bias.
  • Model variance can be thought of as the difference in error rate between training and training-dev (or between training and dev, as is usually done).
  • Ng shows Training/Dev mismatch as the difference between error on training-dev and dev…
  • Finally, the difference between dev and test error is a sign of overfitting the dev set

I have to give some of this more thought… But anyway, this analysis then informs Ng’s updated ML workflow:

# NOTE: oftentimes, a change made anywhere in the workflow indicates to start back
#   at the beginning (especially if an architectural change is made).

          *---------------------------------------------------------------*
          v                                                               |
[Training Error High?] ---YES---> [Bigger Model | Train Longer | New Model Architecture]
          |        ^
          NO        \_______
          |                 *------*
          V                         \
[Train-Dev Set Error High?] ---YES---> [More Training Data | Regularization | New Model Architecture]
          |
          NO
          |
          V
[Dev Set Error High?] ---YES---> [More Training and Dev Data | Data Synthesis | New Model Architecture]
          |
          NO
          |
          V
[Test Set Error High?] ---YES---> [More Dev Data]
          |
          NO
          |
          V
         DONE

Ng finds that progress in an area can be rapidly made until the human error rate is surpassed, then things get tricky. This is in part due to there being an upper bound on error rate, called the optimal error rate (or Bayes rate) – and the fact that humans are actually pretty good at many of the tasks we try to automate with ML/DL (i.e., oftentimes, the human error rate is already fairly close to the optimal rate). Also, once you surpass the human error rate you can run into some fundamental issues, e.g., are humans the ones labeling your data?

Ng brings up a toy medical example where he lists the typical human error rate as 3%, the typical doctor error rate as 1%, an expert doctor’s error rate as 0.7%, and the error rate of a team of expert doctors as 0.5%. The point is: which one should you consider the “human error rate” for this problem? Or more importantly: what type of data should you be using? Disregarding data collection costs and complications, the answer should be obvious: you want to train your model based on the team of expert doctors. This is one way to improve model performance: identify and/or insist upon high quality data.

If you are looking to make rapdid progress on something, Ng’s advice is to identify an area in which ML/DL has not yet surpassed the human error rate. This need not be something entirely new: for example, in speech recognition certain accents are in need of improvement.

How to get good? Simple: read a lot of papers, replicate results, and get comfortable with all the dirty work. In other words, get serious, set expectations consistent with reality, and put in the time!

7. Deep Reinforcement Learning (John Schulman, OpenAI)

  • 1994: Jaakkola et al: Convergence of Stochastic Iterative Dynamic Programming Algorithms: http://papers.nips.cc/paper/764-convergence-of-stochastic-iterative-dynamic-programming-algorithms.pdf
  • 2002: Kakade: A Natural Policy Gradient: http://papers.nips.cc/paper/2073-a-natural-policy-gradient.pdf
  • 2003: Bagnell & Scheider: Covariant Policy Search: https://kilthub.cmu.edu/articles/Covariant_Policy_Search/6552458
  • 2005: Riedmiller: Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method: https://link.springer.com/content/pdf/10.1007/11564096_32.pdf
  • 2007: Powell: Approximate Dynamic Programming: Solving the Curse of Dimensionality: «could not find shareable link, so include a later review paper by same author on same topic below»>
  • 2008: Peters et al: Natural Actor Critic: http://www.cs.cmu.edu/~nickr/nips_workshop/jpeters.abstract.pdf
  • 2009: Daume et al: Search-Based Structured Prediction: https://link.springer.com/content/pdf/10.1007%2Fs10994-009-5106-x.pdf
  • 2009: Powell: What you should know about Approximate Dynamic Programming: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.150.1854&rep=rep1&type=pdf
  • 2010: Jie & Abbeel: On a Connection between Importance Sampling and the Likelihood Ratio Policy Gradient: http://papers.nips.cc/paper/3922-on-a-connection-between-importance-sampling-and-the-likelihood-ratio-policy-gradient
  • 2011: Ross et al: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning: http://proceedings.mlr.press/v15/ross11a/ross11a.pdf
  • 2013: Mnih et al: Playing Atari with Deep Reinforcement Learning: https://arxiv.org/abs/1312.5602
  • 2014: Guo et al: Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning: http://papers.nips.cc/paper/5421-deep-learning-for-real-time-atari-game-play-using-offline-monte-carlo-tree-search-planning
  • 2014: Mnih et al: Recurrent Models of Visual Attention: http://papers.nips.cc/paper/5542-recurrent-models-of-visual-attention
  • 2014: Silver et al: Deterministic Policy Gradient Algorithms: http://proceedings.mlr.press/v32/silver14.pdf
  • 2015: Hausknecht & Stone: Deep Recurrent Q-Learning for Partially Observable MDPs: https://www.aaai.org/ocs/index.php/FSS/FSS15/paper/viewPaper/11673
  • 2015: Heess et al: Learning Continuous Control Policies by Stochastic Value Gradients: http://papers.nips.cc/paper/5796-learning-continuous-control-policies-by-stochastic-value-gradients
  • 2015: Ranzato et al: Sequence Level Training with Recurrent Neural Networks: https://arxiv.org/abs/1511.06732
  • 2015: Schaul et al: Prioritized Experience Replay: https://arxiv.org/abs/1511.05952
  • 2015: Schulman et al: High-Dimensional Continuous Control Using Generalized Advantage Estimation: https://arxiv.org/abs/1506.02438
  • 2015: Schulman et al: Trust Region Policy Optimization: http://proceedings.mlr.press/v37/schulman15.pdf
  • 2016: Levine et al: End-to-End Training of Deep Visuomotor Policies: http://www.jmlr.org/papers/volume17/15-522/15-522.pdf
  • 2016: Mnih et al: Asynchronous Methods for Deep Reinforcement Learning: http://proceedings.mlr.press/v48/mniha16.pdf
  • 2016: Silver et al: Mastering the game of Go with deep neural networks and tree search: http://web.iitd.ac.in/~sumeet/Silver16.pdf
  • 2016: van Hasselt et al: Deep Reinforcement Learning with Double Q-Learning: https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12389
  • 2016: Wang et al: Dueling Network Architectures for Deep Reinforcement Learning: https://arxiv.org/abs/1511.06581

8. Theano Tutorial

Honestly, I already use TensorFlow/Keras, so I don’t have any interest in learning Theano at the moment – SKIP!

9. Deep Learning for Speech Recognition (Adam Coates, Baidu)

11. Sequence to Sequence Deep Learning (Quoc Le, Google)


12. Foundations and Challenges of Deep Learning (Yoshua Bengio)