背景介绍
Ilya Sutskever 的论文清单:30u30
Ilya Sutskever 是 Hinton 的大弟子,OpenAI的联合创始人兼首席科学家。
以下是他推荐的论文清单,他认为阅读完这些内容之后就可以了解AI领域90%的内容
主要内容
核心神经网络创新
- Recurrent Neural Network Regularization - Enhancement to LSTM units for better overfitting prevention.
递归神经网络正则化 - 增强了 LSTM 单元,以更好地防止过拟合。 - Pointer Networks - Novel architecture for solving problems with discrete token outputs.
Pointer Networks - 用于解决离散令牌输出问题的新颖架构. - Deep Residual Learning for Image Recognition - Improvements for training very deep networks through residual learning.
用于图像识别的深度残差学习 - 改进了通过残差学习训练非常深度的网络。 - Identity Mappings in Deep Residual Networks - Enhancements to deep residual networks through identity mappings.
深度残差网络中的身份映射 - 通过身份映射增强深度残差网络。 - Neural Turing Machines - Combining neural networks with external memory resources for enhanced algorithmic tasks.
- 将神经网络与外部内存资源相结合,以增强算法任务。
- Attention Is All You Need - Introducing the Transformer architecture solely based on attention mechanisms.
Attention Is All You Need - 介绍完全基于注意力机制的 Transformer 架构。
专业神经网络应用
- Multi-Scale Context Aggregation by Dilated Convolutions - A convolutional network module for better semantic segmentation.
Multi-Scale Context Aggregation by Dilated Convolutions - 一个卷积网络模块,用于更好的语义分割. - Neural Machine Translation by Jointly Learning to Align and Translate - A model improving translation by learning to align and translate concurrently.
Neural Machine Translation by Joint Learning to Align and Translation - 一种通过学习同时对齐和翻译来改进翻译的模型. - Neural Message Passing for Quantum Chemistry - A framework for learning on molecular graphs for quantum chemistry.
Neural Message Passing for Quantum Chemistry - 量子化学分子图学习框架. - Relational RNNs - Enhancement to standard memory architectures integrating relational reasoning capabilities.Theoretical and Principled Approaches
关系 RNN - 增强了集成关系推理功能的标准内存架构。理论和原则方法 - Deep Speech 2: End-to-End Speech Recognition in English and Mandarin - Deep learning system for speech recognition.
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin -用于语音识别的深度学习系统. - ImageNet Classification with Deep CNNs - Convolutional neural network for classifying large-scale image data.
ImageNet Classification with Deep CNNs - 用于对大规模图像数据进行分类的卷积神经网络. - Variational Lossy Autoencoder - Combines VAEs and autoregressive models for improved image synthesis.
变分有损自动编码器 - 结合 VAE 和自回归模型以改进图像合成。 - A Simple NN Module for Relational Reasoning - A neural module designed to improve relational reasoning in AI tasks.
用于关系推理的简单神经网络模块 - 旨在改进 AI 任务中关系推理的神经模块.
理论见解和原则性方法
- Order Matters: Sequence to sequence for sets - Investigating the impact of data order on model performance.
Order Matters: Sequence to sequence for sets - 调查数据顺序对模型性能的影响。 - Scaling Laws for Neural LMs - Empirical study on the scaling laws of language model performance.
Scaling Laws for Neural LMs -语言模型性能的缩放规律的实证研究. - A Tutorial Introduction to the Minimum Description Length Principle - Tutorial on the MDL principle in model selection and inference.
最小描述长度原则教程简介 - 模型选择和推理中的MDL原理教程. - Keeping Neural Networks Simple by Minimizing the Description Length of the Weights - Method to improve neural network generalization by minimizing weight description length.
通过最小化权重的描述长度来保持神经网络的简单性 - 通过最小化权重描述长度来提高神经网络泛化的方法。 - Machine Super Intelligence DissertationMachine Super Intelligence Dissertation - Study on optimal behavior of agents in computable environments.
机器 超级 智能 论文Machine Super Intelligence Dissertation - 研究智能体在可计算环境中的最优行为。 - PAGE 434 onwards: Komogrov Complexity - Comprehensive exploration of Kolmogorov complexity, discussing its mathematical foundations and implications for fields like information theory and computational complexity.
第434页起:科莫格罗夫复杂性 - 全面探索柯尔莫戈罗夫复杂性,讨论其数学基础以及对信息论和计算复杂性等领域的影响。
跨学科和概念研究
- Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton - Study on complexity in closed systems using cellular automata.
Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton - 使用元胞自动机研究封闭系统中的复杂性.
效率和可扩展性技术
- GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism - A method for efficient training of large-scale neural networks.
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism - 一种高效训练大规模神经网络的方法.
教材和教程
- CS231n: Convolutional Neural Networks for Visual Recognition - Stanford University course on CNNs for visual recognition.
CS231n:用于视觉识别的卷积神经网络 - 斯坦福大学视觉识别 CNN 课程。 - The Annotated Transformer - Annotated, line-by-line implementation of the Transformer paper. Code is available here.
The Annotated Transformer - Transformer 论文的带注释的逐行实现.代码可在此处获得。 - The First Law of Complexodynamics - Blog post discussing the measure of system complexity in computational terms.
The First Law of Complexodynamics - 讨论计算术语中系统复杂性度量的博客文章. - The Unreasonable Effectiveness of RNNs - Blog post demonstrating the versatility of RNNs.
RNN 的不合理有效性 - 展示 RNN 多功能性的博客文章. - Understanding LSTM Networks - Blog post providing a detailed explanation of LSTM networks.
了解 LSTM 网络 - 博客文章提供了 LSTM 网络的详细说明。