Full Paper Content
1. Introduction
Deep convolutional neural networks have led to a series of breakthroughs for image classification. Many other visual recognition tasks have also greatly benefited from very deep models.
2. Deep Residual Learning
Let us consider H(x) as an underlying mapping to be fit by a few stacked layers, with x denoting the inputs to the first of these layers. If one hypothesizes that multiple nonlinear layers can asymptotically approximate complicated functions, then it is equivalent to hypothesize that they can asymptotically approximate the residual functions.
3. Network Architectures
We have tested various plain/residual nets, and observed consistent phenomena. To provide instances for discussion, we describe two models for ImageNet as follows.
4. Experiments
We evaluate our method on ImageNet 2012 classification dataset that consists of 1000 classes. The models are trained on the 1.28 million training images, and evaluated on the 50k validation images.
5. Conclusion
This work presented residual learning frameworks to ease the training of deep networks. Our methods are based on the idea of learning residual functions with reference to layer inputs.