Ray Sin Learning notes

喜歡分析資料、計算數字、善用工具的統計學家

2020-02-24

Network in Network Notes

245
0
CNN

M Lin, Network In Network, 2013

Conventional convolution layer

Input --> Linear filter --> Feature map
Linear filter
- Generalized linear model(GLM)
  - Low level abstraction ability
  - Abstraction = invariant to variants of the same concept
    - => It is possible to make them more "non-linear" ?

Linearity

H(x) = y
- H(kx)=ky
- H(x₁+x₂)=H(x₁)+H(x₂)=y₁+y₂
Linear separable
- To separate points in n-dimension with n-1 dimension
To increase feature dimensions
To utilize "over-complete set of filters"
- Can cause extra burden on the next layer

MLP convolution layers

Add a "mirco network"
Why is MLP?
- MLP is trained using back-propagation
- MLP can be a deep model itself
- -> Cross feature map pooling
- -> Equivalent to 1x1 convolution layer
Compariosn to max-out layers
- Can model any function.(max-out: any convex function)
- A universal function approximator

Global Average Pooling

Fully connected layers are prone to overfitting
There is no parameter in the global average pooling

Layers calculation

Linear convolution layer
- f_i,j,k = max(w_k^Tx_i,j,0)
MLPconv layer
- f¹_i,j,k1 = max(w¹_k1^Tx_i,j+b_k1, 0)
- ...
- f_i,j,kn = max(wⁿ_k^Tx_i,j+bkn, 0)

reference:https://arxiv.org/pdf/1312.4400.pdf

CNN

回首頁