INQ (Incremental Network Quantization)

3월 04, 2018

Summary of "Incremental Network Qunatization: Towards Lossless CNNs with Low-Precision Weights"

1. Problem

- Deep CNN requires heavy burdens on the memory and other computational resources
- Reducing the heavy burdens could cause the Performance Loss (accuracy loss)

2. Solution

- INQ (Incremental Network Quantization) :
Convert "Any Pre-Trained Full-Precision CNN" into "Low-Precision Version (weights are constrained to be either powers of two or zero)"

3. How

- three interdependent operations

1) Weight Partition

- Divide the weights in each layer of a pre-trained CNN model into two disjoint groups

- Weights in the first group
+ Responsible to form a low-precision base,
+ Thus, Quantized by a variable-length encoding method

- Weights in the second group
+ Responsible to compensate for the accuracy loss from the quantization
+ Thus, Retrained

2) Group-Wise Quantization

- Above two operations can be formulated into following joint optimization form :

- By employing SGD (Stochastic Gradient Decent) method, the updating scheme is :

3) Re-Training

- Above Operations are Repeated until All Weights are converted into Low-Precision Ones

- Overall Procedure of INQ

4. Performance

- 5 bit quantization has improved accuracy than 32 bit floating point reference

+ Variable Length Encoding

+ 1 bit for representing zero value

+ 4 bit for at most 16 different values for the powers of two)

- Combination of network pruning and INQ shows impressive results

Deep Learning, 5G, IoT

IT Tech Mania