Quantization helps to reduce the size of a model by downsizing the const operations from 32 bit down to 8 bits. In general, the model will not suffer significant degradation of performance. However, this has to be verified on a case by case basis.
Quantization helps to reduce the size of a model by downsizing the const operations from 32 bit down to 8 bits. In general, the model will not suffer significant degradation of performance. However, this has to be verified on a case by case basis.