参考#

参考资料

GKD+21

Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, and Kurt Keutzer. A survey of quantization methods for efficient neural network inference. 2021. arXiv:2103.13630.

Kri18

Raghuraman Krishnamoorthi. Quantizing deep convolutional networks for efficient inference: a whitepaper. 2018. arXiv:1806.08342.

WJZ+20

Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev, and Paulius Micikevicius. Integer quantization for deep learning inference: principles and empirical evaluation. 2020. arXiv:2004.09602.