TensorFlow Lite 模型量化的问题

tfers-migration · March 31, 2020, 3:52pm

Tensorflow 已经提供了对模型进行量化的工具 graph_transform，将浮点数量化为 8 位整型的话，模型的大小会缩小为原来的 1/4 左右，在移动端部署的时候就能有效控制 APK 的大小。
对于 TensorFlow Lite，并没有提供直接量化模型的工具，官方提供的方案是在训练的过程中，利用 create_training_graph () 和 create_eval_graph () 在 Graph 中加入 FakeQuant 节点（参考这里），然后用 toco 工具在转换 pb 文件为 tflite 文件的时候利用 FakeQuant 节点中记录的信息进行量化，我每次都是在最后一步转换 pb 为 tflite 的时候出错，提示 “xxx is lacking min/max data, which is necessary for quantization”。不知论坛里的各位有没有成功量化 TensorFlow Lite 模型的经验，如果有，恳请分享一下。

提问人：程志超，2018-6-1 17:06:12

tfers-migration · March 31, 2020, 3:52pm

www.tensorfly.cn 看看 API

ZMikkelsen 发表于 2018-6-1 21:41:24

tfers-migration · March 31, 2020, 3:53pm

tensorflow.google.cn 有最新的哟

舟 3332，2018-6-1 22:38

tfers-migration · March 31, 2020, 3:54pm

是不是在转换的时候要多给一个参数？

舟 3332 发表于 2018-6-8 21:39:33

tfers-migration · March 31, 2020, 3:56pm

转换时要给出下面的参数

--default_ranges_min=-1.0 --default_ranges_max=1.0

Yanbo 发表于 2018-7-3 10:48:47

tfers-migration · March 31, 2020, 3:57pm

转换时要给出下面的参数

--default_ranges_min=-1.0 --default_ranges_max=1.0

衣农发表于 2018-7-3 11:03:56

tfers-migration · March 31, 2020, 3:59pm

请问你现在量化成功了吗?还有你从哪里得知要量化 lite 文件需要使用 create_training_graph () 和 create_eval_graph () 加入 FakeQuant 节点.因为我看别的文档似乎并没有这么说.

快到碗里来发表于 2018-10-16

tfers-migration · March 31, 2020, 3:59pm

post training quantize 貌似也支持了。可以试试。

舟 3332 发表于 2018-10-17 00:33:32

tfers-migration · March 31, 2020, 4:00pm

这个方法并不是真正的 quantization。Inference 时依然用的是 float kernel。链接：https://www.tensorflow.org/performance/post_training_quantization

Zongjun, 2018-10-27 01:08

tfers-migration · March 31, 2020, 4:01pm

我也在做这个，类似的问题，到处问找答案也没找到……我模型训练出来后转化的时候它说不知道 FakeQuantWithMinMaxVars 的数据类型，我感觉可能是程序那里少写了东西吧？有人知道这是什么问题吗？楼主能给我参考下你的训练代码吗？因为我不知道 eval 的部分怎么加进去……多谢～

九幽发表于 2018-11-14 11:40:48

tfers-migration · March 31, 2020, 4:01pm

楼主的问题我也碰到过。如果用 Netron 这个软件打开你的.tflite 和.pb 文件，你会发现有的该插入 fake quantization node 的地方并没有插入。这是因为 create_training_graph 这个 function 确实有时候会漏掉某些需要插入伪量化节点的 op。解决方法是去掉这些 op；用某些其他 op 代替之；手动插入 fake quantization nodes; default ranges。
lacking min/max data, which is necessary for quantization. Either target a non-quantized output format, or change the input graph to contain min/max information, or pass --default_ranges_min= and --default_ranges_max= if you do not care about the accuracy of results.

但要注意的是：–default_ranges_min, --default_ranges_max. Type: floating-point. Default value for the (min, max) range values used for all arrays without a specified range. Allows user to proceed with quantization of non-quantized or incorrectly-quantized input files. These flags produce models with low accuracy. They are intended for easy experimentation with quantization via “dummy quantization”.

Zongjun 发表于 2018-11-15 07:11:11