webnn: Build infrastructure to support float16 data type
Original design is to serialize all operand to TFLite tensor in a
loop before serializing operations, but this approach is not
suitable for float16 model because some operations (concat, cast and
reshape) support float16 input data type, other operations don't
support float16, the operation kind isn't known when serializing
operand.
The current design is to serialize the input and output operand of
operation while serializing the operation at the same time. For input
operand, insert a TFLite dequantize operator to convert fp16 to fp32
for graph input, constant and intermediate operand if the current
operation doesn't support float16 inference. For output operand,
insert a TFLite cast operator to convert fp32 to fp16 if the operand is
graph output and the current operation doesn't support float16
inference.
Bug: 339654398
Change-Id: Icd03e7a94874bbb1bf915a7485509badf2e59026
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5807389
Reviewed-by: ningxin hu ningxin.hu@intel.com
Reviewed-by: Reilly Grant reillyg@chromium.org
Commit-Queue: Junwei Fu junwei.fu@intel.com
Cr-Commit-Position: refs/heads/main@{#1358484}