TensorFlow Serving using TensorFlow 2.5.0-rc2
Major Features and Improvements
- Upgrade to CUDA 11.2. (commit: 1975e3e)
- Update TF Text to v2.4.3. (commit: ccfb606)
- Experimental support for XLA/CPU models. (commit: 3c1b2b3)
- Add metrics to the Prometheus API (commit: dfb41f1)
- Support URL reserved characters for REST API (#1726) (commit: dd9c467)
Breaking Changes
- No Breaking changes
Bug Fixes and Other Changes
- Fix typo in REQUIRED_PACKAGES for grpcio (commit: b9ed0f8)
- update resnet_k8s.yaml file (commit: e7b7b33)
- Fix a compile warning thrown by gcc-9 (commit: 38a017d)
- Update json_tensor.cc (commit: a0a9d14)
- Add TfLiteInterpreterPool to make concurrent use of TfliteSession better (commit: d9efa43)
- Enable download of TF Serving sources at arbitrary commit for CPU docker image. (commit: de1ab9e)
- Updated tests to newer API (commit: 30dd2fe)
- When GRPC messages come to TF Serving GRPC server, server will create a new threads to handle each message. (commit: ac0eb73)
- Add dedicated aliases field to ModelServerConfig. (commit: 358f7d1)
- Update docker command line to work with GPUs (Fixes #1768). (commit: b41a28b)
- option to disable grpc over http (commit: f087290)
- use RecordRuntimeLatency (commit: d353931)
- Improve error message for file not exists. (commit: 78d47f7)
- Transition TensorFlow Serving to TensorFlow's new WORKSPACE protocol. (commit: 50a7ef3)
- fix http_rest_api_handler_test (commit: 53da4f8)
- add status label in request_latency (commit: 0483edb)
- fix timing of request_latency (commit: e20c91e)
- fix http_rest_api_handler_test (commit: 6254411)
- Clarifying object values in REST requests to include B64 encoding and similar key:value pair objects. (commit: 0536678)
- Register custom TfLite ParseExample and add benchmark (commit: 20fe3ca)
- Update docker.md (commit: 7ebcd15)
- Pre-allocate memory for certain vectors where the size is known. (commit: e208b6e)
- Updating serving_basic for adjusting serving_basic.md file and making it up2date with TF2.x - including: (commit: cea306a)
- Support URL reserved characters for REST API (#1726) (commit: dd9c467)
- Fixing MKL builds due to missing 'build_with_openmp' option (commit: 0ed23df)
- Implement batch parallelism for tflite sessions (commit: fec1d5d)
- Use LOG_EVERY_N_SEC insted of LOG with local static time variable. (commit: c6b3936)
- Fix TensorFlow Serving build with MKL+OpenMP (commit: ddad074)
- add cors headers (commit: d65914b)
- Remove extra build options for TF Serving (commit: 63acc95)
- Remove hashtable custom op dependencies (commit: bb51722)
- allow http OPTIONS request and add default OPTIONS request handling (commit: 6287cb4)
- Enable aspired version which failed to load to attempt reload. (commit: 2530a33)
- Fixed a compilation error in aspired_versions_manager.cc (commit: 4ca9a4b)
- handle options requests and check if Origin header exists (commit: c512141)
- Add "_r" root event annotation to ProcessBatch events. (commit: e5c3aec)
- Bump minimum bazel version 3.7.2. (commit: 5edcd13)
- Update TensorFlow serving documentation with instructions to bind a host volume. The profiler in the docker will write to this volume, and on the host side TensorBoard will load the profile from the location. (commit: 7c21b22)
- Fix package build due to config move in: (commit: 18dd766)
- Fix memory leak from allocating input tensors (commit: 2f9b6a0)
- Allowing lossy floating point conversions for JSON inputs since JSON has only a single numeric type. This is inline with textproto conversion (and even C++ implicit double -> float conversion). (commit: 57dac6c)
- Adding enable_profiler command line flag. (commit: 7e8720d)
- Add logging in ServerCore. (commit: 623da67)
- Fix zlib. (commit: bc24390)
- Fix broken GPU build by add TF cuda options: (commit: 05377a9)
Thanks to our Contributors
This release contains contributions from many people at Google, as well as:
Abhinav Pundir, Abolfazl Shahbazi, Aurélien Geron, Bairen Yi, gbaned, handong, Hao Ziyu, Junqin Zhang, kiddos, Oliver Sampson, OniB, Runzhen Wang, skawasak, zou000