What’s new in TensorFlow 2.21

Google has officially launched LiteRT, the successor to TFLite, which offers significantly faster GPU and NPU acceleration alongside seamless support for PyTorch and JAX. The update also introduces lower-precision data type support for increased efficiency and a commitment to more frequent security and dependency updates across the TensorFlow ecosystem. This transition solidifies LiteRT as Google’s primary high-performance framework for deploying GenAI and advanced on-device inference.

TensorFlow 2.21 has been released! You can find a complete list of all changes in the full release notes on GitHub.

What’s new in the LiteRT stack?

At Google I/O ‘25, we shared a preview of the evolution to LiteRT: a high-performance runtime designed specifically for advanced hardware acceleration. Today, we are excited to announce that these advanced acceleration capabilities have fully graduated into the LiteRT production stack, available now for all developers.

This milestone solidifies LiteRT as the universal on-device inference framework for the AI era, representing a significant leap over TFLite for being:

Faster: delivers 1.4x faster GPU performance than TFLite, and introduces new, state-of-the-art NPU acceleration.
Simpler: provides a unified, streamlined workflow for GPU and NPU acceleration across edge platforms.
Powerful: supports superior cross-platform GenAI deployment for popular open models like Gemma.
Flexible: offers first-class PyTorch/JAX support via seamless model conversion.

All of this is delivered while maintaining the same reliable, cross-platform deployment you trust since TFLite.

Read the full announcement and get started.

tf.lite

Several operators now support lower-precision data types for better performance and efficiency which includes int8 and int16x8 for the SQRT operator as well as int16x8 for comparison operators.
Support for smaller data types has been extended across multiple operators. tfl.cast now supports conversions involving INT2 and INT4, tfl.slice has added support for INT4, and tfl.fully_connected now includes support for INT2.

Community updates

We’ve also heard from the community around the need for fixing bugs quickly and providing more timely dependency updates, so we are increasing resources towards these efforts. Going forward, we will exclusively focus on:

Security and bug fixes: We are increasing our efforts to quickly address security vulnerabilities and critical bugs, releasing minor and patch versions as required.
Dependency updates: We will release minor versions as required to support dependency updates, including new Python releases.
Community contributions: We will continue to review and accept critical bug fixes as relevant from the open source community.

These commitments will apply to TF.data, TensorFlow Serving, TFX, TensorFlow Data Validation, TensorFlow Transform, TensorFlow Model Analysis, TensorFlow Recommenders, TensorFlow Text, TensorBoard, and TensorFlow Quantum.

Note: The TF Lite project has been renamed to LiteRT and is in active development separately.

While TensorFlow continues to provide stability for production, we recommend exploring our latest updates for Keras 3, JAX, and PyTorch for new work in Generative AI.

What’s new in TensorFlow 2.21

What’s new in the LiteRT stack?

tf.lite

Community updates

Lasă un răspuns

Servicii

Utile

Categorii