News

Deep Netts Now Powered by GPU for Lightning-Fast AI Inference

Introducing GPU Support in Deep Netts 3.1.1: Unleashing Unmatched Performance for AI Workloads

With the release of Deep Netts 3.1.1, we are excited to introduce GPU support for inference, bringing a new level of performance to deep learning models. This update enables significant performance improvements for large-scale models and use cases that require the power of GPUs, particularly for demanding AI tasks such as computer vision.

Massive Performance Gains with GPU Support

For models requiring intensive computation, the performance boost is nothing short of extraordinary. With GPU support, Deep Netts now offers up to 1000x faster performance compared to previous versions running on pure Java implementations on CPU. For example, inference times for a large convolutional network architecture like VGG have been reduced from 19,000 milliseconds on CPU to just 19 milliseconds on GPU.

Powered by jCuda and NVIDIA CUDA Libraries

The GPU functionality is powered by the jCuda library, which provides Java Native Interface (JNI) bindings to the CUDA API. This means that Deep Netts can seamlessly offload heavy computational tasks to GPU hardware, significantly improving inference speeds. However, running models on GPU requires the native NVIDIA CUDA libraries (v11.2 or higher) to be available on your runtime platform.

Developer-Friendly Integration

One of the key advantages of GPU support in Deep Netts is its developer-friendly architecture. The complexities of GPU implementation and usage are completely abstracted away, allowing developers to focus on building and running their AI models without needing to worry about the intricacies of GPU processing. This architecture is designed to be flexible and scalable, allowing for easy evolution as future hardware and APIs become available.

Current Limitations

While this release introduces powerful new capabilities, there are a few current limitations to be aware of:

  • NVIDIA GPUs only: GPU support is currently available exclusively for NVIDIA GPUs using the CUDA API.
  • Inference only: GPU acceleration is currently available only for inference tasks and not for model training.

Future Development: Expanding GPU and Accelerator Support

The Deep Netts team is hard at work on GPU support for training, which is under active development and testing. Future releases will also focus on expanding support to other hardware accelerators, leveraging tools such as the Foreign Memory API from the Java Panama project and accelerator APIs like Tornado VM to enable compatibility with FPGA and other hardware accelerators.

When CPU Still Shines

It’s important to note that in some cases, particularly for certain types of workloads, CPU-based execution may still be more efficient than using a GPU. This is especially true when the nature of the task doesn’t benefit from GPU’s batched processing.


With this latest release, Deep Netts continues to push the boundaries of performance and usability for machine learning in Java environments. The introduction of GPU support marks a significant milestone, enabling developers to run AI workloads faster and more efficiently than ever before.

Download Deep Netts 3.1.1 today and experience the power of GPU acceleration!