Difference between revisions of "TensorRT"

From eLinux.org
Jump to: navigation, search
(Created page with "NVIDIA TensorRT™ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-t...")
 
(The Usage of Polygraphy)
 
(96 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
NVIDIA TensorRT™ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT-based applications perform up to 40x faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and finally deploy to hyperscale data centers, embedded, or automotive product platforms.
 
NVIDIA TensorRT™ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT-based applications perform up to 40x faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and finally deploy to hyperscale data centers, embedded, or automotive product platforms.
 
<br>
 
<br>
 +
 
== Introduction ==
 
== Introduction ==
  
[https://developer.nvidia.com/tensorrt developer.nvidia.com/tensorrt ]<br>
+
 
[https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html TensorRT Developer Guide ]
+
[https://developer.nvidia.com/tensorrt TensorRT Download]<br>
 +
[https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html TensorRT Developer Guide]
 
<br>
 
<br>
 +
 
== FAQ ==
 
== FAQ ==
  
===== <big>1. How to check TensorRT version?</big> =====
+
 
There are two methods to check TensorRT version,
+
=== Official FAQ ===
* Symbols from library
+
[https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#troubleshooting TensorRT Developer Guide#FAQs]<br>
<pre>
+
 
$ nm -D /usr/lib//aarch64-linux-gnu/libnvinfer.so | grep "tensorrt"
+
 
0000000007849eb0 B tensorrt_build_svc_tensorrt_20181028_25152976
+
----
0000000007849eb4 B tensorrt_version_5_0_3_2
+
=== Common FAQ ===
</pre>
+
You can find answers here for some common questions about using TRT.<br>
NOTE: 20181028 is the build date and 25152976 is the top changelist and 5_0_3_2 is the version information.<br>
+
Refer to the page [https://elinux.org/TensorRT/CommonFAQ TensorRT/CommonFAQ]<br>
* Macros from header file
+
 
<pre>
+
 
$ cat /usr/include/aarch64-linux-gnu/NvInfer.h | grep "define NV_TENSORRT"
+
----
#define NV_TENSORRT_MAJOR 5 //!< TensorRT major version.
+
=== TRT Accuracy FAQ ===
#define NV_TENSORRT_MINOR 0 //!< TensorRT minor version.
+
If your FP16 result or Int8 result is not as expected, below page may help you fix the accuracy issues.<br>
#define NV_TENSORRT_PATCH 3 //!< TensorRT patch version.
+
Refer to the page [https://elinux.org/TensorRT/AccuracyIssues TensorRT/AccuracyIssues]<br>
#define NV_TENSORRT_BUILD 2 //!< TensorRT build number.
+
 
#define NV_TENSORRT_SONAME_MAJOR 5 //!< Shared object library major version number.
+
 
#define NV_TENSORRT_SONAME_MINOR 0 //!< Shared object library minor version number.
+
----
#define NV_TENSORRT_SONAME_PATCH 3 //!< Shared object library patch version number.
+
=== TRT Performance FAQ ===
</pre>
+
If the performance of doing inference with TRT is not as expected, below page may help you to optimize the performance.<br>
 +
Refer to the page [https://elinux.org/TensorRT/PerfIssues TensorRT/PerfIssues]<br>
 +
 
 +
 
 +
----
 +
 
 +
=== TRT Int8 Calibration FAQ ===
 +
Below page will present some FAQs about TRT Int8 Calibration.<br>
 +
Refer to the page [https://elinux.org/TensorRT/Int8CFAQ  TensorRT/Int8CFAQ]<br>
 +
 
 +
 
 +
----
 +
 
 +
=== TRT Plugin FAQ ===
 +
Below page will present some FAQs about TRT Plugin.<br>
 +
Refer to the page [https://elinux.org/TensorRT/PluginFAQ  TensorRT/PluginFAQ]<br>
 +
 
 +
 
 +
----
 +
=== How to fix some Common Errors ===
 +
If you met some Errors during using TRT, please find from below page for the answer.<br>
 +
Refer to the page [https://elinux.org/TensorRT/CommonErrorFix TensorRT/CommonErrorFix]<br>
 +
 
 +
 
 +
----
 +
=== How to debug or analyze ===
 +
Below page will help you debugging your inferencing in some ways.<br>
 +
Refer to the page [https://elinux.org/TensorRT/How2Debug TensorRT/How2Debug]<br>
 +
 
 +
 
 +
----
 +
 
 +
=== TRT & YoloV3 FAQ ===
 +
Refer to the page [https://elinux.org/TensorRT/YoloV3 TensorRT/YoloV3]<br>
 +
 
 +
 
 +
----
 +
 
 +
=== TRT & YoloV4 FAQ ===
 +
Refer to the page [https://elinux.org/TensorRT/YoloV4 TensorRT/YoloV4]<br>
 +
 
 +
 
 +
----
 +
 
 +
=== TRT ONNXParser FAQ ===
 +
If you have some question about onnx dynamic shape and onnx Parsing issues, this page might be helpful.<br>
 +
Refer to the page [https://elinux.org/TensorRT/ONNX TensorRT/ONNX]<br>
 +
 
 +
 
 +
----
 +
 
 +
=== The Usage of Polygraphy ===
 +
Polygraphy is really useful debugging toolkit for TensorRT<br>
 +
Refer to the page [https://elinux.org/TensorRT/Polygraphy_Usage TensorRT/Polygraphy_Usage] <br>
 +
 
 +
 
 +
----

Latest revision as of 03:36, 29 February 2024

NVIDIA TensorRT™ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT-based applications perform up to 40x faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and finally deploy to hyperscale data centers, embedded, or automotive product platforms.

Introduction

TensorRT Download
TensorRT Developer Guide

FAQ

Official FAQ

TensorRT Developer Guide#FAQs



Common FAQ

You can find answers here for some common questions about using TRT.
Refer to the page TensorRT/CommonFAQ



TRT Accuracy FAQ

If your FP16 result or Int8 result is not as expected, below page may help you fix the accuracy issues.
Refer to the page TensorRT/AccuracyIssues



TRT Performance FAQ

If the performance of doing inference with TRT is not as expected, below page may help you to optimize the performance.
Refer to the page TensorRT/PerfIssues



TRT Int8 Calibration FAQ

Below page will present some FAQs about TRT Int8 Calibration.
Refer to the page TensorRT/Int8CFAQ



TRT Plugin FAQ

Below page will present some FAQs about TRT Plugin.
Refer to the page TensorRT/PluginFAQ



How to fix some Common Errors

If you met some Errors during using TRT, please find from below page for the answer.
Refer to the page TensorRT/CommonErrorFix



How to debug or analyze

Below page will help you debugging your inferencing in some ways.
Refer to the page TensorRT/How2Debug



TRT & YoloV3 FAQ

Refer to the page TensorRT/YoloV3



TRT & YoloV4 FAQ

Refer to the page TensorRT/YoloV4



TRT ONNXParser FAQ

If you have some question about onnx dynamic shape and onnx Parsing issues, this page might be helpful.
Refer to the page TensorRT/ONNX



The Usage of Polygraphy

Polygraphy is really useful debugging toolkit for TensorRT
Refer to the page TensorRT/Polygraphy_Usage