NVIDIA’s “Chat With RTX” Turns Your GPU Into A Personal AI Chatbot

Low Boon Shen
3 Min Read
NVIDIA’s “Chat With RTX” Turns Your GPU Into A Personal AI Chatbot

NVIDIA’s “Chat With RTX” Turns Your GPU Into A Personal AI Chatbot

NVIDIA's "Chat With RTX" Turns Your GPU Into A Personal AI Chatbot

NVIDIA's "Chat With RTX" Turns Your GPU Into A Personal AI Chatbot

Everyone knows ChatGPT – the AI chatbot capable of many things. With the next challenge for tech companies being implementing the large language model (LLM) locally, NVIDIA has presented their answer in the form of “Chat with RTX“.

The advantage of NVIDIA’s solution is that it runs locally, meaning no internet connection is required to operate, with the bonus that any sensitive information will not leave your system, sold to the highest bidder. The application taps into NVIDIA GPU’s Tensor Cores, which are used for various functions – mainly DLSS for gaming workloads. Over time, Tensor Cores has seen increased use in other non-gaming scenarios, such as webcam auto-framing, background blur, and more.

NVIDIA's "Chat With RTX" Turns Your GPU Into A Personal AI Chatbot 6

NVIDIA's "Chat With RTX" Turns Your GPU Into A Personal AI Chatbot 6

Those in the loop will know it takes a lot of computing to construct something like ChatGPT, as the variables are in the magnitude of billions making calculations extremely compute-intensive. So, how does Chat with RTX work? Simply put, the locally-run version utilizes TensorRT-LLM and Retrieval Augmented Generated (RAG) software, with a downsized version of the language model capable of running with the power of a single GPU.

The premise of Chat with RTX is that you can use it to sift through files and contents on your system. The process looks like this: feed the AI with the files, and ask the chatbot what kind of information you’re looking for. The application will provide you the answer within the contents provided, saving users time in scenarios where there are tons of complex documents or content unfamiliar to the end user. It also works with online content, such as YouTube – as demonstrated below.

The catch? You’ll need an RTX 30 or 40 Series GPU with at least 8GB of VRAM. All desktop GPUs under these two families will work with the exception of RTX 3050 6GB, while mobile GPUs will require RTX 3070 or RTX 4060 and up. The application will also require GPU driver version 535.11 and up, with 16GB or greater amount of system RAM. If you have the hardware and want to try it out for yourself, click here to download the demo (it’s pretty large – at 35.1GB).

Pokdepinion: This might be the one AI implementation that I find a lot more useful than others. 

Share This Article