Run Your Own Chatbot Using Nvidia Chat With RTX
- By Paul Mah
- February 14, 2024
Nvidia this week unveiled a generative AI chatbot that runs completely on a consumer-grade Windows PC.
As its name suggests, Chat with RTX works with Nvidia GTX GPUs and comes with features that let users personalize the chatbot with their content.
Under the hood, Chat with RTX uses retrieval-augmented generation (RAG), the open-source Nvidia TensorRT-LLM software, and Nvidia RTX acceleration to power its generative AI capabilities.
Chat with RTX
There are several benefits to Chat with RTX, namely the ability to access a large-language model (LLM) without an Internet connection.
“Since Chat with RTX runs locally on Windows RTX PCs and workstations, the provided results are fast – and the user’s data stays on the device,” explained Nvidia in a blog announcing the release of Chat with RTX.
“Rather than relying on cloud-based LLM services, Chat with RTX lets users process sensitive data on a local PC without the need to share it with a third party or have an internet connection.”
The tool can also load local files, including .txt, .pdf, docx, and .xml, which can be loaded by pointing the app to a folder containing these files to load. Moreover, users can also include information from YouTube videos and playlists by simply adding a video URL.
The feature makes it possible for users to quickly access relevant information using typed queries. As described by Nvidia, one could ask, “What was the restaurant my partner recommended while in Las Vegas?” and Chat with RTX will scan local files the user points it to and provide the answer with context.
Works with other LLMs
Chat with RTX defaults to AI startup Mistral’s open-source model but supports other text-based models, including Meta’s Llama 2, which is also open-source.
Unlike OpenAI’s ChatGPT, Chat with RTX doesn’t remember the context of prompts. Asking Chat with RTX to give examples of birds in one prompt and then asking for a description of “the birds” in the next prompt will result in a blank – users will need to spell out everything explicitly.
Chat with RTX will run on a Windows 10 or Windows 11 PC with an RTX 30 Series GPU or newer with at least 8GB of VRAM and 16GB of RAM. You can download the beta here - the file download is 35GB.
Image credit: iStockphoto/Manfort Okolie
Paul Mah
Paul Mah is the editor of DSAITrends, where he report on the latest developments in data science and AI. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose.