Nvidia Chat with RTX AI chatbot heralds the coming of local AI

Nvidia has released an AI chatbot that works on your computer without connecting to the web. The catch? It needs a recent Nvidia graphics card – from its RTX series – to work.

Chat with RTX delves through your personal or business files to find useful information and answer questions, but your data stays safely on your PC.

Nvidia released a free demo version of the local AI app so anyone can try it out. The app can analyse YouTube videos as well as files and documents stored on your computer.

Ask what restaurant a colleague recommended, and it will search through the transcripts of past meetings to find the answer. Ask when your new job starts, and it will search through the file detailing your contract. When responding, RTX also shows you whichever file contained the answer so you can see for yourself.

Want to check out YouTube videos without actually watching them? Paste in the URL of the video and you can either generate a summary or ask specific questions about the clip’s contents.

Because your files never leave your computer, you don’t risk sending personal or business-sensitive data to online-based generative AI systems such as ChatGPT, Microsoft 365 Copilot, and Google Bard.

Engineers at Samsung were caught out when they asked ChatGPT to check over source code, uploading the code to the cloud and potentially adding proprietary data into the training dataset that informs answers to other users. Samsung subsequently banned employees from using generative AI.

Local AI is coming

Local AI is set to be a next big step in AI’s world domination (or at least domination of our computers). There are various local LLMs available already, but most are open source and not particularly user-friendly.

Research firm IDC predicts the market for so-called AI PCs could explode from 50 million this year to 167 million by 2027.

There are other advantages as well as the security of keeping your files to yourself. Personalised on-device generative AI can help you out by drawing on various kinds of data the device holds, analysing not just your files and photos but also the sensors on your smartphone collecting information like location and health stats.

And on mobile devices, local AI has the added benefit of reducing the need for an Internet connection to send off prompts and receive an answer.

Indeed, the major Android phone manufacturers are keen to make on-device AI more accessible: new smartphones such as the Samsung S24 and Google Pixel 8 already include local AI for some tasks.

Chat with RTX requirements

Nvidia’s Chat with RTX uses retrieval-augmented generation (RAG) and open-source large language models Mistral and Llama 2. It works with file formats including .txt, .pdf, .doc/.docx and .xml.

To use the chatbot you need a Windows 10 or 11 computer running an RTX 30- or 40-series Nvidia GPU with at least 8GB of VRAM. Once installed, you use the app in a browser. It’s an early demo, so isn’t completely polished: for example, it doesn’t remember previous prompts so you can’t ask follow-up questions or refine answers already given.

As with all generative AI, it’s worth fact-checking the information it gives you and double-checking any references it suggests.

Nvidia is showing off Chat with RTX as an example of the potential of combining LLMs with RTX GPUs. It’s built from the TensorRT-LLM RAG developer reference project, which is available on open source website GitHub. Nvidia encourages developers to make your own apps and plug-ins, and enter them into the company’s competition to win prizes.