Can Salesforce Einstein detoxify AI?

Salesforce claims it will rid the company’s Einstein AI of hallucinations and “toxicity” by running generated content through a “trust layer”.

Salesforce used this week’s Dreamforce conference in San Francisco to unveil new AI products, including Einstein Copilot Studio, a tool that lets customers customise AI tools for their particular brand or divisions within a company.

In his keynote speech at Dreamforce, Salesforce CEO Mark Benioff tackled some of the fears companies have about deploying AI tools, not least losing control of their data or it delivering harmful, inaccurate results.

“We have a very different narrative around artificial intelligence,” said Benioff. “We look at things very differently. And we are not afraid to say things that others are afraid to say.”

Benioff claimed that a lot of companies, especially in the B2C market, are exploiting their customers’ data. “I don’t have to tell you, I do not have to explain to anybody here, it is a well-known secret at this point that they’re using your data to make money,” he said. “That is not what we do at Salesforce. Your data isn’t our product.”

Einstein trust layer

Salesforce hopes to overcome some of these problems by ensuring generative AI results are passed through a “trust layer”. Instead of prompts going straight into the large language model (LLM), they will be fed through a filter – both on the way in and the way out – to ensure nothing inappropriate gets through. At least, that’s the theory.

On the way in, the trust layer will look for things such as confidential customer data, preventing it from being sent to the LLM. “We’re not going to send PII information, we’re not going to send credit card information,” said Salesforce’s chief technology officer, Parker Harris. “We’re going to make sure you remain compliant. All that data is going to get masked, that is data that you don’t want to go there.”

On the way back out, the trust layer will attempt to ensure that customers aren’t fed nonsense or harmful content. When “some really amazing AI-generated email or an automated flow comes back, we’re going to make sure that we check it for toxicity, meaning was there a hallucination? Is it toxic? Is there some ethical issue in there? We’re going to scan it and then we’re going to keep an audit trail of all the use of your AI.”

Quite how the trust layer will decide if content is toxic or purely invented by the AI isn’t clear. Hallucinations are one of the great pitfalls of generative AI systems, with current models seemingly unable to prevent themselves from presenting misinformation as fact.

Only real-world usage of the tools will determine whether Salesforce can really filter out AI’s fibs, or whether Salesforce is hallucinating itself.

Read our guide to Microsoft 365 Copilot to see what one of Salesforce’s great rivals hopes to achieve with generative AI.

Avatar photo
Barry Collins

Barry has 20 years of experience working on national newspapers, websites and magazines. He was editor of PC Pro and is co-editor and co-owner of He has published a number of articles on TechFinitive covering data, innovation and cybersecurity.