As the need for automation increases, are chatbots living up to their potential?
Kaveer Beharee, CEO at Ubiquity AI looks at how to evaluate chatbots and intelligent conversational agents for commercial use and why communication as much as technical expertise will be key to wider adoption.
One of the most important frameworks that developers can use to measure the efficacy of their bots is PARADISE (PARAdigm for DIalogue System Evaluation). This is a general framework for evaluating spoken dialogue agents, which has been discussed and explored in many theses and post-doctoral studies such as this one from AT&T Labs, titled PARADISE: A Framework for Evaluating Spoken Dialogue Agents.
The PARADISE framework has two broad metrics: •The first part seeks to objectively measure task efficacy – maximising task success relative to dialogue costs. •The second part focuses on subjective user ratings around ease of use, friendliness of the chatbot, how natural the chatbot is to engage with, content clarity and conversation continuity, and the user’s propensity to use the chatbot again.
The first part of the framework, the objective measure, fundamentally probes the business case supporting the need (or lack thereof) for a chatbot. The fundamental question here is: does the chatbot fulfil tasks more effectively and more cheaply than our current processes?
In a high proportion of instances, well-designed chatbots will be more cost-effective when compared to other direct engagement channels. Chatbot costs refer to both efficiency costs, including the system costs of successfully executing a task, and qualitative costs, relating to aspects such as incorrect responses, re-prompts and reputational costs.
However, a chatbot’s ability to effectively fulfil tasks is a complex issue and arguably the main reason companies are not rushing their chatbots out to market. During lockdown, two of the service providers that I use – my mobile carrier and my health insurer – both rolled out chatbots.
After one use of their chatbots, I can say that I won’t be using their chatbots any time soon. First, neither bot could fulfil reasonably standard queries, supposed for which they were design. Second, it was much easier and less frustrating to simply pick up the phone and speak to an agent on the other end of a phone line.
So, what makes a good chatbot a good chatbot? There is no doubt that task efficacy and cost issues are main drivers for commerce but objective factors are only part of what makes a successful chatbot for the commercial market.