Long before ChatGPT came along, governments were keen to use chatbots to automate their services and advice.
Those early chatbots “tended to be simpler, with limited conversational abilities,” says Colin van Noordt, a researcher on the use of AI in government, and based in the Netherlands.
But the emergence of generative AI in the last two years, has revived a vision of more efficient public service, where human-like advisors can work all hours, replying to questions over benefits, taxes and other areas where the government interacts with the public.
Generative AI is sophisticated enough to give human-like responses, and if trained on enough quality data, in theory it could deal with all sorts of questions about government services.
But generative AI has become well known for making mistakes or even nonsensical answers – so-called hallucinations.
In the UK, the Government Digital Service (GDS) has carried out tests on a ChatGPT-based chatbot called GOV.UK Chat, which would answer citizens’ questions on a range of issues concerning government services.
In a blog post about their early findings, the agency noted that almost 70% of those involved in the trial found the responses useful.
However, there were problems with “a few” cases of the system generating incorrect information and presenting it as fact.
The blog also raised concern that there might be misplaced confidence in a system that could be wrong some of the time.
“Overall, answers did not reach the highest level of accuracy demanded for a site like GOV.UK, where factual accuracy is crucial. We’re rapidly iterating this experiment to address the issues of accuracy and reliability.”
Other countries are also experimenting with systems based on generative AI.
Portugal released the Justice Practical Guide in 2023, a chatbot devised to answer basic questions on simple subjects such as marriage and divorce. The chatbot has been developed with funds from the European Union’s Recovery and Resilience Facility (RRF).
The €1.3m ($1.4m; £1.1m) project is based on OpenAI’s GPT 4.0 language model. As well as covering marriage and divorce, it also provides information on setting-up a company.
According to data by the Portuguese Ministry of Justice, 28,608 questions were posed through the guide in the project’s first 14 months.
When I asked it the basic question: “How can I set up a company,” it performed well.
But when I asked something trickier: “Can I set up a company if I am younger than 18, but married?”, it apologised for not having the information to answer that question.
A ministry source admits that they are still lacking in terms of trustworthiness, even though wrong replies are rare.
“We hope these limitations will be overcome with a decisive increase in the answers’ level of confidence”, the source tells me.
Such flaws mean that many experts are advising caution – including Colin van Noordt. “It goes wrong when the chatbot is deployed as a way to replace people and reduce costs.”
It would be a more sensible approach, he adds, if they’re seen as “an additional service, a quick way to find information”.
Sven Nyholm, professor of the ethics of artificial intelligence at Munich’s Ludwig Maximilians University, highlights the problem of accountability.
“A chatbot is not interchangeable with a civil servant,” he says. “A human being can be accountable and morally responsible for their actions.
“AI chatbots cannot be accountable for what they do. Public administration requires accountability, and so therefore it requires human beings.”
Mr Nyholm also highlights the problem of reliability.
“Newer types of chatbots create the illusion of being intelligent and creative in a way that older types of chatbots didn’t used to do.
“Every now and then these new and more impressive forms of chatbots make silly and stupid mistakes – this can sometimes be humorous, but it can potentially also be dangerous, if people rely on their recommendations.”
If ChatGPT and other Large Language Models (LLMs) are not ready to give out important advice, then perhaps we could look at Estonia for an alternative.
When it comes to digitising public services, Estonia has been one of the leaders. Since the early 1990s it has been building digital services, and in 2002 introduced a digital ID card that allows citizens to access state services.
So it’s not surprising that Estonia is at the forefront of introducing chatbots.
The nation is currently developing a suite of chatbots for state services under the name of Bürokratt.
However, Estonia’s chatbots are not based on Large Language Models (LLM) like ChatGPT or Google’s Gemini.
Instead they use Natural Language Processing (NLP), a technology which preceded the latest wave of AI.
Estonia’s NLP algorithms break down a request into small segments, identify key words, and from that infers what user wants.
At Bürokratt, departments use their data to train chatbots and check their answers.
“If Bürokratt does not know the answer, the chat will be handed over to customer support agent, who will take over the chat and will answer manually,” says Kai Kallas, head of the Personal Services Department at Estonia’s Information System Authority.
It is a system of more limited potential than one based on ChatGPT, as NLP models are limited in their ability to imitate human speech and to detect hints of nuance in language.
However, they are unlikely to give wrong or misleading answers.
“Some early chatbots forced citizens into choosing options for questions. At the same time, it allowed for greater control and transparency of how the chatbot operates and answers”, explains Colin van Noordt.
“LLM-based chatbots often have much more conversational quality and can provide more nuanced answers.
“However, it comes at a cost of less control of the system, and it can also provide different answers to the same question,” he adds.