At Google I/O, Google demonstrated Google Duplex, an AI-generated voice assistent that can make phone calls for you to perform tasks like making a restaurant reservation or booking a hair salon appointment. After the event, a whole Google Duplex truther movement sprung up, who simply couldn’t believe technology could do anything even remotely like this, and who accused Google and its CEO Sundar Pachai of lying on stage.
Today, a whole slew of media outlets have published articles about how they were invited to an event at a real restaurant, where the journalists themselves got to talk to Google Duplex. The journalists took on the role of restaurant workers taking reservations requested by Google Duplex. The results? It works exactly as advertised – better, even. Here’s Ars Technica’s Ron Amadeo:
Duplex patiently waited for me to awkwardly stumble through my first ever table reservation while I sloppily wrote down the time and fumbled through a basic back and forth about Google’s reservation for four people at 7pm on Thursday. Today’s Google Assistant requires authoritative, direct, perfect speech in order to process a command. But Duplex handled my clumsy, distracted communication with the casual disinterest of a real person. It waited for me to write down its reservation requirements, and when I asked Duplex to repeat things I didn’t catch the first time (“A reservation at what time?”), it did so without incident. When I told this robocaller the initial time it wanted wasn’t available, it started negotiating times; it offered an acceptable time range and asked for a reservation somewhere in that time slot. I offered seven o’clock and Google accepted.
From the human end, Duplex’s voice is absolutely stunning over the phone. It sounds real most of the time, nailing most of the prosodic features of human speech during normal talking. The bot “ums” and “uhs” when it has to recall something a human might have to think about for a minute. It gives affirmative “mmhmms” if you tell it to hold on a minute. Everything flows together smoothly, making it sound like something a generation better than the current Google Assistant voice.
One of the strangest (and most impressive) parts of Duplex is that there isn’t a single “Duplex voice.” For every call, Duplex would put on a new, distinct personality. Sometimes Duplex come across as male; sometimes female. Some voices were higher and younger sounding; some were nasally, and some even sounded cute.
Duplex conveyed politeness in the demos we saw. It paused with a little “mmhmm” when the called human asked it to wait, a pragmatic tactic Huffman called “conversational acknowledgement”. It showed that Duplex was still on the line and listening, but would wait for the human to continue speaking.
It handled a bunch of interruptions, out of order questions, and even weird discursive statements pretty well. When a human sounded confused or flustered, Duplex took a tone that was almost apologetic. It really seems to be designed to be a super considerate and non-confrontational customer on the phone.
All calls started with Duplex identifying itself as an automated service that would also record the calls, giving the person on the receiving end of the line the opportunity to object. Such objections are handled gracefully, with the call being handed over to a human operator at Google on an unrecorded line. The human fallback is a crucial element of the system, according to Google, because regardless of permission, not every call will go smoothly.
Google Duplex will roll out in limited testing over the coming weeks and months.