The evolution of in-car voice control

February 15, 2019 12:42 pm

The evolution of in-car voice control

The Amazon Echo Auto is here…nearly. Originally planned for 2018 the release date has been pushed back but will be available sometime in 2019. The Echo Auto will let us take our Alexa on the road, fulfilling similar functions to the Echo in-home family and making our car a home-from-home.

You’ll be able to play music on car speakers through your streaming service (Apple music has been promised but is not yet available), make calls, set reminders, manage your shopping lists, and edit your calendar.  These are the classic Skills, already available from the in-home Echo. Beyond these, the car version will also help locate and guide you to local restaurants or coffee shops on the go using supported apps, such as Google maps, transforming it into a virtual co-pilot and concierge.

Although this in-car support might feel like a leap into the future, voice control has been available in cars in some shape or form for nearly 15 years. Some of these in-car systems used voice control fantastically, limited only by the technology available at the time, but others treated it more as an side feature likely to cause more harm than good.

Lets take a walk through the history of voice control in cars, looking at how they’ve evolved over that last decade and a half starting with generation 1:

2005 Honda Acura

This was the first car of it’s like to come to market. It was the product of collaboration between IBM and Honda and blazed a trail for voice control in cars, ushering in a new era. Well, that might be true in a parallel universe. Despite its groundbreaking nature, the Acura voice control arrived with less of a bang and more of a whimper. Drivers who reviewed the car regarded the voice functionality as an afterthought, with one reviewer of the Acura lovingly talking of a new drive system and the elegance of the cabin among other features, before mentioning the voice capabilities in a throwaway line.

Interior shot of the Honda Acura showing the voice control system was able to tackle many buttons functionality
Interior shot of the Honda Acura showing the voice control system was able to tackle many buttons functionality

Unlike Alexa, and other voice interfaces the Acura didn’t have an ‘on word’. The driver primed the system by pushing a button located on the steering wheel to turn it into listening-mode so it responded to speech and voice commands. The system had a remarkable level of sophistication for its time. The driver could use voice commands to control temperature, make calls, use the DVD entertainment system, navigate, and have access to Acura-Link a service that would give information about weather, traffic reports and others. In all, quite a comprehensive service.

The system did come with limitations. The microphone was mounted in the ceiling and struggled with accents. Drivers had to “speak with clear, natural voice and reduce background noise” by closing windows and ensuring air conditioning vents did not blow near the microphone. It had a restricted vocabulary, so the driver had to learn the pre-set commands before being able to take advantage of the system. Mobile internet access was still in it’s infancy as well, so recommendations for local coffee shops came from a pre-loaded database which would quickly go out of date.

The video below gives a glimpse of it in action:

Gen 2

2007 Ford Sync

Ford Sync system installed in the car
Ford Sync system installed in the car dashboard

Ford were the third manufacturer to introduce voice control in cars (We’ve skipped Lexus for the moment. They introduced a voice control system in 2006 but we’ll discuss them below in Gen3).

Ford collaborated with Microsoft to introduce Sync to the market, which was powered by Microsoft Auto OS.  It allowed you to interact with some mobile devices to send texts and calls, and also control to music from a USB connected MP3 player. The system used voice guidance and drivers had to navigate down through a deep routing to complete commands. This approach has implications for usability that we will discuss below.

2013 Skoda Octavia Voice Control

Six years on from the Ford Sync, a Skoda Octavia was released with its own form of voice control. Surely in those six years great technological strides have been made, leading to natural and usable voice control? Sadly not according to the reviews of the time.

Reviewers were scathing and the system was the subject of much vitriol on the forums. Most reviewers said they abandoned it in favor of a physical control called the maxidot. This was a turnable button that allowed you to cycle through the menu and adjust settings. Despite the dexterity needed it proved more usable than the voice control!

"Say what you see" - complex menus and commands shown on screen as prompts
“Say what you see” – complex menus and commands shown on screen as prompts

Both the Ford Sync and the Skoda system were difficult and time-consuming to use for the driver. They both had a restricted vocabulary, for example, they would recognize “ring Kevin” but not “call Kevin”. They also had a deep command structure that required the user to work through options step by step. Rather than saying ‘Take me to the nearest pizza restaurant’ you would need to move through the different levels of menu to find navigation and at each level use the appropriate language. To overcome the problems of learnability this imposed on drivers, they used a voice guidance system that would repeat commands and suggest next steps. This was matched with a “say what you see” approach – showing the available options on the screen so the driver could read and then speak.

The complexity of menus, the number of options and the “say what you see approach” not only drew drivers eyes from the road to work out what they needed to say, also obscured the GPS navigation maps, meaning they might miss a critical turning or navigation instruction.

You can see the system in action here:

Gen 3

2017 Lexus RX350

Lexus “say as you see” category menu

The first issue of the Lexus system had come out in 2006. A year after the Acura, but at the time it offered less functionality. It was limited to controlling a phone tethered to the car through Bluetooth.

Over the next decade Lexus have continued to develop their in-car voice systems, gradually adding more functionality. Their 2017 model still using a structured menu approach and “voice guidance” to access functions. This makes interactions slow and unwieldy and also relies upon drivers looking at the screen to read and select options to move forward.

In the screenshot at the top of this section, you can see screen prompts for the driver to select a category. In the screenshot below, the driver has asked for banks nearby and the system has provided a list and now the driver has to read the options and reply with the ordinal number of the bank they want directions to.

Selecting a bank from the list by reading out "number 4"
To select a bank from the list you read out “number 4”

You can watch this interaction below

Working through the command structure takes a long time and a lot of cognitive load, and takes the drivers eyes off the road, with obvious implications for usability and for safety.

Ultimately a central issue for usability as drivers will abandon a task if it is too difficult as was found by a study by the AA which suggested up to 25 percent voice controlled tasks were abandoned by drivers

Gen 4

In our whistle-stop history of voice control in cars we have seen an evolution in what they can offer.

The Honda Acura was the first generation of voice control in cars, it offered control of all the car systems but could not be updated as technology moved on. The original Ford Sync and the original Lexus systems introduced the second generation with the ability to link to phones and music players. They were not able to control in-car functions, and were also hampered by deep command structures and slow voice guidance interactions.

We saw that in Gen 3, functionality increased and drivers were able to control the car and connect to devices as seen in the Lexus 2017 system, but these remain constrained by the same issues affecting previous generations, and the issues perhaps magnified by increased functionality.

So what can we expect for Gen 4?

In 2019 we are used to more natural voice control in the home and the ability to control home devices and do our daily chores as well as entertain us. Amazon are promising to bring that full experience to the car.

The Echo doesn’t need a button to be activated and it doesn’t require drivers to read a screen and select options. It will also connect you to your home controls and won’t be restricted to any particular car. This is a leap forward in the functionality available to the driver, but what of the usability? The “say what you see” distraction has been removed but will the more natural voice interaction overcome the problems with learnability of commands? And what of the cognitive load of being able to do so much more in the car while we should be driving? We can’t wait for release so we can start testing out.

No article is written in isolation, the ideas, research and crafting are all the result of team work. For this one, special thanks are due to Kevin Mok and Lucy Buykx for their contributions.