The Voice Assistant Wars

On September 24th, 2019, the Voice Interoperability Initiative was launched. Amazon, Microsoft and over 30 other companies announced their support for the initiative.

This means that voice bots can run simultaneously and concurrently on voice devices. If voice bots are simultaneous it means that multiple bots can run simultaneously on the same device and can be accessed via their relevant wake words.

Concurrency allows the device to support more than one assistant, however, the device can only be configured to use one assistant at a time.

This was the first salvo of the voice assistant wars, which we predict that Google will win.

Some of the largest companies, such as Google, have so far refused to join the initiative. Google allows the bots to be concurrent but not simultaneous.

It makes sense that Google might resist such an initiative because Google is positioned to be the dominant force in voice assistants and if consumers are eventually forced to choose only one bot, Google would likely be the choice in the end.

Google has a competitive advantage

Not only does it have by far the greatest access to data, it has the world’s most advanced AI. Most importantly however it has the world’s most powerful digital panopticon for monetizing data.

Voice assistants can listen to customers all day long and Google is well positioned to monetize what it hears with advertisers. And this is in addition to the information that Google learns from interactions with the voice assistant.

Google can therefore afford to deploy the most resources to its voice assistant initiative because it has a clear technological and data advantage, and because it will make the most revenue by far out of the data.

It is therefore clear why Amazon and others are keen to commoditize the voice assistant market. They recognize Google’s advantage and do not want to make this a competition where a single chatbot wins, as they are likely to lose that competition.

Amazon clearly sees voice assistants as a way of reducing friction in the customer value chain around purchasing goods and to some extent doesn’t care who controls the channel. The initiative is therefore less of a gamble for them than for others.

Even with regulatory intervention (which may come in the future), which restricts listening on voice devices and similar practices, Google should emerge as the beneficiary of the data, even if consumers end up using multiple devices and multiple voice bots.

This is because Google has a monopoly on monetizing data and all device makers will be incentivized to monetize their data through Google (via ads either on the chatbot or on other platforms).

There is another way that Google can win in the multiple voice assistant scenario.

At the moment, multiple bots will simultaneously be listening for their hotword. This means the hotword is roughly analogous to a domain name or phone number.

If the majority of companies begin to allow consumers to access information and services over the voice assistants, the consumers will need a way of finding the official company bot on the device.

Google is perfectly positioned to provide this service of connecting consumers to companies, as it already does on the internet. Consumers normally search for the company on the internet and then access the company information through Google, often without even going to the company website.

Since Google already has the information for most companies in the world, it can very easily add a voice assistant connection.

Google’s strategy, of course, is to disintermediate companies from the consumer and standardize the information. This is clearly their strategy on the web. The same strategy could easily be applied to voice assistants.

Even if it is possible to access multiple bots simultaneously, the Google assistant would be able to access the bots of other companies and complete the action at hand without the customer ever engaging with the consumer directly.

It will take Google more time to commoditize services offered by companies on voice assistants as, unlike for general information, in order to aggregate services it has to design the interfaces for the service in question.

They however have a head start on this because they are already aggregating services on the internet in many areas, such as flight booking, and ride-hailing, etc.


There is also more potential for Google as an aggregator on voice assistants because of the nature of the interaction. It is very inefficient to filter and search on a voice device and therefore it is likely that consumers will leave most of this filtering and searching to the assistant (provided they can trust it). For example, they might say, “I want to order a pizza” to the assistant and based on limited criteria (such as rating and speed of delivery), the assistant will organize the purchase and delivery based on its own (not the consumer’s) preference. This is an extremely powerful and valuable position for a company to be in.

This commoditization is good for the consumer because it drives down prices, but it makes it much more difficult for the companies being aggregated to make profits. In many cases, they will create strategies to resist commoditization, such as introducing highly customized, long term contracts as the mobile operators have done.

Standardization of voice bots is a necessary step for real productivity to be achieved. It is therefore inevitable that Google joins this initiative.

What about Apple?

Apple has also not joined the initiative as of yet. Historically it has pursued a strategy of being a closed ecosystem, which provided it with outsized profits on a relatively small market share, until it went mainstream by innovating breakthrough products faster than its competitors.

It is likely that Apple will want to control the point of access to the assistants and therefore force developers of assistant functionality to use their point of access. This is like forcing app developers to develop apps for IOS and Android, rather than finding a common standard. Apple will therefore be able to monetize the access this way.

Just like for apps and browsers, the size of their user base will force every assistant vendor to make their software available on Apple devices, so users of Apple products won’t lose out. The problem for Apple is that Google functionality may become so good that it has to concede to Google being the first point of entry for consumers, otherwise their consumers will become frustrated.

What are the implications for consumers of voice assistants in the future?

As we have described, individual companies will have their own hotwords which are the equivalent or URLs or phone numbers, and consumers will likely use Google to access these URLs. These hotwords can either access the company itself, for example for customer service or to purchase a product, or they can access a voice app.

In time Google will develop a business model (like youtube) that shares the monetization of access to the voice assistant with companies that provide services and with the device makers as appropriate.

The initiative announced yesterday between Amazon and others in time will seem quaint. Consumers will need to be able to access millions of bots through unique hotwords, just as they access companies through telephone numbers or domain names. Therefore there will be a time where a hotword market opens up just like has happened for domain names. An agreement on assistants and hotwords between 30 or so players will seem small.

The other player, Facebook

One confounding player with regard to Google’s strategy is Facebook. Facebook owns the main messaging platforms and has made great efforts to provide users with direct access to companies through the messaging platforms. This means that Facebook has an advantage when it comes to text-based assistants. The question is will users prefer to use text-based assistants or voice assistants. In the longer term, it must be voice assistants, as it is much faster to speak than to write, especially on a mobile phone. That doesn’t mean however that Facebook can’t build voice bots into its messaging platforms. There may be an advantage to having voice bots having a chat or graphical screen associated with them (as has been a trend with voice assistants created by Amazon and Facebook).

While Facebook also has a great advantage in having an advertising network it can leverage for data that it obtains through bots, Google has the advantage in terms of scale and search capabilities, etc.

This agreement to some extent will be bad for device makers who don’t have other sources of competitive advantage as they are set to be commoditized and therefore will need to compete on brand, quality and features.

This development will be good for chatbot platforms such as Botpress because it makes the market much more open and makes it very clear to companies that they can no longer ignore this channel.

Let the voice assistant wars begin.