Meet Madison and Jackson, AI narrators or “digital voices” soon to read some audiobooks on Apple Books. It’s nothing like Siri or Alexa or voices telling you about unexpected items in the bagging area of your supermarket checkout. They sound warm, natural, animated. They seem real.
With their advanced levels of realism, there’s a real chance that the listener of Apple’s new AI voices will be unaware of their artificiality. Even the phrase used in Apple’s catalog of digitally-narrated audiobooks – “This is an Apple Books audiobook narrated by a digital voice based on a human narrator” – is ambiguous. This phrase does not make it clear who or what the story is about.
This ambiguity means that it’s possible for you to download an audiobook voiced by Jackson, start listening, and think (if you’re thinking about it at all) that the voice you’re hearing belongs to a voice actor. But does it matter?
If the listener is not fully aware that the narrator is digital, this raises some ethical questions (such as those of consent) given that users are unaware that they are interacting with an AI-powered technology rather than a person.
However, a more complicated – and more interesting – problem arises when we are aware and unaware of their artificiality. When you hear an AI narrator, you know you’re interacting with an artificially intelligent entity. But, as many of us already do with chatbots, many listeners will partially suspend these notions of awareness and personhood to digital voices, as we do to the fictional characters of these books.
Most worryingly, Apple’s marketing language is mired in its own pretense of presenting “digital voice” technology as harmless. The Apple Books for Authors audiobook information page emphasizes the technology’s potential to democratize audiobook production and reduce the impact on human artists. Indeed, the website clearly puts technology on the kid’s side – Apple claims to “empower indie authors and small publishers”.
This pretense ultimately operates by capitalizing on the multiple meanings of the word “heard.” Apple claims that “only a fraction of books are converted to audio – leaving millions of readers who prefer audiobooks out of reach, whether by choice or need”. Apple’s statement that “every book deserves to be heard” is a particular choice given its inherent association with democratic representation and inclusiveness.
Apple did not respond to our request for comment before publication.
It is certain that using digital narration means that authors do not have to bear the financial cost or time burden of describing the books themselves. And, indeed, that means more people can create audiobooks.
But by potentially destroying the livelihood of another type of minor operator (the voice artist), the new digital narrative technology does not stand for the child so much as to set two different children’s interests against each other.
In a further twist, the dataset used to train Apple’s digital voice has been reported to include, in some cases, the work of existing voice actors, much to their chagrin.
In presenting itself as disrupting the “big audiobook” and favoring smaller players, Apple’s marketing follows a recognizable trope. This includes the technical “organization” in which individual operators have the ability to participate in previously closed areas of business activity without sharing in the corporate profits earned through such “inclusion”.
What is perhaps unsettling about this new technology is not the unfamiliarity of its powers but the familiar ring of “platform capitalism” – when large companies provide the technology for others to operate.
The oft-sued Uber and the oft-banned Airbnb have now lost their luster as engines of accessibility. However, their early identification was based on the use of democratic rhetoric, from Uber telling potential drivers “you’re in charge” to AirBnB’s claim to be founded in “connection and belonging”.
So the use of pseudo-altruistic language by tech disruptors is nothing new. A window into this charming fantasy of what’s new is offered through encounters with AI narrators. After all, assuming your narrator is a human parallel involves the self-deception necessary, in many ways, to believe that Apple’s digital voice technology is a benevolent development.
It is necessary to reflect on the relationship between these acts of imagination because, often, it is easy to just believe. It’s easy to believe that your Uber driver is for flexibility, that your Airbnb host is more of a neighbor than a bunch of halfway owned properties.
It is easy to believe, but not always easy to identify and understand the dynamics of this belief. The experience of listening to an artificially-intelligent narrator — along with the act of buying AI-dubbed audiobooks — can help us catch our own brains from self-deception because marketing websites tell us it’s the democratic thing to do.
Looking for something better? Sound off with a carefully selected selection of the latest releases, live events and exhibitions, direct to your inbox every fortnight, Friday. Sign up here.