Trees 003 - Minimal seasons, a series by AiMuse made with gen AI
TL;DR: Text-to-speech AI is set to transform the thriving audiobook market extensively, but widespread adoption will be gradual. While synthetic voice quality is rapidly improving, several factors will likely temper the pace of change.
In recent years, digital formats in trade book publishing have followed divergent paths. E-book sales have either stagnated or shown modest growth in most countries. In contrast, the audiobook market has seen rapid expansion, reaching an estimated $8 billion globally (Statista), or potentially up to $20 billion according to other sources. Notably:
US: as shown by the chart below, audiobook publishers’ revenue has almost matched e-book sales, more than doubling in recent years, with a steady rise in audiobook listeners since 2015. 1
Audiobook VS e-book publishers’ revenue in the US - Statista
UK: the chart below shows that audiobook publishers’ revenue surged from £12 million in 2013 to £206 million in 2023, making up over 40% of digital sales in consumer book publishing. 2
Audiobook publishers’ revenue in the UK - Statista
Nordic countries: in Northern Europe, audiobook sales have far outpaced e-books. In Sweden, they have even surpassed print in terms of volume! 3
Continental Europe: Germany has a well-established audiobook market, while Italy, France, and Spain remain smaller but show strong growth. 4
Spotify’s recent entry into audiobooks, offering a much wider customer base than platforms like Audible, Storytel, and BookBeat, sets the market to expand further, reaching listeners who may be new to audiobooks.
So, how does AI impact this thriving industry? Recent developments in AI voice synthesis have produced natural and expressive AI voices that increasingly mimic human intonation, pacing, and emotional nuance. Given that an audiobook can last over 20 hours, the quality and expressiveness of the voice must be excellent. While a 30-second commercial might work with a “good enough” synthetic voice, extended listening requires a much higher standard. Most publicly available AI text-to-voice software has not yet reached this impeccable level, but it’s likely just a matter of 1 to 2 years or less. Once AI can serve as a perfect narrator, the impact on the audiobook industry will be enormous. Let’s examine this impact across various aspects:
Cost reduction: AI narration can significantly lower production costs by reducing the need for human narrators, studio time, and extensive post-production editing. Currently, producing an audiobook costs thousands of euros, with long audiobooks featuring famous narrators costing upwards of €10,000. AI will cut these costs dramatically.
Speed and efficiency: AI can produce audiobooks far faster than human narrators, completing in minutes a production that would otherwise take several weeks.
Multilingual and localization capabilities: AI can easily produce audiobooks in multiple languages, facilitating entry into new markets.
Accessibility for independent publishers, and self-published authors: indie publishers, who often lack resources to produce audiobooks, could use AI to enter this market, making their audiobooks accessible on a larger scale. This applies also to self-published authors.
Thanks for reading AI Muse! This post is public so feel free to share it.
Customization: listeners may soon be able to adjust narration styles, accents, age, gender, and language to suit their preferences, much like customizing font and brightness in e-books.
Voices from the past: listeners will have the opportunity to enjoy audiobooks narrated by iconic actors from the past. In July, ElevenLabs introduced “Iconic Voices” a feature enabling articles, PDFs, ePubs, and newsletters to be read in the voices of (passed away) legends like Judy Garland, James Dean, John Wayne, and Sir Laurence Olivier, thanks to rights acquired from their estates.
ElevenLabs listeners can now enjoy their favorite content narrated by iconic voices from the past, including Judy Garland, James Dean, John Wayne, and Sir Laurence Olivier.
Multiple voice options: some platforms already offer features allowing users to switch between human and various AI voices. For example, a leading audiobook subscription platform, Storytel, partnered with ElevenLabs to launch “Voice Switcher,” enabling users to switch narrators. Storytel introduced this feature in Poland at the end of 2023 and in Sweden in early 2024 (Storytel Press), allowing audiobook listeners to switch between different narrators.
Expanded content availability: as noted, producing audiobooks is expensive, and large publishers may have tens of thousands of titles in their catalogs. Today it’s commercially unviable to convert entire catalogs to audio, particularly in countries with smaller audiobook markets. AI will allow publishers to create audio versions of all their long-tail titles, expanding the catalog and making niche books available in audio. Most likely, the growth of audiobook availability will further expand the market.
4. RISKS FOR PUBLISHERS AND ETHICAL CONSIDERATIONS
The role of publishers in an AI-driven audio world: major platforms like Audible, Apple, Google, and Spotify will develop (and strongly promote) proprietary AI for audiobook production. If publishers permit platforms to produce audio using their own AI, this could lead to:
Quality issues: publishers may lose control over the quality of their audiobooks, potentially resulting in varying quality across platforms,
Publisher disintermediation: if the publishers’ role in production diminishes, literary agents and authors might negotiate directly with platforms, bypassing traditional publishers.
Piracy: new apps, such as the one below launched in the summer of 2024 by Eleven Labs (yes, them again), enable readers to convert ePubs files, PDFs, and other content into audiobooks using high-quality voices. As a result, consumers might stop purchasing audiobooks or subscribing to services like Audible, potentially undermining this flourishing market. This could become a significant issue, as publishers and authors must ensure these services block the conversion of copyrighted materials—an effort that will be anything but simple.
An app that enables users to upload content and create their own audiobooks could pose significant copyright challenges.
Most publishers, aiming to retain control of their product, will likely choose the AI software themselves, to produce and distribute consistent, high-quality AI-narrated audiobooks across all platforms. However, smaller publishers with lower negotiation leverage and AI know-how may face greater risks in this scenario.
Job displacement: the rise of AI narration poses a risk to voice acting and narration jobs. Indeed, it’s very likely that AI will take over some jobs from humans, as pointed out by Wired:
Intellectual property rights: using AI voices trained on human narrators’ recordings raises legal questions about consent, compensation, and potential evolution in royalty models for AI-narrated content.
Thanks for reading AI Muse! This post is public so feel free to share it.
Both platforms and publishers are approaching this transformation cautiously for several reasons:
Product quality: while AI narration quality is very good, is not yet perfect (but most likely it will be soon).
Intellectual property uncertainty: there’s considerable ambiguity surrounding IP rights for AI voices and other AI-generated assets.
Technological adaptation: established companies tend to move slowly with new technologies, and publishers need to build AI expertise within their audio departments.
Resistance from authors and agents: some authors and agents will oppose AI narration, potentially restricting its adoption.
Consumer sentiment: many listeners currently prefer human-narrated audiobooks to avoid job displacement for narrators, and because AI narration quality is still perceived as inferior. These perceptions are slow to change, even when technology improves.
For example, the main global player in the audiobook field, Audible, is experimenting cautiously. When it launched the beta for Narrator Voice Replicas in September 2024, Amazon emphasized a “commitment to thoughtfully balancing the interests of authors, narrators, publishers, and listeners” (The Verge).
September 2024 | Audible allows narrators to clone themselves with AI - The Verge
Audible US now features over 50,000 titles with virtual voices, though these audiobooks are not heavily promoted on the platform. To date, the majority of these titles are self-published works, often in the erotic or other niche categories. Below is the first page of over 50,000 titles on Audible.com sorted by popularity:
November 4, 2024 | Over 50 thousand audiobooks on Audible US narrated by virtual voices - sorted by popularity - Narrators
Publishers are equally careful. HarperCollins, the first major publisher to publicly work with AI for audio, partnered with ElevenLabs in April 2024. Their announcement highlights AI as a “complementary tool” for backlist titles in non-English markets while affirming continued investment in traditional narration (ElevenLabs Blog).
April 2024 | HarperCollins Partners with ElevenLabs on AI Audiobooks - ElevenLabs Blog
In summary, AI’s impact on the audiobook market will be significant but will not disrupt the market overnight. In the next few years, premium titles and literary fiction will likely continue to feature high-quality human narrators, while AI will narrate backlist and underserved language titles, expanding audiobook availability. With the expansion of the catalog, the market’s value could continue to grow significantly. However, significant new risks have emerged: as discussed, key copyright issues must be addressed to prevent substantial harm to the industry.
In the US, e-book sales remained steady between 2017 and 2022, while audiobook sales surged by 145%, according to Statista (Statista). By 2023, audiobook revenue almost equaled that of e-books, continuing its growth trend, as reported by the Association of American Publishers (AAP) (Publishers Weekly). Pew Research Center’s annual surveys show steady growth in audiobook listeners since 2015 (Pew Research Center). The value of the US audiobook market is estimated at just under $900 million by the AAP (AAP News), $2 billion by the Audio Publishers Association (Audio Publishers Association), and $3.4 billion by Statista (Statista). Further growth is expected, especially following Spotify’s recent entry into the market (The New York Times).
In the UK, audiobooks have moved from a niche to a substantial segment within publishing (The Guardian, Statista). The audio market was valued at £206 million in 2023 by the UK Publishers Association (Publishers Association, Report PDF). However, recent analyses suggest the UK audiobook market could be worth around £1 billion, as the Publishers Association’s figures only account for publishers’ revenue, excluding the considerable value retained by platforms, as well as audio not produced by traditional publishers or foreign-language audiobooks (The Bookseller).
In Sweden, audiobooks account for 30% of total book revenue and 60% in terms of copies, according to the Swedish Publishing Association (Swedish Publishing Association). Audiobook growth has also been significant in Norway (Statista), Denmark (Statista), and Finland (Statista).
In Germany, digital audiobook sales have risen significantly, with audiobook streaming up 191% and downloads up 65% from 2019 to 2023, while e-book sales have leveled off (Börsenverein, Presentation PDF). The overall growth rate for audiobooks in Germany is lower, at +39% over five years, when accounting for the decline of physical CDs, yet the market generates over €320 million in sales (Statista). Other markets such as Italy, France, and Spain remain smaller but have shown substantial growth in the past five years as major platforms like Audible have only recently entered, following a period of negligible audiobook adoption (Statista Italy, Statista France, Statista Spain). In these markets, Spotify’s entrance could greatly expand the market, potentially reaching millions of listeners new to audiobooks, and all projections indicate strong growth ahead (Frankfurt Book Fair White Paper).
🛠️ Best gen AI tools for videos /1
⚠️ Note: this list of tools by AI Muse is not an invitation to purchase nor a comprehensive list. Check the pricing and always read the terms and conditions.
Runway – Best for generative AI video creation: Creates videos from text prompts or still images. Suitable for experimental or creative projects.
Pricing: Free trial available; paid plans vary based on usage
Kling AI – Best for cinematic-quality videos: Text-to-video and image-to-video with lifelike motion and visuals.
Key Features: Custom character models, creativity sliders, free daily credits.
Pricing: Free plan (66 daily credits); advanced features require subscription.
Pictory – Best for repurposing long-form content: Converts blog posts, scripts, and webinars into shareable videos with captions, transitions, and stock footage.
Key Features: Automatic script-to-video conversion, customizable branding, and background music.
Pricing: Free trial; paid plans start at $19/month
InVideo – Best for faceless videos: generates videos from text prompts with stock footage, voiceovers, and transitions. It supports 21 languages.
Key Features: Script-to-video workflows, customizable templates, and real-time editing via prompts.
Pricing: Free plan available; paid plans start at $15/month.
Veed.io – Best for social media content: Creates short-form videos with captions, emojis, and voiceovers. Ideal for platforms like TikTok and Instagram.
Key Features: Auto-transcription, color-changing captions, and stock media integration.
Pricing: Free for videos under 10 minutes; paid plans start at $18/month
HubSpot Clip Creator – Best for beginners on a budget: Simplifies video creation with templates and AI-generated scripts. Great for small businesses or marketers.
Key Features: Intuitive interface, free templates, and background music integration.