Featured image of post Audio-ization of newspaper articles for ESH Médias

Audio-ization of newspaper articles for ESH Médias

Creation of an innovative portal that lets you consume newspaper articles in a different way, by listening to them.

Introduction

At the end of 2022, we were approached by the ESH Médias group, which is the publishing group behind several regional newspapers, to carry out one of their ideas, which at the time was rather innovative: to enable consumers to listen to their articles as well as read them.

The idea itself is excellent, as it allows users on the move to listen to their articles, like a podcast on the radio.

That’s all it took for us to roll up our sleeves and get to work on one of the most fun projects I’ve worked on this year.

The design

The design of the web application had to be extremely simple and clear to use, enabling both young and elder people to understand and use the service with just a few clicks.

To achieve this, the entire public Web interface is grouped into 2 pages:

  1. Article selection
  2. Listening to the playlist

Article selection

Article selection takes the form of a stack of cards to be sorted. An audio summary is launched, and the user can swipe left or right (a la Tinder for younger users) or click on the sorting buttons (for older users) to add an item to the listening playlist or ignore it. In this way, only one action per item is needed to compile the listening playlist.

Listening to the playlist

Now that all the daily articles have been sorted and added to the playlist, all that’s left to do is listen to it. This second page takes the form of a Spotify or Itunes-style player, allowing you to listen to the audio version of the daily newspaper articles. The interface has to be minimalist to allow the user to focus on what he’s hearing. However, “minimalist” does not mean “empty”, and the usual functions such as pause & play, fast forward & rewind, next & previous track, as well as changing the playback speed, are still present to ensure the player’s listening comfort.

The audio transformation

The transformation of written articles into audio recordings is based on 2 distinct elements:

  1. A Symfony-based back-office
  2. Microsoft’s text-to-speech API

Un back-office basé sur Symfony

We created a Symfony-based back-office, which retrieves all the articles written by the group’s various journalists, and allows administrators to choose which articles to audio-ize.

Then, these same administrators can configure the different voices, as well as define a phonetic replacement dictionary, allowing words, names and expressions that would be misinterpreted by the Microsoft service to be replaced by a phonetic version closer to what is expected. For example, if the Microsoft service pronounces Roger Federer as “Rogé Fédéré”, we can force the voice by replacing the text with “Rogeur Fédère” just before sending it to the text-to-speech service.

Also, because articles can be relatively long and it’s a pain to listen to the whole thing over and over again when you just want to retune a pronunciation in the last paragraph at the very bottom of the article, we’ve also added, in addition to listening to the whole article, the ability to select only part of the text to listen to, allowing you to focus on a particular sentence or paragraph.

Finally, when administrators have finished fine-tuning their article and are happy with the audio version, two final MP3 versions are generated and stored on the servers: the summary version, which is played automatically on the playlist creation screen, and the long version, to be listened to peacefully on the playlist listening screen.

Microsoft’s text-to-speech API

Microsoft’s text-to-speech API works by sending documents in SSML, an XML markup dedicated to transforming text into audio. We therefore had to develop a service capable of taking an HTML article as input, cleaning it up, reformatting it, excluding certain parts, detecting the different sections and generating an SSML with only the interesting elements and the right voices associated with the right sections.

This was the most complicated part of the project, as the articles are not written to be listened to, so it was necessary to go back and forth a lot throughout the project to refine the cleaning and splitting rules step by step.

Nevertheless, it was clearly worth the effort!

Conclusion

This project was fun and interesting all the way through, and I’m proud of the result, which works perfectly!

You can find this article-to-audio service on one of ESH Médias’ newspapers: Le Nouvelliste.