If you have ever tried to edit an audio file, you know how infuriating the process can get. It generally involves shifting and cutting infinitesimal portions of the clip while making sure the transitions are seamless and the rest of the recording doesn’t come off as unnatural. What if, though, you could rework audio files as word documents? Sounds too good to be true? Well then, let me introduce you to Descript.
Descript is an automated audio transcriber which employs a series of machine learning networks for letting you edit clips as text. The app is powered by the Google Speech technology as well as a pack of humans and is capable of processing your files in just a couple of seconds or minutes depending on the size. Of course, this also means it uploads the recordings to a cloud server and does not function without an internet connection.
Alright, here’s how it works — You begin by dragging and dropping an audio file in a new project. Descript then takes a few moments to do what it does best and presents you with a text editor. Yes, a text editor with the transcription as its content and a timeline — which Descript calls the “WordBar” — at the bottom. WordBar is indeed an apt term for that since it associates the waveform to various pauses and words which you can mess around with however you like.
The text editor is, though, the highlight of Descript. You can simply delete words from the script and have it reflected in the audio file immediately. It also features a handy resync option which aligns everything again in case you went a little too far with the editing. Moreover, if it’s an interview, you can even tag different speakers to sentences and paragraphs which is nice. Another nifty perk is the ability to collaborate on Descript. You can share the project with your peers letting them listen to the edits and even leave comments.
Once done, Descript allows you to export the project in a bunch of formats including Microsoft Word, subtitles, and more. Usually, the transcription is nearly accurate but there were a couple of instances when it failed to recognize the speech. The developers also need to improve how it handles punctuations and long sentences as the current version just doesn’t seem to properly optimized for them.
Descript is trying to simplify one of the more complicated jobs for creators with the help of machine learning and in a lot of ways, it seems to have succeeded. It’s truly remarkable how well the app can produce outcomes in spite of the background noise or music. Descript is currently available only for Mac but the company says it’s bringing it to the web in the coming months as well. And of course, it’s not free. On signing up, you’ll get 30 complimentary minutes and post that, you’ll have to shed at least 15 cents for a minute (or 8 cents a minute if you subscribe for a monthly plan) which I feel isn’t much if you’re a professional.