Automatic Transcription with DaVinci Resolve’s Speech to Text Function and Text-based Video Editing

Examining this useful new feature for video post-production workflows and comparing it with that of Adobe Premiere Pro

Date
Author
Filip Milovanovic
Post-production expert,
ELEMENTS
Category
Workflow

With DaVinci Resolve version 18.5, Blackmagic Design has followed in Adobe’s footsteps, enabling automated transcription from video and audio files. This blog will examine Resolve’s new transcription feature and compare its capabilities to those of Adobe Premiere Pro.

How It Works

To create a transcription, navigate to the Edit page, select the video or audio files and use the new Transcribe Audio button. The transcription will then start automatically. After completing the process, the results will be displayed in the new Transcription window.

Select any portion of the transcript and open the editing tools, which allow the user to edit, delete, undelete and copy the selected part of the transcript. Deleted parts will be displayed with a strikethrough. The Transcription window also offers the search function with the ability to quickly replace search results and export the transcription to a text file. Options to change the font size and background color are also offered.

Speed

In our transcription test of a 10-minute audio clip on an M1 Max Mac Studio, Resolve created automatic transcriptions in about nine times real-time with 1 and 15 seconds. Premiere Pro was faster and managed to transcribe the sequence in 55 seconds.

However, when we repeated the test on an i7 Intel MacBook Pro, Resolve’s transcription speed was only four times real-time and required 2 minutes and 39 seconds. On the other hand, Premiere Pro still managed to transcribe the sequence at about the same time, with 1 minute and 2 seconds.

This speed difference might not be that important to you if you plan on transcribing a clip occasionally. Still, transcribing several longer clips can add up to a significant time difference.

Text-based Editing

After adding the automatic transcription support, both NLEs have implemented a text-based editing workflow. In it, the user constructs the story by using the transcript to keep only the required parts of the content – no need for endless timeline scrubbing and playback to find what’s needed.

When a text range is selected in the Transcription window, the corresponding range is marked in the source monitor. The selected video portion can then be added to the timeline as usual or by using the Insert or Append buttons in the lower-right corner of the Transcription window. A subclip of the selected range can be created quickly using the Create Sub Clip button in the lower-left corner of the Transcription window.

Automatic Subtitles Creation

With this update, the auto subtitle feature has also been added to Resolve. Accessible through the Cut and Edit pages, it automatically transcribes the audio of the selected timeline into a subtitle track. To use the feature, open your timeline and click the “Create Subtitles from Audio” in the Timeline menu. After choosing the language and how the subtitles should be formatted, the captions will be created and distributed in the timeline on a new subtitle track. Captions can be modified in the Inspector window.

Currently, 15 languages are supported: Danish, Dutch, English, French, German, Italian, Japanese, Mandarin traditional, Mandarin simplified, Norwegian, Portuguese, Russian, Spanish, and Swedish.

One minor drawback of the subtitle creation process of DaVinci Resolve compared to that of Premiere Pro is that the transcription will be redone, even if the clips in the timeline have already been transcribed. In contrast, Premiere Pro created subtitles from a transcript in around 10 seconds for our 10-minute video.

Other Differences

One difference is how both NLEs display the transcript. While Premiere Pro splits the transcript into approximately 20-second parts, Resolve splits it only when a longer pause is detected, potentially resulting in chunky text that can be a little harder to read.

Conclusion

The automatic transcription feature is one of the most exciting additions to the functionality of the modern NLE. We love the quick pace of DaVinci Resolve’s development and the decision not to allow the other NLEs to have the upper hand. The feature is well implemented, and the accuracy seems to be on par with that of Premiere Pro. Being that Premiere’s transcription feature improved after the initial release, it is likely that we will also see this happen in the following Resolve versions.

Workflow

Using the OpenAI Transcription Engine to Generate Subtitles

Workflow

Exploring the New Scene Cut Detection Features of DaVinci Resolve and Adobe Premiere Pro

Workflow

Using an Alternative Microphone Input in DaVinci Resolve with a Blackmagic I/O Device

Glossar

COBIT

COBIT ist ein international anerkanntes Rahmenwerk für das Management und die Governance von Informationstechnologie. Es bietet ein umfassendes Regelwerk von Prinzipien, Praktiken und analytischen Instrumenten und Modellen zur Steuerung der unternehmensweiten IT.