Transcribe Videos in Microsoft Word for the Web

I don’t know about you, but this is one of those features that I have been waiting for for a long, long time. I’ve sampled some 3rd party tools but was never sold on their capabilities. For those of us who are increasingly capturing interviews and meetings online and then painstakingly transcribing part or all of these recordings into print form, you can now generate transcripts directly in Microsoft Word for the web.

Announced on the Microsoft blog in August, the Transcribe feature not only converts your audio or video recordings into usable text, but it also detects different speakers so that you can easily follow the flow of the transcript. After you convert your video, you can easily revisit parts of the recording by playing back the time-stamped audio, and edit the transcript if you see something amiss.

I shared this tip in the October 2020 Microsoft 365 Productivity Tips webinar, which you can find on the blog, on the CollabTalk YouTube channel, or jump to this specific tip in the video by clicking here.

Creating a transcription

There’s not a lot of detail to share in this tip, but it’s worth experimenting with and getting to know the features. Start by opening Word online (go to and login), open a new document, select Dictate > Transcribe, and select your audio or video file to begin processing.

Transcribing video in Microsoft Word for the Web

Instead of uploading a recording, you can also record yourself speaking or capture a small group meeting or conversation, creating a new transcription in real-time.

Transcribing video in Microsoft Word for the Web

Processing time depends on the length of your recording. Just to set expectations: I had some trouble uploading videos that were just over an hour in length with two speakers. While I was eventually able to generate transcriptions, they came after a number of failed attempts that provided FAIL messages without much detail, forcing me to restart the lengthy process a few times. For shorter videos (20 to 30 minutes), I did not experience any issues….but the processing can be slow.

Once processed, the transcription appears to the right of your Word document, broken up by speaker and natural breaks in speech patterns. If interviewing someone, you can select and rename Speaker 1 and Speaker 2, and the change will be reflected throughout the entire transcript making it easier to differentiate text blocks.

Transcription is not a perfect science, for sure. There will be a lot of spelling and grammar mistakes, reflecting the AI’s inability to decipher some of our speech patterns, slang terms, acronyms and business jargon, not to mention the cross-chatter if there are two or more speakers. But as you go through the transcript, you can make edits and then insert the “scrubbed” text, or listen to that portion of the transcript to clarify what was said.

Transcribing video in Microsoft Word for the Web

The other thing I love about this tool is that you can search for specific quotes and easily add them to your Word document one at a time — or add the entire transcript in one click!

Love love love this feature, even though there is room for improvement. Check it out today!

Christian Buckley

Christian is a Microsoft Regional Director and M365 Apps & Services MVP, and an award-winning product marketer and technology evangelist, based in Silicon Slopes (Lehi), Utah. He is the Director of North American Partner Management for leading ISV Rencore (, leads content strategy for TekkiGurus, and is an advisor for both revealit.TV and WellnessWits. He hosts the monthly #CollabTalk TweetJam, the weekly #CollabTalk Podcast, and the Microsoft 365 Ask-Me-Anything (#M365AMA) series.