PRODUCTHEAD: How to transcribe and analyse your user research recordings
PRODUCTHEAD is a regular newsletter of product management goodness,
curated by Jock Busuttil.
any product can play guitar #
Thematic analysis identifies the main themes emerging from qualitative data, such as interview transcripts
It can be a great activity to do with your team for establishing empathy with the users and their context
Good research means being intentional, conscientious, and ethical at every step
help me out
please recommend PRODUCTHEAD to a friend
i’d be ever so grateful :-)
every PRODUCTHEAD edition is online for you to refer back to
In last week’s edition of PRODUCTHEAD, I shared some tips for how to record your user research interviews.
This week we’re looking at how to transcribe your user research interviews and generate highlight reels, and how to then draw out the themes emerging from your research data.
I make no claims of expertise in user research, yet I and many of the product people I’ve worked with have found ourselves needing to do user research in the absence of a dedicated expert.
Update: I received some valuable feedback from user researchers in response to the original article. I had omitted to write about the importance of non-verbal cues in the analysis of the interview recordings, particularly when video is also available. Accordingly I’ve made a couple of updates to improve the advice.
I’ll be mentioning a few different products that you may find helpful. I’m not affiliated with any of these vendors, and I’m not receiving any payment for recommending their products to you.
Transcribing a recording #
In the past, if I was feeling cheap and wanted to practice my touch-typing, I would pore for hours over a recording and painstakingly type the transcript myself. That got old very quickly.
So instead I would send my recordings off to a human-powered transcription service. (I know, how quaint!) While the cost back then was perfectly acceptable versus the time it saved me, this has become a relatively expensive option in comparison with the AI (artificial intelligence) transcription services now proliferating.
Extracting audio from a video file #
While most transcription services will happily cope with a video file, occasionally you may first need to extract the audio from the video recording as a separate file. You may also wish to do this to save on upload time, as compressed audio files such as MP3 or AAC (Advanced Audio Coding, intended successor to the MP3 format) will be much smaller sizes than the original video.
This is pretty easy to do, and the audio track by itself will be a smaller file to manage and store than the audio and video combined.
To extract the audio from a video file, you can load up most video files straight into Audacity, a free audio editor for Windows PCs and Macs), then export the audio as a MP3.
Another method for the more technical among you is to use a command-line tool called
ffmpeg, which is especially quick. MP4 video files often have the audio tracks embedded within them in AAC format, so you can usually just pull this out as a separate audio file.
ffmpeg -i inputvideo.mp4 -vn -acodec copy outputaudio.aac # ffmpeg: The command # -i inputvideo.mp4: The source video file # -vn: Do not record (do not consider) video data # -acodec copy: Copy the audio source as-is # outputaudio.aac: The output filename
Doing transcription on the cheap #
Last time I suggested using your smartphone as your recording device, either on its own or with a dedicated microphone. On more recent smartphones, you can take advantage of automatic transcription right there on the phone itself. On Android, Google’s own Recorder app is free and does a passable job of transcribing the recording. It can also do the processing offline on the phone itself, so you don’t even need a good signal or wifi while you’re recording.
Although I don’t have an iPhone to test it out, Smart Voice Recorder – Offline looks like a good clone of the Google app. The free version is limited to 4-minute recordings, the in-app purchase lets you unlock the ability to make longer recordings and import files for transcription.
There are many other hacks for generating free transcriptions, and they generally need a bit of technical tomfoolery to get working. If you’re happy to roll up your sleeves and do things manually, then these may work for you. If not, skip ahead to the more convenient (and spendy) options below.
Google Docs: You can play the recording you want to transcribe into Google Docs’s voice typing feature, using VB-Audio’s virtual cable to connect the playback into the microphone. This is okay, but the transcription is often poor quality, it often stops transcribing after a while, and requires you to keep the focus on document otherwise it stops transcribing. I have only ever used this in a pinch, and there are much better quality alternatives these days.
YouTube Studio: If you have a video file, or can turn your audio file into a basic video file, you can upload it to a free account on YouTube Studio and publish it privately. If you wait a couple of days, YouTube will auto-generate captions for the video. You can then download the transcript.
I’ve only used this with English language videos, but this does support many languages. The transcription quality is quite passable (a fair bit better than Google Docs). However it’s still a bit of an effort, and you have to wait a couple of days for the transcript to auto-generate.
Native voice dictation: Both Windows PCs (10 and later versions) and Macs have built-in voice dictation. Again using VB-Audio’s virtual cable (or similar) to connect audio output to microphone input, you can play your recording and sit back while it transcribes into the writing app of your choice. With the exception of more recent Apple hardware (M1 and later processors), these dictation features will need internet access as they use cloud services to transcribe.
Whisper: If you’re comfortable with a very slightly more techie approach, OpenAI (of ChatGPT fame) has a free and open source audio transcription model called Whisper. I’ve run through a couple of average quality recordings with fair amounts of background noise, and I’m really pleased with the results.
It generates transcripts with punctuation and actual sentences (unlike some of the other options), and will generate timestamped transcripts if you need them. There’s the odd bit of transcription confusion, but for the most part I reckon it will save you a load of time. Because I’m a cheapskate, this has become my default method for transcription.
To get up and running in no more than 15 minutes, I followed a YouTube tutorial by Kevin Stratvert, for which you only need a Google Drive account with some free storage space. Again, if you’re not particularly technical, the setup will look a bit intimidating, but Kevin walks you through the process.
As you have to upload the file for transcription, extracting the audio first and uploading that rather than the video file (= much smaller file size) can save you a lot of waiting around. The transcription itself (for English, on the ‘medium’ model) runs at about 3x speed, so about 10 minutes to transcribe a 30 minute audio file.
More spendy* options for transcription #
If you want to step up your game a little, or you just want a simpler setup, you could consider buying a subscription to an AI-powered transcription service. You can feed it your user interview recording, then it will transcribe it, usually recognising different speakers.
Otter and Sembly let you invite their transcription services to your Zoom, Google Meet or Microsoft Teams video call. They listen in and handle the recording and transcription for you. This is a overall a much simpler option than having to record and transcribe video calls for yourself.
Sembly has also partnered up with Phillips on a Bluetooth meeting microphone that sends the audio of a meeting for transcription via your smartphone or laptop. It uses an array of four microphones pointing in different directions so I would imagine it would be better suited to recording office meetings with minimal background noise, rather than interviews in noisy coffee shops or similar. If however you’re recording both in-person and online interviews, this may be a good all-in-one solution.
Analysing the transcripts #
Update: Once you’ve done the hard work of running your user interviews, you’re going to need to work through the recordings to draw out any recurring themes and other useful insights. Whether you have video or just audio recordings, remember to pay attention to the non-verbal communication, such as facial expressions, pauses, and other cues.
Whenever you’re going through your transcripts, have the corresponding audio or video playing alongside to help with this. This will also help you pick up any weirdness in the transcriptions — any AI-driven transcription service can and will make transcription errors or simply skip out passages. Some of the tools I mention later on will help you with this by synchronising your recording with the corresponding transcription.
Drawing out insights on the cheap #
There’s no two ways about this, this will involve you — and hopefully others on your delivery team — going through the transcribed recordings line-by-line and tagging the themes you encounter.
While there are collaborative tools to help do this, sometimes it’s just easier and more worthwhile to do this manually. Nielsen Norman Group has a neat video demonstrating this manual method, and a thorough article on thematic analysis for you to follow up with.
There’s typically two stages to thematic analysis:
1. Going through your transcribed recordings to find and code text fragments that relate to your research goals; and then
2. Grouping those fragments into thematic groupings. This second stage is often called affinity mapping or diagramming.
Doing these activities with your delivery team and perhaps even your stakeholders, rather than by yourself, has a few additional benefits.
In practical terms, having lots of people helping with the coding and affinity mapping will help to get through the process perhaps more quickly, but certainly more enjoyably.
Because everyone is going to be reading the research transcripts, it also gives you a way to familiarise your team and stakeholders with your raw research data in a more immediate way than if you simply relayed the findings after doing the analysis solo.
And don’t forget the benefit of having other people’s perspectives on the research data: they may well spot things that you would not on your own.
While these activities are traditionally done in person with lots of sticky notes and a vast wall space, it is also possible to run these activities remotely. The UX Design Institute has a short video that describes how you can run an affinity diagram workshop remotely using Miro (or another collaborative whiteboard tool of your choice).
Regardless of whether you do thematic analysis manually or with the help of a tool, you’ll benefit from understanding what you’re trying to achieve and why.
More spendy options for drawing out insights #
In addition to transcribing, both Otter and Sembly will use AI to generate summaries of the main points discussed in the recorded session. Again, I’ve not personally used this feature in either product, and so I can’t tell you how effective this is in practice. As they’re optimised for recording and transcribing more general meetings, my assumption would be that the summaries they both generate will probably not be geared towards thematic analysis. The full transcripts remain helpful, though.
Descript is more of a general video and audio editing tool for podcasters, but does let you create highlight reels based on fragments of your transcript.
Like many of the other options, once you’ve recorded your video call in Descript, or fed it your audio or video file, it will generate a transcription which is linked back to the media, identifying different speakers. It then lets you edit the audio or video using just the text. The transcription engine is usually fine, though it occasionally gives up and misses out chunks of text if it’s struggling with accents or a particuarly noisy recording.
In addition, Descript has a highlighting tool that lets you select bits of the text in different colours. You can use different colours for different themes or topics, and then Descript lets you pull out all the clips (video, audio and text) for each colour as a highlight reel.
They each provide similar capabilities to Descript by allowing you to select parts of the transcript they generate from your video or audio for a highlight reel. Unlike Descript, however, they are set up primarily for analysing, sharing and storing user research findings, which you may find gives them an edge over more general transcription tools.
Getting your recordings into your chosen user research tool varies a bit. Here’s the tl;dr (correct as at the time of writing):
Grain can be invited to record Zoom meetings as a virtual participant.
Dovetail, Grain and Condens can pull in your recordings from some or all of Zoom, Google Meet and Microsoft Teams.
Lookback effectively takes the place of Zoom, Meet or Teams by providing its own video call functionality, from which Dovetail can then import the recordings.
All the options provide the ability to import recordings for transcription.
Final thoughts #
Thanks to the various free and open source tools available you no longer need to spend days and weeks getting your user research interviews ready for analysis, even on a tight budget. If time and budget were obstacles to gathering valuable user research in your organisation, now there’s no excuse :-)
With more dedicated user research tools, what you gain in efficiency for gathering, analysing and sharing your qualitative research findings, you may lose in usefulness in more general use cases.
If you are on a modest budget, you may not necessarily be able to fire up subscriptions with every single provider. You may benefit more from a more general tool which you can use in more situations, rather than lots of specialised tools.
As always, try to figure out whatever activity you’re doing more of and invest in a tool that will help you primarily with that.
Speak to you soon,
what to think about this week
Uncovering themes in qualitative data can be daunting and difficult. Summarizing a quantitative study is relatively clear: you scored 25% better than the competition, let’s say. But how do you summarize a collection of qualitative observations?
[Maria Rosala / Nielsen Norman Group]
Recent discussions have been swirling around the phrase “democratization of research” concerning who should participate in what kind of research in design (product/service/technology/non-profit etc.) organizations.
There has been a lot of yelling about gatekeeping and handwringing about the potential for low-quality research. The discussion gets shouty when you don’t stop to define your terms and clarify what type of research, to what standard, and for what purpose. Without that, everything devolves into a territory battle.
[Erika Hall / Mule Design]
In UX workshops, it can be challenging to engage the team, and create order among diverse ideas and facts.
One method that helps teams collaboratively analyze research findings as well as ideas from ideation sessions is affinity diagramming. Often used in UX, affinity diagramming is adapted from the KJ diagramming method (named after author Kawakita Jiro).
[Kara Pernice / Nielsen Norman Group]
Affinity diagraming is a core skill of a good UX designer. This guide will take you through the key steps.
[UX Design Institute]
Years ago, someone once told me that “perception is reality” when it comes to reputation at work. Of all the lessons I’ve learned in my career, this has been by far one of the hardest.
[I Manage Products]
You talk about doing user research directly with users – does it matter that the Operations and Process tracks are telling me what their users want instead?
[I Manage Products]
Product managers of software and hardware platforms face unique challenges that PMs of ‘regular’ products do not.
In this panel discussion, Hans-Bernd Kittlaus discusses platform product management with Samira Negm, Peter Stadlinger and Jock Busuttil.
[I Manage Products]
can we help you?
Product People is a product management services company. We can help you through consultancy, training and coaching. Just contact us if you need our help!
Helping people build better products, more successfully, since 2012.
PRODUCTHEAD is a newsletter for product people of all varieties, and is lovingly crafted from [muffled voices] … uhhhh, could you speak up? [more muffled noises].
Read more from Jock
The Practitioner's Guide To Product Management
by Jock Busuttil
“This is a great book for Product Managers or those considering a career in Product Management.”— Lyndsay Denton