Easily Integrate Live Captioning into Your App — Ant Media Server

Ant Media Server
5 min readSep 20, 2022


Live Caption automatically captions speech on your device. You can use it on media like live streams, videos, podcasts, phone calls, video calls, and audio messages. Even in some usage scenarios, Live captioning is a must.

Live captioning is an accessibility feature for the over 466 million people around the world who are deaf or hard of hearing. It is a feature that helps us a lot not only in these situations but also in our normal lives. It is a very useful solution that allows you to watch videos or listen to podcasts, for example, while on a noisy journey or trying not to wake a baby.

This article will help you understand how to implement live captioning for a live video stream using the Ant Media WebRTC server and Amazon Transcribe. You can download the demo project source code from here.

Prerequisite live captioning setup

1. Create AWS Key and Secret

Please follow the instructions provided here to generate the AWS key and secret.

2. Create AWS Transcribe Key

Please follow the instructions provided here to generate AWS transcribe key.

3. Setting up the AWS Transcribe with Web-socket

Set up transcribe streaming permission in your AWS account using the below sample config. In order to learn more about configuring WebSocket with AWS transcribe service please follow this AWS documentation.

{ "Version": "2012-10-17", "Statement": [ { "Sid": "transcribestreaming", "Effect": "Allow", "Action":"transcribe:StartStreamTranscriptionWebSocket", "Resource": "*" } ] }

Clone the sample project, setup, and run

Clone project from Git repository:


You can’t perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

2. Install dependencies in the root Folder

Run the below command to install dependencies in the root folder

npm i

3. Installing the AWS SDK for JavaScript on the root folder

npm install aws-sdk

4. Add AWS credentials

Navigate to file src\main\webapp\lib\main.js and add AWS Transcribe key and secret as shown in the below screenshot

function createPresignedUrl() { let endpoint = "transcribestreaming." + region + ".amazonaws.com:8443"; // get a preauthenticated URL that we can use to establish our WebSocket return v4.createPresignedURL( 'GET', endpoint, '/stream-transcription-websocket', 'transcribe', crypto.createHash('sha256').update('', 'utf8').digest('hex'), { 'key':'AWS Access KeyId', 'secret': 'AWS Secret Key', 'protocol': 'wss', 'expires': 15, 'region': region, 'query': "language-code=" + languageCode + "&media-encoding=pcm&sample-rate=" + sampleRate +"&show-speaker-label=true" } ); }

5. Add web socket credentials

Navigate to the following files and replace the websocketURL variable with your WebSocket URL i.e. wss://server.com:5443/WebRTCAppEE/websocket

src\main\webapp\index.html src\main\webapp\play.html var websocketURL = "web socket URL";

6. Generate transcribe build and run

npm run-script build

Example live captioning screenshots

Take a look at some of the live captions captured:


An embedded player with the live caption

main.js file description

This file is located on the below location


This file functionalities are described below:

Start / Stop Publishing and captioning.

Start Publishing Button

When you click on the start publishing button on the publishing demo, it will start streaming and also start a speech-to-text conversion from AWS Transcribe. It will trigger the #start_publish_button on the main.js file, when we click on the start publishing button in the publisher end, it will start a speech to text.

$('#start_publish_button').click(function () { $('#error').hide(); // hide any existing errors toggleStartStop(true); // disable start and enable stop button // set the language and region from the dropdowns setLanguage(); setRegion(); const video = document.getElementById('localVideo'); const astream =video.captureStream(); streamAudioToWebSocket(astream) });

Stop Publishing Button

As we see a button with the name stop publishing is shown, as disabled on the publishing demo page. It will be active after clicking the start publishing button, then the stop publishing button is now enabled. Click on the button, it will stop video streaming and also stop AWS from transcribing speech to text.

$('#stop_publish_button').click(function () { closeSocket(); });

2. Set Language

Set your language in the main.js file. Find the set language function and set your languageCode according to your caption’s language. By default, we set en-US(English, US) language.

function setLanguage() { //languageCode = $('#language').find(':selected').val(); languageCode = 'en-US'; if (languageCode == "en-US" || languageCode == "es-US") sampleRate = 44100; else sampleRate = 8000; }

How to integrate live captioning within your Project?

There are many ways to integrate this example into your current project but we are explaining the Iframe way.

Integration with iframe:

Publisher Page:

For integrating the Publisher page in your project add an iframe tag on the page and use the index.html file URL in the src attribute in the iframe as shown below.

<iframe src="http://{{server.com}}/stream/Live-Captioning-demo-with-Ant-Media- and-AWS-transcribe/src/main/webapp/" title="captions demo" style="width:100%;height:100%;"></iframe>

Player Page:

For integrating the Player page in your project add an iframe tag on the page with the blank src, need to update the src dynamically as shown below:

<iframe id="streamIdurl" src="" title="captions player demo" style="width:100%;height:100%;"></iframe> <script> document.getElementById("streamIdurl").src = "http://{{server.com}}/stream/Live-Captioning-demo-with-Ant-Media-and- AWS-transcribe/src/main/webapp/play.html"+window.location.search; </script> src/main/webapp/iframe-player.html?id=stream122

That’s how easy it is to add live captioning to your project. While this feature makes the lives of more than 466 million people easier, it provides a competitive advantage, especially to companies in the streaming sector. As we mentioned at the beginning, live captioning has become a mandatory feature in some usage scenarios.

Originally published at https://antmedia.io on September 20, 2022.



Ant Media Server

Ant Media Server, open source software, supports publishing live streams with WebRTC and RTMP. It supports HLS(HTTP Live Streaming) and MP4 as well.