Transcripts
Extract transcripts from media files
You can transcribe YouTube, Vimeo, DailyMotion and private media files (.mp4 and .mp3). More formats will be introduced soon.
Get transcript
GET
https://videohighlight.com/api-v1/transcripts
Path variables
-
videoId
string required-
The video ID for the file.
-
For YouTube video this it is the last part of the YouTube URL. For https://www.youtube.com/watch?v=65_PmYipnpk, the videoId is:
65_PmYipnpk
-
For Vimeo videos this is the last part of the URL. For https://vimeo.com/318971562, the videoId is:
318971562
-
For DailyMotion videos this is the last part of the URL. For https://www.dailymotion.com/video/x8w5pyg, the videoId is:
x8w5pyg
-
Request parameters
-
type
string required- The media type of the video. It can be
youtube
,vimeo
,dailymotion
, orprivate
- The media type of the video. It can be
-
languageCode
string required- The language code of the input audio in ISO-639-1 format, like
en
oren-US
- The language code of the input audio in ISO-639-1 format, like
-
aiFormatted
boolean required- If true, the transcript will be formatted by the AI. This is applicable for auto-generated transcripts from video platforms like YouTube. It is not applicable for private videos.
Response
Transcript object
{
"transcript_video_id": "92195d82-bff5-4b3b-bc7b-9ae684fa34ba",
"video_id": "65_PmYipnpk",
"type": "youtube",
"language_code": "en",
"video_title": "Steve Jobs - Courage",
"video_duration": 214,
"transcript": {
"transcript_segments": [
{
"id": "95cffeb9-52ed-4e6c-a3af-7bcf6b3bd96a",
"start": "0:00:01",
"segment": "Apple is a company that has doesn't have the most resources of everybody in the world and the way we've succeeded is by",
"start_time": 1.34,
"end_time": 15.24,
"segment_length": 119
},
{
"id": "ec56262e-7216-4919-a952-271ee445e34b",
"start": "0:00:13",
"segment": "choosing what horses to ride really carefully technically we try to look for these technical vectors that that have a",
"start_time": 12.99,
"end_time": 23.34,
"segment_length": 117
},
{
"id": "12e6c87a-bb03-47e4-afeb-1a85c03b7e22",
"start": "0:00:20",
"segment": "...",
"start_time": 19.89,
"end_time": 31.949,
"segment_length": 110
}
]
}
}
Error responses
200
-OK
- The transcript has been successfully retrieved.204
-NO_CONTENT
- The video has not been transcribed yet.400
-BAD_REQUEST
- The request is invalid.401
-UNAUTHORIZED
- The request is unauthorized.500
-INTERNAL_SERVER_ERROR
- The server has encountered an error.
If the video has not been transcribed yet, the response will be 204
with no content. Calling the Stream endpoint is required
Stream transcript
POST
https://videohighlight.com/api-v1/transcripts/stream
Request body
-
video_id
string required -
type
string required- The media type of the video. It can be
youtube
,vimeo
,dailymotion
, orprivate
- The media type of the video. It can be
-
language_code
string required- The language code of the input audio in ISO-639-1 format, like
en
orfr
- The language code of the input audio in ISO-639-1 format, like
-
ai_formatted
boolean required- If true, the transcript will be formatted by the AI. This is applicable for auto-generated transcripts from video platforms like YouTube. It is not applicable for private videos.
-
use_sse
boolean required -
regenerate
boolean required
{
"video_id": "65_PmYipnpk",
"type": "youtube",
"language_code": "en",
"ai_formatted": false,
"use_sse": true,
"regenerate": false
}
Responses
Transcript Messages
{
"type": "download_progress",
"info": {
"progress": 0.5,
"speed": "1.5x",
"eta": "1m 30s"
}
}
{
"type": "error",
"info": {
"type": "[ERROR-FILE-DOWNLOAD]",
"message": "This video can not be downloaded."
}
}
Error Message Types
[ERROR]
- Generic error message[ERROR-FILE-DOWNLOAD]
- The video can not be downloaded[ERROR-FILE-DOWNLOAD-TO-ASR]
- The video can not be downloaded by the Automatic Speech Recognition (ASR) service[ERROR-VIDEO-PRIVATE]
- The video is private[ERROR-VIDEO-GEO-RESTRICTION]
- The video is not available in your region[ERROR-VIDEO-COPYRIGHT]
- The video is copyrighted[ERROR-VIDEO-UNAVAILABLE]
- The video is not available[ERROR-GENERIC-STREAMING-EXCEPTION]
- Generic error message[ERROR-REMOTE-ASR-CONNECTION]
- The connection to the Automatic Speech Recognition (ASR) service has failed[ERROR-REMOTE-ASR-TIMEOUT]
- The connection to the Automatic Speech Recognition (ASR) service has timed out[ERROR-UNEXPECTED]
- An unexpected error has occurred
{
"type": "segment",
"info": {
"id": "b44f3bba-91ee-4c47-b2c4-c5a132c9e730",
"start_time": 1.959,
"end_time": 15.36,
"text": "apple is a company that has doesn't have the most resources of everybody in the world and um the way we've succeeded is",
"text_length": 119,
"timestamp": "0:00:02",
"asr": true,
"language_code": "en",
"progress": 0.2
}
}
{
"type": "semaphore",
"info": {
"status": "[DONE]"
}
}
Semaphore Types
[DONE]
- The transcript has been successfully generated[INSUFFICIENT_DATA]
- The transcript has been processed, but there is not sufficient data to complete it (e.g., the video is too short)
{
"type": "transcript_info",
"info": {
"transcript_video_id": "95cffeb9-52ed-4e6c-a3af-7bcf6b3bd96a",
"video_id": "65_PmYipnpk",
"type": "youtube",
"video_title": "Steve Jobs - Courage",
"video_duration": 214,
"language_code": "en",
"asr": true
}
}
{
"type": "transcript_progress",
"info": {
"progress": 0.5,
"language_code": "en",
"asr": true
}
}