Docs
Transcripts

Transcripts

Extract transcripts from media files

You can transcribe YouTube, Vimeo, DailyMotion and private media files (.mp4 and .mp3). More formats will be introduced soon.

Get transcript

GET https://videohighlight.com/api-v1/transcripts

Path variables

Request parameters

  • type string required

    • The media type of the video. It can be youtube, vimeo, dailymotion, or private
  • languageCode string required

    • The language code of the input audio in ISO-639-1 format, like en or en-US
  • aiFormatted boolean required

    • If true, the transcript will be formatted by the AI. This is applicable for auto-generated transcripts from video platforms like YouTube. It is not applicable for private videos.

Response

Transcript object

{
    "transcript_video_id": "92195d82-bff5-4b3b-bc7b-9ae684fa34ba",
    "video_id": "65_PmYipnpk",
    "type": "youtube",
    "language_code": "en",
    "video_title": "Steve Jobs - Courage",
    "video_duration": 214,
    "transcript": {
        "transcript_segments": [
            {
                "id": "95cffeb9-52ed-4e6c-a3af-7bcf6b3bd96a",
                "start": "0:00:01",
                "segment": "Apple is a company that has doesn't have the most resources of everybody in the world and the way we've succeeded is by",
                "start_time": 1.34,
                "end_time": 15.24,
                "segment_length": 119
            },
            {
                "id": "ec56262e-7216-4919-a952-271ee445e34b",
                "start": "0:00:13",
                "segment": "choosing what horses to ride really carefully technically we try to look for these technical vectors that that have a",
                "start_time": 12.99,
                "end_time": 23.34,
                "segment_length": 117
            },
            {
                "id": "12e6c87a-bb03-47e4-afeb-1a85c03b7e22",
                "start": "0:00:20",
                "segment": "...",
                "start_time": 19.89,
                "end_time": 31.949,
                "segment_length": 110
            }
        ]
    }
}

Error responses

  • 200 - OK - The transcript has been successfully retrieved.
  • 204 - NO_CONTENT - The video has not been transcribed yet.
  • 400 - BAD_REQUEST - The request is invalid.
  • 401 - UNAUTHORIZED - The request is unauthorized.
  • 500 - INTERNAL_SERVER_ERROR - The server has encountered an error.

Stream transcript

POST https://videohighlight.com/api-v1/transcripts/stream

Request body

  • video_id string required

  • type string required

    • The media type of the video. It can be youtube, vimeo, dailymotion, or private
  • language_code string required

    • The language code of the input audio in ISO-639-1 format, like en or fr
  • ai_formatted boolean required

    • If true, the transcript will be formatted by the AI. This is applicable for auto-generated transcripts from video platforms like YouTube. It is not applicable for private videos.
  • use_sse boolean required

  • regenerate boolean required

Request body
{
    "video_id": "65_PmYipnpk",
    "type": "youtube",
    "language_code": "en",
    "ai_formatted": false,
    "use_sse": true,
    "regenerate": false
}

Responses

Transcript Messages

Download Progress Message
{
    "type": "download_progress",
    "info": {
        "progress": 0.5,
        "speed": "1.5x",
        "eta": "1m 30s"
    }
}
Error Message
{
    "type": "error",
    "info": {
        "type": "[ERROR-FILE-DOWNLOAD]",
        "message": "This video can not be downloaded."
    }
}

Error Message Types

  • [ERROR] - Generic error message
  • [ERROR-FILE-DOWNLOAD] - The video can not be downloaded
  • [ERROR-FILE-DOWNLOAD-TO-ASR] - The video can not be downloaded by the Automatic Speech Recognition (ASR) service
  • [ERROR-VIDEO-PRIVATE] - The video is private
  • [ERROR-VIDEO-GEO-RESTRICTION] - The video is not available in your region
  • [ERROR-VIDEO-COPYRIGHT] - The video is copyrighted
  • [ERROR-VIDEO-UNAVAILABLE] - The video is not available
  • [ERROR-GENERIC-STREAMING-EXCEPTION] - Generic error message
  • [ERROR-REMOTE-ASR-CONNECTION] - The connection to the Automatic Speech Recognition (ASR) service has failed
  • [ERROR-REMOTE-ASR-TIMEOUT] - The connection to the Automatic Speech Recognition (ASR) service has timed out
  • [ERROR-UNEXPECTED] - An unexpected error has occurred
Transcript Segment Message
{
    "type": "segment",
    "info": {
        "id": "b44f3bba-91ee-4c47-b2c4-c5a132c9e730",
        "start_time": 1.959,
        "end_time": 15.36,
        "text": "apple is a company that has doesn't have the most resources of everybody in the world and um the way we've succeeded is",
        "text_length": 119,
        "timestamp": "0:00:02",
        "asr": true,
        "language_code": "en",
        "progress": 0.2
    }
}
Semaphore Message
{
    "type": "semaphore",
    "info": {
        "status": "[DONE]"
    }
}

Semaphore Types

  • [DONE] - The transcript has been successfully generated
  • [INSUFFICIENT_DATA] - The transcript has been processed, but there is not sufficient data to complete it (e.g., the video is too short)
Transcript Info Message
{
    "type": "transcript_info",
    "info": {
        "transcript_video_id": "95cffeb9-52ed-4e6c-a3af-7bcf6b3bd96a",
        "video_id": "65_PmYipnpk",
        "type": "youtube",
        "video_title": "Steve Jobs - Courage",
        "video_duration": 214,
        "language_code": "en",
        "asr": true
    }
}
Transcript Progress Message
{
    "type": "transcript_progress",
    "info": {
        "progress": 0.5,
        "language_code": "en",
        "asr": true
    }
}