Transcription Contents
Version 1
The transcript of an audio recording. Part of a Transcription.
Fields
| Name | Type | Notes | |
|---|---|---|---|
| version | 
 | Version of this transcription object. Different versions can have a different structure. | |
| language | 
 | Detected language of the audio. | |
| text | 
 | The full transcript of the audio, without regard for timestamps or speakers. | |
| languageProbability | 
 | Confidence in the detected language. Between 0 (low) and 1 (high). | |
| segments | List of all segments. | 
TranscriptionSegment
A segment is a contiguous portion of the audio, typically corresponding to a short phrase or sentence, along with its associated start and end timestamps.
| Name | Type | Notes | |
|---|---|---|---|
| speakerId | 
 | ID of the speaker. | |
| startMs | 
 | Start time of this segment, in milliseconds. | |
| endMs | 
 | End time of this segment, in milliseconds. | |
| text | 
 | Text of this segment. | |
| word | List of words. | 
TranscriptionWord
A word is a finer-grained unit within a segment. Each word represents a single transcribed word from the audio, along with its own start and end timestamps (denoting when that word was spoken in the audio).
| Name | Type | Notes | |
|---|---|---|---|
| startMs | 
 | Start time of this word, in milliseconds. | |
| endMs | 
 | End time of this word, in milliseconds. | |
| text | 
 | Text of this word. | |
| probability | 
 | Confidence in the detected text. Between 0 (low) and 1 (high). |