Transcription Contents
Version 1
The transcript of an audio recording. Part of a Transcription.
Fields
Name | Type | Notes | |
---|---|---|---|
version |
|
Version of this transcription object. Different versions can have a different structure. |
|
language |
|
Detected language of the audio. |
|
text |
|
The full transcript of the audio, without regard for timestamps or speakers. |
|
languageProbability |
|
Confidence in the detected language. Between 0 (low) and 1 (high). |
|
segments |
List of all segments. |
TranscriptionSegment
A segment is a contiguous portion of the audio, typically corresponding to a short phrase or sentence, along with its associated start and end timestamps.
Name | Type | Notes | |
---|---|---|---|
speakerId |
|
ID of the speaker. |
|
startMs |
|
Start time of this segment, in milliseconds. |
|
endMs |
|
End time of this segment, in milliseconds. |
|
text |
|
Text of this segment. |
|
word |
List of words. |
TranscriptionWord
A word is a finer-grained unit within a segment. Each word represents a single transcribed word from the audio, along with its own start and end timestamps (denoting when that word was spoken in the audio).
Name | Type | Notes | |
---|---|---|---|
startMs |
|
Start time of this word, in milliseconds. |
|
endMs |
|
End time of this word, in milliseconds. |
|
text |
|
Text of this word. |
|
probability |
|
Confidence in the detected text. Between 0 (low) and 1 (high). |