Speech & Audio — DTPR Taxonomy

AI systems that convert between spoken words and other formats, or that classify non-speech sound. Includes speech-to-text (transcription), text-to-speech (synthetic voice), keyword spotting, and audio event detection. Voice-as-identity (recognizing who is speaking) belongs in Biometric Recognition.

Description

Playground

Try the element with your own values, contexts, and color modes.

Description

Style

Default

Dark

Raw JSON

Live API response from GET /schemas/ai@2026-05-06-beta/elements/speech_audio.

element "speech_audio"37 lines

{
  "id": "speech_audio",
  "category_id": "processing",
  "title": [
    {
      "locale": "en",
      "value": "Speech & Audio"
    }
  ],
  "description": [
    {
      "locale": "en",
      "value": "AI systems that convert between spoken words and other formats, or that classify non-speech sound. Includes speech-to-text (transcription), text-to-speech (synthetic voice), keyword spotting, and audio event detection. Voice-as-identity (recognizing who is speaking) belongs in Biometric Recognition."
    }
  ],
  "authoring_guidance": [],
  "examples": [],
  "sources": [],
  "symbol_id": "processing_text-to-speech",
  "variables": [
    {
      "id": "additional_description",
      "label": [
        {
          "locale": "en",
          "value": "Description"
        }
      ],
      "required": false
    }
  ],
  "shape": "circle",
  "icon_variants": [
    "default",
    "dark"
  ]
}