IBM Speech to Text

info

Available only in PAM version R24.0201 (Python 3.7) and earlier versions.



IBM Speech to Text

Primary Features

This plugin calls IBM Watson Speech-to-text technology.
It activates the AI based STT by sending the audio file and other required parameters such as compression type and language of choice.



Need help?

Technical contact to tech@argos-labs.com


May you search all operations,

CAUTION

API Keys created under IBM Cloud resource regions other than “Dallas” may cause authentication error with error code 403.

IMPORTANT NOTES

1. This is a commercial API and end user will be charged by the supplier of this API after a certain amount of free usage.
2. The user license contract must be entered directly between the supplier of this API and the End User.
3. ARGOS LABS will not be responsible for any consequences either tangible or non-tangible that have resulted from usage of this API.



Contents




How to set the parameters





Advanced Settings


  1. When checked, the plugin returns a confidence measure in the range of 0.0 to 1.0 for each word. When unchecked, no word confidence measures are returned. 

  2. When checked, the plugin converts dates, times, series of digits and numbers, phone numbers, currency values, and internet addresses into more readable, conventional representations. For US English, the plugin also converts certain keyword strings to punctuation symbols. This applies to US English, Japanese, and Spanish transcription only.

  3. When checked, it specifies the duration of the pause interval at which the service ends the processing. Silence indicates a point at which the speaker pauses between spoken words or phrases. Specify a value for the pause interval in the range of 0.0 to 120.0 The default pause interval for most languages is 0.8 seconds. The default for Chinese is 0.6 seconds.

  4. Use this parameter to suppress side conversations or background noise. Specify a value between 0.0 and 1.0:
  • 0.0 (the default) provides no suppression (background audio suppression is disabled).
  • 0.5 provides a reasonable level of audio suppression for general usage.
  • 1.0 suppresses all audio (no audio is transcribed).




How to obtain IBM API Key

Step 1

  • Visit https://cloud.ibm.com
  • Signup and Login to IBM account.
  • click CREATE button in Resource summary to create a new API service. (see below)


Step 2

  • Select AI menu first and then select “Speech to Text”.
  • Or you can search “Speech to Text” from top search icon.


Step 3

  • Open Manage from the top menu and then choose Access (IAM).


Step 4

  • Click “Create an IBM Cloud API key” button.


Step 5

  • Download the API key and you are done.









All Plugins