Writing a Speech Recognition App. in Carbon
Pages: 1, 2, 3, 4

Speech objects

The Speech Recognition Manager is object oriented in design. The parent class of the Speech Manager is the SRSpeechObject. An instance of this object, a speech object, may include an SRRecognitionSystem object, an SRRecognizer object, an SRSpeechSource object, and an SRLanguageObject model object.

SRRecognitionSystem object

We will create an SRRecognitionSystem object when we initialize our speech recognition application. This object is used to open and close the speech recognition system we will be using.

SRRecognizer object

The SRRecognizer object is the attentive component of our speech application. When a user speaks into the microphone, this object listens for an utterance. The utterance is processed and the SRRecognizer determines if it is recognizable or gibberish. The result is processed and then sent to our application.

SRLanguageObject object

Our speech application will need to be instructed which words, phrases, and complex phrases to listen for. The list of items we wish to listen for will be maintained by the SRLanguageObject instance. The items in this list are sub-classes, SRWord for a word, SRPhrase for a phrase, and SRLanguageModel for a complex phrase.

Using the SpeechRecognitionManager API

The example code provided is intended for Mac OS X portability and uses the Carbon Events calling conventions. You can still develop code for Carbon on Mac OS 9 with the Carbon SDK. Using the CarbonStub9 library, Carbon applications can run under Mac OS 9. For simplicity, I merely made a copy of the BasicCarbEvents project, found in the Sample Code folder, and inserted my own code. The speech recognition manager code is built around this example. Figure 3 shows an example of what my project looks like using Metrowerks Code Warrior 5.

Figure 3. Using the BasicCarbEvents project.

Figure 3. Using the BasicCarbEvents project.

Initializing the Speech Manager

The Speech Manager API, as it is currently available for Mac OS 9, is portable for Mac OS X. Preliminary documentation on the Speech Manager API for Carbon is available in HTML format. Therefore, the initialization of the SRRecognitionSystem and SRRecognizer objects are pretty much as they appear in the original pre-Carbon Speech Manager PDF document I referenced earlier. For creating a SpeechObject instance in our application we have the following code:

OSStatus InitSpeech (void)
  OSStatus err = kBadSRMVersion;
  long currVersion;
  short feedback = kSRHasFeedbackHasListenModes;
  /* check for valid Speech Recognition Manager
  err = Gestalt(gestaltSpeechRecognitionVersion,
  if (!err)
    if (currVersion < kMinSRVersion)
      return kBadSRMVersion;
  /* instantiate the SR system object */
  err = SROpenRecognitionSystem( &objSRsystem,
  /* use standard feedback window and listening
modes */
  if (!err)
    err = SRSetProperty(objSRsystem,
kSRFeedbackAndListeningModes, &feedback, sizeof(feedback));
  /* instantiate a Speech recognizer object*/
  if (!err)
    err = SRNewRecognizer(objSRsystem,
&objSRrecognizer, kSRDefaultSpeechSource);
  return err;

Function InitSpeech checks for a valid version of the Speech Manager installed in the operating system. If the version is 1.5 or greater, the code then creates a SpeechObject instance with a call to SROpenRecognitionSystem. My code uses the variables objSRsystem for the SRRecognitionSystem object and objSRrecognizer for the SRRecognizer object. You will want to refer to the SpeechRecLib.c file for other variable declarations.

The header file SpeechRecLib.h does not have any variable declarations, but only contains the prototypes for the function calls that are exported by the library -- a simple practice for variable protection. By calling InitSpeech from our application, the call to SRSetProperty tells the Speech Manager the type of listening mode we want and that we will be using a feedback window. This is all we need to do to initialize the Speech Recognition Manager toolbox. Now we'll need to teach our SpeechObject which words to listen for.

Pages: 1, 2, 3, 4

Next Pagearrow