Writing Python Macros
Last Modified: October 14, 2000
(Editors Note: Since I originally wrote this topic, I have prepared a PowerPoint presentation and additional sample programs which should also help explain how to use NatLink to write command and control macros. See NatLinkTalk.ppt for more details.)
The initial reason for developing the NatLink system was to create a more powerful command and control subsystem for Dragon NaturallySpeaking. By using Python as the macro language (instead of the built-in Basic subset), and by exposing more grammar constructs, it would be possible to implement more flexible voice commands for any version of Dragon NaturallySpeaking.
To use Python as a macro language for Dragon NaturallySpeaking, you have to enable it using the EnableNL program. Once enabled, the NatLink subsystem will be automatically loaded whenever Dragon NaturallySpeaking starts.
NatLink uses grammar files, written in Python, which tell NatSpeak what to listen for and what to do when those grammars are recognized. I included three sample grammar files with the NatLink distribution in the SampleMacros subdirectory (sleeping.py is not really a grammar file since it has no grammars).
How It Works
- When Dragon NaturallySpeaking starts, it will load natlink.dll. This is done because natlink.dll is registered as a COM server and its GUID is listed in the Dragon NaturallySpeaking registry section.
- When the natlink.dll is loaded, it launches the Python interpreter and loads the natlinkmain.py Python file. From this point on, most of the work is done in natlinkmain.
- Natlinkmain is designed to load grammar files automatically according to the following rules:
- When NatSpeak starts, any Python files which begin with an underscore in the same directory as natlinkmain are loaded. These are global grammars.
- Also when NatSpeak starts, any Python files which begin with an underscore and are stored in the same directory as the user's speech files are loaded. These are user-specific global grammars.
- Whenever you turn on the microphone, natlinkmain loads any Python module with the same name as the currently active Windows module. It looks both in the natlinkmain directory and in the directory where the user's speech files are stored. These are application-specific grammars.
- Whenever the current user changes, any previously loaded user-specific grammars (global and application-specific) are unloaded. Then the global and application-specific grammars from the new user's directory are loaded.
- Unlike the built-in command and control subsystem, natlinkmain will detect when any grammar file it is currently using changes on disk. This test is made when the microphone is turned on and every time you begin speaking. If the Python grammar file on disk changes, it is reloaded.
The actual work is done in each grammar file itself. Natlinkmain does not assume that the grammar file does anything except to expose the required unload function (which is called when that grammar file is unloaded). However, I have come up with a standard grammar file based on a base class defined in natlinkutils.py. Using this standard base class, grammar files work as follows.
- When a grammar file is loaded, it creates an instance of a class derived from GrammarBase (defined in natlinkutils.py).
- The grammar file then calls a member function called initialize(). In initialize the grammar class compiles and load the grammar specification into Dragon NaturallySpeaking. (By convention, the grammar specification is defined as a string in the grammar class.) Then the grammar class enables the grammar for recognition and returns.
- When recognition occurs, Dragon NaturallySpeaking will callback into code inside NatLink. This will, in turn, cause Python code to be executed for the grammar which was recognized. The GrammarBase class gets the callback and decodes the results.
- GrammarBase will call member functions of your grammar class by name, based on the rules which were recognized. So if you defined a rule called " = microphone off", when this rule is recognized, a member function called gotResults_micoff() will be called.
- Your grammar's callback function can then take any action it wants. Common actions involve calling the function natlink.playString which types keystrokes and natlink.execScript which runs arbitrary NatSpeak scripting language commands.
About Grammars and Results
Most of the important points are documented in the various Python files included with NatLink. Look in natlink.txt for the list of functions available in the NatLink module itself. Natlinkutils.py describes how the GrammarBase class works and what additional helper functions are defined. Finally, grammarparser.py documents the syntax of grammars.
Grammars in NatLink talk directly to the low-level SAPI interfaces. Thus, they can be much more expressive than the grammar you write using NatLink's built-in command and control subsystem. For example, you can define rules which reference other rules, rules which include optional words and rules which include repeated phrases.
When a grammar is recognized, Dragon NaturallySpeaking returns information about which words were recognized and which rules those words came from. The GrammarBase base class uses this information to decide what functions to call.
Warning: the callback mechanism is somewhat limited. You will not be told every rule which was recognized, only the innermost rule for each work. As an example, assume the following grammar:
<start> = <rule1> <rule2>
<rule1> = this is
<rule2> = a test
If the rule named "start" is activated and the utterance "this is a test" is recognized, NatLink will indicate that it heard the two words "this is" from rule1 and the two words "a test" from rule2. There will be no indication that the enclosing rule (start) was involved. This is a minor inconvenience but it means that you may have to be clever about your rule design.
GrammarBase makes the following callbacks when a recognition occurs:
- gotResultsInit(). This function is passed all the information about the recognition. It is a good place to initialize any variable you might use in the parsing of the words.
- gotResults_XXX() where XXX is a rule name. This function is called for every word involved in the recognition. However, to be more efficient, when there is a contiguous sequence of words from the same rule, only one call is made to gotResults_XXX() for that rule. Consider the following example:
<start> = <rule1> <rule2>
<rule1> = this test
<rule2> = and <rule1> works
If the rule "start" is activated and the utterance "this test and this test works" is recognized then the following callbacks are made:
gotResultsInit( ['this','test','and','this','test','works'] )
gotResults_rule1( ['this','test'] )
gotResults_rule2( ['and'] )
gotResults_rule1( ['this','test'] )
gotResults_rule2( ['works'] )
gotResults( ['this','test','and','this','test','works'] )
- gotResults(). This function is called at the end and is passed every word recognized in the recognition.
Dragon NaturallySpeaking actually creates a results object for every recognition. This results object is turned into a Python instance of the ResObj class. Normally you can ignore this class instance in your grammar files. But, it is possible to remember this results object and query it later for the entire choice list or use it for training. (See trainUser.py for examples of this.)
Writing Grammar Files
Copy one of the sample grammar files into the MacroSystem directory and see how it works. (Note: although natlinkmain will load any grammar file which changes while NatSpeak is running, it will only see new grammar files when NatSpeak first starts.)
Now modify the grammar file to get a feel for what types of grammars you can define and what kinds of callback you get when they are recognized.
There is a hidden window in the NatLink subsystem which can be used for debugging. This window will automatically appear in the upper left hand corner of the screen if Python writes to either stdout or stderr. This means than any errors in your grammar files will not crash NatLink, but will be reported in this display window. Also, you can use print statements in your Python macros to help development. The printed text will be shown in the display window.
Remember, if you change a grammar file on disk, that grammar file will be automatically reloaded by NatLink when you toggle the microphone on and off. This is extremely useful for testing new grammar files.
This web page (http://www.synapseadaptive.com/joel/WritingPythonMacros.html) was last modified on October 14, 2000.
The contents of this page are (c) Copyright 1998-1999 by Joel Gould. All Rights Reserved.
See Copyright Information for more details.