Monday, 28 May 2012

Android Voice Recognition in Appcelerator Titanium

November 19th, 2011 // 11:32 am @ matt

One of my most recent tasks at Food on the Table was to implement speech recognition for our meal planning app?s grocery list on Android. Now that the first version of the feature is released, the fully featured product allows users to quickly add items just by speaking into their phone. Not only that, but the app knows to separate items whenever you say ?and? in between (e.g. ?bacon and eggs and cheese? get added as separate items to your list as ?bacon?, ?eggs?, and ?cheese?), determines which department the item should be categorized into, and automatically appends sale data based upon whether or not the item is on sale where you shop.

Most of the backend for this was already implemented from our company working on other tasks. The newest piece to get this working was the voice recognition itself. Fortunately, Android 2.1+ comes with speech recognition capabilities via the RecognizerIntent.

The basic logic is as follows:

  1. Create an Intent that calls the speech recognizer action,
  2. Pass various parameters, such as the user prompt string and max number of desired results,
  3. Accept the response in the form of an array of strings, where the first array item is most likely to correctly match what the user said.

Because we?re using Appcelerator Titanium for our mobile development, this means all our code is built using JavaScript with wrappers and proxies that act as interfaces to the raw Java Android code. This also means that some of the work is simplified, such as a callback method being directly passed to an intent that returns a result, as opposed to defining a special method that handles all results as is typically done on Android.

Unfortunately, at times it becomes clear that working with Titanium is going to take a bit of extra work compared to coding natively for Android.

As I mentioned above, when Android?s speech recognition Intent is successful, it returns an array of text strings that are most likely to match what the user said. To access this array of strings, there is a method on the Intent class named getStringArrayListExtra. The problem is that Titanium?s IntentProxy does not provide a wrapper for this method. As such, the temporary solution until it is included in Titanium Mobile core is to create a custom module that acts as a middleman between the JavaScript and Android layers.

Implementing this took some time, as it was my first time working with Titanium modules, but it turned out to be relatively simple. Since extending Titanium?s core implementation of IntentProxy into a subclass won?t work, I had to essentially create a helper class that accepts the IntentProxy and the parameters required for the method, calling the actual method on the Intent given those parameters.

@Kroll.method public String[] getStringArrayListExtra(final IntentProxy intentProxy, final String name) { 	if (DBG) { 		Log.d(LCAT, "getStringArrayListExtra called with name:" + name); 	} ? 	final ArrayList<String> list = intentProxy.getIntent().getStringArrayListExtra(name); 	return list != null ? list.toArray(new String[list.size()]) : null; }

Let?s break down what?s happening here. First, the @Kroll.method is a simple annotation used by Titanium to say ?this method should be available when accessed through the JavaScript.? If the annotation weren?t included, then the method would only be accessible by other Java classes.

Next, the DGB and Log.d lines simply create an entry in the Android log for debugging purposes. This doesn?t effect the outcome of the function at all, and is simply useful for anyone attempting to debug their code. Removing these lines would have no effect on the actual implementation of the core functionality.

On the next line, where I define list, I first get the actual Android Intent object from the IntentProxy. Once I have the Intent object, I can call the method I?ve been trying to reach in the first place ? getStringArrayListExtra ? passing the name parameter provided to the proxy. This method returns an ArrayList of String objects.

Finally, we want to turn a simple array of String objects from our new method. The reason I chose a string array is that Titanium already has the built-in logic to convert a Java array into an array object that JavaScript understands. To return a string array, we need to convert the ArrayList into an array. Fortunately, this can easily be done by calling the toArray method on the ArrayList object that we already have. (If the object is null, then we simply return that instead.)

Great! So now it?s time to compile, implement, and test out our module. Throwing a little speech recognition code together, it quickly becomes clear that there?s another problem. As it turns out, Titanium does not properly handle putExtra for integer values. As such, we?ll need to add a new method called putIntExtra to our custom Intent module to perform this task.

@Kroll.method public void putIntExtra(final IntentProxy intentProxy, final String key, final Object value) { 	if (DBG) { 		Log.d(LCAT, "putIntExtra called with key:" + key + "\", value:" + value); 	} ? 	if (value instanceof Integer) { 		intentProxy.getIntent().putExtra(key, (Integer) value); 	} else { 		intentProxy.getIntent().putExtra(key, (int) Double.parseDouble(value.toString())); 	} }

Breaking this down, you can see that it behaves similar in structure to the original module method I described. First, we accept the IntentProxy object along with the desired key and value for the extra data. Note that we accept the value as an Object though since JavaScript uses loose variable types.

Next, we provide the same basic debug logging as mentioned before.

Now we get to the actual heart of what?s happening. Even though I don?t do so myself, this code can technically be called from the Java itself. As such, I perform a preliminary check to see if the value is already an Integer instance. If so, then we don?t perform any magic, and simply call putExtra on the intent. (Java?s automatic unboxing handles the Integer to int conversion.) However, if the object is not already an Integer, then we need to convert it to one.

At first, I tried to utilize the Integer.parseInt method to convert the string representation of the object. However, it turns out that the object was being passed in as ?1.0?, and the decimal point was causing an exception to be thrown when this method was called. As such, the solution was to parse the string as a Double, then case that to an integer. This simply removes anything after the decimal, meaning that a value of 1.9999 would become 1. I decided this is acceptable since the method is intended to only be receiving integer representations in the first place.

Finally, after recompiling, reimplementing, and retesting the code, we find that speech recognition now works. The Titanium JavaScript code is returned an array of strings when our RecognizerIntent finishes successfully, and we can use this string to populate our grocery list items for the user.

Here?s an example of the Titanium code used to perform the speech recognition through our new module.

var intentModule = require("com.foodonthetable.intent"); ? var intent = Ti.Android.createIntent({   action: "android.speech.action.RECOGNIZE_SPEECH" }); ? intent.putExtra("calling_package", "com.foodonthetable.mobile"); intent.putExtra("android.speech.extra.PROMPT", "Start speaking..."); intent.putExtra("android.speech.extra.LANGUAGE_MODEL", "free_form"); intentModule.putIntExtra(intent, "android.speech.extra.MAX_RESULTS", 5); ? // For retrieving the array list  Ti.Android.currentActivity.startActivityForResult(intent, function(event) {   if (event.resultCode == Ti.Android.RESULT_OK) {     var intent = event.intent;     var results = intentModule.getStringArrayListExtra(intent, "android.speech.extra.RESULTS"); ?     if (results && results.length) {       console.log("There were " + results.length + " results");       for (var i in results) {         console.log("Result " + i + " = " + results[i]);       }     } else {       console.log("No results");     }   } });

Feel free to view or modify the full source code on Github. If you have any suggestions for improvement or actual code changes, feel free to contact me or fork my code and submit a pull request. Also, check out the Android meal planning app that?s using this feature for a full demonstration.

Category : Blog &Featured

lytro camera lytro camera andrew brietbart branson mo monkees songs danica patrick school closings

No comments:

Post a Comment