If you want to know something just ask

Written by : Jag Padala (CTO – Tagnos)

Our interactions with computer systems have definitely evolved over the years. Starting with cards and keyboards and moving on to touch and chat it has become easier and more intuitive to interact with systems.

The next frontier for the interface may well be voice. Although a number of implementations have sprung up over the last few years the one with arguably the most friendly API and ease of implementation is Amazon Alexa.

When we were discussing the Operating room scenarios with our clients there were several instances where the staff were required to interact with a screen or key board to enter in vital information about the procedures or sometimes just status updates that would later help them track important timelines within the procedure such as when the first cut was made or when they start to close on a patient after a procedure.

The life of a ER nurse is similarly crammed with activity and spending time on busy computer screens is always more of a distraction when you have patients waiting for critically needed time sensitive care.

These kind of conversations first motivated us to test out the possibility of integration of Alexa or a similar system into our software. Although a number of voice recognition software was available there were no easy APIs that we could plug into our system. When we started looking at Alexa the methodology seemed to be pretty straight forward.

Cash strapped and always struggling to meet schedules, proof of concept projects like these are always challenging in a startup world. After a few discussions we thought it would be worthwhile to spend a few days exploring the Alexa interface.

The implementation of the API turned out to be pretty simple. Alexa supports Node.js, Python and Java. Since we live and breathe java it was our choice of implementation. There were a few kinks like the APIs needed to be registered only in a specific US East zone of AWS. The Alexa triggers would not be visible on other zones. A few google searches later we were all configured to test out the hello world sample.

Once we got past the basic testing we started looking at the use cases that we wanted to be part of the proof of concept. Someone would want to locate a tag or they would want to locate a specific equipment. There could be more use case in the future when a ER director may want to check out the load in the  various sections of ER but once the basic structure is set it would be easy to setup the more involved cases.

A trip to frys later I was equipped with a Amazon Echo to test out the basics.

Amazon Alexa lets you setup a variety of prompts for a query. For example looking up a tag could be “Find tag {Tag}” “Locate tag {Tag}” “Can you find tag {Tag}”. We came up with close to thirty variations for these. Once they were coded in we turned on the Echo and to our pleasant surprise we got a response from the system that the tags were in the Pre-Op zone. Alexa comes with a app that shows the exact words Alexa heard and responded to. Sometimes Tag was heard as Tad. So we added cases like “Find tad {Tag}”. Once these were coded we had our end to end tests completed.

The more I play around with this the more I am convinced that this would be the favored means of interaction with computer systems. These are early days. There are a few kinks with security that are not built in, so they would need to be added by the developers.

Leave a Reply

Your email address will not be published. Required fields are marked *