OSCAR, the hangout robot

A Robot in a Hangout

I have been having fun recently playing around with hangouts and I thought it would be nifty to create a robot that can join and be controlled from within a hangout. Rather than wait for someone else to do it, I took an existing project for controlling a Roomba that I had already (based on the Maya Telepresence Robot / Johnny Lee robot) and implemented a simple way for controlling the robot from the hangout. This post will go over the basics of how I did this.

Feel free to download the Roomba robot web server / controller and hangout XML if you want to hack on the code I have already written.

How it all fits together

Rather than start looking at the pieces, let’s take a look at how the whole thing works together. The idea is extremely simple:

  1. A web server is running on the robot that takes commands (http://baseURL/command) that will control the robot
  2. A hangout extension is created to control the robot
  3. Participants join the hangout which manages queries to the robot web server
  4. The robot is joined to the hangout so you can see what the robot sees

I made a simple diagram to outline this:

As you can see, there’s nothing too complex going on because most of the difficult components, a controller for the robot, a simple web server, and provisions for the hangout such as shared state are all easily created through great software and services that already exist. Let’s take a closer look at the various components that make up the robot and the hangout extension.

OSCAR, the robot

The robot, a Roomba, is conveniently controllable through a serial port using a computer.  Others have written .NET wrappers that make it super easy to control the robot and I naturally have used Johnny Lee’s version to start from. On top of the wrappers, I built a very simple HTTP server.  The following code summarizes how the web server works and gives you a brief overview of how I’m connecting the web server to the robot.

Before we can handle any requests, we must configure the robot controller as well as add basic handlers for sensor updates.  The following code shows how I use the .NET wrapper to do this:

            private void OnWheelDropChanged(object sender, EventArgs e)
            {
                if (!robot.prevSensorState.WheelDropLeft && robot.sensorState.WheelDropLeft)
                    Console.WriteLine("wheel left dropped!");
                if (!robot.prevSensorState.WheelDropRight && robot.sensorState.WheelDropRight)
                    Console.WriteLine("wheel right dropped!");
                if (!robot.prevSensorState.WheelDropCaster && robot.sensorState.WheelDropCaster)
                    Console.WriteLine("wheel caster dropped!");
            }

            private void OnCliffDetectChanged(object sender, EventArgs e)
            {
                if (!robot.prevSensorState.CliffLeft && robot.sensorState.CliffLeft)
                    Console.WriteLine("cliff left!");
                if (!robot.prevSensorState.CliffFrontLeft && robot.sensorState.CliffFrontLeft)
                    Console.WriteLine("cliff front left!");
                if (!robot.prevSensorState.CliffFrontRight && robot.sensorState.CliffFrontRight)
                    Console.WriteLine("cliff front right!");
                if (!robot.prevSensorState.CliffRight && robot.sensorState.CliffRight)
                    Console.WriteLine("cliff right!");
            }

            private void OnBumperChanged(object sender, EventArgs e)
            {
                if (!robot.prevSensorState.BumpLeft && robot.sensorState.BumpLeft)
                    Console.WriteLine("bump left!");
                if (!robot.prevSensorState.BumpRight && robot.sensorState.BumpRight)
                    Console.WriteLine("bump right!");

            }

            private void OnSensorUpdate(object sender, EventArgs e)
            {
                string sensortemp = "";
                try
                {

                    sensortemp += "<hr>Bumpers: " + (robot.sensorState.BumpLeft ? 0 : 1) + " " + (robot.sensorState.BumpRight ? 0 : 1);
                    sensortemp += "<hr>WheelDrop: " + (robot.sensorState.WheelDropLeft ? 0 : 1) + " " + (robot.sensorState.WheelDropCaster ? 0 : 1) + " " + (robot.sensorState.WheelDropRight ? 0 : 1);
                    sensortemp += "<hr>Cliff: " + (robot.sensorState.CliffLeft ? 0 : 1) + " " + (robot.sensorState.CliffFrontLeft ? 0 : 1) + " " + (robot.sensorState.CliffFrontRight ? 0 : 1) + " " + (robot.sensorState.CliffRight ? 0 : 1);
                    string state = "Idle";
                    if (robot.sensorState.ChargingState == 1)
                        state = "Reconditioning";
                    if (robot.sensorState.ChargingState == 2)
                        state = "Charging";
                    if (robot.sensorState.ChargingState == 3)
                        state = "Trickle";
                    if (robot.sensorState.ChargingState == 4)
                        state = "Waiting";
                    if (robot.sensorState.ChargingState == 5)
                        state = "Fault";

                    int batteryCharge = (robot.sensorState.BatteryCharge * 100 / robot.sensorState.BatteryCapacity);

                    sensortemp += "<hr>iRobot Battery: " + state + " " + batteryCharge + "% (" + ((robot.sensorState.Current > 0) ? "+" : "") + robot.sensorState.Current / 1000.0 + "A " + robot.sensorState.Voltage / 1000.0 + "V " + robot.sensorState.BatteryTempurature + "C)";
                }
                catch (Exception x)
                {
                    Console.Out.WriteLine(x);
                }
                sensorState = sensortemp;
            }

            public void setupRobot()
            {
                robot = new iRobotCreate();
                robot.OnSensorUpdateRecieved += new iRobotCreate.SensorUpdateHandler(OnSensorUpdate);
                robot.OnBumperChanged += new iRobotCreate.BumperChangedHandler(OnBumperChanged);
                robot.OnCliffDetectChanged += new iRobotCreate.CliffDetectChangedHandler(OnCliffDetectChanged);
                robot.OnWheelDropChanged += new iRobotCreate.WheelDropChangedHandler(OnWheelDropChanged);
            }

            public bool connectRobot(string thePort)
            {                
                if (!robot.connected)
                {
                    if (robot.Connect(thePort))
                    {
                        Console.Out.WriteLine("Connected to iRobot Create");
                    }
                    else
                    {
                        Console.Out.WriteLine("Could not Connect to iRobot Create on port:" + thePort);

                        Console.Out.WriteLine("Please enter another port: ");

                        thePort = Console.ReadLine();
                        connectRobot(thePort);
                    }
                }
                robot.StartInFullMode();
                robot.StartSensorStreaming();
                return robot.connected;
            }

Next, we configure a simple web server:

            public void startServer()
            {
                UPnP.OpenFirewallPort("iRobot", UPnP.Protocol.TCP, 8080);

                string[] prefixes = { "http://+:8080/" };
                // URI prefixes are required,
                // for example "http://contoso.com:8080/index/".
                if (prefixes == null || prefixes.Length == 0)
                    throw new ArgumentException("prefixes");

                // Create a listener.
                listener = new HttpListener();

                // Add the prefixes.
                foreach (string s in prefixes)
                {
                    listener.Prefixes.Add(s);
                }
                listener.Start();
                bool keepListening = true;
                while (keepListening)
                {
                    keepListening = handleRequest(listener);
                    // TODO: display power status
                    // Console.Out.WriteLine();
                }
                listener.Stop();
            }

Finally, we handle requests to the listener that will perform moves based on the path passed to the server:

            public bool handleRequest(HttpListener listener)
            {
                Console.WriteLine("Listening...");
                // Note: The GetContext method blocks while waiting for a request. 
                HttpListenerContext context = listener.GetContext();
                HttpListenerRequest request = context.Request;

                string requestedURL = request.RawUrl.ToString();

                switch (requestedURL)
                {
                    case "/forward":
                        System.Console.WriteLine("Going forward...");
                        robot.DriveDirect(100, 100);
                        Thread.Sleep(1000);
                        robot.DriveDirect(0, 0);
                        break;
                    case "/left":
                        System.Console.WriteLine("Turning left...");
                        robot.DriveDirect(50, -50);
                        Thread.Sleep(1000);
                        robot.DriveDirect(0, 0);
                        break;
                    case "/right":
                        System.Console.WriteLine("Turning right...");
                        robot.DriveDirect(-50, 50);
                        Thread.Sleep(1000);
                        robot.DriveDirect(0, 0);
                        break;
                    case "/back":
                        System.Console.WriteLine("Backing up...");
                        robot.DriveDirect(-100, -100);
                        Thread.Sleep(1000);
                        robot.DriveDirect(0, 0);
                        break;
                    default:
                        System.Console.WriteLine("Unrecognized command: " + requestedURL);
                        break;
                }

                // Obtain a response object.
                HttpListenerResponse response = context.Response;
                // Construct a response.
                string responseString = "<HTML><BODY> ROBOT CONTROL INTERFACE: <br>" + sensorState + "</BODY></HTML>";
                byte[] buffer = System.Text.Encoding.UTF8.GetBytes(responseString);
                // Get a response stream and write the response to it.
                response.ContentLength64 = buffer.Length;
                System.IO.Stream output = response.OutputStream;
                output.Write(buffer, 0, buffer.Length);
                // You must close the output stream.
                output.Close();

                return true;
            }
        }

The actual app that runs to control the robot  is just a console app that configures the robot and then starts the web server:

static void Main(string[] args)
        {            
            if (!HttpListener.IsSupported)
            {
                Console.WriteLine ("Windows XP SP2 or Server 2003 is required to use the HttpListener class.");
                return;
            }

            welcomeMessage();
            TheServer theServer = new TheServer();
            theServer.startServer();

        }

        static void welcomeMessage()
        {
            System.Console.WriteLine("Firing up the messenger pigeons...");

            //get ip address
            IPAddress[] localIPs = Dns.GetHostAddresses(Dns.GetHostName());
            for (int i = 0; i < localIPs.Length; i++)
            {
                if (localIPs[i].AddressFamily == AddressFamily.InterNetwork)
                {
                    Console.Out.WriteLine("Local IP: " + localIPs[i]);                    
                    break;
                }
            }

            // TODO: display power status
            // Console.Out.WriteLine();
        }

After successfully starting, the web server will be running with endpoints that can be queried using any browser to handle the various commands.  For example, http://robotservername.com/forward will send the robot forward. I’ll go into the quick hack that I use to tunnel the robot’s web server to the web later on.

The Google+ hangout extension

The hangout extension, written in old school fashion with just a couple fields and JavaScript, performs queries to the server controlling the robot using getElementById to update variables that are joined with known endpoints and query the robot through an iframe.

The following XML is based entirely on the hangout boilerplate and just configures the extension:

<?xml version="1.0" encoding="UTF-8" ?>
<Module>
<!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not
 * use this file except in compliance with the License. You may obtain a copy of
 * the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *      
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
 * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
 * License for the specific language governing permissions and limitations under
 * the License
-->
<ModulePrefs title="The classy robot hangout">
  <Require feature="rpc"/>
  <Require feature="views"/>
</ModulePrefs>
<Content type="html"><![CDATA[

The following payload in the XML contains the controller for the robot that will be rendered to the frame for the hangout extension:

    <script src="//hangoutsapi.talkgadget.google.com/hangouts/_/api/hangout.js?v=1.1"></script>

    <script type="text/javascript" language="javascript"> 
    var baseURI = "http://3wp.localtunnel.com:8080";
    var state_ = null;
    var metadata = null;

    function navigate(direction){
      var fullPath = baseURI + "/" + direction;
      document.getElementById("response").src=fullPath;
    }

    function updateLocalBaseURI(){
      console.log("Setting state data");
      baseURI = document.getElementById("baseURI").value;
      gapi.hangout.data.setValue("baseURI", baseURI);
      console.log("State data set");
    }
    </script>

    <div id="sharedEndpoint" style="display:none">
      <center>
      Endpoint base URI: <input type="text" id="baseURI" length="32" value="http://3wap.localtunnel.com"><br/>
      <button onclick="updateLocalBaseURI();">Configure BASE URI</button>
      </center>
    </div>
    <hr/>
    <p>
      <table align="center">
        <tr>
        <th colspan=3>Robot controls</th>
        </tr>
        <tr>
          <td></td><td><center><button onclick='navigate("forward")'>Forward</button></center></td><td></td>
        </tr>
        <tr>
          <td><center><button onclick='navigate("left")'>Left</button></center></td><td></td><td><button onclick='navigate("right")'>Right</button></center></td>
        </tr>
        <tr>
          <td></td><td><center><button onclick='navigate("back")'>Back</button></center></td><td></td>
        </tr>
      </table>
    </p>
    <hr/>
    <center>
      <iframe border="0" id="response">
      </iframe>
    </center>
    <script>

    /** Handle messages for data sharing and more!
     * @param {MessageReceievedEvent} event An event.
     */
    function updateBaseURI(event) {
      try {
        console.log("URL changed across the share");
        var newURI = gapi.hangout.data.getValue("baseURI");
        baseURI = newURI;

        // hide the input section after being set initiallly
        document.getElementById('sharedEndpoint').style.display='none';
      } catch (e) {
      }
    }

    function init() {
      // When API is ready...                                                         
      gapi.hangout.onApiReady.add(
        function(eventObj) {
          if (eventObj.isApiReady) {
            var newURI = gapi.hangout.data.getValue("baseURI");

            if (newURI != null){
              baseURI = newURI;
            }else{
              // show the endpoint thing
              document.getElementById('sharedEndpoint').style.display='block';
            }

            // set the onMessage callback to be used for data sharing
            gapi.hangout.data.onStateChanged.add(updateBaseURI);
          }
        });
    }
    gadgets.util.registerOnLoadHandler(init);

    // Keyboard shortcut handler
    document.onkeypress=function(e){
        var e=window.event || e;
        if (e.altKey && e.ctrlKey && e.shiftKey){
          // TODO: This is a workaround for a UI need, let you toggle the URL area after it's set.
          // u character (URL)
          if (e.charCode === 21){
            console.log("reshowing URL area");
            document.getElementById('sharedEndpoint').style.display='block';
          }

          // WASD controls when you hold CTRL+ALT
          // w character
          if (e.charCode === 23){
            console.log("WASD - forward");
            navigate("forward");
          }
          // a character
          if (e.charCode === 1){
            console.log("WASD - left");
            navigate("left");
          }
          // s character
          if (e.charCode === 19){
            console.log("WASD - back");
            navigate("right");
          }
          // d character
          if (e.charCode === 4){
            console.log("WASD - right");
            navigate("down");
          }
        }
    }
    </script>

Of note to hangouts is the following code:

    /** Handle messages for data sharing and more!
     * @param {MessageReceievedEvent} event An event.
     */
    function updateBaseURI(event) {
      try {
        console.log("URL changed across the share");
        var newURI = gapi.hangout.data.getValue("baseURI");
        baseURI = newURI;

        // hide the input section after being set initiallly
        document.getElementById('sharedEndpoint').style.display='none';
      } catch (e) {
      }
    }

    function init() {
      // When API is ready...                                                         
      gapi.hangout.onApiReady.add(
        function(eventObj) {
          if (eventObj.isApiReady) {
            var newURI = gapi.hangout.data.getValue("baseURI");

            if (newURI != null){
              baseURI = newURI;
            }else{
              // show the endpoint thing
              document.getElementById('sharedEndpoint').style.display='block';
            }

            // set the onMessage callback to be used for data sharing
            gapi.hangout.data.onStateChanged.add(updateBaseURI);
          }
        });
    }
    gadgets.util.registerOnLoadHandler(init);

What that code does is map the Google+ APIs to control state within the hangout.  The event handler, updateBaseURI, will take the value stored in a global key/value mapping with the configured endpoint that can be set by anyone in the hangout.  The following code is called when the shared endpoint URL is updated:

    function updateLocalBaseURI(){
      console.log("Setting state data");
      baseURI = document.getElementById("baseURI").value;
      gapi.hangout.data.setValue("baseURI", baseURI);
      console.log("State data set");
    }
    </script>

Bringing it all together

There are still a few issues with how we get this working.  For one, how do we get clients from the global internet to be able to easily connect to the robot?  To solve this, I use a nifty Ruby script, localtunnel, that creates a proxy from the web to the server.  When you run localtunnel, it will proxy a globally accessible web address to the port passed to the script. It works surprisingly well!

Next, how do we map everything from the hangout to that address (which is dynamic)?  This is done using the shared state capabilities of the Google+ hangout.  If you look carefully at the two sections of code I call out in the end of previous section, you will notice one that sets the shared baseURI and another that will update the locally stored baseURI when the shared value is updated.  I also hide the URI field once it’s set to discourage others from updating it accidentally after its set.

For kicks, I added WASD controls to the robot. For example, holding Ctrl+Alt+Shift and then pressing W, moves the robot forward.

Improving the system

So that’s all there is to it, a simple web server, a client connected to the hangout, and then a basic hangout app that maps the hangout to the robot’s endpoints. As it’s built now, the robot requires some rather expensive components, namely a PC and a Roomba.  Also, the robot needs to be on WiFi in order to function so that limits where it could go.

What would be awesome would be if you were to implement the web server and robot controller on a device (such as a phone) that would be cheaper than the components I reused for this. If you wanted to dive into it, you could start from the serial protocol for the Roomba.  Alternatively, you could approach this from the ground up and then implement the entire robot around such a device.

From a different perspective, what if you wanted to create a new robot that would work similarly with the hangout app. This could be easily done by creating a robot with the same endpoints (baseURL + back/forward/left/ right) and then pointing the baseURL of the hangout to those endpoints.  What would be even cooler would be if you implemented a flying robot then added endpoints for up and down to make the robot go higher or lower.

I’m not a security guru by any measure, but also of note is that the system is very insecure.  If you wanted to secure the system you could add a component to the hangout that would ensure that only queries from the hangout would be able to make requests to the web server running on the robot.  You could implement this using OAUTH2, basic HTTP passwords, or even using a shared secret. By using a common security protocol between the hangout and the robot’s web server, you could definitely make it harder for an attacker to do malicious things on the robot.

Finally, what else could we do with Google+ and the robot. I think it might be cool to add sensors to the robot that would then create moments in Google+ history.  For example, what if you had the robot write moments to everybody’s history in the response HTML with GPS coordinates to where the robot is or to take pictures and add moments.  This is where the potential for Hangouts and History becomes very interesting.

Please let me know if you plan on doing anything cool that’s similar, want any further clarification on my components, or if you have another robot implementation in the comments.

If you’re interested in creating your own, read Building and running the OSCAR code.

See Also