In this tutorial we're going to look at how to use OpenCV, a real time computer vision library, with Processing, Arduino, a webcam and a pan/tilt bracket to create a video that will keep a persons face in the middle of the frame as they walk around the room. A video of the final product illustrates the concept a little better than I can explain it.
There are several pieces for this project that need to be assembled. Start by putting the Pan/Tilt Bracket together using the assembly guide from the product page. This will show you how to put the bracket together and install the servos for controlling the bracket's orientation.
Once the Pan/Tilt Bracket has been assembled we need to find a way to mount the webcam onto the bracket. I'm using a Logitech Webcam that we lying around the office. It came on this little mounting swivel so that you can hang it from the top of a monitor.
I figured that the camera had to be mounted onto it's bracket somehow, so if I could take it out of the current bracket it might make it easier to mount to the pan/tilt bracket. After taking out a handful of screws, and pulling apart some rather reluctant plastic, I was lucky to find that there were some small metal swivels that I could mount the camera with.
The only thing left was to figure out how to mount the swivels onto the bracket in a fashion that would allow the camera to be put back onto the swivels. Since there are a ton of cut-outs in the pan/tilt bracket it wasn't too difficult. I found some small screws to mount the swivel to the bracket. Unfortunately the screws were just a tad too big to fit through the holes in the swivel, so I enlarged them a bit with a drill press. After making the holes bigger I just mounted both swivels to one edge of the bracket, and then with a little bit of wiggling I got the webcam back onto the swivels.
If you don't have the same webcam you'll have to find your own way to mount the webcam to the pan/tilt bracket. Don't be afraid of a little Duct Tape! One thing to check before mounting the camera to the bracket, though, is the range of motion of the pan/tilt. I wanted the camera to have the widest range of motion possible when tracking, so I found the mid-point of the pan and tilt angles, and then mounted my camera so that the lens was facing forward. To find the mid-point of the pan/tilt range of motion just manually move the bracket from side to side, and up and down and approximate the center position for each axis.
Now that the camera jig is set up we need to assemble the electronics. Since the Arduino is capable of providing enough power for two servos, there's not much to the assemble. All we need is an Arduino, a breadboard, a couple 3 pin Male Header strips, and some wire. The fritzing diagram below illustrates how to connect the entire setup. Just try to make sure that you plug the wire for the pan servo (left and right movement) into pin 2 of Arduino, and plug the wire for the tilt servo (up and down movement) into pin 3.
Everything should now be assembled and ready to go. Things are a bit messy, but they'll work.
Before we can get started with the code for this project there are a couple of software programs that need to be installed. First and foremost, if you don't have Arduino installed you'll need to go get it. There are great instructions for installing the software on any OS right on the Arduino download page. You also need to install Processing, which is very similar to Arduino, but made for writing programs for a computer (like, a real computer) rather than for Arduino. Processing also has great installation instructions for every operating system.
The brains of this project, though, come from OpenCV. OpenCV (Open Computer Vision) is a library for real-time image processing. It actually allows you to do all sorts of cool things, but we're going to use it for detecting faces. If you read the OpenCV webpage you'll notice that the only things that are really supported are C, C++ and Python. However a quick google for "OpenCV Processing" found a project from a group that has created an OpenCV library for Processing and Java.
In order to use OpenCV with Processing we'll need to install the OpenCV Framework and the OpenCV Processing Library. We'll also download the OpenCV Processing Examples so we can get a head start on writing the code. The OpenCV for Processing and Java site has good instructions on how to get these things installed. I copied these instructions from them.
Start by installing the OpenCV Framework to your computer. If you're using Windows, download the installer from the sourceforge page for the project. Make sure you download OpenCV_1.0.exe, not the zip file. And yes, there are newer version of the OpenCV Framework, but they don't work with OpenCV for Processing, so make sure to download version 1.0. When you run the installer, make sure you agree to have the OpenCV directory added to your system path. You'll need to reboot after installing in order for the system changes to take affect. If you're using a Mac, just download the OpenCV image and follow the installer instructions. Finally, if you're a Linux user, download/compile/install the source files for OpenCV.
Once the OpenCV framework is installed, the OpenCV Processing library still needs to be installed in order to use OpenCV from Processing. Just download the OpenCV Processing Library and move it to your Processing libraries folder. The libraries folder resides inside the Processing sketchbook folder. You can find the sketchbook folder by opening Processing, then going to File->Preferences. Make sure to extract the OpenCV library folder into the libraries folder, don't just put the zip file into the folder.
Finally the OpenCV Processing examples need to be downloaded. Either create a folder inside the OpenCV Library folder called 'examples' or just put the examples into the Processing sketchbook. Again, make sure the zip file is unzipped into one of these folders.
Once the OpenCV Processing examples are installed, and Arduino, Processing, OpenCV Framework and the OpenCV Processing library have all been installed, we've got everything we need to finish the project.
By now you've got the Pan/Tilt Bracket with a Webcam mounted onto it, and all of the required software has been installed. If you watched the video at the beginning of the tutorial you should have a general understanding of what this project will do. Basically the servos are going to move the webcam and keep the subjects face in the center of the video feed. Pragmatically speaking, Processing takes the video input from the webcam and uses the OpenCV library to analyze the video. If a face is detected in the video, the OpenCV library will give the Processing sketch the coordinates of the face. The processing sketch will determine where the face is located in the frame, relative to the center of the frame, and send this data through a serial connection to an Arduino. The Arduino will use the data from the Processing sketch to move the servos connected to the Pan/Tilt bracket. Let's start by taking a look at the Arduino sketch, which is called SerialServoControl.
If you haven't already grabbed the Arduino firmware for this project, go ahead and download the SerialServoControl sketch and save it to your sketchbook. Once you've downloaded it, open the sketch in Arduino, start by reading the comment section at the top. Basically this sketch will analyze a serial input for commands and set the servo positions accordingly. The command structure for the serial commands is simple. A command consists of two bytes: a servo ID and a servo position. If the Arduino receives a servo ID, then it waits for another serial byte and then assigns the received position value to the servo identified by the servo ID.
The Arduino Servo library is used to easily control the pan and tilt servos. There aren't that many variables; a couple are used to keep track of the servo ID values for each servo, and then an object (or instance) is created for each servo. Finally there's a character variable that will be used to keep track of the characters that come in on the Serial port.
In the setup section, we tell the Arduino which pins have the servos attached to them. In this case, we're telling the Arduino that the tilt servo is attached to pin 2 and the pan servo is attached to pin 3. Make sure that this reflects how the hardware is actually connected in your setup. If you get it wrong, it's easy to fix. Just relocate the wires. After telling the Arduino where the servos are connected, we set the initial position of the servos to be 90 degrees; this is just so that the setup goes back to a good starting point every time the Arduino powers up. Finally, in order to use the serial port we set up the connection with the Serial.begin command; this sketch will be using a 57600bps baud rate.
The loop section of the SerialServoControl sketch is also pretty short. Basically we wait for a character to come in on the serial port, and we only act if the character is an ID for one of the servos. Because of the command structure, the next byte to come from the serial port should be the servo position for the previously sent servo ID. So the sketch waits for another character from the serial port, and when it arrives the servo position is updated to the value read from the serial port. Simple!
The OpenCV framework is very powerful, and powerful frameworks tend to be a bit intimidating to work with. Luckily the OpenCV Processing Library comes with couple example sketches showing us how to work with the library. Make sure you have the OpenCV Processing Library installed, as well as the OpenCV Processing Examples. Open the example named "face_detection." When I was first exploring the examples I didn't know what "face_detection" actually did, but it sounded promising.
Make sure you have a webcam plugged in before running the sketch. Once your webcam is plugged in go ahead and run the sketch. What happens? You should see just a streaming video from the webcam. Nothing special. Bummer. But wait! If you check out the program output in the processing window, the top line has an error message. The error message reads "The haar classifier cascade file 'haarcascade_frontalface_alt.xml' can not be found in folders, you must specify the full path instead." Ok, so I don't know what a haar classifier cascade is, but I do know that this error message is indicating that the sketch folder is missing a file. To correct this error, you need to find the 'haarcascade_frontalface_alt.xml' file in the OpenCV Framework directory. The file should be in '.../OpenCV/data/haarcascades.' Just copy the XML file named 'haarcascade_frontalface_alt.xml' and paste it into the "face_detection" sketch folder in your Processing Sketchbook libraries folder.
Go back to Processing and run the sketch again (close the sketch if it's still open). Now, if you're face is in the webcam's field of view, you should see a colored rectangle outlining your face. Awesome! You might also notice that if you click and drag inside the window the brightness and contrast will change. Let's check out the code to see what information we can get that might help us control a servo. Keep in mind, I'm no expert on OpenCV, so we're learning this together.
In the initialization and setup sections there doesn't seem to be anything too complex going on. In the initialization section a couple of libraries are included, and a couple of variables are declared. Then in the setup section, a window is created, and the opencv instance is configured. Basically after the instance of the OpenCV class is created, the class is configured to capture a video that same width and height of our window, and the detection shape is set to detect the image defined by the CASCADE_FRONTALFACE_ALT file. (Try out some of the different haarcascade xml files to see what they do! Just copy the XML files like you did before, and rename the file in this line of the sketch). After configuring the openCV instance, some instructions are printed to the user on how to manipulate the brightness and contrast.
This part of the sketch is surprisingly short! The OpenCV instance reads a frame from the Webcam, converts it to grayscale, then sets the contrast and brightness. The only weird part of the sketch is the opencv.detect(...) stuff, I'm not sure exactly what happens there. But the result is that there is an array of rectangles which represent the coordinates of the detected faces, which is exactly what we were hoping for! After saving the detection algorithm, the image captured from the webcam is displayed in the window. Finally, there's a loop that draws a rectangle around each of the 'faces' that were detected using the coordinates in the faces array. This last part was the key in figuring out how to control a servo with data from the OpenCV library. Since we are given the x and y coordinates of the faces, we can use these coordinates to direct the servo! Let's move onto the final sketch, where we use data from the OpenCV algorithm to move the pan/tilt camera and keep a face in the center of the picture.
We now have two sketches that look like they can implement a face tracking application. The Arduino sketch will allow us to set the pan/tilt angles of the webcam by sending serial strings from a computer to the Arduino, and the Processing sketch will give us x and y coordinates of a face in a frame. The plan is to try and keep the x and y coordinates of the face in the center area of the screen by moving the webcam left/right and up/down based on the current x and y coordinates. In order to get this working, though, we still need to change the processing sketch so that it can send serial commands to the Arduino. If you haven't downloaded the PanTiltFaceTracking Processing sketch, do so now and extract it to your Processing sketchbook. Once you're ready open the sketch in Processing. If you want to skip the description and get straight to the demonstration make sure a webcam is plugged into the computer and that the Arduino is plugged into the computer with a USB cable, and also provide an external 9V DC power supply to the Arduino. Once the hardware is setup just press run.
There are a handful of changes that need to be made to control the Arduino sketch from the Processing sketch. Mostly, though, all we need to do is add a way to talk using the serial port, and also a way to keep track of the current position of the pan/tilt servos.
The setup and initialization section of the sketch look a lot longer, but it's mostly comments that were added to make reading the sketch a bit easier. Some variables were added for the serial port, to keep track of the pan/tilt servo positions, and to help analyze the face coordinates vs. the middle of the screen. In the setup section of the sketch some new code was added to create a serial connection and to initialize the position of the pan/tilt servos. If you're getting an error that highlights the serial connection line of the sketch, the selected serial port probably needs to be changed. The serial port number is located insider the  brackets, in this case the sketch is selecting the first port in the list with . Even if the sketch is showing an error, you can find the proper port in the sketch output. Here's what the output looks like:
Since I want the sketch to connect to COM3, I put 0 into the  brackets since that's the item number of the serial port list. After connecting to the serial port and displaying the instructions to the user, the initial servo positions are sent to the Arduino sketch using the port.write() commands.
There are quite a few additions to the draw() code from the initial face_detection sketch. All of the changes are after the rectangle is drawn around the face area. First we find out if there are any objects that were placed in the faces array with the faces.length argument. If there are some faces then we make some calculations to determine the coordinates of the center of the face. Notice that the only face coordinates that are used are the coordinates in the first array position. This means that if there is more than one face in the frame, the sketch will only track the first face it sees. This might seem perfect, but the OpenCV library doesn't always put the same face in the same array location; so the sketch will only work properly if there is only one face in the frame.
After finding out where the middle of the face is, this position is compared to the center of the screen. If the center of the face is above the center of the screen, the tilt position variable is changed to raise the webcam. The change in the webcam angle is small, but it still tries to get the subjects face closer to the center of the screen. The same general principle is applied if the face is below the center of the screen, or to the left or right of the center.
And that's it! Now you have a webcam that will follow you as you move around the room. I built a little mount for mine using wood blocks and some clamps so that I could set it on a flat surface. For testing I just held the pan servo in my hand while I moved my head around. Kinda creepy, but I think it's fun. If you want to play around with it, start by changing the stepSize and midScreenWindow variables to get a feel for how these values affect the devices behavior. If you can figure out how to have the webcam track a specific person in the room while there are multiple people in the frame, please let us know and post in the comments. Have fun!