A while back, I wrote a blogpost about using computer vision techniques to play an automated game of pong. This was accomplished by scanning the screen for a white pixel and moving the mouse to this position.
In that particular blogpost, I didn’t really go deep into the optimization that can be done when using Java for computer vision. So this time, I’d like to try some techniques and benchmark them to find out how to shorten the execution time.
The benchmark is prepared as follows:
An image is loaded from disk
The benchmark timer starts
The image is converted to grayscale one hundred times
The benchmark timer stops
The image is written back to disk to manually verify the results
For each benchmark, a couple of variables are prepared:
imageO containing the original image as a BufferedImage object
imageN containing a blank target image as a BufferedImage object
Lets try some techniques, starting with a worst-case scenario:
My starting point would be the same: Have a camera monitor the phone’s screen, and a computer to control the phone’s keypad.
The difference I wanted to make was sending the key-presses using an RS232 or Bluetooth serial connection. This would eliminate the need for external hardware pressing the buttons, or the need for altering the cellphone by hot-wiring the buttons. This was a very simple task, all I needed was the AT Commands Online Reference. The phone I used (Sony Ericsson T610) supports the AT+CKPD command (Control Keypad), thus giving me the power to emulate the entire keypad over a serial connection.
The second difference would be on the graphical user interface. At first I thought it would be best to control the phone using the PC’s keyboard, but whilst developing I found out it would be better to be able to click the phone’s actual buttons using your mouse. I rewrote the key-press part, so when a mouse-click on the video-stream is registered, it translates the mouseX and mouseY to a certain button on the phone, and sends the corresponding key-press using the Bluetooth connection.
The end result: A live video-stream of the phone on your computer and being able to use your mouse to click the phone’s buttons within this stream. A demonstration:
Note: The original cellphone button pusher also is an amazing piece of art. I get why the over-enginereerd method is chosen, and it is quite astonishing how it all worked out. This article is purely to demonstrate how it can done in a simpler way.
Now that pong is fully automated, it is time to optimize.
First I switched to the game Doeo. The goal of the game is to move your mouse as fast over Doeos as possible. A fun game to use as an example for automation.
I also made a small performance enhancement (10 to 15 percent) by not using getRGB () from BufferdImage, but reading directly from an array. It does take some time to get everything into an integer array, but if that’s done, the for-loop is a lot faster.