Tuesday, April 14, 2015

Another Application of the Four Point Algorithm


I mentioned the mathematical underlying of the Four Point Algorithm in my previous post, with an example showing tiles in the NY subway under projective distortion and the same tiles without the distortion.

Another application of this algorithm is used in stadiums, on the sidelines. You'll often see ads that appear "straight" to the viewers, although the camera is located high, on one side of the field, and thus should display major distortion.

Using the Four Point Algorithm, one can remove that distortion such that the user sees the ads as undistorted.

Here's an example:
I snapped a picture of a blank piece of paper, at an angle such that there is visible projective distortion.

I calculated the homography by clicking the 4 points of the trapeze above and putting them in correspondence with 4 points forming a rectangle of dimensions equal to the original piece of paper. If you put anything on the piece of paper, it'll undergo that homography transformation as well - hence the alien look.

Printed that picture out, and put it back in the same setting, roughly the same camera pose. The projective distortion is the inverse of the homography which produced the alien look... We end up with no distortion.

Friday, April 10, 2015

The 4 Point Algorithm


The Four Point Algorithm allows you to remove projective distortion for planar points. One application is those sideline ads in stadiums: they appear perfectly straight to the tv viewer, but if you were to examine them from up close they'd look fairly distorted.

This algorithm is a special case of epipolar geometry, which aims to find the projective transformation from one camera to another one, with the restriction that the points under consideration are planar. The transformation is called a homography.

Suppose we have Camera 1 and Camera 2, and \textbf{X}_i is a point with coordinates (x,y,z) for camera i: \begin{equation} \textbf{X}_2 = H \textbf{X}_1 \\ \begin{bmatrix} x_2\\ y_2\\ z_2 \end{bmatrix} = \begin{bmatrix} H_{11} & H_{21} & H_{31}\\ H_{21} & H_{22} & H_{32}\\ H_{31} & H_{23} & H_{33} \end{bmatrix} \begin{bmatrix} x_1\\ y_1\\ z_1 \end{bmatrix} \end{equation}
The goal is to determine H. There are 9 elements, but 8 degrees of freedom since it's up to a scale factor. Keep in mind that you can write the equations in inhomogeneous form (dividing x2 and y2 by z2 to produce x'2 and y'2). You end up with 2 equations linear in H_ij, for each point in correspondence. Thus, all the elements of H are determined with 8/2 = 4 points in correspondence between camera 1 and 2.

\begin{equation} (H_{31} x_1 + H_{32}y_1 + H_{33} z_1) x'_2 = H_{11}x_1 + H_{12}y_1 + H_{13} z_1 \\ (H_{31} x_1 + H_{32}y_1 + H_{33} z_1) y'_2 = H_{21}x_1 + H_{22}y_1 + H_{23} z_1 \end{equation}
From here, use SVD to solve for the coefficients of H.
H allows us to "fix" projective distortion for points that are planar. Here's an example, where the selected 4 points were the 4 corners of the Clinton-Washington Av sign (and the 4 points in correspondence are the 4 corners of a rectangle which I ballparked to be the same ratio):




One requirement is that all 4 points are planar. That's the reason my arm is severely distorted. Same goes for the beams supporting the ceiling.

Monday, April 29, 2013

Gesture Controlled Pitch Bend

Here's a demo of a project I recently completed as part of my Cognitive Video class: Gesture Controlled Pitch Bend.


The motivation behind this project stems from my interest for the variety of sounds a guitar can produce, although I am mainly a keyboard player.
One of the expressions I have always wanted to reproduce on a keyboard are bends and vibratos. The effects are subtle, but they definitely add something to licks. Especially when a nice long bend slowly, asymptotically pass the Blue note and lands on the 5th..

There are ways to do that by using the pitch wheel featured on some keyboards. But it's a bit awkward. It's like using a whammy bar to bend a note. So I thought, why not make a keyboard a 2-dimensional device?
We use one dimension to run through all the notes - what about extracting information about the position of the player's hands along the perpendicular?

In the setup described in the video, a camera continuously tracks the position of the player's hands and sends displacement positions along one dimension to an arduino, which handles MIDI communication. The result is fairly intuitive. Slide your hand away and the bend goes up, slide towards your body and the band goes opposite.

Improvements to come:
1) I had a new idea for quick tracking of the hand
2) Make this a standalone device by using a Raspberry Pi and doing the image processing on the RPi's GPU
3) Automatic calibration to register displacements occurring over the keyboard region only.

Wednesday, February 27, 2013

Work at Tandent Vision Science


I started working at Tandent Vision Science about a month ago. I absolutely love the work, the people, and the office! The company focuses on computer vision, and recently published a release called Lightbrush. The software is amazing: in a nutshell, you can separate all the shadows in an image with one click:


I think this is incredible: so many computer vision tasks are limited by illumination variations. Lightbrush provides a fundamental first step by making the image illumination-invariant. As such, it can be used as a natural preprocessing step in all computer vision pipelines, and makes most recognition applications much simpler. Anyone with experience dealing with images for the purpose of the 3 R's (Recognition, Reconstruction, Registration) will see the incredible benefit of this software.
Besides, I am a big fan of photoshop, and I know a feature such as this one would be revolutionary! I can't count how many times I've seen awful photoshopped images due to incoherent lighting.
Apparently Lightbrush gained huge attention from texture artists and high level graphics artists (at Pixar notably) so the product is geared towards a very specialized crowd. Too bad, I personally think the average user would have a blast playing with this product.
I've been working on improving the machinery - can't talk about it! - and it still blows my mind that this is possible with minimal user input... There are many aspects that I have to take into account in my tasks, such as keeping the user in mind with respect to user interaction complexity, speed (texture artists work with gigantic images).

I'm working with 2 very friendly Carnegie Mellon alumni. We each have our separate offices (with a REAL DOOR - something I missed at other workplaces I spent time at) and has a very cozy homey feel, much different from the old functional cubicle. I'm definitely loving the work here.


Automated caller

It's been a while since I updated anything here. My schedule for this new semester is fairly busy: 4 classes, research and an internship barely leave enough time to cook.
The script below is something I wrote back in December when contact information from a certain religious hate group was hacked then published online. It successively and continuously calls all the numbers on a text file (in the file below, the numbers are just stored in an array). I used mechanize to fill forms on website such as findmyphone.com and dial the numbers.


I was debugging the script using my number and had to leave in the middle of it running... I came back to 57 missed calls and voice messages. 

Saturday, December 15, 2012

HMMs

Here' s a really good explanation, through concrete examples, of Hidden Markov Models.
[Credits: Professor Moore, Carnegie Mellon SCS].


Saturday, December 8, 2012

New university webpage

Crunch mode over. I just created my personal webpage on Carnegie Mellon's ECE servers.
Links to a project paper on automating Horizontal Gaze Nystagmus (part of the field sobriety tests performed by law enforcement in the US) and the final paper on using Twitter to predict users' political affiliations are on there.

My girlfriend thinks I really should learn HTML5 - That might be my Christmas break project.