Have you ever wondered how facial recognition technology works on platforms like Snapchat, Facebook or your digital camera? The exact execution varies between platforms and computer learning has definitely expanded the technology in recent years, but the basics are actually pretty simple.
In many programming tasks, the best way to approach the problem is to think: do humans already do this and, if so, how? The answer is a resounding yes, as humans are amazing at seeing faces. We even see things like the electric plugs in the wall as faces. Due to an aspect of our psychology called Pareidolia, we see patterns even when one doesn’t exist. A face generally has a pattern: two eyes and a mouth, and electric plugs fall close enough to that pattern to be recognizable. Therein lies the key to facial recognition technology: patterns.
The question then becomes how do we make computers see patterns? The solution is comparing brightness of relative pixel clusters. By figuring out which points on a face create the most distinctive “mask” we can create a basic template for the computer to look for. Consider the examples below where I have taken two selfies and made them easier for us to see, like a computer would. These are converted to grayscale so only the level of contrast is present, and then I grouped pixels into clusters with an average lightness.
Specifically I would direct you to notice a few specific features:
- Eyes are significantly darker than the surrounding area of the face due to eye sockets, while the surrounding area is relatively light. Even for a side of the face that is in shadow, this is true.
- The nose, slightly below the eyes contains the lightest pixels being further off the face. The nose and cheeks cast a shadow and, in the cases above, have mustaches .
- The mouth area is darker due to lips and shadows from cheek bones. If either of us were smiling, the teeth would be much brighter than the surrounding area.
These points allow us to make a map that the computer will look for when trying to find faces or track facial features. As a fun example to see how this works I built a very basic program and dropped a photo of some of our team members from our last Fun Friday into it. The program first looked for paired dark spots (eye check), followed by a check for very bright clusters with high contrast to surrounding area below the paired dark spots (teeth check), and lastly a comparison of pixels in the general surrounding area (nose and mouth/cheekbone shadow check).
For a quick script, I was surprised how many members it actually caught, and excluding 15~ results from Cindy’s camo and Jana’s skirt, the only false result was on the wall, which looks pretty face-like to me. Although it would be fun to iterate and improve my little program, many people have been specializing in this for years, allowing us access to this technology for very little cost.
As good as platforms like Snapchat and Facebook are, methods are still being developed to improve facial recognition. These include 3D mapping that attempts to determine actual shape faces and skin texture analysis, which attempts to create a more accurate comparison of facial features. But, for now, you’re still able to laugh at the accidental face swap with inanimate objects every now and then.