How does facial recognition work?

4 min readDec 12, 2020

Facial recognition has been around for decades but is only now getting widely adopted, and more likely than not, your face is in a database somewhere!

What is facial recognition? 💻

Facial recognition is software that can identify a person by their unique features. It does this by comparing it to predefined aspects of the person termed “data sets”.

The most common way to identify someone using facial recognition is using the algorithm YOLO, or You Only Look Once. The main concept is having numerous pictures of the subject in a dataset and with only one other image, be able to match it to the correct dataset.

With the industry of facial recognition reaching 4 billion dollars and expected to double in the next 4 years, we can only imagine the advancement to happen!

How does it really work? 🤔

#1 Is there even a face?

We first have to understand the difference between detection and recognition. With detection, it will simply tell you if there is a face or not, like binary. This is commonly used in public spaces as having datasets on each individual can be intense.

Facial recognition also uses detection though. The first step is to determine if it is a face, comparing a lamp to datasets of faces is inefficient. Once we are sure there is even a face to recognize, then we can proceed.

#2 Comparing the picture

By analyzing the 80+ distinguishable points on the human face, we can create a unique code for that specific person, think of it as an ID! Some devices like flagship smartphones use technology such as Lidar or Light Detection And Ranging, a specific component added to the phone. By including Lidar, you get many more data points than a traditional camera. It measures depth, creating a hyper-secure system, compared to others being fooled by a printed picture of the owner!

Some points can include the distance between eyes to minor things like the angle of your ears. The higher the image's quality, the better result you will get because it is easier to pull the differences in pixels.

#3 Making the match

We now try and match the subject to the correct database. It used to take hours, and even days to go through all the different datasets of people, but with recent breakthroughs, we have gotten the process down to 10s in just seconds! Industry-standard is now 25 images per second, making real-time video a serious option.

YOLO ✌️

What is YOLO

As mentioned earlier, YOLO is an acronym for You Only Look Once. It is a type of Artificial Intelligence-based facial recognition software, and it works completely different than anything we have seen before! Based on the very popular coding language, python, it has a low entry barrier, and even I will be creating my own software for it!

Why YOLO is the most popular

YOLO is by far the most popular and widely used application for object detection and recognition, its fairly straight forward to set up, but the actual process is pretty complex and interesting 🤯

YOLO is significantly faster than traditional algorithms. In the early days of facial recognition development, the person's aspect ratio had to be the same! Now we can change angles and sizes as we please, this is because YOLO uses a single Neural Network for the whole image.

The complexity behind it

The first thing YOLO does is to divide and conquer. It splits the image given into a 13x13 grid of cells. However, the size of the 169 cells does vary depending on the size of the input. For example, OpenCV did an experiment with a 416x416 input size, and the cell-sized came out as 32x32 meaning the scale will fluctuate, but the aspect ratio will stay the same. It then predicts bounding boxes and the probability of the machine getting it right, typically if it is under 50%, the box will not appear to keep accuracy consistent.

Conclusions👋

Facial recognition is a very powerful application of AI, and it will only grow more into everyday usage.

I will continue to write about Artificial Intelligence, specifically Yolo, on my journey to create Facial Recognition software :)

-Aleem Rehmtulla