Hope everyone is having a good holiday season! We wanted to take a break from the festivities to share a new demo for face landmark detection. This technique serves a variety of purposes, from lighthearted chats among friends to very serious applications in legal proceedings.
Modern face landmark detection algorithms work best when they’re provided with crisp video, but they’re quite resilient to the presence of noise and blur when lighting conditions get bad. Still, they have their limits!
We decided to test these limits with three cameras simultaneously: a standard laptop webcam, a premium low-light webcam, and a SPAD camera with Ubicept processing. Our setup involved the type of lighting you might encounter with typical video calls: a single LED lamp (with a lampshade) and the laptop screen itself (with the GUI in dark mode). For the algorithm itself, we piped the video feed of all three cameras into Google’s MediaPipe Studio (you can try it in your browser!).
Here are some results from the standard laptop webcam:
You can see that the clip starts with the LED lamp at about 85% and the laptop screen at 100%. The face landmark detection algorithm starts off working quite well, capturing eye and mouth movements. However, it loses a lot of precision as we lower the lamp to 15% and the laptop screen to its minimum setting.
Now, let’s take a look at what happens when we keep the lighting constant and switch over to the low-light webcam:
Great results here! But what if we continue darkening the room by shutting off the LED lamp entirely?
Not so great results here. To be fair, we could have smoothed it out a bit by applying noise correction and/or increasing the exposure time, but doing so would have resulted in subtle movements being lost.
Finally, let’s switch over to the Ubicept solution which combines a SPAD camera and our “photon fusion” algorithm to produce motion-compensated live video:
Much better! Take a look at our full results and comparisons in the video below, and be sure to stay tuned until the end to see how the face landmark detection output translates to an animated avatar. There’s sound in the video too if you’d like to see how the expressions match the speech:
Yes, our video editor had a bit of fun with pitch correction, but we think it works! Anyway, please feel free to reach out to us if you’re interested in learning more!