In our last blog post, What Can SPADs Do Alone?, we showed how Ubicept technologies can unlock the enormous potential of SPAD sensors. If you haven’t seen it yet, please check it out!
That post was a milestone for us: it featured early results from our new full-color 1 MP SPAD development kit, and it was also the first time we benchmarked against a commercial AI-based video denoiser. Many folks have asked us about this comparison, which makes sense given the massive advances in AI over the past few years.
It probably didn’t surprise anyone that, here on the Ubicept blog, we showed that our technique came out ahead. But we also acknowledged that it wasn’t exactly a fair fight—AI-based video denoisers are trained on conventional sensor data, so it’s reasonable to expect them to struggle with SPAD sensor data with very different noise characteristics.
So, in the interest of fairness, we decided to flip the script and run a new set of tests using conventional sensors. Will Ubicept Photon Fusion still come out ahead on their home turf? Let's find out!
Motion is one of the biggest challenges in imaging. Whether it’s the camera or the scene that’s moving, fast motion can ruin detail and make things harder for downstream perception systems. It’s a problem across the board: on planes, trains, automobiles, robots, manufacturing lines, surveillance systems, and more.
At the core of this is the balance between shorter exposures, which freeze motion but increase noise, and longer exposures, which lower noise but blur motion. For example, take a look at this video we shot using a head-mounted camera:
This was captured at 240 fps with 1/240 s exposures using a Sony IMX287-based Lucid Vision Labs PHX004S-CS. The left side shows half of the raw frames played back at 60 fps, while the right side shows averaged 8-frame windows to show the motion blur you'd get from a 30 fps camera with 1/30 s exposures and a 360° shutter angle.1
Using high frame rates preserves temporal precision and helps minimize motion blur. As with our SPAD-based systems, the goal here wasn’t to hide noise through long exposures or heavy filtering. Instead, we aim to capture as much real, instantaneous data as possible so the system can reconstruct something physically accurate.
To the human eye, the scene is easy to follow and nothing looks obviously wrong. But that apparent clarity doesn’t hold up under the hood. For downstream computer vision tasks that rely on frame-level precision—such as optical flow, SLAM, or feature tracking—blur and noise in individual frames can cause serious issues.
To show this, here’s a frame (without nasty compression artifacts!) from early in the video:
On the left is a raw 240 fps frame. It’s quite noisy! On the right, the 30 fps equivalent is much smoother, but exhibits significant motion blur. Neither of these frames is great if you care about accurate perception.
Now, let’s apply the two competing approaches: an AI-based video denoiser on the left, and Ubicept Photon Fusion on the right:
So, we’ll just go ahead and say what you’re probably thinking—the AI-based video denoiser result looks better here. The entire frame is sharp and free of noise. Ubicept Photon Fusion also performs well, but we have to give credit where it’s due.
Of course, there’s a catch! As anyone who’s used AI-based systems knows, they often hallucinate details that aren’t actually present. So, it’s natural to wonder how much can we trust what we’re seeing from the AI-based video denoiser.
To answer that question, let’s look at a more challenging frame later in the video. Again, we start by looking at the noisy 240 fps frame (left) and the blurrier 30 fps frame (right):
There are a few things worth noting before we continue:
Now, let’s look at some results. On the left is the output from the AI-based video denoiser, and on the right is what we get from Ubicept Photon Fusion:
As before, the video denoiser produces a smooth, clean frame—but this time, there are some blatant deficiencies. The seam on the cabinet is distorted and broken. The cable seems to fade into the carpet. And the mystery object? It might as well be a stain or shadow!
Ubicept Photon Fusion, on the other hand, reveals all three of these details: a sharp seam, a distinct cable, and the object, which you might recognize as a ColorChecker.
Let’s zoom in for some gratuitous pixel-peeping:
And here’s the comparison video. The disappearing objects start about halfway through:
For downstream perception applications, the impact of errors like these can vary significantly. In consumer AR/VR scenarios, the obliteration of fine details—like what we see from the AI-based video denoiser—might lead to a degraded user experience. That’s not good, but the outcome is probably no worse than frustration or nausea. In an automotive context, however, mistaking obstacles or pedestrians for a smooth, empty road could have fatal consequences.
This example drives home a core insight: when it comes to perception, appearance means nothing without accuracy. Ubicept Photon Fusion is designed with that principle in mind. It can improve performance for both SPAD and conventional sensors by delivering results that are physics-based and reliable, rather than polished but misleading.
If you’ve spent any time on our site, you probably know we’re big believers that SPADs are the future. So why are we writing about this? To put it simply, conventional sensors are the present. If our technology can help our partners tackle their immediate imaging challenges, it delivers value sooner and offers a smoother path for them to adopt SPADs as they become more accessible for cost-sensitive applications. That’s the approach we’ve been pursuing with partners across a range of industries, such as Continental in the automotive space. We’re looking forward to sharing a lot more soon, so stay tuned!