Publications

Please checkout Google Scholar for a complete list.

2025

ICCV

Bayesian-Inspired Space-Time Superpixels

Kent Gauen, and Stanley Chan

In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025

Abs PDF Code Poster Website

This paper presents Bayesian-inspired Space-Time Superpixels (BIST): a fast, state-of-the-art method to compute space-time superpixels. BIST is a novel extension of a single-image Bayesian method named BASS, and it is inspired by hill-climbing to a local mode of a Dirichlet-Process Gaussian Mixture Model (DP-GMM). The method is only Bayesian-inspired, rather than actually Bayesian, because it includes heuristic modifications to the theoretically correct sampler. Similar to existing methods, BIST can adapt the number of superpixels to an individual frame using split-merge steps. A key novelty is a new temporal coherence term in the split step, which reduces the chance of splitting propagated superpixels. This term enforces temporal coherence in propagated regions, and unconstrained adaptation in disoccluded regions. A hyperparameter determines the strength of this new term, which does not require special tuning to return consistent results across multiple videos. The wall-clock runtime of BIST is over twice as fast as BASS and over 30 times faster than the next fastest space-time superpixel method with open-source code.

2024

NeurIPS

Soft Superpixel Neighborhood Attention

Kent Gauen, and Stanley Chan

Advances in Neural Information Processing Systems, 2024

Abs PDF Code Poster

Images contain objects with deformable boundaries, such as the contours of a human face, yet attention operators act on square windows. This mixes features from perceptually unrelated regions, which can degrade the quality of a denoiser. This paper proposes using superpixel probabilities to re-weight the local attention map. If images are modeled with latent superpixel probabilities, we show our re-weighted attention module matches the theoretically optimal denoiser. The left image shows that NA mixes information from the unrelated blue region, Hard-SNA improperly rejects pixels from the adjacent orange regions, and SNA correctly selects the all the orange pixels and rejects the blue pixels.

2023

arXiv

Shifted Neighborhood Search

Kent Gauen, and Stanley Chan

arXiv, 2023

Abs PDF Code

Computing attention maps for videos is challenging due to the motion of objects between frames. Small spatial inaccuracies significantly impact the attention module’s quality. Recent works propose using a deep network to correct these small inaccuracies. In this project, we efficiently implement a space-time grid search which outperforms existing deep neural network alternatives. The image on the left shows a no-shift search, a search using a deep network from related works, and our proposed shifted non-local search.