Two Things I Care About
Two Things I Care About
Two ongoing threads, in plain language and one picture each.
🎨Semantic Segmentation across Domains
“Teaching a model that the road in Boston is still the road in Beijing.”
I work on unsupervised domain adaptation for semantic segmentation — the model is trained on one city (or one LiDAR sensor, or one microscope), and is then expected to just work on another. The picture is rarely that kind.
My approach: instead of trusting confident pseudo-labels blindly, I look for structural priors — long-range dependencies in 3D point clouds, density patterns in LiDAR sweeps, anatomical regularities in medical slides — and use them as a quiet, geometry-aware supervisor.
💭Hallucinations in Large Vision-Language Models
“When the model sees a cat that isn't there.”
Large vision-language models (LVLMs) are eloquent storytellers, but they sometimes describe objects that the image never contained.
My recent work proposes attention contrastive decoding — a decoding-time intervention that quietly suppresses the heads hallucinating from the language prior, while keeping the heads that genuinely look at the image. No retraining required; the surgery is at inference.