Two Things I Care About

Research, in plain language · 02

Two Things I Care About

Two ongoing threads, in plain language and one picture each.

semseg

🎨Semantic Segmentation across Domains

“Teaching a model that the road in Boston is still the road in Beijing.”

Figure 1. Domain shift between source and target; the prediction stays consistent across both.

I work on unsupervised domain adaptation for semantic segmentation — the model is trained on one city (or one LiDAR sensor, or one microscope), and is then expected to just work on another. The picture is rarely that kind.

My approach: instead of trusting confident pseudo-labels blindly, I look for structural priors — long-range dependencies in 3D point clouds, density patterns in LiDAR sweeps, anatomical regularities in medical slides — and use them as a quiet, geometry-aware supervisor.

lvlm-hallucination

💭Hallucinations in Large Vision-Language Models

“When the model sees a cat that isn't there.”

Figure 2. A hallucinated object disappears once decoding is corrected.

Large vision-language models (LVLMs) are eloquent storytellers, but they sometimes describe objects that the image never contained.

My recent work proposes attention contrastive decoding — a decoding-time intervention that quietly suppresses the heads hallucinating from the language prior, while keeping the heads that genuinely look at the image. No retraining required; the surgery is at inference.