Yujia Chen
Personal academic site
中文

Two Things I Care About

Research, in plain language · 02

Two Things I Care About

Two ongoing threads, in plain language and one picture each.

semseg

🎨Semantic Segmentation across Domains

Teaching a model that the road in Boston is still the road in Beijing.

Source domainlabelled · daytime · CityATarget domainunlabelled · dusk · CityBAdapted predictionmy model · domain-invariant
Figure 1. Domain shift between source and target; the prediction stays consistent across both.

I work on unsupervised domain adaptation for semantic segmentation — the model is trained on one city (or one LiDAR sensor, or one microscope), and is then expected to just work on another. The picture is rarely that kind.

My approach: instead of trusting confident pseudo-labels blindly, I look for structural priors — long-range dependencies in 3D point clouds, density patterns in LiDAR sweeps, anatomical regularities in medical slides — and use them as a quiet, geometry-aware supervisor.

·
lvlm-hallucination

💭Hallucinations in Large Vision-Language Models

When the model sees a cat that isn't there.

input imageLVLMVanilla LVLM caption:“A brown dog stands on grass next toa small white cat.”With Attention Contrastive Decoding:“A brown dog stands on grassunder a sunny sky.”hallucinated
Figure 2. A hallucinated object disappears once decoding is corrected.

Large vision-language models (LVLMs) are eloquent storytellers, but they sometimes describe objects that the image never contained.

My recent work proposes attention contrastive decoding — a decoding-time intervention that quietly suppresses the heads hallucinating from the language prior, while keeping the heads that genuinely look at the image. No retraining required; the surgery is at inference.