Research, in plain language · 02

Two things
I care about.

Plain-language notes on what I actually work on. One picture, one paragraph, no hedging.

By Yujia Chen2 threadsUpdated continuously
🎨Research thread

Semantic Segmentation across Domains

Teaching a model that the road in Boston is still the road in Beijing.

Source domainlabelled · daytime · CityATarget domainunlabelled · dusk · CityBAdapted predictionmy model · domain-invariant
Fig. 1

Domain shift between source and target; the prediction stays consistent across both.

I work on unsupervised domain adaptation for semantic segmentation — the model is trained on one city (or one LiDAR sensor, or one microscope), and is then expected to just work on another. The picture is rarely that kind.

My approach: instead of trusting confident pseudo-labels blindly, I look for structural priors — long-range dependencies in 3D point clouds, density patterns in LiDAR sweeps, anatomical regularities in medical slides — and use them as a quiet, geometry-aware supervisor.

§
💭Research thread

Hallucinations in Large Vision-Language Models

When the model sees a cat that isn't there.

input imageLVLMVanilla LVLM caption:“A brown dog stands on grass next toa small white cat.”With Attention Contrastive Decoding:“A brown dog stands on grassunder a sunny sky.”hallucinated
Fig. 2

A hallucinated object disappears once decoding is corrected.

Large vision-language models (LVLMs) are eloquent storytellers, but they sometimes describe objects that the image never contained.

My recent work proposes attention contrastive decoding — a decoding-time intervention that quietly suppresses the heads hallucinating from the language prior, while keeping the heads that genuinely look at the image. No retraining required; the surgery is at inference.

§