Publications

(2023). Siamese DETR. CVPR.

PDF Code

(2022). ST-Adapter: Parameter-efficient Image-to-Video Transfer Learning. NeurIPS.

PDF Code

(2022). Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy. CoRR.

PDF Cite Code Project

(2022). X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation. European Conference on Computer Vision (ECCV), 2022.

PDF Cite

(2022). Benchmarking Omni-Vision Representation through the Lens of Visual Realms. European Conference on Computer Vision (ECCV), 2022.

PDF Cite

(2022). RePre: Improving Self-Supervised Vision Transformer with Reconstructive Pre-training. International Joint Conference on Artificial Intelligence (IJCAI), 2022.

PDF Cite DOI

(2022). 1st Place Solutions for RxR-Habitat Vision-and-Language Navigation Competition (CVPR 2022). CoRR.

PDF Cite

(2022). Prompt for Extraction? PAIE: Prompting Argument Interaction for Event Argument Extraction. Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (ACL Long Papers), 2022.

PDF Cite Code DOI

(2022). MMEKG: Multi-modal Event Knowledge Graph towards Universal Representation across Modalities. Annual Meeting of the Association for Computational Linguistics: System Demonstrations (ACL Demo), 2022.

PDF Cite DOI

(2022). ERGO: Event Relational Graph Transformer for Document-level Event Causality Identification. CoRR.

PDF Cite

(2022). Few-shot Forgery Detection via Guided Adversarial Interpolation. CoRR.

PDF Cite

(2022). Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm. International Conference on Learning Representations (ICLR), 2022.

PDF Cite

(2022). Task-Balanced Distillation for Object Detection. CoRR.

Cite

(2022). ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning for Action Recognition. CoRR.

PDF Cite

(2022). SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples. CoRR.

Cite

(2022). Robust Face Anti-Spoofing with Dual Probabilistic Modeling. CoRR.

Cite

(2021). ForgeryNet - Face Forgery Analysis Challenge 2021: Methods and Results. CoRR.

PDF Cite

(2021). A Simple Long-Tailed Recognition Baseline via Vision-Language Model. CoRR.

PDF Cite Code

(2021). BlockQNN: Efficient Block-Wise Neural Network Architecture Generation. IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE T-PAMI), 2021.

Cite DOI

(2021). ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Oral Presentation, 2021.

PDF Cite Dataset DOI

(2021). Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

PDF Cite Code DOI

(2021). Few-Shot Domain Expansion for Face Anti-Spoofing. CoRR.

Cite

(2020). Thinking in Frequency: Face Forgery Detection by Mining Frequency-Aware Clues. European Conference on Computer Vision (ECCV), 2020.

PDF Cite DOI

(2020). Powering One-Shot Topological NAS with Stabilized Share-Parameter Proxy. European Conference on Computer Vision (ECCV), 2020.

PDF Cite DOI

(2020). Learning Connectivity of Neural Networks from a Topological Perspective. European Conference on Computer Vision (ECCV), 2020.

PDF Cite DOI

(2020). CelebA-Spoof: Large-Scale Face Anti-spoofing Dataset with Rich Annotations. European Conference on Computer Vision (ECCV), 2020.

PDF Cite Dataset Video DOI

(2020). 1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge 2020. CoRR.

PDF Cite

(2020). High-Quality Video Generation from Static Structural Annotations. International Journal of Computer Vision (IJCV), 2020.

PDF Cite Code DOI

(2020). Morphing and Sampling Network for Dense Point Cloud Completion. AAAI Conference on Artificial Intelligence (AAAI), 2020.

PDF Cite Code Dataset DOI

(2020). PV-NAS: Practical Neural Architecture Search for Video Recognition. CoRR.

Cite

(2019). Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis. Advances in Neural Information Processing Systems (NeurIPS), 2019.

PDF Cite Code Slides

(2019). CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval. IEEE/CVF International Conference on Computer Vision (ICCV), 2019.

PDF Cite Code DOI

(2019). Video Generation From Single Semantic Label Map. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

PDF Cite Code DOI

(2019). Semantics Disentangling for Text-To-Image Generation. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Oral Presentation, 2019.

PDF Cite DOI

(2019). Improving Referring Expression Grounding With Cross-Modal Attention-Guided Erasing. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

PDF Cite Code DOI

(2019). Context and Attribute Grounded Dense Captioning. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

PDF Cite DOI

(2019). Unsupervised Bi-directional Flow-based Video Generation from one Snapshot. CoRR.

Cite

(2018). Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection. ACM International Conference on Multimedia (MM), 2018.

PDF Cite Code DOI

(2018). Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data. European Conference on Computer Vision (ECCV), 2018.

PDF Cite DOI

(2018). Transductive Centroid Projection for Semi-supervised Large-Scale Recognition. European Conference on Computer Vision (ECCV), 2018.

PDF Cite DOI

(2018). Localization Guided Learning for Pedestrian Attribute Recognition. British Machine Vision Conference (BMVC), 2018.

PDF Cite

(2018). Practical Block-Wise Neural Network Architecture Generation. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Oral Presentation, 2018.

Cite DOI

(2018). Exploring Disentangled Feature Representation Beyond Face Identification. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

PDF Cite DOI

(2018). Avatar-Net: Multi-scale Zero-Shot Style Transfer by Feature Decoration. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

PDF Cite Code Project DOI

(2018). Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition. European Conference on Computer Vision (ECCV), 2018.

PDF Cite DOI

(2018). Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association. European Conference on Computer Vision (ECCV), 2018.

PDF Cite DOI

(2017). Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification. IEEE International Conference on Computer Vision (ICCV), 2017.

PDF Cite DOI

(2017). HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis. IEEE International Conference on Computer Vision (ICCV), 2017.

PDF Cite Code DOI

(2017). Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

PDF Cite Code Dataset DOI

(2017). Learning Scene-Independent Group Descriptors for Crowd Understanding. IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT), 2017.

Cite DOI

(2017). Crowded Scene Understanding by Deeply Learned Volumetric Slices. IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT), 2016.

Cite DOI

(2016). Slicing Convolutional Neural Network for Crowd Video Understanding. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Spotlight, 2016.

Cite DOI

(2015). Deeply learned attributes for crowded scene understanding. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Oral Presentation, 2015.

PDF Cite DOI

(2014). Scene-Independent Group Profiling in Crowd. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Oral Presentation, 2014.

PDF Cite DOI