计算机视觉与模式识别

2025年06月的作者和标题

总共 3129 条目 : 1-50 51-100 101-150 151-200 201-250 251-300 301-350 ... 3101-3129

显示最多 50 每页条目：较少 | 更多 | 所有

[151] arXiv:2506.01783 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： FaceCoT：具有连锁思维推理的面部反欺骗基准数据集

标题： FaceCoT: A Benchmark Dataset for Face Anti-Spoofing with Chain-of-Thought Reasoning

Honglu Zhang, Zhiqin Fang, Ningning Zhao, Saihui Hou, Long Ma, Renwang Pei, Zhaofeng He

主题：计算机视觉与模式识别 (cs.CV)
[152] arXiv:2506.01795 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： R2SM：指代与推理用于选择性掩码

标题： R2SM: Referring and Reasoning for Selective Masks

Yu-Lin Shih, Wei-En Tai, Cheng Sun, Yu-Chiang Frank Wang, Hwann-Tzong Chen

主题：计算机视觉与模式识别 (cs.CV)
[153] arXiv:2506.01799 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： WorldExplorer：迈向生成完全可导航的3D场景

标题： WorldExplorer: Towards Generating Fully Navigable 3D Scenes

Manuel-Andreas Schneider, Lukas Höllein, Matthias Nießner

评论：项目页面：见 https://the-world-explorer.github.io/，视频：见 https://youtu.be/c1lBnwJWNmE

主题：计算机视觉与模式识别 (cs.CV)
[154] arXiv:2506.01801 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： OmniV2V：通过动态内容操作的多功能视频生成与编辑

标题： OmniV2V: Versatile Video Generation and Editing via Dynamic Content Manipulation

Sen Liang, Zhentao Yu, Zhengguang Zhou, Teng Hu, Hongmei Wang, Yi Chen, Qin Lin, Yuan Zhou, Xin Li, Qinglin Lu, Zhibo Chen

主题：计算机视觉与模式识别 (cs.CV)
[155] arXiv:2506.01802 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： UMA：通过多级曲面对齐的超详细人体 avatar

标题： UMA: Ultra-detailed Human Avatars via Multi-level Surface Alignment

Heming Zhu, Guoxing Sun, Christian Theobalt, Marc Habermann

评论：欲查看视频结果，请访问 https://youtu.be/XMNCy7J2tuc

主题：计算机视觉与模式识别 (cs.CV)
[156] arXiv:2506.01806 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： Ridgeformer：细粒度跨域指纹识别的多阶段对比训练

标题： Ridgeformer: Mutli-Stage Contrastive Training For Fine-grained Cross-Domain Fingerprint Recognition

Shubham Pandey, Bhavin Jawade, Srirangaraj Setlur

评论：已被接受发表于2025年IEEE图像处理国际会议

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[157] arXiv:2506.01822 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： GSCodec工作室：高斯点云压缩的模块化框架

标题： GSCodec Studio: A Modular Framework for Gaussian Splat Compression

Sicheng Li, Chengzhen Wu, Hao Li, Xiang Gao, Yiyi Liao, Lu Yu

评论：项目仓库：https://github.com/JasonLSC/GSCodec_Studio

主题：计算机视觉与模式识别 (cs.CV) ; 多媒体 (cs.MM)
[158] arXiv:2506.01850 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： MoDA：用于细粒度视觉接地的指令调优大模型的调制适配器

标题： MoDA: Modulation Adapter for Fine-Grained Visual Grounding in Instructional MLLMs

Wayner Barrios, Andrés Villa, Juan León Alcázar, SouYoung Jin, Bernard Ghanem

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG) ; 多媒体 (cs.MM)
[159] arXiv:2506.01853 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： ShapeLLM-Omni：一种用于3D生成和理解的原生多模态大型语言模型

标题： ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding

Junliang Ye, Zhengyi Wang, Ruowen Zhao, Shenghao Xie, Jun Zhu

评论：项目页面：https://github.com/JAMESYJL/ShapeLLM-Omni

主题：计算机视觉与模式识别 (cs.CV)
[160] arXiv:2506.01902 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：多尺度预训练与扰动报告判别增强的生物医学多模态表示学习

标题： Enhancing Biomedical Multi-modal Representation Learning with Multi-scale Pre-training and Perturbed Report Discrimination

Xinliu Zhong, Kayhan Batmanghelich, Li Sun

评论： 6页，1幅图，已被2024年IEEE人工智能会议（CAI）接受

期刊参考： 2024年IEEE人工智能会议（CAI），2024年，第480-485页

主题：计算机视觉与模式识别 (cs.CV) ; 计算与语言 (cs.CL)
[161] arXiv:2506.01908 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：视频大语言模型的强化学习调优：奖励设计与数据效率

标题： Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency

Hongyu Li, Songhao Han, Yue Liao, Junfeng Luo, Jialin Gao, Shuicheng Yan, Si Liu

主题：计算机视觉与模式识别 (cs.CV)
[162] arXiv:2506.01912 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：阐明无条件去噪扩散模型中图像的表示

标题： Elucidating the representation of images within an unconditional diffusion model denoiser

Zahra Kadkhodaie, Stéphane Mallat, Eero Simoncelli

主题：计算机视觉与模式识别 (cs.CV)
[163] arXiv:2506.01921 (交叉列表自 cs.CV) [中文pdf, pdf, 其他]: 标题： MedEBench：回顾医学领域的文本指导图像编辑

标题： MedEBench: Revisiting Text-instructed Image Editing on Medical Domain

Minghao Liu, Zhitao He, Zhiyuan Fan, Qingyun Wang, Yi R. Fung

评论：项目网站：https://mliuby.github.io/MedEBench_Website/

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[164] arXiv:2506.01923 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： TaxaDiffusion：用于细粒度物种生成的逐步训练扩散模型

标题： TaxaDiffusion: Progressively Trained Diffusion Model for Fine-Grained Species Generation

Amin Karimi Monsefi, Mridul Khurana, Rajiv Ramnath, Anuj Karpatne, Wei-Lun Chao, Cheng Zhang

评论：被ICCV 2025接受

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG)
[165] arXiv:2506.01933 (交叉列表自 cs.CV) [中文pdf, pdf, 其他]: 标题： E3D-Bench：端到端3D几何基础模型的基准测试

标题： E3D-Bench: A Benchmark for End-to-End 3D Geometric Foundation Models

Wenyan Cong, Yiqing Liang, Yancheng Zhang, Ziyi Yang, Yan Wang, Boris Ivanovic, Marco Pavone, Chen Chen, Zhangyang Wang, Zhiwen Fan

评论：项目页面：https://e3dbench.github.io/

主题：计算机视觉与模式识别 (cs.CV)
[166] arXiv:2506.01935 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：使用寄存器的低秩头部 avatar 个性化

标题： Low-Rank Head Avatar Personalization with Registers

Sai Tanmay Reddy Chakkera, Aggelina Chatziagapi, Md Moniruzzaman, Chen-Ping Yu, Yi-Hsuan Tsai, Dimitris Samaras

评论： 23页，16幅图。项目页面：https://starc52.github.io/publications/2025-05-28-LoRAvatar/

主题：计算机视觉与模式识别 (cs.CV)
[167] arXiv:2506.01940 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：快速且鲁棒的旋转平均与各向异性坐标下降

标题： Fast and Robust Rotation Averaging with Anisotropic Coordinate Descent

Yaroslava Lochman, Carl Olsson, Christopher Zach

主题：计算机视觉与模式识别 (cs.CV)
[168] arXiv:2506.01942 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： OD3：无优化的数据集蒸馏用于目标检测

标题： OD3: Optimization-free Dataset Distillation for Object Detection

Salwa K. Al Khatib (1), Ahmed ElHagry (1), Shitong Shao (2 and 1), Zhiqiang Shen (1) ((1) Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI), (2) Hong Kong University of Science and Technology (Guangzhou))

评论：第一到第三作者贡献相同

主题：计算机视觉与模式识别 (cs.CV)
[169] arXiv:2506.01943 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：用于机器人操作的学习视频生成与协作轨迹控制

标题： Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control

Xiao Fu, Xintao Wang, Xian Liu, Jianhong Bai, Runsen Xu, Pengfei Wan, Di Zhang, Dahua Lin

评论：项目页面：https://fuxiao0719.github.io/projects/robomaster/ 代码：https://github.com/KwaiVGI/RoboMaster

主题：计算机视觉与模式识别 (cs.CV)
[170] arXiv:2506.01946 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：大型语言模型需要三维感知表示监督来进行场景理解

标题： MLLMs Need 3D-Aware Representation Supervision for Scene Understanding

Xiaohu Huang, Jingjing Wu, Qunyi Xie, Kai Han

主题：计算机视觉与模式识别 (cs.CV)
[171] arXiv:2506.01949 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： IMAGHarmony：具有恒定物体数量和布局的可控图像编辑

标题： IMAGHarmony: Controllable Image Editing with Consistent Object Quantity and Layout

Fei Shen, Xiaoyu Du, Yutong Gao, Jian Yu, Yushe Cao, Xing Lei, Jinhui Tang

主题：计算机视觉与模式识别 (cs.CV)
[172] arXiv:2506.01955 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：双过程图像生成

标题： Dual-Process Image Generation

Grace Luo, Jonathan Granskog, Aleksander Holynski, Trevor Darrell

主题：计算机视觉与模式识别 (cs.CV) ; 计算与语言 (cs.CL) ; 机器学习 (cs.LG)
[173] arXiv:2506.02010 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： CNVSRC 2024：第二届中国连续视觉语音识别挑战赛

标题： CNVSRC 2024: The Second Chinese Continuous Visual Speech Recognition Challenge

Zehua Liu, Xiaolou Li, Chen Chen, Lantian Li, Dong Wang

评论：将于2025年发表于INTERSPEECH

主题：计算机视觉与模式识别 (cs.CV) ; 声音 (cs.SD) ; 音频与语音处理 (eess.AS)
[174] arXiv:2506.02011 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： OASIS：在线样本选择用于持续视觉指令微调

标题： OASIS: Online Sample Selection for Continual Visual Instruction Tuning

Minjae Lee, Minhyuk Seo, Tingyu Qu, Tinne Tuytelaars, Jonghyun Choi

主题：计算机视觉与模式识别 (cs.CV)
[175] arXiv:2506.02012 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：利用大型语言模型进行视觉语音识别：模型扩展、上下文感知解码和迭代优化

标题： Leveraging Large Language Models in Visual Speech Recognition: Model Scaling, Context-Aware Decoding, and Iterative Polishing

Zehua Liu, Xiaolou Li, Li Guo, Lantian Li, Dong Wang

主题：计算机视觉与模式识别 (cs.CV) ; 声音 (cs.SD) ; 音频与语音处理 (eess.AS)
[176] arXiv:2506.02014 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：基于多模态大语言模型优化的驾驶场景技术研究

标题： Research on Driving Scenario Technology Based on Multimodal Large Lauguage Model Optimization

Wang Mengjie, Zhu Huiping, Li Jian, Shi Wenxiu, Zhang Song

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[177] arXiv:2506.02015 (交叉列表自 cs.CV) [中文pdf, pdf, 其他]: 标题：面向对象的自改进偏好优化用于文本到图像生成

标题： Object-centric Self-improving Preference Optimization for Text-to-Image Generation

Yoonjin Oh, Yongjin Kim, Hyomin Kim, Donghwan Chi, Sungwoong Kim

主题：计算机视觉与模式识别 (cs.CV)
[178] arXiv:2506.02016 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：经典的深度神经网络是否具有弱对抗鲁棒性？

标题： Are classical deep neural networks weakly adversarially robust?

Nuolin Sun, Linyuan Wang, Dongyang Li, Bin Yan, Lei Li

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)
[179] arXiv:2506.02017 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过反馈实现公平：解决自动性别识别中的算法冒犯性性别认定问题

标题： Fairness through Feedback: Addressing Algorithmic Misgendering in Automatic Gender Recognition

Camilla Quaresmini, Giacomo Zanotti

主题：计算机视觉与模式识别 (cs.CV)
[180] arXiv:2506.02020 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过显式硬负样本梯度放大改进多模态嵌入学习

标题： Improve Multi-Modal Embedding Learning via Explicit Hard Negative Gradient Amplifying

Youze Xue, Dian Li, Gang Liu

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)
[181] arXiv:2506.02021 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：动态感知视频蒸馏：基于视频语义优化时间分辨率

标题： Dynamic-Aware Video Distillation: Optimizing Temporal Resolution Based on Video Semantics

Yinjie Zhao, Heng Zhao, Bihan Wen, Yew-Soon Ong, Joey Tianyi Zhou

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[182] arXiv:2506.02022 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：《你看见我了吗？》：一个多维度基准来评估多模态大型语言模型中的视觉感知

标题： Do You See Me : A Multidimensional Benchmark for Evaluating Visual Perception in Multimodal LLMs

Aditya Kanade, Tanuja Ganu

主题：计算机视觉与模式识别 (cs.CV)
[183] arXiv:2506.02095 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：循环一致性作为奖励：无需人类偏好即可学习图像-文本对齐

标题： Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences

Hyojin Bahng, Caroline Chan, Fredo Durand, Phillip Isola

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)
[184] arXiv:2506.02112 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： SAB3R：三维重建中的语义增强主干网络

标题： SAB3R: Semantic-Augmented Backbone in 3D Reconstruction

Xuweiyi Chen, Tian Xia, Sihan Xu, Jianing Yang, Joyce Chai, Zezhou Cheng

评论： 3D-LLM/VLA @ CVPR2025 | 项目页面：https://uva-computer-vision-lab.github.io/sab3r/

主题：计算机视觉与模式识别 (cs.CV)
[185] arXiv:2506.02150 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：具有可学习核的隐式可变形医学图像配准

标题： Implicit Deformable Medical Image Registration with Learnable Kernels

Stefano Fogarollo, Gregor Laimer, Reto Bale, Matthias Harders

评论：预接受

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[186] arXiv:2506.02161 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： TIIF-Bench：你的文本到图像模型如何遵循您的指令？

标题： TIIF-Bench: How Does Your T2I Model Follow Your Instructions?

Xinyu Wei, Jinrui Zhang, Zeqing Wang, Hongyang Wei, Zhen Guo, Lei Zhang

评论： 23页，12图，11表

主题：计算机视觉与模式识别 (cs.CV)
[187] arXiv:2506.02164 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：利用决策变量相关性量化与任务相关的表征相似性

标题： Quantifying task-relevant representational similarity using decision variable correlation

Yu (Eric)Qian, Wilson S. Geisler, Xue-Xin Wei

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG) ; 神经与认知 (q-bio.NC) ; 定量方法 (q-bio.QM)
[188] arXiv:2506.02167 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： Fire360：一种用于退化全景消防视频中鲁棒感知和情景记忆的基准

标题： Fire360: A Benchmark for Robust Perception and Episodic Memory in Degraded 360-Degree Firefighting Videos

Aditi Tiwari, Farzaneh Masoud, Dac Trong Nguyen, Jill Kraft, Heng Ji, Klara Nahrstedt

评论： 20页，9个图，6个表格

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[189] arXiv:2506.02221 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： Diff2Flow：通过扩散模型对齐训练流匹配模型

标题： Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment

Johannes Schusterbauer, Ming Gui, Frank Fundel, Björn Ommer

评论：被CVPR 2025接受

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)
[190] arXiv:2506.02229 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： VLCD：用于准确且高效的自动胎盘分析的视觉-语言对比蒸馏

标题： VLCD: Vision-Language Contrastive Distillation for Accurate and Efficient Automatic Placenta Analysis

Manas Mehta, Yimu Pan, Kelly Gallagher, Alison D. Gernand, Jeffery A. Goldstein, Delia Mwinyelle, Leena Mithal, James Z. Wang

评论：第九届国际健康智能研讨会会议录与美国人工智能协会年度会议同期举办，宾夕法尼亚州费城，2025年3月

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI) ; 计算与语言 (cs.CL) ; 机器学习 (cs.LG)
[191] arXiv:2506.02244 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：运动感知视频生成模型

标题： Motion aware video generative model

Bowen Xue, Giuseppe Claudio Guarnera, Shuang Zhao, Zahra Montazeri

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[192] arXiv:2506.02247 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： PAIR-Net：通过预训练的音视频融合与对齐损失增强自我中心说话者检测

标题： PAIR-Net: Enhancing Egocentric Speaker Detection via Pretrained Audio-Visual Fusion and Alignment Loss

Yu Wang, Juhyung Ha, David J. Crandall

评论： 4页，1个图，1个表格

主题：计算机视觉与模式识别 (cs.CV)
[193] arXiv:2506.02265 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： Rig3R：针对学习的3D重建的骨架感知条件化

标题： Rig3R: Rig-Aware Conditioning for Learned 3D Reconstruction

Samuel Li, Pujith Kachana, Prajwal Chidananda, Saurabh Nair, Yasutaka Furukawa, Matthew Brown

主题：计算机视觉与模式识别 (cs.CV)
[194] arXiv:2506.02291 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：实体图像与多模态图像检索数据集

标题： Entity Image and Mixed-Modal Image Retrieval Datasets

Cristian-Ioan Blaga, Paul Suganthan, Sahil Dua, Krishna Srinivasan, Enrique Alfonseca, Peter Dornbach, Tom Duerig, Imed Zitouni, Zhe Dong

主题：计算机视觉与模式识别 (cs.CV) ; 信息检索 (cs.IR)
[195] arXiv:2506.02294 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过置信度引导的数据增强改善未知协变量偏移下的知识蒸馏

标题： Improving Knowledge Distillation Under Unknown Covariate Shift Through Confidence-Guided Data Augmentation

Niclas Popp, Kevin Alexander Laube, Matthias Hein, Lukas Schott

主题：计算机视觉与模式识别 (cs.CV)
[196] arXiv:2506.02295 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： QARI-OCR：通过多模态大型语言模型适应的高保真阿拉伯文文本识别

标题： QARI-OCR: High-Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation

Ahmed Wasfy, Omer Nacar, Abdelakreem Elkhateb, Mahmoud Reda, Omar Elshehy, Adel Ammar, Wadii Boulila

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[197] arXiv:2506.02327 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：医学世界模型：治疗规划的肿瘤演化生成模拟

标题： Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning

Yijun Yang, Zhao-Yang Wang, Qiuping Liu, Shuwen Sun, Kang Wang, Rama Chellappa, Zongwei Zhou, Alan Yuille, Lei Zhu, Yu-Dong Zhang, Jieneng Chen

主题：计算机视觉与模式识别 (cs.CV)
[198] arXiv:2506.02334 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过互学习和类分布正则化的广义类别发现

标题： Generalized Category Discovery via Reciprocal Learning and Class-Wise Distribution Regularization

Duo Liu, Zhiquan Tan, Linglan Zhao, Zhongqiang Zhang, Xiangzhong Fang, Weiran Huang

评论： ICML2025海报

主题：计算机视觉与模式识别 (cs.CV)
[199] arXiv:2506.02354 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： RATE-Nav：基于区域感知的零样本物体导航终止增强视觉语言模型

标题： RATE-Nav: Region-Aware Termination Enhancement for Zero-shot Object Navigation with Vision-Language Models

Junjie Li, Nan Zhang, Xiaoyang Qu, Kai Lu, Guokuan Li, Jiguang Wan, Jianzong Wang

评论：被第63届计算语言学协会年会（ACL 2025）接受

主题：计算机视觉与模式识别 (cs.CV)
[200] arXiv:2506.02356 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： InterRVOS：基于交互感知的指代视频对象分割

标题： InterRVOS: Interaction-aware Referring Video Object Segmentation

Woojeong Jin, Seongchan Kim, Seungryong Kim

主题：计算机视觉与模式识别 (cs.CV)

总共 3129 条目 : 1-50 51-100 101-150 151-200 201-250 251-300 301-350 ... 3101-3129

显示最多 50 每页条目：较少 | 更多 | 所有

计算机视觉与模式识别

2025年06月 的作者和标题

2025年06月的作者和标题