计算机视觉与模式识别

2025年06月的作者和标题

总共 3129 条目 : 1-25 51-75 76-100 101-125 126-150 151-175 176-200 201-225 ... 3126-3129

显示最多 25 每页条目：较少 | 更多 | 所有

[126] arXiv:2506.01480 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过强化学习解锁顿悟时刻：推进协作视觉理解和生成

标题： Unlocking Aha Moments via Reinforcement Learning: Advancing Collaborative Visual Comprehension and Generation

Kaihang Pan, Yang Wu, Wendong Bu, Kai Shen, Juncheng Li, Yingting Wang, Yunfei Li, Siliang Tang, Jun Xiao, Fei Wu, Hang Zhao, Yueting Zhuang

评论： 21页，7幅图

主题：计算机视觉与模式识别 (cs.CV)
[127] arXiv:2506.01487 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：动态场景图预测

标题： FDSG: Forecasting Dynamic Scene Graphs

Yi Yang, Yuren Cong, Hao Cheng, Bodo Rosenhahn, Michael Ying Yang

评论： 21页，9幅图，15张表格

主题：计算机视觉与模式识别 (cs.CV)
[128] arXiv:2506.01493 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：效率无妥协：CLIP辅助的具有更高多样性的文本到图像GANs

标题： Efficiency without Compromise: CLIP-aided Text-to-Image GANs with Increased Diversity

Yuya Kobayashi, Yuhta Takida, Takashi Shibuya, Yuki Mitsufuji

评论：已被IJCNN 2025接受

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)
[129] arXiv:2506.01511 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过对抗偏好对齐增强基于扩散的无约束对抗攻击

标题： Enhancing Diffusion-based Unrestricted Adversarial Attacks via Adversary Preferences Alignment

Kaixun Jiang, Zhaoyu Chen, Haijing Guo, Jinglun Li, Jiyuan Fu, Pinxue Guo, Hao Tang, Bo Li, Wenqiang Zhang

主题：计算机视觉与模式识别 (cs.CV)
[130] arXiv:2506.01519 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：基于注意力感知的标记过滤加速视觉变换器模型

标题： Speed-up of Vision Transformer Models by Attention-aware Token Filtering

Takahiro Naruko, Hiroaki Akutsu

主题：计算机视觉与模式识别 (cs.CV)
[131] arXiv:2506.01532 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：超越离散类别：公平人脸识别中的连续人口统计标签

标题： Balancing Beyond Discrete Categories: Continuous Demographic Labels for Fair Face Recognition

Pedro C. Neto, Naser Damer, Jaime S. Cardoso, Ana F. Sequeira

评论：审稿中

主题：计算机视觉与模式识别 (cs.CV)
[132] arXiv:2506.01539 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： G4Seg：基于扩散模型的不精确分割 refinement 生成方法

标题： G4Seg: Generation for Inexact Segmentation Refinement with Diffusion Models

Tianjiao Zhang, Fei Zhang, Jiangchao Yao, Ya Zhang, Yanfeng Wang

评论： 16页，12幅图，IEEE多媒体与 expo 2025国际会议

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[133] arXiv:2506.01546 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：长时域驾驶世界模型：跨粒度蒸馏方法

标题： LongDWM: Cross-Granularity Distillation for Building a Long-Term Driving World Model

Xiaodong Wang, Zhirong Wu, Peixi Peng

评论：项目主页: https://wang-xiaodong1899.github.io/longdwm/

主题：计算机视觉与模式识别 (cs.CV)
[134] arXiv:2506.01551 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： EvolveNav：基于大型语言模型的具身推理自改进方法

标题： EvolveNav: Self-Improving Embodied Reasoning for LLM-Based Vision-Language Navigation

Bingqian Lin, Yunshuang Nie, Khun Loun Zai, Ziming Wei, Mingfei Han, Rongtao Xu, Minzhe Niu, Jianhua Han, Liang Lin, Cewu Lu, Xiaodan Liang

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI) ; 计算与语言 (cs.CL)
[135] arXiv:2506.01558 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： SAM2-LOVE: 借助语言的音频-视觉场景的Segment Anything模型2

标题： SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes

Yuji Wang, Haoran Xu, Yong Liu, Jiaze Li, Yansong Tang

评论： CVPR 2025

主题：计算机视觉与模式识别 (cs.CV)
[136] arXiv:2506.01579 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：高交互：基于分层场景感知的全身人-物-场景交互生成

标题： HOSIG: Full-Body Human-Object-Scene Interaction Generation with Hierarchical Scene Perception

Wei Yao, Yunlian Sun, Hongwen Zhang, Yebin Liu, Jinhui Tang

主题：计算机视觉与模式识别 (cs.CV)
[137] arXiv:2506.01586 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：野外的多模态数据蒸馏

标题： Multi-Modal Dataset Distillation in the Wild

Zhuohang Dang, Minnan Luo, Chengyou Jia, Hangwei Qian, Xiaojun Chang, Ivor W. Tsang

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)
[138] arXiv:2506.01608 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： EPFL-Smart-Kitchen-30：密集注释的烹饪数据集，带有3D运动学，以挑战视频和语言模型

标题： EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models

Andy Bonnetto, Haozhe Qi, Franklin Leong, Matea Tashkovska, Mahdi Rad, Solaiman Shokur, Friedhelm Hummel, Silvestro Micera, Marc Pollefeys, Alexander Mathis

评论：代码和数据在: https://github.com/amathislab/EPFL-Smart-Kitchen

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG) ; 其他定量生物学 (q-bio.OT)
[139] arXiv:2506.01636 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：度量学习的相似特征激活的视觉解释

标题： Visual Explanation via Similar Feature Activation for Metric Learning

Yi Liao, Ugochukwu Ejike Akpudo, Jue Zhang, Yongsheng Gao, Jun Zhou, Wenyi Zeng, Weichuan Zhang

主题：计算机视觉与模式识别 (cs.CV)
[140] arXiv:2506.01663 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： Zoom-Refine：通过局部缩放和自 refinement 提升高分辨率多模态理解

标题： Zoom-Refine: Boosting High-Resolution Multimodal Understanding via Localized Zoom and Self-Refinement

Xuan Yu, Dayan Guan, Michael Ying Yang, Yanfeng Gu

评论：代码可在https://github.com/xavier-yu114/Zoom-Refine获取

主题：计算机视觉与模式识别 (cs.CV)
[141] arXiv:2506.01667 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： EarthMind：利用大型多模态模型实现多粒度和多传感器地球观测

标题： EarthMind: Towards Multi-Granular and Multi-Sensor Earth Observation with Large Multimodal Models

Yan Shu, Bin Ren, Zhitong Xiong, Danda Pani Paudel, Luc Van Gool, Begum Demir, Nicu Sebe, Paolo Rota

主题：计算机视觉与模式识别 (cs.CV)
[142] arXiv:2506.01674 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： MotionSight：提升多模态大型语言模型中的精细动作理解

标题： MotionSight: Boosting Fine-Grained Motion Understanding in Multimodal LLMs

Yipeng Du, Tiehan Fan, Kepan Nan, Rui Xie, Penghao Zhou, Xiang Li, Jian Yang, Zhenheng Yang, Ying Tai

主题：计算机视觉与模式识别 (cs.CV)
[143] arXiv:2506.01691 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： SteerPose：基于关节运动的同时外参相机标定与匹配

标题： SteerPose: Simultaneous Extrinsic Camera Calibration and Matching from Articulation

Sang-Eun Lee, Ko Nishino, Shohei Nobuhara

评论： 13页

主题：计算机视觉与模式识别 (cs.CV)
[144] arXiv:2506.01701 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：最大化信息的数据修剪

标题： Data Pruning by Information Maximization

Haoru Tan, Sitong Wu, Wei Huang, Shizhen Zhao, Xiaojuan Qi

评论： ICLR 2025

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[145] arXiv:2506.01724 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过视觉-语言模型适应开放数据的主动学习

标题： Active Learning via Vision-Language Model Adaptation with Open Data

Tong Wang, Jiaqi Wang, Shu Kong

评论：项目网页地址如下：https://leowangtong.github.io/ALOR/

主题：计算机视觉与模式识别 (cs.CV)
[146] arXiv:2506.01725 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： VideoCap-R1：通过结构化思维增强用于视频字幕的MLLMs

标题： VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking

Desen Meng, Rui Huang, Zhilin Dai, Xinhao Li, Yifan Xu, Jun Zhang, Zhenpeng Huang, Meng Zhang, Lingshu Zhang, Yi Liu, Limin Wang

主题：计算机视觉与模式识别 (cs.CV)
[147] arXiv:2506.01738 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： STORM：使用综合序数回归数据集 benchmarking 多语言大模型的视觉评级

标题： STORM: Benchmarking Visual Rating of MLLMs with a Comprehensive Ordinal Regression Dataset

Jinhong Wang, Shuo Tong, Jian liu, Dongqi Tang, Jintai Chen, Haochao Ying, Hongxia Xu, Danny Chen, Jian Wu

评论： NIPS2025 D&B轨道审稿中

主题：计算机视觉与模式识别 (cs.CV)
[148] arXiv:2506.01757 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：基于多模态数据的高效自中心动作识别

标题： Efficient Egocentric Action Recognition with Multimodal Data

Marco Calzavara, Ard Kastrati, Matteo Macchini, Dushan Vasilevski, Roger Wattenhofer

评论：被接受为第二届自监督视觉（EgoVis）研讨会的扩展摘要，2025年

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[149] arXiv:2506.01758 (交叉列表自 cs.CV) [中文pdf, pdf, 其他]: 标题：多对多：统一多个视频和图像生成与操作任务的训练

标题： Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks

Tao Yang, Ruibin Li, Yangming Shi, Yuqi Zhang, Qide Dong, Haoran Cheng, Weiguo Feng, Shilei Wen, Bingyue Peng, Lei Zhang

主题：计算机视觉与模式识别 (cs.CV)
[150] arXiv:2506.01778 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：无监督多目标分割的中心边界推理

标题： unMORE: Unsupervised Multi-Object Segmentation via Center-Boundary Reasoning

Yafei Yang, Zihui Zhang, Bo Yang

评论： ICML 2025. 代码和数据可在以下网址获取：https://github.com/vLAR-group/unMORE

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG) ; 机器人技术 (cs.RO)

总共 3129 条目 : 1-25 51-75 76-100 101-125 126-150 151-175 176-200 201-225 ... 3126-3129

显示最多 25 每页条目：较少 | 更多 | 所有

计算机视觉与模式识别

2025年06月 的作者和标题

2025年06月的作者和标题