计算机视觉与模式识别

最近提交的作者和标题

查看今天的新的变化

总共 740 条目 : 1-25 26-50 51-75 76-100 ... 726-740

显示最多 25 每页条目：较少 | 更多 | 所有

[1] arXiv:2507.02863 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： Point3R：具有显式空间指针记忆的流式三维重建

标题： Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory

Yuqi Wu, Wenzhao Zheng, Jie Zhou, Jiwen Lu

评论：代码可在以下地址获取：https://github.com/YkiWu/Point3R

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG)
[2] arXiv:2507.02862 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：基于参考的标记化用于视频生成

标题： RefTok: Reference-Based Tokenization for Video Generation

Xiang Fan, Xiaohang Sun, Kushan Thakkar, Zhu Liu, Vimal Bhat, Ranjay Krishna, Xiang Hao

主题：计算机视觉与模式识别 (cs.CV)
[3] arXiv:2507.02861 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： LiteReality：从RGB-D扫描中生成图形的3D场景重建

标题： LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans

Zhening Huang, Xiaoyang Wu, Fangcheng Zhong, Hengshuang Zhao, Matthias Nießner, Joan Lasenby

评论：项目页面：https://litereality.github.io；视频：https://www.youtube.com/watch?v=ecK9m3LXg2c&feature=youtu.be

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI) ; 图形学 (cs.GR)
[4] arXiv:2507.02860 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：少即是够：通过运行时自适应缓存实现无需训练的视频扩散加速

标题： Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching

Xin Zhou, Dingkang Liang, Kaijin Chen, Tianrui Feng, Xiwu Chen, Hongkai Lin, Yikang Ding, Feiyang Tan, Hengshuang Zhao, Xiang Bai

评论：代码可在 https://github.com/H-EmbodVis/EasyCache 获取。项目页面：https://h-embodvis.github.io/EasyCache/

主题：计算机视觉与模式识别 (cs.CV)
[5] arXiv:2507.02859 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：在多模态大语言模型中进行数据高效模型适应的基于基础的思维链引导

标题： Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation

Jiaer Xia, Bingkui Tong, Yuhang Zang, Rui Shao, Kaiyang Zhou

评论：被ICCV 2025接收

主题：计算机视觉与模式识别 (cs.CV)
[6] arXiv:2507.02857 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：任何I2V：使用运动控制动画任意条件图像

标题： AnyI2V: Animating Any Conditional Image with Motion Control

Ziye Li, Hao Luo, Xincheng Shuai, Henghui Ding

评论： ICCV 2025，项目页面：https://henghuiding.com/AnyI2V/

主题：计算机视觉与模式识别 (cs.CV)
[7] arXiv:2507.02844 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：视觉上下文攻击：通过图像驱动的上下文注入破解多模态大语言模型

标题： Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection

Ziqi Miao, Yi Ding, Lijun Li, Jing Shao

评论： 16页

主题：计算机视觉与模式识别 (cs.CV) ; 计算与语言 (cs.CL) ; 密码学与安全 (cs.CR)
[8] arXiv:2507.02827 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： USAD：一种无监督数据增强时空注意力扩散网络

标题： USAD: An Unsupervised Data Augmentation Spatio-Temporal Attention Diffusion Network

Ying Yu, Hang Xiao, Siyao Li, Jiarui Li, Haotian Tang, Hanyu Liu, Chao Li

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[9] arXiv:2507.02826 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：基于置信度的梯度调制用于多模态人类活动识别：一种动态对比双路径学习方法

标题： Confidence-driven Gradient Modulation for Multimodal Human Activity Recognition: A Dynamic Contrastive Dual-Path Learning Approach

Panpan Ji, Junni Song, Hang Xiao, Hanyu Liu, Chao Li

主题：计算机视觉与模式识别 (cs.CV)
[10] arXiv:2507.02813 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： LangScene-X：使用TriMap视频扩散重建可泛化的3D语言嵌入场景

标题： LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion

Fangfu Liu, Hao Li, Jiawei Chi, Hanyang Wang, Minghui Yang, Fudong Wang, Yueqi Duan

评论：项目页面：https://liuff19.github.io/LangScene-X

主题：计算机视觉与模式识别 (cs.CV)
[11] arXiv:2507.02803 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：超高斯分布：用于高保真可动画人脸化身的高维高斯点云

标题： HyperGaussians: High-Dimensional Gaussian Splatting for High-Fidelity Animatable Face Avatars

Gent Serifi, Marcel C. Bühler

评论：项目页面：https://gserifi.github.io/HyperGaussians

主题：计算机视觉与模式识别 (cs.CV) ; 图形学 (cs.GR)
[12] arXiv:2507.02798 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：没有时间训练！无需训练的基于参考的实例分割

标题： No time to train! Training-Free Reference-Based Instance Segmentation

Miguel Espinosa, Chenhongyi Yang, Linus Ericsson, Steven McDonagh, Elliot J. Crowley

评论：预印本

主题：计算机视觉与模式识别 (cs.CV)
[13] arXiv:2507.02792 (交叉列表自 cs.CV) [中文pdf, pdf, 其他]: 标题： RichControl：结构和外观丰富的无训练空间控制用于文本到图像生成

标题： RichControl: Structure- and Appearance-Rich Training-Free Spatial Control for Text-to-Image Generation

Liheng Zhang, Lexi Pang, Hang Ye, Xiaoxuan Ma, Yizhou Wang

主题：计算机视觉与模式识别 (cs.CV)
[14] arXiv:2507.02790 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：从长视频到吸引人的片段：一种具有多模态叙事理解的人类启发式视频编辑框架

标题： From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding

Xiangfeng Wang, Xiao Li, Yadong Wei, Xueyu Song, Yang Song, Xiaoqiang Xia, Fangrui Zeng, Zaiyi Chen, Liu Liu, Gu Xu, Tong Xu

主题：计算机视觉与模式识别 (cs.CV) ; 计算与语言 (cs.CL)
[15] arXiv:2507.02781 (交叉列表自 cs.CV) [中文pdf, pdf, 其他]: 标题：从像素到损坏程度：使用社交媒体图像的语义分割估计地震影响

标题： From Pixels to Damage Severity: Estimating Earthquake Impacts Using Semantic Segmentation of Social Media Images

Danrong Zhang, Huili Huang, N. Simrill Smith, Nimisha Roy, J. David Frost

主题：计算机视觉与模式识别 (cs.CV) ; 社会与信息网络 (cs.SI)
[16] arXiv:2507.02751 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：部分弱监督方向目标检测

标题： Partial Weakly-Supervised Oriented Object Detection

Mingxin Liu, Peiyuan Zhang, Yuan Liu, Wei Zhang, Yue Zhou, Ning Liao, Ziyang Gong, Junwei Luo, Zhirui Wang, Yi Yu, Xue Yang

评论： 10页，5图，4表，源代码： https://github.com/VisionXLab/PWOOD

主题：计算机视觉与模式识别 (cs.CV)
[17] arXiv:2507.02748 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：带有全局上下文的线性注意力：一种用于视觉和物理的多极注意力机制

标题： Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics

Alex Colagrande, Paul Caillon, Eva Feillet, Alexandre Allauzen

评论：被ICCV 2025的ECLR研讨会接收

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG)
[18] arXiv:2507.02747 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： DexVLG：大规模灵活视觉-语言-抓取模型

标题： DexVLG: Dexterous Vision-Language-Grasp Model at Scale

Jiawei He, Danshi Li, Xinqiang Yu, Zekun Qi, Wenyao Zhang, Jiayi Chen, Zhaoxiang Zhang, Zhizheng Zhang, Li Yi, He Wang

主题：计算机视觉与模式识别 (cs.CV) ; 机器人技术 (cs.RO)
[19] arXiv:2507.02743 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：带有边界框约束的医学图像分割提示学习

标题： Prompt learning with bounding box constraints for medical image segmentation

Mélanie Gaillochet, Mehrdad Noori, Sahar Dastani, Christian Desrosiers, Hervé Lombaert

评论：被IEEE生物医学工程汇刊（TMBE）接受，14页

主题：计算机视觉与模式识别 (cs.CV)
[20] arXiv:2507.02714 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： FairHuman：在扩散模型中以最小潜在延迟公平性提升人体图像生成中的手和脸质量

标题： FairHuman: Boosting Hand and Face Quality in Human Image Generation with Minimum Potential Delay Fairness in Diffusion Models

Yuxuan Wang, Tianwei Cao, Huayu Zhang, Zhongjiang He, Kongming Liang, Zhanyu Ma

评论： ICCV 2025

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[21] arXiv:2507.02713 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： UniMC：统一关键点引导的多类图像生成扩散变换器

标题： UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation

Qin Guo, Ailing Zeng, Dongxu Yue, Ceyuan Yang, Yang Cao, Hanzhong Guo, Fei Shen, Wei Liu, Xihui Liu, Dan Xu

主题：计算机视觉与模式识别 (cs.CV)
[22] arXiv:2507.02705 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： SIU3R：超越特征对齐的同步场景理解和三维重建

标题： SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alignment

Qi Xu, Dongxu Wei, Lingzhe Zhao, Wenpu Li, Zhangchi Huang, Shunping Ji, Peidong Liu

主题：计算机视觉与模式识别 (cs.CV)
[23] arXiv:2507.02691 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： CanonSwap：通过规范空间调制实现高保真且一致的视频人脸交换

标题： CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation

Xiangyang Luo, Ye Zhu, Yunfei Liu, Lijian Lin, Cong Wan, Zijian Cai, Shao-Lun Huang, Yu Li

评论：被ICCV接收

主题：计算机视觉与模式识别 (cs.CV)
[24] arXiv:2507.02687 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： APT：有限数据下扩散模型的自适应个性化训练

标题： APT: Adaptive Personalized Training for Diffusion Models with Limited Data

JungWoo Chae, Jiyoon Kim, JaeWoong Choi, Kyungyul Kim, Sangheum Hwang

评论： CVPR 2025 最终版。项目页面：https://lgcnsai.github.io/apt

期刊参考：计算机视觉与模式识别会议论文集（CVPR），2025年，第28619-28628页

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[25] arXiv:2507.02686 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过展开和蒸馏扩散模型学习少量步骤的后验采样器

标题： Learning few-step posterior samplers by unfolding and distillation of diffusion models

Charlesquin Kemajou Mbakam, Jonathan Spence, Marcelo Pereyra

评论： 28页，16图，10表

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)

总共 740 条目 : 1-25 26-50 51-75 76-100 ... 726-740

显示最多 25 每页条目：较少 | 更多 | 所有

计算机视觉与模式识别

最近提交的作者和标题

2025年07月04日， 星期五 (展示 首先 99 之 25 条目 )

2025年07月04日，星期五 (展示首先 99 之 25 条目 )