计算机视觉与模式识别

2025年06月的作者和标题

总共 3129 条目 : 1-50 51-100 101-150 151-200 ... 3101-3129

显示最多 50 每页条目：较少 | 更多 | 所有

[1] arXiv:2506.00101 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： EgoVIS@CVPR：什么变了，什么本可以变？面向过程感知视频表征学习的状态变化反事实

标题： EgoVIS@CVPR: What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning

Chi-Hsi Kung, Frangil Ramirez, Juhyung Ha, Yi-Ting Chen, David Crandall, Yi-Hsuan Tsai

评论： 4页，1个图，4个表格。完整论文可于arXiv:2503.21055获取。

主题：计算机视觉与模式识别 (cs.CV)
[2] arXiv:2506.00123 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：具身脑视觉：让多模态大型语言模型在空间中看、思考和控制

标题： Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Gen Luo, Ganlin Yang, Ziyang Gong, Guanzhou Chen, Haonan Duan, Erfei Cui, Ronglei Tong, Zhi Hou, Tianyi Zhang, Zhe Chen, Shenglong Ye, Lewei Lu, Jingbo Wang, Wenhai Wang, Jifeng Dai, Yu Qiao, Rongrong Ji, Xizhou Zhu

主题：计算机视觉与模式识别 (cs.CV) ; 机器人技术 (cs.RO)
[3] arXiv:2506.00129 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： Geo-Sign：用于感知几何的手语翻译的双曲对比正则化

标题： Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation

Edward Fish, Richard Bowden

评论：评审中

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)
[4] arXiv:2506.00154 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：使用无人机图像检测濒危鹿种：高效深度学习方法之间的对比研究

标题： Detection of Endangered Deer Species Using UAV Imagery: A Comparative Study Between Efficient Deep Learning Approaches

Agustín Roca, Gastón Castro, Gabriel Torre, Leonardo J. Colombo, Ignacio Mas, Javier Pereira, Juan I. Giribet

期刊参考： 2025年无人飞行器系统国际会议（ICUAS），美国北卡罗来纳州夏洛特市，2025年，第83-90页

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[5] arXiv:2506.00164 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：利用无人机航拍图像和深度学习监测濒危鹿科物种

标题： Efficient Endangered Deer Species Monitoring with UAV Aerial Imagery and Deep Learning

Agustín Roca, Gabriel Torre, Juan I. Giribet, Gastón Castro, Leonardo Colombo, Ignacio Mas, Javier Pereira

期刊参考： 2024 IEEE 阿根廷双年会议 (ARGENCON)，阿根廷圣尼古拉·德洛斯阿尔戈斯，2024年，第1-8页

主题：计算机视觉与模式识别 (cs.CV)
[6] arXiv:2506.00208 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： FastCAR：多任务学习中用于检测对象类别连续属性变量建模的任务巩固的快速分类与回归

标题： FastCAR: Fast Classification And Regression for Task Consolidation in Multi-Task Learning to Model a Continuous Property Variable of Detected Object Class

Anoop Kini, Andreas Jansche, Timo Bernthaler, Gerhard Schneider

主题：计算机视觉与模式识别 (cs.CV)
[7] arXiv:2506.00227 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：可控崩溃：可控扩散的真实汽车碰撞

标题： Ctrl-Crash: Controllable Diffusion for Realistic Car Crashes

Anthony Gosselin, Ge Ya Luo, Luis Lara, Florian Golemo, Derek Nowrouzezahrai, Liam Paull, Alexia Jolicoeur-Martineau, Christopher Pal

评论：审阅中

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI) ; 机器人技术 (cs.RO)
[8] arXiv:2506.00238 (交叉列表自 cs.CV) [中文pdf, pdf, 其他]: 标题： ZeShot-VQA：基于答案映射的零样本视觉问答框架用于自然灾害损毁评估

标题： ZeShot-VQA: Zero-Shot Visual Question Answering Framework with Answer Mapping for Natural Disaster Damage Assessment

Ehsan Karimi, Maryam Rahnemoonfar

评论：已被2025年IEEE地球科学与遥感国际研讨会（IGARSS 2025）接受

主题：计算机视觉与模式识别 (cs.CV) ; 计算与语言 (cs.CL) ; 信息检索 (cs.IR) ; 机器学习 (cs.LG)
[9] arXiv:2506.00318 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：帧链：通过帧感知推理提升多模态大型语言模型中的视频理解

标题： Chain-of-Frames: Advancing Video Understanding in Multimodal LLMs via Frame-Aware Reasoning

Sara Ghazanfari, Francesco Croce, Nicolas Flammarion, Prashanth Krishnamurthy, Farshad Khorrami, Siddharth Garg

主题：计算机视觉与模式识别 (cs.CV)
[10] arXiv:2506.00324 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过利用基于不确定性的学习难题来改进光流和立体深度估计

标题： Improving Optical Flow and Stereo Depth Estimation by Leveraging Uncertainty-Based Learning Difficulties

Jisoo Jeong, Hong Cai, Jamie Menjay Lin, Fatih Porikli

评论： CVPRW 2025

主题：计算机视觉与模式识别 (cs.CV)
[11] arXiv:2506.00325 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：具有扩散模型的鲁棒视觉跟踪的有效且高效的对抗防御研究

标题： Towards Effective and Efficient Adversarial Defense with Diffusion Models for Robust Visual Tracking

Long Xu, Peng Gao, Wen-Jia Tang, Fei Wang, Ru-Yue Yuan

主题：计算机视觉与模式识别 (cs.CV)
[12] arXiv:2506.00327 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：扩散模型中用于感知评估的潜在引导

标题： Latent Guidance in Diffusion Models for Perceptual Evaluations

Shreshth Saini, Ru-Ling Liao, Yan Ye, Alan C. Bovik

评论： 24页，7个图，10个表格

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[13] arXiv:2506.00333 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：测试时词汇适应的语言驱动目标检测

标题： Test-time Vocabulary Adaptation for Language-driven Object Detection

Mingxuan Liu, Tyler L. Hayes, Massimiliano Mancini, Elisa Ricci, Riccardo Volpi, Gabriela Csurka

评论：已被ICIP 2025接受为会议论文

主题：计算机视觉与模式识别 (cs.CV)
[14] arXiv:2506.00365 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：特征融合与知识蒸馏的多模态多目标检测

标题： Feature Fusion and Knowledge-Distilled Multi-Modal Multi-Target Detection

Ngoc Tuyen Do, Tri Nhu Do

主题：计算机视觉与模式识别 (cs.CV) ; 信号处理 (eess.SP)
[15] arXiv:2506.00394 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：基于序列的第一人称相机佩戴者在第三人称视图中的识别

标题： Sequence-Based Identification of First-Person Camera Wearers in Third-Person Views

Ziwei Zhao, Xizi Wang, Yuchen Wang, Feng Cheng, David Crandall

主题：计算机视觉与模式识别 (cs.CV)
[16] arXiv:2506.00406 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：实例解耦提示注意力的增量医学目标检测 (iDPA)

标题： iDPA: Instance Decoupled Prompt Attention for Incremental Medical Object Detection

Huahui Yi, Wei Xu, Ziyuan Qin, Xi Chen, Xiaohu Wu, Kang Li, Qicheng Lao

评论：被ICML 2025接受

主题：计算机视觉与模式识别 (cs.CV)
[17] arXiv:2506.00433 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：潜在小波扩散：实现免费的4K图像合成

标题： Latent Wavelet Diffusion: Enabling 4K Image Synthesis for Free

Luigi Sigillo, Shengfeng He, Danilo Comminiello

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG) ; 图像与视频处理 (eess.IV)
[18] arXiv:2506.00447 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：孟加拉手写字符和数字识别的少样本学习方法性能分析

标题： Performance Analysis of Few-Shot Learning Approaches for Bangla Handwritten Character and Digit Recognition

Mehedi Ahamed, Radib Bin Kabir, Tawsif Tashwar Dipto, Mueeze Al Mushabbir, Sabbir Ahmed, Md. Hasanul Kabir

期刊参考： 2024 第六届可持续工业5.0技术国际会议（STI）

主题：计算机视觉与模式识别 (cs.CV)
[19] arXiv:2506.00475 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： BAGNet：一种用于3D点云语义分割的边界感知图注意力网络

标题： BAGNet: A Boundary-Aware Graph Attention Network for 3D Point Cloud Semantic Segmentation

Wei Tao, Xiaoyang Qu, Kai Lu, Jiguang Wan, Shenglin He, Jianzong Wang

评论：已被2025年国际神经网络联合会议（IJCNN 2025）接受

主题：计算机视觉与模式识别 (cs.CV)
[20] arXiv:2506.00513 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： SSAM：用于测试时适应的自监督关联建模

标题： SSAM: Self-Supervised Association Modeling for Test-Time Adaption

Yaxiong Wang, Zhenqiang Zhang, Lechao Cheng, Zhun Zhong, Dan Guo, Meng Wang

评论： 10页

主题：计算机视觉与模式识别 (cs.CV)
[21] arXiv:2506.00523 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： SenseFlow：基于流的文本到图像蒸馏的分布匹配扩展

标题： SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation

Xingtong Ge, Xin Zhang, Tongda Xu, Yi Zhang, Xinjie Zhang, Yan Wang, Jun Zhang

评论：正在审阅

主题：计算机视觉与模式识别 (cs.CV)
[22] arXiv:2506.00541 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：基于异步摄像机的移动点三维轨迹重建

标题： 3D Trajectory Reconstruction of Moving Points Based on Asynchronous Cameras

Huayu Huang, Banglei Guan, Yang Shang, Qifeng Yu

评论：本文已被《力学学报》接受。

主题：计算机视觉与模式识别 (cs.CV)
[23] arXiv:2506.00558 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： ViVo：体积视频重建与压缩的数据集

标题： ViVo: A Dataset for Volumetric Video Reconstruction and Compression

Adrian Azzarelli, Ge Gao, Ho Man Kwan, Fan Zhang, Nantheera Anantrasirichai, Ollie Moolan-Feroze, David Bull

主题：计算机视觉与模式识别 (cs.CV)
[24] arXiv:2506.00562 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： SEED：一种用于基于扩散模型的顺序面部属性编辑的基准数据集

标题： SEED: A Benchmark Dataset for Sequential Facial Attribute Editing with Diffusion Models

Yule Zhu, Ping Liu, Zhedong Zheng, Wei Liu

主题：计算机视觉与模式识别 (cs.CV) ; 多媒体 (cs.MM)
[25] arXiv:2506.00568 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： CReFT-CAD：通过强化微调提升CAD的正投影推理能力

标题： CReFT-CAD: Boosting Orthographic Projection Reasoning for CAD via Reinforcement Fine-Tuning

Ke Niu, Zhuofan Chen, Haiyang Yu, Yuwen Chen, Teng Fu, Mengyang Zhao, Bin Li, Xiangyang Xue

主题：计算机视觉与模式识别 (cs.CV)
[26] arXiv:2506.00578 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：基于事件的多视图摄影测量用于高动态、高_velocity目标测量

标题： Event-based multi-view photogrammetry for high-dynamic, high-velocity target measurement

Taihang Lei, Banglei Guan, Minzu Liang, Xiangyu Li, Jianbing Liu, Jing Tao, Yang Shang, Qifeng Yu

评论： 9页，9幅图，1张表。本文已被《力学学报》接受（日期：2025年5月30日）。

主题：计算机视觉与模式识别 (cs.CV)
[27] arXiv:2506.00596 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： Seg2Any：具有精确形状和语义控制的开放式分割掩码到图像生成

标题： Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control

Danfeng li, Hui Zhang, Sheng Wang, Jiacheng Li, Zuxuan Wu

主题：计算机视觉与模式识别 (cs.CV)
[28] arXiv:2506.00599 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： XYZ-IBD：用于捕获真实工业复杂性的物体6D姿态估计高精度分拣数据集

标题： XYZ-IBD: A High-precision Bin-picking Dataset for Object 6D Pose Estimation Capturing Real-world Industrial Complexity

Junwen Huang, Jizhong Liang, Jiaqi Hu, Martin Sundermeyer, Peter KT Yu, Nassir Navab, Benjamin Busam

主题：计算机视觉与模式识别 (cs.CV)
[29] arXiv:2506.00600 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： SatDreamer360：从卫星图像生成一致的街景视频

标题： SatDreamer360: Geometry Consistent Street-View Video Generation from Satellite Imagery

Xianghui Ze, Beiyi Zhu, Zhenbo Song, Jianfeng Lu, Yujiao Shi

主题：计算机视觉与模式识别 (cs.CV)
[30] arXiv:2506.00607 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：并行重缩放：个性化扩散模型的再平衡一致性指导

标题： Parallel Rescaling: Rebalancing Consistency Guidance for Personalized Diffusion Models

JungWoo Chae, Jiyoon Kim, Sangheum Hwang

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[31] arXiv:2506.00625 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：长尾视觉识别的置换不变头尾特征融合

标题： Long-Tailed Visual Recognition via Permutation-Invariant Head-to-Tail Feature Fusion

Mengke Li, Zhikai Hu, Yang Lu, Weichao Lan, Yiu-ming Cheung, Hui Huang

主题：计算机视觉与模式识别 (cs.CV)
[32] arXiv:2506.00633 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过对比视觉语言预训练的3D潜在扩散模型进行文本到CT生成

标题： Text-to-CT Generation via 3D Latent Diffusion Model with Contrastive Vision-Language Pretraining

Daniele Molino, Camillo Maria Caruso, Filippo Ruffini, Paolo Soda, Valerio Guarrasi

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[33] arXiv:2506.00652 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：视频签名：潜在视频扩散模型的生成水印

标题： Video Signature: In-generation Watermarking for Latent Video Diffusion Models

Yu Huang, Junhao Chen, Qi Zheng, Hanqian Li, Shuliang Liu, Xuming Hu

主题：计算机视觉与模式识别 (cs.CV) ; 密码学与安全 (cs.CR)
[34] arXiv:2506.00661 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：海报：针对攻击向量通过低秩适应（LoRA）调整预训练视觉变换器

标题： Poster: Adapting Pretrained Vision Transformers with LoRA Against Attack Vectors

Richard E. Neddo, Sean Willis, Zander Blasingame, Chen Liu

评论：将于IEEE MOST 2025 presenting

主题：计算机视觉与模式识别 (cs.CV)
[35] arXiv:2506.00667 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：大规模视频分析的场景检测策略与关键帧提取策略

标题： Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis

Vasilii Korolkov

评论： 24页，8个图表，作为预印本提交。仅是ArXiv预印本，尚未提交给期刊。

主题：计算机视觉与模式识别 (cs.CV) ; 多媒体 (cs.MM)
[36] arXiv:2506.00698 (交叉列表自 cs.CV) [中文pdf, pdf, 其他]: 标题：基于概念的矢量量化生成模型的令牌解释

标题： Concept-Centric Token Interpretation for Vector-Quantized Generative Models

Tianze Yang, Yucheng Shi, Mengnan Du, Xuansheng Wu, Qiaoyu Tan, Jin Sun, Ninghao Liu

评论： 17页，7幅图

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)
[37] arXiv:2506.00716 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：焦点堆叠：具有动态局部像差校正的成像

标题： Fovea Stacking: Imaging with Dynamic Localized Aberration Correction

Shi Mao, Yogeshwar Mishra, Wolfgang Heidrich

主题：计算机视觉与模式识别 (cs.CV)
[38] arXiv:2506.00718 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：从局部线索到整体感知：自监督视觉模型中的涌现式格式塔组织

标题： From Local Cues to Global Percepts: Emergent Gestalt Organization in Self-Supervised Vision Models

Tianqin Li, Ziqi Wen, Leiran Song, Jun Liu, Zhi Jing, Tai Sing Lee

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[39] arXiv:2506.00721 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：常见补丁对象：上下文内外

标题： Common Inpainted Objects In-N-Out of Context

Tianze Yang, Tyson Jordan, Ninghao Liu, Jin Sun

评论： 12页，7幅图

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)
[40] arXiv:2506.00735 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：用于资源高效植物疾病分类的两步压缩卷积密集网络

标题： Involution-Infused DenseNet with Two-Step Compression for Resource-Efficient Plant Disease Classification

T. Ahmed, S. Jannat, Md. F. Islam, J. Noor

主题：计算机视觉与模式识别 (cs.CV)
[41] arXiv:2506.00742 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： ArtiScene：基于图像中介的语言驱动艺术性3D场景生成

标题： ArtiScene: Language-Driven Artistic 3D Scene Generation Through Image Intermediary

Zeqi Gu, Yin Cui, Zhaoshuo Li, Fangyin Wei, Yunhao Ge, Jinwei Gu, Ming-Yu Liu, Abe Davis, Yifan Ding

评论：被CVPR接受

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[42] arXiv:2506.00754 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： EcoLens：利用多目标贝叶斯优化在边缘设备上进行能效视频处理

标题： EcoLens: Leveraging Multi-Objective Bayesian Optimization for Energy-Efficient Video Processing on Edge Devices

Benjamin Civjan, Bo Chen, Ruixiao Zhang, Klara Nahrstedt

主题：计算机视觉与模式识别 (cs.CV)
[43] arXiv:2506.00774 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：深度感知打分与层次对齐的多目标跟踪

标题： Depth-Aware Scoring and Hierarchical Alignment for Multiple Object Tracking

Milad Khanchi, Maria Amer, Charalambos Poullis

评论： IEEE国际图像处理会议 2025

主题：计算机视觉与模式识别 (cs.CV)
[44] arXiv:2506.00786 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过图像合成与分类辅助医学诊断

标题： Aiding Medical Diagnosis through Image Synthesis and Classification

Kanishk Choudhary

评论： 8页，6个图。已投稿审稿中。

主题：计算机视觉与模式识别 (cs.CV)
[45] arXiv:2506.00805 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： HSCR：用于对齐医学视觉语言模型的分层自对比奖励

标题： HSCR: Hierarchical Self-Contrastive Rewarding for Aligning Medical Vision Language Models

Songtao Jiang, Yan Zhang, Yeying Jin, Zhihang Tang, Yangyang Wu, Yang Feng, Jian Wu, Zuozhu Liu

主题：计算机视觉与模式识别 (cs.CV) ; 计算与语言 (cs.CL)
[46] arXiv:2506.00813 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：时间：用于鲁棒表格图像学习的TabPFN集成多模态引擎

标题： TIME: TabPFN-Integrated Multimodal Engine for Robust Tabular-Image Learning

Jiaqi Luo, Yuan Yuan, Shixin Xu

主题：计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)
[47] arXiv:2506.00816 (交叉列表自 cs.CV) [中文pdf, pdf, 其他]: 标题： L3A：多标签类增量学习中的标签增强分析适应

标题： L3A: Label-Augmented Analytic Adaptation for Multi-Label Class Incremental Learning

Xiang Zhang, Run He, Jiao Chen, Di Fang, Ming Li, Ziqian Zeng, Cen Chen, Huiping Zhuang

评论：已被ICML2025接受

主题：计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
[48] arXiv:2506.00820 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： QuantFace：一步扩散人脸修复的低比特后训练量化

标题： QuantFace: Low-Bit Post-Training Quantization for One-Step Diffusion Face Restoration

Jiatong Li, Libo Zhu, Haotong Qin, Jingkai Wang, Linghe Kong, Guihai Chen, Yulun Zhang, Xiaokang Yang

主题：计算机视觉与模式识别 (cs.CV)
[49] arXiv:2506.00827 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题：通过灵巧聚焦改进自我视频中的关键步骤识别

标题： Improving Keystep Recognition in Ego-Video via Dexterous Focus

Zachary Chavis, Stephen J. Guy, Hyun Soo Park

主题：计算机视觉与模式识别 (cs.CV)
[50] arXiv:2506.00830 (交叉列表自 cs.CV) [中文pdf, pdf, html, 其他]: 标题： SkyReels-音频：视频扩散变换器中的全方位音频条件化说话人肖像

标题： SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers

Zhengcong Fei, Hao Jiang, Di Qiu, Baoxuan Gu, Youqiang Zhang, Jiahua Wang, Jialin Bai, Debang Li, Mingyuan Fan, Guibin Chen, Yahui Zhou

主题：计算机视觉与模式识别 (cs.CV)

总共 3129 条目 : 1-50 51-100 101-150 151-200 ... 3101-3129

显示最多 50 每页条目：较少 | 更多 | 所有

计算机视觉与模式识别

2025年06月 的作者和标题

2025年06月的作者和标题