Computer Vision and Pattern Recognition

Authors and titles for May 2025

Total of 3183 entries : 1-50 51-100 101-150 151-200 ... 3151-3183

Showing up to 50 entries per page: fewer | more | all

[1] arXiv:2505.00044 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Learning to Borrow Features for Improved Detection of Small Objects in Single-Shot Detectors

Title: 学习借用特征以改进单次检测器对小物体的检测

Richard Schmit

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Optimization and Control (math.OC)
[2] arXiv:2505.00134 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Investigating Zero-Shot Diagnostic Pathology in Vision-Language Models with Efficient Prompt Design

Title: 探究高效提示设计下视觉-语言模型在零样本诊断病理学中的应用

Vasudev Sharma, Ahmed Alagha, Abdelhakim Khellaf, Vincent Quoc-Huy Trinh, Mahdi S. Hosseini

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[3] arXiv:2505.00135 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Eye2Eye: A Simple Approach for Monocular-to-Stereo Video Synthesis

Title: Eye2Eye：一种简单的单目视频到立体视频合成方法

Michal Geyer, Omer Tov, Linyi Jin, Richard Tucker, Inbar Mosseri, Tali Dekel, Noah Snavely

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[4] arXiv:2505.00150 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Detecting and Mitigating Hateful Content in Multimodal Memes with Vision-Language Models

Title: 检测和缓解多模态表情包中的仇恨内容与视觉-语言模型

Minh-Hao Van, Xintao Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Computation and Language (cs.CL)
[5] arXiv:2505.00156 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving

Title: V3LMA：用于自动驾驶的视觉三维增强语言模型

Jannik Lübberstedt, Esteban Rivera, Nico Uhlemann, Markus Lienkamp

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[6] arXiv:2505.00209 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Direct Motion Models for Assessing Generated Videos

Title: 直接运动模型用于评估生成的视频

Kelsey Allen, Carl Doersch, Guangyao Zhou, Mohammed Suhail, Danny Driess, Ignacio Rocco, Yulia Rubanova, Thomas Kipf, Mehdi S. M. Sajjadi, Kevin Murphy, Joao Carreira, Sjoerd van Steenkiste

Comments: Project page: http://trajan-paper.github.io

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[7] arXiv:2505.00220 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Towards Robust and Generalizable Gerchberg Saxton based Physics Inspired Neural Networks for Computer Generated Holography: A Sensitivity Analysis Framework

Title: 面向鲁棒性和泛化的基于Gerchberg Saxton的物理启发神经网络在计算机生成全息术中的应用：敏感性分析框架

Ankit Amrutkar, Björn Kampa, Volkmar Schulz, Johannes Stegmaier, Markus Rothermel, Dorit Merhof

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Optics (physics.optics)
[8] arXiv:2505.00254 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Empowering Agentic Video Analytics Systems with Video Language Models

Title: 赋予自主视频分析系统以视频语言模型的力量

Yuxuan Yan, Shiqi Jiang, Ting Cao, Yifan Yang, Qianqian Yang, Yuanchao Shu, Yuqing Yang, Lili Qiu

Comments: 15 pages, AVAS, add latency breakdown

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[9] arXiv:2505.00259 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Pack-PTQ: Advancing Post-training Quantization of Neural Networks by Pack-wise Reconstruction

Title: Pack-PTQ：通过打包式重构推进神经网络的后训练量化

Changjun Li, Runqing Jiang, Zhuo Song, Pengpeng Yu, Ye Zhang, Yulan Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[10] arXiv:2505.00275 (cross-list from cs.CV) [cn-pdf, pdf, other]: Title: AdCare-VLM: Leveraging Large Vision Language Model (LVLM) to Monitor Long-Term Medication Adherence and Care

Title: AdCare-VLM：利用大型视觉语言模型（LVLM）监测长期药物依从性和护理

Md Asaduzzaman Jabin, Hanqi Jiang, Yiwei Li, Patrick Kaggwa, Eugene Douglass, Juliet N. Sekandi, Tianming Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[11] arXiv:2505.00295 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Fine-grained spatial-temporal perception for gas leak segmentation

Title: 精细的空间-时间感知用于气体泄漏分割

Xinlong Zhao, Shan Du

Comments: 6 pages, 4 figures, ICIP 2025 Conference

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[12] arXiv:2505.00308 (cross-list from cs.CV) [cn-pdf, pdf, other]: Title: AI-Assisted Decision-Making for Clinical Assessment of Auto-Segmented Contour Quality

Title: 基于人工智能辅助决策的自分割轮廓质量临床评估

Biling Wang, Austen Maniscalco, Ti Bai, Siqiu Wang, Michael Dohopolski, Mu-Han Lin, Chenyang Shen, Dan Nguyen, Junzhou Huang, Steve Jiang, Xinlei Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Applications (stat.AP)
[13] arXiv:2505.00312 (cross-list from cs.CV) [cn-pdf, pdf, other]: Title: AWARE-NET: Adaptive Weighted Averaging for Robust Ensemble Network in Deepfake Detection

Title: AWARE-NET: 深度伪造检测中的自适应加权平均用于鲁棒集成网络

Muhammad Salman, Iqra Tariq, Mishal Zulfiqar, Muqadas Jalal, Sami Aujla, Sumbal Fatima

Journal-ref: IET Conference Proceedings CP917, Volume 2025, Issue 3, Pages 526-533, The Institution of Engineering and Technology, 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[14] arXiv:2505.00334 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution

Title: 四元数小波条件扩散模型用于图像超分辨率

Luigi Sigillo, Christian Bianchi, Aurelio Uncini, Danilo Comminiello

Comments: Accepted for presentation at IJCNN 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[15] arXiv:2505.00335 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Efficient Neural Video Representation with Temporally Coherent Modulation

Title: 高效具有时间相干调制的神经视频表示

Seungjun Shin, Suji Kim, Dokwan Oh

Comments: ECCV 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[16] arXiv:2505.00369 (cross-list from cs.CV) [cn-pdf, pdf, other]: Title: Automated segmenta-on of pediatric neuroblastoma on multi-modal MRI: Results of the SPPIN challenge at MICCAI 2023

Title: 多模态MRI中的儿童神经母细胞瘤自动分割：MICCAI 2023 SPPIN挑战赛结果

M.A.D. Buser, D.C. Simons, M. Fitski, M.H.W.A. Wijnen, A.S. Littooij, A.H. ter Brugge, I.N. Vos, M.H.A. Janse, M. de Boer, R. ter Maat, J. Sato, S. Kido, S. Kondo, S. Kasai, M. Wodzinski, H. Muller, J. Ye, J. He, Y. Kirchhoff, M.R. Rokkus, G. Haokai, S. Zitong, M. Fernández-Patón, D. Veiga-Canuto, D.G. Ellis, M.R. Aizenberg, B.H.M. van der Velden, H. Kuijf, A. De Luca, A.F.W. van der Steeg

Comments: 23 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[17] arXiv:2505.00378 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Cues3D: Unleashing the Power of Sole NeRF for Consistent and Unique Instances in Open-Vocabulary 3D Panoptic Segmentation

Title: Cues3D：释放单一NeRF的潜力以实现开放词汇3D全景分割中一致且独特的实例

Feng Xue, Wenzhuang Xu, Guofeng Zhong, Anlong Minga, Nicu Sebe

Comments: Accepted by Information Fusion

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18] arXiv:2505.00380 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: The Invisible Threat: Evaluating the Vulnerability of Cross-Spectral Face Recognition to Presentation Attacks

Title: 隐形威胁：评估跨谱面人脸识别对呈现攻击的脆弱性

Anjith George, Sebastien Marcel

Comments: 10 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[19] arXiv:2505.00394 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: SOTA: Spike-Navigated Optimal TrAnsport Saliency Region Detection in Composite-bias Videos

Title: SOTA：复合偏置视频中的 Spike 导向最优传输显著区域检测

Wenxuan Liu, Yao Deng, Kang Chen, Xian Zhong, Zhaofei Yu, Tiejun Huang

Comments: Accepted to IJCAI 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20] arXiv:2505.00421 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Real-Time Animatable 2DGS-Avatars with Detail Enhancement from Monocular Videos

Title: 基于单目视频细节增强的实时可动画2DGS虚拟形象

Xia Yuan, Hai Yuan, Wenyi Ge, Ying Fu, Xi Wu, Guanyu Xing

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[21] arXiv:2505.00426 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Leveraging Pretrained Diffusion Models for Zero-Shot Part Assembly

Title: 利用预训练扩散模型进行零样本部件组装

Ruiyuan Zhang, Qi Wang, Jiaxiang Liu, Yu Zhang, Yuchi Huo, Chao Wu

Comments: 10 pages, 12 figures, Accepted by IJCAI-2025

Journal-ref: IJCAI 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22] arXiv:2505.00452 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: ClearLines - Camera Calibration from Straight Lines

Title: ClearLines - 直线用于相机标定

Gregory Schroeder, Mohamed Sabry, Cristina Olaverri-Monreal

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23] arXiv:2505.00482 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers

Title: JointDiT：利用扩散变换器增强RGB-深度联合建模

Kwon Byung-Ki, Qi Dai, Lee Hyoseok, Chong Luo, Tae-Hyun Oh

Comments: Accepted to IEEE/CVF International Conference on Computer Vision (ICCV) 2025. Project page: https://byungki-k.github.io/JointDiT/ Code: https://github.com/kaist-ami/JointDiT

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[24] arXiv:2505.00497 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution

Title: KeySync：一种无泄漏的高分辨率唇同步鲁棒方法

Antoni Bigata, Rodrigo Mira, Stella Bounareli, Michał Stypułkowski, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25] arXiv:2505.00502 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Towards Scalable Human-aligned Benchmark for Text-guided Image Editing

Title: 面向可扩展的人类对齐基准的文本引导图像编辑

Suho Ryu, Kihyun Kim, Eugene Baek, Dongsoo Shin, Joonseok Lee

Comments: Accepted to CVPR 2025 (highlight)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2505.00507 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: HeAL3D: Heuristical-enhanced Active Learning for 3D Object Detection

Title: HeAL3D：启发式增强的主动学习在三维物体检测中的应用

Esteban Rivera, Surya Prabhakaran, Markus Lienkamp

Comments: Accepted in CVPRw2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27] arXiv:2505.00511 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Inconsistency-based Active Learning for LiDAR Object Detection

Title: 基于不一致性的主动学习在LiDAR目标检测中的应用

Esteban Rivera, Loic Stratil, Markus Lienkamp

Comments: Accepted in IV2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28] arXiv:2505.00512 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: InterLoc: LiDAR-based Intersection Localization using Road Segmentation with Automated Evaluation Method

Title: InterLoc：基于LiDAR的交叉口定位，使用道路分割的自动评估方法

Nguyen Hoang Khoi Tran, Julie Stephany Berrio, Mao Shan, Zhenxing Ming, Stewart Worrall

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Robotics (cs.RO)
[29] arXiv:2505.00534 (cross-list from cs.CV) [cn-pdf, pdf, other]: Title: A Robust Deep Networks based Multi-Object MultiCamera Tracking System for City Scale Traffic

Title: 基于深度网络的城市规模多目标多摄像机跟踪系统

Muhammad Imran Zaman, Usama Ijaz Bajwa, Gulshan Saleem, Rana Hammad Raza

Journal-ref: Zaman, Muhammad Imran, et al. "A robust deep networks based multi-object multi-camera tracking system for city scale traffic." Multimedia Tools and Applications 83.6 (2024): 17163-17181

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[30] arXiv:2505.00564 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: X-ray illicit object detection using hybrid CNN-transformer neural network architectures

Title: 基于混合CNN-Transformer神经网络架构的X射线非法物品检测

Jorgen Cani, Christos Diou, Spyridon Evangelatos, Panagiotis Radoglou-Grammatikis, Vasileios Argyriou, Panagiotis Sarigiannidis, Iraklis Varlamis, Georgios Th. Papadopoulos

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31] arXiv:2505.00568 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Multimodal Masked Autoencoder Pre-training for 3D MRI-Based Brain Tumor Analysis with Missing Modalities

Title: 多模态掩码自编码器预训练用于缺失模态的3D MRI脑肿瘤分析

Lucas Robinet, Ahmad Berjaoui, Elizabeth Cohen-Jonathan Moyal

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[32] arXiv:2505.00569 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: AnimalMotionCLIP: Embedding motion in CLIP for Animal Behavior Analysis

Title: AnimalMotionCLIP: 将运动嵌入CLIP用于动物行为分析

Enmin Zhong, Carlos R. del-Blanco, Daniel Berjón, Fernando Jaureguizar, Narciso García

Comments: 6 pages, 3 figures,Accepted for the poster session at the CV4Animals workshop: Computer Vision for Animal Behavior Tracking and Modeling In conjunction with Computer Vision and Pattern Recognition 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33] arXiv:2505.00584 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Synthesizing and Identifying Noise Levels in Autonomous Vehicle Camera Radar Datasets

Title: 合成与识别自动驾驶汽车摄像雷达数据集中的噪声水平

Mathis Morales, Golnaz Habibi

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Image and Video Processing (eess.IV) ; Signal Processing (eess.SP)
[34] arXiv:2505.00592 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Uncertainty-Aware Multi-Expert Knowledge Distillation for Imbalanced Disease Grading

Title: 不确定性感知的多专家知识蒸馏用于疾病分级不平衡问题

Shuo Tong, Shangde Gao, Ke Liu, Zihang Huang, Hongxia Xu, Haochao Ying, Jian Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[35] arXiv:2505.00599 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Visual Trajectory Prediction of Vessels for Inland Navigation

Title: 内河航行船舶的视觉轨迹预测

Alexander Puzicha, Konstantin Wüstefeld, Kathrin Wilms, Frank Weichert

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2505.00606 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Dietary Intake Estimation via Continuous 3D Reconstruction of Food

Title: 膳食摄入估算通过食物的连续三维重建

Wallace Lee, YuHao Chen

Comments: 2025 CVPR MetaFood Workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[37] arXiv:2505.00615 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction

Title: Pixel3DMM：用于单图像3D人脸重建的多功能屏幕空间先验

Simon Giebenhain, Tobias Kirschstein, Martin Rünz, Lourdes Agapito, Matthias Nießner

Comments: Project Website: https://simongiebenhain.github.io/pixel3dmm/ ; Video: https://www.youtube.com/watch?v=BwxwEXJwUDc

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[38] arXiv:2505.00619 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Diverse Semantics-Guided Feature Alignment and Decoupling for Visible-Infrared Person Re-Identification

Title: 多样的语义引导的特征对齐和解耦用于可见光-红外人物再识别

Neng Dong, Shuanglin Yan, Liyan Zhang, Jinhui Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[39] arXiv:2505.00627 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Brain Foundation Models with Hypergraph Dynamic Adapter for Brain Disease Analysis

Title: 基于超图动态适配器的脑基础模型在脑疾病分析中的应用

Zhongying Deng, Haoyu Wang, Ziyan Huang, Lipei Zhang, Angelica I. Aviles-Rivero, Chaoyu Liu, Junjun He, Zoe Kourtzi, Carola-Bibiane Schönlieb

Comments: 35 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[40] arXiv:2505.00630 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook

Title: 遥感中的 Vision Mamba：技术、应用和展望全面调研

Muyi Bao, Shuchang Lyu, Zhaoyang Xu, Huiyu Zhou, Jinchang Ren, Shiming Xiang, Xiangtai Li, Guangliang Cheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41] arXiv:2505.00668 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Deep Reinforcement Learning for Urban Air Quality Management: Multi-Objective Optimization of Pollution Mitigation Booth Placement in Metropolitan Environments

Title: 城市空气质量管理系统中的深度强化学习：大都市环境中污染缓解亭布局的多目标优化

Kirtan Rajesh, Suvidha Rupesh Kumar

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Machine Learning (cs.LG)
[42] arXiv:2505.00684 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Visual Test-time Scaling for GUI Agent Grounding

Title: 视觉测试时缩放用于GUI代理定位

Tiange Luo, Lajanugen Logeswaran, Justin Johnson, Honglak Lee

Comments: ICCV2025, https://github.com/tiangeluo/RegionFocus

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Machine Learning (cs.LG)
[43] arXiv:2505.00690 (cross-list from cs.CV) [cn-pdf, pdf, other]: Title: Towards Autonomous Micromobility through Scalable Urban Simulation

Title: 迈向自主微移动的规模化城市仿真

Wayne Wu, Honglin He, Chaoyuan Zhang, Jack He, Seth Z. Zhao, Ran Gong, Quanyi Li, Bolei Zhou

Comments: CVPR 2025 Highlight. Project page: https://metadriverse.github.io/urban-sim/

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Robotics (cs.RO)
[44] arXiv:2505.00702 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: RayZer: A Self-supervised Large View Synthesis Model

Title: RayZer：一种自监督的大视场合成模型

Hanwen Jiang, Hao Tan, Peng Wang, Haian Jin, Yue Zhao, Sai Bi, Kai Zhang, Fujun Luan, Kalyan Sunkavalli, Qixing Huang, Georgios Pavlakos

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45] arXiv:2505.00703 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Title: T2I-R1：通过协同语义级和标记级思维链增强图像生成

Dongzhi Jiang, Ziyu Guo, Renrui Zhang, Zhuofan Zong, Hao Li, Le Zhuo, Shilin Yan, Pheng-Ann Heng, Hongsheng Li

Comments: Project Page: https://github.com/CaraJ7/T2I-R1

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Computation and Language (cs.CL) ; Machine Learning (cs.LG)
[46] arXiv:2505.00734 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Unconstrained Large-scale 3D Reconstruction and Rendering across Altitudes

Title: 无约束的大规模三维跨海拔重建与渲染

Neil Joshi, Joshua Carney, Nathanael Kuo, Homer Li, Cheng Peng, Myron Brown

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Image and Video Processing (eess.IV)
[47] arXiv:2505.00739 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: MoSAM: Motion-Guided Segment Anything Model with Spatial-Temporal Memory Selection

Title: MoSAM：基于空间-时间记忆选择的运动引导Segment Anything模型

Qiushi Yang, Yuan Yao, Miaomiao Cui, Liefeng Bo

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Image and Video Processing (eess.IV)
[48] arXiv:2505.00740 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Fast2comm:Collaborative perception combined with prior knowledge

Title: Fast2comm：结合先验知识的协同感知

Zhengbin Zhang, Yan Wu, Hongkun Zhang

Comments: 8pages,8figures

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Multiagent Systems (cs.MA)
[49] arXiv:2505.00741 (cross-list from cs.CV) [cn-pdf, pdf, other]: Title: Detection and Classification of Diseases in Multi-Crop Leaves using LSTM and CNN Models

Title: 基于LSTM和CNN模型的多作物叶片疾病检测与分类

Srinivas Kanakala, Sneha Ningappa

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[50] arXiv:2505.00742 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Zoomer: Adaptive Image Focus Optimization for Black-box MLLM

Title: Zoomer：针对黑盒MLLM的自适应图像焦点优化

Jiaxu Qian, Chendong Wang, Yifan Yang, Chaoyun Zhang, Huiqiang Jiang, Xufang Luo, Yu Kang, Qingwei Lin, Anlan Zhang, Shiqi Jiang, Ting Cao, Tianjun Mao, Suman Banerjee, Guyue Liu, Saravan Rajmohan, Dongmei Zhang, Yuqing Yang, Qi Zhang, Lili Qiu

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Image and Video Processing (eess.IV)

Total of 3183 entries : 1-50 51-100 101-150 151-200 ... 3151-3183

Showing up to 50 entries per page: fewer | more | all