Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs.CV

Help | Advanced Search

Computer Vision and Pattern Recognition

Authors and titles for May 2025

Total of 3183 entries : 1-50 51-100 101-150 151-200 ... 3151-3183
Showing up to 50 entries per page: fewer | more | all
[1] arXiv:2505.00044 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Learning to Borrow Features for Improved Detection of Small Objects in Single-Shot Detectors
Title: 学习借用特征以改进单次检测器对小物体的检测
Richard Schmit
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Optimization and Control (math.OC)
[2] arXiv:2505.00134 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Investigating Zero-Shot Diagnostic Pathology in Vision-Language Models with Efficient Prompt Design
Title: 探究高效提示设计下视觉-语言模型在零样本诊断病理学中的应用
Vasudev Sharma, Ahmed Alagha, Abdelhakim Khellaf, Vincent Quoc-Huy Trinh, Mahdi S. Hosseini
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[3] arXiv:2505.00135 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Eye2Eye: A Simple Approach for Monocular-to-Stereo Video Synthesis
Title: Eye2Eye:一种简单的单目视频到立体视频合成方法
Michal Geyer, Omer Tov, Linyi Jin, Richard Tucker, Inbar Mosseri, Tali Dekel, Noah Snavely
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[4] arXiv:2505.00150 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Detecting and Mitigating Hateful Content in Multimodal Memes with Vision-Language Models
Title: 检测和缓解多模态表情包中的仇恨内容与视觉-语言模型
Minh-Hao Van, Xintao Wu
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Computation and Language (cs.CL)
[5] arXiv:2505.00156 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving
Title: V3LMA:用于自动驾驶的视觉三维增强语言模型
Jannik Lübberstedt, Esteban Rivera, Nico Uhlemann, Markus Lienkamp
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[6] arXiv:2505.00209 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Direct Motion Models for Assessing Generated Videos
Title: 直接运动模型用于评估生成的视频
Kelsey Allen, Carl Doersch, Guangyao Zhou, Mohammed Suhail, Danny Driess, Ignacio Rocco, Yulia Rubanova, Thomas Kipf, Mehdi S. M. Sajjadi, Kevin Murphy, Joao Carreira, Sjoerd van Steenkiste
Comments: Project page: http://trajan-paper.github.io
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[7] arXiv:2505.00220 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Towards Robust and Generalizable Gerchberg Saxton based Physics Inspired Neural Networks for Computer Generated Holography: A Sensitivity Analysis Framework
Title: 面向鲁棒性和泛化的基于Gerchberg Saxton的物理启发神经网络在计算机生成全息术中的应用:敏感性分析框架
Ankit Amrutkar, Björn Kampa, Volkmar Schulz, Johannes Stegmaier, Markus Rothermel, Dorit Merhof
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Optics (physics.optics)
[8] arXiv:2505.00254 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Empowering Agentic Video Analytics Systems with Video Language Models
Title: 赋予自主视频分析系统以视频语言模型的力量
Yuxuan Yan, Shiqi Jiang, Ting Cao, Yifan Yang, Qianqian Yang, Yuanchao Shu, Yuqing Yang, Lili Qiu
Comments: 15 pages, AVAS, add latency breakdown
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[9] arXiv:2505.00259 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Pack-PTQ: Advancing Post-training Quantization of Neural Networks by Pack-wise Reconstruction
Title: Pack-PTQ:通过打包式重构推进神经网络的后训练量化
Changjun Li, Runqing Jiang, Zhuo Song, Pengpeng Yu, Ye Zhang, Yulan Guo
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[10] arXiv:2505.00275 (cross-list from cs.CV) [cn-pdf, pdf, other]
Title: AdCare-VLM: Leveraging Large Vision Language Model (LVLM) to Monitor Long-Term Medication Adherence and Care
Title: AdCare-VLM:利用大型视觉语言模型(LVLM)监测长期药物依从性和护理
Md Asaduzzaman Jabin, Hanqi Jiang, Yiwei Li, Patrick Kaggwa, Eugene Douglass, Juliet N. Sekandi, Tianming Liu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[11] arXiv:2505.00295 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Fine-grained spatial-temporal perception for gas leak segmentation
Title: 精细的空间-时间感知用于气体泄漏分割
Xinlong Zhao, Shan Du
Comments: 6 pages, 4 figures, ICIP 2025 Conference
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[12] arXiv:2505.00308 (cross-list from cs.CV) [cn-pdf, pdf, other]
Title: AI-Assisted Decision-Making for Clinical Assessment of Auto-Segmented Contour Quality
Title: 基于人工智能辅助决策的自分割轮廓质量临床评估
Biling Wang, Austen Maniscalco, Ti Bai, Siqiu Wang, Michael Dohopolski, Mu-Han Lin, Chenyang Shen, Dan Nguyen, Junzhou Huang, Steve Jiang, Xinlei Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Applications (stat.AP)
[13] arXiv:2505.00312 (cross-list from cs.CV) [cn-pdf, pdf, other]
Title: AWARE-NET: Adaptive Weighted Averaging for Robust Ensemble Network in Deepfake Detection
Title: AWARE-NET: 深度伪造检测中的自适应加权平均用于鲁棒集成网络
Muhammad Salman, Iqra Tariq, Mishal Zulfiqar, Muqadas Jalal, Sami Aujla, Sumbal Fatima
Journal-ref: IET Conference Proceedings CP917, Volume 2025, Issue 3, Pages 526-533, The Institution of Engineering and Technology, 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[14] arXiv:2505.00334 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution
Title: 四元数小波条件扩散模型用于图像超分辨率
Luigi Sigillo, Christian Bianchi, Aurelio Uncini, Danilo Comminiello
Comments: Accepted for presentation at IJCNN 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[15] arXiv:2505.00335 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Efficient Neural Video Representation with Temporally Coherent Modulation
Title: 高效具有时间相干调制的神经视频表示
Seungjun Shin, Suji Kim, Dokwan Oh
Comments: ECCV 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[16] arXiv:2505.00369 (cross-list from cs.CV) [cn-pdf, pdf, other]
Title: Automated segmenta-on of pediatric neuroblastoma on multi-modal MRI: Results of the SPPIN challenge at MICCAI 2023
Title: 多模态MRI中的儿童神经母细胞瘤自动分割:MICCAI 2023 SPPIN挑战赛结果
M.A.D. Buser, D.C. Simons, M. Fitski, M.H.W.A. Wijnen, A.S. Littooij, A.H. ter Brugge, I.N. Vos, M.H.A. Janse, M. de Boer, R. ter Maat, J. Sato, S. Kido, S. Kondo, S. Kasai, M. Wodzinski, H. Muller, J. Ye, J. He, Y. Kirchhoff, M.R. Rokkus, G. Haokai, S. Zitong, M. Fernández-Patón, D. Veiga-Canuto, D.G. Ellis, M.R. Aizenberg, B.H.M. van der Velden, H. Kuijf, A. De Luca, A.F.W. van der Steeg
Comments: 23 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[17] arXiv:2505.00378 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Cues3D: Unleashing the Power of Sole NeRF for Consistent and Unique Instances in Open-Vocabulary 3D Panoptic Segmentation
Title: Cues3D:释放单一NeRF的潜力以实现开放词汇3D全景分割中一致且独特的实例
Feng Xue, Wenzhuang Xu, Guofeng Zhong, Anlong Minga, Nicu Sebe
Comments: Accepted by Information Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18] arXiv:2505.00380 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: The Invisible Threat: Evaluating the Vulnerability of Cross-Spectral Face Recognition to Presentation Attacks
Title: 隐形威胁:评估跨谱面人脸识别对呈现攻击的脆弱性
Anjith George, Sebastien Marcel
Comments: 10 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[19] arXiv:2505.00394 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: SOTA: Spike-Navigated Optimal TrAnsport Saliency Region Detection in Composite-bias Videos
Title: SOTA:复合偏置视频中的 Spike 导向最优传输显著区域检测
Wenxuan Liu, Yao Deng, Kang Chen, Xian Zhong, Zhaofei Yu, Tiejun Huang
Comments: Accepted to IJCAI 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20] arXiv:2505.00421 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Real-Time Animatable 2DGS-Avatars with Detail Enhancement from Monocular Videos
Title: 基于单目视频细节增强的实时可动画2DGS虚拟形象
Xia Yuan, Hai Yuan, Wenyi Ge, Ying Fu, Xi Wu, Guanyu Xing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[21] arXiv:2505.00426 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Leveraging Pretrained Diffusion Models for Zero-Shot Part Assembly
Title: 利用预训练扩散模型进行零样本部件组装
Ruiyuan Zhang, Qi Wang, Jiaxiang Liu, Yu Zhang, Yuchi Huo, Chao Wu
Comments: 10 pages, 12 figures, Accepted by IJCAI-2025
Journal-ref: IJCAI 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22] arXiv:2505.00452 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: ClearLines - Camera Calibration from Straight Lines
Title: ClearLines - 直线用于相机标定
Gregory Schroeder, Mohamed Sabry, Cristina Olaverri-Monreal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23] arXiv:2505.00482 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
Title: JointDiT:利用扩散变换器增强RGB-深度联合建模
Kwon Byung-Ki, Qi Dai, Lee Hyoseok, Chong Luo, Tae-Hyun Oh
Comments: Accepted to IEEE/CVF International Conference on Computer Vision (ICCV) 2025. Project page: https://byungki-k.github.io/JointDiT/ Code: https://github.com/kaist-ami/JointDiT
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[24] arXiv:2505.00497 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution
Title: KeySync:一种无泄漏的高分辨率唇同步鲁棒方法
Antoni Bigata, Rodrigo Mira, Stella Bounareli, Michał Stypułkowski, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25] arXiv:2505.00502 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Towards Scalable Human-aligned Benchmark for Text-guided Image Editing
Title: 面向可扩展的人类对齐基准的文本引导图像编辑
Suho Ryu, Kihyun Kim, Eugene Baek, Dongsoo Shin, Joonseok Lee
Comments: Accepted to CVPR 2025 (highlight)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2505.00507 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: HeAL3D: Heuristical-enhanced Active Learning for 3D Object Detection
Title: HeAL3D:启发式增强的主动学习在三维物体检测中的应用
Esteban Rivera, Surya Prabhakaran, Markus Lienkamp
Comments: Accepted in CVPRw2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27] arXiv:2505.00511 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Inconsistency-based Active Learning for LiDAR Object Detection
Title: 基于不一致性的主动学习在LiDAR目标检测中的应用
Esteban Rivera, Loic Stratil, Markus Lienkamp
Comments: Accepted in IV2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28] arXiv:2505.00512 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: InterLoc: LiDAR-based Intersection Localization using Road Segmentation with Automated Evaluation Method
Title: InterLoc:基于LiDAR的交叉口定位,使用道路分割的自动评估方法
Nguyen Hoang Khoi Tran, Julie Stephany Berrio, Mao Shan, Zhenxing Ming, Stewart Worrall
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Robotics (cs.RO)
[29] arXiv:2505.00534 (cross-list from cs.CV) [cn-pdf, pdf, other]
Title: A Robust Deep Networks based Multi-Object MultiCamera Tracking System for City Scale Traffic
Title: 基于深度网络的城市规模多目标多摄像机跟踪系统
Muhammad Imran Zaman, Usama Ijaz Bajwa, Gulshan Saleem, Rana Hammad Raza
Journal-ref: Zaman, Muhammad Imran, et al. "A robust deep networks based multi-object multi-camera tracking system for city scale traffic." Multimedia Tools and Applications 83.6 (2024): 17163-17181
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[30] arXiv:2505.00564 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: X-ray illicit object detection using hybrid CNN-transformer neural network architectures
Title: 基于混合CNN-Transformer神经网络架构的X射线非法物品检测
Jorgen Cani, Christos Diou, Spyridon Evangelatos, Panagiotis Radoglou-Grammatikis, Vasileios Argyriou, Panagiotis Sarigiannidis, Iraklis Varlamis, Georgios Th. Papadopoulos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31] arXiv:2505.00568 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Multimodal Masked Autoencoder Pre-training for 3D MRI-Based Brain Tumor Analysis with Missing Modalities
Title: 多模态掩码自编码器预训练用于缺失模态的3D MRI脑肿瘤分析
Lucas Robinet, Ahmad Berjaoui, Elizabeth Cohen-Jonathan Moyal
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[32] arXiv:2505.00569 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: AnimalMotionCLIP: Embedding motion in CLIP for Animal Behavior Analysis
Title: AnimalMotionCLIP: 将运动嵌入CLIP用于动物行为分析
Enmin Zhong, Carlos R. del-Blanco, Daniel Berjón, Fernando Jaureguizar, Narciso García
Comments: 6 pages, 3 figures,Accepted for the poster session at the CV4Animals workshop: Computer Vision for Animal Behavior Tracking and Modeling In conjunction with Computer Vision and Pattern Recognition 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33] arXiv:2505.00584 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Synthesizing and Identifying Noise Levels in Autonomous Vehicle Camera Radar Datasets
Title: 合成与识别自动驾驶汽车摄像雷达数据集中的噪声水平
Mathis Morales, Golnaz Habibi
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Image and Video Processing (eess.IV) ; Signal Processing (eess.SP)
[34] arXiv:2505.00592 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Uncertainty-Aware Multi-Expert Knowledge Distillation for Imbalanced Disease Grading
Title: 不确定性感知的多专家知识蒸馏用于疾病分级不平衡问题
Shuo Tong, Shangde Gao, Ke Liu, Zihang Huang, Hongxia Xu, Haochao Ying, Jian Wu
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[35] arXiv:2505.00599 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Visual Trajectory Prediction of Vessels for Inland Navigation
Title: 内河航行船舶的视觉轨迹预测
Alexander Puzicha, Konstantin Wüstefeld, Kathrin Wilms, Frank Weichert
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2505.00606 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Dietary Intake Estimation via Continuous 3D Reconstruction of Food
Title: 膳食摄入估算通过食物的连续三维重建
Wallace Lee, YuHao Chen
Comments: 2025 CVPR MetaFood Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[37] arXiv:2505.00615 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction
Title: Pixel3DMM:用于单图像3D人脸重建的多功能屏幕空间先验
Simon Giebenhain, Tobias Kirschstein, Martin Rünz, Lourdes Agapito, Matthias Nießner
Comments: Project Website: https://simongiebenhain.github.io/pixel3dmm/ ; Video: https://www.youtube.com/watch?v=BwxwEXJwUDc
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[38] arXiv:2505.00619 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Diverse Semantics-Guided Feature Alignment and Decoupling for Visible-Infrared Person Re-Identification
Title: 多样的语义引导的特征对齐和解耦用于可见光-红外人物再识别
Neng Dong, Shuanglin Yan, Liyan Zhang, Jinhui Tang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[39] arXiv:2505.00627 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Brain Foundation Models with Hypergraph Dynamic Adapter for Brain Disease Analysis
Title: 基于超图动态适配器的脑基础模型在脑疾病分析中的应用
Zhongying Deng, Haoyu Wang, Ziyan Huang, Lipei Zhang, Angelica I. Aviles-Rivero, Chaoyu Liu, Junjun He, Zoe Kourtzi, Carola-Bibiane Schönlieb
Comments: 35 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[40] arXiv:2505.00630 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Title: 遥感中的 Vision Mamba:技术、应用和展望全面调研
Muyi Bao, Shuchang Lyu, Zhaoyang Xu, Huiyu Zhou, Jinchang Ren, Shiming Xiang, Xiangtai Li, Guangliang Cheng
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41] arXiv:2505.00668 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Deep Reinforcement Learning for Urban Air Quality Management: Multi-Objective Optimization of Pollution Mitigation Booth Placement in Metropolitan Environments
Title: 城市空气质量管理系统中的深度强化学习:大都市环境中污染缓解亭布局的多目标优化
Kirtan Rajesh, Suvidha Rupesh Kumar
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Machine Learning (cs.LG)
[42] arXiv:2505.00684 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Visual Test-time Scaling for GUI Agent Grounding
Title: 视觉测试时缩放用于GUI代理定位
Tiange Luo, Lajanugen Logeswaran, Justin Johnson, Honglak Lee
Comments: ICCV2025, https://github.com/tiangeluo/RegionFocus
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Machine Learning (cs.LG)
[43] arXiv:2505.00690 (cross-list from cs.CV) [cn-pdf, pdf, other]
Title: Towards Autonomous Micromobility through Scalable Urban Simulation
Title: 迈向自主微移动的规模化城市仿真
Wayne Wu, Honglin He, Chaoyuan Zhang, Jack He, Seth Z. Zhao, Ran Gong, Quanyi Li, Bolei Zhou
Comments: CVPR 2025 Highlight. Project page: https://metadriverse.github.io/urban-sim/
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Robotics (cs.RO)
[44] arXiv:2505.00702 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: RayZer: A Self-supervised Large View Synthesis Model
Title: RayZer:一种自监督的大视场合成模型
Hanwen Jiang, Hao Tan, Peng Wang, Haian Jin, Yue Zhao, Sai Bi, Kai Zhang, Fujun Luan, Kalyan Sunkavalli, Qixing Huang, Georgios Pavlakos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45] arXiv:2505.00703 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
Title: T2I-R1:通过协同语义级和标记级思维链增强图像生成
Dongzhi Jiang, Ziyu Guo, Renrui Zhang, Zhuofan Zong, Hao Li, Le Zhuo, Shilin Yan, Pheng-Ann Heng, Hongsheng Li
Comments: Project Page: https://github.com/CaraJ7/T2I-R1
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Computation and Language (cs.CL) ; Machine Learning (cs.LG)
[46] arXiv:2505.00734 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Unconstrained Large-scale 3D Reconstruction and Rendering across Altitudes
Title: 无约束的大规模三维跨海拔重建与渲染
Neil Joshi, Joshua Carney, Nathanael Kuo, Homer Li, Cheng Peng, Myron Brown
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Image and Video Processing (eess.IV)
[47] arXiv:2505.00739 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: MoSAM: Motion-Guided Segment Anything Model with Spatial-Temporal Memory Selection
Title: MoSAM:基于空间-时间记忆选择的运动引导Segment Anything模型
Qiushi Yang, Yuan Yao, Miaomiao Cui, Liefeng Bo
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Image and Video Processing (eess.IV)
[48] arXiv:2505.00740 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Fast2comm:Collaborative perception combined with prior knowledge
Title: Fast2comm:结合先验知识的协同感知
Zhengbin Zhang, Yan Wu, Hongkun Zhang
Comments: 8pages,8figures
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Multiagent Systems (cs.MA)
[49] arXiv:2505.00741 (cross-list from cs.CV) [cn-pdf, pdf, other]
Title: Detection and Classification of Diseases in Multi-Crop Leaves using LSTM and CNN Models
Title: 基于LSTM和CNN模型的多作物叶片疾病检测与分类
Srinivas Kanakala, Sneha Ningappa
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[50] arXiv:2505.00742 (cross-list from cs.CV) [cn-pdf, pdf, html, other]
Title: Zoomer: Adaptive Image Focus Optimization for Black-box MLLM
Title: Zoomer:针对黑盒MLLM的自适应图像焦点优化
Jiaxu Qian, Chendong Wang, Yifan Yang, Chaoyun Zhang, Huiqiang Jiang, Xufang Luo, Yu Kang, Qingwei Lin, Anlan Zhang, Shiqi Jiang, Ting Cao, Tianjun Mao, Suman Banerjee, Guyue Liu, Saravan Rajmohan, Dongmei Zhang, Yuqing Yang, Qi Zhang, Lili Qiu
Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Image and Video Processing (eess.IV)
Total of 3183 entries : 1-50 51-100 101-150 151-200 ... 3151-3183
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号