Computer Vision and Pattern Recognition

Authors and titles for February 2025

Total of 2199 entries : 1-50 51-100 101-150 151-200 ... 2151-2199

Showing up to 50 entries per page: fewer | more | all

[1] arXiv:2502.00051 (cross-list from cs.CV) [cn-pdf, pdf, other]: Title: A two-stage dual-task learning strategy for early prediction of pathological complete response to neoadjuvant chemotherapy for breast cancer using dynamic contrast-enhanced magnetic resonance images

Title: 使用动态对比增强磁共振图像对乳腺癌新辅助化疗早期病理完全反应预测的两阶段双任务学习策略

Bowen Jing (1), Jing Wang (1) ((1) Department of Radiation Oncology, University of Texas Southwestern Medical Center)

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Medical Physics (physics.med-ph)
[2] arXiv:2502.00074 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: SpikingRTNH: Spiking Neural Network for 4D Radar Object Detection

Title: SpikingRTNH：用于4D雷达目标检测的脉冲神经网络

Dong-Hee Paek, Seung-Hyun Kong

Comments: arxiv preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Neural and Evolutionary Computing (cs.NE)
[3] arXiv:2502.00076 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Influence of color correction on pathology detection in Capsule Endoscopy

Title: 颜色校正对胶囊内镜病理检测的影响

Bidossessi Emmanuel Agossou, Marius Pedersen, Kiran Raja, Anuja Vats, Pål Anders Floor

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Machine Learning (cs.LG)
[4] arXiv:2502.00083 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: CerraData-4MM: A multimodal benchmark dataset on Cerrado for land use and land cover classification

Title: CerraData-4MM：针对塞拉多的多模态基准数据集用于土地利用和土地覆盖分类

Mateus de Souza Miranda, Ronny Hänsch, Valdivino Alexandre de Santiago Júnior, Thales Sehn Körting, Erison Carlos dos Santos Monteiro

Comments: 9 pages, 13 Figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Image and Video Processing (eess.IV)
[5] arXiv:2502.00094 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: AIN: The Arabic INclusive Large Multimodal Model

Title: AIN：阿拉伯语包容性大型多模态模型

Ahmed Heakl, Sara Ghaboura, Omkar Thawkar, Fahad Shahbaz Khan, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan

Comments: 20 pages, 16 figures, ACL

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Computation and Language (cs.CL) ; Human-Computer Interaction (cs.HC) ; Machine Learning (cs.LG)
[6] arXiv:2502.00129 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: ProtoSnap: Prototype Alignment for Cuneiform Signs

Title: ProtoSnap：楔形文字符号的原型对齐

Rachel Mikulinsky, Morris Alper, Shai Gordin, Enrique Jiménez, Yoram Cohen, Hadar Averbuch-Elor

Comments: Accepted to ICLR 2025. Project page: https://tau-vailab.github.io/ProtoSnap/

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[7] arXiv:2502.00133 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Exploring Transfer Learning for Deep Learning Polyp Detection in Colonoscopy Images Using YOLOv8

Title: 探索使用YOLOv8在结肠镜图像中进行深度学习息肉检测的迁移学习

Fabian Vazquez, Jose Angel Nuñez, Xiaoyan Fu, Pengfei Gu, Bin Fu

Comments: 10 pages, 3 figures, 6 tables, SPIE conference

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[8] arXiv:2502.00156 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: ALBAR: Adversarial Learning approach to mitigate Biases in Action Recognition

Title: ALBAR：对抗学习方法以减轻动作识别中的偏差

Joseph Fioresi, Ishan Rajendrakumar Dave, Mubarak Shah

Comments: Accepted to ICLR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Cryptography and Security (cs.CR)
[9] arXiv:2502.00173 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Lifting by Gaussians: A Simple, Fast and Flexible Method for 3D Instance Segmentation

Title: 高斯分布提升：一种简单、快速且灵活的3D实例分割方法

Rohan Chacko, Nicolai Haeni, Eldar Khaliullin, Lin Sun, Douglas Lee

Comments: Accepted to WACV 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[10] arXiv:2502.00196 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: DermaSynth: Rich Synthetic Image-Text Pairs Using Open Access Dermatology Datasets

Title: DermaSynth：使用开放获取的皮肤病学数据集生成丰富的合成图像-文本对

Abdurrahim Yilmaz, Furkan Yuceyalcin, Ece Gokyayla, Donghee Choi, Ozan Erdem, Ali Anil Demircali, Rahmetullah Varol, Ufuk Gorkem Kirabali, Gulsum Gencoglan, Joram M. Posma, Burak Temelkuran

Comments: 12 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Computation and Language (cs.CL)
[11] arXiv:2502.00205 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: EcoWeedNet: A Lightweight and Automated Weed Detection Method for Sustainable Next-Generation Agricultural Consumer Electronics

Title: EcoWeedNet：一种轻量级自动化杂草检测方法，面向可持续的下一代农业消费电子产品

Omar H. Khater, Abdul Jabbar Siddiqui, M. Shamim Hossain, Aiman El-Maleh

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[12] arXiv:2502.00232 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: A Hybrid Random Forest and CNN Framework for Tile-Wise Oil-Water Classification in Hyperspectral Images

Title: 一种混合随机森林和CNN的框架用于高光谱图像中的像素级油水分类

Mehdi Nickzamir, Seyed Mohammad Sheikh Ahamdi Gandab

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[13] arXiv:2502.00250 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Transformer-Based Vector Font Classification Using Different Font Formats: TrueType versus PostScript

Title: 基于Transformer的矢量字体分类，使用不同的字体格式：TrueType与PostScript

Takumu Fujioka (1), Gouhei Tanaka (1 and 2) ((1) Nagoya Institute of Technology, (2) The University of Tokyo)

Comments: 8 pages, 8 figures, 4 tables, Submitted to IJCNN 2025. Code available at https://github.com/fjktkm/truetype-vs-postscript-transformer

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[14] arXiv:2502.00262 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation

Title: 洞察：通过上下文感知危险检测和边缘案例评估提高自动驾驶安全性

Dianwei Chen, Zifan Zhang, Yuchen Liu, Xianfeng Terry Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[15] arXiv:2502.00266 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: MCM: Multi-layer Concept Map for Efficient Concept Learning from Masked Images

Title: MCM：从掩码图像中进行高效概念学习的多层概念图

Yuwei Sun, Lu Mi, Ippei Fujisawa, Ryota Kanai

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[16] arXiv:2502.00307 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: A Diffusion Model Translator for Efficient Image-to-Image Translation

Title: 一种用于高效图像到图像翻译的扩散模型翻译器

Mengfei Xia, Yu Zhou, Ran Yi, Yong-Jin Liu, Wenping Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[17] arXiv:2502.00315 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: MonoDINO-DETR: Depth-Enhanced Monocular 3D Object Detection Using a Vision Foundation Model

Title: MonoDINO-DETR：使用视觉基础模型的深度增强单目3D目标检测

Jihyeok Kim, Seongwoo Moon, Sungwon Nah, David Hyunchul Shim

Comments: 8 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18] arXiv:2502.00333 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: BiMaCoSR: Binary One-Step Diffusion Model Leveraging Flexible Matrix Compression for Real Super-Resolution

Title: BiMaCoSR：利用灵活矩阵压缩的二值化一步扩散模型实现真实超分辨率

Kai Liu, Kaicheng Yang, Zheng Chen, Zhiteng Li, Yong Guo, Wenbo Li, Linghe Kong, Yulun Zhang

Comments: 10 pages, 5 figures. The code and models will be available at https://github.com/Kai-Liu001/BiMaCoSR

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[19] arXiv:2502.00342 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Embodied Intelligence for 3D Understanding: A Survey on 3D Scene Question Answering

Title: 具身智能在3D理解中的应用：关于3D场景问答的综述

Zechuan Li, Hongshan Yu, Yihao Ding, Yan Li, Yong He, Naveed Akhtar

Comments: This is a submitted version of a paper accepted by Information Fusion

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20] arXiv:2502.00360 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Shape from Semantics: 3D Shape Generation from Multi-View Semantics

Title: 从语义中获取形状：从多视图语义生成3D形状

Liangchen Li, Caoliwen Wang, Yuqi Zhou, Bailin Deng, Juyong Zhang

Comments: Project page: https://shapefromsemantics.github.io

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Graphics (cs.GR)
[21] arXiv:2502.00372 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: NAVER: A Neuro-Symbolic Compositional Automaton for Visual Grounding with Explicit Logic Reasoning

Title: NAVER：一种用于视觉定位的神经符号组合自动机，具有显式逻辑推理

Zhixi Cai, Fucai Ke, Simindokht Jahangard, Maria Garcia de la Banda, Reza Haffari, Peter J. Stuckey, Hamid Rezatofighi

Comments: ICCV 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22] arXiv:2502.00375 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Scalable Framework for Classifying AI-Generated Content Across Modalities

Title: 跨模态分类人工智能生成内容的可扩展框架

Anh-Kiet Duong, Petra Gomez-Krämer

Comments: Defactify4 @ AAAI 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23] arXiv:2502.00379 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Latent Action Learning Requires Supervision in the Presence of Distractors

Title: 在存在干扰物的情况下，潜在动作学习需要监督

Alexander Nikulin, Ilya Zisman, Denis Tarasov, Nikita Lyubaykin, Andrei Polubarov, Igor Kiselev, Vladislav Kurenkov

Comments: ICML 2025, Poster, Project Page: https://laom.dunnolab.ai/, Source code: https://github.com/dunnolab/laom

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Machine Learning (cs.LG)
[24] arXiv:2502.00382 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Masked Generative Nested Transformers with Decode Time Scaling

Title: 带有解码时间缩放的掩码生成嵌套变换器

Sahil Goyal, Debapriya Tula, Gagan Jain, Pradeep Shenoy, Prateek Jain, Sujoy Paul

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Machine Learning (cs.LG)
[25] arXiv:2502.00386 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Efficient Adaptive Label Refinement for Label Noise Learning

Title: 高效自适应标签精炼用于标签噪声学习

Wenzhen Zhang, Debo Cheng, Guangquan Lu, Bo Zhou, Jiaye Li, Shichao Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2502.00392 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: RefDrone: A Challenging Benchmark for Referring Expression Comprehension in Drone Scenes

Title: RefDrone：无人机场景中指代表达理解的挑战性基准

Zhichao Sun, Yepeng Liu, Huachao Zhu, Yuliang Gu, Yuda Zou, Zelong Liu, Gui-Song Xia, Bo Du, Yongchao Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27] arXiv:2502.00397 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Minimalistic Video Saliency Prediction via Efficient Decoder & Spatio Temporal Action Cues

Title: 通过高效解码器和时空动作线索的极简视频显著性预测

Rohit Girmaji, Siddharth Jain, Bhav Beri, Sarthak Bansal, Vineet Gandhi

Comments: Accepted at 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28] arXiv:2502.00402 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Enhancing Highway Safety: Accident Detection on the A9 Test Stretch Using Roadside Sensors

Title: 提高高速公路安全性：使用路侧传感器在A9测试路段进行事故检测

Walter Zimmer, Ross Greer, Xingcheng Zhou, Rui Song, Marc Pavel, Daniel Lehmberg, Ahmed Ghita, Akshay Gopalkrishnan, Mohan Trivedi, Alois Knoll

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29] arXiv:2502.00404 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Exploring Linear Attention Alternative for Single Image Super-Resolution

Title: 探索单图像超分辨率的线性注意力替代方法

Rongchang Lu, Changyu Li, Donghang Li, Guojing Zhang, Jianqiang Huang, Xilai Li

Comments: This paper has been published to IEEE International Joint Conference on Neural Networks 2025 as the final camera ready version. Contact at nomodeset@qq.com

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Image and Video Processing (eess.IV)
[30] arXiv:2502.00412 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: TROI: Cross-Subject Pretraining with Sparse Voxel Selection for Enhanced fMRI Visual Decoding

Title: TROI：基于稀疏体素选择的跨被试预训练用于增强fMRI视觉解码

Ziyu Wang, Tengyu Pan, Zhenyu Li, Ji Wu, Xiuxing Li, Jianyong Wang

Comments: ICASSP 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31] arXiv:2502.00418 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Parameter Efficient Fine-Tuning of Segment Anything Model for Biomedical Imaging

Title: 用于生物医学成像的 Segment Anything 模型高效微调参数

Carolin Teuber, Anwai Archit, Constantin Pape

Comments: Published in MIDL 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[32] arXiv:2502.00425 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization

Title: MQuant：通过全静态量化释放多模态大语言模型的推理潜力

JiangYong Yu, Sifan Zhou, Dawei Yang, Shuo Wang, Shuoyu Li, Xing Hu, Chen Xu, Zukang Xu, Changyong Shu, Zhihang Yuan

Comments: Accepted by ACM MM 2025. First PTQ solution for Multimodal large language models applicable to 5 mainstream MLLMs

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[33] arXiv:2502.00426 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: TEST-V: TEst-time Support-set Tuning for Zero-shot Video Classification

Title: TEST-V：测试时支持集微调用于零样本视频分类

Rui Yan, Jin Wang, Hongyu Qu, Xiaoyu Du, Dong Zhang, Jinhui Tang, Tieniu Tan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34] arXiv:2502.00433 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: CAT Pruning: Cluster-Aware Token Pruning For Text-to-Image Diffusion Models

Title: CAT剪枝：面向文本到图像扩散模型的聚类感知标记剪枝

Xinle Cheng, Zhuoming Chen, Zhihao Jia

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[35] arXiv:2502.00435 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: SatMamba: Development of Foundation Models for Remote Sensing Imagery Using State Space Models

Title: SatMamba：使用状态空间模型开发遥感图像的基础模型

Chuc Man Duc, Hiromichi Fukui

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2502.00462 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: MambaGlue: Fast and Robust Local Feature Matching With Mamba

Title: MambaGlue：使用Mamba的快速且鲁棒的局部特征匹配

Kihwan Ryoo, Hyungtae Lim, Hyun Myung

Comments: Proc. IEEE Int'l Conf. Robotics and Automation (ICRA) 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Robotics (cs.RO)
[37] arXiv:2502.00464 (cross-list from cs.CV) [cn-pdf, pdf, other]: Title: Evaluation of End-to-End Continuous Spanish Lipreading in Different Data Conditions

Title: 端到端连续西班牙语唇读在不同数据条件下的评估

David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

Comments: Accepted in the "Language Resources and Evaluation" journal, Springer Nature

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[38] arXiv:2502.00474 (cross-list from cs.CV) [cn-pdf, pdf, other]: Title: A framework for river connectivity classification using temporal image processing and attention based neural networks

Title: 基于时间图像处理和基于注意力的神经网络的河流连通性分类框架

Timothy James Becker, Derin Gezgin, Jun Yi He Wu, Mary Becker

Comments: 15 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG) ; Image and Video Processing (eess.IV)
[39] arXiv:2502.00500 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation

Title: 视频潜在流匹配：视频插值和外推的最优多项式投影

Yang Cao, Zhao Song, Chiwun Yang

Comments: 39 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Machine Learning (cs.LG)
[40] arXiv:2502.00528 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Vision-Language Modeling in PET/CT for Visual Grounding of Positive Findings

Title: PET/CT中视觉语言建模用于阳性发现的视觉定位

Zachary Huemann, Samuel Church, Joshua D. Warner, Daniel Tran, Xin Tie, Alan B McMillan, Junjie Hu, Steve Y. Cho, Meghan Lubner, Tyler J. Bradshaw

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Computation and Language (cs.CL)
[41] arXiv:2502.00535 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Work-Efficient Parallel Non-Maximum Suppression Kernels

Title: 高效并行非最大抑制内核

David Oro, Carles Fernández, Xavier Martorell, Javier Hernando

Comments: Code: https://github.com/hertasecurity/gpu-nms

Journal-ref: The Computer Journal, Volume 65, Issue 4, April 2022, Pages 773-787

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Distributed, Parallel, and Cluster Computing (cs.DC)
[42] arXiv:2502.00536 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: CAD: Confidence-Aware Adaptive Displacement for Semi-Supervised Medical Image Segmentation

Title: CAD：一种用于半监督医学图像分割的置信度感知自适应位移

Wenbo Xiao, Zhihao Xu, Guiping Liang, Yangjun Deng, Yi Xiao

Comments: 9 pages, 3 figures, 4 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[43] arXiv:2502.00547 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Milmer: a Framework for Multiple Instance Learning based Multimodal Emotion Recognition

Title: Milmer：基于多实例学习的多模态情感识别框架

Zaitian Wang, Jian He, Yu Liang, Xiyuan Hu, Tianhao Peng, Kaixin Wang, Jiakai Wang, Chenlong Zhang, Weili Zhang, Shuang Niu, Xiaoyang Xie

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Human-Computer Interaction (cs.HC)
[44] arXiv:2502.00563 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Complex Wavelet Mutual Information Loss: A Multi-Scale Loss Function for Semantic Segmentation

Title: 复数小波互信息损失：用于语义分割的多尺度损失函数

Renhao Lu

Comments: Accepted at ICML 2025. This version corresponds to the official camera-ready submission

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Image and Video Processing (eess.IV)
[45] arXiv:2502.00568 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Generating crossmodal gene expression from cancer histopathology improves multimodal AI predictions

Title: 从癌症组织病理学生成跨模态基因表达可提高多模态AI预测

Samiran Dey, Christopher R.S. Banerji, Partha Basuchowdhuri, Sanjoy K. Saha, Deepak Parashar, Tapabrata Chakraborti

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI) ; Machine Learning (cs.LG)
[46] arXiv:2502.00571 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Contrastive Forward-Forward: A Training Algorithm of Vision Transformer

Title: 对比前向-前向：一种视觉变压器的训练算法

Hossein Aghagolzadeh, Mehdi Ezoji

Comments: 22 pages, 8 figures, under review

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (cs.LG)
[47] arXiv:2502.00594 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Fast Vision Mamba: Pooling Spatial Dimensions for Accelerated Processing

Title: 快速视觉马尔可夫：池化空间维度以加速处理

Saarthak Kapse, Robin Betz, Srinivasan Sivanandan

Comments: 20 pages, 15 figures, https://github.com/insitro/FastVim

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[48] arXiv:2502.00618 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: DesCLIP: Robust Continual Learning via General Attribute Descriptions for VLM-Based Visual Recognition

Title: DesCLIP：通过VLM基础视觉识别的通用属性描述进行鲁棒持续学习

Chiyuan He, Zihuan Qiu, Fanman Meng, Linfeng Xu, Qingbo Wu, Hongliang Li

Comments: Accepted by IEEE Transactions on Multimedia

Subjects: Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)
[49] arXiv:2502.00630 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: Self-Prompt SAM: Medical Image Segmentation via Automatic Prompt SAM Adaptation

Title: 自提示SAM：通过自动提示SAM适应进行医学图像分割

Bin Xie, Hao Tang, Dawen Cai, Yan Yan, Gady Agam

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[50] arXiv:2502.00631 (cross-list from cs.CV) [cn-pdf, pdf, html, other]: Title: MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density Prediction

Title: MedConv：卷积在长尾骨密度预测中优于Transformer

Xuyin Qi, Zeyu Zhang, Huazhan Zheng, Mingxi Chen, Numan Kutaiba, Ruth Lim, Cherie Chiang, Zi En Tham, Xuan Ren, Wenxin Zhang, Lei Zhang, Hao Zhang, Wenbing Lv, Guangzhen Yao, Renda Han, Kangsheng Wang, Mingyuan Li, Hongtao Mao, Yu Li, Zhibin Liao, Yang Zhao, Minh-Son To

Comments: Accepted to IJCNN 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)

Total of 2199 entries : 1-50 51-100 101-150 151-200 ... 2151-2199

Showing up to 50 entries per page: fewer | more | all