信号处理

新提交
交叉列表
替换

查看最近的文章

显示 2025年07月29日，星期二新的列表

总共 53 条目

显示最多 1000 每页条目：较少 | 更多 | 所有

[1] arXiv:2507.19546 [中文pdf, pdf, 其他]: 标题：通过一种新的压缩感知框架抑制间接飞行时间成像中的多路径干扰

标题： Multipath Interference Suppression in Indirect Time-of-Flight Imaging via a Novel Compressed Sensing Framework

Yansong Du, Yutong Deng, Yuting Zhou, Feiyu Jiao, Bangyao Wang, Zhancong Xu, Zhaoxiang Jiang, Xun Guan

评论： 15页，10图

主题：信号处理 (eess.SP) ; 计算机视觉与模式识别 (cs.CV)

我们提出了一种新颖的压缩感知方法，以提高间接时间飞行（iToF）系统的深度重建精度和多目标分离能力。与依赖硬件修改、复杂调制或繁琐的数据驱动重建的传统方法不同，我们的方法使用单一调制频率，并通过多个相位偏移和窄占空比连续波构建传感矩阵。在矩阵构建过程中，我们进一步考虑了由镜头畸变引起的像素级距离变化，使传感矩阵更符合实际调制响应特性。为了增强稀疏恢复，我们将K-Means聚类应用于距离响应字典，并在OMP过程中对每个聚类内的原子选择进行约束，这有效减少了搜索空间并提高了求解稳定性。实验结果表明，所提出的方法在重建精度和鲁棒性方面均优于传统方法，且无需任何额外的硬件更改。

We propose a novel compressed sensing method to improve the depth reconstruction accuracy and multi-target separation capability of indirect Time-of-Flight (iToF) systems. Unlike traditional approaches that rely on hardware modifications, complex modulation, or cumbersome data-driven reconstruction, our method operates with a single modulation frequency and constructs the sensing matrix using multiple phase shifts and narrow-duty-cycle continuous waves. During matrix construction, we further account for pixel-wise range variation caused by lens distortion, making the sensing matrix better aligned with actual modulation response characteristics. To enhance sparse recovery, we apply K-Means clustering to the distance response dictionary and constrain atom selection within each cluster during the OMP process, which effectively reduces the search space and improves solution stability. Experimental results demonstrate that the proposed method outperforms traditional approaches in both reconstruction accuracy and robustness, without requiring any additional hardware changes.
[2] arXiv:2507.19763 [中文pdf, pdf, html, 其他]: 标题：混合蜂窝和无蜂窝网络的覆盖概率和平均速率分析

标题： Coverage Probability and Average Rate Analysis of Hybrid Cellular and Cell-free Network

Zhuoyin Dai, Jingran Xu, Xiaoli Xu, Ruoguang Li, Yong Zeng, Jiangbin Lyu

主题：信号处理 (eess.SP)

无基站无线网络部署分布式接入点（AP）以在服务区域内同时为用户设备（UE）提供服务，并被视为最具前景的网络架构范式之一。尽管在无基站无线网络的性能分析和优化方面取得了最近进展，但在现有无线网络中大规模部署AP是否能够经济有效地实现通信容量增长，仍是一个开放性问题。此外，无基站网络的实现被认为是一个渐进的长期演进过程，在此过程中，无基站AP将逐步引入现有蜂窝网络，并与现有的蜂窝基站（BS）形成混合通信网络。这种协作将弥合现有蜂窝网络与创新的无基站网络之间的差距。因此，混合蜂窝和无基站网络（HCCNs）作为一种实用且可行的解决方案应运而生，值得进一步探索其性能极限。本文提出了一种基于随机几何的混合蜂窝和无基站网络模型，用于分析信号和干扰的分布并揭示它们之间的相互耦合。具体而言，为了使UE能够同时受益于蜂窝BS和无基站AP，采用共轭波束成形设计，并通过矩匹配分析聚合信号。然后，通过推导干扰组件的拉普拉斯变换及其高阶导数来表征混合网络的覆盖概率。此外，基于干扰耦合分析，推导了混合网络在信道衰落下的平均可实现速率。

Cell-free wireless networks deploy distributed access points (APs) to simultaneously serve user equipments (UEs) across the service region and are regarded as one of the most promising network architectural paradigms. Despite recent advances in the performance analysis and optimization of cellfree wireless networks, it remains an open question whether large-scale deployment of APs in existing wireless networks can cost-effectively achieve communication capacity growth. Besides, the realization of a cell-free network is considered to be a gradual long-term evolutionary process in which cell-free APs will be incrementally introduced into existing cellular networks, and form a hybrid communication network with the existing cellular base stations (BSs). Such a collaboration will bridge the gap between the established cellular network and the innovative cellfree network. Therefore, hybrid cellular and cell-free networks (HCCNs) emerge as a practical and feasible solution for advancing cell-free network development, and it is worthwhile to further explore its performance limits. This paper presents a stochastic geometry-based hybrid cellular and cell-free network model to analyze the distributions of signal and interference and reveal their mutual coupling. Specifically, in order to benefit the UEs from both the cellular BSs and the cell-free APs, a conjugate beamforming design is employed, and the aggregated signal is analyzed using moment matching. Then, the coverage probability of the hybrid network is characterized by deriving the Laplace transforms and their higher-order derivatives of interference components. Furthermore, the average achievable rate of the hybrid network over channel fading is derived based on the interference coupling analysis.
[3] arXiv:2507.19785 [中文pdf, pdf, html, 其他]: 标题：基于变压器编码器的雷达和声学传感器融合用于鲁棒的无人机检测与分类

标题： Radar and Acoustic Sensor Fusion using a Transformer Encoder for Robust Drone Detection and Classification

Gevindu Ganganath, Pasindu Sankalpa, Samal Punsara, Demitha Pasindu, Chamira U. S. Edussooriya, Ranga Rodrigo, Udaya S. K. P. Miriya Thanthrige

评论：提交至IEEE传感器快报

主题：信号处理 (eess.SP)

无人机在各种应用中的使用正在稳步增加。然而，这也引发了诸如未经授权的无人机进入限制区域等关键安全问题。因此，尽管无人机体积小、低空飞行和环境噪声带来重大挑战，仍需要强大且准确的无人机检测和分类机制。在本文中，我们提出了一种结合雷达和声学传感的多模态方法来检测和分类无人机。我们采用雷达，因其具有远距离能力，并且对不同天气条件具有鲁棒性。我们使用原始的声学信号，而无需将其转换为其他域，例如频谱图或梅尔频率倒谱系数。这使我们能够比最先进的方法使用更少的参数。此外，我们探讨了变压器编码器架构在融合这些传感器方面的有效性。在户外环境中获得的实验结果验证了所提出方法相对于最先进方法的优越性能。

The use of drones in a wide range of applications is steadily increasing. However, this has also raised critical security concerns such as unauthorized drone intrusions into restricted zones. Therefore, robust and accurate drone detection and classification mechanisms are required despite significant challenges due to small size of drones, low-altitude flight, and environmental noise. In this letter, we propose a multi-modal approach combining radar and acoustic sensing for detecting and classifying drones. We employ radar due to its long-range capabilities, and robustness to different weather conditions. We utilize raw acoustic signals without converting them to other domains such as spectrograms or Mel-frequency cepstral coefficients. This enables us to use fewer number of parameters compared to the stateof-the-art approaches. Furthermore, we explore the effectiveness of the transformer encoder architecture in fusing these sensors. Experimental results obtained in outdoor settings verify the superior performance of the proposed approach compared to the state-of-the-art methods.
[4] arXiv:2507.19812 [中文pdf, pdf, html, 其他]: 标题：基于正交时延多普勒分复用的大规模MIMO系统信道估计

标题： Channel Estimation in Massive MIMO Systems with Orthogonal Delay-Doppler Division Multiplexing

Dezhi Wang, Chongwen Huang, Xiaojun Yuan, Sami Muhaidat, Lei Liu, Xiaoming Chen, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

主题：信号处理 (eess.SP)

正交时延-多普勒分复用（ODDM）调制最近被认为是一种有前景的技术，可以在高移动性情况下提供可靠的通信。准确且低复杂度的信道估计是大规模多输入多输出（MIMO）ODDM系统中最关键的挑战之一，主要是由于天线阵列非常大和高移动性环境。为克服这些挑战，本文研究了下行链路大规模MIMO-ODDM系统中的信道估计问题，并提出了一种基于记忆近似消息传递（MAMP）的低复杂度算法来估计信道状态信息（CSI）。具体而言，我们首先建立了大规模MIMO-ODDM系统的有效信道模型，其中等效信道向量中元素的幅度服从伯努利-高斯分布。进一步地，随着天线数量的增长，等效系数矩阵中的元素趋于完全随机。利用这些特性，我们使用MAMP方法来确定多径信道的增益、时延和多普勒效应，而信道角度则通过离散傅里叶变换方法进行估计。最后，数值结果表明，当天线数量趋于无穷大时，所提出的信道估计算法接近贝叶斯最优结果，并且在归一化均方误差方面相比现有算法提高了约30%的信道估计精度。

Orthogonal delay-Doppler division multiplexing~(ODDM) modulation has recently been regarded as a promising technology to provide reliable communications in high-mobility situations. Accurate and low-complexity channel estimation is one of the most critical challenges for massive multiple input multiple output~(MIMO) ODDM systems, mainly due to the extremely large antenna arrays and high-mobility environments. To overcome these challenges, this paper addresses the issue of channel estimation in downlink massive MIMO-ODDM systems and proposes a low-complexity algorithm based on memory approximate message passing~(MAMP) to estimate the channel state information~(CSI). Specifically, we first establish the effective channel model of the massive MIMO-ODDM systems, where the magnitudes of the elements in the equivalent channel vector follow a Bernoulli-Gaussian distribution. Further, as the number of antennas grows, the elements in the equivalent coefficient matrix tend to become completely random. Leveraging these characteristics, we utilize the MAMP method to determine the gains, delays, and Doppler effects of the multi-path channel, while the channel angles are estimated through the discrete Fourier transform method. Finally, numerical results show that the proposed channel estimation algorithm approaches the Bayesian optimal results when the number of antennas tends to infinity and improves the channel estimation accuracy by about 30% compared with the existing algorithms in terms of the normalized mean square error.
[5] arXiv:2507.19837 [中文pdf, pdf, html, 其他]: 标题：无线通信与网络的特征工程：概念、方法和应用

标题： Feature Engineering for Wireless Communications and Networking: Concepts, Methodologies, and Applications

Jiacheng Wang, Changyuan Zhao, Zehui Xiong, Tao Xiang, Dusit Niyato, Xianbin Wang, Shiwen Mao, Dong In Kim

评论： 7页，5图

主题：信号处理 (eess.SP)

AI赋能的无线通信近年来引起了极大的研究兴趣，尤其是在低空集成感知与通信（ISAC）网络等新范式兴起之后。在这些系统中，特征工程起着关键作用，通过将原始无线数据转换为适合人工智能模型的结构化表示。因此，本文对AI驱动的无线通信中的特征工程技术进行了全面的研究。具体而言，我们首先对特征工程的基本原理和方法进行详细分析。接下来，我们介绍了其在无线通信系统中的应用，特别强调了ISAC网络。最后，我们介绍了一个基于生成式人工智能的框架，该框架可以在恶意攻击下重建低空ISAC网络中的信号特征谱。案例研究显示，它能够有效重建信号谱，平均结构相似性指数提高了4%，从而支持下游的感知和通信应用。

AI-enabled wireless communications have attracted tremendous research interest in recent years, particularly with the rise of novel paradigms such as low-altitude integrated sensing and communication (ISAC) networks. Within these systems, feature engineering plays a pivotal role by transforming raw wireless data into structured representations suitable for AI models. Hence, this paper offers a comprehensive investigation of feature engineering techniques in AI-driven wireless communications. Specifically, we begin with a detailed analysis of fundamental principles and methodologies of feature engineering. Next, we present its applications in wireless communication systems, with special emphasis on ISAC networks. Finally, we introduce a generative AI-based framework, which can reconstruct signal feature spectrum under malicious attacks in low-altitude ISAC networks. The case study shows that it can effectively reconstruct the signal spectrum, achieving an average structural similarity index improvement of 4%, thereby supporting downstream sensing and communication applications.
[6] arXiv:2507.19910 [中文pdf, pdf, html, 其他]: 标题：面向双功能 LAWN：气动辅助无人机编队的控制感知系统设计

标题： Toward Dual-Functional LAWN: Control-Aware System Design for Aerodynamics-Aided UAV Formations

Jun Wu, Weijie Yuan, Qingqing Cheng, Haijia Jin

主题：信号处理 (eess.SP)

集成感知与通信（ISAC）已成为推动低空无线网络（LAWNs）发展的关键技术，是下一代通信系统的关键使能技术。本文研究了双功能LAWN中节能无人飞行器（UAV）编队的系统设计，其中地面基站（GBS）同时无线控制多个UAV编队并执行感知任务。为了提高飞行续航能力，我们利用空气动力学上的上升气流效应，并提出了一种基于自适应-然后-组合（ATC）扩散最小均方（LMS）算法的分布式节能编队框架。具体来说，每个UAV通过调用LMS算法更新局部位置估计，随后通过与邻居的协作信息交换进行优化。这使得能够形成优化的空气动力学结构，从而最小化编队的整体能耗。为了确保控制稳定性和公平性，我们制定了一个最大线性二次调节器（LQR）最小化问题，该问题受可用功率预算和所需感知波束图增益的约束。为了解决这个非凸问题，我们首先推导出LQR作为任意波束成形器函数的闭式表达式，然后提出一种结合连续凸逼近（SCA）和半定松弛（SDR）技术的高效迭代算法，以获得次优的双功能波束成形解决方案。大量仿真结果证实，“V”字形编队是最节能的配置，并展示了我们所提出的方案在提高控制性能方面优于基准方案的优势。

Integrated sensing and communication (ISAC) has emerged as a pivotal technology for advancing low-altitude wireless networks (LAWNs), serving as a critical enabler for next-generation communication systems. This paper investigates the system design for energy-saving unmanned aerial vehicle (UAV) formations in dual-functional LAWNs, where a ground base station (GBS) simultaneously wirelessly controls multiple UAV formations and performs sensing tasks. To enhance flight endurance, we exploit the aerodynamic upwash effects and propose a distributed energy-saving formation framework based on the adapt-then-combine (ATC) diffusion least mean square (LMS) algorithm. Specifically, each UAV updates the local position estimate by invoking the LMS algorithm, followed by refining it through cooperative information exchange with neighbors. This enables an optimized aerodynamic structure that minimizes the formation's overall energy consumption. To ensure control stability and fairness, we formulate a maximum linear quadratic regulator (LQR) minimization problem, which is subject to both the available power budget and the required sensing beam pattern gain. To address this non-convex problem, we develop a two-step approach by first deriving a closed-form expression of LQR as a function of arbitrary beamformers. Subsequently, an efficient iterative algorithm that integrates successive convex approximation (SCA) and semidefinite relaxation (SDR) techniques is proposed to obtain a sub-optimal dual-functional beamforming solution. Extensive simulation results confirm that the 'V'-shaped formation is the most energy-efficient configuration and demonstrate the superiority of our proposed design over benchmark schemes in improving control performance.
[7] arXiv:2507.19936 [中文pdf, pdf, html, 其他]: 标题：基于深度学习的稀疏XL-MIMO OFDM系统的联合信道估计与定位

标题： Deep Learning Based Joint Channel Estimation and Positioning for Sparse XL-MIMO OFDM Systems

Zhongnian Li, Chao Zheng, Jian Xiao, Ji Wang, Gongpu Wang, Ming Zeng, Octavia A. Dobre

评论： 5页，8图

主题：信号处理 (eess.SP) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG)

本文研究了近场稀疏超大规模多输入多输出（XL-MIMO）正交频分复用（OFDM）系统中的联合信道估计和定位问题。为了实现信道估计和定位之间的协作增益，我们提出了一种基于深度学习的两阶段框架，包括定位和信道估计。在定位阶段，预测用户的坐标并在信道估计阶段加以利用，从而提高信道估计的准确性。在此框架中，我们提出了一种用于信道估计和定位的U型Mamba架构，称为CP-Mamba。该网络结合了Mamba模型的优势与U型卷积网络的结构优势，能够有效捕捉信道的局部空间特征和远距离时间依赖性。数值仿真结果表明，所提出的具有CP-Mamba架构的两阶段方法优于现有的基线方法。此外，稀疏阵列（SA）在信道估计和定位精度方面相比传统紧凑阵列表现出显著优越的性能。

This paper investigates joint channel estimation and positioning in near-field sparse extra-large multiple-input multiple-output (XL-MIMO) orthogonal frequency division multiplexing (OFDM) systems. To achieve cooperative gains between channel estimation and positioning, we propose a deep learning-based two-stage framework comprising positioning and channel estimation. In the positioning stage, the user's coordinates are predicted and utilized in the channel estimation stage, thereby enhancing the accuracy of channel estimation. Within this framework, we propose a U-shaped Mamba architecture for channel estimation and positioning, termed as CP-Mamba. This network integrates the strengths of the Mamba model with the structural advantages of U-shaped convolutional networks, enabling effective capture of local spatial features and long-range temporal dependencies of the channel. Numerical simulation results demonstrate that the proposed two-stage approach with CP-Mamba architecture outperforms existing baseline methods. Moreover, sparse arrays (SA) exhibit significantly superior performance in both channel estimation and positioning accuracy compared to conventional compact arrays.
[8] arXiv:2507.19984 [中文pdf, pdf, html, 其他]: 标题：基于可靠性理论的流天线系统的统计QoS提供

标题： Dependability Theory-based Statistical QoS Provisioning of Fluid Antenna Systems

Irfan Muhammad, Priyadarshi Mukherjee, Wee Kiat New, Hirley Alves, Ioannis Krikidis, Kai-Kit Wong

主题：信号处理 (eess.SP)

流体天线系统（FAS）最近成为下一代无线网络的一种有前景的技术，提供实时空间重新配置以提高可靠性、吞吐量和能效。然而，现有的研究常常忽略了信道衰落的时间动态及其对任务关键操作的影响。在本文中，我们提出了一种基于可靠性的理论框架，用于在有限块长度（FBL）约束下对FAS进行统计服务质量（QoS）的提供。具体而言，我们推导了在Nakagami-$m$衰落信道上$N$端口FAS的穿越率（LCR）和平均衰落持续时间（AFD）的新闭合表达式。利用这些二阶统计量，我们定义了两个关键的可靠性指标，即任务可靠性和平均首次故障时间（MTTFF），以量化在定义的任务持续时间内无中断运行的概率。我们进一步扩展了经典的有效容量（EC）概念，以在FBL范围内纳入任务可靠性，从而得到任务有效容量（mEC）。为了在突发流量和时延约束下捕捉能效，我们还开发了任务有效能效（mEEE）指标，并将其最大化公式化为一个非凸分数优化问题。然后通过一种改进的Dinkelbach方法结合嵌入式线搜索来解决这个问题。大量仿真揭示了端口数量、QoS指数、信噪比和任务持续时间之间的关键权衡，为设计超可靠、低时延和节能的工业物联网（IIoT）系统提供了见解。

Fluid antenna systems (FAS) have recently emerged as a promising technology for next-generation wireless networks, offering real-time spatial reconfiguration to enhance reliability, throughput, and energy efficiency. Nevertheless, existing studies often overlook the temporal dynamics of channel fading and their implications for mission-critical operations. In this paper, we propose a dependability-theoretic framework for statistical quality-of-service (QoS) provisioning of FAS under finite blocklength (FBL) constraints. Specifically, we derive new closed-form expressions for the level-crossing rate (LCR) and average fade duration (AFD) of an $N$-port FAS over Nakagami-$m$ fading channels. Leveraging these second-order statistics, we define two key dependability metrics such as mission reliability and mean time-to-first-failure (MTTFF), to quantify the probability of uninterrupted operation over a defined mission duration. We further extend the classical effective capacity (EC) concept to incorporate mission reliability in the FBL regime, yielding a mission EC (mEC). To capture energy efficiency under bursty traffic and latency constraints, we also develop the mission effective energy efficiency (mEEE) metric and formulate its maximization as a non-convex fractional optimization problem. This problem is then solved via a modified Dinkelbach's method with an embedded line search. Extensive simulations uncover critical trade-offs among port count, QoS exponent, signal-to-noise ratio, and mission duration, offering insights for the design of ultra-reliable, low-latency, and energy-efficient industrial internet-of-things (IIoT) systems.
[9] arXiv:2507.19996 [中文pdf, pdf, html, 其他]: 标题：基于最优加权低秩矩阵补全的DOA估计

标题： DOA Estimation via Optimal Weighted Low-Rank Matrix Completion

Saeed Razavikia, Mohammad Bokaei, Arash Amini, Stefano Rini, Carlo Fischione

主题：信号处理 (eess.SP)

本文提出了一种新颖的方法，用于使用加权提升结构低秩矩阵补全来估计非均匀和稀疏线性传感器阵列的到达方向（DOA）。该方法使用一个单快照样本，在该样本中观察到一个数组的数据。该方法基于加权提升结构低秩矩阵恢复框架。该方法包括四个关键步骤：(i) 将天线样本提升以形成低秩结构，然后 (ii) 设计左右权重矩阵以反映样本的信息量，(iii) 通过加权提升样本的补全来估计无噪声的均匀阵列输出，以及 (iv) 从恢复的均匀线性阵列样本中获得DOA。我们研究了上述步骤(i)至(iii)的复杂度，其中我们分析了步骤(iii)中阵列插值所需的样本用于DOA估计。我们证明了所提出的权重矩阵选择实现了接近最优的样本复杂度。这种复杂度与问题的自由度一致，相当于调整了对数因子后的DOA数量。数值评估显示了所提出方法相对于非加权方法和基于原子范数最小化的方法的优势。值得注意的是，我们的方法在低噪声条件下相对于非加权方法显著提高了性能，归一化均方误差降低了大约10 dB。

This paper presents a novel method for estimating the direction of arrival (DOA) for a non-uniform and sparse linear sensor array using the weighted lifted structure low-rank matrix completion. The proposed method uses a single snapshot sample in which a single array of data is observed. The method is rooted in a weighted lifted-structured low-rank matrix recovery framework. The method involves four key steps: (i) lifting the antenna samples to form a low-rank stature, then (ii) designing left and right weight matrices to reflect the sample informativeness, (iii) estimating a noise-free uniform array output through completion of the weighted lifted samples, and (iv) obtaining the DOAs from the restored uniform linear array samples. We study the complexity of steps (i) to (iii) above, where we analyze the required sample for the array interpolation of step (iii) for DOA estimation. We demonstrate that the proposed choice of weight matrices achieves a near-optimal sample complexity. This complexity aligns with the problem's degree of freedom, equivalent to the number of DOAs adjusted for logarithmic factors. Numerical evaluations show the proposed method's superiority against the non-weighted counterpart and atomic norm minimization-based methods. Notably, our proposed method significantly improves, with approximately a 10 dB reduction in normalized mean-squared error over the non-weighted method at low-noise conditions.
[10] arXiv:2507.20189 [中文pdf, pdf, html, 其他]: 标题： NeuroCLIP：一种用于rTMS治疗的甲基苯丙胺成瘾分析的多模态对比学习方法

标题： NeuroCLIP: A Multimodal Contrastive Learning Method for rTMS-treated Methamphetamine Addiction Analysis

Chengkai Wang, Di Wu, Yunsheng Liao, Wenyao Zheng, Ziyi Zeng, Xurong Gao, Hemmings Wu, Zhoule Zhu, Jie Yang, Lihua Zhong, Weiwei Cheng, Yun-Hsuan Chen, Mohamad Sawan

主题：信号处理 (eess.SP) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG) ; 神经与认知 (q-bio.NC)

甲基苯丙胺依赖是一个重大的全球健康挑战，但其评估以及像重复经颅磁刺激（rTMS）之类的治疗方法的评价通常依赖于主观的自我报告，这可能会引入不确定性。虽然客观的神经成像技术如脑电图（EEG）和功能性近红外光谱（fNIRS）提供了替代方案，但它们各自的局限性以及对传统、常常手工制作的特征提取的依赖可能会损害所得到生物标志物的可靠性。为了克服这些限制，我们提出了NeuroCLIP，一种新颖的深度学习框架，通过渐进式学习策略整合同时记录的EEG和fNIRS数据。这种方法提供了一个稳健且可信的甲基苯丙胺成瘾生物标志物。验证实验表明，与仅使用EEG或仅使用fNIRS的模型相比，NeuroCLIP在甲基苯丙胺依赖个体和健康对照组之间显著提高了区分能力。此外，该框架促进了rTMS治疗效果的客观、基于大脑的评估，显示出治疗后神经模式向健康对照组特征的明显变化。关键的是，我们通过显示多模态数据驱动的生物标志物与心理测量验证的渴望评分之间的强相关性，确立了其可信度。这些发现表明，通过NeuroCLIP从EEG-fNIRS数据中得到的生物标志物相比单模态方法具有更高的稳健性和可靠性，为成瘾神经科学研究提供了有价值的工具，并可能改善临床评估。

Methamphetamine dependence poses a significant global health challenge, yet its assessment and the evaluation of treatments like repetitive transcranial magnetic stimulation (rTMS) frequently depend on subjective self-reports, which may introduce uncertainties. While objective neuroimaging modalities such as electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) offer alternatives, their individual limitations and the reliance on conventional, often hand-crafted, feature extraction can compromise the reliability of derived biomarkers. To overcome these limitations, we propose NeuroCLIP, a novel deep learning framework integrating simultaneously recorded EEG and fNIRS data through a progressive learning strategy. This approach offers a robust and trustworthy biomarker for methamphetamine addiction. Validation experiments show that NeuroCLIP significantly improves discriminative capabilities among the methamphetamine-dependent individuals and healthy controls compared to models using either EEG or only fNIRS alone. Furthermore, the proposed framework facilitates objective, brain-based evaluation of rTMS treatment efficacy, demonstrating measurable shifts in neural patterns towards healthy control profiles after treatment. Critically, we establish the trustworthiness of the multimodal data-driven biomarker by showing its strong correlation with psychometrically validated craving scores. These findings suggest that biomarker derived from EEG-fNIRS data via NeuroCLIP offers enhanced robustness and reliability over single-modality approaches, providing a valuable tool for addiction neuroscience research and potentially improving clinical assessments.
[11] arXiv:2507.20283 [中文pdf, pdf, html, 其他]: 标题：信息保留的CSI反馈：具有内生量化和信道误差缓解的可逆网络

标题： Information-Preserving CSI Feedback: Invertible Networks with Endogenous Quantization and Channel Error Mitigation

Haotian Tian, Lixiang Lian, Jiaqi Cao, Sijie Ji

主题：信号处理 (eess.SP)

深度学习已成为一种有前景的解决方案，用于频分双工（FDD）大规模MIMO系统中高效的信道状态信息（CSI）反馈。传统的基于深度学习的方法通常依赖于深度自编码器来压缩CSI，这会导致不可逆的信息丢失并降低重建精度。本文介绍了InvCSINet，这是一种基于可逆神经网络（INNs）的信息保留CSI反馈框架。通过利用INN的双射性质，该模型确保了信息保留的压缩和重建，并共享模型参数。为了解决量化和信道引起的误差等实际挑战，我们内生地将自适应量化模块、可微位信道失真模块和信息补偿模块集成到INN架构中。这种设计使网络能够学习并补偿CSI压缩、量化和噪声传输过程中的信息丢失，从而在整个反馈过程中保持CSI的完整性。仿真结果验证了所提出方案的有效性，展示了轻量级架构下优越的CSI恢复性能和对实际损伤的鲁棒性。

Deep learning has emerged as a promising solution for efficient channel state information (CSI) feedback in frequency division duplex (FDD) massive MIMO systems. Conventional deep learning-based methods typically rely on a deep autoencoder to compress the CSI, which leads to irreversible information loss and degrades reconstruction accuracy. This paper introduces InvCSINet, an information-preserving CSI feedback framework based on invertible neural networks (INNs). By leveraging the bijective nature of INNs, the model ensures information-preserving compression and reconstruction with shared model parameters. To address practical challenges such as quantization and channel-induced errors, we endogenously integrate an adaptive quantization module, a differentiable bit-channel distortion module and an information compensation module into the INN architecture. This design enables the network to learn and compensate the information loss during CSI compression, quantization, and noisy transmission, thereby preserving the CSI integrity throughout the feedback process. Simulation results validate the effectiveness of the proposed scheme, demonstrating superior CSI recovery performance and robustness to practical impairments with a lightweight architecture.
[12] arXiv:2507.20392 [中文pdf, pdf, html, 其他]: 标题：基于Wi-Fi、LTE和5G的无人机遥控链路在ISM频段中的可靠性：上行干扰不对称性分析与HARQ设计

标题： Reliability of Wi-Fi, LTE, and 5G-Based UAV RC Links in ISM Bands: Uplink Interference Asymmetry Analysis and HARQ Design

Donggu Lee, Sung Joon Maeng, Ozgur Ozdemir, Mani Bharathi Pandian, Ismail Guvenc

主题：信号处理 (eess.SP)

指挥和控制无人驾驶飞行器（UAVs）通常通过在ISM频段运行的空对地（A2G）远程控制（RC）链路实现。虽然无线保真（Wi-Fi）技术常用于UAV RC链路，但基于ISM的长期演进（LTE）和第五代（5G）技术也最近被考虑用于相同目的。 ISM频段中UAV RC链路的一个主要问题是其他类型的干扰源，如旧式Wi-Fi和蓝牙传输，可能会降低链路质量。由于UAV处于与更多干扰源的视距（LoS）状态，这种干扰问题对空中UAV比地面RC单元更为重要。为了获得下行链路（DL）和上行链路（UL）中不对称干扰条件的实证证据，我们首先在北卡罗来纳州立大学的城市和农村地区使用直升机平台进行了测量活动。这次测量活动的结果显示，与地面接收器观察到的干扰相比，在高达170米的高海拔处，总干扰可达16.66 dB。由于这种不对称的UL干扰，UL中丢失的混合自动重传请求（HARQ）指示符（ACK/NACK）可能会降低DL吞吐量。为了研究这一点，我们研究了各种HARQ机制，包括无合并的HARQ Type-I、带追捕合并的HARQ Type-I、带增量冗余的HARQ Type-III以及带追捕合并的突发传输。为了评估不对称UL干扰对吞吐量性能的影响，我们考虑了三个评估过程步骤：1）在完美ACK/NACK假设下单独的物理DL共享信道（PDSCH）吞吐量评估；2）单独的物理UL控制信道（PUCCH）解码可靠性评估；3）在不对称UL ACK/NACK传输下PDSCH DL吞吐量评估。

Command and control of uncrewed aerial vehicles (UAVs) is often realized through air-to-ground (A2G) remote control (RC) links that operate in ISM bands. While wireless fidelity (Wi-Fi) technology is commonly used for UAV RC links, ISM-based long-term evolution (LTE) and fifth-generation (5G) technologies have also been recently considered for the same purpose. A major problem for UAV RC links in the ISM bands is that other types of interference sources, such as legacy Wi-Fi and Bluetooth transmissions, may degrade the link quality. Such interference problems are a higher concern for the UAV in the air than the RC unit on the ground due to the UAV being in line-of-sight (LoS) with a larger number of interference sources. To obtain empirical evidence of the asymmetric interference conditions in downlink (DL) and uplink (UL), we first conducted a measurement campaign using a helikite platform in urban and rural areas at NC State University. The results from this measurement campaign show that the aggregate interference can be up to 16.66 dB at higher altitudes up to 170 m, compared with the interference observed at a ground receiver. As a result of this asymmetric UL interference, lost hybrid automatic repeat request (HARQ) indicators (ACK/NACK) in the UL may degrade the DL throughput. To investigate this, we study various HARQ mechanisms, including HARQ Type-I with no combining, HARQ Type-I with chase combining, HARQ Type-III with incremental redundancy, and burst transmission with chase combining. To evaluate the impact of asymmetric UL interference on throughput performance, we consider three steps of evaluation process: 1) standalone physical DL shared channel (PDSCH) throughput evaluation with perfect ACK/NACK assumption; 2) standalone physical UL control channel (PUCCH) decoding reliability evaluation; and 3) PDSCH DL throughput evaluation with asymmetric UL ACK/NACK transmission.
[13] arXiv:2507.20408 [中文pdf, pdf, html, 其他]: 标题：用于自动儿科肺部声音分类的多阶段混合CNN-Transformer网络

标题： A Multi-Stage Hybrid CNN-Transformer Network for Automated Pediatric Lung Sound Classification

Samiul Based Shuvo, Taufiq Hasan

主题：信号处理 (eess.SP) ; 人工智能 (cs.AI)

自动化分析肺部声音听诊对于监测呼吸健康至关重要，尤其是在医疗保健工作者短缺的地区。尽管呼吸音分类在成人中已被广泛研究，但在儿科人群中，尤其是年龄小于6岁的儿童中的应用仍是一个研究不足的领域。儿科肺部的发育变化显著改变了呼吸音的声学特性，这需要针对这一年龄段的专门分类方法。为了解决这个问题，我们提出了一种多阶段的混合CNN-Transformer框架，该框架结合了CNN提取的特征与基于注意力的架构，通过来自完整记录和单个呼吸事件的频谱图图像对儿科呼吸系统疾病进行分类。通过采用类别特定的焦点损失来解决数据不平衡问题，我们的模型在二元事件分类中的总体得分为0.9039，在多类事件分类中的得分为0.8448。在记录级别，模型在三元分类中的得分为0.720，在多类分类中的得分为0.571。这些得分分别比之前的最佳模型提高了3.81%和5.94%。这种方法为可扩展的儿科呼吸系统疾病诊断提供了一个有前景的解决方案，尤其是在资源有限的环境中。

Automated analysis of lung sound auscultation is essential for monitoring respiratory health, especially in regions facing a shortage of skilled healthcare workers. While respiratory sound classification has been widely studied in adults, its ap plication in pediatric populations, particularly in children aged <6 years, remains an underexplored area. The developmental changes in pediatric lungs considerably alter the acoustic proper ties of respiratory sounds, necessitating specialized classification approaches tailored to this age group. To address this, we propose a multistage hybrid CNN-Transformer framework that combines CNN-extracted features with an attention-based architecture to classify pediatric respiratory diseases using scalogram images from both full recordings and individual breath events. Our model achieved an overall score of 0.9039 in binary event classifi cation and 0.8448 in multiclass event classification by employing class-wise focal loss to address data imbalance. At the recording level, the model attained scores of 0.720 for ternary and 0.571 for multiclass classification. These scores outperform the previous best models by 3.81% and 5.94%, respectively. This approach offers a promising solution for scalable pediatric respiratory disease diagnosis, especially in resource-limited settings.
[14] arXiv:2507.20489 [中文pdf, pdf, html, 其他]: 标题：通过无人机轨迹和可移动天线阵列波束成形的联合优化实现节能安全通信

标题： Energy-Efficient Secure Communications via Joint Optimization of UAV Trajectory and Movable-Antenna Array Beamforming

Sanghyeok Kim, Jinu Gong, Joonhyuk Kang

评论： 5页，2图

主题：信号处理 (eess.SP)

本文研究了配备可移动天线（MA）阵列的无人机（UAV）在增强无线通信系统安全性方面的潜力。我们提出了一种新的框架，联合优化无人机轨迹和MA阵列的可重构波束成形，以最大化保密能效，同时确保与合法用户的可靠通信。通过利用MA阵列提供的空间自由度，系统可以形成高度定向的波束和深度零点，从而显著提高物理层安全性。数值结果表明，所提出的方法实现了优越的保密能效，这归因于可移动天线架构提供的增强空间灵活性。

This paper investigates the potential of unmanned aerial vehicles (UAVs) equipped with movable-antenna (MA) arrays to strengthen security in wireless communication systems. We propose a novel framework that jointly optimizes the UAV trajectory and the reconfigurable beamforming of the MA array to maximize secrecy energy efficiency, while ensuring reliable communication with legitimate users. By exploiting the spatial degrees of freedom enabled by the MA array, the system can form highly directional beams and deep nulls, thereby significantly improving physical layer security. Numerical results demonstrate that the proposed approach achieves superior secrecy energy efficiency, attributed to the enhanced spatial flexibility provided by the movable antenna architecture.
[15] arXiv:2507.20587 [中文pdf, pdf, 其他]: 标题：基于极端轻量模型和跨域蒸馏的实时分布式光纤振动识别

标题： Real-Time Distributed Optical Fiber Vibration Recognition via Extreme Lightweight Model and Cross-Domain Distillation

Zhongyao Luo, Hao Wu, Zhao Ge, Ming Tang

评论： 12页，8图

主题：信号处理 (eess.SP) ; 系统与控制 (eess.SY)

分布式光纤振动传感（DVS）系统为大规模监测和入侵事件识别提供了一个有前景的解决方案。然而，其实际部署仍受到两个主要挑战的阻碍：动态条件下的识别精度下降，以及大规模传感数据实时处理的计算瓶颈。本文通过一种FPGA加速的极端轻量级模型以及一种新提出的知识蒸馏框架，提出了针对这些挑战的新解决方案。所提出的三层深度可分离卷积网络仅包含4141个参数，这是该领域目前最紧凑的架构，并实现了每个样本在0.256秒内覆盖12.5米光纤的最大处理速度0.019毫秒。这种性能对应于对长达168.68公里的传感光纤实现实时处理能力。为了在变化的环境中提高泛化能力，这里使用了由物理先验引导的跨域蒸馏框架，以将频域洞察嵌入到时域模型中。这使得可以在不增加复杂度的情况下进行时频表示学习，并在未见过的环境条件下将识别准确率从51.93%提升至95.72%。所提出的方法提供了包括将可解释信号处理技术与深度学习结合的框架，以及DVS系统中实时处理和边缘计算的参考架构，以及更广泛的分布式光纤传感（DOFS）领域的关键进展。它缓解了感知范围与实时能力之间的权衡，弥合了理论能力和实际部署需求之间的差距。此外，这项工作揭示了构建更高效、稳健和可解释的人工智能系统的新方向，用于DOFS技术。

Distributed optical fiber vibration sensing (DVS) systems offer a promising solution for large-scale monitoring and intrusion event recognition. However, their practical deployment remains hindered by two major challenges: degradation of recognition accuracy in dynamic conditions, and the computational bottleneck of real-time processing for mass sensing data. This paper presents a new solution to these challenges, through a FPGA-accelerated extreme lightweight model along with a newly proposed knowledge distillation framework. The proposed three-layer depthwise separable convolution network contains only 4141 parameters, which is the most compact architecture in this field to date, and achieves a maximum processing speed of 0.019 ms for each sample covering a 12.5 m fiber length over 0.256 s. This performance corresponds to real-time processing capabilities for sensing fibers extending up to 168.68 km. To improve generalizability under changing environments, the proposed cross-domain distillation framework guided by physical priors is used here to embed frequency-domain insights into the time-domain model. This allows for time-frequency representation learning without increasing complexity and boosts recognition accuracy from 51.93% to 95.72% under unseen environmental conditions. The proposed methodology provides key advancements including a framework combining interpretable signal processing technique with deep learning and a reference architecture for real-time processing and edge-computing in DVS systems, and more general distributed optical fiber sensing (DOFS) area. It mitigates the trade-off between sensing range and real-time capability, bridging the gap between theoretical capabilities and practical deployment requirements. Furthermore, this work reveals a new direction for building more efficient, robust and explainable artificial intelligence systems for DOFS technologies.
[16] arXiv:2507.20648 [中文pdf, pdf, 其他]: 标题： RFI和天线阵列中的干扰检测使用LSTM自编码器

标题： RFI and Jamming Detection in Antenna Arrays with an LSTM Autoencoder

Christos Ntemkas, Antonios Argyriou

期刊参考： 2025年IEEE雷达会议（RadarConf25）

主题：信号处理 (eess.SP)

无线电频率干扰（RFI）和恶意干扰器是我们无线世界中的一个重要问题。检测RFI或干扰通常使用基于模型的统计检测方法或人工智能增强的算法，这些算法使用输入的基带数据或时频表示，如频谱图。在本工作中，我们摒弃了之前的方法，并利用天线阵列系统中的数据。我们使用傅里叶成像来定位空间中的信号源，然后部署一个深度LSTM自编码器，将RFI和干扰检测为异常。我们在不同功率级别的RFI/干扰源和感兴趣的信号上的结果表明，我们的检测器在不需要任何关于RFI或干扰信号的先验知识的情况下，仍能提供高性能。

Radio frequency interference (RFI) and malicious jammers are a significant problem in our wireless world. Detecting RFI or jamming is typically performed with model-based statistical detection or AI-empowered algorithms that use an input baseband data or time-frequency representations like spectrograms. In this work we depart from the previous approaches and we leverage data in antenna array systems. We use Fourier imaging to localize spatially the sources and then deploy a deep LSTM autoencoder that detects RFI and jamming as anomalies. Our results for different power levels of the RFI/jamming sources, and the signal of interest, reveal that our detector offers high performance without needing any pre-existing knowledge regarding the RFI or jamming signal.
[17] arXiv:2507.20651 [中文pdf, pdf, 其他]: 标题：基于深度学习的角度-距离分解用于主动声呐检测

标题： Angle-distance decomposition based on deep learning for active sonar detection

Jichao Zhang, Xiao-Lei Zhang, Kunde Yang

主题：信号处理 (eess.SP)

使用主动声纳进行水下目标检测是海洋科学和工程中的一个重要研究领域。然而，由于噪声、混响和干扰，传统信号处理方法在复杂的水下环境中面临重大挑战。为解决这些问题，本文提出了一种基于深度学习的主动声纳目标检测方法，该方法将检测过程分解为单独的角度和距离估计任务。主动声纳目标检测使用深度学习模型来预测目标距离和角度，最终目标位置通过整合这些估计值来确定。有限的水下声学数据阻碍了有效模型训练，但迁移学习和仿真提供了对此挑战的实用解决方案。实验结果验证了该方法在恶劣条件下的有效性和鲁棒性。

Underwater target detection using active sonar constitutes a critical research area in marine sciences and engineering. However, traditional signal processing methods face significant challenges in complex underwater environments due to noise, reverberation, and interference. To address these issues, this paper presents a deep learning-based active sonar target detection method that decomposes the detection process into separate angle and distance estimation tasks. Active sonar target detection employs deep learning models to predict target distance and angle, with the final target position determined by integrating these estimates. Limited underwater acoustic data hinders effective model training, but transfer learning and simulation offer practical solutions to this challenge. Experimental results verify that the method achieves effective and robust performance under challenging conditions.
[18] arXiv:2507.20657 [中文pdf, pdf, 其他]: 标题：基于无线信号的AI人体活动分类中的微多普勒攻击

标题： The micro-Doppler Attack Against AI-based Human Activity Classification from Wireless Signals

Margarita Loupa, Antonios Argyriou, Yanwei Liu

期刊参考： 2025年IEEE雷达会议（RadarConf25）

主题：信号处理 (eess.SP)

基于人工智能算法使用被动收集的无线信号的人类活动分类（HAC）系统的一个子集。本文提出了针对来自无线正交频分复用（OFDM）信号的人类活动分类的微多普勒攻击。该攻击通过在传输的OFDM波形中插入人工变化，在其反射到人体目标时改变其微多普勒特征。我们研究了我们方案的两种变体，这些变体在不同的时间尺度上操作波形，导致接收器频谱图发生变化。使用深度卷积神经网络（CNN）的HAC准确率可以降低到10%以下。

A subset of Human Activity Classification (HAC) systems are based on AI algorithms that use passively collected wireless signals. This paper presents the micro-Doppler attack targeting HAC from wireless orthogonal frequency division multiplexing (OFDM) signals. The attack is executed by inserting artificial variations in a transmitted OFDM waveform to alter its micro-Doppler signature when it reflects off a human target. We investigate two variants of our scheme that manipulate the waveform at different time scales resulting in altered receiver spectrograms. HAC accuracy with a deep convolutional neural network (CNN) can be reduced to less than 10%.
[19] arXiv:2507.20664 [中文pdf, pdf, 其他]: 标题：基于高次谐波自相关的心跳估计非线性谱方法

标题： A Nonlinear Spectral Approach for Radar-Based Heartbeat Estimation via Autocorrelation of Higher Harmonics

Kohei Shimomura, Chi-Hsuan Lee, Takuya Sakamoto

评论： 4页，4图，3表。这项工作将提交给IEEE以供可能发表

主题：信号处理 (eess.SP)

这项研究提出了一种非线性信号处理方法，通过利用心跳信号中固有的高阶谐波的周期性，实现了基于雷达的准确心跳间隔估计。与传统方法采用选择性频率滤波或跟踪单个谐波不同，所提出的方法通过非线性相关处理增强频谱的整体周期结构。具体而言，首先对雷达位移信号进行平滑和二阶导数运算，以抑制噪声并突出高阶心跳谐波。而不是隔离特定的频率成分，我们在谐波频率附近计算傅里叶谱的局部自相关性。这些自相关性的非相干叠加产生了一个伪谱，在该伪谱中基本心跳周期性得到明显强调。这种非线性方法减轻了呼吸谐波和噪声的影响，从而实现了稳健的心跳间隔估计。对五名参与者雷达测量结果的实验表明，与传统技术相比，所提出的方法将均方根误差降低了20%，并将相关系数提高了0.20。

This study presents a nonlinear signal processing method for accurate radar-based heartbeat interval estimation by exploiting the periodicity of higher-order harmonics inherent in heartbeat signals. Unlike conventional approaches that employ selective frequency filtering or track individual harmonics, the proposed method enhances the global periodic structure of the spectrum via nonlinear correlation processing. Specifically, smoothing and second-derivative operations are first applied to the radar displacement signal to suppress noise and accentuate higher-order heartbeat harmonics. Rather than isolating specific frequency components, we compute localized autocorrelations of the Fourier spectrum around the harmonic frequencies. The incoherent summation of these autocorrelations yields a pseudo-spectrum in which the fundamental heartbeat periodicity is distinctly emphasized. This nonlinear approach mitigates the effects of respiratory harmonics and noise, enabling robust interbeat interval estimation. Experiments with radar measurements from five participants demonstrate that the proposed method reduces root-mean-square error by 20% and improves the correlation coefficient by 0.20 relative to conventional techniques.
[20] arXiv:2507.20789 [中文pdf, pdf, 其他]: 标题：基于DT的频谱共享融合卫星-地面网络中的资源管理

标题： DT-Aided Resource Management in Spectrum Sharing Integrated Satellite-Terrestrial Networks

Hung Nguyen-Kha, Vu Nguyen Ha, Ti Ti Nguyen, Eva Lagunas, Symeon Chatzinotas, Joel Grotz

主题：信号处理 (eess.SP)

通过频谱共享集成的卫星-地面网络（ISTNs）已成为提高频谱效率并满足日益增长的无线需求的有前景的解决方案。然而，这种共存引入了重大挑战，包括系统间干扰（ISI）和低地球轨道卫星（LSat）的运动。为了捕捉实际环境以进行资源管理，我们提出了一种结合三维地图的时间变化数字孪生（DT）辅助框架，用于ISTNs，该框架能够联合优化带宽（BW）分配、流量引导和资源分配，并旨在最小化拥塞。该问题被建模为一种混合整数非线性规划（MINLP），通过基于连续凸近似（SCA）和压缩感知方法的两阶段算法来解决。数值结果表明，与基准方法相比，所提出的方法在队列长度最小化方面表现出优越的性能。

The integrated satellite-terrestrial networks (ISTNs) through spectrum sharing have emerged as a promising solution to improve spectral efficiency and meet increasing wireless demand. However, this coexistence introduces significant challenges, including inter-system interference (ISI) and the low Earth orbit satellite (LSat) movements. To capture the actual environment for resource management, we propose a time-varying digital twin (DT)-aided framework for ISTNs incorporating 3D map that enables joint optimization of bandwidth (BW) allocation, traffic steering, and resource allocation, and aims to minimize congestion. The problem is formulated as a mixed-integer nonlinear programming (MINLP), addressed through a two-phase algorithm based on successive convex approximation (SCA) and compressed sensing approaches. Numerical results demonstrate the proposed method's superior performance in queue length minimization compared to benchmarks.
[21] arXiv:2507.20825 [中文pdf, pdf, 其他]: 标题： Chirp-Permuted AFDM：下一代通用波形设计的新自由度

标题： Chirp-Permuted AFDM: A New Degree of Freedom for Next-Generation Versatile Waveform Design

Hyeon Seok Rou, Giuseppe Thadeu Freitas de Abreu

主题：信号处理 (eess.SP)

我们提出了一种新型多载波波形，称为线性调频置换仿射频分复用（CP-AFDM），它在传统AFDM的线性调频子载波基础上引入了一个独特的线性调频置换域。通过对信号模型和波形特性的严格分析，并通过数值仿真支持，证明所提出的CP-AFDM保留了仿射频分复用（AFDM）的所有核心特性——包括对双重色散信道的鲁棒性、峰均功率比（PAPR）和完整的时延-多普勒表示——同时进一步提高了模糊函数的分辨率和多普勒域的峰值旁瓣比（PSLR）。这些改进使CP-AFDM成为一种极具吸引力的候选方案，适用于未来第六代（6G）应用场景，这些场景需要同时具备可靠性和感知意识。此外，通过利用线性调频置换域中的大量自由度，介绍了两种典型的多功能应用：一种是在置换域上的索引调制（IM）技术，实现了显著的频谱效率提升；另一种是物理层安全方案，通过基于置换的密钥调制确保几乎完美的安全性，而无需额外的发射能量或信令开销。

We present a novel multicarrier waveform, termed chirp-permuted affine frequency division multiplexing (CP-AFDM), which introduces a unique chirp-permutation domain on top of the chirp subcarriers of the conventional AFDM. Rigorous analysis of the signal model and waveform properties, supported by numerical simulations, demonstrates that the proposed CP-AFDM preserves all core characteristics of affine frequency division multiplexing (AFDM) - including robustness to doubly-dispersive channels, peak-to-average power ratio (PAPR), and full delay-Doppler representation - while further enhancing ambiguity function resolution and peak-to-sidelobe ratio (PSLR) in the Doppler domain. These improvements establish CP-AFDM as a highly attractive candidate for emerging sixth generation (6G) use cases demanding both reliability and sensing-awareness. Moreover, by exploiting the vast degree of freedom in the chirp-permutation domain, two exemplary multifunctional applications are introduced: an index modulation (IM) technique over the permutation domain which achieves significant spectral efficiency gains, and a physical-layer security scheme that ensures practically perfect security through permutation-based keying, without requiring additional transmit energy or signaling overhead.
[22] arXiv:2507.20942 [中文pdf, pdf, html, 其他]: 标题：多基地OFDM基础ISAC系统的干扰分析与级联干扰消除

标题： Interference Analysis and Successive Interference Cancellation for Multistatic OFDM-based ISAC Systems

Taewon Jeong, Lucas Giroto, Umut Utku Erdem, Christian Karle, Jiyeon Choi, Thomas Zwick, Benjamin Nuss

评论：此工作已提交给IEEE以供可能发表

主题：信号处理 (eess.SP)

多站集成感知与通信（ISAC）系统，使用分布式发射器和接收器，在空间覆盖范围和感知准确性方面优于独立的ISAC配置。然而，由于共存的ISAC节点之间存在干扰，特别是同时运行时，这些系统面临挑战。在本文中，我们分析了多站ISAC场景中由于共存而产生的相互干扰的影响，其中单站和双站ISAC系统共享相同的频谱资源。我们首先在功率域中对不同类型的干扰进行分类。然后，我们通过仿真讨论了干扰如何在不同发射功率和RCS配置下影响比特误码率（BER）、误差矢量幅度（EVM）和雷达图像中的感知和通信。除了干扰分析外，我们提出了一种低复杂度的连续干扰消除方法，该方法根据单站雷达图像的信干噪比（SINR）自适应地消除单站反射或双站视距信号。所提出的框架通过使用带有雷达回波生成器的ISAC测试平台进行仿真和概念验证测量进行评估。结果表明，所提出的方法在广泛的SINR条件下减少了BER并提高了EVM以及雷达图像的SINR。这些结果表明，可以通过低计算开销实现精确的逐组件干扰消除，使该方法适用于实际应用。

Multistatic integrated sensing and communications (ISAC) systems, which use distributed transmitters and receivers, offer enhanced spatial coverage and sensing accuracy compared to stand-alone ISAC configurations. However, these systems face challenges due to interference between co-existing ISAC nodes, especially during simultaneous operation. In this paper, we analyze the impact of this mutual interference arising from the co-existence in a multistatic ISAC scenario, where a mono- and a bistatic ISAC system share the same spectral resources. We first classify differenct types of interference in the power domain. Then, we discuss how the interference can affect both sensing and communications in terms of bit error rate (BER), error vector magnitude (EVM), and radar image under varied transmit power and RCS configurations through simulations. Along with interfernce analysis, we propose a low-complexity successive interference cancellation method that adaptively cancels either the monostatic reflection or the bistatic line-of-sight signal based on a monostatic radar image signal-to-interference-plus-noise ratio (SINR). The proposed framework is evaluated with both simulations and proof-of-concept measurements using an ISAC testbed with a radar echo generator for object emulation. The results have shown that the proposed method reduces BER and improves EVM as well as radar image SINR across a wide range of SINR conditions. These results demonstrate that accurate component-wise cancellation can be achieved with low computational overhead, making the method suitable for practical applications.
[23] arXiv:2507.20952 [中文pdf, pdf, html, 其他]: 标题：无电池物联网传感器的分析建模由环境能量收集供电

标题： Analytical Modeling of Batteryless IoT Sensors Powered by Ambient Energy Harvesting

Jimmy Fernandez Landivar, Andrea Zanella, Ihsane Gryech, Sofie Pollin, Hazem Sallouha

评论： 6页，6图，1表，已被接受发表于第36届IEEE个人、室内和移动无线电通信国际研讨会（PIMRC 2025），伊斯坦布尔，土耳其

主题：信号处理 (eess.SP)

本文提出了一种全面的数学模型，用于表征完全由环境能量收集供电的无电池物联网传感器节点的能量动态。该模型捕捉了能量收集和消耗阶段，明确结合了电源管理任务，以实现对设备在不同环境条件下的行为的精确估计。所提出的模型适用于多种物联网设备，并支持旨在在波动的环境条件下最大化收集能量的智能电源管理单元。我们针对一个无电池物联网节点原型验证了我们的模型，在三种不同的光照场景下进行了实验。结果表明，分析得出的超级电容器电压曲线与测量结果之间存在强烈的相关性，证实了所提出模型的准确性。

This paper presents a comprehensive mathematical model to characterize the energy dynamics of batteryless IoT sensor nodes powered entirely by ambient energy harvesting. The model captures both the energy harvesting and consumption phases, explicitly incorporating power management tasks to enable precise estimation of device behavior across diverse environmental conditions. The proposed model is applicable to a wide range of IoT devices and supports intelligent power management units designed to maximize harvested energy under fluctuating environmental conditions. We validated our model against a prototype batteryless IoT node, conducting experiments under three distinct illumination scenarios. Results show a strong correlation between analytical and measured supercapacitor voltage profiles, confirming the proposed model's accuracy.

[24] arXiv:2507.19608 (交叉列表自 cs.AI) [中文pdf, pdf, html, 其他]: 标题： DeltaLLM：一种利用时间稀疏性进行高效边缘LLM推理的无训练框架

标题： DeltaLLM: A Training-Free Framework Exploiting Temporal Sparsity for Efficient Edge LLM Inference

Jiawen Qi, Chang Gao, Zhaochun Ren, Qinyu Chen

主题：人工智能 (cs.AI) ; 信号处理 (eess.SP)

由于序列长度的二次增长，将大型语言模型（LLMs）部署在边缘设备上仍然具有挑战性。现有的动态注意力剪枝研究是为具有大规模并行计算能力的硬件（如GPU或TPU）设计的，并旨在长上下文长度（例如，64K），这使得它们不适合边缘场景。我们提出了DeltaLLM，这是一种无需训练的框架，利用注意力模式中的时间稀疏性，在资源受限的边缘设备上实现高效的LLM推理，涵盖预填充和解码阶段。DeltaLLM引入了一种兼顾准确性和内存的delta矩阵构建策略，该策略引入了时间稀疏性，并提出了一种上下文感知的混合注意力机制，将局部上下文窗口内的完整注意力与窗口外的delta近似相结合，以提高准确性。我们在面向边缘设备的BitNet-b1.58-2B-4T模型和Llama3.2-1B-Instruct模型上对我们的框架进行了多种语言任务的评估。结果表明，在BitNet上，我们的框架在预填充阶段将注意力稀疏性从0%提高到60%，并在WG任务上略有准确率提升；在预填充和解码阶段整体从0%提高到57%，在SQuAD-v2任务上的F1分数从29.63提高到30.97。在Llama模型上，它也可以在预填充阶段达到高达60%的稀疏性，并在两个阶段达到约57%，而准确率下降可以忽略不计。这些结果表明，DeltaLLM为高效的边缘部署提供了一个有前景的解决方案，无需微调，并可无缝集成到现有的推理流水线中。

Deploying Large Language Models (LLMs) on edge devices remains challenging due to their quadratically increasing computations with the sequence length. Existing studies for dynamic attention pruning are designed for hardware with massively parallel computation capabilities, such as GPUs or TPUs, and aim at long context lengths (e.g., 64K), making them unsuitable for edge scenarios. We present DeltaLLM, a training-free framework that exploits temporal sparsity in attention patterns to enable efficient LLM inference across both the prefilling and decoding stages, on resource-constrained edge devices. DeltaLLM introduces an accuracy- and memory-aware delta matrix construction strategy that introduces temporal sparsity, and a context-aware hybrid attention mechanism that combines full attention in a local context window with delta approximation outside it to increase accuracy. We evaluate our framework on the edge-device-friendly BitNet-b1.58-2B-4T model and Llama3.2-1B-Instruct model across diverse language tasks. The results show that on BitNet, our framework increases the attention sparsity from 0% to 60% during the prefilling stage with slight accuracy improvement on the WG task, and 0% to 57% across both the prefilling and decoding stages, with even higher F1 score from 29.63 to 30.97 on SQuAD-v2 task. On the Llama model, it can also achieve up to 60% sparsity during the prefilling stage and around 57% across both stages with negligible accuracy drop. These results demonstrate that DeltaLLM offers a promising solution for efficient edge deployment, requiring no fine-tuning and seamlessly integrating with existing inference pipelines.
[25] arXiv:2507.19736 (交叉列表自 cs.HC) [中文pdf, pdf, html, 其他]: 标题： LowKeyEMG：使用减少的键集进行肌电图分类

标题： LowKeyEMG: Electromyographic typing with a reduced keyset

Johannes Y. Lee, Derek Xiao, Shreyas Kaasyap, Nima R. Hadidi, John L. Zhou, Jacob Cunningham, Rakshith R. Gore, Deniz O. Eren, Jonathan C. Kao

评论： 11+3页，5个主要图表，2个补充表格，4个补充图表

主题：人机交互 (cs.HC) ; 信号处理 (eess.SP)

我们介绍了LowKeyEMG，这是一种实时人机接口，仅使用从表面肌电图（sEMG）解码的7种手势类别即可实现高效的文本输入。先前的研究尝试从sEMG中解码完整字母表，但解码大字符集仍然不可靠，尤其是对于运动障碍的个体。相反，LowKeyEMG将英语字母表减少到4种手势键，再加上3种用于空格和系统交互，以可靠地将简单的单手手势转换为文本，利用基于循环变压器的语言模型RWKV进行高效计算。在实时实验中，参与者使用LowKeyEMG实现了平均23.3词/分钟的单手无键盘打字速度，并将手势效率提高了17%（相对于输入短语长度）。当仅使用7个键时，LowKeyEMG可以达到98.2%的前三词准确率，证明这种低键输入范式可以保持实际的通信速率。我们的结果对辅助技术和任何输入带宽受限的界面都有意义。

We introduce LowKeyEMG, a real-time human-computer interface that enables efficient text entry using only 7 gesture classes decoded from surface electromyography (sEMG). Prior work has attempted full-alphabet decoding from sEMG, but decoding large character sets remains unreliable, especially for individuals with motor impairments. Instead, LowKeyEMG reduces the English alphabet to 4 gesture keys, with 3 more for space and system interaction, to reliably translate simple one-handed gestures into text, leveraging the recurrent transformer-based language model RWKV for efficient computation. In real-time experiments, participants achieved average one-handed keyboardless typing speeds of 23.3 words per minute with LowKeyEMG, and improved gesture efficiency by 17% (relative to typed phrase length). When typing with only 7 keys, LowKeyEMG can achieve 98.2% top-3 word accuracy, demonstrating that this low-key typing paradigm can maintain practical communication rates. Our results have implications for assistive technologies and any interface where input bandwidth is constrained.
[26] arXiv:2507.19822 (交叉列表自 cs.LG) [中文pdf, pdf, html, 其他]: 标题：揭开联邦学习在医学图像分类中的优化误区

标题： Debunking Optimization Myths in Federated Learning for Medical Image Classification

Youngjoon Lee, Hyukjoon Lee, Jinu Gong, Yang Cao, Joonhyuk Kang

评论：被接受参加高效医学AI研讨会 - MICCAI 2025

主题：机器学习 (cs.LG) ; 图像与视频处理 (eess.IV) ; 信号处理 (eess.SP)

联邦学习（FL）是一种协作学习方法，在保持数据隐私的同时实现去中心化的模型训练。尽管在医学影像方面具有潜力，但最近的FL方法通常对局部因素（如优化器和学习率）敏感，这限制了它们在实际部署中的鲁棒性。在本工作中，我们重新审视原始FL，以明确边缘设备配置的影响，并在结直肠病理学和血细胞分类任务上对最近的FL方法进行基准测试。我们通过数值实验表明，局部优化器和学习率的选择对性能的影响大于具体的FL方法。此外，我们发现增加局部训练轮次可能会增强或损害收敛，具体取决于FL方法。这些发现表明，适当的边缘特定配置比算法复杂性对于实现有效的FL更为关键。

Federated Learning (FL) is a collaborative learning method that enables decentralized model training while preserving data privacy. Despite its promise in medical imaging, recent FL methods are often sensitive to local factors such as optimizers and learning rates, limiting their robustness in practical deployments. In this work, we revisit vanilla FL to clarify the impact of edge device configurations, benchmarking recent FL methods on colorectal pathology and blood cell classification task. We numerically show that the choice of local optimizer and learning rate has a greater effect on performance than the specific FL method. Moreover, we find that increasing local training epochs can either enhance or impair convergence, depending on the FL method. These findings indicate that appropriate edge-specific configuration is more crucial than algorithmic complexity for achieving effective FL.
[27] arXiv:2507.19941 (交叉列表自 cs.IT) [中文pdf, pdf, html, 其他]: 标题：自适应学习的置信度传播用于解码纠错码

标题： Adaptive Learned Belief Propagation for Decoding Error-Correcting Codes

Alireza Tasdighi, Mansoor Yousefi

主题：信息论 (cs.IT) ; 信号处理 (eess.SP)

加权信念传播（WBP）用于线性分组码的解码。在WBP中，代码的Tanner图根据信念传播解码器的迭代进行展开。然后，将权重分配给所得递归网络的边，并使用训练数据集进行离线优化。本文的主要贡献是一种自适应WBP，其中解码器的权重针对每个接收字确定。研究了该解码器的两种变体。在并行WBP解码器中，权重取值于一个离散集合。多个WBP解码器并行运行以实时搜索最佳权重序列。在两阶段解码器中，使用一个小的神经网络来动态确定每个接收字的WBP解码器的权重。所提出的自适应解码器在两个应用中相对于静态对应物表现出显著改进。在第一个应用中，Bose--Chaudhuri--Hocquenghem、极化和准循环低密度奇偶校验（QC-LDPC）码在加性高斯白噪声信道上使用。结果表明，在大约相同的解码复杂度下，自适应WBP的比特错误率（BERs）比静态WBP的BERs低一个数量级，具体取决于码、其速率和信噪比。第二个应用是一个为长距离非线性光纤维信道设计的级联码，其中内码是QC-LDPC码，外码是空间耦合LDPC码。在这种情况下，内码使用自适应WBP进行解码，而外码使用滑动窗口解码器和静态信念传播进行解码。结果表明，与神经归一化最小和解码器相比，自适应WBP提供了0.8 dB的编码增益，计算复杂度和解码延迟大致相同。

Weighted belief propagation (WBP) for the decoding of linear block codes is considered. In WBP, the Tanner graph of the code is unrolled with respect to the iterations of the belief propagation decoder. Then, weights are assigned to the edges of the resulting recurrent network and optimized offline using a training dataset. The main contribution of this paper is an adaptive WBP where the weights of the decoder are determined for each received word. Two variants of this decoder are investigated. In the parallel WBP decoders, the weights take values in a discrete set. A number of WBP decoders are run in parallel to search for the best sequence- of weights in real time. In the two-stage decoder, a small neural network is used to dynamically determine the weights of the WBP decoder for each received word. The proposed adaptive decoders demonstrate significant improvements over the static counterparts in two applications. In the first application, Bose--Chaudhuri--Hocquenghem, polar and quasi-cyclic low-density parity-check (QC-LDPC) codes are used over an additive white Gaussian noise channel. The results indicate that the adaptive WBP achieves bit error rates (BERs) up to an order of magnitude less than the BERs of the static WBP at about the same decoding complexity, depending on the code, its rate, and the signal-to-noise ratio. The second application is a concatenated code designed for a long-haul nonlinear optical fiber channel where the inner code is a QC-LDPC code and the outer code is a spatially coupled LDPC code. In this case, the inner code is decoded using an adaptive WBP, while the outer code is decoded using the sliding window decoder and static belief propagation. The results show that the adaptive WBP provides a coding gain of 0.8 dB compared to the neural normalized min-sum decoder, with about the same computational complexity and decoding latency.
[28] arXiv:2507.20023 (交叉列表自 eess.AS) [中文pdf, pdf, html, 其他]: 标题：双耳语音增强使用复数卷积循环网络

标题： Binaural Speech Enhancement Using Complex Convolutional Recurrent Networks

Vikas Tokala, Eric Grinstein, Mike Brookes, Simon Doclo, Jesper Jensen, Patrick A. Naylor

主题：音频与语音处理 (eess.AS) ; 信号处理 (eess.SP)

从助听器到增强现实和虚拟现实设备，双耳语音增强算法已被确立为提高语音可懂度和听觉舒适度的最先进技术。在本文中，我们提出了一种端到端的双耳语音增强方法，该方法使用具有编码器-解码器结构的复数循环卷积网络，并在编码器和解码器之间放置了一个复数LSTM循环块。引入了一种损失函数，该函数除了关注语音可懂度的提高和噪声的减少外，还注重空间信息的保留。该网络在时频域中为双耳助听设备的左耳和右耳通道估计单独的复数比率掩码。我们表明，与其他基线算法相比，所提出的方法在单个目标说话人和各种类型的各向同性噪声的声学环境下，显著提高了估计的语音可懂度，同时保留了双耳信号的空间信息。

From hearing aids to augmented and virtual reality devices, binaural speech enhancement algorithms have been established as state-of-the-art techniques to improve speech intelligibility and listening comfort. In this paper, we present an end-to-end binaural speech enhancement method using a complex recurrent convolutional network with an encoder-decoder architecture and a complex LSTM recurrent block placed between the encoder and decoder. A loss function that focuses on the preservation of spatial information in addition to speech intelligibility improvement and noise reduction is introduced. The network estimates individual complex ratio masks for the left and right-ear channels of a binaural hearing device in the time-frequency domain. We show that, compared to other baseline algorithms, the proposed method significantly improves the estimated speech intelligibility and reduces the noise while preserving the spatial information of the binaural signals in acoustic situations with a single target speaker and isotropic noise of various types.
[29] arXiv:2507.20027 (交叉列表自 eess.AS) [中文pdf, pdf, html, 其他]: 标题：双耳定位模型用于噪声中的语音

标题： Binaural Localization Model for Speech in Noise

Vikas Tokala, Eric Grinstein, Rory Brooks, Mike Brookes, Simon Doclo, Jesper Jensen, Patrick A. Naylor

主题：音频与语音处理 (eess.AS) ; 信号处理 (eess.SP)

双耳声源定位对人类听觉者在空间感知、交流和安全方面非常重要。本文提出了一种端到端的噪声中语音双耳定位模型。介绍了一种轻量级卷积循环网络，用于对嘈杂混响双耳信号进行前向方位角平面的声音定位。该模型结合了加性内耳噪声，以表示典型听者的频率依赖性听力阈值。将该模型的定位性能与定向响应功率算法进行了比较，并研究了该模型作为双耳语音增强方法中双耳线索保留度量的使用。进行了一项听觉测试，以比较该模型在嘈杂条件下的语音定位性能与人类的定位性能。

Binaural acoustic source localization is important to human listeners for spatial awareness, communication and safety. In this paper, an end-to-end binaural localization model for speech in noise is presented. A lightweight convolutional recurrent network that localizes sound in the frontal azimuthal plane for noisy reverberant binaural signals is introduced. The model incorporates additive internal ear noise to represent the frequency-dependent hearing threshold of a typical listener. The localization performance of the model is compared with the steered response power algorithm, and the use of the model as a measure of interaural cue preservation for binaural speech enhancement methods is studied. A listening test was performed to compare the performance of the model with human localization of speech in noisy conditions.
[30] arXiv:2507.20268 (交叉列表自 cs.LG) [中文pdf, pdf, html, 其他]: 标题：通过交叉验证的数据高效预测驱动校准

标题： Data-Efficient Prediction-Powered Calibration via Cross-Validation

Seonghoon Yoo, Houssem Sifaou, Sangwoo Park, Joonhyuk Kang, Osvaldo Simeone

主题：机器学习 (cs.LG) ; 信号处理 (eess.SP) ; 机器学习 (stat.ML)

校准数据对于正式量化现有人工智能（AI）模型产生的决策的不确定性是必要的。为了克服校准数据稀缺的常见问题，一种有前途的方法是使用由（通常不同的）预测模型生成的合成标签。然而，在感兴趣的推理任务上微调生成标签的预测器，以及估计合成标签的残差偏差，需要额外的数据，这可能会加剧校准数据稀缺的问题。本文介绍了一种新方法，该方法高效利用有限的校准数据，同时微调预测器并估计合成标签的偏差。所提出的方法为AI生成的决策提供了具有严格覆盖保证的预测集。在室内定位问题上的实验结果验证了我们解决方案的有效性和性能提升。

Calibration data are necessary to formally quantify the uncertainty of the decisions produced by an existing artificial intelligence (AI) model. To overcome the common issue of scarce calibration data, a promising approach is to employ synthetic labels produced by a (generally different) predictive model. However, fine-tuning the label-generating predictor on the inference task of interest, as well as estimating the residual bias of the synthetic labels, demand additional data, potentially exacerbating the calibration data scarcity problem. This paper introduces a novel approach that efficiently utilizes limited calibration data to simultaneously fine-tune a predictor and estimate the bias of the synthetic labels. The proposed method yields prediction sets with rigorous coverage guarantees for AI-generated decisions. Experimental results on an indoor localization problem validate the effectiveness and performance gains of our solution.
[31] arXiv:2507.20399 (交叉列表自 eess.SY) [中文pdf, pdf, html, 其他]: 标题： ACCESS-AV：智能工厂中可持续自主车辆定位的自适应通信-计算协同设计

标题： ACCESS-AV: Adaptive Communication-Computation Codesign for Sustainable Autonomous Vehicle Localization in Smart Factories

Rajat Bhattacharjya, Arnab Sarkar, Ish Kool, Sabur Baidya, Nikil Dutt

评论： 28页，9图

主题：系统与控制 (eess.SY) ; 硬件架构 (cs.AR) ; 网络与互联网架构 (cs.NI) ; 机器人技术 (cs.RO) ; 信号处理 (eess.SP)

自主配送车辆（ADVs）在5G网络支持的智能工厂中越来越多地用于运输货物，其中计算密集型的定位模块为优化提供了重要机会。我们提出了ACCESS-AV，一种节能的车对基础设施（V2I）定位框架，该框架利用智能工厂环境中的现有5G基础设施。通过机会性地访问定期广播的5G同步信号块（SSBs）进行定位，ACCESS-AV无需专用的路边单元（RSUs）或额外的车载传感器即可实现节能和成本降低。我们实现了一种基于到达角（AoA）的估计方法，使用多信号分类（MUSIC）算法，并通过自适应通信-计算策略对资源受限的ADVs平台进行优化，该策略根据环境条件（如信噪比（SNR）和车辆速度）动态平衡能耗与定位精度。实验结果表明，与采用AoA算法（如原始MUSIC、ESPRIT和Root-MUSIC）的非自适应系统相比，ACCESS-AV平均能耗降低了43.09%。它在保持30厘米以下定位精度的同时，还显著降低了基础设施和运营成本，证明了其在可持续智能工厂环境中的可行性。

Autonomous Delivery Vehicles (ADVs) are increasingly used for transporting goods in 5G network-enabled smart factories, with the compute-intensive localization module presenting a significant opportunity for optimization. We propose ACCESS-AV, an energy-efficient Vehicle-to-Infrastructure (V2I) localization framework that leverages existing 5G infrastructure in smart factory environments. By opportunistically accessing the periodically broadcast 5G Synchronization Signal Blocks (SSBs) for localization, ACCESS-AV obviates the need for dedicated Roadside Units (RSUs) or additional onboard sensors to achieve energy efficiency as well as cost reduction. We implement an Angle-of-Arrival (AoA)-based estimation method using the Multiple Signal Classification (MUSIC) algorithm, optimized for resource-constrained ADV platforms through an adaptive communication-computation strategy that dynamically balances energy consumption with localization accuracy based on environmental conditions such as Signal-to-Noise Ratio (SNR) and vehicle velocity. Experimental results demonstrate that ACCESS-AV achieves an average energy reduction of 43.09% compared to non-adaptive systems employing AoA algorithms such as vanilla MUSIC, ESPRIT, and Root-MUSIC. It maintains sub-30 cm localization accuracy while also delivering substantial reductions in infrastructure and operational costs, establishing its viability for sustainable smart factory environments.
[32] arXiv:2507.20426 (交叉列表自 cs.LG) [中文pdf, pdf, html, 其他]: 标题： ResCap-DBP：一种轻量级残差胶囊网络，用于使用全局ProteinBERT嵌入的准确DNA结合蛋白预测

标题： ResCap-DBP: A Lightweight Residual-Capsule Network for Accurate DNA-Binding Protein Prediction Using Global ProteinBERT Embeddings

Samiul Based Shuvo, Tasnia Binte Mamun, U Rajendra Acharya

主题：机器学习 (cs.LG) ; 人工智能 (cs.AI) ; 信号处理 (eess.SP) ; 生物大分子 (q-bio.BM)

DNA结合蛋白（DBPs）在基因调控和细胞过程中起着关键作用，因此准确识别它们对于理解生物功能和疾病机制至关重要。用于DBP识别的实验方法耗时且成本高，这推动了高效计算预测技术的需求。在本研究中，我们提出了一种新的深度学习框架ResCap-DBP，该框架结合了基于残差学习的编码器和一维胶囊网络（1D-CapsNet），直接从原始蛋白质序列中预测DBPs。我们的架构在残差块中引入扩张卷积以缓解梯度消失问题并提取丰富的序列特征，同时具有动态路由的胶囊层捕获学习特征空间内的层次和空间关系。我们进行了全面的消融研究，比较了ProteinBERT的全局和局部嵌入与传统one-hot编码。结果表明， ProteinBERT嵌入在大型数据集上显著优于其他表示方式。尽管one-hot编码在较小的数据集如PDB186上表现出微弱优势，但其难以有效扩展。在四对公开可用的基准数据集上的广泛评估表明，我们的模型始终优于当前最先进的方法。它在PDB14189和PDB1075上的AUC分数分别为98.0%和89.5%。在独立测试集PDB2272和PDB186上，模型达到了83.2%和83.3%的最高AUC，同时在较大的数据集如 PDB20000上保持了有竞争力的性能。值得注意的是，该模型在不同数据集上保持了良好的灵敏度和特异性。这些结果证明了将全局蛋白质表示与先进的深度学习架构相结合在多样化的基因组环境中进行可靠和可扩展的DBP预测的有效性和通用性。

DNA-binding proteins (DBPs) are integral to gene regulation and cellular processes, making their accurate identification essential for understanding biological functions and disease mechanisms. Experimental methods for DBP identification are time-consuming and costly, driving the need for efficient computational prediction techniques. In this study, we propose a novel deep learning framework, ResCap-DBP, that combines a residual learning-based encoder with a one-dimensional Capsule Network (1D-CapsNet) to predict DBPs directly from raw protein sequences. Our architecture incorporates dilated convolutions within residual blocks to mitigate vanishing gradient issues and extract rich sequence features, while capsule layers with dynamic routing capture hierarchical and spatial relationships within the learned feature space. We conducted comprehensive ablation studies comparing global and local embeddings from ProteinBERT and conventional one-hot encoding. Results show that ProteinBERT embeddings substantially outperform other representations on large datasets. Although one-hot encoding showed marginal advantages on smaller datasets, such as PDB186, it struggled to scale effectively. Extensive evaluations on four pairs of publicly available benchmark datasets demonstrate that our model consistently outperforms current state-of-the-art methods. It achieved AUC scores of 98.0% and 89.5% on PDB14189andPDB1075, respectively. On independent test sets PDB2272 and PDB186, the model attained top AUCs of 83.2% and 83.3%, while maintaining competitive performance on larger datasets such as PDB20000. Notably, the model maintains a well balanced sensitivity and specificity across datasets. These results demonstrate the efficacy and generalizability of integrating global protein representations with advanced deep learning architectures for reliable and scalable DBP prediction in diverse genomic contexts.
[33] arXiv:2507.20477 (交叉列表自 cs.IT) [中文pdf, pdf, html, 其他]: 标题：重新思考语义域中的多用户通信：基于洗牌正交化和扩散去噪的增强型OMDMA

标题： Rethinking Multi-User Communication in Semantic Domain: Enhanced OMDMA by Shuffle-Based Orthogonalization and Diffusion Denoising

Maojun Zhang, Guangxu Zhu, Xiaoming Chen, Kaibin Huang, Zhaoyang Zhang

评论： 16页

主题：信息论 (cs.IT) ; 信号处理 (eess.SP)

用户间干扰仍然是无线通信系统中的关键瓶颈，特别是在语义通信（SemCom）这一新兴范式中。与传统系统相比，SemCom中的用户间干扰会严重破坏关键的语义信息，通常在相同功率水平下表现比高斯噪声更差。为解决这一挑战，我们受到最近提出的正交模型划分多址接入（OMDMA）概念的启发，该概念利用基于个性化联合源信道编码（JSCC）模型的语义正交性来区分用户，我们提出了一种新颖且可扩展的框架，不再需要像原始OMDMA那样使用用户特定的JSCC模型。我们的关键创新在于基于打乱的正交化，其中随机打乱JSCC特征向量的位置将用户间干扰转化为类似高斯噪声。通过为每个用户分配唯一的打乱模式，干扰被当作信道噪声处理，从而可以使用扩散模型（DMs）进行有效缓解。这种方法不仅通过只需要一个通用的JSCC模型简化了系统设计，还增强了隐私性，因为打乱模式充当隐式的私钥。此外，我们将该框架扩展到涉及语义相关数据的场景。通过基于语义相似性对用户进行分组，引入了一种协作波束成形策略，以利用相关数据中的冗余，进一步提高系统性能。大量仿真结果表明，所提出的方法优于最先进的多用户SemCom框架，在不增加额外训练开销的情况下，实现了更高的语义保真度、抗干扰能力和可扩展性。

Inter-user interference remains a critical bottleneck in wireless communication systems, particularly in the emerging paradigm of semantic communication (SemCom). Compared to traditional systems, inter-user interference in SemCom severely degrades key semantic information, often causing worse performance than Gaussian noise under the same power level. To address this challenge, inspired by the recently proposed concept of Orthogonal Model Division Multiple Access (OMDMA) that leverages semantic orthogonality rooted in the personalized joint source and channel (JSCC) models to distinguish users, we propose a novel, scalable framework that eliminates the need for user-specific JSCC models as did in original OMDMA. Our key innovation lies in shuffle-based orthogonalization, where randomly permuting the positions of JSCC feature vectors transforms inter-user interference into Gaussian-like noise. By assigning each user a unique shuffling pattern, the interference is treated as channel noise, enabling effective mitigation using diffusion models (DMs). This approach not only simplifies system design by requiring a single universal JSCC model but also enhances privacy, as shuffling patterns act as implicit private keys. Additionally, we extend the framework to scenarios involving semantically correlated data. By grouping users based on semantic similarity, a cooperative beamforming strategy is introduced to exploit redundancy in correlated data, further improving system performance. Extensive simulations demonstrate that the proposed method outperforms state-of-the-art multi-user SemCom frameworks, achieving superior semantic fidelity, robustness to interference, and scalability-all without requiring additional training overhead.
[34] arXiv:2507.20792 (交叉列表自 eess.SY) [中文pdf, pdf, 其他]: 标题：基于无人机的数字雷达系统用于相干多基地SAR成像

标题： UAV-Borne Digital Radar System for Coherent Multistatic SAR Imaging

Julian Kanz, Christian Gesell, Christina Bonfert, David Werbunat, Alexander Grathwohl, Julian Aguilar, Martin Vossiek, Christian Waldschmidt

主题：系统与控制 (eess.SY) ; 信号处理 (eess.SP)

模拟到数字转换器（ADC）技术的进步使得更高的采样率成为可能，这使得采用直接采样射频（RF）信号的数字雷达架构成为可行，从而消除了对模拟下变频的需求。这种数字方法在波形设计和信号处理方面提供了更大的灵活性，特别是通过正交频分复用（OFDM）等数字调制方案。本文介绍了一种安装在无人飞行器（UAV）上的数字雷达系统，该系统采用OFDM波形进行L波段相干多站合成孔径雷达（SAR）成像。雷达设置包括一个负责信号发射和单基地数据采集的主要UAV节点，以及在仅接收模式下运行的次级节点。这些次级节点捕获从场景反射的雷达信号以及直接的侧链信号。雷达和侧链路径的射频信号被采样并离线处理。为了高效管理数据存储，采用了一个触发机制来仅记录雷达信号的相关部分。该系统在快时间域和慢时间域中保持相干性，这对于多站SAR成像至关重要。由于次级节点是被动的，该系统可以轻松扩展以适应更大的UAV群。本文详细描述了单基地和多站SAR图像形成的完整信号处理流程，包括对由于节点未耦合操作而产生的同步误差的分析和校正。所提出的相干处理方法通过静态雷达测量进行了验证，证明了该概念实现了相干性。此外，基于UAV的双基地SAR实验通过生成高分辨率的单基地、双基地和组合多站SAR图像展示了系统的性能。

Advancements in analog-to-digital converter (ADC) technology have enabled higher sampling rates, making it feasible to adopt digital radar architectures that directly sample the radio-frequency (RF) signal, eliminating the need for analog downconversion. This digital approach supports greater flexibility in waveform design and signal processing, particularly through digital modulation schemes like orthogonal frequency division multiplexing (OFDM). This paper presents a digital radar system mounted on an uncrewed aerial vehicle (UAV), which employs OFDM waveforms for coherent multistatic synthetic aperture radar (SAR) imaging in the L-band. The radar setup features a primary UAV node responsible for signal transmission and monostatic data acquisition, alongside secondary nodes that operate in a receive-only mode. These secondary nodes capture the radar signal reflected from the scene as well as a direct sidelink signal. RF signals from both the radar and sidelink paths are sampled and processed offline. To manage data storage efficiently, a trigger mechanism is employed to record only the relevant portions of the radar signal. The system maintains coherency in both fast-time and slow-time domains, which is essential for multistatic SAR imaging. Because the secondary nodes are passive, the system can be easily scaled to accommodate a larger swarm of UAVs. The paper details the full signal processing workflow for both monostatic and multistatic SAR image formation, including an analysis and correction of synchronization errors that arise from the uncoupled operation of the nodes. The proposed coherent processing method is validated through static radar measurements, demonstrating coherency achieved by the concept. Additionally, a UAV-based bistatic SAR experiment demonstrates the system's performance by producing high-resolution monostatic, bistatic, and combined multistatic SAR images.
[35] arXiv:2507.20846 (交叉列表自 astro-ph.IM) [中文pdf, pdf, 其他]: 标题：亚赫兹频率下的精确谱估计：闭合形式后验和贝叶斯噪声投影

标题： Precision spectral estimation at sub-Hz frequencies: closed-form posteriors and Bayesian noise projection

Lorenzo Sala, Stefano Vitale

评论：此工作已提交给IEEE以可能发表

主题：天体物理学的仪器与方法 (astro-ph.IM) ; 信号处理 (eess.SP) ; 应用 (stat.AP)

我们提出了一种贝叶斯方法，用于估计多变量高斯时间序列中的谱量。该方法基于周期图和Wishart统计量，在任何给定频率下，为各个功率谱密度的边缘后验分布、成对相干性、多重相干性以及完整的交叉谱密度矩阵的联合后验分布提供了闭合形式的表达式。在噪声投影的背景下——其中某一序列被建模为其他序列的滤波版本的线性组合加上一个背景成分——该方法还为灵敏度（即滤波器传递函数）和背景的功率谱密度提供了闭合形式的后验分布。该方法最初是为分析欧洲空间局LISA路径探测器任务的数据而开发的，特别适用于非常低频的数据，在这种情况下，长时间的观测时间使得无法通过对大量周期图进行平均来处理，而这些周期图本来可以被视为近似正态分布。

We present a Bayesian method for estimating spectral quantities in multivariate Gaussian time series. The approach, based on periodograms and Wishart statistics, yields closed-form expressions at any given frequency for the marginal posterior distributions of the individual power spectral densities, the pairwise coherence, and the multiple coherence, as well as for the joint posterior distribution of the full cross-spectral density matrix. In the context of noise projection - where one series is modeled as a linear combination of filtered versions of the others, plus a background component - the method also provides closed-form posteriors for both the susceptibilities, i.e., the filter transfer functions, and the power spectral density of the background. Originally developed for the analysis of the data from the European Space Agency's LISA Pathfinder mission, the method is particularly well-suited to very-low-frequency data, where long observation times preclude averaging over large sets of periodograms, which would otherwise allow these to be treated as approximately normally distributed.
[36] arXiv:2507.20966 (交叉列表自 cs.IT) [中文pdf, pdf, html, 其他]: 标题：基于DRL的用户中心无蜂窝大规模MIMO网络切换设计

标题： Handoff Design in User-Centric Cell-Free Massive MIMO Networks Using DRL

Hussein A. Ammar, Raviraj Adve, Shahram Shahbazpanahi, Gary Boudreau, Israfil Bahceci

评论：发表于IEEE通信汇刊（IEEE TCOM）

主题：信息论 (cs.IT) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG) ; 网络与互联网架构 (cs.NI) ; 信号处理 (eess.SP)

在以用户为中心的无蜂窝大规模MIMO（UC-mMIMO）网络方案中，用户移动性需要更新服务接入点的集合，以维持以用户为中心的聚类。此类更新通常通过切换（HO）操作执行；然而，频繁的HO会导致与资源分配和释放相关的开销。本文提出了一种基于深度强化学习（DRL）的解决方案，用于预测和管理移动用户的连接。我们的解决方案采用软演员评论家算法，具有连续动作空间表示，以训练深度神经网络作为HO策略。我们提出了一个新颖的奖励函数提案，其中集成了HO惩罚，以平衡可达到的速率和与HO相关的开销。我们开发了系统的两种变体；第一种使用基于用户运动模式的移动方向辅助（DA）观测，而第二种使用基于大尺度衰落（LSF）历史的历史辅助（HA）观测。仿真结果表明，我们的基于DRL的连续动作空间方法比离散空间方法更具可扩展性，并且我们推导的HO策略能够自动学习在特定时间槽中聚集HO，以最小化启动HO的开销。我们的解决方案还可以实时运行，响应时间小于0.4 ms。

In the user-centric cell-free massive MIMO (UC-mMIMO) network scheme, user mobility necessitates updating the set of serving access points to maintain the user-centric clustering. Such updates are typically performed through handoff (HO) operations; however, frequent HOs lead to overheads associated with the allocation and release of resources. This paper presents a deep reinforcement learning (DRL)-based solution to predict and manage these connections for mobile users. Our solution employs the Soft Actor-Critic algorithm, with continuous action space representation, to train a deep neural network to serve as the HO policy. We present a novel proposition for a reward function that integrates a HO penalty in order to balance the attainable rate and the associated overhead related to HOs. We develop two variants of our system; the first one uses mobility direction-assisted (DA) observations that are based on the user movement pattern, while the second one uses history-assisted (HA) observations that are based on the history of the large-scale fading (LSF). Simulation results show that our DRL-based continuous action space approach is more scalable than discrete space counterpart, and that our derived HO policy automatically learns to gather HOs in specific time slots to minimize the overhead of initiating HOs. Our solution can also operate in real time with a response time less than 0.4 ms.
[37] arXiv:2507.21023 (交叉列表自 cs.LG) [中文pdf, pdf, html, 其他]: 标题：使用Shapley值进行异常定位：统计研究

标题： On Using the Shapley Value for Anomaly Localization: A Statistical Investigation

Rick S. Blum, Franziska Freytag

主题：机器学习 (cs.LG) ; 信号处理 (eess.SP)

最近的出版物建议使用Shapley值用于传感器数据系统的异常定位。使用合理的数学异常模型进行完全控制，实验表明，在Shapley值计算中使用单个固定项可实现较低复杂度的异常定位测试，与使用所有测试案例的Shapley值的测试具有相同的错误概率。一个证明表明这些结论对于所有独立观测情况必须为真。对于依赖观测情况，没有可用的证明。

Recent publications have suggested using the Shapley value for anomaly localization for sensor data systems. Using a reasonable mathematical anomaly model for full control, experiments indicate that using a single fixed term in the Shapley value calculation achieves a lower complexity anomaly localization test, with the same probability of error, as a test using the Shapley value for all cases tested. A proof demonstrates these conclusions must be true for all independent observation cases. For dependent observation cases, no proof is available.

[38] arXiv:2312.09450 (替换) [中文pdf, pdf, 其他]: 标题：基于性能的优化2D钢筋混凝土框架通过推覆分析和ABC优化算法

标题： Performance-Based Optimization of 2D Reinforced Concrete Moment Frames through Pushover Analysis and ABC Optimization Algorithm

Saba Faghirnejad

评论：发表于《地震与结构》，第27卷第4期，2024年

期刊参考：地震与结构 27(4):285-302, 2024

主题：信号处理 (eess.SP)

进行非线性推覆分析通常需要复杂且资源密集的计算尝试，并涉及一个高度迭代的过程，这对于满足基于性能设计中定义的设计要求和规范要求是必要的。本研究提出了一种基于计算机的技术，用于钢筋混凝土（RC）建筑，结合优化数值方法、最优性准则技术和推覆分析，以自动实现推覆位移性能的抗震设计。使用人工蜂群优化算法，提出了基于混凝土梁、柱和剪力墙在混凝土框架中的性能的最优设计。该设计应用于三个框架，如4层、8层和12层。这些结构旨在最小化整体重量，同时满足包括生命安全（L-S）、防止倒塌（C-P）和立即使用（I-O）在内的性能水平。为了达到这个目标，进行了三个主要步骤。在第一步中，在MATLAB软件中实现了优化代码，并使用OpenSees软件对结构进行非线性静力分析。通过求解优化问题，为每个框架和剪力墙获得几个最佳设计。考虑到FEMA356规范的非线性规定中的相对位移和塑性铰旋转约束，进行推覆分析以达到每个性能水平。随后，为每个框架绘制收敛、推覆和位移历史曲线，并最终选择每个框架的最佳设计。结果表明，该算法在实现选择最佳设计和降低重量的结构方面表现良好。

Conducting nonlinear pushover analysis typically demands intricate and resource-intensive computational attempts, and involves a process that is highly iterative and necessary for satisfying design-defined and also requirements of codes in performance-based design. A computer-based technique is presented for reinforced concrete (RC) buildings in this study, incorporating optimization numerical approaches, techniques of optimality criteria and pushover analysis to seismic design automatically the pushover drift performance. The optimal design based on the performance of concrete beams, columns and shear walls in concrete moment frames is presented using the artificial bee colony optimization algorithm. The design is applied to three frames such as a 4-story, an 8-story and a 12-story. These structures are designed to minimize the overall weight while satisfying the levels of performance include Life Safety (L-S), Collapse Prevention (C-P), and Immediate Occupancy (I-O). To achieve this goal, three main steps are performed. In the first step, optimization codes are implemented in MATLAB software, and the OpenSees software is used for nonlinear static analysis of the structure. By solving the optimization problem, several top designs are obtained for each frame and shear wall. Pushover analysis is performed considering the constraints of relative displacement and plastic hinge rotation based on the nonlinear provisions of FEMA356 code to achieve each levels of performance. Following this, convergence, pushover, and drift history curves are plotted for each frame, and selecting the best design for each frame ultimately occurs. The results demonstrate the algorithm's performance is desirable for the structure to achieve selecting the best design and lower weight.
[39] arXiv:2409.08839 (替换) [中文pdf, pdf, 其他]: 标题： RF 挑战：数据驱动的射频信号分离挑战

标题： RF Challenge: The Data-Driven Radio Frequency Signal Separation Challenge

Alejandro Lancho, Amir Weiss, Gary C.F. Lee, Tejas Jayashankar, Binoy Kurien, Yury Polyanskiy, Gregory W. Wornell

评论： 17页，16图。增加了关于测试集泄漏的脚注

期刊参考： IEEE 通信学会开放期刊，第6卷，第4083-4100页，2025

主题：信号处理 (eess.SP) ; 机器学习 (cs.LG)

我们采用数据驱动的方法来解决射频(RF)信号中的干扰抑制这一关键问题。本文的主要贡献是引入了RF Challenge，这是一个公开可用的多样化射频信号数据集，用于数据驱动的射频信号问题分析。具体而言，我们采用了一个简化的信号模型来开发和分析干扰抑制算法。对于这个信号模型，我们引入了一组精心选择的深度学习架构，结合关键领域知识的修改以及传统基准解决方案，以建立该复杂且普遍问题的基线性能指标。通过涉及八种不同信号混合类型的广泛模拟，我们展示了如UNet和WaveNet等架构在传统方法如匹配滤波和线性最小均方误差估计方面的优越性能（在某些情况下，性能提高了两个数量级）。我们的研究结果表明，数据驱动的方法可以产生可扩展的解决方案，即相同的架构可能被类似地训练和部署用于不同类型的信号。此外，这些发现进一步证实了深度学习算法在增强通信系统方面的潜在前景，特别是通过干扰抑制。本工作还包括基于RF Challenge的开放竞赛结果，该竞赛在2024年IEEE国际声学、语音与信号处理会议(ICASSP'24)上举办。

We address the critical problem of interference rejection in radio-frequency (RF) signals using a data-driven approach that leverages deep-learning methods. A primary contribution of this paper is the introduction of the RF Challenge, which is a publicly available, diverse RF signal dataset for data-driven analyses of RF signal problems. Specifically, we adopt a simplified signal model for developing and analyzing interference rejection algorithms. For this signal model, we introduce a set of carefully chosen deep learning architectures, incorporating key domain-informed modifications alongside traditional benchmark solutions to establish baseline performance metrics for this intricate, ubiquitous problem. Through extensive simulations involving eight different signal mixture types, we demonstrate the superior performance (in some cases, by two orders of magnitude) of architectures such as UNet and WaveNet over traditional methods like matched filtering and linear minimum mean square error estimation. Our findings suggest that the data-driven approach can yield scalable solutions, in the sense that the same architectures may be similarly trained and deployed for different types of signals. Moreover, these findings further corroborate the promising potential of deep learning algorithms for enhancing communication systems, particularly via interference mitigation. This work also includes results from an open competition based on the RF Challenge, hosted at the 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'24).
[40] arXiv:2502.18118 (替换) [中文pdf, pdf, html, 其他]: 标题：生成式人工智能赋能的无线通信用于稳健的低空经济网络

标题： Generative AI-enabled Wireless Communications for Robust Low-Altitude Economy Networking

Changyuan Zhao, Jiacheng Wang, Ruichen Zhang, Dusit Niyato, Geng Sun, Hongyang Du, Dong In Kim, Abbas Jamalipour

评论： 9页，4图

主题：信号处理 (eess.SP)

低空经济网络（LAENets）已成为社会活动的重要推动者，提供低空服务，如包裹、杂货和医疗用品的运输。由于其控制机制和不断变化的操作因素，LAENets本质上比传统地面网络更复杂，并且更容易受到安全威胁。随着LAENet应用的持续扩展，这些系统的鲁棒性变得至关重要。在本文中，我们提出了一种生成式人工智能（GenAI）优化框架，以解决LAENets中的鲁棒性挑战。我们对LAENets的鲁棒性需求进行了系统分析，并从无线物理层的角度全面回顾了鲁棒服务质量（QoS）指标。然后，我们研究了现有的基于GenAI的鲁棒性增强方法。这导致我们提出了一个基于扩散的优化框架，该框架具有专家混合（MoE）-transformer执行器网络。在鲁棒波束成形的案例研究中，所提出的框架通过在不确定性下优化波束成形展示了其有效性，在最坏情况下的可实现保密速率上，超过了四个学习基线，提高了超过15%。这些发现突显了GenAI在增强LAENet鲁棒性方面的巨大潜力。

Low-Altitude Economy Networks (LAENets) have emerged as significant enablers of social activities, offering low-altitude services such as the transportation of packages, groceries, and medical supplies. Owing to their control mechanisms and ever-changing operational factors, LAENets are inherently more complex and vulnerable to security threats than traditional terrestrial networks. As applications of LAENet continue to expand, the robustness of these systems becomes crucial. In this paper, we propose a generative artificial intelligence (GenAI) optimization framework that tackles robustness challenges in LAENets. We conduct a systematic analysis of robustness requirements for LAENets, complemented by a comprehensive review of robust Quality of Service (QoS) metrics from the wireless physical layer perspective. We then investigate existing GenAI-enabled approaches for robustness enhancement. This leads to our proposal of a novel diffusion-based optimization framework with a Mixture of Experts (MoE)-transformer actor network. In the robust beamforming case study, the proposed framework demonstrates its effectiveness by optimizing beamforming under uncertainties, achieving a more than 15% increase over four learning baselines in the worst-case achievable secrecy rate. These findings highlight the significant potential of GenAI in strengthening LAENet robustness.
[41] arXiv:2503.09922 (替换) [中文pdf, pdf, html, 其他]: 标题：基于RIS的分数约束分数规划联合感知与通信

标题： RIS-Assisted Joint Sensing and Communications via Fractionally Constrained Fractional Programming

Yiming Liu, Kareem M. Attiah, Wei Yu

评论：论文已被接受发表于IEEE无线通信汇刊

主题：信号处理 (eess.SP)

本文研究了一个由可重构智能表面（RIS）辅助的上行双功能感知和通信系统，其反射模式被最优配置以权衡感知和通信功能。具体而言，在确保通信用户信干噪比约束的前提下，最小化估计感知用户方位角的贝叶斯克拉默-拉奥下界（BCRLB）。我们表明，这个问题可以被表述为一个新颖的分数约束分数规划（FCFP）问题。为了处理这个高度非平凡的问题，我们将一种二次变换技术进行了扩展，该技术最初用于处理仅在目标中包含分数结构的优化问题，现在扩展到约束中也包含比值的情况。首先，我们考虑衰落系数已知的情况。使用二次变换，FCFP问题可以转化为一系列子问题，这些子问题除了常模约束外都是凸的，而常模约束可以通过基于惩罚的方法来处理。为了进一步降低计算复杂度，我们利用了常模条件，并提出了一种新的线性变换。这种新变换使得FCFP问题可以转化为一系列线性规划（LP）子问题，这些子问题可以在反射元件维度上以线性复杂度求解。然后，我们考虑衰落系数未知的情况。使用修改后的BCRLB使问题更具可处理性，并使用所提出的基于二次变换的算法来解决该问题。数值结果揭示了由RIS合成的非平凡且有效的反射模式，这些模式可以促进通信和感知功能。

This paper studies an uplink dual-functional sensing and communication system aided by a reconfigurable intelligent surface (RIS), whose reflection pattern is optimally configured to trade-off sensing and communication functionalities. Specifically, the Bayesian Cram\'er-Rao lower bound (BCRLB) for estimating the azimuth angle of a sensing user is minimized while ensuring the signal-to-interference-plus-noise ratio constraints for communication users. We show that this problem can be formulated as a novel fractionally constrained fractional programming (FCFP) problem. To deal with this highly nontrivial problem, we extend a quadratic transform technique, originally proposed to handle optimization problems containing fractional structures only in objectives, to the scenario where the constraints also include ratios. First, we consider the case where the fading coefficient is known. Using the quadratic transform, the FCFP problem can be turned into a sequence of subproblems that are convex except for the constant-modulus constraints which can be tackled using a penalty-based approach. To further reduce the computational complexity, we leverage the constant-modulus conditions and propose a novel linear transform. This new transform enables the FCFP problem to be turned into a sequence of linear programming (LP) subproblems, which can be solved with linear complexity in the dimension of reflecting elements. Then, we consider the case where the fading coefficient is unknown. A modified BCRLB is used to make the problem more tractable, and the proposed quadratic transform based algorithm is used to solve the problem. Numerical results unveil nontrivial and effective reflection patterns that can be synthesized by the RIS to facilitate both communication and sensing functionalities.
[42] arXiv:2504.07720 (替换) [中文pdf, pdf, 其他]: 标题：通过拓扑透镜进行过滤：时间频率平面上点过程的同调性

标题： Filtering through a topological lens: homology for point processes on the time-frequency plane

Juan Manuel Miramont, Kin Aun Tan, Soumendu Sundar Mukherjee, Rémi Bardenet, Subhroshekhar Ghosh

主题：信号处理 (eess.SP) ; 代数拓扑 (math.AT)

我们从拓扑数据分析（TDA）的角度，介绍了一种分析来自噪声测量的信号的通用方法。虽然TDA已发展成为一种强大的数据分析工具，用于具有明显拓扑结构的数据，但在这里我们展示了其在一般信号处理问题中的适用性，而无需任何先验几何特征。我们的方法适用于不同科学领域中的各种时间依赖信号，其中声学信号是一个特别重要的应用。我们使用这些信号的时间频率表示，重点关注它们的零点，这些零点由于其稳定性特性，正在成为信号处理工具中越来越重要的部分。利用最先进的拓扑概念，如稳定体积和最小体积，我们开发了一整套基于TDA的方法，以探索这些零点的微妙随机几何结构，根据它们对这种刚性、超均匀空间结构造成的破坏来捕捉信号。与经典的空间数据工具不同，TDA能够捕捉零点随机几何的全部谱，从而产生由有原则的统计基础支持的强大推断结果。这体现在我们应用的功率和多样性上，包括在处理各种音频信号（特别是低信噪比环境）中的竞争性能，有效检测和重建引力波信号（一种具有非高斯噪声的著名信号处理挑战），以及来自脑电图的医学时间序列数据，表明本文介绍的方法和方法具有广阔的应用前景。

We introduce a very general approach to the analysis of signals from their noisy measurements from the perspective of Topological Data Analysis (TDA). While TDA has emerged as a powerful analytical tool for data with pronounced topological structures, here we demonstrate its applicability for general problems of signal processing, without any a-priori geometric feature. Our methods are well-suited to a wide array of time-dependent signals in different scientific domains, with acoustic signals being a particularly important application. We invoke time-frequency representations of such signals, focusing on their zeros which are gaining salience as a signal processing tool in view of their stability properties. Leveraging state-of-the-art topological concepts, such as stable and minimal volumes, we develop a complete suite of TDA-based methods to explore the delicate stochastic geometry of these zeros, capturing signals based on the disruption they cause to this rigid, hyperuniform spatial structure. Unlike classical spatial data tools, TDA is able to capture the full spectrum of the stochastic geometry of the zeros, thereby leading to powerful inferential outcomes that are underpinned by a principled statistical foundation. This is reflected in the power and versatility of our applications, which include competitive performance in processing. a wide variety of audio signals (esp. in low SNR regimes), effective detection and reconstruction of gravitational wave signals (a reputed signal processing challenge with non-Gaussian noise), and medical time series data from EEGs, indicating a wide horizon for the approach and methods introduced in this paper.
[43] arXiv:2506.22456 (替换) [中文pdf, pdf, html, 其他]: 标题：基于变分自编码器的自动化仓库中人工智能驱动的无线电传播预测

标题： AI-Driven Radio Propagation Prediction in Automated Warehouses using Variational Autoencoders

Rahul Gulia, Amlan Ganguly, Andres Kwasinski, Michael E. Kuhl, Ehsan Rashedi, Clark Hochgraf

主题：信号处理 (eess.SP) ; 图像与视频处理 (eess.IV)

接下来的十年将带来无线通信的深刻变革，这由对数据密集型应用日益增长的需求以及新兴技术的快速采用所推动。为了充分发挥5G及更先进技术的潜力，需要在信号处理技术、创新网络架构和高效频谱利用策略方面取得重大进展。这些进展促进了新兴技术的无缝集成，推动了工业数字化转型和连接性。本文介绍了一种基于变分自编码器（VAE）的框架，名为WISVA（使用VAE的智能仓库无线基础设施），旨在准确地对自动化工业4.0环境中的室内无线电传播进行建模，例如在5G无线频段内运行的仓库和工厂地板。本研究深入探讨了训练数据张量的精心创建，捕捉受各种障碍物影响的复杂电磁（EM）波行为，并概述了所提出的VAE模型的架构和训练方法。通过其在不同场景下预测信噪干扰比（SINR）热图的能力，包括去噪任务、验证数据集、对未见过的配置进行外推以及之前未遇到的仓库布局，展示了该模型的鲁棒性和适应性。本文展示了引人注目的重建误差热图，突显了WISVA相比传统自编码器模型的优越准确性。本文还分析了该模型在处理复杂智能仓库环境中的性能，证明了其作为优化工业4.0无线基础设施的关键推动者的潜力。

The next decade will usher in a profound transformation of wireless communication, driven by the ever-increasing demand for data-intensive applications and the rapid adoption of emerging technologies. To fully unlock the potential of 5G and beyond, substantial advancements are required in signal processing techniques, innovative network architectures, and efficient spectrum utilization strategies. These advancements facilitate seamless integration of emerging technologies, driving industrial digital transformation and connectivity. This paper introduces a novel Variational Autoencoder (VAE)-based framework, Wireless Infrastructure for Smart Warehouses using VAE (WISVA), designed for accurate indoor radio propagation modeling in automated Industry 4.0 environments such as warehouses and factory floors operating within 5G wireless bands. The research delves into the meticulous creation of training data tensors, capturing complex electromagnetic (EM) wave behaviors influenced by diverse obstacles, and outlines the architecture and training methodology of the proposed VAE model. The model's robustness and adaptability are showcased through its ability to predict signal-to-interference-plus-noise ratio (SINR) heatmaps across various scenarios, including denoising tasks, validation datasets, extrapolation to unseen configurations, and previously unencountered warehouse layouts. Compelling reconstruction error heatmaps are presented, highlighting the superior accuracy of WISVA compared to traditional autoencoder models. The paper also analyzes the model's performance in handling complex smart warehouse environments, demonstrating its potential as a key enabler for optimizing wireless infrastructure in Industry 4.0.
[44] arXiv:2506.22495 (替换) [中文pdf, pdf, html, 其他]: 标题：掩码自编码器感受心脏：揭示心电图分析中的简单性偏差

标题： Masked Autoencoders that Feel the Heart: Unveiling Simplicity Bias for ECG Analyses

He-Yang Xu, Hongxiang Gao, Yuwen Li, Xiu-Shen Wei, Chengyu Liu

评论：修订版4

主题：信号处理 (eess.SP) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG)

心电图（ECG）的诊断价值在于其动态特性，从节律波动到随时间域和频率域演变的细微波形变形。然而，监督ECG模型往往过度拟合主导且重复的模式，忽视了细粒度但临床上关键的提示，这种现象称为简单性偏差（SB），其中模型更倾向于学习容易获取的信号而非细微但有信息量的信号。在本工作中，我们首先通过实证证明了ECG分析中SB的存在及其对诊断性能的负面影响，同时发现自监督学习（SSL）可以缓解这一问题，为解决偏差提供了一个有前景的方向。遵循SSL范式，我们提出了一种新方法，包含两个关键组件：1）时间-频率感知滤波器，用于捕捉反映ECG信号动态特性的时频特征，以及2）在此基础上，构建多粒度原型重构，以在双域中进行粗粒度和细粒度表示学习，进一步减轻SB。为了推进ECG分析中的SSL，我们整理了一个大规模多中心ECG数据集，包含来自300多个临床中心的153万条记录。在六个ECG数据集上的三个下游任务实验表明，我们的方法有效减少了SB，并实现了最先进性能。

The diagnostic value of electrocardiogram (ECG) lies in its dynamic characteristics, ranging from rhythm fluctuations to subtle waveform deformations that evolve across time and frequency domains. However, supervised ECG models tend to overfit dominant and repetitive patterns, overlooking fine-grained but clinically critical cues, a phenomenon known as Simplicity Bias (SB), where models favor easily learnable signals over subtle but informative ones. In this work, we first empirically demonstrate the presence of SB in ECG analyses and its negative impact on diagnostic performance, while simultaneously discovering that self-supervised learning (SSL) can alleviate it, providing a promising direction for tackling the bias. Following the SSL paradigm, we propose a novel method comprising two key components: 1) Temporal-Frequency aware Filters to capture temporal-frequency features reflecting the dynamic characteristics of ECG signals, and 2) building on this, Multi-Grained Prototype Reconstruction for coarse and fine representation learning across dual domains, further mitigating SB. To advance SSL in ECG analyses, we curate a large-scale multi-site ECG dataset with 1.53 million recordings from over 300 clinical centers. Experiments on three downstream tasks across six ECG datasets demonstrate that our method effectively reduces SB and achieves state-of-the-art performance.
[45] arXiv:2507.06020 (替换) [中文pdf, pdf, 其他]: 标题：一种具有邻域变异的微分进化算法用于DOA估计

标题： A Differential Evolution Algorithm with Neighbor-hood Mutation for DOA Estimation

Bo Zhou, Kaijie Xu, Yinghui Quan, Mengdao Xing

主题：信号处理 (eess.SP) ; 神经与进化计算 (cs.NE)

二维（2D）多重信号分类算法是阵列信号处理中一种强大的高分辨率到达方向（DOA）估计技术。然而，在二维角度域中的全面搜索会导致计算成本高昂，限制了其在实时场景中的应用。在本工作中，我们将峰值查找过程重新表述为多模态优化问题，并提出了一种具有邻域变异的差分进化算法（DE-NM），以高效地定位多个谱峰，而无需密集的网格采样。仿真结果表明，所提出的方法在估计精度上与传统的网格搜索方法相当，同时显著减少了计算时间。这种策略为实际应用中的实时高分辨率DOA估计提供了一个有前景的解决方案。实现代码可在 https://github.com/zzb-nice/DOA_multimodel_optimize 获取。

Two-dimensional (2D) Multiple Signal Classification algorithm is a powerful technique for high-resolution direction-of-arrival (DOA) estimation in array signal processing. However, the exhaustive search over the 2D an-gular domain leads to high computa-tional cost, limiting its applicability in real-time scenarios. In this work, we reformulate the peak-finding process as a multimodal optimization prob-lem, and propose a Differential Evolu-tion algorithm with Neighborhood Mutation (DE-NM) to efficiently lo-cate multiple spectral peaks without requiring dense grid sampling. Simu-lation results demonstrate that the proposed method achieves comparable estimation accuracy to the traditional grid search, while significantly reduc-ing computation time. This strategy presents a promising solution for real-time, high-resolution DOA estimation in practical applications. The imple-mentation code is available at https://github.com/zzb-nice/DOA_multimodel_optimize.
[46] arXiv:2507.14144 (替换) [中文pdf, pdf, html, 其他]: 标题：递归KalmanNet：基于卡尔曼滤波引导的递归神经网络泛化能力分析

标题： Recursive KalmanNet: Analyse des capacités de généralisation d'un réseau de neurones récurrent guidé par un filtre de Kalman

Cyril Falcon, Hassan Mortada, Mathéo Clavaud, Jean-Philippe Michel

评论： 4页，法语语言。4个图表。已被接受发表于GRETSI 2025论文集

主题：信号处理 (eess.SP) ; 机器学习 (cs.LG)

递归卡尔曼网络，由作者最近引入，是一种由卡尔曼滤波引导的循环神经网络，能够在不事先了解噪声特性的情况下，从噪声测量中估计随机动态系统的状态变量和误差协方差。本文探讨了其在分布外场景中的泛化能力，其中测试测量的时间动态与训练期间遇到的不同。 Le Recursive KalmanNet, récemment introduit par les auteurs, est un réseau de neurones récurrent guidé par un filtre de Kalman, capable d'estimer les variables d'état et la covariance des erreurs des systèmes dynamiques stochastiques à partir de mesures bruitées, sans connaissance préalable des caractéristiques des bruits. Cet article explore ses capacités de généralisation dans des scénarios hors distribution, où les dynamiques temporelles des mesures de test diffèrent de celles rencontrées à l'entraînement.

The Recursive KalmanNet, recently introduced by the authors, is a recurrent neural network guided by a Kalman filter, capable of estimating the state variables and error covariance of stochastic dynamic systems from noisy measurements, without prior knowledge of the noise characteristics. This paper explores its generalization capabilities in out-of-distribution scenarios, where the temporal dynamics of the test measurements differ from those encountered during training. Le Recursive KalmanNet, r\'ecemment introduit par les auteurs, est un r\'eseau de neurones r\'ecurrent guid\'e par un filtre de Kalman, capable d'estimer les variables d'\'etat et la covariance des erreurs des syst\`emes dynamiques stochastiques \`a partir de mesures bruit\'ees, sans connaissance pr\'ealable des caract\'eristiques des bruits. Cet article explore ses capacit\'es de g\'en\'eralisation dans des sc\'enarios hors distribution, o\`u les dynamiques temporelles des mesures de test diff\`erent de celles rencontr\'ees \`a l'entra\^inement.
[47] arXiv:2407.06868 (替换) [中文pdf, pdf, 其他]: 标题： DRL-AdaPart：DRL驱动的自适应STAR-RIS划分以实现公平和节约的资源利用

标题： DRL-AdaPart: DRL-Driven Adaptive STAR-RIS Partitioning for Fair and Frugal Resource Utilization

Ashok S. Kumar, Nancy Nayak, Sheetal Kalyani, Himal A. Suraweera

主题：信息论 (cs.IT) ; 机器学习 (cs.LG) ; 信号处理 (eess.SP)

在本工作中，我们提出了一种方法，用于高效利用同时发射和反射可重构智能表面（STAR-RIS）元件，以确保公平且高速的数据速率。我们引入了一个子表面分配变量，该变量决定了分配给每个用户的STAR-RIS元件数量，并通过使用适当调整的深度强化学习（DRL）算法联合优化STAR-RIS的相位偏移和子表面分配变量来最大化数据速率之和。所提出的DRL方法还与Dinkelbach算法和设计的混合DRL方法进行了比较。在DRL模型中引入了一个惩罚项，通过智能地在不需要时停用STAR-RIS元件来提高资源利用率。通过广泛的仿真，所提出的DRL方法可以在确保高效资源利用的同时，为静态和移动用户实现公平且高速的数据速率。使用所提出的DRL方法，在静态和移动场景中，最多可以分别停用27%和21%的STAR-RIS元件，而不会影响性能。

In this work, we propose a method for efficient resource utilization of simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) elements to ensure fair and high data rates. We introduce a subsurface assignment variable that determines the number of STAR-RIS elements allocated to each user and maximizes the sum of the data rates by jointly optimizing the phase shifts of the STAR-RIS and the subsurface assignment variables using an appropriately tailored deep reinforcement learning (DRL) algorithm. The proposed DRL method is also compared with a Dinkelbach algorithm and the designed hybrid DRL approach. A penalty term is incorporated into the DRL model to enhance resource utilization by intelligently deactivating STAR-RIS elements when not required. The proposed DRL method can achieve fair and high data rates for static and mobile users while ensuring efficient resource utilization through extensive simulations. Using the proposed DRL method, up to 27% and 21% of STAR-RIS elements can be deactivated in static and mobile scenarios, respectively, without affecting performance.
[48] arXiv:2407.19299 (替换) [中文pdf, pdf, 其他]: 标题： LoRA适配器在计算和数据约束下对临床文本分类中大型语言模型的影响

标题： The Impact of LoRA Adapters on LLMs for Clinical Text Classification Under Computational and Data Constraints

Thanh-Dung Le, Ti Ti Nguyen, Vu Nguyen Ha, Symeon Chatzinotas, Philippe Jouvet, Rita Noumeir

评论：已接受发表于IEEE Access

主题：计算与语言 (cs.CL) ; 信号处理 (eess.SP)

微调大型语言模型（LLMs）用于临床自然语言处理（NLP）由于领域差距、数据有限和严格的硬件限制而面临重大挑战。在本研究中，我们在现实世界、资源受限的条件下评估了四种适配器技术——Adapter、Lightweight、TinyAttention 和门控残差网络（GRN）——等同于低秩适配（LoRA），用于临床笔记分类。所有实验均在单个NVIDIA Quadro P620 GPU（2 GB VRAM，512 CUDA核心，1.386 TFLOPS FP32）上进行，限制批量大小小于8个序列，最大序列长度为256个标记。我们的临床语料库仅包含580000个标记，比标准LLM预训练数据集小几个数量级。我们微调了三种生物医学预训练LLM（CamemBERT-bio，AliBERT，DrBERT）和两个从头开始训练的轻量级Transformer模型。结果表明，1）在这些约束下微调生物医学LLM时，适配器结构没有提供一致的增益，2）参数数量最少且训练时间在六小时以内的简单Transformer优于需要超过1000个GPU小时的适配器增强LLM。在适配器中，GRN达到了最佳指标（准确率、精确率、召回率、F1 = 0.88）。这些发现表明，在数据和计算资源有限的低资源临床环境中，从头开始训练的轻量级Transformer比大型LLM提供了更实用和高效的解决方案，而当需要最小适应时，GRN仍然是一个可行的适配器选择。

Fine-tuning Large Language Models (LLMs) for clinical Natural Language Processing (NLP) poses significant challenges due to domain gap, limited data, and stringent hardware constraints. In this study, we evaluate four adapter techniques-Adapter, Lightweight, TinyAttention, and Gated Residual Network (GRN) - equivalent to Low-Rank Adaptation (LoRA), for clinical note classification under real-world, resource-constrained conditions. All experiments were conducted on a single NVIDIA Quadro P620 GPU (2 GB VRAM, 512 CUDA cores, 1.386 TFLOPS FP32), limiting batch sizes to <8 sequences and maximum sequence length to 256 tokens. Our clinical corpus comprises only 580 000 tokens, several orders of magnitude smaller than standard LLM pre-training datasets. We fine-tuned three biomedical pre-trained LLMs (CamemBERT-bio, AliBERT, DrBERT) and two lightweight Transformer models trained from scratch. Results show that 1) adapter structures provide no consistent gains when fine-tuning biomedical LLMs under these constraints, and 2) simpler Transformers, with minimal parameter counts and training times under six hours, outperform adapter-augmented LLMs, which required over 1000 GPU-hours. Among adapters, GRN achieved the best metrics (accuracy, precision, recall, F1 = 0.88). These findings demonstrate that, in low-resource clinical settings with limited data and compute, lightweight Transformers trained from scratch offer a more practical and efficient solution than large LLMs, while GRN remains a viable adapter choice when minimal adaptation is needed.
[49] arXiv:2412.06336 (替换) [中文pdf, pdf, html, 其他]: 标题：一种结合信道方法解码颅内 EEG 信号：通过空间信息整合提高准确性

标题： A Combined Channel Approach for Decoding Intracranial EEG Signals: Enhancing Accuracy through Spatial Information Integration

Maryam Ostadsharif Memar, Navid Ziaei, Behzad Nazari

主题：人机交互 (cs.HC) ; 信号处理 (eess.SP)

颅内脑电图（iEEG）记录具有高空间和时间分辨率以及优越的信噪比（SNR），使开发用于神经解码的精确脑机接口（BCI）系统成为可能。然而，该过程的侵入性显著限制了iEEG数据集在参与人数和记录会话时长方面的可用性。为解决这一限制，我们提出了一种针对iEEG信号解码优化的单参与者机器学习模型。该模型采用18个关键特征，并在两种模式下运行：最佳通道模式和组合通道模式。组合通道模式整合了多个脑区的空间信息，从而实现了更优的分类性能。在三个数据集——音乐重建、视听和AJILE12——上的评估表明，组合通道模式在所有分类器上始终优于最佳通道模式。在表现最佳的情况下，随机森林在音乐重建数据集中获得了0.81 +/- 0.05的F1分数，在视听数据集中获得了0.82 +/- 0.10的F1分数，而XGBoost在AJILE12数据集中获得了0.84 +/- 0.08的F1分数。此外，对组合通道模式中脑区贡献的分析表明，该模型识别出与每个任务的生理预期一致的相关脑区，并有效结合这些区域中的电极数据以实现高性能。这些发现突显了整合跨脑区空间信息以提高任务解码的潜力，为推进BCI系统和神经技术应用提供了新途径。

Intracranial EEG (iEEG) recording, characterized by high spatial and temporal resolution and superior signal-to-noise ratio (SNR), enables the development of precise brain-computer interface (BCI) systems for neural decoding. However, the invasive nature of the procedure significantly limits the availability of iEEG datasets in terms of both the number of participants and the duration of recorded sessions. To address this limitation, we propose a single-participant machine learning model optimized for decoding iEEG signals. The model employs 18 key features and operates in two modes: best channel and combined channel. The combined channel mode integrates spatial information from multiple brain regions, leading to superior classification performance. Evaluations across three datasets -- Music Reconstruction, Audio Visual, and AJILE12 -- demonstrate that the combined channel mode consistently outperforms the best channel mode across all classifiers. In the best-performing cases, Random Forest achieved an F1 score of 0.81 +/- 0.05 in the Music Reconstruction dataset and 0.82 +/- 0.10 in the Audio Visual dataset, while XGBoost achieved an F1 score of 0.84 +/- 0.08 in the AJILE12 dataset. Furthermore, the analysis of brain region contributions in the combined channel mode revealed that the model identifies relevant brain regions aligned with physiological expectations for each task and effectively combines data from electrodes in these regions to achieve high performance. These findings highlight the potential of integrating spatial information across brain regions to improve task decoding, offering new avenues for advancing BCI systems and neurotechnological applications.
[50] arXiv:2502.01189 (替换) [中文pdf, pdf, html, 其他]: 标题：带有去噪扩散代码本模型的压缩图像生成

标题： Compressed Image Generation with Denoising Diffusion Codebook Models

Guy Ohayon, Hila Manor, Tomer Michaeli, Michael Elad

评论：发表于国际机器学习会议（ICML）2025。代码和演示可在 https://ddcm-2025.github.io/ 获取。

主题：图像与视频处理 (eess.IV) ; 人工智能 (cs.AI) ; 计算机视觉与模式识别 (cs.CV) ; 信息论 (cs.IT) ; 信号处理 (eess.SP)

我们提出了一种基于去噪扩散模型（DDMs）的新型生成方法，该方法能够生成高质量的图像样本及其无损压缩的比特流表示。这是通过将反向扩散中的标准高斯噪声采样替换为从预定义的固定独立同分布高斯向量代码本中选择的噪声样本实现的。令人惊讶的是，我们发现我们的方法，称为去噪扩散代码本模型（DDCM），即使在使用极小的代码本时，也能保持标准DDMs的样本质量和多样性。我们利用DDCM并选择与给定图像最匹配的代码本中的噪声，将我们的生成模型转换为一种高效的有损图像编解码器，实现了最先进的感知图像压缩结果。更一般地，通过设置其他噪声选择规则，我们将我们的压缩方法扩展到任何条件图像生成任务（例如，图像修复），其中生成的图像与其压缩的比特流表示一起生成。我们的工作伴随着对所提出的压缩条件生成方案的数学解释，建立了与所考虑任务的后验采样器的基于分数的近似之间的联系。

We present a novel generative approach based on Denoising Diffusion Models (DDMs), which produces high-quality image samples along with their losslessly compressed bit-stream representations. This is obtained by replacing the standard Gaussian noise sampling in the reverse diffusion with a selection of noise samples from pre-defined codebooks of fixed iid Gaussian vectors. Surprisingly, we find that our method, termed Denoising Diffusion Codebook Model (DDCM), retains sample quality and diversity of standard DDMs, even for extremely small codebooks. We leverage DDCM and pick the noises from the codebooks that best match a given image, converting our generative model into a highly effective lossy image codec achieving state-of-the-art perceptual image compression results. More generally, by setting other noise selections rules, we extend our compression method to any conditional image generation task (e.g., image restoration), where the generated images are produced jointly with their condensed bit-stream representations. Our work is accompanied by a mathematical interpretation of the proposed compressed conditional generation schemes, establishing a connection with score-based approximations of posterior samplers for the tasks considered.
[51] arXiv:2502.02889 (替换) [中文pdf, pdf, html, 其他]: 标题：从DeepSense到Open RAN：动态频谱感知中的AI/ML进展及其应用

标题： From DeepSense to Open RAN: AI/ML Advancements in Dynamic Spectrum Sensing and Their Applications

Ryan Barker

评论： 6页，9图

主题：网络与互联网架构 (cs.NI) ; 信号处理 (eess.SP)

人工智能（AI）和机器学习（ML）在下一代无线通信系统中的集成已成为推动智能、自适应和可扩展网络的基石。本阅读报告审视了动态频谱感知（DSS）的关键创新，从基础的DeepSense框架开始，该框架使用卷积神经网络（CNN）和基于频谱图的分析进行实时宽带频谱监控。在此基础上，它突出了DeepSweep和宽带信号拼接等进展，这些进展通过并行处理、语义分割和稳健的数据增强策略来解决可扩展性、延迟和数据集多样性方面的挑战。报告随后探讨了开放无线接入网络（ORAN），重点在于无人机实验的AI/ML驱动增强、基于数字孪生的优化、网络切片和自愈xApp开发。通过将基于AI的DSS方法与ORAN的开放、无供应商限制的架构相结合，这些研究强调了软件定义的智能基础设施在为5G/6G生态系统实现高效、弹性且自我优化的网络方面的潜力。通过这一综合分析，报告突出了AI在塑造无线通信和自主系统未来中的变革作用。

The integration of Artificial Intelligence (AI) and Machine Learning (ML) in next-generation wireless communication systems has become a cornerstone for advancing intelligent, adaptive, and scalable networks. This reading report examines key innovations in dynamic spectrum sensing (DSS), beginning with the foundational DeepSense framework, which uses convolutional neural networks (CNNs) and spectrogram-based analysis for real-time wideband spectrum monitoring. Building on this groundwork, it highlights advancements such as DeepSweep and Wideband Signal Stitching, which address the challenges of scalability, latency, and dataset diversity through parallel processing, semantic segmentation, and robust data augmentation strategies. The report then explores Open Radio Access Networks (ORAN), focusing on AI/ML-driven enhancements for UAV experimentation, digital twin-based optimization, network slicing, and self-healing xApp development. By bridging AI-based DSS methodologies with ORAN's open, vendor-neutral architecture, these studies underscore the potential of software-defined, intelligent infrastructures in enabling efficient, resilient, and self-optimizing networks for 5G/6G ecosystems. Through this synthesis, the report highlights AI's transformative role in shaping the future of wireless communication and autonomous systems.
[52] arXiv:2506.06732 (替换) [中文pdf, pdf, 其他]: 标题：神经频谱带生成用于音频编码

标题： Neural Spectral Band Generation for Audio Coding

Woongjib Choi, Byeong Hyeon Kim, Hyungseob Lim, Inseon Jang, Hong-Goo Kang

评论：被接受至2025年国际语音通信会议

主题：音频与语音处理 (eess.AS) ; 人工智能 (cs.AI) ; 信号处理 (eess.SP)

频谱带复制（SBR）通过从低频带生成高频带实现高效编码。然而，它仅在子带信号复制的基础上利用粗略的频谱特征，限制了对各种声学信号的适应性。在本文中，我们探讨了一种基于深度神经网络（DNN）的生成方法在编码高频带中的有效性，我们称之为神经频谱带生成（n-SBG）。具体而言，我们提出了一种基于DNN的编码器-解码器结构，以提取和量化与高频分量相关的辅助信息，并在给定辅助信息和解码的核心带信号的情况下生成这些分量。整个编码流程通过生成对抗标准进行优化，以实现感知上合理的声音生成。通过使用AAC作为核心编解码器的实验，我们表明所提出的方法在使用更少辅助信息的情况下，比HE-AAC-v1具有更好的感知质量。

Spectral band replication (SBR) enables bit-efficient coding by generating high-frequency bands from the low-frequency ones. However, it only utilizes coarse spectral features upon a subband-wise signal replication, limiting adaptability to diverse acoustic signals. In this paper, we explore the efficacy of a deep neural network (DNN)-based generative approach for coding the high-frequency bands, which we call neural spectral band generation (n-SBG). Specifically, we propose a DNN-based encoder-decoder structure to extract and quantize the side information related to the high-frequency components and generate the components given both the side information and the decoded core-band signals. The whole coding pipeline is optimized with generative adversarial criteria to enable the generation of perceptually plausible sound. From experiments using AAC as the core codec, we show that the proposed method achieves a better perceptual quality than HE-AAC-v1 with much less side information.
[53] arXiv:2507.10808 (替换) [中文pdf, pdf, html, 其他]: 标题：对比-KAN：一种用于网络安全的半监督入侵检测框架，适用于标记数据稀缺的情况

标题： Contrastive-KAN: A Semi-Supervised Intrusion Detection Framework for Cybersecurity with scarce Labeled Data

Mohammad Alikhani, Reza Kazemi

主题：密码学与安全 (cs.CR) ; 信号处理 (eess.SP) ; 系统与控制 (eess.SY)

在第四次工业革命时代，网络安全和入侵检测系统对于物联网和工业物联网环境的安全可靠运行至关重要。该领域的一个关键挑战是标记的网络攻击数据稀缺，因为大多数工业系统在正常条件下运行。这种数据不平衡与标注的高成本相结合，阻碍了机器学习模型的有效训练。此外，在关键基础设施中，快速检测攻击至关重要，以防止大规模中断。为了解决这些挑战，我们提出了一种基于半监督对比学习框架的实时入侵检测系统，使用科莫戈罗夫-阿诺德网络（KAN）。我们的方法利用丰富的未标记数据来有效区分正常行为和攻击行为。我们在三个基准数据集UNSW-NB15、BoT-IoT和Gas Pipeline上验证了我们的方法，分别仅使用2.20%、1.28%和8%的标记样本，以模拟现实条件。实验结果表明，我们的方法优于现有的基于对比学习的方法。我们进一步将KAN与传统的多层感知机（MLP）进行比较，证明了KAN在有限监督下的检测准确性和鲁棒性方面表现更优。 KAN建模复杂关系的能力以及可学习的激活函数也得到了探索和可视化，提供了可解释性并具有规则提取的潜力。该方法支持多类分类，并在安全、关键环境中证明了其有效性，其中可靠性至关重要。

In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyberattack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, the rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to effectively distinguish between normal and attack behaviors. We validate our approach on three benchmark datasets, UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20\%, 1.28\%, and 8\% of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships, along with its learnable activation functions, is also explored and visualized, offering interpretability and the potential for rule extraction. The method supports multi-class classification and proves effective in safety, critical environments where reliability is paramount.

总共 53 条目

显示最多 1000 每页条目：较少 | 更多 | 所有

信号处理

显示 2025年07月29日，星期二新的列表

新提交 (展示 23 之 23 条目 )

交叉提交 (展示 14 之 14 条目 )

替换提交 (展示 16 之 16 条目 )

信号处理

显示 2025年07月29日， 星期二 新的列表

新提交 (展示 23 之 23 条目 )

交叉提交 (展示 14 之 14 条目 )

替换提交 (展示 16 之 16 条目 )

显示 2025年07月29日，星期二新的列表