Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs.DS

Help | Advanced Search

Data Structures and Algorithms

  • New submissions
  • Cross-lists
  • Replacements

See recent articles

Showing new listings for Thursday, 25 September 2025

Total of 19 entries
Showing up to 2000 entries per page: fewer | more | all

New submissions (showing 9 of 9 entries )

[1] arXiv:2509.19528 [cn-pdf, pdf, html, other]
Title: A Note on Fine-Grained Quantum Reductions for Linear Algebraic Problems
Title: 关于线性代数问题的细粒度量子归约的注记
Kyle Doney, Cameron Musco
Subjects: Data Structures and Algorithms (cs.DS) ; Computational Complexity (cs.CC) ; Numerical Analysis (math.NA)

We observe that any $T(n)$ time algorithm (quantum or classical) for several central linear algebraic problems, such as computing $\det(A)$, $tr(A^3)$, or $tr(A^{-1})$ for an $n \times n$ integer matrix $A$, yields a $O(T(n)) + \tilde O(n^2)$ time \textit{quantum algorithm} for $n \times n$ matrix-matrix multiplication. That is, on quantum computers, the complexity of these problems is essentially equivalent to that of matrix multiplication. Our results follow by first observing that the Bernstein-Vazirani algorithm gives a direct quantum reduction from matrix multiplication to computing $tr(ABC)$ for $n \times n$ inputs $A,B,C$. We can then reduce $tr(ABC)$ to each of our problems of interest. For the above problems, and many others in linear algebra, their fastest known algorithms require $\Theta(n^\omega)$ time, where $\omega \approx 2.37$ is the current exponent of fast matrix multiplication. Our finding shows that any improvements beyond this barrier would lead to faster quantum algorithms for matrix multiplication. Our results complement existing reductions from matrix multiplication in algebraic circuits [BCS13], and reductions that work for standard classical algorithms, but are not tight -- i.e., which roughly show that an $O(n^{3-\delta})$ time algorithm for the problem yields an $O(n^{3-\delta/3})$ matrix multiplication algorithm [WW10].

我们观察到,对于几个中心的线性代数问题,如计算一个$T(n)$时算法(量子或经典)的$\det(A)$,$tr(A^3)$或$tr(A^{-1})$对于一个$n \times n$整数矩阵$A$,将给出一个$O(T(n)) + \tilde O(n^2)$时的\textit{量子算法}对$n \times n$矩阵-矩阵乘法。 也就是说,在量子计算机上,这些问题的复杂度基本上等同于矩阵乘法的复杂度。 我们的结果是通过首先观察到Bernstein-Vazirani算法从矩阵乘法到计算$tr(ABC)$的直接量子归约,对于$n \times n$个输入$A,B,C$。 然后我们可以将$tr(ABC)$减少到我们感兴趣的问题中的每一个。 对于上述问题以及线性代数中的许多其他问题,其已知最快的算法需要$\Theta(n^\omega)$时间,其中$\omega \approx 2.37$是当前快速矩阵乘法的指数。 我们的发现表明,任何超越这一障碍的改进都将导致矩阵乘法的更快量子算法。 我们的结果补充了现有的代数电路中矩阵乘法的归约 [BCS13],以及适用于标准经典算法的归约,但这些归约并不紧密——即,它们大致表明,该问题的一个$O(n^{3-\delta})$时间算法将产生一个$O(n^{3-\delta/3})$矩阵乘法算法 [WW10]。

[2] arXiv:2509.19655 [cn-pdf, pdf, html, other]
Title: A Better-Than-$5/4$-Approximation for Two-Edge Connectivity
Title: 一种优于$5/4$的二边连通性近似方法
Felix Hommelsheim, Alexander Lindermayr, Zhenwei Liu
Subjects: Data Structures and Algorithms (cs.DS)

The 2-Edge-Connected Spanning Subgraph Problem (2ECSS) is a fundamental problem in survivable network design. Given an undirected $2$-edge-connected graph, the goal is to find a $2$-edge-connected spanning subgraph with the minimum number of edges; a graph is 2-edge-connected if it is connected after the removal of any single edge. 2ECSS is APX-hard and has been extensively studied in the context of approximation algorithms. Very recently, Bosch-Calvo, Garg, Grandoni, Hommelsheim, Jabal Ameli, and Lindermayr showed the currently best-known approximation ratio of $\frac{5}{4}$ [STOC 2025]. This factor is tight for many of their techniques and arguments, and it was not clear whether $\frac{5}{4}$ can be improved. We break this natural barrier and present a $(\frac{5}{4} - \eta)$-approximation algorithm, for some constant $\eta \geq 10^{-6}$. On a high level, we follow the approach of previous works: take a triangle-free $2$-edge cover and transform it into a 2-edge-connected spanning subgraph by adding only a few additional edges. For $\geq \frac{5}{4}$-approximations, one can heavily exploit that a $4$-cycle in the 2-edge cover can ``buy'' one additional edge. This enables simple and nice techniques, but immediately fails for our improved approximation ratio. To overcome this, we design two complementary algorithms that perform well for different scenarios: one for few $4$-cycles and one for many $4$-cycles. Besides this, there appear more obstructions when breaching $\frac54$, which we surpass via new techniques such as colorful bridge covering, rich vertices, and branching gluing paths.

2-边连通生成子图问题(2ECSS)是容错网络设计中的一个基本问题。 给定一个无向$2$-边连通图,目标是找到一个边数最少的$2$-边连通生成子图;如果在删除任意一条边后图仍然连通,则该图是2-边连通的。 2ECSS是APX难问题,并且在近似算法的背景下被广泛研究。 最近,Bosch-Calvo、Garg、Grandoni、Hommelsheim、Jabal Ameli和Lindermayr展示了目前最好的近似比为$\frac{5}{4}$[STOC 2025]。 这个因子对于他们许多技术与论证来说是紧的,而且不清楚是否可以改进$\frac{5}{4}$。 我们打破了这一自然障碍,提出了一种$(\frac{5}{4} - \eta)$近似算法,其中$\eta \geq 10^{-6}$是某个常数。 从高层次来看,我们遵循了之前工作的思路:取一个不含三角形的$2$-边覆盖,并通过仅添加少量额外边将其转换为2-边连通的生成子图。 对于$\geq \frac{5}{4}$-近似,可以充分利用在2边覆盖中的$4$-环可以“购买”一个额外的边这一特性。 这使得简单且优美的技术成为可能,但立即在我们改进的近似比下失效。 为了克服这一点,我们设计了两个互补的算法,分别在不同场景下表现良好:一个用于少量$4$-环,另一个用于大量$4$-环。 除此之外,在突破$\frac54$时会出现更多的障碍,我们通过新的技术如彩色桥覆盖、丰富顶点和分支粘合路径来克服这些障碍。

[3] arXiv:2509.19662 [cn-pdf, pdf, html, other]
Title: Non-Clairvoyant Scheduling with Progress Bars
Title: 非透明调度与进度条
Ziyad Benomar, Romain Cosson, Alexander Lindermayr, Jens Schlöter
Comments: NeurIPS 2025
Subjects: Data Structures and Algorithms (cs.DS)

In non-clairvoyant scheduling, the goal is to minimize the total job completion time without prior knowledge of individual job processing times. This classical online optimization problem has recently gained attention through the framework of learning-augmented algorithms. We introduce a natural setting in which the scheduler receives continuous feedback in the form of progress bars: estimates of the fraction of each job completed over time. We design new algorithms for both adversarial and stochastic progress bars and prove strong competitive bounds. Our results in the adversarial case surprisingly induce improved guarantees for learning-augmented scheduling with job size predictions. We also introduce a general method for combining scheduling algorithms, yielding further insights in scheduling with predictions. Finally, we propose a stochastic model of progress bars as a more optimistic alternative to conventional worst-case models, and present an asymptotically optimal scheduling algorithm in this setting.

在非先知调度中,目标是在不了解每个作业处理时间的情况下最小化总作业完成时间。 这个经典的在线优化问题最近通过学习增强算法的框架获得了关注。 我们引入了一个自然的环境,在这个环境中,调度器会以进度条的形式持续接收反馈:每个作业随时间完成的比例估计。 我们为对抗性和随机性进度条设计了新的算法,并证明了强大的竞争界限。 我们在对抗性情况下的结果意外地为具有作业大小预测的学习增强调度提供了改进的保证。 我们还引入了一种结合调度算法的一般方法,从而进一步深入了解带有预测的调度。 最后,我们提出了一种进度条的随机模型,作为传统最坏情况模型的更乐观替代方案,并在此环境下提出了一个渐近最优的调度算法。

[4] arXiv:2509.19703 [cn-pdf, pdf, html, other]
Title: SS-GUMAP, SL-GUMAP, SSSL-GUMAP: Fast UMAP Algorithms for Large Graph Drawing
Title: SS-GUMAP,SL-GUMAP,SSSL-GUMAP:大型图绘制的快速UMAP算法
Amyra Meidiana, Seok-Hee Hong
Subjects: Data Structures and Algorithms (cs.DS)

UMAP is a popular neighborhood-preserving dimension reduction (DR) algorithm. However, its application for graph drawing has not been evaluated. Moreover, a naive application of UMAP to graph drawing would include O(nm) time all-pair shortest path computation, which is not scalable to visualizing large graphs. In this paper, we present fast UMAP-based for graph drawing. Specifically, we present three fast UMAP-based algorithms for graph drawing: (1) The SS-GUMAP algorithm utilizes spectral sparsification to compute a subgraph G' preserving important properties of a graph G, reducing the O(nm) component of the runtime to O(n^2 log n) runtime; (2) The SSL-GUMAP algorithm reduces the kNN (k-Nearest Neighbors) graph computation from $O(n \log n)$ time to linear time using partial BFS (Breadth First Search), and the cost optimization runtime from O(n) time to sublinear time using edge sampling; (3) The SSSL-GUMAP algorithm combines both approaches, for an overall O(n) runtime. Experiments demonstrate that SS-GUMAP runs 28% faster than GUMAP, a naive application of UMAP to graph drawing, with similar quality metrics, while SL-GUMAP and SSSL-GUMAP run over 80% faster than GUMAP with less than 15% difference on average for all quality metrics. We also present an evaluation of GUMAP to tsNET, a graph layout based on the popular DR algorithm t-SNE. GUMAP runs 90% faster than tsNET with similar neighborhood preservation and, on average, 10% better on quality metrics such as stress, edge crossing, and shape-based metrics, validating the effectiveness of UMAP for graph drawing.

UMAP是一种流行的保持邻域的降维(DR)算法。 然而,其在图绘制中的应用尚未得到评估。 此外,将UMAP直接应用于图绘制会包括O(nm)时间的全对最短路径计算,这无法扩展到可视化大型图。 在本文中,我们提出了基于UMAP的快速图绘制方法。 具体而言,我们提出了三种基于UMAP的快速图绘制算法:(1)SS-GUMAP算法利用谱稀疏化来计算一个保留图G的重要特性的子图G',将运行时间中的O(nm)部分减少到O(n^2 log n)的运行时间;(2)SSL-GUMAP算法使用部分BFS(广度优先搜索)将kNN(k-最近邻)图计算从$O(n \log n)$时间减少到线性时间,并使用边采样将成本优化运行时间从O(n)时间减少到次线性时间;(3)SSSL-GUMAP算法结合了这两种方法,总体运行时间为O(n)。 实验表明,SS-GUMAP比GUMAP(将UMAP直接应用于图绘制的朴素方法)快28%,在相似的质量指标下,而SL-GUMAP和SSSL-GUMAP比GUMAP快超过80%,所有质量指标平均差异不到15%。 我们还对GUMAP与tsNET进行了评估,tsNET是一种基于流行DR算法t-SNE的图布局。 GUMAP比tsNET快90%,在邻域保留方面相似,并且在应力、边交叉和基于形状的指标等质量指标上平均好10%,验证了UMAP在图绘制中的有效性。

[5] arXiv:2509.19740 [cn-pdf, pdf, other]
Title: Geometric Interpretation of 3-SAT and Phase Transition
Title: 3-SAT的几何解释和相变
Frederic Gillet
Subjects: Data Structures and Algorithms (cs.DS)

Interpretation of 3-SAT as a volume filling problem, and its use to explore the SAT/UNSAT phase transition.

将3-SAT解释为体积填充问题,并用于探索SAT/UNSAT相变。

[6] arXiv:2509.19785 [cn-pdf, pdf, html, other]
Title: BH-tsNET, FIt-tsNET, L-tsNET: Fast tsNET Algorithms for Large Graph Drawing
Title: BH-tsNET,FIt-tsNET,L-tsNET:用于大型图绘制的快速tsNET算法
Amyra Meidiana, Seok-Hee Hong, Kwan-Liu Ma
Subjects: Data Structures and Algorithms (cs.DS)

The tsNET algorithm utilizes t-SNE to compute high-quality graph drawings, preserving the neighborhood and clustering structure. We present three fast algorithms for reducing the time complexity of tsNET algorithm from O(nm) time to O(n log n) time and O(n) time. To reduce the runtime of tsNET, there are three components that need to be reduced: (C0) computation of high-dimensional probabilities, (C1) computation of KL divergence gradient, and (C2) entropy computation. Specifically, we reduce the overall runtime of tsNET, integrating our new fast approaches for C0 and C2 with fast t-SNE algorithms for C1. We first present O(n log n)-time BH-tsNET, based on (C0) new O(n)-time partial BFS-based high-dimensional probability computation and (C2) new O(n log n)-time quadtree-based entropy computation, integrated with (C1) O(n log n)-time quadtree-based KL divergence computation of BH-SNE. We next present faster O(n log n)-time FIt-tsNET, using (C0) O(n)-time partial BFS-based high-dimensional probability computation and (C2) quadtree-based O(n log n)-time entropy computation, integrated with (C1) O(n)-time interpolation-based KL divergence computation of FIt-SNE. Finally, we present the O(n)-time L-tsNET, integrating (C2) new O(n)-time FFT-accelerated interpolation-based entropy computation with (C0) O(n)-time partial BFS-based high-dimensional probability computation, and (C1) O(n)-time interpolation-based KL divergence computation of FIt-SNE. Extensive experiments using benchmark data sets confirm that BH-tsNET, FIt-tsNET, and L-tsNET outperform tsNET, running 93.5%, 96%, and 98.6% faster while computing similar quality drawings in terms of quality metrics (neighborhood preservation, stress, edge crossing, and shape-based metrics) and visual comparison. We also present a comparison between our algorithms and DRGraph, another dimension reduction-based graph drawing algorithm.

tsNET算法利用t-SNE计算高质量的图绘制,保留邻域和聚类结构。 我们提出了三种快速算法,将tsNET算法的时间复杂度从O(nm)时间降低到O(n log n)时间和O(n)时间。 为了减少tsNET的运行时间,有三个部分需要减少:(C0)高维概率的计算,(C1)KL散度梯度的计算,以及(C2)熵的计算。 具体来说,我们通过将针对C0和C2的新快速方法与针对C1的快速t-SNE算法相结合,从而减少了tsNET的整体运行时间。 我们首先提出O(n log n)时间的BH-tsNET,基于(C0)新的O(n)时间的部分BFS-based高维概率计算和(C2)新的O(n log n)时间的四叉树-based熵计算,结合(C1)O(n log n)时间的四叉树-based KL散度计算的BH-SNE。 我们接下来提出更快的O(n log n)时间的FIt-tsNET,使用(C0)O(n)时间的部分BFS-based高维概率计算和(C2)四叉树-based的O(n log n)时间的熵计算,结合(C1)O(n)时间的插值-based KL散度计算的FIt-SNE。 最后,我们提出O(n)时间的L-tsNET,结合(C2)新的O(n)时间的FFT加速的插值-based熵计算与(C0)O(n)时间的部分BFS-based高维概率计算,以及(C1)O(n)时间的插值-based KL散度计算的FIt-SNE。 使用基准数据集进行的大量实验确认,BH-tsNET、FIt-tsNET和L-tsNET优于tsNET,在计算相似质量的绘制方面分别快93.5%、96%和98.6%,在质量指标(邻域保留、应力、边交叉和基于形状的指标)和视觉比较方面具有相似的质量。 我们还展示了我们的算法与另一种基于降维的图绘制算法DRGraph之间的比较。

[7] arXiv:2509.19914 [cn-pdf, pdf, html, other]
Title: Stealing From the Dragon's Hoard: Online Unbounded Knapsack With Removal
Title: 偷取龙的宝藏:具有删除功能的在线无界背包问题
Matthias Gehnen, Moritz Stocker
Subjects: Data Structures and Algorithms (cs.DS)

We introduce the Online Unbounded Knapsack Problem with Removal, a variation of the well-known Online Knapsack Problem. Items, each with a weight and value, arrive online and an algorithm must decide on whether or not to pack them into a knapsack with a fixed weight limit. An item may be packed an arbitrary number of times and items may be removed from the knapsack at any time without cost. The goal is to maximize the total value of items packed, while respecting a weight limit. We show that this is one of the very few natural online knapsack variants that allow for competitive deterministic algorithms in the general setting, by providing an algorithm with competitivity 1.6911. We complement this with a lower bound of 1.5877. We also analyze the proportional setting, where the weight and value of any single item agree, and show that deterministic algorithms can be exactly 3/2-competitive. Lastly, we give lower and upper bounds of 6/5 and 4/3 on the competitivity of randomized algorithms in this setting.

我们引入了带有移除功能的在线无界背包问题,这是众所周知的在线背包问题的一个变种。物品依次在线到达,每个物品都有一个重量和价值,算法必须决定是否将其装入固定重量限制的背包中。一个物品可以被任意次数地装入背包,并且可以在任何时候无需成本地从背包中移除物品。目标是在遵守重量限制的前提下最大化装入物品的总价值。我们通过提供一个竞争比为1.6911的算法,证明了这是极少数允许在一般情况下使用竞争性确定性算法的自然在线背包变种之一。我们还给出了一个下界为1.5877的结论。我们还分析了比例设置,在这种情况下,任何单个物品的重量和价值是一致的,并表明确定性算法可以达到3/2的竞争性。最后,我们在这种情况下给出了随机算法的竞争比的下界为6/5和上界为4/3。

[8] arXiv:2509.20304 [cn-pdf, pdf, html, other]
Title: Ads that Stick: Near-Optimal Ad Optimization through Psychological Behavior Models
Title: 贴身广告:通过心理行为模型实现接近最优的广告优化
Kailash Gopal Darmasubramanian, Akash Pareek, Arindam Khan, Arpit Agarwal
Subjects: Data Structures and Algorithms (cs.DS) ; Machine Learning (cs.LG)

Optimizing the timing and frequency of ads is a central problem in digital advertising, with significant economic consequences. Existing scheduling policies rely on simple heuristics, such as uniform spacing and frequency caps, that overlook long-term user interest. However, it is well-known that users' long-term interest and engagement result from the interplay of several psychological effects (Curmei, Haupt, Recht, Hadfield-Menell, ACM CRS, 2022). In this work, we model change in user interest upon showing ads based on three key psychological principles: mere exposure, hedonic adaptation, and operant conditioning. The first two effects are modeled using a concave function of user interest with repeated exposure, while the third effect is modeled using a temporal decay function, which explains the decline in user interest due to overexposure. Under our psychological behavior model, we ask the following question: Given a continuous time interval $T$, how many ads should be shown, and at what times, to maximize the user interest towards the ads? Towards answering this question, we first show that, if the number of displayed ads is fixed, then the optimal ad-schedule only depends on the operant conditioning function. Our main result is a quasi-linear time algorithm that outputs a near-optimal ad-schedule, i.e., the difference in the performance of our schedule and the optimal schedule is exponentially small. Our algorithm leads to significant insights about optimal ad placement and shows that simple heuristics such as uniform spacing are sub-optimal under many natural settings. The optimal number of ads to display, which also depends on the mere exposure and hedonistic adaptation functions, can be found through a simple linear search given the above algorithm. We further support our findings with experimental results, demonstrating that our strategy outperforms various baselines.

优化广告的时间和频率是数字广告中的核心问题,具有重大的经济后果。 现有的调度策略依赖于简单的启发式方法,如均匀间隔和频率限制,这些方法忽视了用户的长期兴趣。 然而,众所周知,用户长期的兴趣和参与度来自于几种心理效应的相互作用(Curmei, Haupt, Recht, Hadfield-Menell, ACM CRS, 2022)。 在本工作中,我们基于三个关键的心理学原理对展示广告后用户兴趣的变化进行建模:单纯暴露、享乐适应和操作性条件反射。 前两种效应通过用户兴趣的凹函数来建模,该函数随着重复曝光而变化,而第三种效应则通过时间衰减函数来建模,这解释了由于过度曝光导致的用户兴趣下降。 在我们的心理学行为模型下,我们提出以下问题:给定一个连续时间区间$T$,应该展示多少广告,以及在哪些时间点,以最大化用户对广告的兴趣? 为了回答这个问题,我们首先证明,如果展示的广告数量是固定的,那么最优的广告调度仅取决于操作性条件反射函数。 我们的主要结果是一个准线性时间算法,该算法输出一个近似最优的广告调度,即我们的调度与最优调度之间的性能差异是指数级小的。 我们的算法为最优广告定位提供了重要的见解,并表明在许多自然情况下,简单的启发式方法如均匀间隔是次优的。 要找到最优的广告数量,这还取决于单纯暴露和享乐适应函数,可以通过上述算法进行简单的线性搜索找到。 我们进一步通过实验结果支持我们的发现,证明我们的策略优于各种基线方法。

[9] arXiv:2509.20351 [cn-pdf, pdf, html, other]
Title: Testable algorithms for approximately counting edges and triangles in sublinear time and space
Title: 可在子线性时间和空间中近似计数边和三角形的可测试算法
Talya Eden, Ronitt Rubinfeld, Arsen Vasilyan
Subjects: Data Structures and Algorithms (cs.DS)

We consider the fundamental problems of approximately counting the numbers of edges and triangles in a graph in sublinear time. Previous algorithms for these tasks are significantly more efficient under a promise that the arboricity of the graph is bounded by some parameter $\overline{\alpha}$. However, when this promise is violated, the estimates given by these algorithms are no longer guaranteed to be correct. For the triangle counting task, we give an algorithm that requires no promise on the input graph $G$, and computes a $(1\pm \epsilon)$-approximation for the number of triangles $t$ in $G$ in time $O^*\left( \frac{m\cdot \alpha(G)}{t} + \frac{m}{t^{2/3}} \right)$, where $\alpha(G)$ is the arboricity of the graph. The algorithm can be used on any graph $G$ (no prior knowledge the arboricity $\alpha(G)$ is required), and the algorithm adapts its run-time on the fly based on the graph $G$. We accomplish this by trying a sequence of candidate values $\tilde{\alpha}$ for $\alpha(G)$ and using a novel algorithm in the framework of testable algorithms. This ensures that wrong candidates $\tilde{\alpha}$ cannot lead to incorrect estimates: as long as the advice is incorrect, the algorithm detects it and continues with a new candidate. Once the algorithm accepts the candidate, its output is guaranteed to be correct with high probability. We prove that this approach preserves - up to an additive overhead - the dramatic efficiency gains obtainable when good arboricity bounds are known in advance, while ensuring robustness against misleading advice. We further complement this result with a lower bound, showing that such an overhead is unavoidable whenever the advice may be faulty. We further demonstrate implications of our results for triangle counting in the streaming model.

We consider the fundamental problems of approximately counting the numbers of edges and triangles in a graph in sublinear time. Previous algorithms for these tasks are significantly more efficient under a promise that the arboricity of the graph is bounded by some parameter $\overline{\alpha}$. However, when this promise is violated, the estimates given by these algorithms are no longer guaranteed to be correct. For the triangle counting task, we give an algorithm that requires no promise on the input graph $G$, and computes a $(1\pm \epsilon)$-approximation for the number of triangles $t$ in $G$ in time $O^*\left( \frac{m\cdot \alpha(G)}{t} + \frac{m}{t^{2/3}} \right)$, where $\alpha(G)$ is the arboricity of the graph. 该算法可用于任何图$G$(不需要关于分树度$\alpha(G)$的先验知识),并且该算法会根据图$G$动态调整其运行时间。 我们通过尝试一系列候选值$\tilde{\alpha}$对$\alpha(G)$进行测试,并在可测试算法的框架中使用一种新算法来实现这一点。 这确保了错误的候选值$\tilde{\alpha}$不会导致不正确的估计:只要建议是错误的,算法就会检测到并继续使用新的候选值。 一旦算法接受候选值,其输出就以高概率保证正确。 我们证明了这种方法在已知良好的分树度界限时可以获得显著的效率提升,同时确保对误导性建议的鲁棒性——最多增加一个加法开销。 我们进一步补充了一个下界,表明当建议可能有误时,这种开销是不可避免的。 我们进一步展示了我们的结果在流模型中三角形计数方面的意义。

Cross submissions (showing 4 of 4 entries )

[10] arXiv:2509.19598 (cross-list from cs.IT) [cn-pdf, pdf, html, other]
Title: Efficient $\varepsilon$-approximate minimum-entropy couplings
Title: 高效的$\varepsilon$-近似最小熵耦合
Spencer Compton
Subjects: Information Theory (cs.IT) ; Data Structures and Algorithms (cs.DS)

Given $m \ge 2$ discrete probability distributions over $n$ states each, the minimum-entropy coupling is the minimum-entropy joint distribution whose marginals are the same as the input distributions. Computing the minimum-entropy coupling is NP-hard, but there has been significant progress in designing approximation algorithms; prior to this work, the best known polynomial-time algorithms attain guarantees of the form $H(\operatorname{ALG}) \le H(\operatorname{OPT}) + c$, where $c \approx 0.53$ for $m=2$, and $c \approx 1.22$ for general $m$ [CKQGK '23]. A main open question is whether this task is APX-hard, or whether there exists a polynomial-time approximation scheme (PTAS). In this work, we design an algorithm that produces a coupling with entropy $H(\operatorname{ALG}) \le H(\operatorname{OPT}) + \varepsilon$ in running time $n^{O(\operatorname{poly}(1/\varepsilon) \cdot \operatorname{exp}(m) )}$: showing a PTAS exists for constant $m$.

给定 $m \ge 2$个在 $n$个状态上的离散概率分布,最小熵耦合是具有与输入分布相同的边缘分布的最小熵联合分布。 计算最小熵耦合是NP难的,但设计近似算法已经取得了显著进展;在此工作之前,已知的最佳多项式时间算法的保证形式为 $H(\operatorname{ALG}) \le H(\operatorname{OPT}) + c$,其中 $c \approx 0.53$ 对于 $m=2$,而 $c \approx 1.22$ 对于一般的 $m$ [CKQGK '23]。 一个主要的开放问题是,这个任务是否是APX难的,或者是否存在一个多项式时间近似方案(PTAS)。 在本工作中,我们设计了一个算法,在运行时间$n^{O(\operatorname{poly}(1/\varepsilon) \cdot \operatorname{exp}(m) )}$内生成一个熵为$H(\operatorname{ALG}) \le H(\operatorname{OPT}) + \varepsilon$的耦合,证明了对于常数$m$存在一个多项式时间近似方案。

[11] arXiv:2509.19718 (cross-list from math.OC) [cn-pdf, pdf, html, other]
Title: ALNS for Tugboat Scheduling in Inland Waterway
Title: 内河航道拖船调度的ALNS
Zihang Ma
Subjects: Optimization and Control (math.OC) ; Data Structures and Algorithms (cs.DS)

This paper focuses on the barges shipping problem, also known as the tugboats scheduling problem, within the context of a scenario where a single tugboat has the capacity to tow multiple barges and conduct multiple trips in a drop-and-pull mode during a daily work shift. The problem is mathematically formalized as mixed-integer programming models. To tackle real-world-sized problem instances, an adaptive large neighborhood search (ALNS) algorithm integrated with a decoding mathematical model is proposed. When applied to large-scale instances, the ALNS algorithm showcases performance superiority over the strengthened mathematical model.

本文专注于驳船运输问题,也称为拖船调度问题,在一个场景中,一艘拖船能够拖曳多艘驳船,并在每日工作班次中以甩挂模式进行多次运输。 该问题被数学地形式化为混合整数规划模型。 为了处理现实规模的问题实例,提出了一种集成解码数学模型的自适应大邻域搜索(ALNS)算法。 当应用于大规模实例时,ALNS算法表现出优于强化数学模型的性能。

[12] arXiv:2509.19966 (cross-list from quant-ph) [cn-pdf, pdf, html, other]
Title: No Quantum Advantage in Decoded Quantum Interferometry for MaxCut
Title: 解码量子干涉仪在最大割问题中没有量子优势
Ojas Parekh
Comments: 12 pages
Subjects: Quantum Physics (quant-ph) ; Data Structures and Algorithms (cs.DS)

Decoded Quantum Interferometry (DQI) is a framework for approximating special kinds of discrete optimization problems that relies on problem structure in a way that sets it apart from other classical or quantum approaches. We show that the instances of MaxCut on which DQI attains a nontrivial asymptotic approximation guarantee are solvable exactly in classical polynomial time. We include a streamlined exposition of DQI tailored for MaxCut that relies on elementary graph theory instead of coding theory to motivate and explain the algorithm.

解码量子干涉测量(DQI)是一种近似特殊类型的离散优化问题的框架,它以一种不同于其他经典或量子方法的方式依赖于问题结构。 我们证明,在DQI能够获得非平凡渐近近似保证的MaxCut实例可以在经典多项式时间内精确求解。 我们提供了一个针对MaxCut的简化说明,该说明基于基本图论而非编码理论来激发和解释该算法。

[13] arXiv:2509.20183 (cross-list from quant-ph) [cn-pdf, pdf, other]
Title: Dequantization and Hardness of Spectral Sum Estimation
Title: 去量化与谱和估计的难度
Roman Edenhofer, Atsuya Hasegawa, François Le Gall
Subjects: Quantum Physics (quant-ph) ; Computational Complexity (cs.CC) ; Data Structures and Algorithms (cs.DS)

We give new dequantization and hardness results for estimating spectral sums of matrices, such as the log-determinant. Recent quantum algorithms have demonstrated that the logarithm of the determinant of sparse, well-conditioned, positive matrices can be approximated to $\varepsilon$-relative accuracy in time polylogarithmic in the dimension $N$, specifically in time $\mathrm{poly}(\mathrm{log}(N), s, \kappa, 1/\varepsilon)$, where $s$ is the sparsity and $\kappa$ the condition number of the input matrix. We provide a simple dequantization of these techniques that preserves the polylogarithmic dependence on the dimension. Our classical algorithm runs in time $\mathrm{polylog}(N)\cdot s^{O(\sqrt{\kappa}\log \kappa/\varepsilon)}$ which constitutes an exponential improvement over previous classical algorithms in certain parameter regimes. We complement our classical upper bound with $\mathsf{DQC1}$-completeness results for estimating specific spectral sums such as the trace of the inverse and the trace of matrix powers for log-local Hamiltonians, with parameter scalings analogous to those of known quantum algorithms. Assuming $\mathsf{BPP}\subsetneq\mathsf{DQC1}$, this rules out classical algorithms with the same scalings. It also resolves a main open problem of Cade and Montanaro (TQC 2018) concerning the complexity of Schatten-$p$ norm estimation. We further analyze a block-encoding input model, where instead of a classical description of a sparse matrix, we are given a block-encoding of it. We show $\mathsf{DQC}1$-completeness in a very general way in this model for estimating $\mathrm{tr}[f(A)]$ whenever $f$ and $f^{-1}$ are sufficiently smooth. We conclude our work with $\mathsf{BQP}$-hardness and $\mathsf{PP}$-completeness results for high-accuracy log-determinant estimation.

我们给出了对矩阵谱和估计的新去量化和难度结果,例如对数行列式。最近的量子算法表明,稀疏、条件良好、正定矩阵的行列式的对数可以在维度$N$的多项对数时间内以$\varepsilon$相对精度近似,具体时间复杂度为$\mathrm{poly}(\mathrm{log}(N), s, \kappa, 1/\varepsilon)$,其中$s$是稀疏性,$\kappa$是输入矩阵的条件数。我们提供了一种简单的这些技术的去量化方法,该方法保留了与维度的多项对数依赖关系。我们的经典算法运行时间为$\mathrm{polylog}(N)\cdot s^{O(\sqrt{\kappa}\log \kappa/\varepsilon)}$,这在某些参数范围内相对于之前的经典算法有指数级的改进。我们通过针对特定谱和估计的$\mathsf{DQC1}$完全性结果来补充我们的经典上界,例如对数局部哈密顿量的逆矩阵迹和矩阵幂迹,其参数缩放与已知量子算法的参数缩放类似。 假设$\mathsf{BPP}\subsetneq\mathsf{DQC1}$,这排除了具有相同缩放比例的经典算法。 它还解决了 Cade 和 Montanaro(TQC 2018)关于 Schatten-$p$范数估计复杂性的主要开放问题。 我们进一步分析了一种块编码输入模型,在这种模型中,我们不是获得稀疏矩阵的经典描述,而是获得它的块编码。 我们在这个模型中以一种非常普遍的方式证明了$\mathsf{DQC}1$完全性,当$\mathrm{tr}[f(A)]$、$f$和$f^{-1}$具有足够的平滑性时。 我们最后得出高精度对数行列式估计的$\mathsf{BQP}$-难性和$\mathsf{PP}$-完全性结果。

Replacement submissions (showing 6 of 6 entries )

[14] arXiv:2503.06464 (replaced) [cn-pdf, pdf, other]
Title: Detecting Correlation Efficiently in Stochastic Block Models: Breaking Otter's Threshold in the Entire Supercritical Regime
Title: 在随机块模型中高效检测相关性:在整个超临界区域打破Otter的阈值
Guanyi Chen, Jian Ding, Shuyang Gong, Zhangsong Li
Comments: This substantially improves the result in an earlier version. 98 pages, 13 figures
Subjects: Data Structures and Algorithms (cs.DS) ; Probability (math.PR) ; Statistics Theory (math.ST)

Consider a pair of sparse correlated stochastic block models $\mathcal S(n,\tfrac{\lambda}{n},\epsilon;s)$ subsampled from a common parent stochastic block model with two symmetric communities, average degree $\lambda=O(1)$, divergence parameter $\epsilon\in (0,1)$ and subsampling probability $s$. For all $\epsilon\in(0,1)$ and $\Delta>0$, we construct a statistic based on the combination of two low-degree polynomials and show that there exists a sufficiently small constant $\delta=\delta(\epsilon,\lambda,\Delta)>0$ such that if $\epsilon^2 \lambda s>1+\Delta$ and $s>\sqrt{\alpha}-\delta$ where $\alpha\approx 0.338$ is Otter's constant, this statistic can distinguish this model and a pair of independent stochastic block models $\mathcal S(n,\tfrac{\lambda s}{n},\epsilon)$ with probability $1-o(1)$. We also provide an efficient algorithm that approximates this statistic in polynomial time. The crux of our statistic's construction lies in a carefully curated family of multigraphs called \emph{decorated trees}, which enables effective aggregation of the community signal and graph correlation by leveraging the counts of the same decorated tree while suppressing the undesirable correlations among counts of different decorated trees. We believe such construction may be of independent interest.

考虑一对稀疏相关随机块模型$\mathcal S(n,\tfrac{\lambda}{n},\epsilon;s)$,它们是从具有两个对称社区的共同父随机块模型中子采样的,平均度为$\lambda=O(1)$,发散参数为$\epsilon\in (0,1)$,子采样概率为$s$。 对于所有$\epsilon\in(0,1)$和$\Delta>0$,我们基于两个低次多项式的组合构造一个统计量,并证明存在一个足够小的常数$\delta=\delta(\epsilon,\lambda,\Delta)>0$,使得如果$\epsilon^2 \lambda s>1+\Delta$和$s>\sqrt{\alpha}-\delta$,其中$\alpha\approx 0.338$是 Otter 常数,这个统计量可以在概率$1-o(1)$下区分该模型和一对独立的随机块模型$\mathcal S(n,\tfrac{\lambda s}{n},\epsilon)$。 我们还提供了一个高效的算法,在多项式时间内近似该统计量。 该统计量构造的核心在于一个精心策划的多重图族,称为\emph{装饰树},它通过利用相同装饰树的计数来有效聚合社区信号和图相关性,同时抑制不同装饰树计数之间的不良相关性。 我们认为这种构造可能具有独立的兴趣。

[15] arXiv:2506.23906 (replaced) [cn-pdf, pdf, html, other]
Title: Segmented Operations using Matrix Multiplications
Title: 使用矩阵乘法的分段操作
Aleksandros Sobczyk, Giuseppe Sorrentino, Anastasios Zouzias
Subjects: Data Structures and Algorithms (cs.DS) ; Computational Complexity (cs.CC) ; Distributed, Parallel, and Cluster Computing (cs.DC)

Specialized computational units that perform small matrix multiplications as primitive operations are typically present in modern AI accelerators. However, these Matrix Multiplication Units (MMUs) are often underutilized for many fundamental deep learning operations besides dense matrix multiplications. Coincidentally, the lack of a rigorous theoretical model of computation for such architectures obstructs algorithmic design. In this work, we propose MMV-RAM, a computational model which judiciously extends the Vector-RAM model with an additional MMU. We provide a detailed theoretical analysis and carefully balance the computational power between the matrix and vector units, guided by the circuit complexity lower bound that parity is not in AC{[0]}. Given MMV-RAM, we proceed to algorithm design, starting with two fundamental parallel operations: segmented scan and sum. By expressing them as compositions of elementary parallel primitives (e.g., seg. sum reduces to: scan, compress, and vector differentiation), we can exploit MMUs to perform speculative blocked computations, ultimately leading to provable theoretical speed-ups against vector-only approaches. These results extend to other ubiquitous AI kernels, including dense matrix product, and sparse matrix-vector product. As a case study, we implemented the proposed algorithms on the Ascend 910B AI accelerator, which contains matrix and vector cores. We evaluate these implementations on synthetic and real-world datasets from various applications, including Large Language Models.

专门执行小矩阵乘法作为原始操作的计算单元通常存在于现代AI加速器中。 然而,这些矩阵乘法单元(MMUs)在许多基本的深度学习操作中除了密集矩阵乘法之外常常被低估使用。 巧合的是,这种架构缺乏严格的理论计算模型阻碍了算法设计。 在本工作中,我们提出了MMV-RAM,这是一种计算模型,它明智地扩展了向量-RAM模型,增加了一个MMU。 我们提供了详细的理论分析,并在矩阵和向量单元之间仔细平衡计算能力,这是由电路复杂性下界决定的,即奇偶性不在AC{[0]}中。 在MMV-RAM的基础上,我们继续进行算法设计,从两个基本的并行操作开始:分段扫描和求和。 通过将它们表示为基本并行原语的组合(例如,分段求和可以简化为:扫描、压缩和向量微分),我们可以利用MMU执行推测性分块计算,最终导致相对于仅向量方法的可证明理论加速。 这些结果也适用于其他常见的AI内核,包括密集矩阵乘积和稀疏矩阵-向量乘积。 作为一个案例研究,我们在Ascend 910B AI加速器上实现了所提出的算法,该加速器包含矩阵和向量核心。 我们在来自各种应用的合成和真实世界数据集上评估了这些实现,包括大型语言模型。

[16] arXiv:2210.06140 (replaced) [cn-pdf, pdf, html, other]
Title: Differentially Private Bootstrap: New Privacy Analysis and Inference Strategies
Title: 差分隐私引导的自助法:新的隐私分析和推断策略
Zhanyu Wang, Guang Cheng, Jordan Awan
Comments: 22 pages before appendices and references. 50 pages total
Subjects: Machine Learning (stat.ML) ; Cryptography and Security (cs.CR) ; Data Structures and Algorithms (cs.DS) ; Machine Learning (cs.LG)

Differentially private (DP) mechanisms protect individual-level information by introducing randomness into the statistical analysis procedure. Despite the availability of numerous DP tools, there remains a lack of general techniques for conducting statistical inference under DP. We examine a DP bootstrap procedure that releases multiple private bootstrap estimates to infer the sampling distribution and construct confidence intervals (CIs). Our privacy analysis presents new results on the privacy cost of a single DP bootstrap estimate, applicable to any DP mechanism, and identifies some misapplications of the bootstrap in the existing literature. For the composition of the DP bootstrap, we present a numerical method to compute the exact privacy cost of releasing multiple DP bootstrap estimates, and using the Gaussian-DP (GDP) framework (Dong et al., 2022), we show that the release of $B$ DP bootstrap estimates from mechanisms satisfying $(\mu/\sqrt{(2-2/\mathrm{e})B})$-GDP asymptotically satisfies $\mu$-GDP as $B$ goes to infinity. Then, we perform private statistical inference by post-processing the DP bootstrap estimates. We prove that our point estimates are consistent, our standard CIs are asymptotically valid, and both enjoy optimal convergence rates. To further improve the finite performance, we use deconvolution with DP bootstrap estimates to accurately infer the sampling distribution. We derive CIs for tasks such as population mean estimation, logistic regression, and quantile regression, and we compare them to existing methods using simulations and real-world experiments on 2016 Canada Census data. Our private CIs achieve the nominal coverage level and offer the first approach to private inference for quantile regression.

差分隐私(DP)机制通过在统计分析过程中引入随机性来保护个体层面的信息。 尽管有众多的DP工具可用,但在DP下进行统计推断仍缺乏通用技术。 我们研究了一种DP自助法程序,该程序释放多个私有自助估计值以推断抽样分布并构建置信区间(CIs)。 我们的隐私分析提出了关于单个DP自助估计隐私成本的新结果,适用于任何DP机制,并指出了现有文献中自助法的一些误用情况。 对于DP自助法的组合,我们提出了一种数值方法来计算释放多个DP自助估计的精确隐私成本,并使用高斯-DP(GDP)框架(Dong等,2022),我们证明了从满足$(\mu/\sqrt{(2-2/\mathrm{e})B})$-GDP的机制中释放$B$个DP自助估计值,在$B$趋于无穷大时,渐近地满足$\mu$-GDP。 然后,我们通过对DP自助估计进行后处理来进行私有统计推断。 我们证明了我们的点估计是一致的,我们的标准CIs渐近有效,且两者都具有最优收敛速度。 为了进一步提高有限样本性能,我们使用带有DP自助估计的去卷积来准确推断抽样分布。 我们为总体均值估计、逻辑回归和分位数回归等任务推导了CIs,并使用2016年加拿大人口普查数据的模拟和真实实验与现有方法进行了比较。 我们的私有CIs达到了名义覆盖率水平,并为分位数回归提供了第一个私有推断方法。

[17] arXiv:2410.09296 (replaced) [cn-pdf, pdf, html, other]
Title: The 2020 United States Decennial Census Is More Private Than You (Might) Think
Title: 2020年美国十年人口普查比你(可能)认为的更私密
Buxin Su, Weijie J. Su, Chendi Wang
Subjects: Cryptography and Security (cs.CR) ; Data Structures and Algorithms (cs.DS) ; Applications (stat.AP) ; Machine Learning (stat.ML)

The U.S. Decennial Census serves as the foundation for many high-profile policy decision-making processes, including federal funding allocation and redistricting. In 2020, the Census Bureau adopted differential privacy to protect the confidentiality of individual responses through a disclosure avoidance system that injects noise into census data tabulations. The Bureau subsequently posed an open question: Could stronger privacy guarantees be obtained for the 2020 U.S. Census compared to their published guarantees, or equivalently, had the privacy budgets been fully utilized? In this paper, we address this question affirmatively by demonstrating that the 2020 U.S. Census provides significantly stronger privacy protections than its nominal guarantees suggest at each of the eight geographical levels, from the national level down to the block level. This finding is enabled by our precise tracking of privacy losses using $f$-differential privacy, applied to the composition of private queries across these geographical levels. Our analysis reveals that the Census Bureau introduced unnecessarily high levels of noise to meet the specified privacy guarantees for the 2020 Census. Consequently, we show that noise variances could be reduced by $15.08\%$ to $24.82\%$ while maintaining nearly the same level of privacy protection for each geographical level, thereby improving the accuracy of privatized census statistics. We empirically demonstrate that reducing noise injection into census statistics mitigates distortion caused by privacy constraints in downstream applications of private census data, illustrated through a study examining the relationship between earnings and education.

美国十年人口普查是许多重要政策决策过程的基础,包括联邦资金分配和重新划分选区。 2020年,人口普查局采用差分隐私来通过一种披露避免系统保护个体回答的机密性,该系统在人口普查数据统计中注入噪声。 随后,该局提出了一个开放性问题:与他们发布的保证相比,2020年美国人口普查是否能获得更强的隐私保障,或者等价地说,隐私预算是否已被完全使用? 在本文中,我们通过证明2020年美国人口普查在八个地理层级(从全国层面到街区层面)提供的隐私保护比其名义上的保证要强得多,从而肯定地回答了这个问题。 这一发现得益于我们使用$f$-差分隐私对这些地理层级上私有查询的组合进行精确的隐私损失跟踪。 我们的分析表明,人口普查局为满足2020年人口普查指定的隐私保证而引入了不必要的高水平噪声。 因此,我们展示了在每个地理层级保持几乎相同的隐私保护水平的同时,噪声方差可以减少$15.08\%$到$24.82\%$,从而提高去隐私化人口普查统计数据的准确性。 我们通过一项研究实证证明,减少对人口普查数据的噪声注入可以缓解隐私约束在私有人口普查数据后续应用中引起的失真,该研究考察了收入与教育之间的关系。

[18] arXiv:2503.20438 (replaced) [cn-pdf, pdf, html, other]
Title: Factorised Representations of Join Queries: Tight Bounds and a New Dichotomy
Title: 连接查询的因子表示:紧界和新的二分法
Christoph Berkholz, Harry Vinall-Smeeth
Comments: 28 pages, 1 figure; v.2. improved presentation and extended discussion on related work
Subjects: Databases (cs.DB) ; Data Structures and Algorithms (cs.DS) ; Logic in Computer Science (cs.LO)

A common theme in factorised databases and knowledge compilation is the representation of solution sets in a useful yet succinct data structure. In this paper, we study the representation of the result of join queries (or, equivalently, the set of homomorphisms between two relational structures). We focus on the very general format of $\{\cup, \times\}$-circuits -- also known as d-representations or DNNF circuits -- and aim to find the limits of this approach. In prior work, it has been shown that there always exists a $\{\cup, \times\}$-circuits-circuit of size $N^{O(subw)}$ representing the query result, where N is the size of the database and subw the submodular width of the query. If the arity of all relations is bounded by a constant, then subw is linear in the treewidth tw of the query. In this setting, the authors of this paper proved a lower bound of $N^{\Omega(tw^{\varepsilon})}$ on the circuit size (ICALP 2023), where $\varepsilon>0$ depends on the excluded grid theorem. Our first main contribution is to improve this lower bound to $N^{\Omega(tw)}$, which is tight up to a constant factor in the exponent. Our second contribution is a $N^{\Omega(subw^{1/4})}$ lower bound on the circuit size for join queries over relations of unbounded arity. Both lower bounds are unconditional lower bounds on the circuit size for well-chosen database instances. Their proofs use a combination of structural (hyper)graph theory with communication complexity in a simple yet novel way. While the second lower bound is asymptotically equivalent to Marx's conditional bound on the decision complexity (JACM 2013), our $N^{\Theta(tw)}$ bound in the bounded-arity setting is tight, while the best conditional bound on the decision complexity is $N^{\Omega(tw/\log tw)}$. Note that, removing this logarithmic factor in the decision setting is a major open problem.

因子化数据库和知识编译中的一个共同主题是用一种有用且简洁的数据结构来表示解集。 在本文中,我们研究连接查询的结果(或等价地,两个关系结构之间的同态集合)的表示方法。 我们关注的是$\{\cup, \times\}$-电路这一非常通用的格式——也称为d表示或DNNF电路——并旨在找到这种方法的极限。 在之前的工作中,已经证明总是存在一个$\{\cup, \times\}$-电路,其大小为$N^{O(subw)}$,用于表示查询结果,其中N是数据库的大小,subw是查询的子模宽度。 如果所有关系的元数被一个常数所限制,那么subw与查询的树宽tw成线性关系。 在这种情况下,本文的作者在 ICALP 2023 中证明了电路大小的下界为$N^{\Omega(tw^{\varepsilon})}$,其中$\varepsilon>0$依赖于排除网格定理。 我们的第一个主要贡献是将这个下界改进为$N^{\Omega(tw)}$,这在指数中的常数因子范围内是紧致的。 我们的第二个贡献是针对元数无界的关联上的连接查询的电路大小的$N^{\Omega(subw^{1/4})}$下界。 这两个下界是对选定数据库实例的电路大小的无条件下界。 它们的证明以一种简单但新颖的方式结合了结构(超)图论与通信复杂性。 虽然第二个下界在渐近意义上等价于Marx关于决策复杂性的条件性界限(JACM 2013),但在有界基数情况下,我们的$N^{\Theta(tw)}$界是紧的,而关于决策复杂性的最佳条件性界限是$N^{\Omega(tw/\log tw)}$。注意,在决策设置中消除这个对数因子是一个主要的开放问题。

[19] arXiv:2509.13966 (replaced) [cn-pdf, pdf, other]
Title: Smaller Circuits for Bit Addition
Title: 更小的位加法电路
Mikhail Goncharov, Alexander S. Kulikov, Georgie Levtsov
Subjects: Computational Complexity (cs.CC) ; Discrete Mathematics (cs.DM) ; Data Structures and Algorithms (cs.DS) ; Logic in Computer Science (cs.LO)

Bit addition arises virtually everywhere in digital circuits: arithmetic operations, increment/decrement operators, computing addresses and table indices, and so on. Since bit addition is such a basic task in Boolean circuit synthesis, a lot of research has been done on constructing efficient circuits for various special cases of it. A vast majority of these results are devoted to optimizing the circuit depth (also known as delay). In this paper, we investigate the circuit size (also known as area) over the full binary basis of bit addition. Though most of the known circuits are built from Half Adders and Full Adders, we show that, in many interesting scenarios, these circuits have suboptimal size. Namely, we improve an upper bound $5n-3m$ to $4.5n-2m$, where $n$ is the number of input bits and $m$ is the number of output bits. In the regimes where $m$ is small compared to $n$ (for example, for computing the sum of $n$ bits or multiplying two $n$-bit integers), this leads to $10\%$ improvement. We complement our theoretical result by an open-source implementation of generators producing circuits for bit addition and multiplication. The generators allow one to produce the corresponding circuits in two lines of code and to compare them to existing designs.

位加法几乎在数字电路的各个方面都出现:算术运算、增量/减量操作符、计算地址和表索引等。 由于位加法是布尔电路综合中的基本任务,已经有很多研究致力于构建各种特殊情况下的高效电路。 大多数这些结果都专注于优化电路深度(也称为延迟)。 在本文中,我们研究了位加法在完整二进制基下的电路规模(也称为面积)。 尽管大多数已知的电路都是由半加器和全加器构建的,但我们表明,在许多有趣的场景中,这些电路的规模并不最优。 即,我们将一个上界$5n-3m$改进为$4.5n-2m$,其中$n$是输入位的数量,$m$是输出位的数量。 在$m$相比$n$很小的区域(例如,计算$n$位的和或相乘两个$n$位整数时),这会导致$10\%$的改进。 我们通过一个开源实现来补充我们的理论结果,该实现生成用于位加法和乘法的电路。 这些生成器允许用两行代码生成相应的电路并与现有设计进行比较。

Total of 19 entries
Showing up to 2000 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号