Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs > arXiv:2502.00250

Help | Advanced Search

Computer Science > Computer Vision and Pattern Recognition

arXiv:2502.00250 (cs)
[Submitted on 1 Feb 2025 ]

Title: Transformer-Based Vector Font Classification Using Different Font Formats: TrueType versus PostScript

Title: 基于Transformer的矢量字体分类,使用不同的字体格式:TrueType与PostScript

Authors:Takumu Fujioka (1), Gouhei Tanaka (1 and 2) ((1) Nagoya Institute of Technology, (2) The University of Tokyo)
Abstract: Modern fonts adopt vector-based formats, which ensure scalability without loss of quality. While many deep learning studies on fonts focus on bitmap formats, deep learning for vector fonts remains underexplored. In studies involving deep learning for vector fonts, the choice of font representation has often been made conventionally. However, the font representation format is one of the factors that can influence the computational performance of machine learning models in font-related tasks. Here we show that font representations based on PostScript outlines outperform those based on TrueType outlines in Transformer-based vector font classification. TrueType outlines represent character shapes as sequences of points and their associated flags, whereas PostScript outlines represent them as sequences of commands. In previous research, PostScript outlines have been predominantly used when fonts are treated as part of vector graphics, while TrueType outlines are mainly employed when focusing on fonts alone. Whether to use PostScript or TrueType outlines has been mainly determined by file format specifications and precedent settings in previous studies, rather than performance considerations. To date, few studies have compared which outline format provides better embedding representations. Our findings suggest that information aggregation is crucial in Transformer-based deep learning for vector graphics, as in tokenization in language models and patch division in bitmap-based image recognition models. This insight provides valuable guidance for selecting outline formats in future research on vector graphics.
Abstract: 现代字体采用基于矢量的格式,这确保了在不损失质量的情况下可缩放。 虽然许多关于字体的深度学习研究集中在位图格式上,但针对矢量字体的深度学习仍研究不足。 在涉及矢量字体深度学习的研究中,字体表示的选择通常都是传统方式决定的。 然而,字体表示格式是可能影响与字体相关任务中机器学习模型计算性能的因素之一。 在这里,我们展示了基于PostScript轮廓的字体表示在基于Transformer的矢量字体分类中优于基于TrueType轮廓的表示。 TrueType轮廓将字符形状表示为点及其相关标志的序列,而PostScript轮廓则将其表示为命令的序列。 在以前的研究中,当字体被视为矢量图形的一部分时,主要使用PostScript轮廓,而当专注于字体本身时,主要使用TrueType轮廓。 是使用PostScript还是TrueType轮廓主要由文件格式规范和以往研究中的先例设置决定,而不是性能考虑。 到目前为止,很少有研究比较哪种轮廓格式能提供更好的嵌入表示。 我们的研究结果表明,在基于Transformer的矢量图形深度学习中,信息聚合至关重要,就像语言模型中的分词和基于位图的图像识别模型中的块划分一样。 这一见解为未来矢量图形研究中轮廓格式的选择提供了有价值的指导。
Comments: 8 pages, 8 figures, 4 tables, Submitted to IJCNN 2025. Code available at https://github.com/fjktkm/truetype-vs-postscript-transformer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
ACM classes: I.5.1; I.4.7
Cite as: arXiv:2502.00250 [cs.CV]
  (or arXiv:2502.00250v1 [cs.CV] for this version)
  https://doi.org/10.48550/arXiv.2502.00250
arXiv-issued DOI via DataCite

Submission history

From: Takumu Fujioka [view email]
[v1] Sat, 1 Feb 2025 01:16:27 UTC (282 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • HTML (experimental)
  • TeX Source
  • Other Formats
view license
Current browse context:
cs.CV
< prev   |   next >
new | recent | 2025-02
Change to browse by:
cs

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号