Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs > arXiv:2107.09883

Help | Advanced Search

Computer Science > Machine Learning

arXiv:2107.09883 (cs)
[Submitted on 21 Jul 2021 ]

Title: MG-NET: Leveraging Pseudo-Imaging for Multi-Modal Metagenome Analysis

Title: MG-NET:利用伪成像进行多模态宏基因组分析

Authors:Sathyanarayanan N. Aakur, Sai Narayanan, Vineela Indla, Arunkumar Bagavathi, Vishalini Laguduva Ramnath, Akhilesh Ramachandran
Abstract: The emergence of novel pathogens and zoonotic diseases like the SARS-CoV-2 have underlined the need for developing novel diagnosis and intervention pipelines that can learn rapidly from small amounts of labeled data. Combined with technological advances in next-generation sequencing, metagenome-based diagnostic tools hold much promise to revolutionize rapid point-of-care diagnosis. However, there are significant challenges in developing such an approach, the chief among which is to learn self-supervised representations that can help detect novel pathogen signatures with very low amounts of labeled data. This is particularly a difficult task given that closely related pathogens can share more than 90% of their genome structure. In this work, we address these challenges by proposing MG-Net, a self-supervised representation learning framework that leverages multi-modal context using pseudo-imaging data derived from clinical metagenome sequences. We show that the proposed framework can learn robust representations from unlabeled data that can be used for downstream tasks such as metagenome sequence classification with limited access to labeled data. Extensive experiments show that the learned features outperform current baseline metagenome representations, given only 1000 samples per class.
Abstract: 新病原体和人畜共患病如SARS-CoV-2的出现凸显了开发新型诊断和干预流程的必要性,这些流程能够从少量标记数据中快速学习。 结合下一代测序技术的进步,基于宏基因组的诊断工具在推动快速现场诊断方面具有巨大潜力。 然而,开发这种方法面临重大挑战,其中最主要的是学习自我监督表示,这些表示能够在标记数据非常少的情况下帮助检测新的病原体特征。 考虑到密切相关的病原体可以共享超过90%的基因组结构,这使得这一任务尤为困难。 在这项工作中,我们通过提出MG-Net来解决这些挑战,MG-Net是一种自我监督表示学习框架,利用从临床宏基因组序列派生的伪图像数据的多模态上下文。 我们证明,所提出的框架可以从无标签数据中学习到稳健的表示,这些表示可用于下游任务,例如在有限访问标记数据的情况下进行宏基因组序列分类。 大量实验表明,所学特征在每类仅1000个样本的情况下优于当前的宏基因组表示。
Comments: To appear in MICCAI 2021
Subjects: Machine Learning (cs.LG) ; Genomics (q-bio.GN)
Cite as: arXiv:2107.09883 [cs.LG]
  (or arXiv:2107.09883v1 [cs.LG] for this version)
  https://doi.org/10.48550/arXiv.2107.09883
arXiv-issued DOI via DataCite

Submission history

From: Sathyanarayanan Aakur [view email]
[v1] Wed, 21 Jul 2021 05:53:01 UTC (776 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • TeX Source
license icon view license
Current browse context:
cs.LG
< prev   |   next >
new | recent | 2021-07
Change to browse by:
cs
q-bio
q-bio.GN

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号