SVD: Spatial Video Dataset

Izadimehr, M. H.; Ghanbari, Milad; Chen, Guodong; Zhou, Wei; Hao, Xiaoshuai; Dasari, Mallesham; Timmerer, Christian; Amirpour, Hadi

Computer Science > Multimedia

arXiv:2506.06037v1 (cs)

[Submitted on 6 Jun 2025 ]

Title: SVD: Spatial Video Dataset

Title: SVD：空间视频数据集

Authors:M. H. Izadimehr, Milad Ghanbari, Guodong Chen, Wei Zhou, Xiaoshuai Hao, Mallesham Dasari, Christian Timmerer, Hadi Amirpour

Abstract: Stereoscopic video has long been the subject of research due to its capacity to deliver immersive three-dimensional content across a wide range of applications, from virtual and augmented reality to advanced human-computer interaction. The dual-view format inherently provides binocular disparity cues that enhance depth perception and realism, making it indispensable for fields such as telepresence, 3D mapping, and robotic vision. Until recently, however, end-to-end pipelines for capturing, encoding, and viewing high-quality 3D video were neither widely accessible nor optimized for consumer-grade devices. Today's smartphones, such as the iPhone Pro, and modern Head-Mounted Displays (HMDs), like the Apple Vision Pro (AVP), offer built-in support for stereoscopic video capture, hardware-accelerated encoding, and seamless playback on devices like the Apple Vision Pro and Meta Quest 3, requiring minimal user intervention. Apple refers to this streamlined workflow as spatial video. Making the full stereoscopic video process available to everyone has made new applications possible. Despite these advances, there remains a notable absence of publicly available datasets that include the complete spatial video pipeline. In this paper, we introduce SVD, a spatial video dataset comprising 300 five-second video sequences, 150 captured using an iPhone Pro and 150 with an AVP. Additionally, 10 longer videos with a minimum duration of 2 minutes have been recorded. The SVD dataset is publicly released under an open-access license to facilitate research in codec performance evaluation, subjective and objective quality of experience (QoE) assessment, depth-based computer vision, stereoscopic video streaming, and other emerging 3D applications such as neural rendering and volumetric capture. Link to the dataset: https://cd-athena.github.io/SVD/

Abstract: 立体视频长期以来一直是研究的主题，因为它能够在从虚拟现实和增强现实到高级人机交互的广泛应用中提供沉浸式的三维内容。双视图格式本质上提供了双眼视差线索，增强了深度感知和真实感，使其在远程呈现、3D地图绘制和机器人视觉等领域不可或缺。然而，直到最近，用于捕捉、编码和观看高质量3D视频的端到端管道既不广泛可用，也不针对消费级设备进行优化。如今的智能手机，如iPhone Pro，以及现代头戴式显示器（HMD），如苹果Vision Pro（AVP），内置了对立体视频捕捉的支持、硬件加速的编码以及在设备如苹果Vision Pro和Meta Quest 3上的无缝播放，用户干预最少。苹果公司将这个流畅的工作流程称为空间视频。使完整的立体视频过程可供所有人使用开启了新的应用可能性。尽管取得了这些进展，仍然缺乏包含完整空间视频管道的公开可用数据集。在这篇论文中，我们介绍了SVD，这是一个包含300个五秒视频序列的空间视频数据集，其中150个使用iPhone Pro捕捉，150个使用AVP捕捉。此外，还录制了10个长度至少为2分钟的更长视频。 SVD数据集以开放访问许可证公开发布，以促进编解码器性能评估、主观和客观体验质量（QoE）评估、基于深度的计算机视觉、立体视频流媒体以及其他新兴的3D应用，如神经渲染和体积捕捉。数据集链接：https://cd-athena.github.io/SVD/

Subjects:	Multimedia (cs.MM)
Cite as:	arXiv:2506.06037 [cs.MM]
	(or arXiv:2506.06037v1 [cs.MM] for this version)
	https://doi.org/10.48550/arXiv.2506.06037

Submission history

From: Hadi Amirpour [view email]
[v1] Fri, 6 Jun 2025 12:38:01 UTC (1,203 KB)

Computer Science > Multimedia

Title: SVD: Spatial Video Dataset

Title: SVD：空间视频数据集

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Multimedia

Title: SVD: Spatial Video Dataset Show Chinese title

Title: SVD：空间视频数据集

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: SVD: Spatial Video Dataset