From patterned response dependency to structured covariate dependency: categorical-pattern-matching

Fushing, Hsieh; Liu, Shan-Yu; Hsieh, Yin-Chen; McCowan, Brenda

doi:10.1371/journal.pone.0198253

统计学 > 方法论

arXiv:1706.00103 (stat)

[提交于 2017年5月31日 ]

标题：从模式化的响应依赖性到结构化的协变量依赖性：类别模式匹配

标题： From patterned response dependency to structured covariate dependency: categorical-pattern-matching

Authors:Hsieh Fushing, Shan-Yu Liu, Yin-Chen Hsieh, Brenda McCowan

摘要：来自感兴趣系统的数据通常由多个响应和协变量特征的受试者集合测量值组成，自然表示为一个响应矩阵与一个协变量矩阵相对应。很可能这两个矩阵同时涵盖了异构数据类型：连续型、离散型和类别型。这里使用矩阵作为一种实用平台，以理想方式在其格点上保持受试者之间以及特征之间的隐藏依赖关系完整。响应和协变量依赖性通过一种新开发的计算范式——数据力学，分别进行计算并通过多尺度块表达。我们提出了一种基于分类模式匹配的方法，用于以信息流的形式建立从模式化响应依赖性到结构化协变量依赖性的因果联系。信息流的强度通过组合信息论来评估。这个用于系统知识发现的统一平台通过五个数据集进行了说明。在每个说明性案例中，通过出现的可见可读异质性展示了信息流如何作为发现的知识位点的组织形式。这种统一方法从根本上解决了数据分析中的许多长期存在的问题，包括统计建模、多响应、重整化和特征选择，但没有涉及人为结构和分布假设。这里报告的结果增强了这样一个观点：将响应依赖性的模式与协变量依赖性的结构联系起来，是科学领域数据驱动计算和学习的真正哲学基础。

摘要： Data generated from a system of interest typically consists of measurements from an ensemble of subjects across multiple response and covariate features, and is naturally represented by one response-matrix against one covariate-matrix. Likely each of these two matrices simultaneously embraces heterogeneous data types: continuous, discrete and categorical. Here a matrix is used as a practical platform to ideally keep hidden dependency among/between subjects and features intact on its lattice. Response and covariate dependency is individually computed and expressed through mutliscale blocks via a newly developed computing paradigm named Data Mechanics. We propose a categorical pattern matching approach to establish causal linkages in a form of information flows from patterned response dependency to structured covariate dependency. The strength of an information flow is evaluated by applying the combinatorial information theory. This unified platform for system knowledge discovery is illustrated through five data sets. In each illustrative case, an information flow is demonstrated as an organization of discovered knowledge loci via emergent visible and readable heterogeneity. This unified approach fundamentally resolves many long standing issues, including statistical modeling, multiple response, renormalization and feature selections, in data analysis, but without involving man-made structures and distribution assumptions. The results reported here enhance the idea that linking patterns of response dependency to structures of covariate dependency is the true philosophical foundation underlying data-driven computing and learning in sciences.

评论：	32页，10幅图，3张插图
主题：	方法论 (stat.ME)
引用方式：	arXiv:1706.00103 [stat.ME]
	(或者 arXiv:1706.00103v1 [stat.ME] 对于此版本)
	https://doi.org/10.48550/arXiv.1706.00103
相关 DOI:	https://doi.org/10.1371/journal.pone.0198253

提交历史

来自： Shan-Yu Liu [查看电子邮件]
[v1] 星期三， 2017 年 5 月 31 日 21:43:36 UTC (879 KB)

统计学 > 方法论

标题：从模式化的响应依赖性到结构化的协变量依赖性：类别模式匹配

标题： From patterned response dependency to structured covariate dependency: categorical-pattern-matching

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 方法论

标题： 从模式化的响应依赖性到结构化的协变量依赖性：类别模式匹配 显示英文标题

标题： From patterned response dependency to structured covariate dependency: categorical-pattern-matching

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：从模式化的响应依赖性到结构化的协变量依赖性：类别模式匹配