作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (8): 46-53. doi: 10.19678/j.issn.1000-3428.0064749

• 人工智能与模式识别 • 上一篇    下一篇

基于正交约束的广义可分离非负矩阵分解算法

陈君航1,2, 杨祖元1, 刘名扬1,3, 李陵江1   

  1. 1. 广东工业大学 自动化学院 广东省物联网信息技术重点实验室, 广州 510006
    2. 粤港澳离散制造智能化联合实验室, 广州 510006
    3. 智能检测与制造物联教育部重点实验室, 广州 510006
  • 收稿日期:2022-05-19 出版日期:2023-08-15 发布日期:2022-10-11
  • 作者简介:

    陈君航(1994—),男,博士研究生,主研方向为模式识别

    杨祖元,教授、博士、博士生导师

    刘名扬,硕士研究生

    李陵江,博士研究生

  • 基金资助:
    广东省重点领域研发计划(2019B010154002)

Generalized Separable Nonnegative Matrix Factorization Algorithm Based on Orthogonal Constraints

Junhang CHEN1,2, Zuyuan YANG1, Mingyang LIU1,3, Lingjiang LI1   

  1. 1. Guangdong Key Laboratory of IoT Information Technology, School of Automation, Guangdong University of Technology, Guangzhou 510006, China
    2. Guangdong-Hong Kong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangzhou 510006, China
    3. Key Laboratory of Intelligence Detection and Manufacturing Internet of Things, Ministry of Education, Guangzhou 510006, China
  • Received:2022-05-19 Online:2023-08-15 Published:2022-10-11

摘要:

可分离非负矩阵分解(NMF)是通过抽取数据集中的部分样本或关键主题来表示整个数据集的一种特殊NMF方法。广义可分离非负矩阵分解(GSNMF)算法是由可分离NMF扩展的算法,可以同时得到数据集中的关键样本和关键主题两类特征,使分解结果更具有可解释性,但在处理某些数据集时由于选择方法存在的缺陷,GSNMF算法只能单独选择行或列的特征,从而失去可解释性的优点。为此,引入正交约束来修正GSNMF算法的选取结果,提出一种基于正交约束的广义可分离非负矩阵分解(OGSNMF)算法,利用非负特性及正交约束的特点,限制迭代过程中关于行和列的迭代矩阵,确保得到行和列的特征,并获取更加精确的分解结果。在此基础上,引入相对近似误差作为实验指标,结合分解结果的秩在行与列上的分配作为实验评判标准。实验结果表明,与原有算法相比,OGSNMF算法在处理数据集时,相对近似误差提高了1~3个百分点,说明在分解过程中损失的信息更少,确保能够获取到行和列的特征,得到更具有可解释性的分解结果。

关键词: 降维, 非负矩阵分解, 广义可分离非负矩阵分解, 正交约束, 数据表示

Abstract:

Separable Nonnegative Matrix Factorization(NMF) is a special NMF method used to represent an entire dataset by extracting partial samples or key topics from the dataset.Generalized Separable Nonnegative Matrix Factorization(GSNMF) is an extended algorithm of separable NMF that can simultaneously obtain two types of features of key samples and key topics in the dataset to enable the decomposition results to more interpretable.However, it is shown that when GSNMF processes certain datasets, because of its defective selection algorithm, the decomposition process select only the characteristics of the row or column, thus losing the advantage of interpretability. The orthogonal constraints are introduced to correct the selection results of GSNMF and a algorithm of GSNMF with Orthogonal constraints(OGSNMF) is derived. OGSNMF uses the characteristics of nonnegative and orthogonal constraints to limit the iteration matrix of rows and columns in the iterative process. This ensures that the features of rows and columns are obtained simultaneously, thus yielding more accurate and thorough decomposition results. To verify the effectiveness of the algorithm, the relative approximation error is introduced as the experimental index, and the distribution of rank on rows and columns of the decomposition results is used as the experimental evaluation standard. A comparison of the experimental results shows that the OGSNMF improves the relative approximation error by 1 to 3 percentage points over that of the original algorithm, indicating that the loss of information under the decomposition process is less.The algorithm also ensures that the characteristics of rows and columns can be obtained, making the decomposition results more interpretable.

Key words: dimension reduction, Nonnegative Matrix Factorization(NMF), Generalized Separable Nonnegative Matrix Factorization(GSNMF), orthogonal constraint, data representation