问题生成R中主成分的加载分数

I have completed PCA in R using example code from stat quest https://github.com/StatQuest/pca_demo/blob/master/pca_demo.R (with my own data in it, dataframe called "PCA4", which was then named PCstats for the PCA) :

Dataset

dput(PCA4)
structure(list(X40.45cm = c(0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L), X50.55cm = c(0L, 0L, 0L, 0L, 0L, 
3L, 0L, 0L, 0L, 0L, 32L, 0L, 64L, 96L, 0L, 0L), X60.65cm = c(0L, 
3L, 1L, 64L, 3L, 3L, 0L, 0L, 128L, 0L, 0L, 0L, 352L, 512L, 160L, 
0L), X70.75cm = c(1L, 7L, 0L, 32L, 33L, 7L, 1L, 0L, 256L, 32L, 
0L, 0L, 352L, 544L, 320L, 0L), X80.85cm = c(109L, 1L, 2L, 11L, 
164L, 34L, 2L, 64L, 480L, 32L, 160L, 96L, 352L, 1184L, 224L, 
32L)), .Names = c("X40.45cm", "X50.55cm", "X60.65cm", "X70.75cm", 
"X80.85cm"), class = "data.frame", row.names = c(NA, -16L))

我的数据由列中的样本组成,然后每一行都是一种物种,并在每个样本中找到给定数量的物种。 为了完成PCA,我必须删除物种名称,但是我认为这现在可能会妨碍我生成负荷得分的能力。我尝试使用的代码如下:

loading_scores <- PCstats$rotation[,1]
species_scores <- abs(loading_scores)
species_score_ranked <- sort(species_scores, decreasing=TRUE)
top_10_species <- names(species_score_ranked[1:10])

top_10_species 
PCstats$rotation[top_10_species,1]

但是,当我输入此内容时,它只会显示数字(0),然后为NULL。

如果对此解释不当,我深表歉意,我之前从未在这里提出过任何问题,但是如果您需要更多信息或说明,请告诉我-非常感谢您的帮助!

非常感谢!

评论
  • Hood
    Hood 回复

    原始帖子中的代码标识最有助于第一个主成分的列(度量)。为了了解对第一个主成分贡献最大的行,我们必须将因子得分与原始数据合并,并按PC1的绝对值进行排序。

    data <-structure(list(X40.45cm = c(0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 
                    0L, 0L, 0L, 0L, 0L, 0L, 0L), X50.55cm = c(0L, 0L, 0L, 0L, 0L, 
                   3L, 0L, 0L, 0L, 0L, 32L, 0L, 64L, 96L, 0L, 0L), 
                   X60.65cm = c(0L, 
                   3L, 1L, 64L, 3L, 3L, 0L, 0L, 128L, 0L, 0L, 0L, 352L, 512L, 160L, 
                   0L), X70.75cm = c(1L, 7L, 0L, 32L, 33L, 7L, 1L, 0L, 256L, 32L, 
                   0L, 0L, 352L, 544L, 320L, 0L), X80.85cm = c(109L, 1L, 2L, 11L, 
                   164L, 34L, 2L, 64L, 480L, 32L, 160L, 96L, 352L, 1184L, 224L, 
              32L)), .Names = c("X40.45cm", "X50.55cm", "X60.65cm", "X70.75cm", 
              "X80.85cm"), class = "data.frame", row.names = c(NA, -16L))
    
    # add column for species names
    Species <- paste("Species",LETTERS[1:16])
    data <- cbind(Species,data)
    # pca
    princomp <- prcomp(data[,2:ncol(data)])
    stats <- summary(princomp)
    # return factor scores for first principal component
    x_result <- princomp$x[,"PC1"]
    names(x_result) <- Species
    x_result[order(abs(x_result),decreasing = TRUE)][1:10]
    

    和输出:

    > x_result[order(abs(x_result),decreasing = TRUE)][1:10]
    Species N Species M Species I Species C Species G Species B Species P Species F 
    1177.2555  357.0957  326.8597 -220.7754 -220.7253 -217.7296 -196.9533 -190.8999 
    Species J Species D 
    -182.8966 -174.9421 
    > 
    
  • ham
    ham 回复

    名称不能用于top_10_species,因为species_score_ranked不是数据框。尝试这个

    loading_scores <- PCstats$rotation[,1]
    species_scores <- abs(loading_scores)
    species_score_ranked <- sort(species_scores, decreasing=TRUE)
    top_10_species <- match(species_score_ranked[1:10],species_scores)
    
    top_10_species
    PCstats$rotation[top_10_species,]
    

    输出:

    我希望这是您要寻找的。

    > PCstats$rotation[top_10_species,]
                PC1         PC2         PC3         PC4         PC5
     [1,] 0.2938804 -0.06715619  0.06862443  0.12857870  0.11781917
     [2,] 0.2932723 -0.08432870  0.02670414 -0.08688880  0.22867875
     [3,] 0.2869432  0.13535231  0.05421256  0.08940930 -0.36614492
     [4,] 0.2851476  0.12618316 -0.13957539 -0.17127002 -0.04163248
     [5,] 0.2849976 -0.14928654 -0.03024884  0.17614136  0.30544938
     [6,] 0.2843594 -0.15278602 -0.03551544  0.17841931 -0.06085204
     [7,] 0.2843594 -0.15278602 -0.03551544  0.17841931 -0.06085204
     [8,] 0.2843594 -0.15278602 -0.03551544  0.17841931 -0.06085204
     [9,] 0.2841340  0.05627504  0.25523009  0.03263271  0.00697634
    [10,] 0.2710251 -0.20911054 -0.01754100 -0.56694280  0.01517564