r - 按两个数字对列名称进行排序

我最近得到了this amazing answer来自 JBGruber,对具有双数值的字符串列进行排序,这适用于帖子底部的两个数据集:

library(magrittr)
order_cols <- function(dat) {
  
  # look for words to order by
  s_ordered <- stringi::stri_extract_all_regex(colnames(dat), "[[:alpha:]]+") %>% 
    unlist() %>% 
    unique() %>% 
    sort()
  
  if (length(s_ordered) > 1) {
    # replace words with their alphabetical index
    cnames <- stringi::stri_replace_all_fixed(colnames(dat), s_ordered, seq_along(s_ordered), vectorise_all = FALSE)
  } else {
    cnames <- colnames(dat)
  }
  
  cnames %>% 
    stringi::stri_extract_all_regex("\\d+") %>% # extract all numbers (including the alphabetical index numbers)
    lapply(as.numeric) %>% 
    lapply(sum) %>% 
    unlist() %>% 
    order()
  
}

但是,我注意到对于以下数据它并不完全有效,因为它是基于这样的假设,即数字的总和按顺序给出列的写入顺序:

dat_I <- structure(list(`[25,250)`=3L, `[0,25)` = 5L, `[100,250)` = 43L, `[100,500)` = 0L, 
    `[1000,1000000]` = 20L, `[1000,1500)` = 0L, `[1500,3000)` = 0L, 
    `[25,100)` = 38L, `[25,50)` = 0L, `[250,500)` = 27L, `[3000,1000000]` = 0L, 
    `[50,100)` = 0L, `[500,1000)` = 44L, `[500,1000000]` = 0L), row.names = "Type_A", class = "data.frame")

colnames(dat_I )[order_cols(dat_I)]

有没有办法先按第一个元素排序,再按第二个元素排序?

旧数据

dat_I <- structure(list(`[0,25)` = 5L, `[100,250)` = 43L, `[100,500)` = 0L, 
    `[1000,1000000]` = 20L, `[1000,1500)` = 0L, `[1500,3000)` = 0L, 
    `[25,100)` = 38L, `[25,50)` = 0L, `[250,500)` = 27L, `[3000,1000000]` = 0L, 
    `[50,100)` = 0L, `[500,1000)` = 44L, `[500,1000000]` = 0L), row.names = "Type_A", class = "data.frame")

dat_II <- structure(list(`[0,25) east` = c(1269L, 85L), `[0,25) north` = c(364L, 
21L), `[0,25) south` = c(1172L, 97L), `[0,25) west` = c(549L, 
49L), `[100,250) east` = c(441L, 149L), `[100,250) north` = c(224L, 
45L), `[100,250) south` = c(521L, 247L), `[100,250) west` = c(770L, 
124L), `[100,500) east` = c(0L, 0L), `[100,500) north` = c(0L, 
0L), `[100,500) south` = c(0L, 0L), `[100,500) west` = c(0L, 
0L), `[1000,1000000] east` = c(53L, 0L), `[1000,1000000] north` = c(82L, 
0L), `[1000,1000000] south` = c(23L, 0L), `[1000,1000000] west` = c(63L, 
0L), `[1000,1500) east` = c(0L, 0L), `[1000,1500) north` = c(0L, 
0L), `[1000,1500) south` = c(0L, 0L), `[1000,1500) west` = c(0L, 
0L), `[1500,3000) east` = c(0L, 0L), `[1500,3000) north` = c(0L, 
0L), `[1500,3000) south` = c(0L, 0L), `[1500,3000) west` = c(0L, 
0L), `[25,100) east` = c(579L, 220L), `[25,100) north` = c(406L, 
58L), `[25,100) south` = c(1048L, 316L), `[25,100) west` = c(764L, 
131L), `[25,50) east` = c(0L, 0L), `[25,50) north` = c(0L, 0L
), `[25,50) south` = c(0L, 0L), `[25,50) west` = c(0L, 0L), `[250,500) east` = c(232L, 
172L), `[250,500) north` = c(207L, 40L), `[250,500) south` = c(202L, 
148L), `[250,500) west` = c(457L, 153L), `[3000,1000000] east` = c(0L, 
0L), `[3000,1000000] north` = c(0L, 0L), `[3000,1000000] south` = c(0L, 
0L), `[3000,1000000] west` = c(0L, 0L), `[50,100) east` = c(0L, 
0L), `[50,100) north` = c(0L, 0L), `[50,100) south` = c(0L, 0L
), `[50,100) west` = c(0L, 0L), `[500,1000) east` = c(103L, 0L
), `[500,1000) north` = c(185L, 0L), `[500,1000) south` = c(66L, 
0L), `[500,1000) west` = c(200L, 0L), `[500,1000000] east` = c(0L, 
288L), `[500,1000000] north` = c(0L, 120L), `[500,1000000] south` = c(0L, 
229L), `[500,1000000] west` = c(0L, 175L)), row.names = c("A", 
"B"), class = "data.frame")

最佳答案

我修改了函数的最后三行,以便顺序现在依次基于每个元素。

order_cols <- function(dat) {
  
  # look for words to order by
  s_ordered <- stringi::stri_extract_all_regex(colnames(dat), "[[:alpha:]]+") %>% 
    unlist() %>% 
    unique() %>% 
    sort()
  
  if (length(s_ordered) > 1) {
    # replace words with their alphabetical index
    cnames <- stringi::stri_replace_all_fixed(colnames(dat), s_ordered, seq_along(s_ordered), vectorise_all = FALSE)
  } else {
    cnames <- colnames(dat)
  }
  
  cnames %>% 
    stringi::stri_extract_all_regex("\\d+") %>% # extract all numbers (including the alphabetical index numbers)
    lapply(as.numeric) %>% 
    do.call(rbind, .) %>%    # bind list items to a matrix
    as.data.frame %>%        # change the matrix to a data.frame (i.e. a list)
    do.call(order, .)        # use the list for ordering
}
colnames(dat_II)[order_cols(dat_II)]
# [1] "[0,25) east"          "[0,25) north"         "[0,25) south"        
# [4] "[0,25) west"          "[25,50) east"         "[25,50) north"       
# [7] "[25,50) south"        "[25,50) west"         "[25,100) east"       
# [10] "[25,100) north"       "[25,100) south"       "[25,100) west"       
# [13] "[50,100) east"        "[50,100) north"       "[50,100) south"      
# [16] "[50,100) west"        "[100,250) east"       "[100,250) north"     
# [19] "[100,250) south"      "[100,250) west"       "[100,500) east"      
# [22] "[100,500) north"      "[100,500) south"      "[100,500) west"      
# [25] "[250,500) east"       "[250,500) north"      "[250,500) south"     
# [28] "[250,500) west"       "[500,1000) east"      "[500,1000) north"    
# [31] "[500,1000) south"     "[500,1000) west"      "[500,1000000] east"  
# [34] "[500,1000000] north"  "[500,1000000] south"  "[500,1000000] west"  
# [37] "[1000,1500) east"     "[1000,1500) north"    "[1000,1500) south"   
# [40] "[1000,1500) west"     "[1000,1000000] east"  "[1000,1000000] north"
# [43] "[1000,1000000] south" "[1000,1000000] west"  "[1500,3000) east"    
# [46] "[1500,3000) north"    "[1500,3000) south"    "[1500,3000) west"    
# [49] "[3000,1000000] east"  "[3000,1000000] north" "[3000,1000000] south"
# [52] "[3000,1000000] west

https://stackoverflow.com/questions/72908061/

相关文章:

javascript - 使用 Object.keys() 获取 searchParams

haskell - 在 Haskell 中内存递归函数

r - 在 R 中使用 dplyr 包 Lag 函数时有没有办法省略 NA?

python - Linux 命令行中 Python 对象类的子类

python - 我应该如何打破这个问题的循环?

databricks - Databricks 中的目录

typescript - 将通用 typescript 类型限制为单个字符串文字值,不允许联合

java - 是否可以向 OpenAPI 添加方法?

c - 尝试创建一个 C 程序来打印出所有具有有理平方根的数字?

xaml - 删除开关中的文本