楚新元 | All in R

Welcome to R Square

用 R 批量转数据文件格式

楚新元 / 2021-09-03


这里仅以 .txt 文本文件转 .xlsx 格式的 Excel 文件为例,其它格式同理。直接给出代码如下,读者自行体验。

加载相关 R 包

library(dplyr)
library(purrr)
library(openxlsx)
library(rio)

批量生成测试文件

# 批量生成 txt 测试文件
xfun::dir_create("data")
dest_file = paste0("./data/", levels(iris$Species), ".txt")
iris %>% 
  group_split(Species) %>% 
  # split(.$Species) %>% 
  walk2(
    ., dest_file, 
    \(x, y) write.table(x, y, row.names = F)
  )

也可以用如下代码批量生成测试文件

# 批量生成 txt 测试文件
xfun::dir_create("data")
iris %>% 
  group_nest(Species, keep = TRUE) %>% 
  mutate(dest_file = paste0("./data/", Species, ".txt")) %>% 
  # pwalk(\(...) write.table(..2, ..3, row.names = F)) %>% 
  pwalk(
    \(data, dest_file, ...) {
      write.table(data, dest_file, row.names = F)
    } 
  )

方法一:批量读取、汇总、聚合、写入

# 批量读取 txt 文件、合并、按来源聚合后分别写入 Excel 文件
xfun::dir_create("result")
path = "./data"
path %>% 
  list.files(
    pattern = "\\.txt$",
    full.names = TRUE
  ) %>% 
  set_names(.) %>%
  map(\(x) read.table(x, header = TRUE)) %>% 
  list_rbind(names_to = "src") %>% 
  group_nest(src) %>% 
  mutate(
    dest_file = paste0(
      "./result/",
      gsub(".*/(.*?)\\.txt", "\\1", src),
      ".xlsx"
    )
  ) %>% 
  pwalk(\(...) write.xlsx(..2, ..3))

方法二:批量逐文件转换

xfun::dir_create("result")
path = "./data"
path %>% 
  list.files(pattern = "\\.txt$") %>% 
  walk(
    \(x) convert(  # convert 函数来自 rio 包
      file.path(path, x),
      file.path("result", gsub(".txt", ".xlsx", x))
    )
  )