Skip to content Skip to sidebar Skip to footer

Setting Up An Sqllite Database In R From A Csv File That Cannot Read Into Memory

I have a large *.csv file that I would like to query using SQLlite and dplyr verbs in R. How can I connect it to SQLlite without reading it into memory? I can parse it in as I also

Solution 1:

I am not sure that SQLite is a good fit for something that large. SQLite stores the entire database in a single file and you might be better off using a client server database that distributes it across multiple files.

In any case, ignoring the above this will create an empty SQLite database mydb.sqlite and read myfile.csv into a table mytable that it creates in that database. It will do so without the data ever going through R -- it works entirely on the SQLite side. You might need additional arguments depending on the file. You will get a warning but can ignore it. Test it out on a subset of rows first. Then delete the database created and try with the entire file if that worked.

library(sqldf)

read.csv.sql("myfile.csv", sql = c("attach 'mydb.sqlite' as new", 
          "create table new.mytable as select * from file"))

# test that it is there
sqldf("select * from mytable limit 3", dbname = "mydb.sqlite")

Alternately SQLite has a command line shell that could be used to do that. See https://sqlite.org/cli.html

Post a Comment for "Setting Up An Sqllite Database In R From A Csv File That Cannot Read Into Memory"