R – import from Excel with French characters and export to csv

It seems like dealing with character encoding is a fairly common problem in R (see this -pretty old- article for example). For some reason I only experienced it today. When trying to import data (read.xlsx()) from Excel into R Studio with French text in it, I realized that:

  1. R Studio cannot display French characters properly when one tries to view the objects’ content,
  2. And R doesn’t export the objects with the proper encoding when using write.csv().

I use R mostly to work on raw data before I visualize them with D3.js. So point 1 doesn’t bother me too much. Point 2 on the other hand is a real problem.

After trying (to no avail) to add the fileEncoding option both to read.xlsx() and to write.csv(), and to set R’s local to French explicitely using Sys.setlocale(),  I finaly discovered a fix : use only ASCII characters in the first row of your Excel table (I’m not 100 % sure it’s ASCII only, but at least avoid characters like é, è, à, ç, etc.).

For instance, this table in Excel

Catégorie Salaire Espèce
A 100 Colibri
B 200 Sirène

will be output as this csv table  :
"CatÃ.gorie","Salaire","EspÃ.ce"
"A",100,"Colibri"
"B",200,"Sirène"

While this Excel table

Categorie Salaire Espece
A 100 Colibri
B 200 Sirène

will give you this csv table :
"Categorie","Salaire","Espece"
"A",100,"Colibri"
"B",200,"Sirène"

I have no clue whether this is a bug or a (rather weird) feature.

Leave a Reply

Your email address will not be published. Required fields are marked *