It seems like dealing with character encoding is a fairly common problem in R (see this -pretty old- article for example). For some reason I only experienced it today. When trying to import data (read.xlsx()) from Excel into R Studio with French text in it, I realized that:
R Studio cannot display French characters properly when one tries to view the objects’ content,
And R doesn’t export the objects with the proper encoding when using write.csv().
I use R mostly to work on raw data before I visualize them with D3.js. So point 1 doesn’t bother me too much. Point 2 on the other hand is a real problem.
After trying (to no avail) to add the fileEncoding option both to read.xlsx() and to write.csv(), and to set R’s local to French explicitely using Sys.setlocale(), I finaly discovered a fix : use only ASCII characters in the first row of your Excel table (I’m not 100 % sure it’s ASCII only, but at least avoid characters like é, è, à, ç, etc.).
For instance, this table in Excel
will be output as this csv table : "CatÃ.gorie","Salaire","EspÃ.ce"
While this Excel table
will give you this csv table : "Categorie","Salaire","Espece"
I have no clue whether this is a bug or a (rather weird) feature.
I ran into this real-life visualization this week by Cambio, a car sharing service quite popular in Belgium.
The message on the banner reads something like “1 Cambio means 12 cars less in the city”. Granted, it’s heavier to load than even a million nodes in a d3.js force layout. And their design isn’t very responsive either (i’m thinking narrow streets in a Greek village, for instance). But still, it’s pretty effective !
I just stumbled upon these photos by (Catalan ?) photographer Xavi Bou, thanks to an article in The Guardian. He revisited what his predecessors Etienne-Jules Marey and Eadweard Muybridge first did in the late 19th century when they invented chronophotography, but his goal was to “move away from [their] scientific approach”, by portraying the scene in a non-invasive way. I think the result is beautiful.
I’m currently studying the new version of ggplot2, and I’m having a lot of fun ! For those interested the updated version of the book written by Hadley Wickham (the author of this library) is available here.
Despite my age (I was born about 22 years after Dylan published his first LP), I like Bob Dylan‘s music a lot. While browsing BobDylan.com to find lyrics I noticed that there was some data about the songs : how many times they’d been played live, and when was the first and the last time.
Data + Dylan = exciting project !
So I thought I’d try to visualize it and see if I could make something interesting.
It seems like somebody made a slopegraph during WWII to show how the exports from different French industries had evolved between 1913 and 1935. The subtitle of the chart most likely aims at convincing the readers that the French economy would be better off after the war. The journal L’Oeuvre was apparently collaborationist during the war. The “terroristes” in Haute-Savoie below the header were certainly members of the French Resistance who had been recently massacred by the Nazis.
So that would make the slopegraph 39 years older than the one published by Tufte in 1983. I bet there’s even older examples out there…
Visualization is a very powerful tool to share knowledge with people, when used properly. Last week there was some noise in the Belgian and French news and about a map of the Muslim population living in Belgium. The map was apparently first published in SudPresse, with a printed map to which I did not have access. A different, interactive map was also made by De Standaard in Flanders (the Dutch-speaking part of Belgium). There were many reactions to the publication of these maps, and I heard about it via this article on lemonde.fr.
First of all, there seem to be many flaws in the research that led to the publication of this news, which has apparently been criticized by other sociologists in Belgium. This is the most important problem here : if the data is of dubious origin or of poor quality, your visualization will necessarily be wrong and misleading.
The interactive map that was made based this research also has a few important problems in my opinion :
The color scale uses a linear color gradient (from grey to dark green) to display non-linear intervals : from 0 to 1 %, 1 to 2 %, 2 to 5 %, 5 to 10 %, 10 to 20 %. This results in a map where the majority of the country is colored, even though the Muslim population seems to concentrate in a few areas (Brussels, Antwerp, Ghent, for instance).
But for the colored areas, the map is empty. There are no cities displayed on the map. The result is once again an impression that Muslims are everywhere in Belgium, with some blur areas where their proportion is higher. I’m assuming that there must be quite a bit more to the data than this !
The resulting map in my opinion is unclear enough to be used by anybody, including for xenophobic purposes, even though that doesn’t seem to have been the goal of the author. I believe that any author of visualizations bears a responsibility towards her readers and towards the people who produced the data. This is especially true about such sensitive topics as Islam and immigration in Europe nowadays.
If you want to read more about the dangers of visualizing data without properly understanding or checking it, I recommend “Unknown unknowns”, a short article by Alberto Cairo about the dangers of “data ignorance”.
In case you didn’t know, constructive criticism of visualizations is a very popular sport in our domain. Take a look at Junk Charts or at this post by Alberto Cairo if you need to be convinced !
Using graphics to communicate is not exactly a recent idea of our species. It actually dates back (at least) to the paintings our ancestors made in caves, as mentioned by Nigel Holmes in his Infographia poster. That means roughly between 20 and 40 000 years ago, for caves like Chauvet, Cussac or Lascaux in France.
While most of us have seen the paintings of animals and humans, few of us know that a great many (non-figurative) signs were also drawn in these caves. This is the topic of Genevieve von Petzinger‘s reasearch. In a TED talk she explained last year that after studying these signs in several caves across Europe, she came to the conclusion that many of them were found in different places and different times. While she thinks that there are too few signs to make them a language, she does think that they had a meaning, and that these signs and their accompanying paintings are the first know forms of graphic communication. That was way before d3.js, but though they lacked animation, they certainly stood the test of time.
Oh, and while we’re into caves, Nature just published an article about a new breakthrough in this field : it seems like Neanderthal was already making sculptures or building things in caves as early as 176 500 years ago…
One thing I enjoy about data-visualization is that it can range from pure science to art. And my favorite pieces are generally those who are well aware of both ends of the spectrum : that they should be accurate and tell a truthful story, while remaining visually appealing.
There has been quite some noise recently about a recent work by Nicolas Rougeux entitled “Interchange Choreography“. Granted, it leaning more toward the art end of the data-visualization spectrum, but I like the result a lot.