Statistician’s Notebook: R User Conference

July 25, 2016

CONTRIBUTORS

Amos Thairu

VIEW PROFILE

I participated in the tenth annual R User Conference on June 27-30 at Stanford University in California. R is a free software environment for statistical computing and graphics. This was an exciting experience for me as I learned new techniques and developments regarding statistical computing and data visualization from various members of the R community.

I had a poster presentation on the new data visualization platform called the Urban Health Statistics platform that APHRC has developed and will launch soon. The platform — developed mainly using R–is a web application and an interactive tool to help advocates, researchers, policymakers and development partners understand urban African health indicators. A number of people at the conference were interested to learn about the project and share ideas.

There were 800 participants, ranging from speakers and presenters to those who would just want to listen, learn and interact. The participants hailed from a range of diverse disciplines where they have applied R, including genomics, epidemiology, climate change, analysis of presidential polls and even fish spawning patterns.

I also had the opportunity to interact with other statisticians and programmers working across the globe. For example, I met and had chats with two researchers from the University of Washington who are part of a working group that has developed a statistical framework for population projections for all countries. Their work has been fruitful; on World Population Day the UN issued official population projections for all countries using their new statistical methods. Following their analysis on UN population data, they developed an interactive data exploratory tool that displays projections of various population indicators. There are a lot of synergies between their work and APHRC’s Urban Health Statistics platform and I am looking forward to seeing where we can learn some of their techniques to enhance our work.

It felt great to be at the epicenter of technological creativity in the Silicon Valley, home to Yahoo!, Google, Hewlett-Packard, and many other cutting-edge tech companies that were started by and continue to be led by Stanford alumni and faculty. I found Stanford University quite beautiful and was fascinated to learn that it is so huge that it has its own zip code.

One thing I noted from the conference is how institutions in Africa have not significantly adopted open-source technologies for statistical computing, particularly R, compared to their counterparts in industrialized countries. Most of the institutions represented by the participants at the conference use R for data analysis. Despite their relative economic strength compared to lower-income African nations, they are much more likely to use free software than the expensive proprietary software used here at APHRC and elsewhere in Kenya. This is a great takeaway and illustration of one of the great benefits of globalization: the availability of inexpensive or even free solutions to challenges that confront us all.

I was exposed to R while studying in Sweden and went on to use it as part of the KEMRI-Wellcome Trust Research Programme, where I met researchers and analysts that were also R enthusiasts. Many are of the opinion that the learning curve for R might be a bit steep compared to other statistical software such as STATA and SPSS, but I think that R is a powerful software that is worth learning and adopting.
One resource that I would recommend for learning R is exploring any one of the free online courses offered at Coursera and edX. The beauty of such courses is that they are available for people to use on their own time and at their own pace, using a variety of knowledge management tools such as videos, quizzes and projects facilitated by top universities and experienced instructors.

In time, I hope more programmers and researchers from across sub-Saharan Africa will join the R community and explore its benefits for performing analytics. And I look forward to demonstrating the functionality of the R software when APHRC launches the Urban Health Statistics platform in September.