Loading...

Data Analytics Languages: R

Rohith Perumalla | 2/9/2018

A quick look at R.

Summary:

Data analytics, sometimes known as analysis of data or data analysis, is the process of inspecting, cleansing, transforming, and modeling data with the objective of finding useful information, suggesting conclusions, and supporting decision-making. These technologies are often used to analyze large amounts of data (Big Data Analytics) in which the findings are used to make data driven decisions. There are various languages and libraries used to analyze data each have their own benefits and drawbacks. Some of the popular programming languages used to develop and build solutions to analyze data include: R, Python, Julia, Java, Hadoop/Hive, Scala, Kafka/Storm, Matlab, Octave, and Go.

Analysis:

R is a a open source free language that has been around since the 1990’s; since then it has been used as an alternative for the expensive statistical packages such as Matlab and SAS. In recent years, R has grown in popularity and has been used by large organizations like Google, Facebook, Bank of America, and many others. R’s main selling point comes from, its ability to quickly and efficiently sift through data sets, manipulate the data with advanced modeling tools and create clean graphics with only a few lines of code - it is similar to a “hyperactive version of Excel.”

R also has a thriving community in which developers are adding to R’s functionality and improving it regularly. It is also the most popular data analytics language being used by around 61% of data scientists. R’s ability to efficiently traverse through data and quickly generate models is an extremely useful feature which allows less time to be spent on creating models and more time to actually make decisions from the data. R has also grown in popularity in the banking industry regarding financial modeling. Niall O’Connor, vice president at Bank of America says, “R makes our mundane tables stand out”.

R is primarily being used to create models, it has become the “go-to language for data modeling”; however it is important to remember that R becomes less and less effective as a organization or company needs to build a tool than can handle large scale functions. “R is more about sketching, and not building,” says Michael Driscoll, CEO of Metamarkets. “You won’t find R at the core of Google’s pagerank or Facebook’s friend suggestion algorithms. Engineers will prototype in R, then hand off the model to be written in Java or Python.”

Considering all the benefits R has from its efficiency and speed regarding going through data sets, effectiveness regarding designing and creating data models, and its “vibrant ecosystem” it is important that it is only effective regarding those functions and that it does not scale well and can not be used to build products. R’s speedy modeling capabilities are especially effective especially when creating models for presenting data. But the most notable characteristic is R being open source making it an always growing and improving tool; the thousands of developers that use to tool are always sharing and developing patches and libraries to make the language as useful as possible. R is a stable language.

Sources

Hayes, Tyler. “The 9 Best Languages For Crunching Data.” Fast Company, Fast Company, 3 Apr. 2015

Images

http://datascience.uci.edu/wp-content/uploads/sites/2/2014/09/r-project-logo.jpg