Sunday, December 22, 2024

R: A Data-Analysis And Statistical-Computing Tool

- Advertisement -

R is a statistical-computing environment that consists of a language along with a run-time environment with graphics, a debugger, access to certain system functions and the ability to run programs stored in script files. Implementation of R is heavily influenced by two programming languages: S and Scheme. R has inherited strong object-oriented features from S language. The underlying implementation and semantics of the software is inspired from Scheme.

R provides a wide range of statistical techniques such as linear and non-linear modelling, classical statistical tests, time-series analysis, classification clustering and more. Communication engineers make use of R for various signal-processing techniques such as filter design and processing of electrocardiogram (ECG) signals.

The software was initially created by Ross Ihaka and Robert Gentleman at University of Auckland, New Zealand, in 1993, and is currently being developed by R Development Core Team.

- Advertisement -

Why I should go for R
With a lot of software packages available for data analysis and statistical computing, one might question the relevance of yet another package for carrying out the same.

fig 1
Fig. 1: Reproducing ECG signals from raw data (Courtesy: http://biostatmatt.com)

Benefits of R being a language. R is not an easy-to-learn language by itself. But definitely it has a lot of advantages over other data-analysis tools, if you could master it. R, being a highly-interactive language, allows the programmer to experiment and explore new areas and new functionalities. This would not be even possible if the data-analysis tool was not a language. The script can be re-run any time, on any machine.

box 1Cutting-edge analytics. A powerful analytics software should be capable of accepting data in various formats, manipulating and converting it to traditional and modern statistical models. R does all of these. Various manipulations like transforms, merge, aggregations and so on are carried out on accepted data and statistical models like regression and tree model are prepared. These techniques allow academicians and researchers around the world to develop latest methods in statistics, machine learning and predictive modelling. There are thousands of packages in every domain that extend the capabilities of R to adapt to various applications and their number increases day by day.

fig 2
Fig. 2: Sample R code (Courtesy: www.joyofdata.de)

Faster yet reliable results. With the ability to mix and match models to yield better results, a normal R programmer can yield faster, yet accurate, results. The code can be automated and thus be reproduced, nourishing greater research.

Challenges faced while coding in R
Engineers who are new to R face certain challenges when they are introduced to R for the first time.

Mastering the language. R is not an easy-to-learn language. The documentation style could have been more user-friendly so that the language would have more user base.

Memory and speed issues. R software is designed to be more generic in nature and programmers code it to suite their applications. Till date there are over 2000 applications developed extending the functionalities of R. Obviously, R is not the best for certain applications. There are memory and speed issues when extending the code for certain applications.

Syntax of the program. For a professional programmer, syntax of the script might look somewhat untidy when compared to other languages like Python.

How a communication engineer can make use of R
As discussed earlier, R is designed to be a generic tool and not an electronic design automation tool. With the help of functions ported from other open source packages, R handles signal-processing tasks pretty decently. These could be filter design or ECG signal analysis; researchers across the globe make efficient use of this tool for their applications.

As a signal-processing tool. “Bulk of R’s basic signal-processing capability comes from the signal package that was ported over from the open source project Octave,” points out Joseph Rickert in his blog titled ‘R and Signal Processing.’ The so-ported signal package can be used to perform signal-processing functionalities including filtering, filter generation, resampling, interpolation and visualisation of filter models. These models are quite similar to the ones in MATLAB and hence anyone who has mastered the latter can easily switch over to this open source alternative.

When it comes to statistical analysis, time-series capabilities of R are superior to proprietary software like MATLAB or an open source rival like SAS. There are wrappers for MIT package for Fast Fourier Transforms called FFTW, dynamic linear modelling filter function based on singular value decompositions for Kalman filtering for Maximum Likelihood and Bayesian dynamic linear models.

R gives the freedom to users to explore Wavelets. They can make use of filters, transforms and multi resolution analysis from the wavelets package. For de-convolution on noisy signals, users can utilise WaveD transform. waveslim and wavethresh are other advanced wavelet signal-processing packages bundled in R.

fig 3
Fig. 3: Spectrogram and FFT of original and filtered signal (Courtesy: www.joyofdata.de)

For biomedical signal processing. Biomedical engineers have made effective use of the tool to adapt to various biomedical signal-processing tasks. Matt Shotwell has developed a reproducible R script for analysis of ECG signals using a windowed (Blackman) sinc low-pass filter. For eliminating high-frequency noise above 30Hz, a low-pass filter is applied to the signals at the first stage. In order to eliminate the slow wave that corresponds to respirations, the filter at a cut-off frequency of 1Hz is applied. You can find the reproduced ECG signals from raw data in the image alongside.

A language with unlimited possibilities
With more than 7000 additional packages extending the functionalities to 2000 applications, R has emerged as a statistical-computing environment with unlimited possibilities. Users can utilise the code developed by others in the open source community to adapt to their application. Hopefully, we can expect more communications and signal-processing applications to be actively developed on R in the near future.

Download now: click here


The author is assistant professor, department of ECE at SETCEM, Thrissur

SHARE YOUR THOUGHTS & COMMENTS

EFY Prime

Unique DIY Projects

Electronics News

Truly Innovative Electronics

Latest DIY Videos

Electronics Components

Electronics Jobs

Calculators For Electronics