Most engineers use Matlab (or its open source alternative Octave) to solve their signal processing problems. Languages such as R or Python are not immediately thought of for signal processing, although this has changed a bit in the last few years with the development of the R signal package and Python scipy.signal.
The package gsignal
aims to further stimulate the use of
R for signal processing tasks. It is ported from the Octave signal package, version
1.4.1 (2019-02-08). The package contains a variety of signal processing
tools, such as signal generation and measurement, correlation and
convolution, filtering, FIR and IIR filter design, filter analysis and
conversion, power spectrum analysis, system identification, decimation
and sample rate change, and windowing.
This vignette provides a brief and general overview of some of
gsignal
’s functions.
The function pulstran
can be used to generate trains of
pulses based on samples of a continuous function (which can be
user-defined). The following figures show a periodic rectangular pulse,
an asymmetric sawtooth pulse, a periodic Gaussian waveform, and a custom
pulse train.
op <- par(mfrow = c(2, 2))
## periodic rectangular pulse
t <- seq(0, 60, 1/1e3)
d <- cbind(seq(0, 60, 2), sin(2 * pi * 0.05 * seq(0, 60, 2)))
y <- pulstran(t, d, 'rectpuls')
plot(t, y, type = "l", xlab = "", ylab = "",
main = "Periodic rectangular pulse")
## assymetric sawtooth waveform
fs <- 1e3
t <- seq(0, 1, 1/fs)
d <- seq(0, 1, 1/3)
x <- tripuls(t, 0.2, -1)
y <- pulstran(t, d, x, fs)
plot(t, y, type = "l", xlab = "", ylab = "",
main = "Asymmetric sawtooth ")
## Periodic Gaussian waveform
fs <- 1e7
tc <- 0.00025
t <- seq(-tc, tc, 1/fs)
x <- gauspuls(t, 10e3, 0.5)
ts <- seq(0, 0.025, 1/50e3)
d <- cbind(seq(0, 0.025, 1/1e3), sin(2 * pi * 0.1 * (0:25)))
y <- pulstran(ts, d, x, fs)
plot(ts, y, type = "l", xlab = "", ylab = "",
main = "Gaussian pulse")
## Custom pulse trains
fnx <- function(x, fn) sin(2 * pi * fn * x) * exp(-fn * abs(x))
ffs <- 1000
tp <- seq(0, 1, 1 / ffs)
pp <- fnx(tp, 30)
fs <- 2e3
t <- seq(0, 1.2, 1 / fs)
d <- seq(0, 1, 1/3)
dd <- cbind(d, 4^-d)
z <- pulstran(t, dd, pp, ffs)
plot(t, z, type = "l", xlab = "", ylab = "",
main = "Custom pulse")
A number of waveform generating functions are available, such as
chirp
, cmorwavf
, diric
,
gauspuls
, gmonopuls
, mexihat
,
meyeraux
, morlet
, rectpuls
,
sawtooth
, square
, and
tripuls
.
The function findpeaks
can be used to determine (local)
minima and maxima in a signal, as the following figures show.
t <- 2 * pi * seq(0, 1,length = 1024)
y <- sin(3.14 * t) + 0.5 * cos(6.09 * t) +
0.1 * sin(10.11 * t + 1 / 6) + 0.1 * sin(15.3 * t + 1 / 3)
data1 <- abs(y) # Positive values
peaks1 <- findpeaks(data1)
data2 <- y # Double-sided
peaks2 <- findpeaks(data2, DoubleSided = TRUE)
peaks3 <- findpeaks (data2, DoubleSided = TRUE, MinPeakHeight = 0.5)
op <- par(mfrow=c(1,2))
plot(t, data1, type="l", xlab="", ylab="")
points(t[peaks1$loc], peaks1$pks, col = "red", pch = 1)
plot(t, data2, type = "l", xlab = "", ylab = "")
points(t[peaks2$loc], peaks2$pks, col = "red", pch = 1)
points(t[peaks3$loc], peaks3$pks, col = "red", pch = 4)
par (op)
title("Finding the peaks of smooth data is not a big deal")
t <- 2 * pi * seq(0, 1, length = 1024)
y <- sin(3.14 * t) + 0.5 * cos(6.09 * t) + 0.1 *
sin(10.11 * t + 1 / 6) + 0.1 * sin(15.3 * t + 1 / 3)
data <- abs(y + 0.1*rnorm(length(y),1)) # Positive values + noise
peaks1 <- findpeaks(data, MinPeakHeight=1)
dt <- t[2]-t[1]
peaks2 <- findpeaks(data, MinPeakHeight=1, MinPeakDistance=round(0.5/dt))
op <- par(mfrow=c(1,2))
plot(t, data, type="l", xlab="", ylab="")
points (t[peaks1$loc],peaks1$pks,col="red", pch=1)
plot(t, data, type="l", xlab="", ylab="")
points (t[peaks2$loc],peaks2$pks,col="red", pch=1)
par (op)
title(paste("Noisy data may need tuning of the parameters.\n",
"In the 2nd example, MinPeakDistance is used\n",
"as a smoother of the peaks"))
The gsignal
package contains functions for designing
lowpass, highpass, bandpass, and bandstop filters. Both Finite Impulse
Response (FIR) and Infinite Impulse Response (IIR) filters can be
designed. The freqz
function displays the frequency
response’s magnitude and phase of the filter.
A FIR filter is a filter whose impulse response settles to zero in finite time. This is in contrast to IIR filters, which have internal feedback causing them to have an infinitely long impulse response (although usually decaying).
For causal discrete-time FIR filters the output is a weighted sum of the most recent input values. Compared to IIR filters, advantages of FIR filters are they are inherently stable (because there is no feedback that propagates indefinitely), and that they have linear phase (constant across frequencies). The main disadvantage is that they require more computation time to obtain sharp transition bands.
The package gsignal
contains various methods to design
digital FIR filters. The functions fir1
, fir2
,
and kaiserord
use the windowing
method, in which a window is applied to the truncated inverse
Fourier transform of the filter’s frequency response. The function
firls
is an extension of the fir1
and
fir2
functions that uses a least-squares approach to
minimize errors between the specified and the actual frequency response
over sub-bands of the frequency range. The Parks-McClellan
method using the Remez exchange
algorithm for finding an optimal equiripple set of filter
coefficients is used by the remez
function. The
cl2bp
function allows designing FIR filters without
explicitly defining the transition bands for the magnitude response.
Below are some examples of FIR filter design. The magnitude and the
phase of the filter’s frequency response are plotted by the function
freqz
.
## FIR filter design by windowing
# low-pass filter 10 Hz
fs = 256
h <- fir1(40, 10/ (fs / 2), "low")
freqz(h, fs = fs)
# fir2 allows specifying arbitrary frequency responses
f <- c(0, 0.3, 0.3, 0.6, 0.6, 1)
m <- c(0, 0, 1, 1/2, 0, 0)
fh <- freqz(fir2(100, f, m))
op <- par(mfrow = c(1, 2))
plot(f, m, type = "b", ylab = "magnitude", xlab = "Frequency")
lines(fh$w / pi, abs(fh$h), col = "blue")
legend("topright", legend = c("specified", "actual"), lty = 1,
pch = c(1, NA), col = c("black", "blue"))
# plot in dB:
plot(f, 20*log10(m+1e-5), type = "b", ylab = "dB", xlab = "Frequency")
lines(fh$w / pi, 20*log10(abs(fh$h)), col = "blue")
par(op)
title("specify arbitrary frequency responses with fir2")
## 50 Hz notch filter with remez
fs <- 200
nyquist <- fs / 2
f <- c(0, 48.5 / nyquist, 49.5 / nyquist, 50.5 / nyquist, 51.5 / nyquist, 1)
a <- c(1, 1, 0, 0, 1, 1)
h <- remez(200, f, a)
freqz(h, fs = fs)
Filtering causes a delay because weighted samples in the past are
used. FIR filters have a linear phase, so in the time domain this delay
is constant, namely \(N / 2\), where
\(N\) is the filter length (or ‘number
of taps’). The function grpdelay
can be used to calculate
the group
delay; see Figure (a) below. Because phase is linear, it is easy to
to compensate for the filter delay as shown in Figure (b) below.
op <- par(mfrow = c(2, 1))
# design the filter
fs = 256
h <- fir1(40, 30/ (fs / 2), "low")
# group delay is constant at N/2
gd <- grpdelay(h)
plot(gd, ylim = c(0, 40),
main = paste("(a) Group delay for FIR filters is constant\n",
"(here 40 / 2 = 20)"))
# filter electrocardiogram data with added noise
data(signals, package = "gsignal")
npts <- nrow(signals)
ecg <- signals$ecg + 1000 * runif(npts)
time <- seq(0, 10, length.out = npts)
plot(time, ecg, type = "l", main = "(b) Example ECG signal",
xlab = "Time", ylab = "", xlim = c(0,2))
title(ylab = expression(paste("Amplitude (", mu, "V)")), line = 2)
f1 <- gsignal::filter(h, ecg)
lines(time, f1, col = "red", lwd = 2)
delay <- mean(gd$gd)
f2 <- c(f1[(delay + 1):npts], rep(NA, delay))
lines(time, f2, col = "blue", lwd = 2)
legend("topright", legend = c("Original", "Filtered", "Corrected"),
lty = 1, lwd = c(1, 2, 2), col = c("black", "red", "blue"))
Infinite Impulse Response, or recursive, filters are an efficient way of achieving a long impulse response by not only using past input samples, but also past output samples. Hence, an element of feedback (recursion) is used. IIR filters are specified by a set of feedback coefficients (usually termed \(a\)), in addition to feedforward coefficients (\(b\)) as used in FIR filters.
Advantages of IIR filters compared to FIR filters are related to
their efficiency in implementation. IIR filters usually require (much)
fewer filter coefficients, implying a correspondingly fewer number of
calculations. On the other hand, the impulse response of IIR filters
does not always decay to zero, which may result in filter instability
(see the example below). In addition, the phase of IIR filters is not
linear but frequency dependent. Forward and reverse filtering
(filtfilt
) results in zero phase at the expense of
additional computing time (there is no free lunch).
Some important types of IIR filters are:
butter
);cheby1
), or a stopband ripple (Type II - function
cheby2
);ellip
);The following figure compare the frequency responses of (a) 5th order Butterworth and Chebyshev filters, (b) 5th order Butterworth and elliptic filters, and (c) type I and type II Chebyshev filters.
op <- par(mfrow = c(3,1))
# compare Butterworth and Chebyshev filters.
bfr <- freqz(butter(5, 0.1))
cfr <- freqz(cheby1(5, .5, 0.1))
plot(bfr$w, 20 * log10(abs(bfr$h)), type = "l", ylim = c(-80, 0),
xlab = "Frequency (rad)", ylab = c("dB"),
main = "(a) Butterworth and Chebyshev")
lines(cfr$w, 20 * log10(abs(cfr$h)), col = "red")
legend("topright", legend = c("5th order Butterworth", "5th order Chebyshev"),
lty = 1, col = c("black", "red"))
# compare Butterworth and elliptic filters.
efr <- freqz(ellip(5, 3, 40, 0.1))
plot(bfr$w, 20 * log10(abs(bfr$h)), type = "l", ylim = c(-80, 0),
xlab = "Frequency (rad)", ylab = c("dB"),
main = "(b) Butterworth and Elliptic")
lines(efr$w, 20 * log10(abs(efr$h)), col = "red")
legend ("topright", legend = c("5th order Butterworh", "5th order Elliptic"),
lty = 1, col = c("black", "red"))
# compare type I and type II Chebyshev filters.
c1fr <- freqz(cheby1(5, .5, 0.1))
c2fr <- freqz(cheby2(5, 20, 0.1))
plot(c1fr$w, 20 * log10(abs(c1fr$h)), type = "l", ylim = c(-80, 0),
xlab = "Frequency (rad)", ylab = c("dB"),
main = "(c) Type I and II Chebyshev")
lines(c2fr$w, 20 * log10(abs(c2fr$h)), col = "red")
legend ("topright", legend = c("5th order Type I", "5th order Type II"),
lty = 1, col = c("black", "red"))
Using IIR filter coefficients \(b,
a\) can cause numerical problems. Therefore, IIR filter design
functions in gsignal
have an output
parameter,
allowing the filter coefficients to be returned in one of three
forms:
list
containing the moving average polynomial
(feedforward) coefficients \(b\), and
the autoregressive (recursive, feedback) coefficients \(a\);list
containing the coefficients in
zero-pole-gain formlist
of series second order sections (biquads)Although the Arma
form is default for compatibility
reasons, the use of Sos
and the accompanying filtering
function sosfilt
is generally preferred.
A second issue that may occur when using IIR filters, is instability.
This may be the result of numerical rounding errors or because too many
filter coefficients were used. Pole-Zero
Analysis can be useful here. A filter is stable if its impulse
response \(h(n)\) decays to 0 when
\(n\) increases. In terms of poles and
zeros, this is true if all of the filter’s poles are inside the unit
circle in the \(z\)-plane (Smith,
J.O. (2012)). The package gsignal
offers the function
zplane
that displays a filter’s poles and zeros in the
complex \(z\)-plane, as the following
figure illustrates. In the figure the ’0’s represent the zeros, and the
’X’s the poles.
op <- par(no.readonly = TRUE)
n <- layout(matrix(c(1, 2, 3, 3), nrow = 2, byrow = TRUE))
stable <- butter(3, 0.2, "low", output = "Zpg")
# artificially adapt pole
instable <- stable
instable$p[2] <- instable$p[2] - 2
zplane(stable, main = "Stable")
zplane(instable, main = "Instable")
t <- seq(0, 1, len = 100)
x <- sin(2* pi * t * 2.3) + 0.5 * rnorm(length(t))
z1 <- filter(stable, x)
z2 <- filter(instable, x)
plot(t, x, type = "l", xlab = "", ylab = "")
lines(t, z1, col = "green", lwd = 2)
lines(t, z2, col = "red")
legend("bottomleft", legend = c("Original", "Stable", "Instable"),
lty = 1, col = c("black", "green", "red"), ncol = 3)
Because the phase of the frequency response of IIR filters is not linear, the filter delay cannot be easily compensated for as in the FIR case. Recall that the 40-tap 30 Hz low-pass FIR filter used above for filtering the ECG signal had a linear phase and a constant delay of 20 samples. If a 5th order elliptic low-pass filter at 30 Hz is used, it can easily be seen that its phase is not linear (Figure (a) below), and hence the filter delay is dependent of frequency (Figure (b) below). Note that the group delay is defined to be the negative first derivative of the filter’s phase response.
op <- par(mfrow = c(2, 1))
ell <- ellip(5, 0.1, 60, 30/(fs/2), "low")
ellf <- freqz(ell, fs = fs)
argh <- Arg(ellf$h)
argh[which(is.na(argh))] <- 0
phase <- unwrap(argh)
plot(ellf$w, phase, type = "l", xlab = "Frequency (Hz)", ylab = "Phase",
main = paste("30 Hz 5th order elliptical low-pass IIR filter\n",
"phase response is not linear"))
gd <- grpdelay(ell, fs = fs)
#> Warning in grpdelay.default(filt$b, filt$a, ...): setting group delay to 0 at
#> singularity
plot(gd, main = paste("group delay depends on frequency\n",
"mean:", round(mean(gd$gd), 1), "samples"))
This means that the filter delay cannot be compensated for in the
same way as for the FIR filter. An alternative is to use the function
filtfilt
, which applies forward and backward filtering and
thus compensates for the delay, as shown in the figure below.
f <- filter(ell, ecg)
ff <- filtfilt(ell, ecg)
plot(time, ecg, type = "l", xlab = "Time", ylab = "", xlim = c(0,2))
title(ylab = expression(paste("Amplitude (", mu, "V)")), line = 2)
lines(time, f, col = "red", lwd = 2)
lines(time, ff, col = "blue", lwd = 2)
legend("topright", legend = c("Original", "filter()", "filtfilt()"),
lty = 1, lwd = c(1, 2, 2), col = c("black", "red", "blue"))
The most straightforward way to implement a digital filter is by convolving the
input signal with the filter’s impulse response. The package
gsignal
contains several functions for convolution. The
function conv
returns the 1-D convolution of two vectors
a
and b
in 3 ‘shapes’; “full”, for which the
output vector has a length equal to
length(a) + length(b) - 1
; “valid”, which only returns the
central part of the convolution with an output length of
length(a)
; or “valid”, which returns only those parts of
the convolution that are computed without the zero-padded edges (output
length max(length(a) - length(b) + 1, 0)
). For example:
u <- rep(1, 3)
v <- c(1, 1, 0, 0, 0, 1, 1)
conv(u, v, "full")
#> [1] 1 2 2 1 0 1 2 2 1
conv(u, v, "same")
#> [1] 1 0 1
conv(u, v, "valid")
#> NULL
conv(v, u, "valid")
#> [1] 2 1 0 1 2
Two-dimensional convolution of two matrices can be computed by the
conv2
function. In this case, the size of the output matrix
is nrow(A) + nrow(B) - 1
by
ncol(A) + ncol(B) - 1
for “full” convolution,
nrow(A)
by ncol(A)
for “same”, and
max(nrow(A) - nrow(B) + 1, 0)
by
max(ncol(A) - ncol(B) + 1, 0)
for “valid”. The function
conv2
is implemented in C++ for speed.
For long series convolution may be sped up by making use of the fact
that convolution in the time domain is equivalent to multiplication in
the frequency domain. Thus, the two series may be padded to the same
length, converted to the frequency domain by FFT, multiplied point-wise,
and transformed back to the time domain. The function cconv
uses this approach. However, if one series is much longer than the other
(as in typical filtering operations), zero-padding the shorter series to
the length of the longer series may not be the most efficient method. In
such cases, even faster methods like the overlap-add method used by the
function fftconv
may be useful. That’s the theory at
least…
short <- runif(20L)
long <- runif(1000L)
# convolve two long series
ll <- microbenchmark::microbenchmark(conv(long, long),
cconv(long, long),
fftconv(long, long))
plot1 <- ggplot2::autoplot(ll)
# convolve a short and a long series
sl <- microbenchmark::microbenchmark(conv(short, long),
cconv(short, long),
fftconv(short, long))
plot2 <- ggplot2::autoplot(sl)
gridExtra::grid.arrange(plot1, plot2, nrow = 2, ncol = 1)
One-dimensional Filtering in gsignal
is performed by the
function filter
. It is a direct form II transposed
implementation in C++ of the standard linear time-invariant difference
equation \[\sum_{k=0}^{N} a(k+1) y(n-k) +
\sum_{k=0}^{M} b(k+1) x(n-k) = 0; 1 \le n \le length(x)\] If a
matrix is passed to filter
, its columns are filtered.
Two-dimensional filtering can be done using filter2
. The
function fftfilt
uses the overlap-add method and FFT
convolution for speed (but your mileage may vary), and
filtfilt
uses forward and backward filtering to avoid
filter delay, as used above. If numerical stability of filter
coefficients is an issue, filter design using series second order
sections and filtering with the function sosfilt
may be
used.
The functions filter
and sosfilt
can retain
the filter state in order to process long data series in chunks. The
following piece of code shows how this is done. A long series is split
into two parts, which are processed sequentially. In the first call to
filter, the final conditions zf
are asked to be returned
after filtering the first part, which are then passed to the second call
to filter as the initial conditions zi
for the second part
of the series. The two filtered parts can then be concatenated for form
the entire filtered series without discontinuities.
Initial conditions can also be used to set the initial state of the
filter so that the output starts at the same value as the first element
of the signal to be filtered. The initial conditions for the filter can
be computed using the functions filter_zi
, or
filtic
, as shown in the following example.
t <- seq(-1, 1, length.out = 201)
x <- (sin(2 * pi * 0.75 * t * (1 - t) + 2.1)
+ 0.1 * sin(2 * pi * 1.25 * t + 1)
+ 0.18 * cos(2 * pi * 3.85 * t))
h <- butter(3, 0.05)
zi <- filter_zi(h)
## alternatively, use:
## lab <- max(length(h$b), length(h$a)) - 1
## zi <- filtic(h, rep(1, lab), rep(1, lab))
z1 <- filter(h, x)
z2 <- filter(h, x, zi * x[1])
plot(t, x, type = "l", xlab ="", ylab = "")
lines(t, z1, col = "red")
lines(t, z2$y, col = "green")
legend("bottomright", legend = c("Original signal",
"Filtered without initial conditions",
"Filtered with initial conditions"),
lty = 1, col = c("black", "red", "green"))
The power
spectral density (PSD) of a time series describes the distribution
of power (variance) into frequency components composing that signal. The
Fourier
Transform is a nonparametric method of decomposing a signal into its
frequency spectrum. The functions fft
and ifft
compute the Discrete Fourier Transform with a fast algorithm, the FFT. A
parametric alternative for autoregressive (AR) models is available
through the functions ar_psd
, pburg
, or
pyulear
. The following figure shows how both FFT- and
AR-based methods can discover the periodicities of 5 and 12 Hz in a
noisy signal.
op <- par(mfrow = c(3, 1))
fs <- 200
nsecs <- 10
lx <- fs * nsecs
t <- seq(0, nsecs, length.out = lx)
# signal of 5 Hz + 12 Hz + noise
x <- (sin(2 * pi * 5 * t)
+ sin(2 * pi * 12 * t)
+ runif(lx))
plot(t, x, type = "l", xlab = "Time (s)", ylab = "", main = "Original signal")
pw <- pwelch(x, window = lx, fs = fs, detrend = "none")
plot(pw, xlim = c(0, 20), main = "PSD estimate using FFT")
py <- pyulear(x, 30, fs = fs)
plot(py, xlim = c(0, 20), main = "PSD estimate using Yule-Walker")
Welch (1967) proposed a method to estimate the power spectrum that reduces the variance of the spectrum (at the expense of decreasing frequency resolution - remember, there is no free lunch) by splitting the signal into (usually) overlapping segments and windowing each segment, for instance by a Hamming window. The periodogram is then computed for each segment, and the squared magnitude is computed, which is then averaged for all segments. The spectral density is the mean of the modified periodograms, scaled so that area under the spectrum is the same as the mean square of the data. In case of multivariate signals, cross-spectral density, phase, and coherence are also returned. The input data can be demeaned or detrended, overall or for each segment separately.
The following figure shows two signals, a sine and a cosine of 5 Hz with noise added. Sines and cosines are the same