Peter Dordal, Loyola University CS Department

A sine wave is described by

y = A sin(2𝜋ft)

where t is the time (eg in seconds) and f is the frequency in cycles/sec

One way to think of this is to imagine a frequency of 1000 Hz. That means we should complete one full cycle in 1/1000 sec, which means that when t=1/1000, we have 2𝜋ft = 2𝜋, and at sin(2𝜋) this is indeed the completion of one full cycle.

Sometimes we denote by ω the "angular frequency" (radians per second) 2𝜋f, so that y = A sin(ωt).

Using cos() instead of sin(), or combining cos() and sin() waves of the same frequency, can always be reduced to A sin(2𝜋ft + 𝛿). The quantity 𝛿 here represents the

Electromagnetic signals are often sin waves. They can be transmitted via

- guided media (wire/fiber)
- unguided ("wireless")

When we write A sin(2𝜋ft) we are working in the so-called time domain: the horizontal axis represents time

- typical sine waves
- square waves, nonperiodic signals
- discrete v continuous
- periodic signals
- wavelength (v frequency)
- phase

For any periodic signal, we can express it as the sum of sin waves of periods P, P/2, P/3, P/4, ..... We do also need cosine waves, or else use sin(2𝜋ft+𝛿) (the 𝛿 here represents a horizontal "phase" shift).

This is a deep fact, called

- bl.ocks.org/jinroh/7524988
- toxicdump.org/stuff/FourierToy.swf
- https://commons.wikimedia.org/wiki/File:Fourier_series_square_wave_circles_animation.gif
- https://jackschaedler.github.io/circles-sines-signals/dft_introduction.html

This means any periodic signal can be expressed in the

**time domain**as a function s(t)**frequency domain**as the sum of the sine frequencies composing it.

- any periodic signal is a superposition of sine waves
- Actually, any signal is, but you have more control of the frequencies with periodic signals
- construction of a square wave with sines: fourier.xls
- discrete v continuous
- deep question: why sin?
- if we use sin/cos as basis, a square wave has infinite bandwidth
- if we use square waves as basis, each individual one has zero bandwidth

Fact: physical transmission media have natural action on sine waves, due to things like "capacitance". This action is different at different frequencies (eg high-frequency attenuation or distortion).

Physical vibrations (such as in our ears) generally involve natural combinations of sine waves.

Discussion of benefits of Fourier analysis and the frequency-domain view: what if distortion is linear, but:

- higher-frequency sine waves transmit faster/slower
- higher-frequency sine waves get damped more

- separate out the individual sine-wave frequencies
- throw away the out-of-band frequencies
- recombine what's left

There are two sources of error: the sampling rate, and the fact that each individual A[i] is rounded off slightly to the nearest integer; this latter is called

This strategy is traditionally known as PCM (Pulse Code Modulation) encoding.

For wav files, we can specify the sampling rate; in the telephony world R=8000 is standard and for CD sound R=44000.

For telephony, we generally use 8-bit values for the A[n]'s, from +127 to -127 appropriately scaled. Suppose s(t) is the signal level, on a -1≤s(t)≤1 scale, and let t

One issue with sound is the "ceiling" on the signal s(t): if we require -1≤s(t)≤1 for all t, then 1.0 represents the absolute maximum volume. We have to scale down the intensity level to conform to this. Suppose the maximum peaks in a music file are 100 times the

+-------------+

| "RIFF" (4) | RIFF chunk : first 12 bytes of the file

+-------------+

| pkglen(4) |

+-------------+

| "WAVE" (4) |

+-------------+

+-------------+

| |

| "fmt_" (4) | FORMAT : 24 bytes

| |

+-------------+

| |

| 0x10 (4) | for PCM

| |

+-------------+

| 0x01 (2) | for linear quantization

+-------------+

| 1 or 2 (2) | Number of channels: 1 for mono, 2 for stereo

+-------------+

| |

| rate (4) | sample rate, in Hertz, eg 8000 or 44,000

| |

+-------------+

| |

| B/sec (4) | Bytes per second, = rate * (Bytes/samp le)

| |

+-------------+

| B/samp (2) | Bytes per sample (eg 2 for 16-bit mono)

+-------------+

| b/samp (2) | bits per individual single-channel sample

+-------------+

+-------------+

| "data" (4) |

+-------------+

| len (4) | length of data, in bytes; SOMETIMES WRONG!!

+-------------+

My file WavReader.java reads *.wav files into arrays, and writes arrays out into .wav files.

Some 16-bit digitized waves:

440 Hz A: text wav

~440 Hz A: text wav (text file has frequency rounded off to 18; 8000/440 is really 18.18)

294 Hz D: text wav

~294 Hz D: text wav (text file has frequency rounded off to 27 from 27.2)

- WavReader library (uses objects for reading, but writeWav() is a static method)
- reader.java
- osc_skel.java, mixer_skel.java

To create a sine wave, we can do this:

sox -n A4.wav synth 3 sine 440

sox -n sqr.wav synth 3 square 440

sox -n saw.wav synth 3 saw 440

We can look at these with audacity.

Now let's make a chord: D4 = 293.66 Hz

Once we have A4.wav and D4.wav, let's mix them:

sox --combine merge A4.wav D4.wav AD.wav; play AD.wav

How about CEGC? Frequencies: 262, 330, 392, 523.

sox -n C.wav synth 3 sine 262

...

sox --combine merge C.wav E.wav G.wav C5.wav CEGC.wav

The end result is a 4-channel sound file. If we want only one channel, we can use

sox CEGC.wav CEGC1.wav remix -m 1-4

This sounds just a bit louder.

How about beat frequencies:

sox -n A441.wav synth 3 sine 441

sox --combine merge A4.wav A441.wav beat.wav remix 1-2

We can get information about sound files by using the

- 440.wav, 440-442beat.wav
- cantdo.wav
- note the various filtering options

Examples:

- square wave
- triangle wave
- others

Most voice signals are not periodic at larger scales: people say different words.

What is useful, though, is the idea that over small-but-not-too-small intervals, say 10ms, voice does indeed look like a set of periodic waves. At this point, the frequency-domain view becomes very useful, because it is just a set of discrete frequencies.

In other words, "sound is made up of vibrations".

G.729 voice encoding (compresses 160-byte 10ms sample into 20 bytes): Step 1 is, for each sample, to try to match the "best fit" pure frequency.

Maybe even more importantly, most digital data signals similarly look "almost periodic", often over much larger intervals than voice. So the same principle applies: translation to the frequency domain is actually useful.

When we start looking at the details of audio highpass, lowpass, and bandpass filters (eg for audio), we'll see that they involve averaging over a time interval. If that time interval is too large, the signal may no longer look very periodic and we may get misleading or distorted results. If the time interval is too small, the filter may not be "sharp" enough.

For many purposes, we just encode the sound and leave it at that until it reaches the receiving end. But sometimes we want to "tweak" the sound; the classic example is to run it through a highpass, lowpass, or bandpass filter, to eliminate frequencies that are too low, too high, or not in the middle. Another example is to try to filter out some of the noise (this works best if the noise is not purely random).

One way to think of the Fourier series is as a set of averaging operations: the sequence A[n] is "averaged against" a sine (or cosine) wave sin(2𝜋Fn), where f is the "frequency" in units of the sampling rate. (More specifically, the frequency in Hz is f = FR, where R is the sampling rate. Alternatively, the sampling frequency in terms of the frequency in Hertz is F = f/R. For a 400-Hz tone, F = 1/20, meaning that 20 samples make up one full sine wave.)

The idea behind "averaging against" is that we're taking a weighted average of the A[n], eg 1/4 A[1] + 2/3 A[2] + 1/12 A[3]. However, with sin(n) involved, some of the weights can be negative.

Averaging two waveforms, in two arrays A and B, amounts to getting the following sum:

double sum = 0.0

for (i = 0; i < whatever; i++) sum += A[i]*B[i]

"whatever" might be A.length, or max(A.length,B.length). We might also "shift" one of the arrays, so the sum is

sum += A[i]*B[shift+i]

In some cases it is natural to reverse the direction in B:

sum += A[i]*B[shift−i]

If we're averaging against a sine wave, with frequency 2𝜋f, we might have something like this:

for (n=0; n<len; n++) sum+=A[n]*sin(2*Pi*f*n);

avg = sum / len

If you're familiar with calculus, this more-or-less corresponds to ∫ A(x) sin(2𝜋fx) dx

Digitized sound makes these sums easy to do, though; no calculus is needed. Sometimes it's unclear what you divide by (eg len, above; sometimes it's actually the time in seconds), but if we divide by the wrong thing we're usually just off by a constant.

Example 1: two sine waves of different frequency average to zero; phase does not matter

Two of the same frequency average to, well, maybe 1/2 (avg of sin

Assume the signal g(x) is periodic, and is composed of sines, and has base frequency 1 (or 1/2𝜋). We know g(x) must have the following form; we just need to find A1, A2, etc:

g(x) = A1 sin(x) + A2 sin(2x) + A3 sin(3x) + ...

To do this, we start by averaging both sides with sin(x):

Average(g(x), sin(x)) = A1 * average(sin(x), sin(x)) + A2 * average(sin(2x), sin(x)) + A3 * average(sin(3x), sin(x)) + ...

= A1 * 1/2

From this we can readily calculate A1, and by averaging with sin(2x) get A2, etc.

In the above averages, we calculate one fixed sum (eg for (i=0;i<max;i++) sum+=g(n)*sin(n)). There is another kind of averaging, where we repeatedly "shift" one of the two functions involved, to get a new sequence / waveform /function. This is also called moving averages. Things don't necessarily have to be periodic here.

Here's a simple example, defining a new sequence B[n] from existing sequence A[n]. If A[n] is defined for n from 0 to 1000, inclusive, then B[n] will be defined for n from 1 to 999.

B[n] = (A[n-1] + A[n] + A[n+1])/3

We can think of this as a weighted average of A[] with (1/3, 1/3, 1/3). The point here is that we get a new sequence, B[n], instead of a single value like A1 above. However, the actual calculation is the same: find the sum of a bunch of A[n]*f(n) products.

Any linear transformation of the input signal can be expressed this way, though some may need lots and lots of coefficients and finding the coefficients might in some cases be difficult.

However, there are some shortcuts to the find-the-coefficients problem. If we know what we want in the frequency domain, we can apply some of the math above to find the corresponding coefficients for a corresponding moving average in the time domain. For example, suppose we want to create a "lowpass filter", to cut off the higher frequency signals. This is an easy point-by-point thing to express, in the frequency domain:

filtering

(1, 1, 1, ...., 1, 0, 0, 0, ...)

^cutoff

Apply this to a sequence of frequency amplitudes (a1, a2, a3, a4, a5, a6, a7, a8) (representing respective amplitudes, say, of sin(x), sin(2x), ..., sin(8x)). Assume the cutoff is at position 4, and multiply corresponding values. We get

(1*a1, 1*a2, 1*a3, 1*a4, 0*a5, 0*a6, 0*a7, 0*a8)

= (a1, a2, a3, a4, 0, 0, 0, 0)

Point-by-point multiplication in the frequency domain corresponds to a specific moving average in the time domain, and we can calculate that in a relatively straightforward way (although it does involve some calculus).

In this particular case of a frequency-domain cutoff, above, it turns out that the time-domain moving average we need is the "sinc" function (with a "c", pronounced "sink"):

sinc(x) = sin(x)/x (sinc(0) = 1)

Moving averages of this sort are sometimes called convolution. The function you are averaging against (eg sinc(x)) is sometimes called a "kernel", not to be confused with the OS use of that term. The new signal after filtering is array B[], where

B[n] = sum (A[t+n]*sinc(2𝜋f * t) / 2𝜋f -M ≤ t ≤ M

This is defined for M ≤ n ≤ A.length - M

Other common kernels:

- Identity ("delta" function)
- windowed sin(x): gives estimate of local frequency
- high-pass (Identity - sinc)
- bandpass: combination of lowpass and highpass

- echo
- reverb

Associativity: bandpass kernel: applying a lowpass filter L and then a highpass filter H is equivalent to convolving L and H together (L*H) and then applying that: (Signal*L)*H = Signal*(L*H)

Note that sinc() goes to infinity in both directions. In practical terms, we have to truncate, generally at a point where sinc(x)=0 (that's what M was, above), and we should also taper off as we approach ±M. The other practical problem is that convolution with sinc(x) involves some knowledge of the "future" sound, that is, to get B[n] we need to know A up to A[n+M]. If n represents the time, then we need to know values of A[i] M units into the future. One solution is to just accept that B is delayed by M units.

Demo: sincfilter.ods: This is a spreadsheet where we average sinc(f

In the spreadsheet, we average just for x>0. Symmetric averaging about the origin with sin(x) is always exactly 0; cos(x) is the "worst case". Any sine wave in the real world has a phase, referring to its relative starting position d: Asin(fx + d). This can also be expressed as Bsin(fx) + C cos(fx) (it's a trig identities thing) and, as noted, all we have to worry about is the cos(fx) part.

To use the spreadsheet, you supply a filter frequency (f

Note that frequencies can be fractional, eg f

Demo: play cantdo.wav subject to various highpass and lowpass filters.

Demo: look at the spectrograms for some .wav files

spectrum: range of frequencies from f

bandwidth: width, = f

effective bandwidth: frequencies you actually need

DC component

major point: in Frequency-division multiplexing, each channel is assigned a very specific "band-width"

Note: standard electronic circuits can extract "bands", hence the practical importance of this notion

Note that this is a different use of "band-width" from "data rate"

There is some relation, however: the narrower the analog band-width, the less room there is for multiple frequencies and modulation, and the less data capacity.

discussion of benefits of Fourier analysis:

what if distortion is linear, but:

- higher-frequency sine waves transmit faster/slower
- higher-frequency sine waves get damped more

Example on data rate ("bandwidth" in the digital sense) versus bandwidth

Suppose we want to transmit a square wave, alternating from -1 to +1. We have a band-width of 4 Mhz.

Case 1: we use sin(2𝜋ft) + (1/3)sin(2𝜋(3f)t) + (1/5)sin(2𝜋(5f)t) (that is, three terms of the Fourier series), where f = 1 Mhz.

Look at this with fourier.xls: does it look squarish?

The frequencies are 1 Mhz, 3 Mhz and 5 Mhz. The band-width is 5-1 = 4 Mhz.

The data rate is 2 Mbps (sort of; we are sending 1 M 1-bits and 1 M 0-bits per second)

Note dependence on notion of what waveform is "good enough"

Case 2: If we double all the frequencies to 2 MHz, 6 MHz, 10 MHz, we get band-width 8MHz, data rate 4Mbps. This is the same as above with f=2 Mhz.

Case 3: We decide in the second case that we do not need the 10 MHz component, due to a more accurate receiver. The base frequency is f = 2MHz, frequencies 2MHz, 6MHz, band-width is now 6MHz - 2MHz = 4MHz, data rate: 4Mbps

Look at this with fourier.xls.

Note that we're really not carrying data in a meaningful sense; we can't send an arbitrary sequence of 0's and 1's this way. However, that's done mostly to simplify things.

Note also the implicit dependence on bandwidth of the fact that we're decomposing into sinusoidal waves.

Voice transmission: frequency band-width ~ 3-4kHz (eg 300 Hz to 3300 Hz)

64Kbps encoding (8 bits sampled 8000 times a second)

Modems have the reverse job: given the band-width of ~3 kHz, they have to send at 56 kbps!

Note: we're looking at DISCRETE frequency spectrum (periodic signals). CONTINUOUS frequency spectrum also makes mathematical sense, but is kind of technical

Note that frequency-domain notion depends on fundamental theorem of Fourier analysis that every periodic function can be expressed as sum of sines & cosines (all with period an integral multiple of original)

Bandwidth of voice: <=4 kHz

quite different meaning from digitization: 64kbps

But if we wanted to encode the digitized voice back into a voice bandwidth, we'd have to encode 16 bits per cycle (Hertz), which is a little tricky.

Amplitude modulation & band-width

Note that AM modulation (ALL modulation, in fact) requires a "band-width"; ie range of frequencies. This will be very important for cellular.

AM:amplitude = [1+data(t)]*sin(2𝜋ft)

f is "carrier" high frequency; eg 100,000

If data(t) = sin(2𝜋gt), g a much lower frequency (eg 1000)

Then sin(2𝜋ft)*sin(2𝜋gt) = 0.5 cos(2𝜋(f-g)t) - 0.5 cos(2𝜋(f+g)t)

band of frequencies: (f-g) to (f+g)

band-width: 2g