A goal for 2023--I will blame my tardiness on the pandemic? was to start down the difficult path of teaching myself Digital Signal Processing.
Audio is a good vehicle for this pursuit: there're a lot of good programming examples out there, and every DJ loves delay.
Question: what platform to use for experiments?
After research I chose Windows and Python. Windows machines are everywhere; Python is free and easy, and perhaps the path of least resistance.
Before going further: fair warning: Like the last post about learning C++, this post forms my trail of breadcrumbs through my learning process--instead of writing this information in a notebook for later review it's in this blog.
So: mostly useful to me, maybe not a lot of others--I expect like 5 views on this post.
More hardware next time.
Let's Get Started....
My dev platform was simple as possible: a Windows 10 Pro computer and Python 3.11.
The goal was to create Python audio effects and waveforms to create a mono audio signal. If I could get some sound to come out of the crappy speaker I figured it was a good start.
First up: I started by recreating the code found in the video here.
The video, as straightforward as it was, still covered math and terminology unfamiliar to me--as with a lot of DSP concepts, I soon realized that learning DSP could (and probably has) made many sane techs run screaming.
Nevertheless I tried to unpack it....
Oh no--right away imaginary numbers and complex planes reared their ugly heads.
I didn't cover these in school but maybe after decades I am getting the hang of it....it is centered around a seemingly impossible number: the square root of -1.
How can you square a real number and get -1? Turns out it's like online dating: You can only do it if you imagine you can do it....
A square root of -1 makes as much sense as a negative number--you can't have a basket with -3 apples in it--but -3 apples might mean you owe your neighbor 3 apples--negative numbers exist only in our minds.
I had to remember: negative numbers are a human, and not a physical construct; imaginary numbers are the same.
Like negative numbers, imaginary numbers help us solve otherwise unsolvable problems (like x*x + 1 = 0)
Interesting Introductory video to imaginary numbers is here.
I saw imaginary numbers everywhere in the DSP literature I was reading.
Why?
Turns out, convenience: It takes a lot less ink to solve math found in DSP using imaginary numbers vs. real numbers--therefore it's what DSP pros use. A good article about this is here.
S plane, Z plane: more complex number stuff--an X-Y grid of real (X-axis) and imaginary (Y-axis) numbers. S plane is analog; Z plane is discrete. A to D conversions are accomplished with math: "Z plane transform"--just as ADC's chop up analog signals into digital chunks, the transform equation in this context does the same.
I also heard a lot about "Zeros and poles" found in these planes. Hello? A video covering this that I could follow for the most part is here.
And on and on forever....tons and many kilogram tons more.
DSP means math, and not the math you learned in 10th grade. DSP is difficult, unfamiliar math. I am curious how far I can go with this.
But enough reading. The rest of the weekend was spent coding.
PROGRAMMING:
First thing I did with Python DSP was follow the example here to create an audio filter. It worked. Cool!
For the example I needed to include and import the Python numpy and sounddevice libraries--numpy created arrays of samples and sounddevice played them back. Here's the code:
import numpy as np
import sounddevice as sd
# create numpy array filled with random numbers
sampling_rate = 48000
duration_in_seconds = 5
highpass = False # False for lowpass
amplitude = 0.3 # scale output for your PC
duration_in_samples = int(duration_in_seconds * sampling_rate)
#watch the line wrap....
white_noise = np.random.default_rng().uniform(-1,1,duration_in_samples)
#next the creator sets up an "all pass filter"
#which uses phase tricks to knock
#out audio frequencies creating a 12db Q=0 type filter:
cutoff_frequency = np.geomspace(20000, 20, input_signal.shape[0])
#array for output processing.
#We create a np array with all zeros w the length of the output.
allpass_output = np.zeros_like(input_signal)
#create all pass filter
#First, create inner buffer of all pass filter.
dn_1 = 0
#process each sample
for n in range(input_signal.shape[0]):
break_frequency = cutoff_frequency[n]
#calculate output coefficient
tan = np.tan(np.pi * break_frequency/sampling_rate)
a1 = (tan - 1 )/(tan + 1)
#all pass filter difference equation
allpass_output[n] = a1 * input_signal[n] + dn_1
#store value in buffer for next iteration
dn_1 = input_signal[n] - a1 * allpass_output[n]
# we now need to set up the feedback loop based on our filter
if highpass:
allpass_output *= -1
filter_output = input_signal + allpass_output
#scale amplitude of loop
filter_output *=0.5
#played out using the sounddevices library....
filter_output *= amplitude
#play out. soundlibrary
sd.play(filter_output, sampling_rate)
sd.wait()
Ha! This worked. This was an excellent introductory video, and the content creator goes over the math and the code but still keeps it all sane.
And at the end of it--I wrote my first DSP code (well, not really--I just copied the code to my PC with a few minor changes and ran it)
Joy!
What next?
If I could play back and filter white noise numpy arrays, how about doing the same with a ramp wave?
It was not too hard to create a ramp wave using some numpy trickery. For instance in this fragment linspace creates a numpy array from x to y with step z. while tile repeats the elements of numpy array x y times. Viola: Ramp wave.
sample_rate = 44100
grain_size = 100
repeat_grain = 4800
wave_audio = np.linspace(-.9999, .9999, grain_size)
ramp_wave = np.tile(wave_audio, repeat_grain)
Or a x% pulse wave? That was not too bad either, but I got stuck...I also found out (not documented?) that sounddevice as sd won't play a sample if it's greater than or equal to 1 or less than or equal to -1. I couldn't figure out why until I started messing around with floats less than 1 for the max and min values in the array. When I did that suddenly I heard a raspy pulse wave coming from my speaker.
#numpy array for square/Pulse
#enter 1-100 below, PW as percent
PW = 5
PW_as_percent = PW * .01
PW_high = int(grain_size* PW_as_percent)
PW_low = int(grain_size - PW_high)
x = np.full(PW_high,.99)
y = np.full(PW_low,-.99)
sq_wave_single_grain = np.concatenate((x,y))
sq_wave = np.tile(sq_wave_single_grain,repeat_grain)
Finally, a numpy array where the PW varies for each "grain", in this case, PW modulated by a ramp wave.
This one made me look up some numpy techniques but I got it to work.
For this fragement I used "simpleaudio", a library similar to sounddevices, but it uses 16 bit values between 32767 and -32768 instead of floats between -.9999 and .9999--in general, there are a lot of audio libraries for python, a good video summarizing some I messed around with is here.
Here's the PWM code:
sq_wave_pwm_np = np.array([], dtype=np.int16)
x = range(20,99,1) #new PWM value each grain
for n in x:
PW_as_percent = n * .01
PW_high = int(grain_size * PW_as_percent)
PW_low = int(grain_size - PW_high)
pwm_x = np.full(PW_high, 32767)
pwm_y = np.full(PW_low, -32768)
sq_wave_pwm_grain = np.concatenate((pwm_x, pwm_y))
#combine new grain and existing array.
sq_wave_pwm_np = np.append(sq_wave_pwm_grain,sq_wave_pwm_np)
pwm = np.tile(sq_wave_pwm_np,10)
New IDE--VSCODE
in the VSCODE terminal, typed this
From there, it all worked.
What's Next?
I could not get sounddevices' buffering too work, too bad, because it'd be cool to send changes to numpy arrays to a buffer in real time. This would get me a lot closer to what I'd be doing with an embedded system.
I am not sure I have time to get that working, and using numpy arrays sent to sounddevices buffers isn't well documented. But I felt I could have figured it out with more time and effort.
For now it's back to hardware for a bit.
I find myself more and more drawn to the software side of DiWHY. DSP of course is a huge challenge, but I might be up for it--maybe? Not just for audio--for all of it. This could become a big problem.