What is Audio?
The simplified version of “what is sound or audio” ….
Audio, or sound, is basically a very fast moving “pressure wave”, changing in height (volume) and width (pitch).
Such a wave is generated by vibration of an object. For example the speaker of your sound system – if you’ve looked at it closely, you can actually see the speaker moving back and forth, causing this “pressure wave”. But the same goes for example for a needle you drop on a tile floor.
These waves are caught by your ear drum which moves in and out as well, because of the pressure waves, and gives us the ability to hear.
Amplitude and Wavelength
Sound broken down in Amplitude (volume) and Wavelength (pitch)
The height (or: the amount up, or down) of a wave is called “Amplitude” (A) and determines volume.
The further away the wave goes from the zero line (the horizontal line in the drawing), the louder it will be.
This goes for both directions – a large negative swing will be loud as well!
The width of the wave, called wavelength (B), determines the sound pitch.
The wavelength is the distance over which it repeats itself. In the simplistic example above: Where I marked the drawing “B” you will see the sine go up, and down passing the zero line, and up again until it hist the zero line again.
This “S” shape (sine) is the repeating shape.
Note that the wavelength is a lot higher than the drawing might suggest.
The average person can hear between 20 and 20,000 Hz. Where 1 Hz (Hertz) matches 1 of these “S’ shapes per second.
I found this really nice animation illustrating that – I did slow down the original (source).
At a frequency of 1.0 Hz you will see one full “S” shape, at 2.0 Hz you will se 2 full “S” shapes, and so on …
Hertz – Wave frequency
The wider the waves, the lower the frequency since less waves fit in a second. This will sound as a low pitch sound (bass).
The more narrow a wave becomes, the higher the frequency will be, sounding much higher pitched (treble).
Audio and your computer or MP3 player
We all know the rumor that computers are pretty stupid – they can only count 0 and 1.
You might or might not know that your MP3 player or cellphone are in reality just tiny computers, so they are stupid on a smaller scale.
Sound is obviously more complex than just 0 and 1.
It is a continues change of “height” and “width” at any given time and cannot be quantified by a fixed set of numbers – the variations are limitless which makes sound smooth when you look at waves.
Think of it as a circle, which has an unlimited number “sides” or points. This is referred to as analog.
The dumb computer can only work with a fixed set of numbers, which is called digital.
In the analogy of the circle, the computer can only work with rectangles having 4 fixed sides (4 points).
So how does your computer record or play sound?
Well, for that purpose some very smart people created AD and DA converters. They convert from Analog to Digital (ADC) and Digital to Analog (DAC).
We just learned that a human can hear between 20 and 20,000 Hz.
So our ears can practically only detect sound waves that repeat 20 to 20,000 times per second (some people can hear outside of this range).
The computer now tries to measure the amplitude just enough times per second so that we humans cannot hear the difference, which gives the computer the chance to convert what it measured in a fixed number range.
Think of the amplitude as a voltage – the higher the amplitude the higher the voltage. The computer “reads” the voltage and converts it to a number.
Now I hear you say; but you said that the computer was stupid and can only count 0 and 1? How is that going to be of any use?
Yep that’s true, but it’s still useful!
Multiple of these “1” and “0” (bits) combined can represent a larger value.
This is called the binary numbering system.
For example: A computer can combine 8 bits to represent an integer value between 0 and 255 – this is called a BYTE.
When you combine 16 bits (a WORD or CHOMP) a value between 0 and 65,535 can be represented – which is what an Audio CD uses, the so called 16 bit sampling.
We humans typically count from 0 to 9.
When we combine multiple of these numbers, we can get to higher numbers as well.
We call that the decimal numbering system.
If we would do we combine up to 8 of the numbers 0 to 9, then we can represent a number between 0 and 99,999,999.
So basically the computer converts a smooth curve to little stairs as you can see in the image below.
Audio – Analog (red) to Digital (blue)
Each “step” represents a value between 0 and 65,535 in the case of an Audio CD.
This is called sampling.
Note: In the illustration we’re not using a 16 bit sampling but used a much rougher scale to make the “stairs” more visible.
Now the next important aspect of sampling is how often the computer takes samples – wouldn’t make much sense if the computer does this only once a minute, right?
With an Audio CD as an example, this will be 44,000 times per second which is called a sample rate of 44,000 (another common sample rate is 48,000).
Why 44,000? Well, there is a theory that basically says that the sample rate should be at least twice as high as the highest frequency – which is 20,000 Hz for the human ear – so times 2 and a little extra for those who can hear beyond 20,000 Hz = 44,000/sec.
This means that when we are talking about stereo sound (2 channels, one left and one right), the computer will be taking 88,000 per second – which results in a lot of data very quickly.
This means for a 3,5 minute song in stereo:
2 channels x 44,000 words x 2 bytes per word x 3.5 minutes x 60 seconds = 36,960,000 bytes … or roughly 37 Mega bytes.
One of the most common problems we see or hear with audio is sound distortion caused by clipping. Typically not a good sign by the way and more than often really not good for your equipment either. It most often occurs when your the volume is set too high and your speakers can’t handle it, or you’re trying to push you amplifier too far and the power supply of your amplifier simply cannot provide enough juice to go just that far.
Clipping doesn’t only happen with speakers. It happens basically any time you feed a device (a recording device or a speaker for example) too much “power”. This means the nice sine curves we did see before will no longer be nice and smooth. The sine of the source audio goes outside of the range what the receiving device can handle.
The illustration below demonstrates what happens.
The first wave looks nice and smooth, let’s say this is what your audio source is providing.
The second wave shows that this wave for the receiving device: it is outside of the range (the 2 light gray lines) of what your recorder or speaker can handle. It simply cuts off anything outside of it’s range … and that’s what causes the distortion you’re hearing when clipping occurs. In most of these cases it also makes that your speaker doesn’t get to move smoothly, and instead has to move in weird ways.
Audio Clipping …
Digital back to Analog
When the computer, or your MP3 player, is used to playback the digital audio, a Digital to Analog (DAC) converter converts it back to a fluctuating voltage. The cool part is that the voltage is analog and switches to the next voltage “step” with a slight curve, which evens out the difference in “steps” that we did see in the previous “stairs” illustration, making the difference in sound of an analog source versus a digital source (almost) negligible.
Hearing the Audio through a LoudSpeaker
The “voltage” (through all kinds of amplifiers) that represents your sound, is the send to a speaker which triggers movement of its diaphragm or cone, which causes … pressure waves that we can hear.
A speaker is basically an electromagnet (C), surrounded by a regular magnet (B), with a paper or plastic disc (A – the diaphragm or cone) attached to it. When power is applied to the electromagnet it will pull or push the diaphragm back or forth as illustrated below. Depending on the polarity of the provided power/voltage the magnetic polarity of the electromagnet will change and either pulls to the natural magnet or tries to get away from the natural magnet.
How a loud speaker works …
Headphones work like speakers – in fact it holds 1 or 2 tiny loudspeakers, which is perfect for that purpose as the headphones are proetty much glued to your ears and is not being used to provide a livingroom with music.
A classical microphone works in a similar fashion. However instead of feeding it power, it will generate power as the sound pressure wave moves the diaphragm, pushing a magnet in and out of a “electromagnet”. The basics can (roughly) be compared to the workings of dynamo or generator … just a little bit different.