|
What Is MP3?
MPEG (pronounced EM-peg--an acronym for Motion Picture Experts Group) is a set
of standards for compressing and storing digital audio and video. MP3 is short
for "MPEG Audio Layer 3," and it identifies a way to store digital audio files.
MP3 files give you CD-quality sound in a file format that requires roughly 1
megabyte for every minute of sound. (CDs and WAV files, by contrast, require
about 11MB per minute.) This means that a single song or track in MP3 format
usually takes up between 3 and 5 megabytes, a reasonable download even at 28.8
Kbps. Because of this, a profusion of MP3 websites, newsgroups, and FTP sites
have sprouted up across the Internet. This also means that it is possible to
create a DVD disk containing over 80 hours of music.
Regardless of where a sound comes from, what you actually hear is analog sound.
Computers translate and store this information as digital sound, however. This
is done through sampling--the process of taking a snapshot of the sound many
times per second. CDs store information in a digital audio format known as
CD-DA, which is very similar to the standard WAV format and samples 44,000
times per second.
MP3 files are based on psychoacoustics--the study of how the human brain
perceives sound. This science has determined that not all of the sound we hear
is perceived by the brain. To create an MP3 file, an MP3 encoder reads a WAV
file and then strips out the parts that you won't miss hearing.
For example, most people can't hear sounds above 16 kHz, so the encoder strips
out any sounds above a preset threshold level. Loud sounds will mask quieter
sounds at or near the same frequency; the encoder removes these, too. By
whittling away the parts you don't hear, the encoder creates a file that sounds
almost the same but is dramatically smaller.
An MP3 file can also contain information about the file itself in a tag. The
tag can contain things like the artist's name, a graphic (usually the CD cover
art), a URL for more information, another URL where you can buy the CD, the
song's lyrics, the genre, and more.
Understanding digital audio formats...or: MP3 & WMA = AUDIO MAGIC!
Digital sound is rapidly becoming a mainstream way of listening to music, and
MP3 is still in the top ten most-searched terms on the Web today.
There are several competing formats for digital audio, but as it relates to
P/PCs, the only two we care about are MP3 (MPEG Audio Layer-3) and WMA (Windows
Media Audio). The two formats have a lot of misconceptions surrounding them, so
we're going to take a look at what these file formats really are and how they
work.
In addition, so you can see where I'm headed, there are a couple steps involved
in taking a CD recording and making a file that is P/PC ready. They are:
Create a WAV file, or find one that you download from the Web.
Convert the WAV file into either an MP3 or Windows Media (WMA) file, ready for
use on your P/PC.
This article is really part of a three-part discussion on using and creating
digital audio files for your Palm-size PC.
In this article I'll explain how to decide whether to use an MP3 or WMA file,
and also provide you some useful background information to aid you in choosing
the one you want to use.
The next article Making WAV files teaches you how to convert your CD music into
WAV files.
Once you've got a WAV file, either one that you created or one you downloaded
from the Web, here's how to convert the WAV file into either a Windows Media
file, or an MP3 so that you can use the audio on your P/PC.
The mysterious shrinking file
As we discuss in the Making WAV files from CDs How-To, when you've ripped a
track from a CD you have a massive WAV file, usually 40 to 60 MB. So how does
the file shrink from 40 MB to 4 MB? This is the magic of MP3 and WMA encoding:
psycho-acoustic data storage. In the same way as the JPEG image format will
discard certain data that is beyond the human eye's ability to perceive, so
does MP3:
As the MP3now.com site explains, "MPEG Layer-3 is a perceptual audio coding
scheme that analyses the audio signal and applies a psycho-acoustic model using
the properties of the human ear trying to maintain the original sound quality
as far as possible."
So what does this mean in practical terms? Well, there are sounds beyond the
range of human hearing (extreme highs and lows) that are part of the digital
signal that can easily be stripped away without changing the sound of the audio
file. In the same way, when there is a strong signal it overpowers the weaker
signals.
In terms of a song, a loud snare drum might overpower the weaker guitar. In the
WAV file, all the data is maintained, but when you convert it to MP3 or WMA,
these extraneous sounds are stripped away. Hence, the algorithm attempts to
duplicate the way the human ear perceives the audio, and tosses everything else
out. There is a more detailed explanation of these concepts here.
It's all about the bit rate
The potential severity of this data loss is measured in terms of "bit rate. A
song on a CD is recorded at a bit rate of 1411 Kbps, while an extremely high
quality MP3 is 192 Kbps or even 320 Kbps. Most MP3s are recorded at 128 Kbps,
which is high enough audio quality that most people wouldn't notice the
difference between that and the original CD format.
I prefer to record my MP3s at 192 Kbps however, because when I'm listening with
headphones I can hear the data loss of a 128 kbps format file (specifically in
the high-end "swish" of cymbals on drums). Different people have different
perceptions of the data loss, so it's very much an individual decision as to
which bit rate sounds the "best." It depends how sensitive your hearing is.
To conserve space on portable devices, some people will even go down to 64 Kbps
with their files. At this low of a bit rate, data loss becomes extremely
evident as both the low and high-end frequencies are lost. Bass response
disappears and high-frequency sounds have an artificial tone.
However, the WMA format really shines at these lower bit rates. While MP3 is
better for high-bit rate encoding, the WMA format does a remarkable job of
making 64 Kbps WMA sound decent. The data loss isn't as apparent, although you
won't get great sounding bass at 64 Kbps in any format. Even at ultra-low bit
rates (like 16 Kbps) WMA does a much better job than MP3.
To illustrate this point, I've extracted a 30-second clip from a song I enjoy
called "I Want to Know You (In the Secret)" by Sonic Flood. It has good mix of
strong bass, edgy guitar, crisp vocals, and a breakdown with only vocals for
listening to clarity -- all good tests for benchmarking audio. Here are my
comments about each too. Take a listen:
128 kbps WMA (475 KB) - Great, full-bodied sound.
128 kbps MP3 (470 KB) - Great, full-bodied sound.
64 kbps WMA (240 KB) - Good sound, good bass response.
64 kbps MP3 (235 KB) - Good sound, weaker bass, some high-end problems.
16 kbps WMA (62 KB) - Very digital-sounding, worse than AM radio.
16 kbps MP3 (59 KB) - Horrible sounding, similar to 28.8 Kbps streaming
RealAudio.
In the 128 Kbps and higher realm, MP3 is the best choice, but for 64 Kbps
encoding, WMA is the way to go. But take a listen and decide for yourself!
CBR vs. VBR
The final spice in the mix is the esoteric-sounding "variable bit rate
encoding" (VBR). Constant bit rate (CBR) is the regular method of encoding -- a
single bit rate throughout the whole file.
VBR is what makes MP3 even more efficient. During the parts in a song where
it's quiet or less sound is going on, it will drop the bit rate as low as it
needs to go -- often down to 32 Kbps or lower. Then in parts where there is a
great deal of signal, it will increase the bit rate up to whatever maximum you
specify -- usually 192 Kbps or higher. The benefit of VBR is higher overall
sound quality at a smaller file size. If your encoder gives you the option of
using VBR, use it!
Now that you understand what these formats are and how they work, let's look at
the first step to creating them, recording the WAV file. Also, you may want to
check out the MP3 files that are readily available online.
|