![]() |
| ![]() |
There are several ways to compact sound. Most people can't hear frequencies
over 10 Khz and voices are still very recognizable when you remove the high
and low frequency. Analog telephone systems only passed the frequencies from
200 Hz to 3400 Hz. To transmit a certain frequency you'll have to sample at
twice it's speed, so ISDN at 8K samples per seconds will transmit frequencies
up to 4 KHz. The signals that used to be transmitted over ISDN lines in the
early days were only human voice data and low speed modem signals upto say
1200 bit/s. These signals have as a characteristic that they never transmit
two different frequencies at the same time and the noise may be removed.
Based upon this knowledge (Adaptive Differential Pulse Code Modulation (ADPCM*)
was developed which could compress ISDN with a factor 2, so to 4bit*8K/s=32Kbit/s.
However this may not compress DTMF, music or faster modem signals well.
My experience is that some kinds of music can be compressed very well
and others not, for example those with important drum (=noise) parts.
By the way, the standard method of ISDN already uses a form of logarithmic
compression called PCM*. The whole 16 bits sound spectrum is interpreted
logarithmically into an 8 bits range whereby the sounds with a lower volume
have relatively more steps. This is in accordance with how the human ear
interprets sound.
Depending on the signal kind it's of course also possible to reduce the
sample rate or the sample size. Almost all current sound cards in PC's
together with Windows can sample and replay via all the above mentioned
methods, so you can experiment on your PC.
A method to save on ADC and DAC hardware which was used in earlier cheap sound
hardware was called Frequency Modulation (FM*) but it has little to do with
the FM* you know from radio broadcasting. It consisted of a pure bitstream at
a certain rate and at an 0 the signal should go down and at an 1 the signal
should go up. An output for such a signal can be made with a logical output
port (0 and 5V), a resistor and a capacitor. The voltage over the capacitor
will be equavalent to the original sound volume, so you can send it directly to
the amplifier. (Probably a little filtering is required.)
In some cases it's possible to reduce the bit stream by filtering out quiet
moments. Although in a telephone conversation pauzes are usually not present
people tend to speak in turns, so halve of each signal could be filtered out.
With modern packet based telephony methods is of course better to just compress
the back channel much more when it's almost silent. There may be background
sounds that have to be transmitted for example.
A modern ways of compressing sound are those used by MP3 and the methods that
streaming media like RealPlayer from Real Networks and Mplayer from Microsoft.
When you're building small cheap systems that only has to generated
prerecorded sound, please consider that it may be economical to spend
a lot of money, time, sophisticated equipment and software to record and
compress the sounds (and even adapt them to strange hardware) so you can
make the replay systems as cheap as possible. For example if you use a 4 bit
resistor network as a cheap DAC, it may be possible to recalculate the samples
to avoid the non-linearity of that cheap DAC.
By the way, there is a fundamental difference between synthesizers (as
integrated in every modern PC) and electrical piano's: In electrical piano's
every tone is generated anyway (using a couple of frequency divider chips)
and which key the player presses determines which tones are send to the amplifier.
In case of a synthesizer or PC, the tone is generated after the key has been
pressed. Early synthesizers were limited to only one different tone at a time,
but after a while multichannel synthesizers appeared. The number of channels
determined how many noted could be played at the same time. In case of a piano
that is played by one person with 10 fingers and always playing one key with
one finger only a maximum of 10 channels are needed. In early synthesizer days
these channels had to be made in hardware and therefore were expensive, but
nowadays it's probably done in software, so it's cheap and you can synthesize
complete symphonies with a single machine.
The synthesizer chips of PC soundcards are usually designed by Yamaha who
is also one of the main synthesizer makers.
Famous sound/effects generators are the AY-3-8910 and AY-3-8912 which were
for example often used in pinball machines and video arcade games in the 1980's.
I don't know much about pure voice synthesis.
I have heard about chips trying to emulate the complete vocal system of humans.
Most of what we think is voice synthesis is of course just voice replay on
the level of single words or letters (called phonemes or voxels I think).
Already around 1982 Radio Shack sold an optional box for it's TRS-80,
with the famous SP0256-AL2 voice synthesizer chip.
voicegen.htm | Lots more information. |
voicercg.htm | Lots more information. |
www.asiansources.com/honsitak.co |
Consumer ICs: We have UMC, REALTEK, PTC, HMC, MOSEL, HOLTEK, WINBOND, API and other brands available.
Melody IC* series:
Sound effect ICs:
Voice/Speech ICs:
index.htm | Index page for sound chips |
melody.htm | About melody generating chips |
speechge.htm | About speech/voice generation/synthesis |
speechrc.htm | About speech recording |
voice.htm | About voice/speech generation/synthesis |
../../oth/voicerec.txt | FAQ about voice recognition processors |
|