Transmit data through sound between 2 computers (very close distance)

Question

I'm writing an example on transmitting data through sound betwwen 2 computers. Some requirements:

The distance is very close , i.e the 2 computers are basically adjacent to each other
Very little noise (I do not think my teacher would turn on a rock song as a noise source)
Error is acceptable: For example if I send "Radio communication" then if the other computer receives "RadiQ communEcation" it's alright as well.
If possible: No header, flag, checksum,.... since I just want a very basic example demonstrating the basics of transmitting data through sound. No need to be fancy.

I tried using Audio Frequency Shift Keying according to this link:

Lab 5 APRS (Automatic Package Reporting System)

and got some results: My Github page

but it is not enough. I don't know how to do clock recovery, synchronization,... (the link has a Phase Locked Loop as timing recovery mechanism, but it was apparently not enough).

So I think I should find a simpler approach. Found a link here:

Data to audio and back. Modulation / demodulation with source code

but the OP did not implement the method suggested in the answer, so I'm afraid it might be very complex. Also I do not clearly understand the decoding method suggested in the answer:

The decoder is a bit more complicated but here's an outline:

Optionally band-pass filter the sampled signal around 11Khz. This will improve performance in a noisy enviornment. FIR filters are pretty simple and there are a few online design applets that will generate the filter for you.

Threshold the signal. Every value above 1/2 maximum amplitude is 1 every value below is 0. This assumes you have sampled the entire signal. If this is in real time you either pick a fixed threshold or do some sort of automatic gain control where you track the maximum signal level over some time.

Scan for start of dot or dash. You probably want to see at least a certain number of 1's in your dot period to consider the samples a dot. Then keep scanning to see if this is a dash. Don't expect a perfect signal - you'll see a few 0's in the middle of your 1's and a few 1's in the middle of your 0's. If there's little noise then differentiating the "on" periods from the "off" periods should be fairly easy.

Then reverse the above process. If you see dash push a 1 bit to your buffer, if a dot push a zero.

I do not understand how many 1's before classifying it as a dot,... So there are many things that I do not understand right now. Please suggest to me a simple method to transmit data through sound so that I can understand the process. Thank you very much :)

UPDATE:

I have made some Matlab code which appear to be (somewhat) operational. I first modulate the signal using Amplitude shift keying (sampling frequency 48000 Hz, F_on = 5000 Hz, bit rate = 10 bits/s), then add it with a header and an end sequence (of course modulate them as well). The header and the end sequence was chosen on an ad-hoc basis (yeah it was a hack):

header = [0 0 1 0 1 1 1 1   1 0 0 0 0 0 0 1   1 0 0 0 0 0 0 1   1 0 1 1 0 1 0 1];  
end_seq = [1 1 1 1 1 0 1 0 1  0 1 0 1 0 1 0 1   0 1 0 1 0 1 0 1     0 1 0 1 0 1 0 1    0 1 0 1 0 1 0 1   0 1 0 1 0 1 0 1  1 0 0 1 0 0 0 1];

Then I transmit them through sound, and recorded it with my smartphone . Then I send the recorded audio back to my computer, use another piece of code to read the audio. Then I correlate the received signal (not yet demodulated) with the modulated header and ending sequence to find out the beginning and the end. After that I take only the relevant signal (from the beginning to the end, as found in the correlation part). Then I demodulate and sample to find the digital data. Here are 3 audio files:

"DigitalCommunication_ask": Link here it sends the text "Digital communication". Relatively noise-free although you can hear some background noise at the beginning and the end. However the result showed only "Digital Commincatio"
"HelloWorld_ask": Link here it sends the text "Hello world". Noise free like "DigitalCommunication_ask". However the result for this one was correct
"HelloWorld_noise_ask": Link here it sends the text "Hello world". However there is some noise that I've made (I just said some random stuff "A,B,C,D,E,...." during the transmission). Unfortunately this one failed

Here is the code for the sender (sender.m):

 clear
fs = 48000;
F_on = 5000;
bit_rate = 10;

% header = [0 0 1 0 1 1 1 1  1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1      1 1 1 1 1 1 1 1    1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1    1 1 1 1 1 1 1 1  1 1 1 1 1 1 1 1 ];
% header = [0 0 1 0 1 1 1 1  1 0 0 0 0 0 0 1   1 0 0 0 0 0 1   1 0 0 0 0 0 0 1   1 0 0 0 0 0 0 1     1 0 0 0 0 0 0 1      1 0 0 0 0 0 0 1    1 0 0 0 0 0 0 1  1 0 0 0 0 0 0 1    1 0 0 0 0 0 0 1  1 1 1 1 1 1 1 1 ];
header = [0 0 1 0 1 1 1 1   1 0 0 0 0 0 0 1   1 0 0 0 0 0 0 1   1 0 1 1 0 1 0 1];  

% end_seq = [1 0 0 1 0 1 0 0  1 0 1 1 0 0 0 1  0 0 0 0 1 0 0 1  1 0 0 0 1 0 0 1];
% end_seq = [1 0 0 1 0 1 0 0  1 0 1 1 0 0 0 1  0 0 0 0 1 0 0 1  1 0 0 0 1 0 0 1   0 1 0 0 1  1 0 0   1 1 0 1 1 0 0 1  ];
% end_seq = [0 0 0 1 0 0 0 1  0 0 0 0 0 0 0 0    0 0 0 0 0 0 0 0   1 1 0 0 1 1 0 0];
end_seq = [1 1 1 1 1 0 1 0 1  0 1 0 1 0 1 0 1   0 1 0 1 0 1 0 1     0 1 0 1 0 1 0 1    0 1 0 1 0 1 0 1   0 1 0 1 0 1 0 1  1 0 0 1 0 0 0 1];


num_of_samples_per_bit = round(fs / bit_rate);
modulated_header = ask_modulate(header, fs, F_on, bit_rate);
modulated_end_seq = ask_modulate(end_seq, fs, F_on, bit_rate);
% input_str = 'Ah';
input_str = 'Hello world';
ascii_list = double(input_str); % https://www.mathworks.com/matlabcentral/answers/298215-how-to-get-ascii-value-of-characters-stored-in-an-array
bit_stream = [];
for i = 1:numel(ascii_list)
    bit = de2bi(ascii_list(i), 8, 'left-msb');
    bit_stream = [bit_stream bit];
end
bit_stream = [header bit_stream  end_seq];
num_of_bits = numel(bit_stream);
bandlimited_and_modulated_signal = ask_modulate(bit_stream, fs, F_on, bit_rate);
sound(bandlimited_and_modulated_signal, fs);

For the receiver (receiver.m):

clear
fs = 48000;
F_on = 5000;
bit_rate = 10;

% header = [0 0 1 0 1 1 1 1  1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1      1 1 1 1 1 1 1 1    1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1    1 1 1 1 1 1 1 1  1 1 1 1 1 1 1 1 ];
% header = [0 0 1 0 1 1 1 1  1 0 0 0 0 0 0 1   1 0 0 0 0 0 1   1 0 0 0 0 0 0 1   1 0 0 0 0 0 0 1     1 0 0 0 0 0 0 1      1 0 0 0 0 0 0 1    1 0 0 0 0 0 0 1  1 0 0 0 0 0 0 1    1 0 0 0 0 0 0 1  1 1 1 1 1 1 1 1 ];
header = [0 0 1 0 1 1 1 1   1 0 0 0 0 0 0 1   1 0 0 0 0 0 0 1   1 0 1 1 0 1 0 1];  

% end_seq = [1 0 0 1 0 1 0 0  1 0 1 1 0 0 0 1  0 0 0 0 1 0 0 1  1 0 0 0 1 0 0 1];
% end_seq = [1 0 0 1 0 1 0 0  1 0 1 1 0 0 0 1  0 0 0 0 1 0 0 1  1 0 0 0 1 0 0 1   0 1 0 0 1  1 0 0   1 1 0 1 1 0 0 1  ];
% end_seq = [0 0 0 1 0 0 0 1  0 0 0 0 0 0 0 0    0 0 0 0 0 0 0 0   1 1 0 0 1 1 0 0];
end_seq = [1 1 1 1 1 0 1 0 1  0 1 0 1 0 1 0 1   0 1 0 1 0 1 0 1     0 1 0 1 0 1 0 1    0 1 0 1 0 1 0 1   0 1 0 1 0 1 0 1  1 0 0 1 0 0 0 1];


modulated_header = ask_modulate(header, fs, F_on, bit_rate);
modulated_end_seq = ask_modulate(end_seq, fs, F_on, bit_rate);

% recObj = audiorecorder(fs,8,1);
% time_to_record = 10; % In seconds
% recordblocking(recObj, time_to_record);
% received_signal = getaudiodata(recObj);

% [received_signal, fs] = audioread('SounddataTruong_Ask.m4a');
% [received_signal, fs] = audioread('HelloWorld_noise_ask.m4a');
% [received_signal, fs] = audioread('HelloWorld_ask.m4a');
[received_signal, fs] = audioread('DigitalCommunication_ask.m4a');
ereceived_signal = received_signal(:)';
num_of_samples_per_bit = round(fs / bit_rate);

modulated_header = ask_modulate(header, fs, F_on, bit_rate);
modulated_end_seq = ask_modulate(end_seq, fs, F_on, bit_rate);

y= xcorr(modulated_header, received_signal); % do cross correlation
[m,ind]=max(y); % location of largest correlation
headstart=length(received_signal)-ind+1;

z = xcorr(modulated_end_seq, received_signal);
[m,ind]=max(z); % location of largest correlation
end_index=length(received_signal)-ind+1; 

relevant_signal = received_signal(headstart + num_of_samples_per_bit * numel(header) : end_index - 1);
% relevant_signal = received_signal(headstart + num_of_samples_per_bit * numel(header): end);
demodulated_signal = ask_demodulate(relevant_signal, fs, F_on, bit_rate);
sampled_points_in_demodulated_signal = demodulated_signal(round(num_of_samples_per_bit / 2) :  num_of_samples_per_bit :end);
digital_output = (sampled_points_in_demodulated_signal > (max(sampled_points_in_demodulated_signal(:)) / 2));
% digital_output = (sampled_points_in_demodulated_signal > 0.05);

% Convert to characters 
total_num_of_bits = numel(digital_output);
total_num_of_characters = total_num_of_bits / 8;
first_idx = 0;
last_idx = 0;
output_str = '';
for i = 1:total_num_of_characters
    first_idx = last_idx + 1;
    last_idx = first_idx + 7;
    binary_repr = digital_output(first_idx:last_idx); 
    ascii_value = bi2de(binary_repr(:)', 'left-msb');  
    character = char(ascii_value);
    output_str = [output_str character];    
end
output_str

ASK modulation code (ask_modulate):

function [bandlimited_and_modulated_signal] = ask_modulate(bit_stream, fs, F_on, bit_rate)
% Amplitude shift keying: Modulation
% Dang Manh Truong (dangmanhtruong@gmail.com)
num_of_bits = numel(bit_stream);
num_of_samples_per_bit = round(fs / bit_rate);
alpha = 0;
d_alpha = 2 * pi * F_on / fs;
A = 3;
analog_signal = [];
for i = 1 : num_of_bits
    bit = bit_stream(i);
    switch bit
        case 1
            for j = 1 : num_of_samples_per_bit
                analog_signal = [analog_signal A * cos(alpha)];
                alpha = alpha + d_alpha;

            end
        case 0
            for j = 1 : num_of_samples_per_bit
                analog_signal = [analog_signal 0];
                alpha = alpha + d_alpha;                
            end
    end    
end
filter_order = 15;
LP_filter = fir1(filter_order, (2*6000)/fs, 'low');
bandlimited_analog_signal = conv(analog_signal, LP_filter,'same');
% plot(abs(fft(bandlimited_analog_signal)))
% plot(bandlimited_analog_signal)
bandlimited_and_modulated_signal = bandlimited_analog_signal;

end

ASK demodulation (ask_demodulate.m) (Basically it is just envelope detection, for which I used the Hilbert transform)

function [demodulated_signal] = ask_demodulate(received_signal, fs, F_on, bit_rate)
% Amplitude shift keying: Demodulation
% Dang Manh Truong (dangmanhtruong@gmail.com)

demodulated_signal = abs(hilbert(received_signal));

end

Please tell me why is it not working? Thank you very much

In theory (in a noise-free environment), this would be trivial to implement but in practice this is much more difficult. Still, it depends on the type of information you're trying to send. Text would be extremely difficult to transmit reliably because even the smallest noise would make the text unrecognizable. — dsp_user, Dec 05 '17 at 13:42
@dsp_user I'm trying to send text. I can live with some error (like "Audio" -> "Apdio") :) Also I do not really understand that, for Amplitude Shift Keying for example, when you have 1 then you send a sine wave, 0 then nothing but how do you know the first 0 ? I mean in a noise-free environment, but before the first 1 there would be a lot of 0 right? Then how do you know it? — Dang Manh Truong, Dec 05 '17 at 14:17
I suggest that you look at something like a old fashioned 14.4 modem for ideas. — , Dec 06 '17 at 15:41
@StanleyPawlukiewicz I have made some progress. Please check the update. Thank you very much. — Dang Manh Truong, Dec 09 '17 at 11:34
There is a lot to comment on. You might want to look at Barker sequences for your preamble, given that you’re using preambles — , Dec 09 '17 at 17:09
@dsp_user Please check my answer and see if there is any room for improvements. Thank you very much — Dang Manh Truong, Jan 02 '18 at 04:52
@StanleyPawlukiewicz Please check my answer and see if there is any room for improvements. Thank you very much — Dang Manh Truong, Jan 02 '18 at 04:52

MBaz · Answer 1 · 2017-12-05T19:17:06.650

8

As you have realized, the hard part of doing digital communications is carrier, symbol and frame synchronization, and channel estimation/equalization.

The bad news is that you can't get around these problems. The good news is that implementing these is not that hard, as long as you limit yourself to narrowband BPSK. I know, because I have done this myself, and so have my (undergrad) students (see http://ieeexplore.ieee.org/document/5739249/)

One simple suggestion to get around the problem of carrier synchronization is to use AM DSB-LC to upconvert your baseband signal. Then, you can use an envelope detector without carrier and phase synchronization. This will cost you in power efficiency, but that's not a priority in your case.

Another simple suggestion is to do "batch processing" instead of "real-time processing"; what that means is, store the entire received signal and process it afterwards. This is much more easier to implement than stream or real-time processing.

My more substantial suggestion is to read this book: Johnson, Sethares and Klein, "Software receiver design", Cambridge. It explains in very clear terms every single piece of the receiver, and has lots of example Matlab code. There is a similar book by Steven Tretter, on implementing a communications system on a DSP (I can't recall the exact title right now).

Good luck; and please ask new, more specific questions if you have them.

edited Dec 05 '17 at 19:17

answered Dec 05 '17 at 14:21

MBaz

12,780
8
24
40

I've read your paper. Keep up the good work! One question: In the paper, you talked about several methods used by students to find the channel response (using impulse, sine waves,..). Would I need to find the channel response too? :) – Dang Manh Truong Dec 05 '17 at 14:51
1

Thanks for your kind words :) The thing is that you want to make sure you transmit over a frequency band where the channel response is flat; otherwise, you'll need an equalizer in the receiver. If you don't want to estimate the channel response, what you can do is use a very low data rate (say, 100 b/s) on a frequency that all audio equipement should be comfortable with (say, 5000 Hz). – MBaz Dec 05 '17 at 16:03
1

@DangManhTruong One more thing: be sure to use bandwidth-limited pulses such as square-root raised cosine, not square pulses that have a large bandwidth and will very likely suffer distortion. – MBaz Dec 05 '17 at 19:15
I have read the book Software receiver design as you suggested (actually I skimmed through most of it and concentrated on Chapter 8: Bits to Symbols to Signals). So I have some questions. You said something about pulses, but in the book's example they used a Hamming window as a pulse, is it alright if I do so? And is my understand correct: First you modulate signal using, say, ASK, then you use pulse shaping. Then on the receiver, you first correlate with the pulse signal to receive the modulated signal. Then you demodulate. Is it correct? – Dang Manh Truong Dec 06 '17 at 16:55
And if I wish to send data in a packet form, with a header at the beginning and the end, say 1 1 1 1 1 1 1 1, so I should append it with the data, then modulate it, then pulse shape it. On the receiver, I would correlate the received signal with the pulse shape (square-root raised cosine,..) then I have to demodulate the signal, after that correlate with the header . Is my understanding correct? – Dang Manh Truong Dec 06 '17 at 16:57
Sounds correct. – MBaz Dec 06 '17 at 17:56
Wait a minute, in the book they say that, for example: you convert binary signals into alphabets, for example: 01 -> -1, 00 -> -3, 10 ->1, 11 -> 3 (voltage). Then you multiply it with a rectangular pulse p(t) to get the analog signal. So with AM-DSB LC, p(t) is already the sine wave, because you only have 1 or zero, 1 then you transmit the sine wave, 0 then nothing. So instead of a sine wave I have to transfer a square-root raised cosine? Please correct me if I'm wrong thank you very much – Dang Manh Truong Dec 07 '17 at 02:37
Also, suppose I transmit the signal through sound using, say: sound(y,Fs) (Matlab) with Fs being the sampling rate. Accoriding to the Nyquist criterion, if the signal is transmitted at F you have to sample it at twice the highest frequency. So in the receiver the sampling rate is 2 * Fs ? Is it correct ? – Dang Manh Truong Dec 07 '17 at 02:39
Another thing is in the chapter Bits to Symbols to Signals, the code for undoing pulse shaping z=y(N*M:M:2*N*M-1)/(pow(ps)*M); % downsample to symbol rate and normalize , where N is the number of characters. So in order to receive the signal we have to know how many characters we are going to receive? But how is it possible? – Dang Manh Truong Dec 07 '17 at 02:41
Lots of questions :) Feel free to ask new questions too. In brief: I recommend sticking to BPSK, where $p(t)$ is a raised cosine and you transmit either $+p(t)$ or $-p(t)$. I suggest fixing the sampling rate at 48,000, since it is high enough for any audio and compatible with all sound cards. You know how many characters you'll receive because you know the frame format. – MBaz Dec 07 '17 at 03:32
BTW, feel free to accept the answer too if it was useful to you :) – MBaz Dec 07 '17 at 03:32
Wait a minute, in the answer you suggested to use AM-DSB LC, but now you're saying I should use BPSK, but then wouldn't I lose the strength of AM-DSB LC that is no need for phase synchronization,... ? You also said that you have implemented data transmission via sound before? Can you tell me which method you used? :) – Dang Manh Truong Dec 07 '17 at 11:06
You first create a baseband BPSK signal -- a sequence of raised cosine pulses with two amplitudes. Then, you use AM DSB LC to upconvert that baseband signal to the passband. In the receiver, you use an envelope detector to recover the baseband signal. – MBaz Dec 07 '17 at 14:07
I have made some progress, please check the update. However I did not implement what you suggested but instead only did only the modulation and demodulation. However the result was really terrible. Please can you help me explain the results? Why is the result the way it is? Is it because my code was wrong, or is it because my method was to simplistic to work? Please help me, thank you very much :) – Dang Manh Truong Dec 09 '17 at 10:29
I have posted my answer. In the end, I used DTMF and it worked fine (it could even work when I used a rap song as the noise source, although it must not be too close to the microphone). But thanks anyway :) – Dang Manh Truong Jan 01 '18 at 16:07
@DangManhTruong It's awesome that you got it working, congratulations, and thanks for posting the details in your answer. I have never tried implementing DTMF myself, now I have a new project :) – MBaz Jan 12 '18 at 16:30
Well can you share your implementation of AM-DSB Lc please ? I'm really curious :) – Dang Manh Truong Jan 13 '18 at 16:18

Dang Manh Truong · Accepted Answer · 2018-01-25T02:49:34.717

In the end, I used DTMF (Dual Tone Multi Frequency signaling). The original DTMF has 16 signals each using a combination of 2 frequencies. But here I only used "1"(697 Hz and 1209 Hz) and "0" (941Hz and 1336 Hz)

An outline of how the code works:

The sender converts text to binary, then transmit "0" / "1" DTMF signals (here the timing is 0.3s for tone duration, and 0.1s for silence period between tones ). The transmission code is taken from: https://sites.google.com/a/nd.edu/adsp-nik-kleber/home/advanced-digital-signal-processing/project-3-touch-tone . Apparently the author used a marginally stable IIR filter to implement a digital oscillator.

The receiver side first uses 2 ridiculously-high-ordered-and-ridiculously-narrow bandpass filters to extract the "0" and "1" frequency components, respectively:

filter_order = 1000;

one_band = [[((2696)/Fs) ((2698)/Fs)] [((21208)/Fs) ((21210)/Fs)]];

one_dtmf_filter = fir1(filter_order, one_band);

zero_band = [[((2940)/Fs) ((2942)/Fs)] [((21335)/Fs) ((21337)/Fs)]];

zero_dtmf_filter = fir1(filter_order, zero_band);

After this is done we will find the beginning and end of each "1" and "0" signal. The code is from https://github.com/codyaray/dtmf-signaling. Basically it finds the silence period which is at least 10 ms and any tone period more than 100ms) :

(From top to bottom: Zero signal, signal after moving average filter, difference of signal after removing those below threshold, signal after thresholding)

First the result from the previous step is normalized then went through a moving average filter (with filter size equals 10ms * Fs). If we plot the result we would see that the shape of the "0" and "1" can clearly be seen. So I think it kinda works as an envelope detector in this case.
Then all the signal below a certain threshold is cut off (I chose 0.1).
Finally find all intervals above the threshold that has a time interval greater than 100ms (note that the image is not reproducible from the code, you will have to dig around to make it)

Then we assemble the bits and convert back into text :)

Video demo: https://www.youtube.com/watch?v=vwQVmNnWa4s , where I send the text "Xin chao" between my laptop and my brother's PC :)

P/S: Originally I did this because my Digital Communication teacher said that whoever did this would get an A without having to do the final exam, but I was only able to do this after the exam. So here goes all my efforts :(

P/S2: I got a C+ :(

Perhaps late now (2021) but when I read this felt compelled to comment: don't ever be discouraged by marks. What you learn from experiments and projects far outweighs a textbook-only or exam-focused approach, no matter your mark. Industrial work and academic work both rely heavily on interactions between what you try, what you observe, and what you learn after that. If the mark is low because the work was late, that's reality too, but reflects more on organizational demands rather than passion and expertise. You'll do very well. — P2000, May 28 '21 at 18:03

score 0 · Answer 3 · answered Jan 02 '18 at 19:07

If you want an open source library with very good synchronization, I recommend https://github.com/jgaeddert/liquid-dsp which uses msequences to align, then does equalization and demodulates the payload. I made an audio modem that runs on top and it works quite well, so if nothing else, liquid's methods should be of some help

Transmit data through sound between 2 computers (very close distance)

3 Answers3