Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: I made a tool to communicate data using the PC speaker (github.com/ggerganov)
189 points by ggerganov on April 21, 2021 | hide | past | favorite | 71 comments
Hi HN,

r2t2 is a command-line tool for transmitting data through sound using the PC speaker on the motherboard. The name of the tool is a reference to the R2-D2 robot from Star Wars :)

In short, you type some message and it gets FSK modulated and transmitted via sound through the PC speaker. Note that this is the speaker/buzzer that you connect to the motherboard and not the regular speakers that you connect to the sound card.

I also made a simple web page that listens to the sound emitted by r2t2 and decodes the received messages. The page can be used by simply opening it on your phone and placing the phone near a computer/device that emits data with r2t2.

I made this tool mostly for fun, but I think it might have some useful applications too. The advantage of this type of communication is that the hardware is very cheap (~$1/speaker), does not require a sound card and the software is very simple and does not use any 3rd-party audio libraries.



As part of my undergraduate course we had to build an embedded system that did this, at a high baud rate, with a transmitter and receiver. It was the most ridiculously difficult project I have ever undertaken. Finished the transmitter but there was something wrong with part of our analogue circuit and it gave the signal an artefact. I wrote the assembly in a day (oh yeah we used assembler and burned an EPROM I think) but the hardware assignment was Walt Disney level fantasy.

I believe zero people on the course got it fully working.

I spent almost every spare hour I had in that lab for ten weeks. And despite it being insanely difficult, I learnt something that semester. I learnt that I would never, ever, work in hardware.


Most people in software have no concept of just how difficult hardware can be. Add to that the fact that most physical products today are really hybrid hardware + software, with the added feature that this sometimes means FPGA + one or more MCU's + analog + high speed controlled impedance + matched propagation delay + signal and power integrity. Yeah, hardware is, well, hard. This is particularly true at scale. Anyone can make one or a few of something. Scale it up and things get interesting very quickly.

I am currently working on one such projects. One of the modules will be manufactured at an initial rate of about 5K units per month and is predicted to eventually reach as high as 50K per month. With software --and particularly with web-based products-- you can literally have millions of customers, make mistakes and fix things overnight.

Imagine shipping hundreds of thousands of physical products and discovering a problem with the design. Not the same world at all. Heck, in most cases you can't even update the software after shipping. Which means you have to get all of it right before it goes to manufacturing.

Among other things, I think this is one of the reasons for which VC's aren't all that interested in hardware. It's difficult, capital intensive and the risk vs. reward simply isn't on the same scale.

That said, I enjoy hardware development. Yes, I do software just as much and, believe me, its a walk in the park compared to hardware, but I would not trade one for the other. No pun intended, either you are wired to like hardware or you are not. That's the way I see it.


While I agree with you overall, the easy vs difficult comparison doesn't make much sense to me. Putting aside the fact that people are skilled at different things, if you find writing a piece of software easy then all your competitors do as well. The fact that you can get instant feedback and push updates now means that you have to iterate your product overnight instead of on a 3 year cycle (because your competitors are definitely doing it as well). There can definitely be just as much risk, complexity and frustration in software.


> Among other things, I think this is one of the reasons for which VC's aren't all that interested in hardware. It's difficult, capital intensive and the risk vs. reward simply isn't on the same scale.

Hardware, it seems, is about to become a winner takes all market. And at that point you can't be a successful software producer unless you control the hardware as well.


The problem with hardware is that is becomes more and more difficult for a startup to compete in a market segment as time goes on due to general technological progress. But this same progress makes writing software easier.

Back in the 70s you probably had a shot at starting a computer company or silicon manufacturer with a small team and good amount of funding. Today that is wildly impossible.

Software, on the other hand, is the opposite. A billion+ dollar research project from 20 years ago is an npm import away today. Two people with a laptop are more and more equipped to take on the goliaths of the industry (or get acquired by them) than they were in the past.


> Two people with a laptop are more and more equipped to take on the goliaths of the industry (or get acquired by them) than they were in the past.

This is how it works now. But if hardware manufacturers start working more like gatekeepers (see app stores) then the two-person-disruptive-startup becomes less of a possibility.


I can relate, my Computer Architecture course at UT Austin was taught by a adjunct professor who was a technical fellow at Intel. He gave exactly zero fucks, and assigned us a level of work and expectation that at a minimum consumed 20-30 hours of my week. I failed chemistry to get an A in this class, and despite coming to the same realization that I wanted nothing to do with hardware, I left understanding what it meant to actually work hard.


Lol, I had a similar experience as a teenager, however it only increased my desire to work in hardware!


This isn't too difficult if you can write the algorithm in software (ie. matlab).

Do a basic OFDM with bins say 5Hz wide, with QAM data. Align timing, phase shift and amplitude by brute force looking for known marker bins. Choose such narrow bins and such long symbols to try to hide the effects of echo, reverb etc.

Deal with noise and channel deficiencies with lots of erasure coding.


IIRC we had to put together a significant analogue circuit, without any real guidance as to how we'd achieve the high baudrate. The software was running on a minimal CPU that we wired up by hand, the address bus, the databus, the memory chips and everything else... hundreds of cables on breadboards, just for the digital part, and the analog electronics took up more boards. A loose connection caused by moving the boards could cost you hours when restarting.

It was definitely too difficult _for me_.


> This isn't too difficult if you can write the algorithm in software (ie. matlab).

Back in the day, I don't think matlab had any 'export' logic like I think modern copies do. IIRC matlab can emit C code + matlab library calls maybe?



My google-fu is failing me because all I'm getting is blogspam, but I could have swore that some higher-end washing machines had a feature where when they broke down, they would emit some kind of diagnostic noise. Not just the standard series of beeps, but complex digital data. When you called in for warranty support or repair, they would guide you though pushing some buttons on the machine, the machine makes the noise, it goes through the phone, and the call center person on the other end could decode it. Or a repair tech on-prem could presumably decode it with an app on their phone.

Also, ham radio operators are doing some amazing things with audio-encoded data. You can send a message halfway around the world with just a few watts and a person with a computer on the other end can received and decode it, even if the signal itself is _below the noise floor_. Of course, the tradeoff is time: it takes a few minutes to transmit the small amount of data required to make a contact.


You're probably thinking of LG Smart Diagnosis https://www.lg.com/us/support/help-library/smart-diagnosis-f...

Video on how it looks like https://youtu.be/Q_fTfGJFK2A?t=68


Wow it sounds like RTTY (radioteletype):

https://m.youtube.com/watch?v=wzkAeopX7P0

I suppose there are only so many ways to modulate a data signal.



I remember years ago reading about this being a black hat method for exfiltrating data from computers not connected to a network. Load the "data to sound" program from an infected USB drive onto the target, then use a separate device to collect the audio without ever having to touch the network directly. There was even a theory about using laser-based acoustic sensing to pull the audio from any window in the room so you wouldn't even have to be in the physically in the building. Interesting stuff.


TEMPEST has many iterations.

Wiki: https://en.m.wikipedia.org/wiki/Tempest_(codename)


Is this a full circle reimplementation of old telephone modems?


The data-over-sound library that I implemented (ggwave) and use in this example implements a modified DTMF protocol. Instead of only 2 tones, I use variable number of tones (usually 6) + error correction.

In r2t2, I use a single-tone encoding, since the buzzer can emit just a single frequency.


> In r2t2, I use a single-tone encoding, since the buzzer can emit just a single frequency.

That's not wrong, but also not the whole truth.

https://en.m.wikipedia.org/wiki/PC_speaker#Pulse-width_modul...


From the Wikipedia article's citations: http://www.oldskool.org/sound/pc/#digitized

I lived through the PC speaker, and never heard anything like this ever emitted from one. It's crazy how good those waveforms sound, if they're representative of the hardware…

I remember how amazing it was when we got our first soundcard installed and working, too. Games were much more interesting with real sound-effects.


there was some name for it when it first started showing up in games, "puresound" or something.... yes, it would make jaws drop when the speaker stopped squeeping and started rendering digital audio.

i think eventually windows added an sound driver that did pwm on the speaker.

it was one of those really cool hacks that blew your mind when you first saw them. (like people playing music with disk drive stepper motors, i guess exactly like that heh)

edit: found it: https://en.wikipedia.org/wiki/RealSound


There was a driver for Windows 3.x (third party, I think) which made it so Windows programs could play samples. But the computer was unresponsive during playback.

It was still semi-useful, you got the little Windows sounds playing, and if I recall correctly, there was a Windows Zeppelin game which was more fun with sporadic sound samples playing.

I upgraded my mothers 80286 PC clone with a separate ISA 8-bit MFM harddisk interface. I put in an 80486 (SX?) motherboard, which of course had IDE disk connectors. But, partly because I didn't have any spare IDE disk, and partly to make it a plug-in upgrade, I just moved over the ISA card to the 80486 motherboard and called it a day.

It was actually a very productive setup, Word 6 was really fast on that machine and the little sounds from the PC speaker driver as a nice novelty.


Neat project! thanks for sharing

An example of when sonic communication is useful is to bootstrap other communication.

During setup, Amazon Dash buttons would listen for data sent over ultrasonic to get network information.

https://www.cnet.com/home/smart-home/appliance-science-how-t...

I think that's just super cool.

(Disclaimer: I work at A->n but not on this in particular, I have only public, consumer knowledge about how this works)


Haha, I wasn't aware of Amazon's Dash buttons!

The funny part is I already did a similar application using ggwave and talking buttons. I mean, it does not order stuff, but you can easily create a button that triggers any kind of action via audio:

https://github.com/ggerganov/ggwave/discussions/27


Well you should try "fldigi"- this is a tool with many modulation and demodulation schemes used for amateur radio- they end up going to the transceiver via audio so it works just fine between PCs using PC speaker and microphone. You can have many simultaneous users on different channels within the audio bandwidth.

https://sourceforge.net/projects/fldigi/files/fldigi/

One of the more entertaining modes is Hellschreiber: basically a radio fax:

https://en.wikipedia.org/wiki/Hellschreiber


I've heard about fldigi, but I didn't know it can communicate through the motherboard's PC speaker. Cool!


Well I should say through the sound system (through your laptop's speaker and microphone). I don't think it would work with the old original IBM PC motherboard speaker.


I remember there being something called Chirp (I think) that had a neat demo where you have two phones download the app, put the phones into aeroplane mode, and then you can still make a box on the other phone move etc -- the implementation was ultrasonic sound I believe.

I tried to find them again but it looks like they were bought out or equivalent by Sonos?

[1] https://medium.com/chirp-io/data-over-sound-and-other-data-c...

[2] https://medium.com/chirp-io


I made a send-only Python client, back in the day. https://github.com/moreati/chirppy


Reminds me of some of the ham radio digital modes. Here’s a fun link of examples:

https://www.sigidwiki.com/wiki/Category:Amateur_Radio


I haven't tried it, but DOSBox emulates the piezo speaker and can record to a .WAV file, you toggle it on/off with <ctrl><f6>.

So, perhaps a way to try this without finding an old pc.


Next, one for using light via the screen. More ways to connect out from air-gapped computers!


Maybe one could perform accesses on mechanical hard drives in such a way you could encode a signal with it. Maybe reading a couple bytes from the inner tracks of the disk for "0" and outer tracks for "1", assuming they produce different but consistent access sounds.

Reading certain areas of a floppy disk would probably be even louder and more consistent.

There could be a "setup program" that listens to the computers microphone and performs various accesses, choosing two that meet the criteria for being distinguishable enough for another system to tell apart. Lower frequencies for 0, higher for 1.


Here's that being done and also with fan noise: https://www.securityweek.com/hard-drive-noise-allows-data-th...

180 bits per minute, not too shabby!



I also thought of this, but from the movie "WarGames".


This is very cool! I briefly browsed the code, and didn't seem to see any error correction. Do you find that ambient noise might "garble" the data?


I use Reed-Solomon error correction:

https://github.com/ggerganov/ggwave/tree/master/examples/r2t...

The communication should be robust towards noise, but I still haven't investigated thoroughly how noise affects the performance.


Have you ever heard of LISNR?

https://www.youtube.com/watch?v=gvtrnpydHlU


I've seen the project, but I was disappointed to not be able to find any way to try it out easily.



Receive Resources Through Tones?


It's an interesting project. I couldn't think of a use case for it but the one in a readme was a good example.


This is a modem. (modulator/demodulator)

Once upon a time transferring digital data over analog channels brought us fax machines and dial-up internet.


Also, packet radio, and semi-related, the good old IrDA ports in computers and PDA's in the late 90's/early 00's.


This took me back to loading programs from cassette tape in the eighties.

What bit rate do you get? In those days about 1300 a second was possible (on a good cassette with Dolby)

https://www.youtube.com/watch?v=faEYry_MyZM


I used to record programs of radio for my C64. We had a radio station that had a computer show and would broadcast software. :)

Edit: The software that was broadcasted was usually made by listeners or shareware. I.e. it was not illegal software.


That is so awesome.


I wonder what data rate is actually possible on a cassette tape, using modern modulation and error correction techniques? A blog commenter claimed to get 13.8 kbps per channel using Digital Radio Mondiale with OFDM+QAM, but I didn't see an obvious way to reproduce this:

http://www.windytan.com/2012/08/vintage-bits-on-cassettes.ht...

There are various data modes for ham radio (e.g. JT65), but tapes are better in some ways (bandwidth) and worse in others (wow and flutter), so those codecs seem ill-suited.


I am not sure of the magnitude of wow and flutter on tapes, but the amateur modes definitely account for some frequency shift -- for HF, the ionosphere can shift the frequency randomly (I kind of thought I was making this up, but found some papers on it: https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/200...), and radio local oscillators are not perfect (usually temperature dependent).

(For other modes like airplane scatter, meteor scatter, EME, satellite, etc. there is even more doppler shift.)


Perhaps one could record this audio on something like some kind of audio device and then replay it later to a receiver. In the spirit of the old school idea, perhaps like an old cassette tape system.


I heard that (some of the) Japanese subway used to use ultra-sonic to broadcast train information


It would be cool if their was an exploit on this protocol to give RCE over sound.


Very cool! What are some of the useful applications you thought for this concept?


I imagine a use case where you have headless devices (servers, raspberry pis, etc.) that are not connected to the internet. These devices can be measuring something through a sensor and periodically reporting the measured value using sound through the speaker. When you go near the device of interest, you take your phone and "listen" to the emitted value to see what was the last measurement.

The advantage of this is that it is a very cheap way of adding such communication channel to your device.


at one point the iPod bootloader was extracted by playing it aloud: http://www.ipodlinux.org/stories/piezo/


What is the transfer rate?


You can add a command-line argument to change the Tx protocol between 3 types as explained in the README. Here are the transfer rates for the 3 protocols:

- [R2T2] Normal : 16 bytes/8.5 sec

- [R2T2] Fast : 16 bytes/5.7 sec

- [R2T2] Fastest : 16 bytes/2.9 sec

The faster the communication - the less reliable it is. Currently, data is transferred in 16-byte batches.


Reminds me of slow scan tv from the HAM radio days! Cool project.


This is so cool! and I love the name! Thank you for sharing.


Well done, great demo!


Now do capslock,scroll lock, and numlock in excel.


Amazing work!


why does it require sudo?


Same reason you need `sudo` to change screen brightness with the /dev/ files; unless you have special group permissions set up, normal users can't directly access hardware devices. It's a Linux convention.


> It's a Linux convention.

It's just basic security. Direct access (ie no trusted intermediate software layer) to physical hardware is a minefield.


Well, yeah, that's the reason for the convention – but it isn't quite that simple. Control of the PC speaker should really be available to all users in `audio` by default, because in most setups it has exactly the same attack surface; running software as `sudo` instead is worse from a basic security perspective.

In this case, the driver is a trusted intermediary layer – at least, I think it is.


cat /path/to/file >/dev/audio ?


I think the hard part is: cat /dev/audio > /path/to/file. :)


[deleted]




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: