If you’re grasping for the deeper meaning of an essay or article, consider the possibility that it may not be in the words themselves, but hidden in the shape of the letters. It really could be the case, now that researchers from Columbia University have developed a method called FontCode, which plants data in text through tiny changes in how the letters are shaped.
The method is a steganographic technique, meaning it hides secret information in plain sight such that only its intended recipient knows where to look for it and how to extract it. FontCode can be applied to hundreds of common fonts, like Helvetica or Times New Roman, and works in word processors like Microsoft Word. Data encoded with FontCode can also endure across any image-preserving digital format, like PDF or PNG. The secret data won’t persist after, say, copy and pasting FontCode text between text editors.
The most significant format conversion FontCode messages can transcend, though, is digital to physical and back.
“Many modern steganographic techniques are always in digital files, but you can argue that the world is much larger than just digital formats,” says Changxi Zheng, a computer scientist at Columbia University who worked on the FontCode research. “So the question was how can we design a common physical object to convey digital information without compromising its existing functionality. I call it a hyperlink between a physical object and digital information.”
The text perturbations FontCode uses to embed a message involve slightly changing curvatures, widths, and heights—but crucially it’s all imperceptible to the naked eye. You can intuit that some letters, like capital “I”s or “J”s, don’t have a lot of complexity in which to hide subtle variations. But lowercase “a”s and “g”s, for example, have lots of edges and curves that can be elongated or shortened and bulked up or paired down.
The only easy way to extract the hidden information in all those tiny tweaks is with the research teams’ decoding algorithm. A recipient of a FontCode message could use their smartphone to take a picture of text manipulated with FontCode, then run the photo through a dedicated mobile app that decrypts the code to pull out the hidden message. It would also be possible to set up decoding schemes that use a webcam, a scanner, or any other image digitization method. You can see how it works in the video below.
Though FontCode is just begging to be used in spy movies or by White House staffers, the researchers also imagine that it could be used in place of everyone’s least-favorite hyperlink method, QR codes, or as a watermarking feature. FontCode messages could convey information about trademarks, patents, or other intellectual property protections in a document, or could even act as an anti-tampering feature to ensure that a document hasn’t been manipulated.
One difficulty for the FontCode team was maximizing the amount of information they could hide in text while still making the method flexible enough that complete information can still be extracted if physical limitations erode fidelity—for example, if a shadow distorts some letters, or someone drips coffee on a printout that contains a FontCode message.
“The main challenge we were facing was how to encode as much information as possible—because if it only works with a tiny bit of information it’s useless—while making it robust enough so if you have bad lighting or ink stains on the text it will still work,” Zheng says.
Steganographic techniques have been around for millennia, but in recent years cybersecurity researchers have noticed hackers adopting them in malicious attacks, and developing new variations to make their hacks more successful. These types of attacks are difficult for network defenders to detect, since they often hide malicious data in things like image files that don’t have any set standard to check them against. FontCode could be more difficult to weaponize, since anyone could potentially use machine learning algorithms—like those that generate FontCode tweaks—to check text documents against font standards.
“This technique is likely easily detected by machine learning, so it is not suitable for sending secret messages where there are people on the lookout for secret messages being sent,” says Owen Campbell-Moore, a security researcher who created the Chrome extension Secretbook to hide messages in Facebook photos. “This is in contrast to other steganography techniques such as JPEG steganography, in which decades of research has been done in order to make it undetectable even when there are people actively scanning for it.”
But FontCode does have the advantage of moving between digital and physical mediums, which could have specific applications in high-stakes espionage. And the researchers note that FontCode messages can be additionally encrypted if the sender and receiver agree on a key for reordering whatever data is embedded in a given text. Plus, the technique’s more mainstream uses wouldn’t be impacted by the potential that a scanner could discover the presence of the data—since it wouldn’t be a secret in the first place that more information was stored there.
“It’s exciting to see new techniques for steganography being invented,” Campbell-Moore says. “I think this technique has a lot of appeal.”
More Great WIRED Stories