I received a cryptic message!

An incoming text, in a Nokia N79. Part of the message is corrupted.

An incoming text, in a Nokia N79. Part of the message is corrupted.

I have already wrote about the evils of plain text and how it is one of the worst inventions of the computing field. But as if I needed a tangible example for my readership, yesterday, I received a cryptic text (I mean SMS) on my mobile phone, which run as follows:

Tried my best; knew your life depends on it:
http://arstechnica.com/te捨⵰潬楣礯㈰ㄴ⼰㘯慰灥慬猭捯畲琭瑨牯睳ⵯ畴ⴳ㐰〰〭潮汩湥⵬楢敬⵲畬楮术⍰㌍ਊ慲獴散桮楣愮捯洯獥捵物瑹⼲〱㐯〶⽵湤敲ⵤ摯猭晥敤汹ⵢ畣歬敳ⵢ畴ⵤ敦楥猭慴瑡捫敲猭數瑯牴楯渭摥浡湤猯

It was a very tough situation: My life depended on a corrupt text. But fortunately, I had a Windows computer at hand and I could fix it.

Reversing the corruption

Using Nokia Suite 3.8, I imported the message into the computer and within seconds, it was in the clipboard.

Next I opened Notepad, pasted the message in it and after deleting the English part, saved it in big-endian Unicode format.

Screenshot of Save As dialog box in Notepad. Please note its Encoding field that is set to "Unicode big endian".

Screenshot of Save As dialog box in Notepad. Please note its Encoding field that is set to “Unicode big endian”.

Then, I opened Command Prompt and issued the following commands:

cd Desktop
type RAW.txt
Screenshot of Command Prompt in Windows 7: A type command has displayed the contents of RAW.txt on the screen.

Screenshot of Command Prompt in Windows 7: A type command has displayed the contents of RAW.txt on the screen.

And the result is as follows:

■ ch-policy/2014/06/appeals-court-throws-out-340000-online-libel-ruling/#p3

arstechnica.com/security/2014/06/under-ddos-feedly-buckles-but-defies-attackers-extortion-demands/

Ignoring the first two characters, I copied and pasted the message into Notepad again and added back the English portion from the SMS message to it:

Tried my best; knew your life depends on it:
http://arstechnica.com/tech-policy/2014/06/appeals-court-throws-out-340000-online-libel-ruling/#p3
arstechnica.com/security/2014/06/underz-ddos-feedly-buckles-but-defies-attackers-extortion-demands/

What is the meaning of what I did?

As soon as I saw the message, I guessed that it should have been English text misinterpreted as some eastern Asian language. So, all I had to do was typecasting. I had to tell my computer to interpret the same bytes in the same order, but not as Unicode; rather, as ASCII. To do so, I told my Notepad to save the corrupted part as “Unicode big endian”, thus retaining the order of bits and bytes as in the original transmission. Then, I told Command Prompt to dump the contents of the file on the screen, knowing that Command Prompt knows no such thing as Unicode. Command Prompt only knows the default code page, in this case, extended ASCII.

But what are the two first characters that I ignored? They were not part of the original message, of course. Notepad added them. They are called a Byte Order Marker (BOM) and indicate whether the Unicode file is big-endian or little-endian. But Command Prompt, which is Unicode-illiterate renders them as actual ASCII characters.

Hack vs. standardized approach

What I did is called a “hack”. It means I applied a non-standard approach based on quirks of programs in my computer; it may break the next time Microsoft releases a Windows with a Unicode-compliant Command Prompt. But it is a solution that nevertheless works on existing versions of Windows and every housewife can implement with Windows itself.

Instead of using the Command Prompt, I could open RAW.txt with Microsoft Word, which allows me to arbitrarily adjust the encoding. I could tell Microsoft Word to use MS-DOS encoding and get the same results. I also could open RAW.txt with a hex editor and strip away the BOM. By doing that, Notepad could re-open the file and see the English contents in it. Or I could write a PowerShell script that to read the gibberish, delete its BOM and cast it to ASCII. But the first two need third-party software and the last needs knowledge of PowerShell and .NET Framework.

Disclaimer

How comes my life depends on some Ars Technica links? Well, it does not. I doctored the original message, replacing the URLs to protect the privacy of people involved, since I have no means of stopping less-than-benevolent people from reading this blog.

Advertisements

Posted on 20 June 2014, in Computers and Internet and tagged , , , , , , , , , , , , , , . Bookmark the permalink. Leave a comment.

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: