There was a challenge today at Didier Stevens’s blog . It’s a pdf puzzle, the goal is to find out the passphrase hidden inside the file.

Opening the file with a pdf reader you’ll see the text:
“The passphrase is XXXXXXXXXXXXXXXXXXX”.
Passphrase is not a sequence of ‘X’ for sure. How to find it out?

Didier gave us a little hint: “There’s a very simple solution just requiring Notepad”. Opening the file with notepad reveals the complete structure of the pdf file. The phrase is not inside the file; after a better glimpse at the file I notice these lines:
5 0 obj
...
/Filter /ASCII85Decode
...
stream
6<#'\7PQ#@1a#b...

This is the definition of an object, as you can see it’s encoded using ascii85. Using a decoder it’s pretty easy to retrieve the required passphrase: “Incremental Updates”.

Is it really necessary to use an ascii85 decoder?
There are two suspicious snippets inside the file indeed; the first snippet is the one you see above, and the other one is:
5 0 obj
...
/Filter /ASCII85Decode
...
stream
6<#'\7PQ#@1a#b...

They are two almost equal objects. There are only some different bytes in the encoded strings. The first and the last part of the encoded strings are the same, it means they have the same operators. i.e. if the object is used to display a text string they can have the same coordinates.

Ok, I have two streams but only one will be showed. Who decide what to display or not?
A pdf file contains a Cross Reference table which is used to define all the objects that are inside the file. A table is something like:

xref
0 7
0000000000 65535 f
0000000012 00000 n
0000000089 00000 n
0000000145 00000 n
0000000214 00000 n
0000000419 00000 n
0000000594 00000 n

There are 7 object defined. Checking each object offset (the number in the first column) you’ll find out that only one stream is defined. The other one is not defined in this table because there’s another Cross Reference table at the end of the file:

xref
0 1
0000000000 65535 f
5 1
0000000935 00000 n

It’s pretty obvious now, the second stream (text with xxx) will be written over the first one (text with password).
To see the right text I removed some bytes from the end of the file. You can remove all the bytes after the first “%%EOF” occurrence.
Now you can see the hidden passphrase without using an ascii85 decoder. Nice challenge!

Lunch break ends now…