Skip to content

Objective 11b: Naughty/Nice List with Blockchain Investigation Part 2


The SHA256 of Jack's altered block is: 58a3b9335a6ceb0234c12d35a0564c4ef0e90152d0eb2ce2082383b38028a90f. If you're clever, you can recreate the original version of that block by changing the values of only 4 bytes. Once you've recreated the original block, what is the SHA256 of that block?

TL;DR - Answer


Qwerty Petabyte is giving a talk about blockchain tomfoolery!

The idea that Jack could somehow change the data in a block without invalidating the whole chain just collides with the concept of hashes and blockchains. While there's no way it could happen, maybe if you look at the block that seems like it got changed, it might help.

Apparently Jack was able to change just 4 bytes in the block to completely change everything about it. It's like some sort of evil game to him.

A blockchain works by "chaining" blocks together - each new block includes a hash of the previous block. That previous hash value is included in the data that is hashed - and that hash value will be in the next block. So there's no way that Jack could change an existing block without it messing up the chain...

If Jack was somehow able to change the contents of the block AND the document without changing the hash... that would require a very UNIque hash COLLision.

Shinny Upatree swears that he doesn't remember writing the contents of the document found in that block. Maybe looking closely at the documents, you might find something interesting.

First Two Bytes

Let's find the block that Jack altered:

>>> import hashlib
>>> import naughty_nice
>>> chain = naughty_nice.Chain(load=True, filename="blockchain.dat")
>>> jack_hash = "58a3b9335a6ceb0234c12d35a0564c4ef0e90152d0eb2ce2082383b38028a90f"
>>> for i, b in enumerate(chain.blocks):
...     if hashlib.sha256(b.block_data_signed()).hexdigest() == jack_hash:
...         print(i, b.index)
1010 129459

If we print the block at index 1010, we start to see some suspicious things about it:

>>> chain.blocks[1010]
Chain Index: 129459
    Nonce: a9447e5771c704f4
    PID: 0000000000012fd1
    RID: 000000000000020f
    Document Count: 2
    Score: ffffffff (4294967295)
    Sign: 1 (Nice)
    Data item: 1
        Data Type: ff (Binary blob)
        Data Length: 0000006c
        Data: <-snip->
    Data item: 2
        Data Type: 05 (PDF)
        Data Length: 00009f57
        Data: <-snip->
    Date: 03/24
    Time: 13:21:41
    PreviousHash: 4a91947439046c2dbaa96db38e924665
    Data Hash to Sign: 347979fece8d403e06f89f8633b5231a
    Signature: <-snip->

First, the block shows Jack receiving a very large nice score - something inconsistent with everything we've seen thus far in the game. Second, there is a strange binary blob document that has been added. Third, the attached PDF is larger than most of the other PDFs in the chain. Let's start by extracting the two documents, and dumping the block to a file so we can view it outside of the chain.

>>> chain.blocks[1010].dump_doc(1)
Document dumped as: 129459.bin
>>> chain.blocks[1010].dump_doc(2)
Document dumped as: 129459.pdf
>>> chain.save_a_block(1010, filename="block129459.dat")
The extracted files can downloaded here: 129459.bin, 129459.pdf, block129459.dat

If we try to view the PDF in a reader such as Adobe Acrobat Reader, we get an error stating The root object is missing or invalid. Let's try another PDF reader, such as Atril Document Viewer which is the default PDF viewer in modern versions of Kali Linux.


A version of the nice report that has been fixed to be readable in Adobe Reader is available here.

The document contains glowing praise from Mother Nature, the Tooth Fairy, Rudolph, and the Abominable Snowman about how great a guy Jack is. Shinny Upatree (the elf who supposedly wrote the report) says he did not write this document, so something is clearly going on with this PDF. peepdf will allow us to view the structure of the PDF to see what's going on. Since the PDF appears to be malformed, we need to use the -f flag in peepdf to tell it to ignore errors; -i allows us to enter interactive mode.

$ peepdf 129459.pdf -if

File: 129459.pdf
MD5: 448ac151b73a6b6da84cccec3345089a
SHA1: a5ff25f4695d17940af6d031f954694932e06d5a
SHA256: 140a330080fe6683d3ac00bddb8eeaeedf4e1d8ad71213ce822cb905b9f5d2f2
Size: 40791 bytes
Version: 1.3
Binary: False
Linearized: False
Encrypted: False
Updates: 0
Objects: 23
Streams: 8
URIs: 0
Comments: 0
Errors: 1

Version 0:
        Catalog: 1
        Info: No
        Objects (23): [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
                Errors (8): [4, 9, 10, 13, 14, 16, 21, 22]
        Streams (8): [4, 9, 10, 13, 14, 16, 21, 22]
                Encoded (8): [4, 9, 10, 13, 14, 16, 21, 22]
                Decoding errors (8): [4, 9, 10, 13, 14, 16, 21, 22]

PPDF> tree

/Catalog (1)
        /Pages (2)
                /Page (23)
                        stream (16)
                        dictionary (17)
                                /F1 (18)
                                        /Font (19)
                                                /FontDescriptor (20)
                                                        stream (21)
                                                stream (22)
                        /Pages (2)
/Pages (3)
        /Page (15)
                stream (4)
                dictionary (5)
                        dictionary (6)
                                /Font (7)
                                        /FontDescriptor (8)
                                                stream (9)
                                        stream (10)
                                /Font (11)
                                        /FontDescriptor (12)
                                                stream (13)
                                        stream (14)
                /Pages (3)

There appears to be two independent trees in this document: the one rooted at the Catalog that references object 2 (a Page), and one rooted at object 3 (also a Page). Let's try modifying the page reference in the Catalog from 2 to 3 in a hex editor (file offset 0x3f).

After this modification, the contents of the document changes to describe Jack traveling to Australia to repeatedly kick a wombat! What kind of sick person does such as thing? This seems more in character with the Jack we know.


A version of the naughty report that has been fixed to be readable in Adobe Reader is available here.

How did Jack insert a PDF with this alternate structure? The report contains clues: apparently Shinny Upatree (the reporting elf) was having issues connecting to WiFi when filing the report (probably being DEAUTHed by Jack), so he used Jack's laptop. Also, Shiny had to rush home due to a water leak before filing the report (probably also Jack), giving Jack time to modify the PDF. When it was submitted, everything appeared as expected - Jack had a huge naughty score and the report detailed his wombat kicking episode. Jack then modified the chain, changing just four bytes without changing the hash of the block. To accomplish this he would have needed to know the nonce that was going to be used with this block, but we know from Objective 11a this is possible.

While in the hex editor we also notice that after the Page reference in the Catalog is a bunch of random-looking bytes. This is strange because if you're familar with the PDF specifications, these bytes are occuring in a dictionary value. Ordinarily non-ASCII bytes are seen in a stream, not in a dictionary declaration. This is likely what leads to the malformed errors with Reader, but fixing this is probably not the goal of the challenge as it would require changing more than four bytes.

At this point, we believe we have identified two bytes that were likely modified: the naughty-nice byte, and the Page reference in the PDF. To find the other two bytes we need to spend some time reviewing the hinted materials about MD5 collisions.

Last Two Bytes

In Ange Albertini's slide deck he talks about an identical prefix collision attack against MD5 known as UniColl. UniColl takes a variable length (must be a multiple of 4) prefix, and returns two sets of 64-byte blocks that start with the prefix and have the same MD5 hash. Due to the internal structure of MD5, data can be appended to either of these sets of blocks and the resulting MD5 hash of all of the blocks will be the same. Using the two blocks generated in the slides, this is demonstrated via CyberChef below:


There is a newline added to visualize 64-byte blocks. The CyberChef recipe From Hex will ignore the extra whitespace

Example using blocks from slides

The two sets of blocks differ by just two bytes that are predictably modified:

  • The 10th byte of the last block of the prefix is incremented by 1.
  • The 10th byte of the collision block is decremented by 1

Let's view block129459.dat in a hex editor, and see where the bytes we think Jack modified fall. Starting with the naughty-nice byte since that occurs first:

First Modified Byte

Interesting, the byte that Jack would need to modify (0x31 at offset 0x49) is the 10th byte of the second 64-byte block of the file. Based on our knowledge of the UniColl attack, the 10th byte of the last block of the prefix is incremented by 1. This is exactly what Jack needed - increment 0x30 (naughty) to 0x31 (nice). Notice the random binary blob as the first document of the block - this is the rest of the prefix block and the collision block from UniColl! So bytes 0 through 0x53 were used as the prefix, and the remainder of the bytes from 0x54 to 0xBF were the resulting collision block. If the byte at 0x49 was incremented, based on the deterministic characteristics of the UniColl attack, the byte at 0x89 (0xD6) must have originally been 0xD7. We can verify this in CyberChef just like we did with the example

Hash Collision for Naughty/Nice byte

Nice! This confirms two of the four bytes that were modified.

Now let's look at the second byte we believe Jack modified - the Page reference in the PDF.

Second Modified Byte

Conveniently, this byte is also the 10th byte after a 64-byte block boundary. Also, the random bytes we noted while analyzing the PDF happen to occur immediately after the Page reference. This appears to be the result of a second collision generated by UniColl. For this collision, the prefix was chosen to be bytes 0 through 0x10F or 0x113.


The exact prefix that was used doesn't really matter in this case because anything after the 'R' (at offset 0x10D) is being ignored up until the end of the dictionary (>>) by the more "forgiving" PDF viewers

In this case, Jack wanted the 10th byte of the last block of the prefix to decrement (from 0x33 to 0x32), rather than increment as before. As a result, the prefix that Jack used would have had 0x32 at offset 0x109. This would result in 0x33 at offset 0x109, and 0x1B at offset 0x149. Now, since these blocks are interchangable without modifying the hash of the overall file, Jack would have submitted the block containing 0x32 instead of the original block which contained 0x33. By knowing that 0x33 was the value for the Naughty report, we know that the value at 0x149 would be 0x1B. We can again confirm this with CyberChef:

Hash Collision for PDF byte

We now have the four bytes that Jack modified, and the value that they contained in the original naughty report.

Offset in Block Offset in blockchain.dat "Nice" Value "Naughty" Value
0x49 0x163075 0x31 0x30
0x89 0x1630B5 0xD6 0xD7
0x109 0x163135 0x32 0x33
0x149 0x163175 0x1C 0x1B

We can now modify blockchain.dat at these four offsets, verify the chain to check that we have the right bytes, and hash the block for the final answer.

>>> from naughty_nice import *
>>> from Crypto.PublicKey import RSA
>>> with open('official_public.pem', 'rb') as fh:
...     official_public_key = RSA.importKey(
>>> c2 = Chain(load=True, filename='blockchain_fixed.dat')
>>> c2.verify_chain(official_public_key, c2.blocks[0].previous_hash)
>>> import hashlib
>>> hashlib.sha256(c2.blocks[1010].block_data_signed()).hexdigest()