Converting binaries into images

Nutjobs like me actually LIKE looking at such stuff

Had this crazy idea for a tool/set of scripts that automatically helped to convert any binary into a bitmap and compared these bitmaps to generate visualizations like heat maps and the like. Might come in handy for DFIR and related tasks…

…and thus, after a couple of hours tinkering around, I have the first component ready! So here’s just a scratchpad post to document the main “results” first: how to convert a given binary into a bitmap.

The Netpbm formats and Netpbm tools are used here, although I’ve read about them in the past, I never expected to be using them this way. Well, counting that I use CLI scripting and text processing tricks extensively, the Netpbm format’s probably the most natural way for me to work with here.

Since the Netpbm tools generally interoperate amongst the six different formats, there’s not too much need to convert between the formats and the final one. The example in the Netpbm wiki page works, but it’s possible now (and simpler) to just call the second command straight instead.

# instead of doing this
$ pgmtoppm "#FFFFFF" somepic.pbm | ppmtobmp > somepic.bmp

# this likely would work anyway
$ ppmtobmp somepic.pbm > somepic.bmp

Ok, some of the fundamental commands and text processing involved:

$ dd if=/dev/urandom bs=32 count=1 > random.bin
1+0 records in
1+0 records out
32 bytes (32 B) copied, 8.0457e-05 s, 398 kB/s

$ xxd random.bin
0000000: 9272 f11f c1bd f512 89d1 7474 479d 973b  .r........ttG..;
0000010: 7935 8f6b eb22 d42a 3fee 9876 6677 52ef  y5.k.".*?..vfwR.

$ xxd -b -c16 random.bin
0000000: 10010010 01110010 11110001 00011111 11000001 10111101 11110101 00010010 10001001 11010001 01110100 01110100 01000111 10011101 10010111 00111011  .r........ttG..;
0000010: 01111001 00110101 10001111 01101011 11101011 00100010 11010100 00101010 00111111 11101110 10011000 01110110 01100110 01110111 01010010 11101111  y5.k.".*?..vfwR.

$ xxd -b -c16 random.bin | sed -r 's/^[^ ]+ (([10]+ ){1,16}).+/\1/'
10010010 01110010 11110001 00011111 11000001 10111101 11110101 00010010 10001001 11010001 01110100 01110100 01000111 10011101 10010111 00111011
01111001 00110101 10001111 01101011 11101011 00100010 11010100 00101010 00111111 11101110 10011000 01110110 01100110 01110111 01010010 11101111

And now…let’s try generating a bitmap for a larger sample file to see what it looks like:

$ dd if=/dev/urandom bs=1024 count=1 > random.bin
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 0.000177957 s, 5.8 MB/s

$ xxd -b -c16 random.bin | sed -r 's/^[^ ]+ (([10]+ ){1,16}).+/\1/' > random.bitchar

$ echo -e "P1\n128 $(wc -l random.bitchar | cut -d' ' -f1)\n$(cat random.bitchar)" | ppmtobmp > random-bitmap.bmp
ppmtobmp: analyzing colors...
ppmtobmp: 2 colors found
ppmtobmp: Writing 1 bits per pixel with a color pallette
16 bytes * 8 bits per byte = 128 bits (the width of the bitmap representation of the binary here)

You might have noticed that the height of the image needs to be specified in the PNM file, which we can (nicely) generate using a chain of the wc and cut commands. The last command shown above will only work if the binary size is in multiples of 16 bytes (128 bits) which does not usually occur. Netpbm will fail with an error when generating the bmp file. In order to make it not complain, we will need to pad the output, which is shown in the graymap example below (it’s a pain to pad 128 bits of 0s, so…)

Now, let’s try generating a graymap from the binary.  We usually count file sizes in bytes, which translates to 8 bits to 1 byte.  Conveniently, a byte (8 bits) is also a common way to represent a graymap (it defines the value in the HSV coordinates for you wannabe image processing geeks).

Converting the hex to decimal automatically in CLI is simple (thanks Google), thankfully…

$ xxd -c16 -g1 random.bin | sed -r 's/^[^ ]+ (([0-9a-f]+ ){1,16}).+/\1/' | head -n2
e5 71 e1 56 85 6e 96 71 93 3a 8a 04 dc 1e aa 6e
4e 50 32 23 84 24 e8 ab b7 ae 7f 51 cd f4 54 4e

$ xxd -c16 -g1 random.bin | sed -r 's/^[^ ]+ (([0-9a-f]+ ){1,16}).+/\1/' > random.hex

$ for i in $(cat random.hex); do echo -n "$((0x$i)) "; done
229 113 225 86 133 110 150 113 147 58 138 4 220 30 170 110 78 80 50 35 132 36 232 171 183 174 127 81 205 244 84 78 230 183 10 63 100 59 238 156 226 87 248 43 96 126 148 75 185 228 120 184 153 77 71 66 0 176 208 178 142 145 108 30 45 180 35 168 230 1 241 171 114 117 214 123 4 148 86 210 49 25 242 35 105 195 186 89 220 77 172 25 83 90 23 168 181 18 18 33 97 240 28 111 224 126 187 33 116 171 205 6 74 227 53 180 184 138 188 21 29 ...

$ echo -e "P2\n16 $(wc -l random.hex | cut -d' ' -f1)\n255" > random.pgm && for i in $(cat random.hex); do echo -n "$((0x$i)) "; done >> random.pgm && echo "0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0" >> random.pgm

$ ppmtobmp random.pgm > random.bmp
ppmtobmp: analyzing colors...
ppmtobmp: 252 colors found
ppmtobmp: Writing 8 bits per pixel with a color pallette

And the result…

One nice thing about the PNM format is that it does not need the data section to be line terminated or formatted according to the width of the image. That information is taken only from the header, and the data section can be in one big chunk.

This set of commands will also work with binaries of variable sizes:

$ dd if=/dev/urandom bs=300 count=1 > random.bin
1+0 records in
1+0 records out
300 bytes (300 B) copied, 0.000105041 s, 2.9 MB/s

$ xxd -c16 -g1 random.bin | sed -r 's/^[^ ]+ (([0-9a-f]+ ){1,16}).+/\1/' > random.hex

$ echo -e "P2\n16 $(wc -l random.hex | cut -d' ' -f1)\n255" > random.pgm && for i in $(cat random.hex); do echo -n "$((0x$i)) "; done >> random.pgm && echo "0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0" >> random.pgm

$ ppmtobmp random.pgm > random-variedsize.bmp
ppmtobmp: analyzing colors...
ppmtobmp: 179 colors found
ppmtobmp: Writing 8 bits per pixel with a color pallette

The resulting image shows the padding (black) after the data part ends:

Now that we know how to generate a bitmap or graymap from a binary, it shouldn’t be too big a jump to generate these images for a set of binaries…

Lastly, one quirk I realized was that in the P1 mode (bitmap), 0 denoted a white whereas 1 denoted a black.  This is the opposite for the P2 mode (graymap), where the lowest value (0) denotes a black, and the highest value (255 for our hex conversions) denotes a white.  Nice and confusing, eh? 🙂

The next component of this crazy DFIR tool idea is to munch on these generated images to assist in analysis.  Let’s see how far this goes…

Advertisements

One thought on “Converting binaries into images”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s