Well, have worked the stuff from my previous post into a set of scripts, just to share. For more on the visualizing of binary samples and why/how it works, you can refer to that post for more details.
You can download the code here. You need to have gnuplot installed. And yes, this runs in Linux (perhaps it could run in cygwin too, but YMMV).

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
How to use: (the output folder defaults to ./output/ )
./generate_scatterplots.sh <binaries_folder> [<output_folder>]
HTH.
$ ./generate_scatterplots.sh test/ chomping input files... test//test10.bin test//test1.bin test//test2.bin test//test3.bin test//test4.bin test//test5.bin test//test6.bin test//test7.bin test//test8.bin test//test9.bin generating scatterplots, total of 20381 points to plot... 0005000.png 0010000.png 0015000.png 0020000.png 0025000.png done! generated PNG files are in ./output/ folder $ ./generate_scatterplots.sh test/ output2 chomping input files... test//test10.bin test//test1.bin test//test2.bin test//test3.bin test//test4.bin test//test5.bin test//test6.bin test//test7.bin test//test8.bin test//test9.bin generating scatterplots, total of 20381 points to plot... 0005000.png 0010000.png 0015000.png 0020000.png 0025000.png done! generated PNG files are in output2 folder
The sources:
#!/usr/bin/env bash
# This work by Ray Foo is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
# http://creativecommons.org/licenses/by-sa/3.0/deed.en_US
# generate_scatterplots.sh
# Processes all files in a given folder (first parameter), and generates scatterplots showing
# the spread of the observed byte values (converted into decimal) at each offset.
# Outputs to "./output/" folder by default unless specified with the second parameter.
# Release 1
if [ -z $1 ]; then echo "I need something to work on!" 1>&2; exit 1; fi
output=${2:-"./output/"}
tmpfolder="./tmp-528cf3248f302b4b3c8b94d8afba716f/"
ptspergraph=5000
# create folders
mkdir "$output"
if [ -e "$tmpfolder" ]; then rm -rf "$tmpfolder"; fi
mkdir "$tmpfolder"
# generate points
echo -e "\nchomping input files..."
for i in "$1"/*; do
echo -n "$i "
./chompdata.sh "$i" >> "$tmpfolder"/tmp1
done
sort -u "$tmpfolder"/tmp1 | sort -n > "$tmpfolder"/tmp2
echo -e "\n"
# plot
lines=$(wc -l "$tmpfolder"/tmp2 | cut -d' ' -f1)
echo "generating scatterplots, total of $lines points to plot..."
counter=0
while ((counter < lines)); do
let "counter+=ptspergraph"
echo -n "$(printf "%07d" $counter).png "
head -n"$counter" "$tmpfolder"/tmp2 | tail -n"$ptspergraph" > input.dat
gnuplot scatterplot_values_vs_offsets.gnuplot > "$output"/$(printf "%07d" $counter).png
done
echo -e "\ndone! generated PNG files are in $output folder\n"
# cleanup
rm input.dat
rm -rf "$tmpfolder"
#!/usr/bin/env bash
# This work by Ray Foo is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
# http://creativecommons.org/licenses/by-sa/3.0/deed.en_US
# chompdata.sh
# takes a file (or STDIN by default) and outputs to STDOUT every offset and its binary value (in decimal) per line
# Release 1
tmpfile="tmpfile.tmp"
input=${1:-"-"}
xxd -c1 -ps "$input" > "$tmpfile"
counter=0
for i in $(cat "$tmpfile"); do
echo "$counter $((0x$i))"
let "counter++"
done
rm "$tmpfile"
# This work by Ray Foo is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. # http://creativecommons.org/licenses/by-sa/3.0/deed.en_US # scatterplot_values_vs_offsets.gnuplot # gnuplot commands for creating scatterplot, outputs PNG to STDOUT # Release 1 set terminal png size 1280,800 set yrange [-1:256] set ytics 16 set key off set grid xtics ytics set title "Scatterplot of binary values across offsets" set xlabel "Offset (decimal)" set ylabel "Value (decimal)" plot "input.dat" using 1:2 with points pt 5 ps 0.5





Pingback: Visualizing the spread of binary content | geekery