I have a favorite visualization that I apply to conversations. Each participant is represented by a string. Each character of the string is one hour. Activity in that hour creates a blip. Inactivity leaves dots (or comma at midnight). Here are five days of Dorkbot project blabber visualized. I share my methods in the remainder of this post.
I prepared this chart in two steps: first get the data in a file and then plot it. Let's work backwards. Here is the plot routine:
while (<>) { ($d, $t) = split /\t|\n/; $p{$t} = ("." x 23 . ",") x 5 unless $p{$t}; $h = int($d*24); substr($p{$t},$h,1) = "|"; } print map ",$p{$_} $_\n", sort keys %p;
This perl code reads date $d and title $t from a file. It creates a plot $p for each title and marks the hour $h with a "|". Here are the first few lines of the data file input to this program:
0.4740 Aaron Eiche 0.4819 Donald Delmar Davis 0.4879 Jared Boone 0.4957 Brian Collins 0.4974 Aaron Eiche 0.5131 Scott Dixon 0.5174 Donald Delmar Davis 0.5439 Aaron Eiche 0.8517 Russell Senior 1.0052 Greg Peek
I collected this data by emailing the whole thread to my self using Apple's mail reader. This gave me a single text file with 80 mime attachements, one for each post. I used this loop to find the "From" and "Date" lines in the headers:
while(<>) { report() if /^--Apple-Mail-126-72263117/; $b++ if /^$/; next if $b>=2; chomp; $t{$1}=$_ if /^(Date|From):/; }
The crazy Apple-Mail-126-72263117 line is the mime separator. I found it by inspection. I also found that all the email headers were before the second blank line of each attachment. Blank lines are significant in email headers. The report() subroutine wrote each line of the data file and then clears the blank line count $b:
sub report { $d = sprintf "%0.4f", `node -e 'console.log (( Date.parse("$t{Date}") - Date.parse("31 Jan 2012 0:00:00 -0800"))/3600000/24)'`; $f = $1 if $t{From} =~ /From: "?(.*?)"? </; print "$d\t$f\n"; $b = 0; }
Email programs vary greatly in how they format names and dates. I tried to load some proven date parser from CPAN but finally gave up and just shell'd out to Node.js to do the date conversion.
These programs would be much longer if I tried to write them in some general way. I find if I write them over and over then I keep my coding skills sharp and move quickly on to other things.
(Regretfully line breaks have been squeezed out of the code and data samples in this post. I'll see if I can't fix that when I get a minute. Until then you must use some imagination.)