Sonntag, 19. Juli 2015

Four Shades of Devel::NYTProf

Sometimes a question or remark helps us solving another problem, which has absolutely nothing to do with the question.

The question was about the colors in the reports of Devel::NYTProf. The colors are green, yellow, orange and red, where green means good and red means bad.

These colors appear in the columns "Statements" and "Time on line" of the reports. To illustrate the coloring I found a nice piece of code where nearly all combinations appear:

Line State
ments
Time
on line
Calls Time
in subs
Code
5112µsfor my $j ($bmin..$bmax) {
5263µs  $bj = $b->[$j];
5363µs  unless (defined $positions->{$bj}) {
5421µs    $Vs->[$j] = $S;
5528µs    next;
56  }
5741µs  $y = $positions->{$bj};
5841µs  $u = $S & $y; # [Hyy04]
5943µs  $S = ($S + $u) | ($S - $u); # [Hyy04]
6043µs  $Vs->[$j] = $S;
61}

The different colors are determined by the difference of the values to the median, measured in median absolute deviations. The colors in the column "Statements" are calculated independent from the column "Time on line". This explains, why the first column can be red while the second is green, and why the first column can be green and the second red.

In the above report we see that the body of the for loop is executed 6 times. The lines in the unless body is executed 2 times, and the part outside 4 times.

The unless condition including the body needs 12µs. The intention of the unless construction is an optimization. So let us estimate how much time this safes. The second part (lines 57-60) needs 8µs for 4 executions, that's 2µs per execution. If we can avoid lines 53-55 we need 2 more executions of lines 57-60. That would theoretically mean minus 12µs plus 4µs.

Of course we need to care for the undefined case. And we do not longer need the caching in line 52.

We get now the following report:

Line State
ments
Time
on line
Calls Time
in subs
Code
521700nsfor my $j ($bmin..$bmax) {
5362µs  $y = $positions->{$b->[$j]} // 0;
546900ns  $u = $S & $y; # [Hyy04]
5561µs  $S = ($S + $u) | ($S - $u); # [Hyy04]
5663µs  $Vs->[$j] = $S;
57}

This looks a lot shorter. We eliminated 5 lines of 11 lines. The total execution time is now below 8µs. This is less than one third of the 25µs before the tuning.

The code is from LCS::BV which now is another bit faster than after Tuning Algorithm::Diff.