when I implement models with discrete variables (which actually happens more than one can think), I always end up estimating this value:
V=log(∑iebi)
Why ? This usually happens at the denominator of a Bayes formula for example. I try to keep log-probabilities all the time so that not to have to deal with very small numbers and to do additions instead of multiplications. By the way, I was looking at the time and latency of floating-point instructions in the latest processors (like Intel Core i7 for example), and I realized that still in 2010, additions are faster than multiplications (even with SSEx and the like).
Therefore, use log
In this expression, bi are the log-probabilities and therefore ebi are very small or very big yielding to overflow or underflow sometimes. A scaling trick can help using numbers in a better range without loss of accuracy and for a little extra cost as follows:
log(∑iebi)=log(∑iebie−BeB) =log((∑iebi−B)eB) =log(∑iebi−B)+B
And that's it. For the value of B, take for instance B=max.
So the extra cost is to find the max value and to make a subtraction.
3 comments:
Very interesting, but ...
let's assume that b_i takes huge values (e.g., 800, 500, 700)
in this case,
how can you avoid computational problems due to machine precission?
Juan
Do you know who invented the technique? I would like to cite the author.
Istvan
I'm not sure exactly who invented that. I think the first time I saw it was in Kevin Murphy's note on Naive Bayes. But I guess it's just a common trick that is well known (or maybe Kevin discovered it ?)
Post a Comment