Reading the excellent book Statistical Inference from George Casella and Roger L. Berger, I found a reference to an interesting work from Lawrence M. Leemis in 1986 that's worth reading. In fact, this paper has been updated in 2008 and presents a nice graphical model (but not really in the sense we all imagine now: just a graph) of univariate probability distributions and what it takes to go from one to another, like generalizations or specializations. The graph in this paper is really amazing !
Just for fun, the Graph:
- Do you like our owl?
- It's artifical?
- Of course it is.
Tuesday, November 9, 2010
Monday, August 16, 2010
Linear Algebra in C++
What ? Linear Algebra in C++, especially in Machine Learning, where almost everybody programs in Matlab ? Yes, because most of the time it's the language used in the industry. And not a bad one to be honest (please no language war here, I know many other languages are good, etc...). So, I'm the new manager of a Linear Algebra library called uBLAS. Yes, you're reading right, the one in the famous Boost libraries.
Boost is made of many libraries, focusing on templates programming and higher-order programming. One of them, uBLAS, is devoted to Linear Algebra. I created a companion website. It's one of the fastest library, however it lacks several things. We're working hard to improve that:
- documentation
iswas poor: I made a new one and still improving, - no small SSE fast vectors, no GPU: we working already on this.
- I will change products prod(m1,m2)to a nicer m1*m2 directly.
- a new impressive assignment operator: it's easier and more powerful to fill in you matrices than Matlab now!
The new version 1.44 is out now, with a lot of improvements and it's just the beginning of the story. I invite everyone interested in linear algebra in C++ to join the project.
Tuesday, June 29, 2010
log-sum-exp trick
when I implement models with discrete variables (which actually happens more than one can think), I always end up estimating this value:
\[ V = \log \left( \sum_i e^{b_i} \right) \]
Why ? This usually happens at the denominator of a Bayes formula for example. I try to keep \(log\)-probabilities all the time so that not to have to deal with very small numbers and to do additions instead of multiplications. By the way, I was looking at the time and latency of floating-point instructions in the latest processors (like Intel Core i7 for example), and I realized that still in 2010, additions are faster than multiplications (even with SSEx and the like).
Therefore, use \(log\)
In this expression, \(b_i\) are the log-probabilities and therefore \(e^{b_i}\) are very small or very big yielding to overflow or underflow sometimes. A scaling trick can help using numbers in a better range without loss of accuracy and for a little extra cost as follows:
\[ \begin{array}{rcl} \log \left( \sum_i e^{b_i} \right)&=& \log \left( \sum_i e^{b_i}e^{-B}e^{B} \right)\\ ~ &=& \log \left( \left( \sum_i e^{b_i - B }\right)e^{B} \right)\\ ~ &=& \log \left( \sum_{i} e^{b_i - B} \right) + B \end{array} \]
And that's it. For the value of \(B\), take for instance \(B=\max_i b_i\).
So the extra cost is to find the max value and to make a subtraction.
\[ V = \log \left( \sum_i e^{b_i} \right) \]
Why ? This usually happens at the denominator of a Bayes formula for example. I try to keep \(log\)-probabilities all the time so that not to have to deal with very small numbers and to do additions instead of multiplications. By the way, I was looking at the time and latency of floating-point instructions in the latest processors (like Intel Core i7 for example), and I realized that still in 2010, additions are faster than multiplications (even with SSEx and the like).
Therefore, use \(log\)
In this expression, \(b_i\) are the log-probabilities and therefore \(e^{b_i}\) are very small or very big yielding to overflow or underflow sometimes. A scaling trick can help using numbers in a better range without loss of accuracy and for a little extra cost as follows:
\[ \begin{array}{rcl} \log \left( \sum_i e^{b_i} \right)&=& \log \left( \sum_i e^{b_i}e^{-B}e^{B} \right)\\ ~ &=& \log \left( \left( \sum_i e^{b_i - B }\right)e^{B} \right)\\ ~ &=& \log \left( \sum_{i} e^{b_i - B} \right) + B \end{array} \]
And that's it. For the value of \(B\), take for instance \(B=\max_i b_i\).
So the extra cost is to find the max value and to make a subtraction.
Monday, June 28, 2010
Just for those of you who wants to know how to put formulas in Blogger, I used this link here : http://watchmath.com/vlog/?p=438
Pretty straighforward. It uses a public LaTeX server to render the formulas. Very pretty !
Pretty straighforward. It uses a public LaTeX server to render the formulas. Very pretty !
This is my first post on this blog. And to be honest, this is the first time I'm gonna try to blog my thoughts. So, I'll do it on what I like these days: Artificial Intelligence and Machine Learning.
The idea is to post thoughts, tricks, ideas, etc... In the hope people will read it and comment too.
And, oh yes, I just installed in function to include math formulas. I don't know if it works so let's try it now with a simple version of the Bayes formula:
\[ P(A|B) = \frac{P(B|A).P(A)}{P(B)}\]
The idea is to post thoughts, tricks, ideas, etc... In the hope people will read it and comment too.
And, oh yes, I just installed in function to include math formulas. I don't know if it works so let's try it now with a simple version of the Bayes formula:
\[ P(A|B) = \frac{P(B|A).P(A)}{P(B)}\]
Subscribe to:
Posts (Atom)