Mutual information on stocks dataset

Sat, Jul 11, 2020 1-minute read

For stock i, $$R_{i,t}, P_{i,t}$$ is given as **log - return of the day t** and **closing price of day t, given by: ‘‘$! R_{i,t} = \ln{\frac{P_{i,t}}{P_{i, t-1}}} $!’’ F or stock i, we set the corresponding log - return interval as $$min R_i, max R_i $$ and formally divide it into k sub - intervals. Frequency is than calculated and approximated into potentially **probability**, and mutual information between quantities are then calculated using under assumption that these quantities are discreate variables.

Given two different variables, we can calculated pearson’s correlation coefficient, then under these coefficients, we can define metric function by: ‘‘$! d_\rho (X, Y) = \sqrt(2(1 - \rho_{X,Y})) $!’’ and, given the mutual information between these variables, we can define:

‘‘$! d_M(X,Y) = H(X) + H(Y) - 2I(X,Y) $!’’

This function can well act as a metric, thus making these variable spaces as metric spaces.

Also, these metric has a normalized version:

‘‘$! D(X,Y) = 1 - \frac{I(X,Y)}{H(X,Y)} $!’’