Biweight midcorrelation

From HandWiki

In statistics, biweight midcorrelation (also called bicor) is a measure of similarity between samples. It is median-based, rather than mean-based, thus is less sensitive to outliers, and can be a robust alternative to other similarity metrics, such as Pearson correlation or mutual information.[1]

Derivation

Here we find the biweight midcorrelation of two vectors x and y, with i=1,2,,m items, representing each item in the vector as x1,x2,,xm and y1,y2,,ym. First, we define med(x) as the median of a vector x and mad(x) as the median absolute deviation (MAD), then define ui and vi as,

ui=ximed(x)9mad(x),vi=yimed(y)9mad(y).

Now we define the weights wi(x) and wi(y) as,

wi(x)=(1ui2)2I(1|ui|)wi(y)=(1vi2)2I(1|vi|)

where I is the identity function where,

I(x)={1,if x>00,otherwise

Then we normalize so that the sum of the weights is 1:

x~i=(ximed(x))wi(x)j=1m[(xjmed(x))wj(x)]2y~i=(yimed(y))wi(y)j=1m[(yjmed(y))wj(y)]2.

Finally, we define biweight midcorrelation as,

bicor(x,y)=i=1mx~iy~i

Applications

Biweight midcorrelation has been shown to be more robust in evaluating similarity in gene expression networks,[2] and is often used for weighted correlation network analysis.

Implementations

Biweight midcorrelation has been implemented in the R statistical programming language as the function bicor as part of the WGCNA package[3]

Also implemented in the Raku programming language as the function bi_cor_coef as part of the Statistics module.[4]

References

  1. Wilcox, Rand (January 12, 2012). Introduction to Robust Estimation and Hypothesis Testing (3rd ed.). Academic Press. p. 455. ISBN 978-0123869838. 
  2. Song, Lin (9 December 2012). "Comparison of co-expression measures: mutual information, correlation, and model based indices". BMC Bioinformatics 13 (328): 328. doi:10.1186/1471-2105-13-328. PMID 23217028. 
  3. Langfelder, Peter. "WGCNA: Weighted Correlation Network Analysis (an R package)". https://cran.r-project.org/package=WGCNA. Retrieved 2018-04-06. 
  4. Khanal, Suman. "Statistics: Raku module for doing statistics". https://github.com/sumanstats/Statistics. Retrieved 2022-03-11.