A recently published paper by Hart et al presented a genome-wide CRISPR screening to identify fitness genes (a superset of essential genes) in five cell lines. The paper is quite impressive and shows the potentiality of CRISPR to generate large scale knockouts and to characterize the importance and function of genes in different conditions.
In the discussion the authors propose that fitness genes are more likely to be more conserved across species. However they do not follow-up on this hypothesis, probably for lack of space. They can’t be blamed as they already present a lot of results in the paper.
This post presents a follow-up analysis on the hypothesis that fitness genes are more conserved than non-essential genes. I’ll take the original data from the paper, get the conservation scores from bioconductor data packages, and do a Wilcoxon test to compare the two distribution. The full code is available as a github repository, and please feel free to contribute if you want to do some free R/Bioconductor analysis.