The #SciFund challenge begins today!

#Scifund is an experiment of ‘crowfunding’ for researchers. The idea behind #Scifund is to push scientists toward the world of crowfunding, to teach them how they can propose their research projects on a crowfunding website, and get people to contribute by helping funding them.

So, today #Scifund starts its first iteration. If you have an idea about a research project and you think that you can convince other people to fund it, you have about two weeks to prepare a draft proposal and post it to the #Scifund site. If you need more information, I suggest you to sign to their website and their mailing list.

It is important to note that the funding will not be collected on their website, but on a popular crowfunding website, RocketHub. The scope of #Scifund is not to do the collect the funding there, but to help researchers to prepare their applications to other crowfunding websites. #Scifund will be a web 2.0 website where researchers will compile applications in a online collaborative manner.

Personally, I think that crowfunding is a good idea for research. Of course, they won’t never be able to get the 70 millions dollars needed to test a new antibiotics, or the money needed to support a wet laboratory; but it may be a good resource for bioinformatics. Moreover, even though you are not interested in submitting a research proposal there, their website is a good resource for learning: have a look at their blog, and at all the useful tips they present.

Ten Simple Rules paper published!

ResearchBlogging.org

The Ten Simple Rules for Getting Help from Online Scientific Communities paper just came out on the last PLoS Computational Biology issue!

It has been a pleasure and an exciting experience to coordinate the writing of my first Open Collaborative Paper. It has been fun, a different approach toward preparing and publishing an academic paper, and I hope I will be able to promote other similar initiatives in the future.

I am also preparing a blog post where I list some tips and notes about coordinating these type of initiatives. I have learned a lot from this experience, and I would like to share a few thoughts on what can be done to organize them at the best. But let’s see if I will find the time to finish writing it ๐Ÿ™‚

This paper is dedicated to all the people who enjoy discussing science on Internet. In particular, I would like to give a huge thanks to the people from the Molecularlab Community, and the Biostar Community; and also a huge thanks to Robert Hoffman from WikiGenes for the support.

And if you pass around here in Barcelona in my lab, I will offer you a chocolate candy ๐Ÿ™‚

Dall’olio GM, Marino J, Schubert M, Keys KL, Stefan MI, Gillespie CS, Poulain P, Shameer K, Sugar R, Invergo BM, Jensen LJ, Bertranpetit J, & Laayouni H (2011). Ten simple rules for getting help from online scientific communities. PLoS computational biology, 7 (9) PMID: 21980280

a script to scrape Uniprot

I wrote a rudimental script to scrape the Uniprot Website and extract some information about a list of Uniprot entries. This may be useful as an example on how to query Uniprot (since I couldn’t find any public API nor library), or to get infos about a list a genes of your interest.

NOTE: the correct way to do this is by the Retrieve Tool from the Uniprot page. The script presented in this post is just an example of how to use the python library Mechanize.

The code is available at bitbucket. Usage is simple: edit the files to enter your email address and the list of IDs you are interested in, and run it as a python script.

Enjoy!

 

 

Favorite command of the day: parallel from moreutils

Yesterday I have discovered a nice Unix tool to launch commands in parallel. It is called ‘parallel’ and it is very easy to use. I think it is the easiest way to parallelize things in a multi-core computer. You can install it from the ‘moreutils’ package in Linux, or from http://www.gnu.org/s/parallel/

the basic usage is:

$: parallel <interpreter> <command> — <list of arguments>

for example

$: parallel bash -c “echo hola” — 1 2 3
hola
hola
hola

This example will launch the “echo hola” command three times in parallel, one for each argument after the ‘–‘.
You can use the command “htop” to monitor CPU usage.

Thanks to this command, it is very easy to launch a great number of jobs in parallel. For example, if I want to run 1000 simulations:

$: parallel perl launch_a_single_simulation.pl — {1..1000}

This will run 1000 simulations in parallel, making use of as many processors as available.

By using the -i option, it is also possible to pass the values of the arguments after the ‘–‘ to the script.

$: parallel i bash -c “echo hola {}” — Johannes Marc Pierre Manu
hola Marc
hola Johannes
hola Manu
hola Pierre

When using the -i option, the symbol ‘{}’ is replaced by the argument.

For example, if we want to run a job on all chromosomes, we can just say:

$: parallel -i python calculate_test_on_chromosome.py {} — {1..22} X Y

Or, if we want to execute a script for many genes, we can say:

$: parallel -i python get_plot_by_gene –gene {} — ALG12 MGAT3 DOLPP1

Have fun ๐Ÿ™‚

Ten Simple Rules initiative entering the final phase

The initiative for the collaborative writing of a candidate ‘Ten Simple Rules’ paper, launched two weeks earlier this month from this blog, has been very successful. So successful that after only two weeks the manuscript is almost ready, in a state where further modifications may be more harmful than useful.

For this reason, we are planning to close the editing phase earlier. The initial deadline was for May 28th; but since it does not make sense to continue working on it, we will probably leave the manuscript editable for a few days more, and then close it.

So, if you want to participate, hurry up! Join the mailing list and add your contribution!

 

new version of the collaborative Post-GWAS article published

There are some recent news about the initiative of the collaborative article on Post-GWAS analysis launched last December[1]. It seems that a new version of the manuscript has been published on Nature Precedings (link), a few weeks earlier this month.

Well, in the end, with the exception of one figure, they did not include almost anything from what has been contributed in the wiki (I still have to check carefully). They thank the contributers in the acknowledgment section, leaving a link to the wiki page, but saying that these have not been included for reasons of space.

Continue reading

update on the status of the ‘Ten Simple Rules’ initiative after the first 2 weeks

This is an update of the status of the ‘Ten Simple Rules for getting help from Mailing Lists and Onlineย  Scientific Communities’ initiative, after two weeks.ย  I am posting it here, but if you want to follow the initiative you should better subscribe to the dedicated mailing list.

First of all, I would like to thank you the people who have participated. Honestly, I didn’t expect this initiative to proceed so fast, and I am very happy to have seen so many contributions and feedback :-).

It seems that the collaborative open approach has paid, this time!

Deadlines and submission date

The manuscript is already almost complete now. The original deadline was for the end of May; however, I was thinking that we could probably finish it and submit it earlier. The manuscript is in a status where each further modification can be more harmful than useful.

Continue reading

contribute to a candidate ‘Ten Simple Rules’ article

A few months ago I had the idea of writing an article in the style of the PLoS Computational Biology ‘Ten Simple Rules’ where to explain people how to use mailing lists and web forums to solve technical problems related to bioinformatics. Something on the style of ‘How To Ask Questions The Smart Way’ by Eric Raymond [3], but adapted to bioinformaticians and more gentle.

However, it does not make sense to submit a paper on best rules on getting help from online communities without achieving some sort of community consensus first. There are so many online communities, and so many different approach and best practices, that a single person can not be representative of all the different opinions on this.

So, I am launching the initiative of a open collaborative draft for a paper in the style of the PLoS ‘Ten Simple Rules’ series, entitled ‘Ten Simple Rules for Getting Help from Mailing Lists and Online Communities’. Here it is the main page of the project:

Public Invitation to the Candidate for Ten Simple Rules for Getting Help from Mailing Lists and Online Communities’

The document will be hosted on the WikiGenes Wiki, where everybody will be able to make contributions (upon registering to the site). The WikiGenes engine will keep track of the individual contributions and acknowledge the authors of the bigger ones. After two months from now (on May 28th), I will close the document and will propose the authors of the biggest contributions to sign it as authors; the manuscript will be then be sent to PLoS CompBio, where it will be eventually be published, provided it passes the editorial review process.

about my research: gene position and selective constraints

It is time I introduce a bit the research I am doing for my PhD, here at the Pompeu Fabra-CSIC university ๐Ÿ™‚

The main area of our research is to study whether there is correlation between the position of a gene within a biological pathway and the strength of selective constraints on it.ย  So, for example, if genes involved in a high number of interactions and functions tend to be more conserved (==see less changes) among species, or not. This can be better explained with this figure for a terrible poster I presented in the workshop for Evolutionary Systems Biology last year:

In this hypothetical biological pathway, genes in upstream positions or with an high number of interaction are more functionally constrained than the others, therefore their sequence should be more conserved.

The figure represents an ideal pathway of genes, as the ones annotated in the KEGG, MetaCyc or Reactome. All the nodes are genes, and the edges represent any kind of interaction between two genes: for the general discussion it is not necessary to specify whether they are metabolic, physical or other kind of interactions.

The intensity of the colors in the figure represent the strength of selective constraints we expect to find on each node. The gene on the most upstream position should be the one with the strongest selective constraints, because, if a mutation introduces a loss of function there, all the downstream interactions will be compromised. A similar reasoning can be made for genes with an high number of interactions, which should be strongly conserved.

Continue reading

my Twitter account!

I have just created a twitter account. You can now follow me at: http://twitter.com/#!/dalloliogm

I tried to resist joining Twitter for a long time, but now I need it to participate to a spare-time project of mine. I recognize that twitter can be a very useful tool for a researcher, but I am worried that it can be too intrusive and distract me too much.

Do you have any suggestions for a new twitter user? Which software (on Ubuntu) do you use to check the feeds? Which groups would you recommend to a bioinformatician?

I have just created a twitter account. You can now follow me at: http://twitter.com/#!/dalloliogm 

I tried to resist joining Twitter for a long time, but now I need it to participate to a project of mine. I recognize that twitter can be a very useful tool for a researcher, but I am worried that it can be too intrusive and distract me too much.

Do you have any suggestions for a new twitter user? Which software (on Ubuntu) do you use to check the feeds? Which groups would you recommend to a bioinformatician?