Notes from a “Write it clearly” course

I recently took a course on improving English Writing skills for researchers. These are my notes, organized as a series of “Do and Do not” lists, plus some separate list for each section of a research paper.

Feel free to have a look at them and make use of them. If you have any comments, you can add them here or to table. Have an happy paper writing day!

click to access the notes.

I wrote a videogame for the Wii

I wrote a small web game for the “Week of Science 2012” (Semana de la Ciencia), a science divulgation initiative organized in Spain. I participated to it as a member in the Institut of Biologia Evolutiva of Barcelona, the institution to which I belong to. The game is in Spanish, but I think anybody can understand it without translation. Click on the image to play with it:

The “Phylogenetic Tree” for the “Semana de la Ciencia” in Barcelona.

If the game is not shown correctly, click on this link: IBE phylogenetic game sc2012

In short, we had 15 minutes to explain to a class of college students (from 12 to 18 years old) how to make phylogenetic trees. This is how we organized the time:

  • In the first five minutes, we had a short presentation explaining that we all come from a common ancestor, and that our work of evolutionary biologist is to reconstruct the tree of life. We also explained what a phylogenetic tree is, and how we reconstruct it.
  • In the next three minutes, we played the first game. This game was quite easy, and was meant to check if the student understood how phylogenetic trees are constructed. During this first game, one volunteer student had to decide where to put a mammal, a bird and a jellyfish in a phylogenetic tree.
  • In the next minutes, we played the second game, which was a bit of a trick. Students had to reconstruct the phylogenetic tree of four protists. Have a look at the “Juego 2: protists” to see it. This game was tricky because there it is no way to come with the correct solution. In fact, after letting the students play for a while, we showed them that the only way to know the real phylogenetic tree was to use the DNA sequences. Then we had a few more slides explaining how mutations in DNA sequences can be used to reconstruct the history of changes in evolution.

To make things a bit more entertaining, we also connected a Wii remote to the computer, so the student who played the game had to use it as a mouse. This was fun to set up, and I think I will use a Wii remote in my next talk :-).

The activity was a bit condensed in 15 minutes, but I think that more or less all the students understood the basic concept. At least, some made questions, and in general, they seemed to like the game. I hope they will at least remember that DNA can be used to study how species have evolved :-).

If you want to customize the page, the code is available on bitbucket:

This was the first time I programmed something in Javascript, so the code is a terrible mess. There is a lot of code duplication, and a lot of patchy fixes. But as Agile Programmers say, “Code first and Refactore later”. I think I will work on cleaning this code for next year, so if you have any suggestions on how to make it better, please join the repository on bitbucket.

“Programming for Evolutionary Biology” course – suggestions for the applicants that have not been accepted

The selection phase for the participants to the “Programming for Evolutionary Biology” course in Leipzig has finished. Congratulations to all the applicants accepted!

I am very sorry for the people who have not been accepted, but we have received a lot more applications than the places available, and the selection process has had to be very strict. As Katja Nowick, the organizer of the course, said, this is a sign of how much introductory courses to programming for researchers are needed. Hopefully we will be able to repeat the course or other people will organize similar courses in the future.

In case you have not been selected, I would like to give you a few suggestions on how to start to learn Unix/R/Perl skills.

– Are there any other courses for learn Programming oriented to biologists?

I think that the “Unix and Perl Primer for Biologists” course is a very good resource for researchers wishing to learn the basics of the Bash shell and Perl. Their material is easy to read, and explains everything step by step. The authors also wrote a book (which I didn’t read yet), and released some good material on this website.

Another good course that should not be missed is “Software Carpentry for Biologists“. This course covers a wider range of topics than the other, and, more important, dedicates a good effort on explaining what should be the “good practices” for a bioinformatician. Maybe the contents are a bit more advanced than the “Unix and Perl Primer”, although there are classes on the shell. In any case, once you feel a bit confident on your programming skills, you should definitely read all the materials on Software Carpentry, and make sure you have understood everything before starting a research project. There is also an “Advanced Software Carpentry for Bioinformaticians“, by Titus Brown, focused on Python programming.

– Are there other courses on Programming for Evolutionary Biologists, or on Next Generation Sequencing?

Thanks to reddit/bioinformatics, I have found two other courses similar to ours: a “Workshop on Molecular Evolution” from the University of Texas, and a “Computational molecular evolution” Course from EMBO in Greece. Both these courses seem very valid, although I don’t have any direct experience with them.

A good course on Next Generation Sequence is the Angus course, by Titus Brown. Titus Brown is a skilled bioinformatician and programmer, who developed, among other things, libraries such as Pygr and parts of nosetests. The website of the course is full of good documentation and examples, it should be a good place to start.

– where can I get help?

Internet is a good place where to ask for help on programming related questions. The StackOverflow network is the most active community for anything related to Programming and Unix in general. For next generation sequencing analysis, a good place is SeqAnswers. And, for the general bioinformatics question, biostar is of course a nice resource 🙂

Programming for Evolutionary Biology Course – Leipzig 2012

This year I will teach in a two-week Introductory course to Programming and Bioinformatics, aimed at PhD students and Post-Docs working in Evolutionary Biology. This is a course designed for researchers that have little or no experience with programming, and it will teach them the basics of Bash, Perl, R Programming along with popular tools used in Evolutionary Biology.

The deadline for application is January 31st 2012, and the course will be held in Leipzig (Germany) in the last two weeks of March 2012. Please check the home page of the course for details on how to apply:

We tried to keep the cost of the course as low as possible, and thanks to a contribution from the Volkswagen foundation we have been able to keep it at only 300 euros for person. Plus, we have some fellowships available.

Here it is the programme, that you can also find in the home page of the course:

  • Introduction to Linux (Giovanni Marco Dall’Olio, University Pompeu Fabra, Barcelona, Spain)
  • Introduction to R (Katja Nowick, University Leipzig, Germany)
  • Analysis of next generation sequencing data (Tomas Marques-Bonet, University Pompeu Fabra, Barcelona, Spain)
  • Analysis of structural variants (Tomas Marques-Bonet, University Pompeu Fabra, Barcelona, Spain)
  • Analysis of expression data (Katja Nowick, University Leipzig, Germany)
  • Promoter evolution (Annalisa Marsico, Max-Planck-Institute for Molecular Genetics, Berlin, Germany)
  • Statistics & Inference (Stuart Baird, University Porto, Portugal)
  • Introduction to Perl (Sofia Robb, University of California Riverside, USA)
  • Phylogenomics (Rui Faria, University Porto, Portugal)
  • Ensembl API (Bert Overduin, EMBL – European Bioinformatics Institute, Hinxton, Cambridge, UK)
  • Introduction to databases (Jan Aerts, Leuven University, Belgium)
  • Visualization of scientific data (Jan Aerts, Leuven University, Belgium)

Invited speakers:

  • Evolution of behavior (Sarah London, University of Chicago, USA)
  • Evolutionary ecology (Claudia Acquisti, Westfälische Wilhelms-University Münster, Germany)
  • Phylogenomics (Rasmus Nielsen, University of California, Berkeley, USA)

Twitting from the X CRG Symposium on “Computational Biology of Molecular Sequences”

Today and tomorrow I will be attending a symposium organized here in Barcelona, about bioinformatics analysis of molecular sequences. Many well known bioinformaticians will participate, including Temple Smith (the Smith & Waterman algorithm), Amos Baroch from Expasy and Tim Hubbard from the EBI institute. Check the programme here, or the Streaming Video here.

The organization of this Symposium as been innovative in a “web 2.0 way”, as the participants have been able to interact in advance with the speakers, through a online web forum. For example, we have been able to propose to Tim Hubbard to explain how the concept of reference genome will evolve in the 1000genomes era, and, seeing that he has changed the title of his presentation, it seems that he is going to talk about it.

So, if everything goes well, I will be twitting from there… This is the first time I use twitter during a conference, so be kind with me :-).

Scifund projects online

The Scifund initiative has reached its final phase. Now all the projects are publicly visible online on RocketHub.

click on the logo to go to the list of projects uploaded

I am surprised to see how many projects have been presented! The crowd-funding seems to be a good idea to make science, specially in these times of crisis. I will keep it in mind for when I will finish my PhD and start looking for a post-doc. If I won’t find any position soon, it seems to be a good way to obtain funding for a short research project, and survive a bit more :-).

The blog of the initiative is very interesting. Here are some of my favorite posts so far:

  • the story of a successful case of crowdfunding for a research project
  • metaphors in science: how to use metaphors to describe complex scientific things to common people. For example, an electrophoresis can be explained as a thin forest that small and big animals have to cross.

And some projects I find interesting:

Continue reading

check the PRBB programmers’ blog

The Parque Recerca Biomedica Barcelona (PRBB) is the building where I work. It is a big research center built about 5 years ago, hosting about 2000 scientists working on different fields and of different nationalities.

Here, we organize many different activities related to Programming. For example, we have a Python Programmers Meetup Group which used to meet once per month; or we have a series of Technical seminars about programming related stuff. Plus, we make a short meeting every Tuesday, to discuss about geekish things

To coordinate all these events, I am setting up a wordpress blog. Check it out:

If you work in the PRBB, you can check this blog to know what is going on, and whether some of the programming related activities may interest you. If you live in Barcelona, you can still check it out, because most of these activities are open to the public. Even if you don’t live in Catalunya, that blog may still be interesting for you, as it is a way to have an ‘inside view’ of what we do here and what tools and programming languages we use. It may also give you some inspiration on the kind of activities that interest a group of bioinformaticians in a big research center, and can be helpful if you want to emulate the initiative :-).

recipe for a home-made (limitless) dropbox

Today, bitbucket has enabled support for git repositories!! This means that, with a little hack and thanks to the sparkleshare project, we can make a home-made dropbox without limits of disk space.

Bitbucket is a repository hosting service, like github, sourceforge, and many others. The good thing about bitbucket is that there are no limits of space and private repositories. I use it for almost all my personal projects and for backups.

Sparkleshare is a free software designed to automatically syncronize a folder on a remote git repository. In principle, you can choose one or multiple folders in your filesystem, and sparkleshare will automatically syncronize it to a remote repository, such as one on github, bitbucket or another server. Every time you will move or change a file, Sparkleshare will automatically create a commit and push it to the remote server. In short, it is like Dropbox, but it can be used on multiple folders, and you have to find the remote repository to host the files.

So, since bitbucket now enables git repositories, we can use it with sparkleshare. The process is quite simple:

  • Get sparkleshare and install it to your computer
  • I had difficulties to configure sparkleshare to work directly with bitbucket, so I have created a repo on github first and then changed the url in the config file.
  • Create an account on github, upload the SSH key that sparkleshare has created in ~/Sparkleshare, and create a repo there;
  • Go to bitbucket, create a private repo, and upload the same SSH key there;
  • start sparkleshare, select the ‘github’ option, and tell him to syncronize the folder with your github repository;
  • open the folder, and edit the hidden file .git/config; replace the github url with the bitbucket url;
  • that’s it; enjoy 🙂

Having a limitless home-made dropbox is cool; however, there are many reasons why I don’t recommend you to abuse this system.

First of all, git is slow when handling big files. If you try to syncronize big files such as movies etc (please do not use this for illegal stuff), you will waste a lot of bandwidth, and the syncronization will be very slow.

Second: although there it seems to be not anything against this in bitbucket’s Terms of Use, it is not nice to abuse them. I have opened a bug in Sparkleshare‘s repo and one at bitbucket to see what is the opinion of its authors about this. In any case, I don’t think you want to risk putting all your backup files on bitbucket through this system, and then see that bitbucket removes them because you have abused the Terms of Usage.

Third, and last, there is not really need for this. There are a lot of alternatives that already that provide cloud hosting for your backups. I will list a few:

  • dropbox is free and gives you 2 GB, plus 250 MB for each invitation (note: this link is an invitation, if you register through it, you and I will get 250 MB extra).
  • Ubuntu One already gives you 5 GB for free, and is great at syncronizing preferences and configurations. Now they have also created a Windows client, so there is no excuse to not use it. I only wish that they will fix the http proxy issue soon.
  • gives you up to 10 GB of remote file storage, although it doesn’t get syncronized automatically as the other systems (note: this link is an invitation, if you register through it, you and I will get 1 GB extra).

The #SciFund challenge begins today!

#Scifund is an experiment of ‘crowfunding’ for researchers. The idea behind #Scifund is to push scientists toward the world of crowfunding, to teach them how they can propose their research projects on a crowfunding website, and get people to contribute by helping funding them.

So, today #Scifund starts its first iteration. If you have an idea about a research project and you think that you can convince other people to fund it, you have about two weeks to prepare a draft proposal and post it to the #Scifund site. If you need more information, I suggest you to sign to their website and their mailing list.

It is important to note that the funding will not be collected on their website, but on a popular crowfunding website, RocketHub. The scope of #Scifund is not to do the collect the funding there, but to help researchers to prepare their applications to other crowfunding websites. #Scifund will be a web 2.0 website where researchers will compile applications in a online collaborative manner.

Personally, I think that crowfunding is a good idea for research. Of course, they won’t never be able to get the 70 millions dollars needed to test a new antibiotics, or the money needed to support a wet laboratory; but it may be a good resource for bioinformatics. Moreover, even though you are not interested in submitting a research proposal there, their website is a good resource for learning: have a look at their blog, and at all the useful tips they present.

Ten Simple Rules paper published!

The Ten Simple Rules for Getting Help from Online Scientific Communities paper just came out on the last PLoS Computational Biology issue!

It has been a pleasure and an exciting experience to coordinate the writing of my first Open Collaborative Paper. It has been fun, a different approach toward preparing and publishing an academic paper, and I hope I will be able to promote other similar initiatives in the future.

I am also preparing a blog post where I list some tips and notes about coordinating these type of initiatives. I have learned a lot from this experience, and I would like to share a few thoughts on what can be done to organize them at the best. But let’s see if I will find the time to finish writing it 🙂

This paper is dedicated to all the people who enjoy discussing science on Internet. In particular, I would like to give a huge thanks to the people from the Molecularlab Community, and the Biostar Community; and also a huge thanks to Robert Hoffman from WikiGenes for the support.

And if you pass around here in Barcelona in my lab, I will offer you a chocolate candy 🙂

Dall’olio GM, Marino J, Schubert M, Keys KL, Stefan MI, Gillespie CS, Poulain P, Shameer K, Sugar R, Invergo BM, Jensen LJ, Bertranpetit J, & Laayouni H (2011). Ten simple rules for getting help from online scientific communities. PLoS computational biology, 7 (9) PMID: 21980280