book club on “Origins of Evolutionary Innovations” by A. Wagner

I am organizing a discussion club on the book “Origins of Evoutionary Innovations” by A. Wagner, for my group.

Well, I don’t promise anything, but since I will do the effort of producing some presentations anyway, I will also publish all the slides here in this blog.

This book describes how new phenotypes are discovered in evolution. In the first chapter, it starts by describing some examples of notable phenotypes that have appeared, such as the Urea cycle and the ability to use glucose as a carbon source. But in general, this book is about how any novel phenotype appears in evolution.

It also explains the concepts of genotype space and genotype network, and how much variability can a population of organisms withstand without having changes in a given phenotype. For example, there are far more possible mRNAs than the number of proteins observed, so it seems that any given protein can be produced by more than one mRNA. This means that an organism can withstand many changes to its DNA, without suffering changes to the structure of the protein. What is the role of this variability in evolution?

There is also a nice paper published on the topic today, in Science: Meyer JR et al, Repeatability and Contingency in the Evolution of a Key Innovation in Phage Lambda, Science 2012.

The book club will take place only in my lab, but if you are interested, you can follow the slides and comment on this blog. (or would it be better to discuss it on Twitter? Let’s use the #evol_innov_book tag on twitter). Enjoy!

Programming for Evolutionary Biology Course – Leipzig 2012

This year I will teach in a two-week Introductory course to Programming and Bioinformatics, aimed at PhD students and Post-Docs working in Evolutionary Biology. This is a course designed for researchers that have little or no experience with programming, and it will teach them the basics of Bash, Perl, R Programming along with popular tools used in Evolutionary Biology.

The deadline for application is January 31st 2012, and the course will be held in Leipzig (Germany) in the last two weeks of March 2012. Please check the home page of the course for details on how to apply:

We tried to keep the cost of the course as low as possible, and thanks to a contribution from the Volkswagen foundation we have been able to keep it at only 300 euros for person. Plus, we have some fellowships available.

Here it is the programme, that you can also find in the home page of the course:

  • Introduction to Linux (Giovanni Marco Dall’Olio, University Pompeu Fabra, Barcelona, Spain)
  • Introduction to R (Katja Nowick, University Leipzig, Germany)
  • Analysis of next generation sequencing data (Tomas Marques-Bonet, University Pompeu Fabra, Barcelona, Spain)
  • Analysis of structural variants (Tomas Marques-Bonet, University Pompeu Fabra, Barcelona, Spain)
  • Analysis of expression data (Katja Nowick, University Leipzig, Germany)
  • Promoter evolution (Annalisa Marsico, Max-Planck-Institute for Molecular Genetics, Berlin, Germany)
  • Statistics & Inference (Stuart Baird, University Porto, Portugal)
  • Introduction to Perl (Sofia Robb, University of California Riverside, USA)
  • Phylogenomics (Rui Faria, University Porto, Portugal)
  • Ensembl API (Bert Overduin, EMBL – European Bioinformatics Institute, Hinxton, Cambridge, UK)
  • Introduction to databases (Jan Aerts, Leuven University, Belgium)
  • Visualization of scientific data (Jan Aerts, Leuven University, Belgium)

Invited speakers:

  • Evolution of behavior (Sarah London, University of Chicago, USA)
  • Evolutionary ecology (Claudia Acquisti, Westfälische Wilhelms-University Münster, Germany)
  • Phylogenomics (Rasmus Nielsen, University of California, Berkeley, USA)

a script to fetch images from the UCSC browser

The UCSC browser is a nice, useful but “mammoth-ish” bioinformatics tool that despite its web 1.0 aspect, can be a very powerful ally for any bioinformaticians or biologist.

I have to admit that for many years I avoided using the UCSC browser, dismissing it because of its very old fashioned look. It was silly of me to think that way, but its interface is objectively old: for example, the user is forced to reload the whole page to update the visualization, and the fonts are not anti-aliased, and they look ugly. To me, it didn’t seem “professional” to use a pre-Ajax website for doing research.

Recently, however, I have changed my mind about this, as I discovered that this tool can be very powerful to integrate data from different sources and for doing “mash-ups”. A local UCSC browser instance can be installed in a computer and be used as a central repository for all the annotations produced in a research unit: for example, sequencing data, results from experiments and from statistical genome-wide tests, etc. If all the custom annotations produced in a lab are available in a local UCSC browser instance (either as custom tracks or as tables), it is possible to compare them, and also to compare them against annotations available publicly, such as position of genes, non-coding regions and much more. The real strength of this tool is that if you have a workflow to automatize retrieval of data from it, you are able to compare your results with virtually anything that is known about a genome.

So, let’s go to the point: I wrote a script to automatically fetch screenshots of a UCSC browser instance. It is available at this page:

The first difficulty I faced when writing this script was that there are a lot of possible different options, to define a region and how to visualize it. So, I have made the script to require three different configuration files: one for the regions to be visualized, one for the tracks to be shown, and one for the connection parameters. So here it is how you would call it:

Have a look at this pdf that created with this script. If you continue reading the post, I will also describe the different configuration files.

example of report created by this script. Click on the image to see the full pdf.

Continue reading

New ways to explore your academic impact

It seems that today, for a strange series of coincidences, is a good day if you wanted new tools to explore your academic impact.

First, Google/Scholar Citations has finally been opened to all. Everybody can now create a profile on Google/Scholar, to keep track of articles and citations. I like google/scholar because it finds articles and books that are not indexed on scopus, but that are interesting nevertheless. Plus, it is free to use. However, our paper on Recombination Rates has been recently cited in a Nature Genetics paper, and Google/Scholar didn’t find it out.

Second, the finalists for the PLoS/Mendeley binary battle have been selected. Check the list here. The PLoS/Mendeley binary battle is an initiative proposed by these two organizations to encourage the writing of applications that make use the PLoS and the Mendeley APIs, to retrieve information on papers and readers. So, this initiative is originating some very good web applications to explore academic impact or play with citations and papers, and here I will describe some of my favourites.

I like two tools to see the impact of research articles on Internet: Total Impact and Readermeter. They both allow to see how many times your articles are read on Mendeley, cited, referenced on Twitter and Facebook, bookmarked on CiteULike, and much more.  The nice thing about Total Impact is that it also indexes my presentations on slideshare: for example, one of my presentations on Python is actually more popular than any other paper. However, one of our papers is not being recognized correctly, because of a duplicated entry in mendeley. On the other hand, Readermeter allows to see the geographical distribution of readers, and provides more statistics. It would be good if it would be possible to embed one of these two reports in a web page, for example in the About page of a blog, or an academic home page.

My TotalImpact report. Click on it to see the full report. Check also my ReaderMeter report if you like.

Another tool I liked is PaperCritic. It is a repository of commentaries on published papers. The idea is not entirely new: PLoS and other journals already allow to comment on papers. Unfortunately not all publishing houses provide this option.. moreover, having a central repository of comments on papers makes them easier to browse and select. I only wonder how much this tool is redundant with ResearchBlogging, and if the commentaries posted on the site are communicated to the authors of the paper even if they are not signed on PaperCritic.

So, these tools provides new ways to play with academic impact indicators, and to see whether our work is effectively useful to anyone.. I’ve played with them this morning, but now I would be better to get back to work, to improve their results 🙂


Twitting from the X CRG Symposium on “Computational Biology of Molecular Sequences”

Today and tomorrow I will be attending a symposium organized here in Barcelona, about bioinformatics analysis of molecular sequences. Many well known bioinformaticians will participate, including Temple Smith (the Smith & Waterman algorithm), Amos Baroch from Expasy and Tim Hubbard from the EBI institute. Check the programme here, or the Streaming Video here.

The organization of this Symposium as been innovative in a “web 2.0 way”, as the participants have been able to interact in advance with the speakers, through a online web forum. For example, we have been able to propose to Tim Hubbard to explain how the concept of reference genome will evolve in the 1000genomes era, and, seeing that he has changed the title of his presentation, it seems that he is going to talk about it.

So, if everything goes well, I will be twitting from there… This is the first time I use twitter during a conference, so be kind with me :-).

Scifund projects online

The Scifund initiative has reached its final phase. Now all the projects are publicly visible online on RocketHub.

click on the logo to go to the list of projects uploaded

I am surprised to see how many projects have been presented! The crowd-funding seems to be a good idea to make science, specially in these times of crisis. I will keep it in mind for when I will finish my PhD and start looking for a post-doc. If I won’t find any position soon, it seems to be a good way to obtain funding for a short research project, and survive a bit more :-).

The blog of the initiative is very interesting. Here are some of my favorite posts so far:

  • the story of a successful case of crowdfunding for a research project
  • metaphors in science: how to use metaphors to describe complex scientific things to common people. For example, an electrophoresis can be explained as a thin forest that small and big animals have to cross.

And some projects I find interesting:

Continue reading

Biostar paper published!

The Biostar paper has just been published in PLoS Computational Biology. Hurra! 🙂

Biostar is a community for questions/answers for Bioinformatics related queries. It is a good resource to visit if you are a bioinformatician, or if you have a question to ask to a bioinformatician. Browse the site to have an idea of the topics discussed: there it is mostly everything, from ‘What tools/libraries do you use to visualize genomic feature data? ‘ to ‘Where to advertise or find bioinformatics jobs?‘, and much more.

First of all, I would like to thank all the Biostar users. I am very happy of this publication, because this kind of activities (participating to an online technical forum) are very difficult to get acknowledged in the academic world.

Participating to an online forum and help is something that each bioinformatician should do, and that improves the overall quality of scientific research. The discussions on biostar helped hundreds of researchers, and saved time and money to many research projects. However, these types of contributions are very rarely considered by universities when evaluating a curriculum. Writing a 50 upvotes post on Biostar won’t help you at all in getting a faculty position at your university, or in getting a grant, despite the time you may have spent in writing it.

So, let’s hope that the publication of this article will make easier for other resources of the type to be acknowledged in the academic world. I think that between Biostar and other active forums for discussion on Research topic, such as Protocol Online (focused on wet-lab techniques), and SeqAnswers (focused on Next Generation Sequencing), a lot of researchers are getting advantages from this kind of resource. The same fact that the Ten Simple Rules article on Getting Help from Scientific Communities paper that we published a month ago has already gotten almost 5000 visualization, corroborates this fact. Let’s see if the universities and the academic world will learn that contributions to online forums must be encouraged and acknowledged.

check the PRBB programmers’ blog

The Parque Recerca Biomedica Barcelona (PRBB) is the building where I work. It is a big research center built about 5 years ago, hosting about 2000 scientists working on different fields and of different nationalities.

Here, we organize many different activities related to Programming. For example, we have a Python Programmers Meetup Group which used to meet once per month; or we have a series of Technical seminars about programming related stuff. Plus, we make a short meeting every Tuesday, to discuss about geekish things

To coordinate all these events, I am setting up a wordpress blog. Check it out:

If you work in the PRBB, you can check this blog to know what is going on, and whether some of the programming related activities may interest you. If you live in Barcelona, you can still check it out, because most of these activities are open to the public. Even if you don’t live in Catalunya, that blog may still be interesting for you, as it is a way to have an ‘inside view’ of what we do here and what tools and programming languages we use. It may also give you some inspiration on the kind of activities that interest a group of bioinformaticians in a big research center, and can be helpful if you want to emulate the initiative :-).

recipe for a home-made (limitless) dropbox

Today, bitbucket has enabled support for git repositories!! This means that, with a little hack and thanks to the sparkleshare project, we can make a home-made dropbox without limits of disk space.

Bitbucket is a repository hosting service, like github, sourceforge, and many others. The good thing about bitbucket is that there are no limits of space and private repositories. I use it for almost all my personal projects and for backups.

Sparkleshare is a free software designed to automatically syncronize a folder on a remote git repository. In principle, you can choose one or multiple folders in your filesystem, and sparkleshare will automatically syncronize it to a remote repository, such as one on github, bitbucket or another server. Every time you will move or change a file, Sparkleshare will automatically create a commit and push it to the remote server. In short, it is like Dropbox, but it can be used on multiple folders, and you have to find the remote repository to host the files.

So, since bitbucket now enables git repositories, we can use it with sparkleshare. The process is quite simple:

  • Get sparkleshare and install it to your computer
  • I had difficulties to configure sparkleshare to work directly with bitbucket, so I have created a repo on github first and then changed the url in the config file.
  • Create an account on github, upload the SSH key that sparkleshare has created in ~/Sparkleshare, and create a repo there;
  • Go to bitbucket, create a private repo, and upload the same SSH key there;
  • start sparkleshare, select the ‘github’ option, and tell him to syncronize the folder with your github repository;
  • open the folder, and edit the hidden file .git/config; replace the github url with the bitbucket url;
  • that’s it; enjoy 🙂

Having a limitless home-made dropbox is cool; however, there are many reasons why I don’t recommend you to abuse this system.

First of all, git is slow when handling big files. If you try to syncronize big files such as movies etc (please do not use this for illegal stuff), you will waste a lot of bandwidth, and the syncronization will be very slow.

Second: although there it seems to be not anything against this in bitbucket’s Terms of Use, it is not nice to abuse them. I have opened a bug in Sparkleshare‘s repo and one at bitbucket to see what is the opinion of its authors about this. In any case, I don’t think you want to risk putting all your backup files on bitbucket through this system, and then see that bitbucket removes them because you have abused the Terms of Usage.

Third, and last, there is not really need for this. There are a lot of alternatives that already that provide cloud hosting for your backups. I will list a few:

  • dropbox is free and gives you 2 GB, plus 250 MB for each invitation (note: this link is an invitation, if you register through it, you and I will get 250 MB extra).
  • Ubuntu One already gives you 5 GB for free, and is great at syncronizing preferences and configurations. Now they have also created a Windows client, so there is no excuse to not use it. I only wish that they will fix the http proxy issue soon.
  • gives you up to 10 GB of remote file storage, although it doesn’t get syncronized automatically as the other systems (note: this link is an invitation, if you register through it, you and I will get 1 GB extra).

The #SciFund challenge begins today!

#Scifund is an experiment of ‘crowfunding’ for researchers. The idea behind #Scifund is to push scientists toward the world of crowfunding, to teach them how they can propose their research projects on a crowfunding website, and get people to contribute by helping funding them.

So, today #Scifund starts its first iteration. If you have an idea about a research project and you think that you can convince other people to fund it, you have about two weeks to prepare a draft proposal and post it to the #Scifund site. If you need more information, I suggest you to sign to their website and their mailing list.

It is important to note that the funding will not be collected on their website, but on a popular crowfunding website, RocketHub. The scope of #Scifund is not to do the collect the funding there, but to help researchers to prepare their applications to other crowfunding websites. #Scifund will be a web 2.0 website where researchers will compile applications in a online collaborative manner.

Personally, I think that crowfunding is a good idea for research. Of course, they won’t never be able to get the 70 millions dollars needed to test a new antibiotics, or the money needed to support a wet laboratory; but it may be a good resource for bioinformatics. Moreover, even though you are not interested in submitting a research proposal there, their website is a good resource for learning: have a look at their blog, and at all the useful tips they present.