Tuesday, January 6, 2015

Bioinformatics in the Lab: Putting a Yearly Plan Together

"Plans are worthless, but planning is everything" Dwight D. Eisenhower

Making yearly software development plans for a wet lab is at best a frightening moment, commonly a nightmare and very often totally ignored.
From the makers perspective, buried in the trenches, already overwhelmed by an avalanche of reactive tasks, making plans for he year to come can be seen as a boring process when not  a total waste of time. From the upper management one, yearly plans are a corner stone, a transmission belt between strategy and tactical realization. Middle managers have therefore to organize the planing process... before making it happen.
The situation we'll consider in this post is a common one in a research environment, when bioinformatics plans have to be drafted to support a wet lab effort. The process described here is designed for a medium size entity, eventually part of a larger organization. It consists for example in a lab with thirty-like scientists and technicians ran by five group leaders. This lab is supported by a group of five computer folks, bioinformaticians or computational biologists. Let's also consider that the whole group already works together on the longer term, therefore killing the business analysis barrier. In other words, people are used to talk to each other and they know what they are talking about.
Drafting plans cannot be a top down process, it must take into account multiple perspectives: the strategic ones, relayed by lab managers and the practical ones, voiced by the makers, the bioinformaticians. Moreover, everyone must be listened to (a hardly achievable process in large scale passionate meetings.) Adoption shall be general and funnier tasks must be mixed with less exciting ones.
We propose a four steps process:
  • gathers the tasks,
  • estimate the effort,
  • individually prioritize,
  • make the One Plan.
The overall planning is facilitated by a single person, navigating across the different contributors. This orchestrator shall have a fair understanding of the scientific domain but should come from the computational side, with a better sense of the technical complexity. Two global meetings are needed: a short one up front to explain the process and a longer one for the last step.

Looking for a path, in Polar Bear area - Spitzbergen 2001

Saturday, October 18, 2014

Embracing Technology Change In Scientific Software

“πάντα χωρεῖ καὶ οὐδὲν μένει” [Everything changes and nothing stands still]  - Heraclitus of Ephesus.
True 500 BC, this statement for sure holds nowadays in the software world. Whether at micro scale, when replacing a third parties library or continuously refactoring, or at macro scale, when adopting a new language, a framework or making major paradigm shift such as going parallel. Evolution is part of the development cycle.
Change can be planned, when a prototype paves the route for a production software or unexpected, when a technology breakthrough emerges during the lifespan of a project.
Although every domain is concerned by the phenomenon, scientific software raises a couple peculiarities. Often serving research, the purpose of the code itself will pivot in unforeseen directions. Meanwhile, developers, as talented as they might be, rarely come from a software engineering background and sometimes (that might be an understatement) do not consider the software building process as a relevant priority.
Embracing evolution raises challenges in multiple dimensions. We will try to cover both the technical perspective as well as the leadership one.

Contents:  Build To Change - Where (and When) To Go? - Foster Evolution in Your Team.

Kayaking through the ice, to reach our glacier Patagonia, 2000

Saturday, May 10, 2014

Viewing graphs in the browser


Entities linked through relations are just everywhere. Social networks, biological entities, publications, any database tables joined with a foreign key etc. are countless examples but often not represented as graphs. Many tools exist to visualize them and this post will introduce my favorite five allowing to interact with a graph in a modern web browser: graphviz, neo4j, cytoscape.js, sigma.js and d3.js.
By far, we will not cover all the cool features of these tools (hei, this a simple blog post, not a book!) but rather give a short introduction to each of them. We will focus on small graphs with force layout methods, as it is the most general purpose one.
Kite drying, after a Greenland ice cap crossing, 2002

Monday, August 5, 2013

Scala: 6 silver bullets

Bye bye Java, Hello Scala

In the JVM world, Scala is certainly the rising star. Created at EPFL in 2001, its strongly gaining in popularity. Depending on the indices, it ranks now as a "serious" language reaching far beyond the academic world and adopted in mainstream companies (twitter backend, Ebay research, Netflix, FourSquare etc.).
For data scientists, this language is a breeze. Above the religion war between functional and object oriented believers, it succeeded by merging the best of both worlds, with a strong drive at "let's be practical."
If Grails/Groovy was a big step forwards in productivity on the JVM, Scala goes even further, mixing static typing (thus efficiency) with many improvements in the language structure, collections handling, concurrency, backed by solid frameworks and a very active community.
In this post, I'll picked up six major (and subjective) improvements, showing my hardcore Java colleagues how jumping on this train would be a promise of a great journey.

Sunday, January 20, 2013

Sparse bitset in scala, benchmarking 5 implementations

Recently in a Scala project, I've hit a problem with sparse bit set. Typically, I was moving around set of 30~50 integers between 0 and 2047 with operation such as xor, and, circular shift etc. Basic operations, but repeated extensively, were soon to become a bottleneck in my application. The first implementation used scala BigInt (the main reason was the straightforwards shift call) but it later appeared that other solutions were more suited to the problem.
In this post, I will profile basic operation on  5 implementation alternatives: Scala BigInt, mutable and immutable BitSet versus Java BigInteger and BitSet (Java type can be used inside the Scala application of course).
Source code is available on github

Tuesday, December 11, 2012

Sustainable JavaScript

JavaScript is certainly one of the most popular programming language [1] [2] [3] and the never increasing popularity of complex web application will not cut that trend soon.
Despite some intrinsic flaws of the language and environment, trends have emerged to turn JavaScript into clearer, more structured, deployable, testable environment. In this post, I will go through some of them. As a subjective point of view, they will reflect some techniques we daily use in the scope of large scale projects, computing and displaying rich information in bio-informatics.

I will mainly make here a short introduction to underscore.js, require.js and jasmine as the tools that recently made my day change and shine again. I will skip discussions about jQuery, chrome developer tool, html5 and backbone which also are invaluable, among many others.

Example code, usable as is or in an eclipse (aptana studio is perfect) is available on google drive
Part of a large scale tech jobs ad campaign in the Bay area

Saturday, August 25, 2012

Continuous Deployment in Perl: Code & Folks

Continuous Deployment in Perl: Code & Folks

pre-reviewed version, published in a Perl issue in the Software Developer's Journal (May 2012 issue)
This article was written together with my friend Pierre-Antoine Queloz (paqueloz@gmail.com)

Continuous Integration is the tactic of decreasing the latency between the implementation of a new piece of code and its integration in the overall project. It is the backbone of Continuous Deployment that is often defined as releasing software very frequently in order to satisfy customer needs and get their feedback as soon as possible. Both have shown their benefits and play an important role in the success of the current Agile software development trend.