Machine Learning for Biology

This weekly seminar brings together theoreticians and biologists on the topic of machine learning for biological data. In recent years, biologists have collected increasingly complex and massive data, and their interpretation is ever more challenging. In parallel, machine learning techniques have seen an unprecedented development and is now turning upside down the way scientists and engineers work with data. The seminar has a double objective: (i) expose machine learning techniques at an elementary level (with lectures and tutorials in Python) and  (ii) helping theoreticians realize the kind of data are relevant to analyze and their specific features.

The sessions review the different existing machine learning techniques and algorithms, and applies the techniques on data provided by the biologists in the seminar.

For more information or session dates and times, don’t hesitate to contact us (you should have our emails already)!

News: If you want the slides, send an email @ jonathan . touboul à college-de-france . fr

The ML groupmates:

People Data
Valentin + Marie Probes signals – brain recordings
David + Hadhemi 3d brain cell motion
Jérôme & Guillaume + Marie Neural activity clustering
Guillaume + Jérôme Functional imaging data
Mirjana & Christophe + Philippe Smart online microscopic acquisition
Gaeta & Alberto + Damien Classification – extracellular matrix
Jonathan & Alberto + Benjamin Clustering of neural activity
Jonathan + Emmanuel 3d ultrafast ultrasound brain imaging

Organization:

  • Tuesday July 12th: Machine Learning Party!
  • Tuesday, July 5th: 
    Classification Theory (2/2):
    Trees and Forests (Bagging), Boosting, Artificial Neural Networks…
    Check out Chap. 9, 10, 11, 15 of [T2]
  • Tuesday June 28th: Project session in the library room
  • Thursday June 23rd in the library room:
    Tutorial on classification in Python with pandas and sklearn.
    Come with your computer and Python!
    Classify Mixture of Gaussian, or vowel sounds! 
    Here is the iPython Notebook File (with all solutions!) or the html version 
  • Tuesday June 14th:
    Classification Theory (1/2):
    Logistic Regression, LDA, Perceptron, KNN, Prototype methods, SVM…
    Check out Chap. 4, 12, 13 of [T2]
  • Tuesday June 7th: Project session in B1
  • [!! Modified !!] Thursday, June 2nd  in the library room:
    Tutorial on linear regression in Python with pandas and sklearn.
    Don’t forget to install Python before (see below)
    Linear and Bayesian regression methods
    Subset selection and shrinkage (Chap. 3 of [T2]).
    Here is the iPython Notebook File (with all solutions!) or the html version 
  • Tuesday, May 24th:
    Regression theory (Linear Methods, Bayesian regression, Bias-Variance tradeoff, … ). You can check out Chap. 3 of [T1,T2].
    Constitution of the groups of biologists+theoreticians for data analysis
    Theoretical session on penalization and oracle inequalities
  • Tuesday, May 12th, 2pm:
    Getting started on what is machine learning and what this book club is about

Ressources:

Installing python: we recommend installing python 2.7 with anaconda (download it here). We shall consider that this install is available during the tutorial sessions.

References and further readings:
For those into theory:
[T1]. Pattern Recognition and Machine Learning, C. Bishop, Springer
[T2]. The elements of Statistical Learning, T. Hastie. R. Tibshirani, J. Friedman, Springer. The pdf is available on the author’s webpage.
For those into moocs:
1. Stanford course by Andrew Ng on youtube or Coursera
2. Caltech course by Yaser Abu-Mostafa on youtube

References to the studied articles (will be posted here as well).

And for anything else you’re looking for, follow this link!