Computational pan-genomics: status, promises and challenges


SNP calling from RNA-seq data without a reference genome: identification, quantification, differential analysis and impact on the protein sequence

  • Authors: Hélène Lopez-Maestre Lilia Brinza Camille Marchet Janice Kielbassa Sylvère Bastien Mathilde Boutigny David Monnin Adil El Filali Claudia Marcia Carareto Cristina Vieira Franck Picard Natacha Kremer Fabrice Vavre Marie-France Sagot Vincent Lacroix
  • Details: Nucl Acids Res (2016) 44 (19): e148.
  • DOI:

A Resource-frugal Probabilistic Dictionary and Applications in (Meta)Genomics

Accurate self-correction of errors in long reads using de Bruijn graphs

  • Authors: L Salmela, R Walve, E Rivals, E Ukkonen
  • Details: Bioinformatics, 2016

Superstring Graph: a new approach for genome assembly

Read mapping on de Bruijn graphs

  • Authors: A. Limasset, B. Cazaux, E. Rivals, P. Peterlongo
  • Details: BMC Bioinformatics doi:10.1186/s12859-016-1103-9, 17:237, 2016.

A linear time algorithm for shortest cyclic cover of strings

Shortest DNA cyclic cover in compressed space

  • Authors: B. Cazaux, R. Canovas, E. Rivals
  • Details: Data Compression Conference (DCC), Snowbird, Utah, 27th March – 1 April, IEEE Computer Society Press, 536 – 545, 2016.
    DOI 10.1109/DCC.2016.79

Colib’read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads

  • Authors: Yvan Le Bras, Olivier Collin, Cyril Monjeaud, Vincent Lacroix, Éric Rivals, Claire Lemaitre, Vincent Miele, Gustavo Sacomoto, Camille Marchet, Bastien Cazaux, Amal Zine El Aabidine, Leena Salmela, Susete Alves-Carvalho, Alexan Andrieux, Raluca Uricaru and Pierre Peterlongo
  • Details: GigaScience, doi:10.1186/s13742-015-0105-2, 2016.

SNP discovery and genetic mapping using genotyping by sequencing of whole genome genomic DNA from a pea RIL population

  • Authors: Gilles Boutet, Susete Alves Carvalho, Matthieu Falque, Pierre Peterlongo, Emeline Lhuillier, Olivier Bouchez, Clément Lavaud, Marie-Laure Pilet-Nayel, Nathalie Rivière and Alain Baranger
  • Details : BMC Genomics, doi:10.1186/s12864-016-2447-2, 2016.




Reference-free compression of high throughput sequencing data with a probabilistic de Bruijn graph.

  • Authors: Gaëtan Benoit, Claire Lemaitre, Dominique Lavenier, Erwan Drezen, Thibault Dayris, Raluca Uricaru and Guillaume Rizk
  • Details: BMC Bioinformatics, doi:10.1186/s12859-015-0709-7, 16:288, 2015.
  • Talk at WABI’2015: pdf

Construction of a de Bruijn Graph for Assembly from a Truncated Suffix Tree

  • Authors: Cazaux, Bastien, Lecroq, Thierry, & Rivals, Eric
  • Details:In International Conference on Language and Automata Theory and Applications

YOC, a new strategy for pairwise alignment of collinear genomes

  • Authors: R. Uricaru, C. Michotey, H. Chiapello, E. Rivals
  • Details: BMC Bioinformatics doi:10.1186/s12859-015-0530-3, 16:111, 2015.


Reference-free detection of isolated SNPs.

  • Authors: Raluca Uricaru, Guillaume Rizk, Vincent Lacroix, Elsa Quillery, Olivier Plantard, Rayan Chikhi, Claire Lemaitre, Pierre Peterlongo
  • Details: Nucleic Acids Research, (2014) doi: 10.1093/nar/gku1187

Amortized Õ (|V|)O~(|V|)-Delay Algorithm for Listing Chordless Cycles in Undirected Graphs

  • Authors: Rui Ferreira, Roberto Grossi, Romeo Rizzi, Gustavo Sacomoto, Marie-France Sagot
  • Details: In European Symposium on Algorithms. Springer Berlin Heidelberg.

Using cascading Bloom filters to improve the memory usage for de Brujin graph

  • Authors: Salikhov, K., Sacomoto, G., & Kucherov, G.
  • Details: Algorithms for Molecular Biology, 9, 2.

Navigating in a Sea of Repeats in RNA-seq without Drowning

  • Authors: Sacomoto, G., Sinaimeri, B., Marchet, C., Miele, V., Sagot, M.-F., & Lacroix, V.
  • DetailsCoRR, abs/1406.1022.

Efficient Algorithms for de novo Assembly of Alternative Splicing Events from RNA-seq Data

  • Authors: Sacomoto, G
  • Details:Phd Thesis

MindTheGap : integrated detection and assembly of short and long insertions.

  • Authors: Guillaume Rizk, Anaïs Gouin, Rayan Chikhi and Claire Lemaitre
  • Details: Bioinformatics, 2014 30(24):3451-3457.
  • Poster at ECCB’2014: pdf

LoRDEC: accurate and efficient long read error correction

  • Authors: Leena Salmela, Eric Rivals
  • Details: Bioinformatics, on line doi:10.1093/bioinformatics/btu538 August 26, 2014.

From Indexing Data Structures to de Bruijn Graphs

Efficiently listing bounded length st-paths

  • Authors: R Rizzi, G Sacomoto, MF Sagot
  • Details:International Workshop on Combinatorial Algorithms

Integrating a new visualization tool in Galaxy

  • Authors: Alexan Andrieux, Pierre Peterlongo, Yvan Le Bras, Cyril Moujaud, Charles Deltel
  • Details: Galaxy Community Conference 2014.

Approximation of greedy algorithms for Max-ATSP, Maximal Compression, Maximal Cycle Cover, and Shortest Cyclic Cover of Strings

Reverse Engineering of Compact Suffix Trees and Links: a Novel Algorithm

Reference-free high-throughput SNP detection in pea: an example of discoSnp usage for a non-model complex genome

  • Authors: Susete Alves Carvalho, Raluca Uricaru, Jorge Duarte, Claire Lemaitre, Nathalie Rivière, Gilles Boutet, Alain Baranger and Pierre Peterlongo
  • Details: ECCB 2014 poster, track Sequencing and sequence analysis for genomics

Commet: Comparing and combining multiple metagenomic datasets

  • Autors : Nicolas Maillet, Guillaume Collet, Thomas Vannier, Dominique Lavenier, Pierre Peterlongo
  • Details :In Bioinformatics and Biomedicine (BIBM), 2014 IEEE International Conference.

Mapping-free and Assembly-free Discovery of Inversion Breakpoints from Raw NGS Reads. 

  • Authors: Claire Lemaitre, Liviu Ciortuz and Pierre Peterlongo.
  • Details: Proceedings of AlCoB 2014, in Algorithms for Computational Biology, LNCS vol. 8542, pp. 119–130. (Springer link)
  • Presentation: here

2013 -These publications are related to the topic, but were not produced under the project

A Polynomial Delay Algorithm for the Enumeration of Bubbles with Length Constraints in Directed Graphs and Its Application to the Detection of Alternative Splicing in RNA-seq Data

  • Autors : Sacomoto, G., Lacroix, V., & Sagot, M.-F.
  • Details : Wabi 2013

CRAC: an integrated approach to the analysis of RNA-seq reads

  • Autors : Nicolas Philippe, Mickael Salson, Thérèse Commes, Eric Rivals
  • DetailsGenome Biology, 14:R30, 2013.

Scalable and Versatile k-mer Indexing for High-Throughput Sequencing Data

Development of genomic resources for the tick Ixodes ricinus: isolation and characterization of Single Nucleotide Polymorphisms. BibTex

2012 – These publications are related to the topic, but were not produced under the project

KISSPLICE: de-novo calling alternative splicing events from RNA-seq data 1471-2105-13-S6-S5.pdfBibTex

  • Autors : Gustavo AT Sacomoto, Janice Kielbassa, Rayan Chikhi, Raluca Uricaru, Pavlos Antoniou, Marie-France Sagot, Pierre Peterlongo and Vincent Lacroix
  • Details : BMC Bioinformatics 2012, 13(Suppl 6):S5

Efficient bubble enumeration in directed graphsarticle15_spire.pdfBibTex

  • Autors : Etienne Birmele, Pierluigi Crescenzi, Rui Ferreira, Roberto Grossi, Vincent Lacroix, Andrea Marino, Nadia Pisanti, Gustavo AT Sacomoto and Marie-France Sagot
  • Details : Proceedings of 19th Conference on String Processing and Information Retrieval (SPIRE) 2012, Springer LNCS 7608, pages 118-129, 2012.

 Mapsembler, targeted and micro assembly of large NGS datasets on a desktop computer1471-2105-13-48.pdfBibTex

  • Autors :   Pierre Peterlongo and Rayan Chikhi
  • Details : BMC Bioinformatics 2012, 13(1), 48. doi:10.1186/1471-2105-13-48.

 Space-efficient and exact de Bruijn graph representation based on a Bloom filter1471-2105-13-S6-S5.pdfBibTex

  • Autors :Rayan Chikhi and Guillaume Rizk
  • Details : BMC Bioinformatics 2012, 13(Suppl 6):S5


Querying large read collections in main memory: a versatile data structure

  • Autors : Nicolas Philippe, Mikaël Salson, Thierry Lecroq, Martine Léonard
  • Details : BMC Bioinformatics 2011


Identifying SNPs without a reference genome by comparing raw reads. BibTex

  • Autors : Pierre Peterlongo, Nicolas Schnel, Nadia Pisanti, Marie-France Sagot, Vincent Lacroix.
  • Details : 17th International Symposium, SPIRE 2010, Los Cabos, Mexico, October 11-13, 2010, Proceedings