Graph Sequences Viewer Users Guide


Overview

Compatibility table

Overview

Graph Sequences Viewer (GSV) is a tool based on cytoscape.js and dedicated to have a graphical represantation of mapping reads in graph. Each reads is a node and overlaps between two reads is an edge. Graph representation for mapping reads allow to view quickly each way can be possible to extend a sequence. The input can be a simple JSON with graph description(nodes and edges). But it's recommended to use the output of the tool Mapsembler. Mapsembler take one or many references sequences and try to extends all of them with sets of reads. Some outputs are available in Mapsembler whose json graph output. This output is a specific JSON formated that allow to use full possibilities of GSV.

Graph Sequences Viewer allow to explore the graph and help to find elements with a particalary interest through a vizmapper. Vizmapper allow to apply style (shape, color , size) on graph elements in function of data properties of this elements (sequences length, coverage,...). Each elements of the graph are clickable and allowed to see various information in a rectractable panel. This panel have some functions, especialy for nodes, for use the data displayed like sequences concatenation, comment and hightlight. And it's possible to export all sequences displayed in this panel (nodes squences and concatenated sequences) in text files.

Compatibility table

Firefox 4.0+ Google Chrome 5.0+ Safari 5.1.7+ Internet Explorer 9.0+ Opera 11.6+

Compatibility table is valid on MacOS X, Windows and Linux.
But several tests had been made only on :
- Windows 7 with IE 10, IE 11, Firefox 26, Chrome 31, Opera 12.16
- MacOS X Montain Lion with Safari 6.0.2, FireFox 26, Firefox 27, Chrome 30, Chrome 31 and Opera 12.16
- MacOS X Maverick with Safari 7.0, FireFox 26, Firefox 27, Chrome 31
- Fedora 17 with FireFox 15, Chrome 22
Some problems are present with Chrome >= 31 on MacOS X, please use an other browser. For others compatibilities problems are questions, please contact alexan.andrieux@inria.fr.

1. Start Page

1.1. Load a File

The start page allow to load a graph file (.json) or a session file (.sjson). If the application is launched in standalone, any compatible file save in your hard drive can be open. But if the application is online, only the file present on the server of the application can be open. So with an online instance, it's necessary to upload files on server in a directory with good right (For example: json directory in root directory of the application).

a. Graph File

The graph file can be output of Mapsembler, but also a simple json. In the minimal structure for a compatible json is:

{
"nodes": [
{"data": {"id": "0","sequence": "ATGC"}},
{"data": {"id": "1","sequence": "ATGC"}},
{"data": {"id": "2","sequence": "ATGC" }},
{"data": {"id": "3","sequence": "ATGC"}},
{"data": {"id": "4","sequence": "ATGC"}}
],
"edges": [
{"data": {"id": "e0","source": "0","target": "1","direction": "FF"}},
{"data":{"id": "e1","source": "1","target": "0","direction": "RR"}},
{ "data": {"id": "e2","source": "0","target": "3","direction": "FF"}},
{ "data": {"id": "e3","source": "3","target": "0","direction": "RR"}},
{"data": {"id": "e4","source": "1","target": "2","direction": "FF"}},
{"data": {"id": "e5","source": "2","target": "1","direction": "RR"}},
{"data": {"id": "e6","source": "3","target": "4","direction": "FF"}},
{"data": {"id": "e7","source": "4","target": "3","direction": "RR"}}
]
}

For better performances, the property "length", refered to the size of the sequence, can be add in nodes data. In this way, the application must not calculate size of sequences and save several time, especially with large graph.


{"data": {"id": "0","length":4 "sequence": "ATGC"}},

Data nodes and edges can contains also the property coverage. Coverage is an average come from differents files. So coverage property is an array containing identity of files and value of average coverage:


{"data": {"id": "0","length":4 "sequence": "ATGC",
coverage":[
{"id":"file_1","avg_coverage": 106.47},
{"id":"file_2","avg_coverage": 106.47}}
]

To use concatenation function , it's necessary to add in nodes data the property k, refered to the overlap between sequences of nodes linked.


{"data": {"id": "0","length":4 "sequence": "ATGC", "k":3}},

When minimal json is loaded, the graph viewer is display automatically.

Mapsembler json output has a complex structure and can include several graphs. With this json file, it's allow to make choices on starters, substarters and coverage files to take into consideration for the graph wanted. For each starter, there are at least one substarter (the substarter can be the starter himself) and for each substarter there are one are two extensions. So the Mapsembler output looks like:

Shema :

Code :

{
"Starter_0":{"id":"S0", "sequence":"ATGC",
"substarters": [
{ "data": { "id":"s0", "sequence":"ATGC", "length":4, "extremGraphRight":"k0", "extremGraphLeft":"none"}}
],
"extremGraphs":[ {"data":{"id":"k0", "sequence":"ATGC", "direction":"RIGHT", "firstNodeId":"n0",
"nodes": [
{"data": {"id": "0","sequence": "ATGC"}},
{"data": {"id": "1","sequence": "ATGC"}},
{"data": {"id": "2","sequence": "ATGC" }},
{"data": {"id": "3","sequence": "ATGC"}},
{"data": {"id": "4","sequence": "ATGC"}}
],
"edges": [
{"data": {"id": "e0","source": "0","target": "1","direction": "FF"}},
{"data":{"id": "e1","source": "1","target": "0","direction": "RR"}},
{ "data": {"id": "e2","source": "0","target": "3","direction": "FF"}},
{ "data": {"id": "e3","source": "3","target": "0","direction": "RR"}},
{"data": {"id": "e4","source": "1","target": "2","direction": "FF"}},
{"data": {"id": "e5","source": "2","target": "1","direction": "RR"}},
{"data": {"id": "e6","source": "3","target": "4","direction": "FF"}},
{"data": {"id": "e7","source": "4","target": "3","direction": "RR"}}
]
}}
]
}
}

Likes to minimal json it's possible to have properties "length" in data nodes and "coverage" in data nodes and edges. Whereas the property "k", is calculated automaticaly with this json structure.

b. Session File

It's possible to load a session file (.sjson). Session files contain positions of nodes and vizmapper properties defined previousely for nodes and edges. Loaded a session file, display automatically the graph viewer.

1.2. Data tables of graph loaded

a. Starters

When mapsembler json output is loaded, a table of starters has been displayed.

Click on one of them selecting and displaying substarters table in a new internal tab.

b. Sub-starters

Click on one of them selecting and displaying the graph viewer in a new browser tab.

c. Coverage Files

Set the coverage file, need to be done before selected a substarter. By default, first coverage files is considered.

2. Graph viewer

2.1. Data tables

In the graph viewer, part left contains nodes data table and edges data table in two tabs. In this area, data elements are displayed:

Select/unselect one of the elements in table select/unselect this element in the graph. Multi selection is possible with hot key "Shift+Left Click". The search, is a functionality very useful, allow to find an elements by his ID or find nodes with specific motif.


2.2. Vizmapper

The vizmapper allow to define style in function of the properties length and average coverage. For nodes, it's possible to define shape and size in function of length. If average coverage is present, edges tab has been appeared and it's possible to define color in fonction of average coverage for nodes and edges. Each cursor define a point in the distribution of nodes length or nodes/edges average coverage. By default, the style of the root is holded.

a. Length property

Points in distribution of nodes length define points in distribution of shapes (discrete) and in distribution of size (gradient). By default, shape is round for all nodes except the root and the distribution of sizes contains three points with the size :20px (minimum length), 30px (medium length), 40px(maximum length). For example, with the default values of size, nodes have sequence length between 0 and medium values have a size values between 20px and 30px. Size values grow up gradually with values of sequences length.

Click on preview of distribution add a new point (and a new cursor) in the distribution.

Click on a cursor display a selector to allow to set the properties (shape , size) of the distribution point.

This action display a frame with a cross arround the cursor too. Click on the cross delete this distribution point (and the cursor).

b. Coverage property

If coverage is present, points in distribution of average coverage define points in distribution of colors (gradient). By default, the distribution of colors contains three points : red (poor coverage), orange (medium coverage), green (good coverage).

Similary to lenght property, it's possible to add, remove and set style (here the color). For example, with the default values of color, nodes have average coverage between 0 and medium values have a color values between red and orange with a gradient transition between these.

c. Reset and table of coverage files

The reset button reset the vizmapper to default state for nodes or for edges (not the two in same time).

The table of coverage files, allow to set the coverage files consedered for the graph. The orange one is the selected one. Click on an other one have set data table (nodes and edges), and vizmapper has been updated with new values of average coverage.

2.3. Graph panel

The graph viewer, allow to:

a. Load file

Load file button allow to return to start page to set starter and substarter, and allow to load a previous session file.

b. Save file

Save file button allow to save current session in a file. The session needs to be save in same directory as the json loaded else session has not been able to restore after.

c. Style

The style button allow to show/hide labels of graph elements (nodes and edges).

d. Layout

The layout button allow to recalculate the current layout (reset), and set type of layout. Nine types of layout are available:

2.4. Data viewer

Select nodes displaying a bottom panel that contains properties of the elements selected. The panel give:

For edges : ID, source, target, average coverage.

For nodes : ID, length of sequence, average coverage and sequence. The panel have a menu and display the interval of selection for the current node.

The panel is close, after elements are unselected. The hold button allow to keep open the panel after elements are unselected.

a. Sequence format

It's pssible to set format of sequence(s) displayed (Set button). Four formats are available:

FASTA


CODATA


PRIDE


RAW

b. Annotation and highlight

It's possible to spot part(s) of sequence(s) with highlight or annotation (Add button):

Click on annotation rectangle displaying a color selector with text area to define name of the annotation and a commentary.

After the selector is hidden, name and commentary are shown on mouse over annotation rectangle.

The highlight and annotation are persistant, so if nodes are unselected these have not been loose. For now, it's only possible to remove all highlight or/and all annotation displayed (Remove button).

c. Export

The export function allow to generate and download a text file, contains all sequences displayed in bottom panel, in selected format.

d. Concaténation

The concatenation function allow to concatenate sequence of tow nodes or more(available only for Mapsembler json output or minimal json with k in data nodes). This function involve some constraints to allow or disallow the concatenation of sequences. It's important to know that if tow nodes or link with an edge, there are n characteres similar (overlap) in the two sequences:

To concatenat two nodes the order of click is very important because only just one direction is shown on the graph but in some cases an other direction exist. (This can be checked in edges table).

So for example if click on n9 nodes first, and after click on n8. The direction followed is FF.


But if click on n8 before n9, the direction follow is RR.
So the result of concatenation will be different.

Be carefule on unselected nodes, concatenation disapear and all annnotation or highlight are loose (for concatenated sequence). If two (or more) nodes can't be concatenated, for example tow nodes not linked, an error message has been displayed in the corner right of the application.