Pro-origami User Guide

1. Introduction

Pro-origami is a system for automatically generating protein structure cartoons. The cartoons are intended to make protein structure easy to interpret by laying out the secondary and super-secondary structure in two dimensions in a manner that makes the structure clear.

The cartoons are drawn in a style similar to those that might be drawn manually using a tool such as Charles Bond's TopDraw program, rather than the purely topological style such as those drawn by TOPS or PTGL, or the detailed hydrogen bonding diagrams drawn by HERA (PROMOTIF), from which the cartoons available in PDBsum are derived.

2. Pre-generated Pro-origami cartoons

The Pro-origami server contains a database with pre-generated cartoons built from the contents of the ASTRAL SCOP 1.75 PDB-style genetic domain 95% sequence identity filtered subset and from the PDB, automatically updated on a weekly basis.

To access these, visit the Retrieve Cartoon page and enter a PDB or SCOP identifier and select the format that you would like the cartoon in.

If you just want to see the cartoon, the easiest way is to view the interactive online SVG. This opens a canvas in the browser that allows you to explore the cartoon. See Section 4 for more information about this.

The raw SVG file (which is the base file used to generate all the other formats) can be downloaded if you want to edit the cartoon, see Section 6.

You can also choose to download a static PNG image or PDF version of the cartoon. If you are planning to use the image in a report or paper, PDF is preferable since it uses vector graphics which will appear high quality at all sizes and resolutions.

Throughout the Pro-origami web interface you will find help links ("← Help") that, when hovered over with the mouse, pop up contextual help for particular settings or options. An example of this is shown to the right.

If the protein you are trying to view is made up of several domains you will be prompted which one you would like to view. If you would like to see the entire protein structure in a single diagram then you can get Pro-origami to generate the diagram for that protein, see Section 3.

3. Generating custom Pro-origami cartoons

When generating a cartoon, in addition to being able to select the protein by specifying the PDB or SCOP identifier, you can upload a PDB file that will then be used to generate the cartoon. PDB file uploads are limited to 10MB in size.

When generating cartoons you are able to select whether the entire structure of the protein will be shown in a single diagram, or whether it will be decomposed in individual domains that can be viewed separately (the default for pre-generated cartoons).

You can also select which program will be used to determine the secondary structure for the protein.

The user may use one of a number of advanced options on this screen to control the properties of the cartoon. Information about all these options can be found in the "← Help" links to the right of each option.

The output options for the generated cartoons are the same as for the pre-generated cartoons, as discussed in Section 2.

4. Interactive viewing of Pro-origami cartoons

Upon loading the interactive online SVG you will see a screen like the one below. At the top is the SVG Pro-origami cartoon drawn in a canvas that allows panning and zooming. Below that are a set of controls and explanation. You can resize this browser window, but the Tableau Search options might not be visible if the window is made too narrow.

When the page loads, the diagram will be zoomed to fit in the window. If you resize the browser window, you can click the "Zoom to fit view" button to return to a comfortable zoom level that shows the entire diagram. You can click and drag an empty point on the canvas to pan the diagram around. You can scroll up or down on the canvas at a particular point to zoom in or out on the diagram.

Hovering your mouse over a connector will highlight that line by widening it. A tooltip will show information about that object. Clicking on an SSE (strand or helix) will select or deselect it for substructure searches, see Section 5. Selected SSEs are drawn with a red border highlight.

There is a button that will open the protein structure in 3D in a new window. This visualisation is a Jmol Java applet. This can be useful when you want to compare the 2D cartoon representation against a 3D representation of the secondary structure. Pro-origami uses similar colour coding to Jmol to make the comparisons easier.

Once the Jmol Java applet has loaded, you can click on the 3D structure and drag the protein to rotate it in 3D and allow greater understanding of the structure.

5. Substructure search

From the interactive online view of Pro-origami cartoons you can perform substructure searches to find and display proteins containing similar structures. To do this you first select some number of SSEs (helices or strands) by clicking on them with the mouse. SSEs may be selected either one at a time, or whole sheets or helix clusters can be selected if the "Select sheets or helix clusters" mode is selected. Selecting the "N" or "C" terminus on a chain will select the entire chain. When selected, SSEs will be highlighted with a red border, as shown in the second image in Section 4. You then select the type of substructure search you wish to perform. Some of these searches may take some time, and you may have enable pop-up windows in your browser to see search results.

The SA Tableau Search button will run SA Tableau Search with 4096 restarts, for accurate but slower searches. The Fast SA Tableau Search button will run SA Tableau Search with 128 restarts, for faster but less accurate searches. The QP Tableau Search button will run the older QP Tableau Search program, also for slower but reasonably accurate searches.

SA Tableau Search and QP Tableau Search will queue a batch job and return a link that can be checked for the results; this could take many hours depending on the query and the load on the server. The Fast SA Tableau Search will run the query immediately and return the results directly to your browser; this should only take a few minutes at most.

There are options controlling selection of SSEs that can be adjusted so that entire sheets or helix clusters are selected or deselected by clicking on one of the member strands or helices.

6. Editing Pro-origami cartoons

If you want to edit properties or styles of the generated SVG files, such as changing the colours of objects, or increasing the width of connectors, this would best be done in an SVG editor. A good option for this is Inkscape, which is an open source SVG editor that runs on Microsoft Windows, Linux and Mac OS X. Inkscape is available for free download from the Inkscape website.

If you want to change the layout of the generated Pro-origami cartoons, then you should edit the SVG file in Dunnart, a constraint-based diagram editor that is used by the Pro-origami webserver to automatically layout the generated SVG files. You can download a copy of Dunnart for Microsoft Windows, Linux or Mac OS X for free from the Dunnart website. Pro-origami generated SVG files can be reloaded into Dunnart and edited interactively. The benefit of this is that Dunnart understands the structure of the strands and sheets, and constrains them to be positioned relative to each other, and it knows connectors are attached to objects and automatically reroutes the connectors when objects are moved. When you open a Pro-origami SVG file in Dunnart you will see a window like the following.

Note there are extra blue indicators shown that are not part of the diagram itself. These represent constraints such as alignment between objects, or an equal distribution of some object positions, such as the x-positions of strands in the sheet pictured here. Dunnart maintains these relationships as you modify the diagram. Thus preserving the visual aspects of the secondary structure of the protein.

If what you want to do is just change the overall layout while keeping the important aspects of the protein's secondary structure enforced, then you can just do things like drag the helix cluster at the bottom of the diagram over to the right (by dragging in the cluster's boundary but not on a child object). After moving a couple of other objects around you can easily end up with the diagram pictured below. Dunnart understands that you are editing the layout and will not let you change the structure of the diagram, i.e., you can't disconnect a connector from one shape and attach it to another. Note, the page can be set to exactly contain the diagram by right-clicking on the canvas and selecting "Fit page to entire diagram".

If you want to edit structural layout to a greater extent, this is possible in Dunnart but you will then likely lose the benefits of representing the secondary structure in cartoon form, unless you carefully follow the layout conventions yourself. Dunnart allows you to edit the constraint, such as adding or removing a shape from an alignment relationship by holding the ALT key as you drag a shape. You can also click on a distribution relationship indicator and two handles will appear that will allow you to set the spacing between object in the distribution. There is also an option in the layout settings where you can turn off non-overlap constraints if you would really like overlapping objects in your final diagram.

Once you have finished laying out the diagram, you can save the SVG file and use Inkscape or another SVG editor to convert the diagram to a PDF or PNG file.

7. Troubleshooting

This section lists some common problems encountered when using the Pro-origami server, and how to fix them.

When trying to generate a cartoon, it takes a long time and eventually I get an error message such as:


Diagram layout failed (35072)


This is due to the server time limit being reached on one of the diagram layout processes. Generally this only happens on very large structures, especially if they have not been split into domains. The first thing to try is to ensure that the structure is decomposed into domains by selecting the "Decompose into domains" option under "2) Choose how to handle domains:". Also ensure that the DDOMAIN domain decomposition method is used rather than Cath CDF by selecting "DDOMAIN" for Domain decomposition in the advanced options (check the "Show advanced options" box to show these options). This is because the structure might not have an entry in the Cath CDF, so will not be decomposed into domains with that option, while the DDOMAIN program can decompose any structure into domains. Note however that the DDOMAIN program itself might take a very long time for some structures.

Note that you can still have multiple domains displayed on a single cartoon when domain decomposition is used (rather than each domain being a separately selected cartoon). This is done by selecting the "multidomain cartoon" option in the first option box (just under Domain decomposition) in the advanced options. The difference between this and not doing domain decomposition at all, is that when this option is used and domain decomposition is performed, the diagram for each domain is laid out separately, and then the domains are reassembled onto a single cartoon, rather than the layout algorithms having to operate on the entire structure. It is the latter operation that is more likely to lead to a timeout on very large structures.

If this still does not solve the problem, you can download the Pro-origami software and install it on own computer, where you can run it without any limits on the amount of time or memory it uses.

Some lines (connectors) are too close together when they are running parallel to each other, for example the red and green connectors in the 1BAG structure.
Regenerate the cartoon (see Section 3) with an increased value in the "connector path separation ('nudge') distance" option under the "Connector routing options" section. You must first select the "Show advanced options" checkbox on the "Generate cartoon" form to reveal these extra options. This will force parallel lines further apart. This option has a default value of 4 and can be increased up to 10.
I want to change the colours used in the cartoon, without having to edit the SVG in an external editor.
Options for setting the colours are revealed by selecting the "Show advanced options" checkbox on the "Generate cartoon" form (see Section 3). As well as a number of automated SSE colouring schemes, colours can be set manually for each SSE by selecting the "color element types in order" radio button, which will reveal a list of colours to set for each SSE type. Similar options are provided to set connector (line) colours.
There used to be lots of options for controlling domain decomposition, cartoon layout, colours, helix clustering and other options. Where did they all go?
These options are still available, but in order to avoid excessive clutter in the "Generate cartoon" form, they are now only revealed when the "Show advanced options" checkbox is selected.
The structure I enter into the "Retrieve cartoon" form to retrieve a pre-generated cartoon from the database results in an "XML Parsing Error: prefix not bound to a namespace" error (SVG image format) or a blank screen (PNG image format).
This can happen due to an error in the cartoon generation process in building (or updating) the database, or if the weekly automatic PDB update just happens to be updating that cartoon in the database at the time you are trying to retrieve it. The solution is to use the Generate cartoon form to generate a new cartoon for that structure. This process will use the PDB (or ASTRAL) file on the server to generate a new cartoon for the structure, completely separately from the cartoon database.
I want to generate a cartoon for a newly solved structure, but don't feel comfortable with uploading the co-ordinates to this server before the PDB is released.
I can assure you that we are not in the business of extracting data from uploaded PDB format files, but you don't have to trust us! You can download the Pro-origami software and install it on your own computer, although this is likely to be a lot more effort than just using this webserver.
I'm trying to get a cartoon for a recently released PDB structure and I get a "no cartoon found" message when trying to retrieve a cartoon or a "Could not find a PDB file" message when trying to generate a cartoon.
The Pro-origami server updates its local copy of the PDB weekly, so PDB files released within the last week might not be available on the Pro-origami server yet. Rather than waiting for new structures to be available on the Pro-origami server, you can download the PDB file from the RCSB Protein Data Bank yourself, and generate a cartoon for it by using the "Upload a PDB file" option on the generate cartoon form on the Pro-origami server.

8. Further information

Further information on Pro-origami and the other programs that it uses can be found in the About Pro-origami page and the references listed therein. This page also contains links to download the Pro-origami software to run on your own machine. Further details on the background and internals of Pro-origami can be found in Chapter 7 of Stivala's PhD thesis.