Skip to content

Commit

Permalink
Update help.md
Browse files Browse the repository at this point in the history
  • Loading branch information
sridevan authored Sep 15, 2024
1 parent 35b33e5 commit c6f4f70
Showing 1 changed file with 11 additions and 11 deletions.
22 changes: 11 additions & 11 deletions help.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ A range of residue numbers can be provided, separating the lower and upper numbe
##### Multiple ranges of residues
This option consists of entering multiple single ranges of nucleotide numbers separated by commas. This is especially helpful when the motif is an internal loop, a junction loop, or a long-range interaction motif. The general format is “**start1:end1,start2:end2**” and repeat for as many ranges as are needed. For example, to get both strands of the E. coli SSU decoding loop, one would type 1405:1409,1491:1496. [Link to example of multiple ranges of residue numbers](http://rna.bgsu.edu/correspondence/comparison?selection=1405:1409,1491:1496&pdb=5J7L&chain=AA&exp_method=all&resolution=3.0&scope=EC&input_form=True).

#### Loop ID
#### Loop identifier
Each week, the BGSU data processing pipeline extracts hairpin, internal, and junction loops from RNA-containing 3D structure files using the FR3D software. Once the loops are extracted, we label them with unique and stable identifiers. These “loop ids" contain the following three fields, separated by underscores:
- Field 1: Loop type prefix: “HL” for hairpin loops, “IL” for internal loops, “J3” for three-way junctions
- Field 2: PDB ID
Expand All @@ -41,7 +41,7 @@ Users can view the loop_ids for a particular RNA structure by exploring the RNA

To specify a R3DMCS query using a loop id, type the loop id in the Selection box, and leave the PDB id and Chain id boxes empty. [Link to example of using loop id to specify a query](http://rna.bgsu.edu/correspondence/comparison?selection=J3_5TBW_003&exp_method=all&resolution=3.0&scope=EC&input_form=True).

#### Unit ID
#### Unit identifier
Unit ids allow full ability to specify a particular residue. Unit ids are described on the [unit_id page](https://www.bgsu.edu/research/rna/help/rna-3d-hub-help/unit-ids.html). For example, 5J7L|1|AA|G|1405 and 5J7L|1|AA|C|1496 are unit ids for one of the flanking pairs in the E. coli SSU decoding loop. Each unit id tells the PDB id, the model number, the chain, the sequence of the residue, and the residue number, and can optionally also tell the insertion code, alternate id, and symmetry operator. [Link to example of using unit ids to specify a query](http://rna.bgsu.edu/correspondence/comparison?selection=8GLP%7C1%7CL5%7CG%7C1561,8GLP%7C1%7CL5%7CG%7C1562,8GLP%7C1%7CL5%7CA%7C1563,8GLP%7C1%7CL5%7CA%7C1564,8GLP%7C1%7CL5%7CA%7C1565,8GLP%7C1%7CL5%7CC%7C1566&input_form=True).

#### Special cases
Expand Down Expand Up @@ -99,7 +99,7 @@ For multiple ranges of nucleotides plus an individual nucleotide:

Note that we are including the resolution threshold in these examples, to provide smaller and faster queries.

### Scope and Depth
### Scope and depth
To specify the scope use the "scope" key and values **EC** for **retrievals across the equivalence class of same molecule, same species**, and use **Rfam** for **retrievals across species**.
- [http://rna.bgsu.edu/correspondence/comparison?selection=HL_5TBW_007&scope=EC](http://rna.bgsu.edu/correspondence/comparison?selection=HL_5TBW_007&scope=EC)
- [http://rna.bgsu.edu/correspondence/comparison?selection=HL_5TBW_007&scope=Rfam](http://rna.bgsu.edu/correspondence/comparison?selection=HL_5TBW_007&scope=Rfam)
Expand All @@ -121,13 +121,13 @@ To specify the experimental technique, use the exp_method key and values all, x-

Note that when exp_method is nmr, then the resolution will be set to all, so that NMR structures will be found and returned.

### PDB IDs to exclude
### PDB identifiers to exclude
To exclude specific PDB ids, for example because they have an unusual feature that distracts from the main data analysis, use the exclude key and list PDB ids separated by commas:
- [https://rna.bgsu.edu/correspondence/comparison?pdb=5J7L&chain=AA&selection=1405%3A1409&exp_method=all&resolution=3.0&scope=EC&exclude=8GHU,8G2U](https://rna.bgsu.edu/correspondence/comparison?pdb=5J7L&chain=AA&selection=1405%3A1409&exp_method=all&resolution=3.0&scope=EC&exclude=8GHU,8G2U)

In the URL above, we modify the earlier example of nucleotides 1405 to 1409 from chain AA of 5J7L to exclude 8GHU and 8G2U. The instance from 8GHU has the modified nucleotide ZIV in position 1405 and so no discrepancy was calculated. The instance from 8G2U had large discrepancy with nearly every other nucleotide, causing the color range to be mostly blue between all other instances. Excluding those instances makes it possible to better discern the structure of the remaining instances; the maximum discrepancy dropped from 1.26 to 0.39.

Another good example is obtained from the E. coli SSU basepair 1405 with 1496; in the query above, two instances have the C modeled in syn, which are real outliers compared to the other instances. Excluding 4V9O and 4V9P together with 8G2U and 8GHU allows us to focus on the other instances.
Another good example is obtained from the *E. coli* SSU basepair 1405 with 1496; in the query above, two instances have the C modeled in syn, which are real outliers compared to the other instances. Excluding 4V9O and 4V9P together with 8G2U and 8GHU allows us to focus on the other instances.
- [https://rna.bgsu.edu/correspondence/comparison?pdb=5J7L&chain=AA&selection=1405%2C1496&exp_method=all&resolution=3.0&scope=EC&exclude=4V9P,4V9O,8G2U,8GHU](https://rna.bgsu.edu/correspondence/comparison?pdb=5J7L&chain=AA&selection=1405%2C1496&exp_method=all&resolution=3.0&scope=EC&exclude=4V9P,4V9O,8G2U,8GHU)

### Pre-filled input page
Expand All @@ -144,15 +144,15 @@ After loading the input page, it takes 2 seconds for the Submit button to change
The R3DMCS output page provides query information, a table of instances, a coordinate window, an interactive heat map, and a listing of nearby chains. Each row of the table lists one instance, and shows the PDB id, model number, chain, resolution, nearby chains, nucleotide numbers, and annotated pairwise interactions. The instances are ordered by geometric similarity so that instances that are more similar to each other are placed near one another in the table. The same ordering is used in the heatmap. The heatmap is interactive; clicking the heatmap selects instances, which are then marked in the table and are shown in the coordinate window. These features of the output page are explained in detail in the context of Example 1 below.

### Example 1: *E. coli* small decoding loop
This example illustrates the dynamic nature of the decoding loop. During translation, the decoding loop in helix 44 of the small subunit ribosomal RNA makes contact with the mRNA to promote fidelity of translation. The contact is made by two adenine bases, often numbered 1492 and 1493, flipping out of the internal loop. When the mRNA is not present, the adenine bases typically stack inside the internal loop. We can see several different conformations of the internal loop with R3DMCS. We use internal loop IL_5J7L_060 from E. coli as the query. For this illustration, we use resolution threshold 3.0Å and retrieve corresponding instances across the equivalence class of E. coli small subunit ribosomal RNA 3D structures. See the [URL to produce the input page for Example 1](http://rna.bgsu.edu/correspondence/comparison?selection=IL_5J7L_060&resolution=3.0&scope=EC&input_form=True).
This example illustrates the dynamic nature of the decoding loop. During translation, the decoding loop in helix 44 of the small subunit ribosomal RNA makes contact with the mRNA to promote fidelity of translation. The contact is made by two adenine bases, often numbered 1492 and 1493, flipping out of the internal loop. When the mRNA is not present, the adenine bases typically stack inside the internal loop. We can see several different conformations of the internal loop with R3DMCS. We use internal loop IL_5J7L_060 from E. coli as the query. For this illustration, we use resolution threshold 3.0Å and retrieve corresponding instances across the equivalence class of *E. coli* small subunit ribosomal RNA 3D structures. See the [URL to produce the input page for Example 1](http://rna.bgsu.edu/correspondence/comparison?selection=IL_5J7L_060&resolution=3.0&scope=EC&input_form=True).

#### Query information panel
The upper left panel of the output page, shown below, shows basic information about the query and the corresponding instances. The query nucleotides come from PDB id 5J7L, model 1, chain AA. The standardized name of that chain is the small subunit ribosomal RNA, SSU for short. The query nucleotides are listed; note that residue 1407 is a modified C. Concatenating the PDB|Model|Chain with the query nucleotide sequence and number would give the full unit id, for example, 5J7L|1|AA|G|1405 for the first nucleotide. The Query Organism identifies the species of the PDB chain the query nucleotides are from. Since we chose to retrieve instances from across the equivalence class, the equivalence class identifier NR_3.0_56726.109 is shown; this indicates that the resolution threshold is 3.0Å and that the equivalence class with handle 56726 is on version 119, meaning that since the inception of this equivalence class, the membership has changed 119 times. This query has retrieved 134 instances, all of which are from E. coli small subunit ribosomal RNA 3D structures. In the all-against-all geometric comparison, the largest geometric discrepancy is 1.40, indicating a moderate level of geometric similarity even between the most dissimilar instances.
The upper left panel of the output page, shown below, shows basic information about the query and the corresponding instances. The query nucleotides come from PDB id 5J7L, model 1, chain AA. The standardized name of that chain is the small subunit ribosomal RNA, SSU for short. The query nucleotides are listed; note that residue 1407 is a modified C. Concatenating the PDB|Model|Chain with the query nucleotide sequence and number would give the full unit id, for example, 5J7L|1|AA|G|1405 for the first nucleotide. The Query Organism identifies the species of the PDB chain the query nucleotides are from. Since we chose to retrieve instances from across the equivalence class, the equivalence class identifier NR_3.0_56726.109 is shown; this indicates that the resolution threshold is 3.0Å and that the equivalence class with handle 56726 is on version 119, meaning that since the inception of this equivalence class, the membership has changed 119 times. This query has retrieved 134 instances, all of which are from *E. coli* small subunit ribosomal RNA 3D structures. In the all-against-all geometric comparison, the largest geometric discrepancy is 1.40, indicating a moderate level of geometric similarity even between the most dissimilar instances.

![Query information panel](/assets/query_panel.png)

#### Table of instances
The table of instances in the center of the output page lists all 134 instances. In the image below, we show two rows of the table. The query instance is in row 4 and indicates the PDB, model, and chain to be 5J7L, 1, AA. The structure 5J7L was solved at 3.0Å resolution. The columns numbered 1, 2, 3, indicate the query nucleotides, starting with G|1405. The column labeled "Neighboring Protein/NA Chains" indicates chains which have at least one residue within 10Å of one of the nucleotides in the instance on that row. In 5J7L, that includes the LSU rRNA and the SSU protein uS12. The instance in row 52 of the table is from PDB structure 7M5D, solved at 2.8Å. Numbering in E. coli 3D structures is quite consistent from one structure to the next, so it is no surprise that the nucleotide numbers in the columns are the same. What differs the most is the nearby chains; in 7M5D three additional chains are nearby, namely Peptide chain release factor RF-1, a tRNA, and an mRNA; their chain identifiers are indicated at the beginning of each line. As we will explain below, the user can visualize residues from these chains in the coordinate window by clicking "Show neighborhood".
The table of instances in the center of the output page lists all 134 instances. In the image below, we show two rows of the table. The query instance is in row 4 and indicates the PDB, model, and chain to be 5J7L, 1, AA. The structure 5J7L was solved at 3.0Å resolution. The columns numbered 1, 2, 3, indicate the query nucleotides, starting with G|1405. The column labeled "Neighboring Protein/NA Chains" indicates chains which have at least one residue within 10Å of one of the nucleotides in the instance on that row. In 5J7L, that includes the LSU rRNA and the SSU protein uS12. The instance in row 52 of the table is from PDB structure 7M5D, solved at 2.8Å. Numbering in *E. coli* 3D structures is quite consistent from one structure to the next, so it is no surprise that the nucleotide numbers in the columns are the same. What differs the most is the nearby chains; in 7M5D three additional chains are nearby, namely Peptide chain release factor RF-1, a tRNA, and an mRNA; their chain identifiers are indicated at the beginning of each line. As we will explain below, the user can visualize residues from these chains in the coordinate window by clicking "Show neighborhood".

![Table of instances](/assets/image1.png)

Expand Down Expand Up @@ -212,20 +212,20 @@ The upper right corner of the output page lists all unique names of nearby chain
![Nearby chains listing](/assets/chains_count.png)

### Example 2: *E. coli* SSU h27 internal loop
This [example](http://rna.bgsu.edu/correspondence/comparison?selection=IL_5AJ3_023&resolution=4.0&scope=Rfam&depth=1&input_form=true) studies an internal loop from the small subunit ribosomal RNA helix 27. The core of the loop is the same as the sarcin-ricin internal loop in Helix 95 of the large subunit ribosomal RNA, consisting of a GUA base triple. This recurrent internal loop motif is also called a G-bulge. This example compares instances of the loop across different species whose SSU chains map to Rfam family RF00177. The query loop is IL_5AJ3_023, which comes from chain A of PDB id 5AJ3, which is a small subunit ribosomal RNA from the mitochondrion of Sus scrofa. As it happens, there are other 3D structures of the same molecule from the same species, and 5AJ3|1|A is not the representative of the equivalence class of 3D structures, as we illustrate below by showing the table entry in the [Representative Set page](http://rna.bgsu.edu/rna3dhub/nrlist/release/3.332/4.0A) that contains 5AJ3:
This [example](http://rna.bgsu.edu/correspondence/comparison?selection=IL_5AJ3_023&resolution=4.0&scope=Rfam&depth=1&input_form=true) studies an internal loop from the small subunit ribosomal RNA helix 27. The core of the loop is the same as the sarcin-ricin internal loop in Helix 95 of the large subunit ribosomal RNA, consisting of a GUA base triple. This recurrent internal loop motif is also called a G-bulge. This example compares instances of the loop across different species whose SSU chains map to Rfam family RF00177. The query loop is IL_5AJ3_023, which comes from chain A of PDB id 5AJ3, which is a small subunit ribosomal RNA from the mitochondrion of *Sus scrofa*. As it happens, there are other 3D structures of the same molecule from the same species, and 5AJ3|1|A is not the representative of the equivalence class of 3D structures, as we illustrate below by showing the table entry in the [Representative Set page](http://rna.bgsu.edu/rna3dhub/nrlist/release/3.332/4.0A) that contains 5AJ3:

![Representative entry](/assets/ec_example.png)

Note that 6GAZ is the representative structure. Note here that the chains in this equivalence class map to Rfam family RF00177, which Rfam labels as being bacterial SSU, but which mitochondrial and chloroplast ribosomes also match well, due to the ribosomes in those organelles originating from bacteria.

Using 5AJ3 as a starting point in the query, R3DMCS maps its 18 nucleotides to other 3D structures which also map to [Rfam family RF00177](https://rfam.org/family/SSU_rRNA_bacteria). The query has depth=1, so only on structure, the representative structure, from each equivalence class is returned. Thus an instance from 6GAZ appears in the output page, not the query from 5AJ3.

This loop is particularly interesting, because the heat map shows four structures that are quite distinct from the rest, see below where we have selected the instance from 6GAZ and the instance from 5J7L, which is from E. coli as in previous examples. The key difference is that the four structures in the lower right of the heat map are all mitochondrial ribosomes, in which position 4 in the sequence is C, whereas the other structures all have G in that position. This example shows that when the G in the base triple in the G-bulge changes to C, the base triple is lost, and the C bulges out of the motif. Apparently that is not a problem in some mitochondria, but all bacteria in the 3D structure database have G in that position, and the G participates in the base triple.
This loop is particularly interesting, because the heat map shows four structures that are quite distinct from the rest, see below where we have selected the instance from 6GAZ and the instance from 5J7L, which is from *E. coli* as in previous examples. The key difference is that the four structures in the lower right of the heat map are all mitochondrial ribosomes, in which position 4 in the sequence is C, whereas the other structures all have G in that position. This example shows that when the G in the base triple in the G-bulge changes to C, the base triple is lost, and the C bulges out of the motif. Apparently that is not a problem in some mitochondria, but all bacteria in the 3D structure database have G in that position, and the G participates in the base triple.

![Example2](/assets/example2.png)

### Example 3: *E. coli* LSU H34 hairpin loop
[Example 3](http://rna.bgsu.edu/correspondence/comparison?selection=HL_8GLP_022&exp_method=all&resolution=4.0&depth=3&scope=Rfam&input_form=True) illustrates the GNRA hairpin loop from Helix 34 of the large subunit ribosomal RNA, compared across different species in the associated Rfam family solved at resolution 4.0A or better, with up to 3 instances from each species. This example illustrates how some structures (in this case, the models of Triticum aestivum) model the top adenine base of the GNRA in syn while others model that base in anti. Other variability is also evident. The image below shows the query instance from 8GLP (human) and one instance with the top A modeled in syn; the syn/anti superposition makes a characteristically symmetric image which can be spotted relatively easily. Modeling differences such as this often explain the difference between clusters of instances.
[Example 3](http://rna.bgsu.edu/correspondence/comparison?selection=HL_8GLP_022&exp_method=all&resolution=4.0&depth=3&scope=Rfam&input_form=True) illustrates the GNRA hairpin loop from Helix 34 of the large subunit ribosomal RNA, compared across different species in the associated Rfam family solved at resolution 4.0A or better, with up to 3 instances from each species. This example illustrates how some structures (in this case, the models of *Triticum aestivum*) model the top adenine base of the GNRA in syn while others model that base in anti. Other variability is also evident. The image below shows the query instance from 8GLP (human) and one instance with the top A modeled in syn; the syn/anti superposition makes a characteristically symmetric image which can be spotted relatively easily. Modeling differences such as this often explain the difference between clusters of instances.

![Example3](/assets/example3.png)

Expand Down

0 comments on commit c6f4f70

Please sign in to comment.