Skip to content

Commit

Permalink
Merge pull request #22 from Goosang-Yu/develop
Browse files Browse the repository at this point in the history
🔖 Release v0.8.1
  • Loading branch information
Goosang-Yu authored Aug 9, 2023
2 parents 3eb1e1c + 30a376b commit 7781bac
Show file tree
Hide file tree
Showing 3 changed files with 33 additions and 26 deletions.
55 changes: 31 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,11 +95,11 @@ df_out = spcas.predict(list_target)

>>> df_out
```
| | Sequence | SpCas9 |
| - | ------------------------------ | -------- |
| 0 | TCACCTTCGTTTTTTTCCTTCTGCAGGAGG | 2.801168 |
| 1 | CCTTCGTTTTTTTCCTTCTGCAGGAGGACA | 2.253283 |
| 2 | CTTTCAAGAACTCTTCCACCTCCATGGTGT | 53.43183 |
| | Target | Spacer | SpCas9 |
| ------ | ------------------------------ | -------------------- | -------- |
| 0 | TCACCTTCGTTTTTTTCCTTCTGCAGGAGG | CTTCGTTTTTTTCCTTCTGC | 2.801172 |
| 1 | CCTTCGTTTTTTTCCTTCTGCAGGAGGACA | CGTTTTTTTCCTTCTGCAGG | 2.253288 |
| 2 | CTTTCAAGAACTCTTCCACCTCCATGGTGT | CAAGAACTCTTCCACCTCCA | 53.43182 |


## Tutorial 2: Predict SpCas9variants activity (by DeepSpCas9variants)
Expand All @@ -125,11 +125,11 @@ df_out = cas_ng.predict(list_target30)

>>> df_out
```
| | Sequence | SpCas9-NG |
| - | ------------------------------ | --------- |
| 0 | TCACCTTCGTTTTTTTCCTTCTGCAGGAGG | 0.618299 |
| 1 | CCTTCGTTTTTTTCCTTCTGCAGGAGGACA | 1.134845 |
| 2 | CTTTCAAGAACTCTTCCACCTCCATGGTGT | 36.74358 |
| | Target | Spacer | SpCas9-NG |
| - | ------------------------------ | -------------------- | --------- |
| 0 | TCACCTTCGTTTTTTTCCTTCTGCAGGAGG | CTTCGTTTTTTTCCTTCTGC | 0.618299 |
| 1 | CCTTCGTTTTTTTCCTTCTGCAGGAGGACA | CGTTTTTTTCCTTCTGCAGG | 1.134845 |
| 2 | CTTTCAAGAACTCTTCCACCTCCATGGTGT | CAAGAACTCTTCCACCTCCA | 36.74358 |

## Tutorial 3: Predict Prime editing efficiency (by DeepPrime and DeepPrime-FT)
![](docs/images/Fig3_DeepPrime_architecture.svg)
Expand All @@ -149,26 +149,33 @@ alt_type = 'sub1'
df_pe = prd.pe_score(seq_wt, seq_ed, alt_type)
df_pe.head()
```
output:
| | ID | WT74_On | Edited74_On | PBSlen | RTlen | RT-PBSlen | Edit_pos | Edit_len | RHA_len | type_sub | type_ins | type_del | Tm1 | Tm2 | Tm2new | Tm3 | Tm4 | TmD | nGCcnt1 | nGCcnt2 | nGCcnt3 | fGCcont1 | fGCcont2 | fGCcont3 | MFE3 | MFE4 | DeepSpCas9_score | PE2max_score |
|---:|:-------|:---------------------------------------------------------------------------|:---------------------------------------------------------------------------|---------:|--------:|------------:|-----------:|-----------:|----------:|-----------:|-----------:|-----------:|--------:|--------:|---------:|---------:|--------:|---------:|----------:|----------:|----------:|-----------:|-----------:|-----------:|-------:|-------:|-------------------:|------------------:|
| 0 | Sample | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxxxxCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 7 | 35 | 42 | 34 | 1 | 1 | 1 | 0 | 0 | 16.191 | 62.1654 | 62.1654 | -277.939 | 58.2253 | -340.105 | 5 | 16 | 21 | 71.4286 | 45.7143 | 50 | -10.4 | -0.6 | 45.9675 | 0.0202249 |
| 1 | Sample | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxxxCCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 8 | 35 | 43 | 34 | 1 | 1 | 1 | 0 | 0 | 30.1995 | 62.1654 | 62.1654 | -277.939 | 58.2253 | -340.105 | 6 | 16 | 22 | 75 | 45.7143 | 51.1628 | -10.4 | -0.6 | 45.9675 | 0.0541608 |
| 2 | Sample | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxxACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 9 | 35 | 44 | 34 | 1 | 1 | 1 | 0 | 0 | 33.7839 | 62.1654 | 62.1654 | -277.939 | 58.2253 | -340.105 | 6 | 16 | 22 | 66.6667 | 45.7143 | 50 | -10.4 | -0.6 | 45.9675 | 0.051455 |
| 3 | Sample | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxCACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 10 | 35 | 45 | 34 | 1 | 1 | 1 | 0 | 0 | 38.5141 | 62.1654 | 62.1654 | -277.939 | 58.2253 | -340.105 | 7 | 16 | 23 | 70 | 45.7143 | 51.1111 | -10.4 | -0.6 | 45.9675 | 0.0826205 |
| 4 | Sample | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 11 | 35 | 46 | 34 | 1 | 1 | 1 | 0 | 0 | 40.8741 | 62.1654 | 62.1654 | -277.939 | 58.2253 | -340.105 | 7 | 16 | 23 | 63.6364 | 45.7143 | 50 | -10.4 | -0.6 | 45.9675 | 0.0910506 |
| | Target | Spacer | RT-PBS | PBSlen | RTlen | RT-PBSlen | Edit_pos | Edit_len | RHA_len | PE2max_score |
| - | ------------------------------------------------- | ------------------------------ | ---------------------------------------------- | ------ | ----- | --------- | -------- | -------- | ------- | ------------ |
| 0 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGG | 7 | 35 | 42 | 34 | 1 | 1 | 0.904907 |
| 1 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGG | 8 | 35 | 43 | 34 | 1 | 1 | 2.377118 |
| 2 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGGT | 9 | 35 | 44 | 34 | 1 | 1 | 2.613841 |
| 3 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGGTG | 10 | 35 | 45 | 34 | 1 | 1 | 3.643573 |
| 4 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGGTGT | 11 | 35 | 46 | 34 | 1 | 1 | 3.770234 |



It is also possible to predict other cell lines (A549, DLD1...) and PE systems (PE2max, PE4max...).
#### if you wanna see biofeatures

```python
from genet import predict as prd
df_pe = prd.pe_score(seq_wt, seq_ed, alt_type, show_features=True)
df_pe.head()
```
| | ID | WT74_On | Edited74_On | PBSlen | RTlen | RT-PBSlen | Edit_pos | Edit_len | RHA_len | type_sub | type_ins | type_del | Tm1 | Tm2 | Tm2new | Tm3 | Tm4 | TmD | nGCcnt1 | nGCcnt2 | nGCcnt3 | fGCcont1 | fGCcont2 | fGCcont3 | MFE3 | MFE4 | DeepSpCas9_score | PE2max_score |
| - | ------ | -------------------------------------------------------------------------- | -------------------------------------------------------------------------- | ------ | ----- | --------- | -------- | -------- | ------- | -------- | -------- | -------- | -------- | ------- | ------- | --------- | -------- | --------- | ------- | ------- | ------- | -------- | -------- | -------- | ------ | ----- | ---------------- | ------------ |
| 0 | Sample | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxxxxCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 7 | 35 | 42 | 34 | 1 | 1 | 1 | 0 | 0 | 16.19097 | 62.1654 | 62.1654 | \-277.939 | 58.22525 | \-340.105 | 5 | 16 | 21 | 71.42857 | 45.71429 | 50 | \-10.4 | \-0.6 | 45.96754 | 0.904907 |
| 1 | Sample | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxxxCCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 8 | 35 | 43 | 34 | 1 | 1 | 1 | 0 | 0 | 30.19954 | 62.1654 | 62.1654 | \-277.939 | 58.22525 | \-340.105 | 6 | 16 | 22 | 75 | 45.71429 | 51.16279 | \-10.4 | \-0.6 | 45.96754 | 2.377118 |
| 2 | Sample | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxxACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 9 | 35 | 44 | 34 | 1 | 1 | 1 | 0 | 0 | 33.78395 | 62.1654 | 62.1654 | \-277.939 | 58.22525 | \-340.105 | 6 | 16 | 22 | 66.66667 | 45.71429 | 50 | \-10.4 | \-0.6 | 45.96754 | 2.613841 |
| 3 | Sample | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxCACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 10 | 35 | 45 | 34 | 1 | 1 | 1 | 0 | 0 | 38.51415 | 62.1654 | 62.1654 | \-277.939 | 58.22525 | \-340.105 | 7 | 16 | 23 | 70 | 45.71429 | 51.11111 | \-10.4 | \-0.6 | 45.96754 | 3.643573 |
| 4 | Sample | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 11 | 35 | 46 | 34 | 1 | 1 | 1 | 0 | 0 | 40.87411 | 62.1654 | 62.1654 | \-277.939 | 58.22525 | \-340.105 | 7 | 16 | 23 | 63.63636 | 45.71429 | 50 | \-10.4 | \-0.6 | 45.96754 | 3.770234 |


seq_wt = 'ATGACAATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATGTCAACTGAAACCTTAAAGTGAGTATTTAATTGAGCTGAAGT'
seq_ed = 'ATGACAATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGAACTATAACCTGCAAATGTCAACTGAAACCTTAAAGTGAGTATTTAATTGAGCTGAAGT'
alt_type = 'sub1'
#### It is also possible to predict other cell lines (A549, DLD1...) and PE systems (PE2max, PE4max...).

```python
df_pe = prd.pe_score(seq_wt, seq_ed, alt_type, sID='MyGene', pe_system='PE4max', cell_type='A549')
```

Expand Down
2 changes: 1 addition & 1 deletion genet/predict/functional.py
Original file line number Diff line number Diff line change
Expand Up @@ -1022,8 +1022,8 @@ def get_extension(masked_seq:str):
df['Target'] = df_all['WT74_On']
df['Spacer'] = list_Guide30
df['RT-PBS'] = df_all['Edited74_On'].apply(get_extension)
df['%s_score' % pe_system] = df_all['%s_score' % pe_system]
df = pd.concat([df,df_all.iloc[:, 3:9]],axis=1)
df['%s_score' % pe_system] = df_all['%s_score' % pe_system]

return df

Expand Down
2 changes: 1 addition & 1 deletion genet/utils/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# python setup.py bdist_wheel
# twine upload dist/genet-0.5.2-py3-none-any.whl

version_ = '0.8.0'
version_ = '0.8.1'

__version__ = version_

0 comments on commit 7781bac

Please sign in to comment.