Skip to content

Commit

Permalink
add functional group descriptions to data
Browse files Browse the repository at this point in the history
  • Loading branch information
Aariq committed Aug 13, 2024
1 parent 1342f27 commit d6976c7
Show file tree
Hide file tree
Showing 4 changed files with 62 additions and 39 deletions.
12 changes: 12 additions & 0 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,20 @@
#' \describe{
#' \item{method}{Either "simpol1" for functional groups only used with the SIMPOL.1 method, or "meredith" for additional groups used in the Meredith et al. method.}
#' \item{functional_groups}{These correspond to matching column names in the results of [get_fx_groups()].}
#' \item{description}{Functional group description from Table 5 of Pankow & Asher (2008)}
#' \item{smarts}{SMARTS strings used to capture groups, when applicable}
#' \item{fun}{The function used to capture the functional group. When `smarts` is not `NA`, this is always "[ChemmineR::smartsSearchOB]". Other groups are captured with other `ChemmineR` functions or as calculations using other functional groups.}
#' \item{notes}{Notes including how any functional group counts are corrected when there is overlap. E.g. when one SMARTS pattern is a subset of another pattern, but the two groups are counted separately without overlap in the SIMPOL.1 method.}
#' }
#'
#' @references
#' Meredith L, Ledford S, Riemer K, Geffre P, Graves K, Honeker L, LeBauer D,
#' Tfaily M, Krechmer J. 2023. Automating methods for estimating metabolite
#' volatility. Frontiers in Microbiology. \doi{10.3389/fmicb.2023.1267234}
#'
#' Pankow, J.F., Asher, W.E. 2008. SIMPOL.1: a simple group
#' contribution method for predicting vapor pressures and enthalpies of
#' vaporization of multifunctional organic compounds. Atmos. Chem. Phys.
#' \doi{10.5194/acp-8-2773-2008}
#'
"smarts_simpol1"
78 changes: 39 additions & 39 deletions data-raw/smarts_simpol1.csv
Original file line number Diff line number Diff line change
@@ -1,39 +1,39 @@
method,functional_group,smarts,fun,notes
simpol1,carbons_asa,NA,NA,Number of carbons on the acid side of an amide—not possible to capture with SMARTS
simpol1,rings_aromatic,NA,ChemmineR::rings,
simpol1,rings_total,NA,ChemmineR::rings,
simpol1,rings_aliphatic,NA,rings_total - rings_aromatic,
simpol1,carbon_dbl_bonds_aliphatic,C=C,ChemmineR::smartsSearchOB,
simpol1,CCCO_aliphatic_ring,C(C=C[AR1])(=O)[AR1],ChemmineR::smartsSearchOB,Matches C=C-C=O in a non-aromatic ring
simpol1,hydroxyl_total,NA,ChemmineR::groups,
simpol1,hydroxyl_aromatic,[OX2H]c,ChemmineR::smartsSearchOB,"This pattern also captures nitrophenols, so the number of nitrophenols is subtracted"
simpol1,hydroxyl_aliphatic,NA,hydroxyl_total - hydroxyl_aromatic,
simpol1,aldehydes,NA,ChemmineR::groups,
simpol1,ketones,NA,ChemmineR::groups,
simpol1,carbox_acids,NA,ChemmineR::groups,
simpol1,ester,NA,ChemmineR::groups,"This also captures carbonylperoxynitrates and nitroesters, so the number of carbonylperoxynitrates and nitroesters are subtracted"
simpol1,ether_total,NA,ChemmineR::groups,
simpol1,ether_alkyl,NA,ether_total - ether_alicyclic - ether_aromatic,
simpol1,ether_alicyclic,[OD2]([C!R0])[C!R0],ChemmineR::smartsSearchOB,
simpol1,ether_aromatic,"O(c)[C,c]",ChemmineR::smartsSearchOB,Only one of the carbons has to be aromatic
simpol1,nitrate,"[$([NX3](=[OX1])(=[OX1])O),$([NX3+]([OX1-])(=[OX1])O)]",ChemmineR::smartsSearchOB,"This pattern also captures carbonylperoxynitrates, so the number of carbonylperoxynitrates is subtracted"
simpol1,nitro,"[$([NX3](=O)=O),$([NX3+](=O)[O-])][!#8]",ChemmineR::smartsSearchOB,
simpol1,amine_primary,[NX3;H2;!$(NC=[!#6]);!$(NC#[!#6])][#6X4],ChemmineR::smartsSearchOB,
simpol1,amine_secondary,[NX3H1!$(NC=[!#6])!$(NC#[!#6])]([#6X4])[#6X4],ChemmineR::smartsSearchOB,
simpol1,amine_tertiary,[NX3H0!$(NC=[!#6])!$(NC#[!#6])]([#6X4])([#6X4])[#6X4],ChemmineR::smartsSearchOB,
simpol1,amine_aromatic,[NX3;!$(NO)]c,ChemmineR::smartsSearchOB,
simpol1,amide_primary,"[CX3;$([R0][#6]),$([H1R0])](=[OX1])[#7X3H2]",ChemmineR::smartsSearchOB,
simpol1,amide_secondary,"[CX3;$([R0][#6]),$([H1R0])](=[OX1])[#7X3H1][#6;!$(C=[O,N,S])]",ChemmineR::smartsSearchOB,
simpol1,amide_tertiary,"[CX3;$([R0][#6]),$([H1R0])](=[OX1])[#7X3H0]([#6;!$(C=[O,N,S])])[#6;!$(C=[O,N,S])]",ChemmineR::smartsSearchOB,
simpol1,carbonylperoxynitrate,*C(=O)OO[N+1](=O)[O-1],ChemmineR::smartsSearchOB,
simpol1,peroxide,[OX2D2][OX2D2],ChemmineR::smartsSearchOB,"This pattern also captures carbonylperoxynitrates, so the number of carbonylperoxinitrates is subtracted"
simpol1,hydroperoxide,"[OX2][OX2H,OX1-]",ChemmineR::smartsSearchOB,"This pattern also captures peroxyacids, so the number of carbonylperoxyacids is subtracted"
simpol1,carbonylperoxyacid,"[CX3;$([R0][#6]),$([H1R0])](=[OX1])[OX2][$([OX2H]),$([OX1-])]",ChemmineR::smartsSearchOB,
simpol1,nitrophenol,"[OX2H][$(c1ccccc1[$([NX3](=O)=O),$([NX3+](=O)[O-])]),$(c1cccc(c1)[$([NX3](=O)=O),$([NX3+](=O)[O-])]),$(c1ccc(cc1)[$([NX3](=O)=O),$([NX3+](=O)[O-])])]",ChemmineR::smartsSearchOB,
simpol1,nitroester,"C(=O)(OC)C~[NX3](-,=[OX1])-,=[OX1]",ChemmineR::smartsSearchOB,"This pattern captures OH groups on a ring that also has a nitro group (para, ortho, or meta)"
meredith,phosphoric_acids,"[$(P(=[OX1])([$([OX2H]),$([OX1-]),$([OX2]P)])([$([OX2H]),$([OX1-]),$([OX2]P)])[$([OX2H]),$([OX1-]),$([OX2]P)]),$([P+]([OX1-])([$([OX2H]),$([OX1-]),$([OX2]P)])([$([OX2H]),$([OX1-]),$([OX2]P)])[$([OX2H]),$([OX1-]),$([OX2]P)])]",ChemmineR::smartsSearchOB,"This pattern also captures phosphoric esthers, so the number of phosphoric esters is subtracted"
meredith,phosphoric_esters,"[$(P(=[OX1])([OX2][#6])([$([OX2H]),$([OX1-]),$([OX2][#6])])[$([OX2H]),$([OX1-]),$([OX2][#6]),$([OX2]P)]),$([P+]([OX1-])([OX2][#6])([$([OX2H]),$([OX1-]),$([OX2][#6])])[$([OX2H]),$([OX1-]),$([OX2][#6]),$([OX2]P)])]",ChemmineR::smartsSearchOB,
meredith,sulfates,"[$([#16X4](=[OX1])(=[OX1])([OX2H,OX1H0-])[OX2][#6]),$([#16X4+2]([OX1-])([OX1-])([OX2H,OX1H0-])[OX2][#6])]",ChemmineR::smartsSearchOB,
meredith,sulfonates,"[#16X4](=[OX1])(=[OX1])([#6])[*$([O-1]),*$([OH1]),*$([OX2H0])]",ChemmineR::smartsSearchOB,This pattern captures sulfonate ions and their conjugate acids (sulfonic acids)
meredith,thiols,[#16X2H],ChemmineR::smartsSearchOB,
meredith,carbothioesters,S([#6])[CX3](=O)[#6],ChemmineR::smartsSearchOB,
method,functional_group,description,smarts,fun,notes
simpol1,carbons_asa,carbon number on the acid-side of an amide,NA,NA,Not possible to capture with SMARTS
simpol1,rings_aromatic,aromatic ring,NA,ChemmineR::rings,
simpol1,rings_total,,NA,ChemmineR::rings,
simpol1,rings_aliphatic,non-aromatic ring,NA,rings_total - rings_aromatic,
simpol1,carbon_dbl_bonds_aliphatic,C=C (non-aromatic),C=C,ChemmineR::smartsSearchOB,
simpol1,CCCO_aliphatic_ring,C=C-C=O in non-aromatic ring,C(C=C[AR1])(=O)[AR1],ChemmineR::smartsSearchOB,
simpol1,hydroxyl_total,,NA,ChemmineR::groups,
simpol1,hydroxyl_aromatic,"aromatic hydroxyl (e.g., phenol)",[OX2H]c,ChemmineR::smartsSearchOB,"This pattern also captures nitrophenols, so the number of nitrophenols is subtracted"
simpol1,hydroxyl_aliphatic,hydroxyl (alkyl),NA,hydroxyl_total - hydroxyl_aromatic,
simpol1,aldehydes,aldehyde,NA,ChemmineR::groups,
simpol1,ketones,ketone,NA,ChemmineR::groups,
simpol1,carbox_acids,carboxylic acid,NA,ChemmineR::groups,
simpol1,ester,ester,NA,ChemmineR::groups,"This also captures carbonylperoxynitrates and nitroesters, so the number of carbonylperoxynitrates and nitroesters are subtracted"
simpol1,ether_total,,NA,ChemmineR::groups,
simpol1,ether_alkyl,ether,NA,ether_total - ether_alicyclic - ether_aromatic,
simpol1,ether_alicyclic,ether (alicyclic),[OD2]([C!R0])[C!R0],ChemmineR::smartsSearchOB,
simpol1,ether_aromatic,"ether, aromatic","O(c)[C,c]",ChemmineR::smartsSearchOB,Only one of the carbons has to be aromatic
simpol1,nitrate,nitrate,"[$([NX3](=[OX1])(=[OX1])O),$([NX3+]([OX1-])(=[OX1])O)]",ChemmineR::smartsSearchOB,"This pattern also captures carbonylperoxynitrates, so the number of carbonylperoxynitrates is subtracted"
simpol1,nitro,nitro,"[$([NX3](=O)=O),$([NX3+](=O)[O-])][!#8]",ChemmineR::smartsSearchOB,
simpol1,amine_primary,"amine, primary",[NX3;H2;!$(NC=[!#6]);!$(NC#[!#6])][#6X4],ChemmineR::smartsSearchOB,
simpol1,amine_secondary,"amine, secondary",[NX3H1!$(NC=[!#6])!$(NC#[!#6])]([#6X4])[#6X4],ChemmineR::smartsSearchOB,
simpol1,amine_tertiary,"amine, tertiary",[NX3H0!$(NC=[!#6])!$(NC#[!#6])]([#6X4])([#6X4])[#6X4],ChemmineR::smartsSearchOB,
simpol1,amine_aromatic,"amine, aromatic",[NX3;!$(NO)]c,ChemmineR::smartsSearchOB,
simpol1,amide_primary,"amide, primary","[CX3;$([R0][#6]),$([H1R0])](=[OX1])[#7X3H2]",ChemmineR::smartsSearchOB,
simpol1,amide_secondary,"amide, secondary","[CX3;$([R0][#6]),$([H1R0])](=[OX1])[#7X3H1][#6;!$(C=[O,N,S])]",ChemmineR::smartsSearchOB,
simpol1,amide_tertiary,"amide, tertiary","[CX3;$([R0][#6]),$([H1R0])](=[OX1])[#7X3H0]([#6;!$(C=[O,N,S])])[#6;!$(C=[O,N,S])]",ChemmineR::smartsSearchOB,
simpol1,carbonylperoxynitrate,carbonylperoxynitrate,*C(=O)OO[N+1](=O)[O-1],ChemmineR::smartsSearchOB,
simpol1,peroxide,peroxide,[OX2D2][OX2D2],ChemmineR::smartsSearchOB,"This pattern also captures carbonylperoxynitrates, so the number of carbonylperoxinitrates is subtracted"
simpol1,hydroperoxide,hydroperoxide,"[OX2][OX2H,OX1-]",ChemmineR::smartsSearchOB,"This pattern also captures peroxyacids, so the number of carbonylperoxyacids is subtracted"
simpol1,carbonylperoxyacid,carbonylperoxyacid,"[CX3;$([R0][#6]),$([H1R0])](=[OX1])[OX2][$([OX2H]),$([OX1-])]",ChemmineR::smartsSearchOB,
simpol1,nitrophenol,nitrophenol,"[OX2H][$(c1ccccc1[$([NX3](=O)=O),$([NX3+](=O)[O-])]),$(c1cccc(c1)[$([NX3](=O)=O),$([NX3+](=O)[O-])]),$(c1ccc(cc1)[$([NX3](=O)=O),$([NX3+](=O)[O-])])]",ChemmineR::smartsSearchOB,
simpol1,nitroester,nitroester,"C(=O)(OC)C~[NX3](-,=[OX1])-,=[OX1]",ChemmineR::smartsSearchOB,"This pattern captures OH groups on a ring that also has a nitro group (para, ortho, or meta)"
meredith,phosphoric_acids,,"[$(P(=[OX1])([$([OX2H]),$([OX1-]),$([OX2]P)])([$([OX2H]),$([OX1-]),$([OX2]P)])[$([OX2H]),$([OX1-]),$([OX2]P)]),$([P+]([OX1-])([$([OX2H]),$([OX1-]),$([OX2]P)])([$([OX2H]),$([OX1-]),$([OX2]P)])[$([OX2H]),$([OX1-]),$([OX2]P)])]",ChemmineR::smartsSearchOB,"This pattern also captures phosphoric esthers, so the number of phosphoric esters is subtracted"
meredith,phosphoric_esters,,"[$(P(=[OX1])([OX2][#6])([$([OX2H]),$([OX1-]),$([OX2][#6])])[$([OX2H]),$([OX1-]),$([OX2][#6]),$([OX2]P)]),$([P+]([OX1-])([OX2][#6])([$([OX2H]),$([OX1-]),$([OX2][#6])])[$([OX2H]),$([OX1-]),$([OX2][#6]),$([OX2]P)])]",ChemmineR::smartsSearchOB,
meredith,sulfates,,"[$([#16X4](=[OX1])(=[OX1])([OX2H,OX1H0-])[OX2][#6]),$([#16X4+2]([OX1-])([OX1-])([OX2H,OX1H0-])[OX2][#6])]",ChemmineR::smartsSearchOB,
meredith,sulfonates,,"[#16X4](=[OX1])(=[OX1])([#6])[*$([O-1]),*$([OH1]),*$([OX2H0])]",ChemmineR::smartsSearchOB,This pattern captures sulfonate ions and their conjugate acids (sulfonic acids)
meredith,thiols,,[#16X2H],ChemmineR::smartsSearchOB,
meredith,carbothioesters,,S([#6])[CX3](=O)[#6],ChemmineR::smartsSearchOB,
Binary file modified data/smarts_simpol1.rda
Binary file not shown.
11 changes: 11 additions & 0 deletions man/smarts_simpol1.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit d6976c7

Please sign in to comment.