Skip to content

Releases: dan2097/opsin

v2.8.0

29 Oct 18:06
Compare
Choose a tag to compare
  • Support for undecahectane/undecadictane (previously only hendeca was supported)
  • Support for dicarboximido
  • Improved support for lysergic acid derivatives
  • Added a few more sugars e.g. digitalose
  • Added borodeuteride and hydro contractions of pharmaceutical salts e.g. hydromethanesulfonate
  • Support substitution on glyceric acid
  • Corrected interpretation of imidazolium, trioxane and phthalhydrazide

v2.7.0

15 Aug 23:54
Compare
Choose a tag to compare
  • Improved coverage of flavonoid parent structures
  • Support for apiofuranosyl, added 5 locant to apiose
  • Improved support for n-amyl
  • Superscripted numbers in poly spiro systems are now intelligently determined if the input lacks superscript indication
  • Support for annulynes
  • Fixed issues where amino acid salts were being interpreted as functionalisation of the amino acid
  • Fixed bug where annulene parsing was case sensitive
  • Chalcone, in accordance with current IUPAC recommendations, is now interpreted as specifically the trans isomer
  • Minor dependency updates

v2.6.0

21 Dec 12:34
Compare
Choose a tag to compare
  • OPSIN now requires Java 8 (or higher)
  • OPSIN command-line functionality moved to opsin-cli module
  • OPSIN standalone jars are now built with mvn package
  • Updated from InChI 1.03 to InChI 1.06
  • Support for capturing relative/racemic stereochemistry (output via CxSmiles) [contributed by John Mayfield]
  • Support for deaza/dethia
  • Support nitrile as a suffix on amino acids [contributed by John Mayfield]
  • Support more glycero-n-phospho substituents
  • Support for chloroxime and other haloximes
  • Support cis/trans on rings where a stereocenter has two non-hydrogen substituents, using Cahn-Ingold-Prelog rules to determine which are relative
  • Multiple improvements to implicit bracketting logic
  • Corrected interpretation of methylselenopyruvate
  • Added group 1/2 nitrides e.g. magnesium nitride
  • Added molecular diatomics e.g. molecular hydrogen (or dihydrogen)
  • Fixed out of memory error if a fusion bracket referenced an interior atom instead of a peripheral atom
  • Fixed out of memory error while parsing very long ambiguous input, by switching parsing algorithm from breadth-first to depth-first

Dependency changes:

  • Updated logging from Log4J v1.2.17 to the latest Log4J2 (v2.17.0). Neither OPSIN 2.5.0 nor 2.6.0 are vulnerable to Log4Shell. The logging implementation is only included in the opsin-cli module
  • opsin-inchi now uses JNA-InChI rather than JNI-InChI. This supports the latest version of InChI and also support new Macs with ARM64 processors
  • Woodstox now uses groupid com.fasterxml.woodstox (the groupid change did not signify a break in API compatibility)
  • dk.brics.automaton now uses groupid dk.brics (the groupid change did not signify a break in API compatibility)
  • commons-cli is only used by the opsin-cli module

v2.5.0

04 Oct 22:31
Compare
Choose a tag to compare
  • OPSIN now requires Java 7 (or higher)
  • Support for traditional oxidation state names e.g. ferric
  • Added support for defining the stereochemistry of phosphines/arsines
  • Added newly discovered elements
  • Improved algorithm for correctly interpreting ester names with a missing space e.g. 3-aminophenyl-4-aminobenzenesulfonate
  • Fixed structure of canavanine
  • Corrected interpretation of silver oxide
  • Vocabulary improvements
  • Minor improvements/bug fixes

Internal XML Changes:

  • tokenList files now all use the same schema (tokenLists.dtd)

v2.4.0

10 May 18:17
Compare
Choose a tag to compare
  • OPSIN is now licensed under the MIT License
  • Locant labels included in extended SMILES output
  • Command-line now has a name flag to include the input name in SMILES/InChI output (tab delimited)
  • Added support for carotenoids
  • Added support for Vitamin B-6 related compounds
  • Added support for more fused ring system bridge prefixes
  • Added support for anilide as a functional replacement group
  • Allow heteroatom replacement as a detachable prefix e.g. 3,6,9-triaza-2-(4-phenylbutyl)undecanoic acid
  • Support Boughton system isotopic suffixes for 13C/14C/15N/17O/18O
  • Support salts of acids in CAS inverted names
  • Improved support for implicitly positively charged purine nucleosides/nucleotides
  • Added various biochemical groups/substituents
  • Improved logic for determining intended substitution in names with too few brackets
  • Incorrectly capitalized locants can now be used to reference ring fusion atoms
  • Some names no longer allow substitution e.g. water, hydrochloride
  • Many minor precision/recall improvements

v2.3.1

10 May 18:18
Compare
Choose a tag to compare
  • Fixed fused ring numbering algorithm incorrectly numbering some ortho- and peri-fused fused systems involving 7-membered rings
  • Support P-thio to indicate thiophosphate linkage
  • Count of isotopic replacements no longer required if locants given
  • Fixed bug where CIP algorithm could assign priorities to identical substituents
  • Fixed "DL" before a substituent not assigning the substituted alpha-carbon as racemic stereo
  • L-stereochemistry no longer assumed on semi-systematic glycine derivatives e.g. phenylglycine
  • Fixed some cases where substituents like carbonyl should have been part of an implicitly bracketed section
  • Fixed interpretation of leucinic acid and 3/4/5-pyrazolone

v2.3.0

10 May 18:18
Compare
Choose a tag to compare
  • D/L stereochemistry can now be assigned algorithmically e.g. L-2-aminobutyric acid
  • Other minor improvements to amino acid support e.g. homoproline added
  • Extended SMILES added to command-line interface
  • Names intended to include the triiodide/tribromide anion no longer erroneously have three monohalides
  • Ambiguity detected when applying unlocanted subtractive prefixes
  • Better support for adjacent multipliers e.g. ditrifluoroacetic acid
  • deoxynucleosides are now implicitly 2'-deoxynucleosides
  • Added support for <number> as a syntax for a superscripted number
  • Added support for amidrazones
  • Aluminium hydrides/chlorides/bromides/iodides are now covalently bonded
  • Fixed names with isotopes less than 10 not being supported
  • Fixed interpretation of some trivial names that clash with systematic names

v2.2.0

10 May 18:19
Compare
Choose a tag to compare
  • Added support for IUPAC system for isotope specification e.g. (3-14C,2,2-2H2)butane
  • Added support for specifying deuteration using the Boughton system e.g. butane-2,2-d2
  • Added support for multiplied bridges e.g. 1,2:3,4-diepoxy
  • Front locants after a von baeyer descriptor are now supported e.g. bicyclo[2.2.2]-7-octene
  • onosyl substituents now supported e.g. glucuronosyl
  • More sugar substituents e.g. glucosaminyl
  • Improved support for malformed polycyclic spiro names
  • Support for oximino as a suffix
  • Added method [NameToStructure.getVersion()] to retrieve OPSIN version number
  • Allowed bridges to be used as detachable prefixes
  • Allow odd numbers of hydro to be added e.g. trihydro
  • Added support for unbracketed R stereochemistry (but not S, for the moment, due to the ambiguity with sulfur locants)
  • Various minor bug fixes e.g. stereochemistry was incorrect for isovaline
  • Minor vocabulary improvements

v2.1.0

10 May 18:19
Compare
Choose a tag to compare
  • Added support for fractional multipliers e.g. hemihydrochloride
  • Added support for abbreviated common salts e.g. HCl
  • Added support for sandwich compounds e.g. ferrocene
  • Improved recognition of names missing the last 'e' (common in German)
  • Support for E/Z directly before double bond indication e.g. 2Z-ylidene, 2Z-ene
  • Improved support for functional class ethers e.g. "glycerol triglycidyl ether"
  • Added general support for names involving an ester formed from an alcohol and an ate group
  • Grignards reagents and certain compounds (e.g. uranium hexafluoride), are now treated as covalent rather than ionic
  • Added experimental support for outputting extended SMILES. Polymers and attachment points are annotated explicitly
  • Polymers when output as SMILES now have atom classes to indicate which end of the repeat unit is which
  • Support * as a superscript indicator e.g. 6 to mean superscript 6
  • Improved recognition of racemic stereochemistry terms
  • Added general support for names like "beta-alanine N,N-diacetic acid"
  • Allowed "one" and "ol" suffixes to be used in more cases where another suffix is also present
  • "ic acid halide" is not interpreted the same as "ic halide"
  • Fixed some cases where ambiguous operations were not considered ambiguous e.g. monosubstitututed phenyl
  • Improvements/bug fixes to heuristics for detecting when spaces are omitted from ether/ester names
  • Improved support for stereochemistry in older CAS index names
  • Many precision improvements e.g. cyclotriphosphazene, thiazoline, TBDMS/TBDPS protecting groups, S-substituted-methionine
  • Various minor bug fixes e.g. names containing "SULPH" not recognized
  • Minor vocabulary improvements

Internal XML Changes:

  • Synonymns of the same concept are now or-ed rather being seperate entities e.g. <token>tertiary|tert-|t-</token>

v2.0.0

10 May 18:20
Compare
Choose a tag to compare

MAJOR CHANGES

  • Requires Java 1.6 or higher

  • CML (Chemical Markup Language) is now returned as a String rather than a XOM Element

  • OPSIN now attempts to identify if a chemical name is ambiguous. Names that appear ambiguous return with a status of WARNING with the structure provided being one interpretation of the name

  • Added support for "alcohol esters" e.g. phenol acetate [meaning phenyl acetate]

  • Multiplied unlocanted substitution is now more intelligent e.g. all substituents must connect to same group, and degeneracy of atom environments is taken into account

  • The ester interpretation is now preferred in more cases where a name does not contain a space but the parent is methanoate/ethanoate/formate/acetate/carbamate

  • Inorganic oxides are now interpreted, yielding structures with [O-2] ions

  • Added more trivial names of simple molecules

  • Support for nitrolic acids

  • Fixed parsing issue where a directly substituted acetal was not interpretable

  • Fixed certain groups e.g. phenethyl, not having their suffix attached to a specific location

  • Corrected interpretation of xanthyl, and various trivial names that look systematic

  • Name to structure is now ~20% faster

  • Initialisation time reduced by a third

  • InChI generation is now ~20% faster

  • XML processing dependency changed from XOM to Woodstox

  • Significant internal refactoring

  • Utility functions designed for internal use are no longer on the public API

  • Various minor bug fixes

Internal XML Changes:

  • Groups lacking a labels attribute now have no locants (previously had ascending numeric locants)
  • Syntax for addGroup/addHeteroAtom/addBond attributes changed to be easier to parse and allow specification of whether the name is ambiguous if a locant is not provided