Friday 2 December 2016

Does IUPAC nomenclature have the ability to name all organic compounds?


We all know about IUPAC nomenclature. There are rules to name straight chain compounds, cyclic ones, polycyclic ones, ones with functional groups, and so on. But I have come across several examples of compounds which I simply cannot name, most of them being pharmaceutical drugs. Take the heme group complex in hemoglobin:


haeme group structure



I'm not sure, but systematic nomenclature appears not suitable to name this structure. A popular antacid, Zantac, also known as ranitidine has this structure:


Zantac structure


Can compounds like these even be named systematically using IUPAC rules?



Answer



Definitely not.


You got yourself in trouble specifying all organic compounds, because there is a truly immense, mind-boggling number of possible compounds. No one even knows how to accurately determine such a quantity. A very rough estimate, making some incredible simplifications such as the use of only carbon, hydrogen, oxygen, nitrogen and sulfur atoms, is that there are some $10^{63}$ "reasonable" unique structures for compounds with a molecular weight below $\rm{500\ g\ mol^{-1}}$. Thanks to combinatorics, chemical space is enormous. We (or all sapient species in our observable volume, for that matter) will never come close to scratching its surface.


Furthermore, IUPAC nomenclature is largely created a posteriori. That is, though there are many rules trying to cover as many bases as possible (in the process becoming quite unwieldy at times), eventually some unexpected compound with unusual connectivity is discovered and becomes of wide interest. Thus standardising its nomenclature and that for closely related structures becomes fundamental to allow communication between scientists. A recent example of this occurred with fullerenes, which quickly jumped to prominence after 1985. IUPAC just had to create an entire new section of nomenclature for this class of compounds, which is not at all uncommon.


The closest thing to an absolute method of describing a compound's structure is to have a table of positional data (XYZ coordinates) giving the relative positions between the atoms determined from X-ray/neutron diffraction. Any attempt at simplifying this data will be lossy, whether the structures are drawn (not too lossy) or named (very lossy).


The structures you show have comparatively simple IUPAC names, in fact. Heme is a type of porphyrin, which is a widely occurring framework in biomolecules. The central framework can have its positions numbered and the substituent in each one read off separately. Regarding Zantac, the Wikipedia page for the compound states its IUPAC name in the "Identifier" section in the box at the right, namely N-(2-[(5-[(dimethylamino)methyl]furan-2-yl)methylthio]ethyl)-N'-methyl-2-nitroethene-1,1-diamine.


As some interesting examples of the relations between available nomenclature and chemical space, consider the following:





  • 1,1,1,2,2,2-Hexaphenylethane : A molecule with a simple structure and a simple IUPAC name which likely cannot exist in reasonable conditions.




  • Maitotoxin : an awe-inspiring biomolecule with a rather large structure but containing fairly simple connectivity between atoms, whose IUPAC exists but is quite complex - disodium (2S,3R,4R,4aS,5aR,6aS,7aS,8R,9R,10R,11aR,12R,12aR,13aS,14aR)-10-[(2R,3R,4R,4aS,6S,7R,8aS)-6-[(1R,3R)-4-[(2S,3R,4R,4aS,6R,7R,8aS)-6-[(1R,3S,5R,7S,9R,10R,12R,13S,14S,16R,19S,21R,23S,25S,28R,30S)-25-[(1S,3R,5S,7R,9S,11S,14R,16S,18R,20S,21Z,24R,26S,28R,30S,32R,34R,35R,37S,39R,42S,44R)-11-[(1S,2R,4R,5S)-1,2-dihydroxy-4,5-dimethyloct-7-en-1-yl]-35-hydroxy-14,16,18,32,34,39,42,44-octamethyl-2,6,10,15,19,25,29,33,38,43-decaoxadecacyclo[22.21.0.0³,²⁰.0⁵,¹⁸.0⁷,¹⁶.0⁹,¹⁴.0²⁶,⁴⁴.0²⁸,⁴².0³⁰,³⁹.0³²,³⁷]pentatetracont-21-en-34-yl]-9,13-dihydroxy-3,7,14,19,30-pentamethyl-2,6,11,15,20,24,29-heptaoxaheptacyclo[17.12.0.0³,¹⁶.0⁵,¹⁴.0⁷,¹².0²¹,³⁰.0²³,²⁸]hentriacontan-10-yl]-3,4,7-trihydroxy-octahydropyrano[3,2-b]pyran-2-yl]-1,3-dihydroxybutyl]-3,4,7-trihydroxy-octahydropyrano[3,2-b]pyran-2-yl]-2-[(2S,3R)-2,3-dihydroxy-3-[(1S,3R,5S,6S,7R,8S,10R,11R,13S,15R,17S,19R,21R,22S,24S,25S,26R)-6,7,11,21,25-pentahydroxy-13,17-dimethyl-8-[(2R,3R,4R,7S,8R,9R,11R,13E)-3,8,11,15-tetrahydroxy-4,9,13-trimethyl-12-methylidene-7-(sulfonatooxy)pentadec-13-en-2-yl]-4,9,14,18,23,27-hexaoxahexacyclo[13.12.0.0³,¹³.0⁵,¹⁰.0¹⁷,²⁶.0¹⁹,²⁴]heptacosan-22-yl]propyl]-4,8,9,12-tetrahydroxy-hexadecahydro-2H-1,5,7,11,13-pentaoxapentacen-3-yl sulfate




  • This hydrocarbon : A molecule which likely exists, with a seemingly very simple structure, but with slightly quirky connectivity which makes naming it a challenge. Add a few more bridges and I'm sure you can break any existent nomenclature rules.





No comments:

Post a Comment

readings - Appending 内 to a company name is read ない or うち?

For example, if I say マイクロソフト内のパートナーシップは強いです, is the 内 here read as うち or ない? Answer 「内」 in the form: 「Proper Noun + 内」 is always read 「ない...