The molmatch module provides functions to compare chemical structures.
BIT_FP_AND(fingerprint,fingerprint)
BIT_FP_AND() operates on two fingerprints (bit patterns) of equal length and performs the logical AND operation on each pair of corresponding bits. In each pair, the result is 1 if the both bits are 1. Otherwise, the result is 0. If the two fingerprints do not have the same length, the function returns NULL.
mysql>SELECT BIT_FP_AND(->fingerprint1,fingerprint2);binary fingerprint
The BIT_FP_AND()
function is very useful when working with structure fingerprints. For example,
if a molecule (with a fingerprint fp1) is a
substructure of an other molecule (with a fingerprint fp2),
the following property is observed:
mysql>SELECT TANIMOTO(BIT_FP_AND(-> 1fp1,fp2),fp1);
BIT_FP_COUNT() returns the number of bits that are set in the fingerprint binary representation.
mysql>SELECT BIT_FP_COUNT(fp_col) FROMtbl_name->WHERE name='1H-indole';-> 23
BIT_FP_OR(fingerprint,fingerprint)
BIT_FP_OR() operates on two fingerprints (bit patterns) of equal length and performs the logical OR operation on each pair of corresponding bits. In each pair, if the first bit is 1 or the second bit is 1 (or both), the result is 1. Otherwise, the result is 0. If the two fingerprints do not have same length, the function returns NULL.
mysql>SELECT BIT_FP_OR(->fingerprint1,fingerprint2);binary fingerprint
MATCH_SUBSTRUCT(query_smarts,reference_obmol)
MATCH_SUBSTRUCT() checks if a query_smarts fragment is a substructure of a reference_obmol molecule. The first argument is a SMARTS string, whereas the second argument is a serialized OBMol object. The second argument type is generated by the MOLECULE_TO_SERIALIZEDOBMOL() function. If the query_smarts is a substructure of reference_obmol, the function returns 1, otherwise, it returns 0.
mysql>SELECT MATCH_SUBSTRUCT('C=O',serializedobmol_col)->FROM-> 1tbl_nameWHERE name='glycine';
SUBSTRUCT_ATOM_IDS(query_smarts,reference_obmol)
SUBSTRUCT_ATOM_IDS() returns the atom ids of a reference_obmol molecule that are contained in substructures matching a query_smarts fragment. The first argument is a SMARTS string, whereas the second argument is a serialized OBMol object. The second argument type is generated by the MOLECULE_TO_SERIALIZEDOBMOL() function. If a reference_obmol molecule contains several fragments matching a query_smarts fragment, a list of items is returned. Each item contains a fragment's atom ids and is separated from the next item by a semicolon character.
mysql>SELECT SUBSTRUCT_ATOM_IDS('C(=O)',serializedobmol_col)->FROM-> 2 3 ;tbl_nameWHERE name='glycine';
SUBSTRUCT_COUNT(query_smarts,reference_obmol)
SUBSTRUCT_COUNT() returns the number of query_smarts fragments founded in a reference_obmol molecule. The first argument is a SMARTS string, whereas the second argument is a serialized OBMol object. The second argument type is generated by the MOLECULE_TO_SERIALIZEDOBMOL() function.
mysql>SELECT SUBSTRUCT_COUNT('C(=O)',serializedobmol_col)->FROM-> 2tbl_nameWHERE name='glycine';
TANIMOTO(first_fingerprint,second_fingerprint)
TANIMOTO() returns the tanimoto coefficient between two fingerprints. Fingerprints are bit patterns and can be generated with the FINGERPRINT() function. The return value is comprised between 0 and 1. The higher the tanimoto coefficient is to 1, the more the molecules are similar.
mysql>SELECT TANIMOTO(molecule_fp,fp_col) FROMtbl_name->WHERE name='glycine';-> 0.8934
The use of another Mychem functions (like FINGERPRINT() or FINGERPRINT2()) within the TANIMOTO() function makes the query slower. In order to get the best performance, you should use the SET function of MySQL:
mysql>SET @fp = (SELECT FINGERPRINT2(->SMILES_TO_MOLECULE('C(C(=O)O)N')));mysql>SELECT id FROMtbl_nameWHERE TANIMOTO(@fp,fp_col)->FROM->tbl_name> 0.7;ids