Skip to content

sebcroft/Sorting-molecules

Repository files navigation

Sorting_molecules_script.ipynb provides a walkthough of the task, briefly providing some background to the data and the objective of this workflow.

The data.xlsx file contains some conjugated polymer data. Only the smiles strings are used for this task (other columns can be ignored).

The remaing .py modules provide helper functions, which are used in Sorting_molecules_script.ipynb

About

A hierachical clustering algorithm used to organise a data set of conjugated polymer repeating unit structures. The repository demonstrates the use of fingerprinting (ECFP), Tanimoto distance matrices, Butina clustering and generally working with molecules/RDKit.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors