Chemistry meets AI

Accelerating chemical innovation through predictive modelling, automated workflows, and chemoinformatics.

Capabilities

QSAR/QSRR Modelling

Delivering predictive models for biological activity, toxicity, and molecular properties. Mapping molecular structure to property via mathematical descriptors.

Process Optimization

Analyzing production data for yield optimization and failure reduction using advanced machine learning models.

Bridging Lab and Code

Implementation of the coding capabilities into LLMs to make easier the integration of laboratory workflows with computational tools.

Implemented Solutions

QSAR Model Development

Building reproducible cheminformatics pipelines for QSAR modelling and model validation.

Technical Details

Implementation Specs

  • Descriptor Calculation: use of AlvaScience for molecule dataset cleaning and standardization, molecular descriptors calculation, and fingerprint calculation.
  • Feature Selection: filtering methods are combined with embedded method to reduce to the dataset to the most important features for the endpoint.
  • Models: different models tested and compared for classification task. Spanning from LDA, QDA to more complex tree-ensemble algorithms such as Random Forest.
  • Validation: cross-validation and external test set validation.
  • Deployment: models used in software for QSAR.

Manufacturing Optimization

Monitoring failure rates and identifying critical trends in production data via chemometric analysis.

Technical Details

Implementation Specs

  • Data recollection and cleaning: standardization of the data and handling of missing values, incorrect values etc.
  • EDA pipeline: data analysis using basic statistic and multivariate analysis such as PCA for batch evolution modelling.
  • Root Cause Analysis: identification of critical process parameters affecting out-of-specification values of a variable in API synthesis.
  • Results: data-driven decision to address next experiments and steps to solve the problem.

QSRR for Chromatography

Building ML pipelines for retention time prediction using standardized chemical structures.

Technical Details

Implementation Specs

  • Dataset Curation: building and curation of datasets of molecules, chromatography and mass spectrometry data, and molecular descriptors.
  • Feature Selection: development of a feature selection pipeline to ensure reliable and significative results, combining filtering and embedded techniques.
  • Algorithm: different algorithms including ensemble learning (XGBoost) optimization for non-linear chromatographic relationships.
  • Results: better understanding of the retention mechanism and reduction of false candidate for MS unknown analyses.

Vision

ChemCoreAI translates chemical data into reliable, actionable intelligence for innovation.
Founder & Principal Scientist

Elena Bandini, PhD

Chemist with a foundation in analytical chemistry and chemoinformatics. Specialized in transforming laboratory logic into scalable computational solutions. With a PhD from Ghent University focused on ML pipelines for retention time prediction and chromatographic behavior, bridging the gap between raw chemical data and actionable structure-property insights.

Get in touch