Phytotherapy Discovery — Ontology Work Toolkit

This repository accompanies the analysis for “Phytotherapy Discovery through Ontology-Guided Machine Learning.”
It is intended to be used with the upstream phytotherapy ontology maintained in RomanoLab/poppy (the source of truth for the ontology itself):

Upstream ontology + primary project context: https://github.com/RomanoLab/poppy

This repository also includes a modified CTAPred pipeline used to predict protein–phytochemical interactions, supporting downstream mechanism inference and repurposing analyses.

Background

Researchers in natural-product drug discovery often work with ontologies linking plants → chemicals → targets → therapeutic roles. In this project, the ontology is maintained upstream in RomanoLab/poppy, while this repository contains analysis companion materials and the CTAPred workflow used in the ontology-guided ML portion of the study.

At a high level, the workflow supports:

Starting from an ontology-derived phytochemical set (from poppy)
Predicting likely protein targets for phytochemicals (CTAPred)
Using predictions for downstream analysis (e.g., candidate prioritization, mechanism inference)

Repository Layout

.
├── README.md                    # you are here
├── requirements.txt             # repo-level environment dependencies (if applicable)
└── CTAPred/                     # modified CTAPred pipeline for protein–phytochemical predictions
    ├── predict_targets.py
    ├── SharedFunc.py
    ├── requirements.txt
    ├── QueryList1_smiles.csv
    └── README.md                # CTAPred usage instructions


| File / Dir             | Purpose                                                                      |
| ---------------------- | -----------------------------------------------------------------------------|
| `CTAPred/`             | Modified CTAPred pipeline used to predict protein–phytochemical interactions |
| `CTAPred/README.md     | Canonical usage instructions for running CTAPred in this repo                |
| `Crequirements.txt`    | Repo-level dependencies used by analysis utilities                           |

Quick Start

1. Clone

git clone https://github.com/RomanoLab/phytotherapy-discovery.git
cd phytotherapy-discovery

2. (Recommended) Create a virtual environment

python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate

3. Install dependencies

If you are running CTAPred, install CTAPred’s pinned dependencies:

pip install -r CTAPred/requirements.txt

If you are running other repo-level analysis utilities, you may also install the root requirements:

pip install -r requirements.txt

4. Run CTAPred

Follow the instructions in CTAPred/README.md. To see the CLI options:

python CTAPred/predict_targets.py --help

CTAPred/QueryList1_smiles.csv is provided as an example/input artifact containing phytochemical SMILES used for predictions.

Command-line Reference

Because this repository’s executable workflow is centered on CTAPred, the authoritative CLI reference is:

python CTAPred/predict_targets.py --help

For usage examples, input expectations, and output files, see:

CTAPred/README.md

Workflow Details

A typical end-to-end workflow looks like this:

Consult / generate ontology artifacts upstream Use RomanoLab/poppy to access the ontology and (if needed) derive a phytochemical set for ML analysis.
Prepare CTAPred inputs Provide a CSV of phytochemical SMILES (use CTAPred/QueryList1_smiles.csv as a template).
Run CTAPred predictions Execute CTAPred/predict_targets.py to generate predicted protein targets / ranked interactions.
Downstream analysis / integration Use predictions for prioritization, hypothesis generation, or reintegration into ontology/KG analyses (depending on your study design and tooling).

Extending the Toolkit

Ideas that keep this repository aligned with its current scope:

Modularise – split into smaller modules and expose a proper Python API.
Unit tests – add pytest and GitHub Actions for continuous integration.
Visualisation – integrate network diagrams or heatmaps (e.g., via matplotlib).
Docker – package the pipeline for reproducible runs on any machine.

Contributing

Fork → create a feature branch → commit + push → open a Pull Request.
Keep changes scoped and well-described (what/why/how).
If you change model inputs/outputs, update CTAPred/README.md and this README accordingly.

Citation

If you use this repository in research, please cite:

@software{hewryk_ontology_work_2025,
  author       = {Hewryk, Oresta S. I. and Pan, Ian Tong and Romano, Joseph D.},
  title        = {{Ontology Work}: A Python toolkit for phytotherapy ontology enrichment},
  year         = {2025},
  publisher    = {GitHub},
  url          = {https://github.com/RomanoLab/ontology-work}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Phytotherapy Discovery — Ontology Work Toolkit

Table of Contents

Background

Repository Layout

Quick Start

1. Clone

2. (Recommended) Create a virtual environment

3. Install dependencies

4. Run CTAPred

Command-line Reference

Workflow Details

Extending the Toolkit

Contributing

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
CTAPred		CTAPred
README.md		README.md
requirements.txt		requirements.txt

RomanoLab/phytotherapy-discovery

Folders and files

Latest commit

History

Repository files navigation

Phytotherapy Discovery — Ontology Work Toolkit

Table of Contents

Background

Repository Layout

Quick Start

1. Clone

2. (Recommended) Create a virtual environment

3. Install dependencies

4. Run CTAPred

Command-line Reference

Workflow Details

Extending the Toolkit

Contributing

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages