NMReDATA to NMR-ML Converter

NMReDATA is a standard format for representing NMR spectroscopy data, including chemical structures, peak assignments, and raw spectral data. NMR-ML is an XML-based format for representing NMR data in a standardized way. This tool allows for conversion between these formats, enabling interoperability between different NMR data processing systems. It's features include:

Download:  

nmrmedata.tar.gz

Installation:

Prerequisites

The following dependencies are required:

Setup

  1. Download the nmrML repository, and related repositories:
    git clone https://github.com/tsajed/nmr-pred
    git clone magmet-repository-url
  2. Install Python dependencies:
    pip install rdkit numpy matplotlib
  3. Ensure the directory structure is maintained as expected by the scripts:
    parent-directory/
    ├── nmredata/           # This repository
    ├── nmr-pred/           # For NMR prediction
    ├── magmet/             # For processing raw NMR data
    │   ├── setting
    │   ├── settings_update.txt
    │   └── solvent_lib/
    └── nmr_ml/
        ├── nmrml_creator.py
        └── nmrml_extract.py

Usage:

Converting NMReDATA SDF to NMR-ML

To convert a single NMReDATA SDF file to NMR-ML:

python nmredata_to_nmrml.py  [output_dir [output_file_basename]]

Example:

python nmredata_to_nmrml.py compound.nmredata.sdf output_folder compound

Converting NMReDATA ZIP to NMR-ML

To convert a zipped NMReDATA file (containing SDF and raw data) to NMR-ML:

python nmredata_zip_to_nmrml.py  [extract_dir output_dir output_basename]

Example:

python nmredata_zip_to_nmrml.py compound.zip extract_folder output_folder compound

Converting NMR-ML to NMReDATA

To convert an NMR-ML file to NMReDATA format:

python nmrml_to_nmredata.py --input  [--tmp ] [--base ] [--output ]

Example:

python nmrml_to_nmredata.py --input compound.nmrML --tmp temp_folder --base compound --output compound_nmredata.zip

Algorithm Details:

NMReDATA to NMR-ML Conversion

  1. Reading NMReDATA:Parse the SDF file using RDKit, extract molecule structure, assignments, peak information, and raw data if available
  2. Extracting Spin System:Build a spin system matrix from assignments, extract chemical shifts and J couplings
  3. Processing Raw Data:Detect data format (Bruker, Varian), process with magmet to get spectra and peak lists
  4. Generating NMR-ML:Create NMR-ML files per spectrum type, include structure, assignments, and data

NMR-ML to NMReDATA Conversion

  1. Extracting from NMR-ML:Parse file to extract structure, peaks, metadata, and generate JSONs
  2. Creating NMReDATA:Build SDF with tags and package with spectral data

Supported Spectrum Types:

File Formats:

NMReDATA

NMR-ML

Example:

The example directory contains a sample implementation for ethanol:

cd example
python construct_example_nmredata.py

This creates:

Troubleshooting:

References:

License:

This project is licensed under the MIT License - see the LICENSE file for details.