NMReDATA is a standard format for representing NMR spectroscopy data, including chemical structures, peak assignments, and raw spectral data. NMR-ML is an XML-based format for representing NMR data in a standardized way. This tool allows for conversion between these formats, enabling interoperability between different NMR data processing systems. It's features include:
Convert NMReDATA files (SDF format) to NMR-ML format
Convert NMReDATA zip archives containing SDF files and raw spectral data to NMR-ML
Convert NMR-ML files back to NMReDATA format
Process raw NMR data from various vendors (Bruker, Varian/Agilent)
Extract peak lists and assignments from NMReDATA files
Reading NMReDATA:Parse the SDF file using RDKit, extract molecule structure, assignments, peak information, and raw data if available
Extracting Spin System:Build a spin system matrix from assignments, extract chemical shifts and J couplings
Processing Raw Data:Detect data format (Bruker, Varian), process with magmet to get spectra and peak lists
Generating NMR-ML:Create NMR-ML files per spectrum type, include structure, assignments, and data
NMR-ML to NMReDATA Conversion
Extracting from NMR-ML:Parse file to extract structure, peaks, metadata, and generate JSONs
Creating NMReDATA:Build SDF with tags and package with spectral data
Supported Spectrum Types:
1D-1H NMR
1D-13C NMR
2D-1H-13C-HSQC
2D-1H-13C-HMBC
2D-1H-1H-COSY
2D-1H-1H-TOCSY
2D-1H-1H-ROESY/NOESY
File Formats:
NMReDATA
SDF file containing molecular structure and NMR tags
Optional raw spectral data files
All files can be zipped together
NMR-ML
XML-based format with:
ul
li Molecular structure
li Acquisition parameters
li Processing parameters
li Peak lists and assignments
li Spectral data (optional)
Example:
The example directory contains a sample implementation for ethanol:
cd example
python construct_example_nmredata.py
This creates:
nmredata.sdf: NMReDATA file for ethanol
Ethanol_nmredata.zip: ZIP archive with SDF and simulated data
Troubleshooting:
Missing dependencies:Ensure all repositories are cloned and properly placed
Path issues:Verify raw data paths
Format detection:If automatic detection fails, set manually