Fig 3

Photonic Data Science

Research group Prof. Dr. Thomas Bocklitz
Fig 3
Graphic: IPHT

Head of the group

Thomas Bocklitz, Univ.-Prof. Dr

Head
Professorship of Photonic Data Science
Prof. Dr. Thomas Bocklitz
Image: Prof. Dr. Thomas Bocklitz
JenTower
Leutragraben 1
07743 Jena Google Maps site planExternal link

Scientific Profile

We explore the entire data life cycle of photonic data from generation to the data analysis and to data archiving. Following a holistic approach, we investigate procedures for experiment and sample size planning as well as data pretreatment and combine these procedures with chemometric procedures, model transfer methods and artificial intelligence methods in a data pipeline. In this way, data from various photonic processes can be used for analysis, diagnostics and therapy in medicine, life science, environmental sciences and pharmacy. The data pipeline are implemented in software components and are tested directly in the applicative environment, e.g. in clinical studies. Further focal points are data fusion of different heterogeneous data sources, the simulation of different measurement procedures in order to optimize correction procedures, methods for the interpretation of analysis models and the construction of data infrastructures for different photonic measurement data, which ensure the FAIR principles.

Research Topics

  • Machine learning for photonic image data

    Fig 1

    Graphic: IPHT
  • Chemometrics / machine learning for spectral data
  • Correlation of different measurement methods and data fusion

Areas of application

  • Bio-medical diagnostics using spectral measurement methods and imaging techniques
  • Extraction of higher information from photonic measurement data
  • Simulation- und data-driven correction of photonic data
  • Guarantee of FAIR principles for photonic data

Staff

Filter 74 publications

Filter publications

Highlighted authors are members of the University of Jena.

  1. Siamese networks in Raman spectroscopy: Towards a better performance against replicate variability

    Year of publicationPublished in:Talanta: the international journal of pure and applied analytical chemistry S. Guo, T. Bocklitz
    The power of Raman spectroscopy is largely enhanced by machine learning and chemometrics, which extract and translate the spectral features into high-level biological or clinical knowledge by constructing classical or deep learning models. The generalizability of such models, however, is often degraded due to the large variations between the training data and the data to be predicted. Model transfer showed great potential in this regard, which improved the prediction on the test data without re-building a new model from scratch. We developed a method based on Siamese neural network (SNet) and compared it with two basis models as well as two model transfer methods score movement (MS) and extensive multiplicative scattering correction (EMSC). The performance was systematically verified with a Raman spectral dataset measured from four bacterial species, each consisting of nine biological replicates. Its generalizability was further tested on a second Raman dataset from mice tissue samples. Siamese network was demonstrated to outperform the MS and EMSC, especially given large training datasets. The load on training data, however, is substantially lower than conventional networks and can be slightly reduced when variability between training and test data is properly incorporated into the loss function. Unlike MS and EMSC, more importantly, Siamese network does not require information of test data for model adjustment or data space adaptation, which makes it more advantageous in practice.
    University Bibliography Jena:
    fsu_mods_00030099External link
  2. A comparative study of robustness to noise and interpretability in U-Net-based denoising of Raman spectra

    Year of publicationPublished in:Spectrochimica Acta - Part A: Molecular and Biomolecular Spectroscopy A. Mokari, S. Eiserloh, O. Ryabchykov, U. Neugebauer, T. Bocklitz
  3. Automatic optimization of flat-field corrections by evaluation and enhancement (EVEN) in multimodal optical microscopy

    Year of publicationPublished in:Nature Communications E. Corbetta, M. Calvarese, P. Then, H. Bae, T. Meyer-Zedler, B. Messerschmidt, O. Guntinas-Lichius, M. Schmitt, C. Eggeling, J. Popp, T. Bocklitz
    Uneven illumination affects all images acquired by optical microscopes, especially large, multicolour and nonlinear measurements. Although removal is possible with various algorithms, evaluating raw and processed images is challenging due to the lack of established workflows for image quality assessment. This manuscript describes a machine learning-based method, EVEN (Evaluation and Enhancement), to assess and optimise corrections in optical microscopy. EVEN integrates quantitative image metrics into a Linear Discriminant Analysis model to detect and predict image quality, automatically optimising corrections. The method can be integrated into the optical microscopy pipeline to simplify further processing and analysis. Here, we show the implementation and application of EVEN in different processing scenarios, including multimodal nonlinear imaging of human and neck tissue slices and multichannel fluorescence measurements of stained cells, demonstrating its capability to automatically optimise image quality by assessing single-channel corrections.
    University Bibliography Jena:
    fsu_mods_00029890External link
  4. Bridging Spectral Gaps: Cross-Device Model Generalization in Blood-Based Infrared Spectroscopy

    Year of publicationPublished in:Analytical Chemistry F. Nemeth, N. Leopold-Kerschbaumer, D. Debreceni, F. Fleischmann, K. Borbely, D. Mazurencu-Marinescu-Pele, T. Bocklitz, M. Žigman, K. Kepesidis
    This paper presents a solution to the challenge of cross-device model generalization in blood-based infrared spectroscopy. As infrared spectroscopy becomes increasingly popular for analyzing human blood, ensuring that machine learning models trained on one device can be effectively transferred to others is essential. However, variations in device characteristics often reduce model performance when applied across different devices. To address this issue, we propose a straightforward domain adaptation method based on data augmentation incorporating device-specific differences. By expanding the training data to include a broader range of nuances, our approach enhances the model’s ability to adapt to the unique characteristics of various devices. We validate the effectiveness of our method through experimental testing on two Fourier-Transform Infrared (FTIR) spectroscopy devices from different research laboratories, demonstrating improved prediction accuracy and reliability.
    University Bibliography Jena:
    fsu_mods_00024831External link
  5. Explainable artificial intelligence for spectroscopy data: a review

    Year of publicationPublished in:Pflügers Archiv : European journal of physiology J. Contreras, T. Bocklitz
    Explainable artificial intelligence (XAI) has gained significant attention in various domains, including natural and medical image analysis. However, its application in spectroscopy remains relatively unexplored. This systematic review aims to fill this gap by providing a comprehensive overview of the current landscape of XAI in spectroscopy and identifying potential benefits and challenges associated with its implementation. Following the PRISMA guideline 2020, we conducted a systematic search across major journal databases, resulting in 259 initial search results. After removing duplicates and applying inclusion and exclusion criteria, 21 scientific studies were included in this review. Notably, most of the studies focused on using XAI methods for spectral data analysis, emphasizing identifying significant spectral bands rather than specific intensity peaks. Among the most utilized AI techniques were SHapley Additive exPlanations (SHAP), masking methods inspired by Local Interpretable Model-agnostic Explanations (LIME), and Class Activation Mapping (CAM). These methods were favored due to their model-agnostic nature and ease of use, enabling interpretable explanations without modifying the original models. Future research should propose new methods and explore the adaptation of other XAI employed in other domains to better suit the unique characteristics of spectroscopic data.
    University Bibliography Jena:
    fsu_mods_00016079External link
  6. Enhancing prediction stability and performance in LIBS analysis using custom CNN architectures

    Year of publicationPublished in:Talanta: the international journal of pure and applied analytical chemistry P. Dehbozorgi, L. Duponchel, V. Motto-Ros, T. Bocklitz
    LIBS-based analysis has experienced an ever-increasing interest in recent years as a well-suited technique for chemical analysis tasks relying on elemental fingerprinting. This method stands out for its ability to offer rapid, simultaneous multi-element analysis with the advantage of portability. In the context of this research, our aim is to bridge the gap between the analysis of simulated and real data to better account for variations in plasma temperature and electron density, which are typically not considered in LIBS analysis. To achieve this, we employ two distinct methodologies, PLS and CNNs, to construct predictive models for predicting the concentration of the 24 elements within each LIBS spectrum. The initial phase of our investigation concentrates on the training and testing of these models using simulated LIBS data, with results evaluated through RMSEP values. The IQR and median RMSEP values for all the elements demonstrate that CNNs consistently achieved values below 0.01, while PLS results ranged from 0.01 to 0.05, highlighting the superior stability and predictive accuracy of CNNs model. In the next phase, we applied the pre-trained models to analyze the real LIBS spectra, consistently identifying Aluminum (Al), Iron (Fe), and Silicon (Si) as having the highest predicted concentrations. The overall predicted values were approximately 0.5 for Al, 0.6 for Si, and 0.04 for Fe. In the third phase, deliberate adjustments are made to the training parameters and architecture of the proposed CNNs model to force the network to emphasize specific elements, prioritizing them over other components present in each real LIBS spectrum. The generation of the three modified versions of the initially proposed CNNs allows us to explore the impact of regularization, sample weighting, and a customized loss function on prediction outcomes. Some elements emerge during the prediction phase, with Calcium (Ca), Magnesium (Mg), Zinc (Zn), Titanium (Ti), and Gallium (Ga) exhibiting more pronounced patterns.
    University Bibliography Jena:
    fsu_mods_00018242External link
  7. Machine Learning-Based Estimation of Experimental Artifacts and Image Quality in Fluorescence Microscopy

    Year of publicationPublished in:Advanced Intelligent Systems E. Corbetta, T. Bocklitz
    Reliable characterization of image data is fundamental for imaging applications, FAIR data management, and an objective evaluation of image acquisition, processing, and analysis steps in an image-based investigation of biological samples. Image quality assessment (IQA) often relies on human visual perception, which is not objective, or reference ground truth images, which are not often available. This study presents a method for a comprehensive IQA of microscopic images, which solves these issues by employing a set of reference-free metrics that estimate the presence of experimental artifacts. The metrics are jointly validated on a semisynthetic dataset and are tested on experimental images. Finally, the metrics are employed in a machine learning model, demonstrating their effectiveness for automatic artifact classification through multimarker IQA. This work provides a reliable reference-free method for IQA in optical microscopy, which can be integrated into the experimental workflow and tuned to address specific artifact detection tasks.
    University Bibliography Jena:
    fsu_mods_00018351External link
  8. Multimodal Integration Enhances Tissue Image Information Content: A Deep Feature Perspective

    Year of publicationPublished in:Bioengineering F. Darzi, T. Bocklitz
    Multimodal imaging techniques have the potential to enhance the interpretation of histology by offering additional molecular and structural information beyond that accessible through hematoxylin and eosin (H&E) staining alone. Here, we present a quantitative approach for comparing the information content of different image modalities, such as H&E and multimodal imaging. We used a combination of deep learning and radiomics-based feature extraction with different information markers, implemented in Python 3.12, to compare the information content of the H&E stain, multimodal imaging, and the combined dataset. We also compared the information content of individual channels in the multimodal image and of different Coherent Anti-Stokes Raman Scattering (CARS) microscopy spectral channels. The quantitative measurements of information that we utilized were Shannon entropy, inverse area under the curve (1-AUC), the number of principal components describing 95% of the variance (PC95), and inverse power law fitting. For example, the combined dataset achieved an entropy value of 0.5740, compared to 0.5310 for H&E and 0.5385 for the multimodal dataset using MobileNetV2 features. The number of principal components required to explain 95 percent of the variance was also highest for the combined dataset, with 62 components, compared to 33 for H&E and 47 for the multimodal dataset. These measurements consistently showed that the combined datasets provide more information. These observations highlight the potential of multimodal combinations to enhance image-based analyses and provide a reproducible framework for comparing imaging approaches in digital pathology and biomedical image analysis.
    University Bibliography Jena:
    fsu_mods_00027232External link
  9. Advances in physics-informed deep learning for imaging data: a review of methods and applications

    Year of publicationPublished in:JPhys Photonics Y. Yogita, T. Bocklitz
    Deep learning (DL) has transformed numerous application domains owing its ability to automatically extract features from data. However, training DL models typically requires large datasets, which are often unavailable for scientific research. In recent years, the integration of physics with DL, known as physics-informed DL (PIDL), has emerged as a promising approach that enables models to learn from limited data. This survey provides an overview of recent advancements in PIDL methods, summarizing the various incorporation techniques and physical priors used in inverse imaging applications. This review highlights the strengths of PIDL, including improved interpretability, data efficiency, robustness, and generalization. It also discusses shortcomings, such as the lack of formulated physics representations, the need for domain-specific knowledge, and the high computational costs. Although PIDL is a relatively new methodology, it has significant potential for creating resilient, efficient, precise, and adaptable models for real-world applications. This survey offers insights into the fundamentals of PIDL in imaging and emphasizes its growing importance in bridging the gap between data-driven approaches and physics-based modeling in scientific research. As the field progresses, PIDL is likely to play an increasingly crucial role in advancing scientific understanding and real-world applications.
    University Bibliography Jena:
    fsu_mods_00028564External link
Pagination Page 1