Nothing but the Ground Truth
One year ago, in October 2023, our project partner NEO installed a Hyper Spectral Imaging (HSI) system in Leoben and provided us with training to generate a substantial dataset—a quite large dataset of over 150 GB—aimed at developing a classification model to support LIBS measurements for sorting our spent refractories.
This HSI system consists of a high resolution optical sensor operating in the 400-2500 nm range. It can be used to detect and differentiate materials based on their unique spectral signatures, which are influenced by mineralogy and bonding systems. The HSI sensor supports the LIBS unit in two primary ways:
- Identifying Key Regions: It locates areas on the sample that are significant for classification, such as regions that are free from contamination.
- Material Classification: It classifies spent refractories, thereby improving accuracy and validating classification decisions.
To achieve effective classification, the system must be trained using samples with known properties, referred to as ground truth data. This data serves as a benchmark for training multivariate and machine learning models. In the context of optical sensors, these samples are typically prepared under controlled conditions, ensuring that their spectral properties are well-documented and reliably reproducible. Ground truth data is crucial for training algorithms, as it provides a reference for the model to learn how to recognize patterns. However, as you might recall from previous blog posts, determining the exact material properties of spent refractories poses certain challenges. Fortunately, we recognized the importance of sampling early in the project and gained valuable insights into our feedstock, enabling us to provide well-characterized samples for our initial training.
For our sample preparation, we selected spent refractory samples from various customers and aggregates corresponding to specific sorting classes including defined common contaminants. We identified these classes either visually through well-known optical features or by chemically analyzing the bricks for confirmation. We then manually crushed the samples to ensure their integrity, ultimately providing images and data for about 2,000 pieces representing the sample shape to be placed on the conveyor belt in the sorting equipment.
Once prepared, these samples were analyzed using HSI. For data acquisition, each sample was illuminated, and the resulting spectral data was recorded as a reflectance spectrum. Consistency in data collection and meticulous documentation was crucial, so in total, we stored the aforementioned 150 GB of spectral data, along with an additional 4.5 GB of images documenting the sample origins, setup, and measurement settings.
After gathering the spectral data, we shared it with NEO to extract relevant features for classification. This process may involve statistical analyses or dimensionality reduction techniques, such as Principal Component Analysis (PCA), to pinpoint the most informative aspects of the spectral data. To aid in this, we also provided spectral data for our most common raw materials, helping to link spectral features to material classifications.
With the ground truth data and extracted features, we began the actual model training. By applying machine learning algorithms, such as Support Vector Machines (SVM), the model learns to associate specific spectral patterns with their corresponding material classifications. The final step involves validating and testing the model to ensure its effectiveness. This validation process requires a separate set of samples not included in the training phase, which assesses the model’s ability to generalize to new, unseen data—an essential measure of its practical applicability.
A year after the installation of the equipment and continuous exchange with NEO, we are pleased to report that we can classify certain sorting classes with an accuracy exceeding 95%. While fine-tuning remains necessary and the upscaling to actual equipment is still in progress, we are excited about the advancements made in our material classification capabilities. By focusing on chemical and mineralogical properties rather than visual features, we are not only minimizing subjectivity in our sorting process but we are also able to customize our sorting classes to further valorize our circular raw materials.
Authors Portraits
Simone Neuhold
Dr. Simone Neuhold currently works for RHI Magnesita. Before she joined the company she was hired at Pilkington Deutschland AG/NSG Group. Simone studied at the TU Graz Chemistry and Advanced Materials Science, and at the Montanuniveristaet Leoben Waste Management and Waste Processing Technologies. Her research interests are recycling of mineral wastes, materials science and oekodesign.
Julio Hernandez
Julio Hernandez, MSc, is a senior research scientist at Norsk Elektro Optikk AS with over 15 years of experience in the field of hyperspectral imaging. Julio has worked developing scientific-grade hyperspectral cameras and data acquisition systems for a variety of applications within remote sensing, defense, industry and biomedical research. He is currently Manager of the Hyperspectral Applications department at HySpex, focused on developing customized solutions for end-users and promoting the adoption of hyperspectral technologies in new markets. Julio studied Physics at the Autonomous National University of Mexico (Mexico) and Nanotechnology at Chalmers University of Technology (Sweden) with specialization in quantum information systems.