Preprocessing
Rampy offers handful functions to preprocess your spectra. Do not hesitate to propose/ask for new functionalities!
Below you will find the documentation of the relevant functions. They are often used in the different example notebooks: Example notebooks
Flip the X axis
Some spectra come with decreasing X values. Rampy offers a simple function to flip them. This can be necessary to e.g. resample them (as interpolation algorithms usually require the X values to increase). The following function does this. It even allows even arbitrary X value positions. It returns a sorted array with an increasing x axis.
- class rampy.spectranization.flipsp(sp: ndarray)
Bases:
Sorts or flips a spectrum along its row dimension based on x-axis values.
- Parameters:
sp (np.ndarray) – A 2D array where the first column contains x-axis values and subsequent columns contain y-values.
- Returns:
The input array sorted in ascending order based on the first column.
- Return type:
np.ndarray
Notes
Uses np.argsort to ensure sorting regardless of initial order.
Example
>>> import numpy as np >>> import rampy as rp >>> sp = np.array([[300, 30], [100, 10], [200, 20]]) >>> sorted_sp = rp.flipsp(sp)
Shift the X axis
You can shift the X axis from a given value using the following function:
- class rampy.spectranization.shiftsp(sp: ndarray, shift: float)
Bases:
Shifts the x-axis values of a spectrum by a given amount.
- Parameters:
sp (np.ndarray) – A 2D array where the first column contains x-axis values (e.g., frequency or wavenumber) and subsequent columns contain y-values.
shift (float) – The amount by which to shift the x-axis values. Negative values shift left; positive values shift right.
- Returns:
The input array with shifted x-axis values.
- Return type:
np.ndarray
Example
>>> import numpy as np >>> import rampy as rp >>> sp = np.array([[100, 10], [200, 20], [300, 30]]) >>> shifted_sp = rp.shiftsp(sp, shift=50)
Extract a portion / portions of a signal
You can use the function rampy.extract_signal()
to do that (old version: rampy.get_portion_interest
)
- class rampy.baseline.extract_signal(x: ndarray, y: ndarray, roi)
Bases:
Extracts the signal from specified regions of interest (ROI) in the x-y data.
This function selects and extracts portions of the input x-y data based on the specified regions of interest (ROI) provided in roi. Each region is defined by a lower and upper bound along the x-axis.
- Parameters:
x (ndarray) – The x-axis values (e.g., time, wavelength, or other independent variables).
y (ndarray) – The y-axis values corresponding to the x-axis values (e.g., signal intensity).
roi (ndarray or list of lists) –
Regions of interest (ROI) where the signal should be extracted. Must be an n x 2 array or a list of lists, where n is the number of regions to extract. Each sublist or row should contain two elements:
The lower bound of the region (inclusive).
The upper bound of the region (inclusive).
Example
Array: np.array([[10, 20], [50, 70]])
List: [[10, 20], [50, 70]]
- Returns:
- A 2-column array containing the extracted x-y signals from the specified regions.
The first column contains the x values, and the second column contains the corresponding y values.
- Return type:
ndarray
- Raises:
ValueError – If roi is not a valid n x 2 array or list of lists, or if any region in roi falls outside the range of x.
Notes
Overlapping regions in roi are not merged; they are processed as separate regions.
If no valid regions are found within roi, an empty array is returned.
Examples
Extracting signal from two regions in an x-y dataset:
>>> import numpy as np >>> x = np.linspace(0, 100, 101) >>> y = np.sin(x / 10) + np.random.normal(0, 0.1, size=x.size) >>> roi = [[10, 20], [50, 70]] >>> extracted_signal = extract_signal(x, y, roi) >>> print(extracted_signal)
Remove spikes
Spikes can be removed via the rampy.despiking()
function. It takes as input X and Y values of a spectrum and a threshold. The threshold is the number of standard deviation above the mean noise value that a point must be to be considered as a spike. For instance, if the threshold is 3, then a point will be considered as a spike if it is 3 standard deviation above the mean of the noise. The function will then replace the spike by the mean of k points before and after the spike.
- class rampy.spectranization.despiking(x: ndarray, y: ndarray, neigh: int = 4, threshold: int = 3)
Bases:
Removes spikes from a 1D signal using a threshold-based approach.
This function identifies spikes in a signal by comparing local residuals to a threshold based on the root mean square error (RMSE). Spikes are replaced with the mean of neighboring points.
- Parameters:
x (np.ndarray) – A 1D array containing the x-axis values of the signal.
y (np.ndarray) – A 1D array containing the y-axis values of the signal to despike.
neigh (int) – The number of neighboring points to use for calculating average values during despiking and for smoothing. Default is 4.
threshold (int) – The multiplier of RMSE used to identify spikes. Default is 3.
- Returns:
A 1D array of the despiked signal.
- Return type:
np.ndarray
Example
>>> import numpy as np >>> import rampy as rp >>> x = np.linspace(0, 10, 100) >>> y = rp.gaussian(x, 10., 50., 2.0) >>> y_despiked = rp.despiking(x, y)
Resampling a spectrum
We need sometime to resample a spectrum with a new X axis. rampy.resample()
offers such ability. For instance,
we have a spectrum that has a X axis from 400 to 1300 cm-1, with points each 0.9 cm-1. We want the same but with an X axis with a value each cm-1. We can do for our spectrum:
- class rampy.spectranization.resample(x: ndarray, y: ndarray, x_new: ndarray, **kwargs)
Bases:
Resamples a y signal along new x-axis values using interpolation.
- Parameters:
x (np.ndarray) – Original x-axis values.
y (np.ndarray) – Original y-axis values corresponding to x.
x_new (np.ndarray) – New x-axis values for resampling.
**kwargs –
Additional arguments passed to scipy.interpolate.interp1d.
kind (str or int): Type of interpolation (‘linear’, ‘cubic’, etc.). Default is ‘linear’.
bounds_error (bool): If True, raises an error when extrapolation is required. Default is False.
fill_value (float or str): Value used for out-of-bounds points. Default is NaN or “extrapolate”.
- Returns:
Resampled y-values corresponding to x_new.
- Return type:
np.ndarray
Example
>>> import numpy as np >>> import rampy as rp >>> original_x = np.array([100, 200, 300]) >>> original_y = np.array([10, 20, 30]) >>> new_x = np.linspace(100, 300, 5) >>> resampled_y = rp.resample(original_x, original_y, new_x)
Normalisation
Rampy provides the rampy.normalisation()
function to normalise the Y values of a spectrum to
the maximum intensity
the trapezoidal area under the curve
to min-max values of intensities
- class rampy.spectranization.normalise(y: ndarray, x: ndarray = None, method: str = 'intensity')
Bases:
Normalizes the y signal(s) using specified methods.
This function normalizes the input y signal(s) based on the chosen method: by area under the curve, maximum intensity, or min-max scaling.
- Parameters:
y (np.ndarray) – A 2D array of shape (m values, n samples) containing the y values to normalize.
x (np.ndarray, optional) – A 2D array of shape (m values, n samples) containing the x values corresponding to y. Required for area normalization. Default is None.
method (str) – The normalization method to use. Options are: - ‘area’: Normalize by the area under the curve. - ‘intensity’: Normalize by the maximum intensity. - ‘minmax’: Normalize using min-max scaling.
- Returns:
A 2D array of normalized y signals with the same shape as the input y.
- Return type:
np.ndarray
- Raises:
ValueError – If x is not provided when using the ‘area’ normalization method.
NotImplementedError – If an invalid normalization method is specified.
Example
>>> import numpy as np >>> import rampy as rp >>> x = np.linspace(0, 10, 100) >>> y = rp.gaussian(x, 10., 50., 2.0) >>> y_norm = rp.normalise(y, x=x, method="area")
Temperature and excitation line effects
Raman spectra may need correction from temperature and excitation line effects. See the review of Brooker et al. 1988 for details. rampy offers several way to do so with the rampy.tlexcitation()
function.
- class rampy.tlcorrection.tlcorrection(x: ndarray, y: ndarray, temperature: float, wavelength: float, **kwargs)
Bases:
Corrects Raman spectra for temperature and excitation line effects.
This function applies corrections to Raman spectra to account for temperature and laser excitation wavelength effects. It supports multiple correction equations and normalization methods, making it suitable for a variety of materials and experimental conditions.
- Parameters:
x (np.ndarray) – Raman shifts in cm⁻¹.
y (np.ndarray) – Intensity values (e.g., counts).
temperature (float) – Temperature in °C.
wavelength (float) – Wavelength of the laser that excited the sample, in nm.
correction (str, optional) – The correction equation to use. Options are: - ‘long’: Default equation from Galeener and Sen (1978) with a (v_0^3) coefficient correction. - ‘galeener’: Original equation from Galeener and Sen (1978), based on Shuker and Gammon (1970). - ‘hehlen’: Equation from Hehlen et al. (2010), preserving the Boson peak signal. Default is ‘long’.
normalisation (str, optional) – Normalization method for the corrected data. Options are: - ‘intensity’: Normalize by maximum intensity. - ‘area’: Normalize by total area under the curve. - ‘no’: No normalization. Default is ‘area’.
density (float, optional) – Density of the studied material in kg/m³, used only with the ‘hehlen’ equation. Default is 2210.0 (density of silica).
- Returns:
x (np.ndarray): Raman shift values after correction.
ycorr (np.ndarray): Corrected intensity values.
ese_corr (np.ndarray): Propagated errors calculated as (sqrt{y}) on raw intensities.
- Return type:
tuple[np.ndarray, np.ndarray, np.ndarray]
- Raises:
ValueError – If an invalid correction or normalization method is specified.
Notes
The ‘galeener’ equation is a modification of Shuker and Gammon’s formula to account for ((v_0 - v)^4) dependence of Raman intensity.
The ‘long’ equation includes a (v_0^3) coefficient to remove cubic meter dimensions, as used in several studies like Mysen et al. (1982).
The ‘hehlen’ equation avoids signal suppression below 500 cm⁻¹, preserving features like the Boson peak in glasses.
References
Galeener, F.L., & Sen, P.N. (1978). Theory of the first-order vibrational spectra of disordered solids. Physical Review B, 17(4), 1928–1933.
Hehlen, B. (2010). Inter-tetrahedra bond angle of permanently densified silicas extracted from their Raman spectra. Journal of Physics: Condensed Matter, 22(2), 025401.
Brooker, M.H., Nielsen, O.F., & Praestgaard, E. (1988). Assessment of correction procedures for reduction of Raman spectra. Journal of Raman Spectroscopy, 19(2), 71–78.
Mysen, B.O., Finger, L.W., Virgo, D., & Seifert, F.A. (1982). Curve-fitting of Raman spectra of silicate glasses. American Mineralogist, 67(7-8), 686–695.
Neuville, D.R., & Mysen, B.O. (1996). Role of aluminium in the silicate network: In situ high-temperature study of glasses and melts on the join SiO₂-NaAlO₂. Geochimica et Cosmochimica Acta, 60(9), 1727–1737.
Le Losq, C., Neuville, D.R., Moretti, R., & Roux, J. (2012). Determination of water content in silicate glasses using Raman spectrometry: Implications for the study of explosive volcanism. American Mineralogist, 97(5-6), 779–790.
Shuker, R., & Gammon, R.W. (1970). Raman-scattering selection-rule breaking and the density of states in amorphous materials. Physical Review Letters, 25(4), 222.
Examples
Correct a simple spectrum using default parameters:
>>> import numpy as np >>> x = np.array([100, 200, 300]) # Raman shifts in cm⁻¹ >>> y = np.array([10, 20, 30]) # Intensity values >>> temperature = 25.0 # Temperature in °C >>> wavelength = 532.0 # Wavelength in nm >>> x_corr, y_corr, ese_corr = correct_spectra(x, y, temperature, wavelength)
Use a specific correction equation and normalization method:
>>> x_corr, y_corr, ese_corr = correct_spectra( x=x, y=y, temperature=25, wavelength=532, correction='hehlen', normalisation='intensity', density=2500 )
Wavelength-wavenumber convertion
The convert_x_units()
function allows to convert your X values in nm in inverse cm, or the opposite! Do not hesitate to propose new ways to enrich it!
- class rampy.spectranization.convert_x_units(x: ndarray, laser_nm: float = 532.0, way: str = 'nm_to_cm-1')
Bases:
Converts between nanometers and inverse centimeters for Raman spectroscopy.
- Parameters:
x (np.ndarray) – Array of x-axis values to convert.
laser_nm (float) – Wavelength of the excitation laser in nanometers. Default is 532.0 nm.
way (str) – Conversion direction. Options are: - “nm_to_cm-1”: Convert from nanometers to inverse centimeters. - “cm-1_to_nm”: Convert from inverse centimeters to nanometers.
- Returns:
Converted x-axis values.
- Return type:
np.ndarray
- Raises:
ValueError – If an invalid conversion direction is specified.
Example
Convert from nanometers to inverse centimeters:
>>> import rampy as rp >>> x_nm = np.array([600.0]) >>> x_cm_1 = rp.convert_x_units(x_nm)
Convert from inverse centimeters to nanometers:
>>> x_cm_1 = np.array([1000.0]) >>> x_nm = rp.convert_x_units(x_cm_1, way="cm-1_to_nm")