Peak fitting

Introduction

Spectra allows to fit a spectrum with a sum of peaks, of different shapes. For that, we rely on the peak shape functions gaussian, lorentzian, pseudovoigt and pearson7. Using those, you can generate a signal given x, and peak parameters. To generate a signal composed of multiple contributions from different peaks, use the create_peaks function. It is a useful function for instance to create "fake" signals and test our peak fitting function, fit_peaks.

fit_peaks uses either Optim.jl or a custom quasi-Newton algorithm to fit the sum of the peaks $y_{calc}$ to an observed signal $y$ affected by errors $\sigma_y$. The regression also takes into account à priori errors $\sigma_{m_{prior}}$ on prior model parameters $m_{prior}$: we assume a Gaussian prior on model parameters with a mean $m_{prior}$ and a covariance matrix $C_M$ which diagonal contains $\sigma_{m_{prior}}^2$.

Given the forward calculation of $y_{calc}$ as

\[y_{calc} = g(m)\]

with $m$ the model parameters and g the forward model, the misfit function $S$ is (Tarantola 2005, chapter 3, eq. 3.46):

\[S(m) = \frac{1}{2}[(y - y_{calc})^{t} C_D^{-1}(y - y_{calc}) + (m - m_{prior})^{t}C_M^{-1}(m-m_{prior})]\]

where $C_D^{-1}$ is the inverse of the data covariance matrix, which diagonal contains $\sigma_y^2$. The misfit function is related to the posterior probability density in the model space following $K \exp (-S(m))$, with $K$ a constant. Here, we are dealing with a non-linear problem so this posterior probability density is not Gaussian. However, we assume that it can be linearized near $m_{prior}$. Therefore, starting close to or at $m_{prior}$, we can use the following quasi-Newton algorithm to find a suitable solution (Tarantola 2005, eq. 3.51):

\[m_{n+1} = m_{n} - \mu_n(G_n^tC_D^{-1}G_n + C_M^{-1})^{-1}(G_n^tC_D^{-1}(y_{calc, n} - y) + C_M^{-1}(m_n - m_{prior}))\]

where $y_{calc, n}$ is the model output at iteration $n$ with the set of parameters $m_n$, $(G_n)^i_\alpha = (\frac{g^i}{m^\alpha})_{m_n}$ is the matrix of partial derivatives, and $\mu_n$ a step size typically lower or equal to 1. In fit_peaks, the two parameters maxiter and relax control the maximum number of iterations n and the step size $\mu_n$, with $\mu_n = \frac{1}{\text{relax}}$.

The other available method is the Interior Point Newton algorithm from Optim.jl (:Optim backend). This method implements an interior-point primal-dual Newton algorithm for solving the nonlinear, constrained optimization problem. In other terms, it allows the use of constraints, such as parameter boundaries. This is the main difference with the quasi-Newton method. The misfit function that is minimized is the $S(m)$ function provided above.

Convergence between the two methods usually is similar, with the quasi-Newton method being slightly faster. However, as there is no boundaries on parameters, one may have problems for instance if the intensity of one peak is close to 0. The quasi-Newton method offers no boundaries for the parameter values, contrary to the Interior Point Newton method.

Fitting procedure

A short example probably is better than many worlds. We are going to fit a synthetic signal that we created ourselves with the following code:

using Plots
using Spectra
using Statistics

# The X axis
x = collect(0:0.2:100)

# The "perfect" Y signal
y_perfect = (
    gaussian(x, [10.0, 35.0, 10.0]) +
    lorentzian(x, [15.0, 55.0, 3.0]) +
    pearson7(x, [20.0, 45.0, 2.0, 0.4])
)
# Of course, in real world we have noise, here it is Gaussian
noise = randn(length(x))*0.3

# This is what we observe and want to fit
y_obs = y_perfect + noise

# Let's visualize it!
p1 = plot(x, [y_perfect, y_obs]; labels=["The true signal" "Observations"])

First, you define a vector containing named vectors of peak types, $m_{prior}$, $\sigma_{m_{prior}}$, and lower and upper boundaries. For instance, after a visual review of the signal above, you would declare a vector of peak informations like this:

# (peak_type, m_prior, sigma_m_prior, lower_bounds, upper_bounds)
peaks_info = [
    (
        :gaussian, # peak_type
        [10.5, 30.0, 11.0], # m_prior (intensity, position, hwhm)
        [5.0, 5.0, 3.0], # sigma_m_prior
        [0.0, 0.0, 0.0], # lower_bounds
        [Inf, Inf, 50.0]), # upper_bounds
    (
        :lorentzian, # peak_type
        [17.5, 54.0, 3.1], # m_prior (intensity, position, hwhm)
        [5.0, 3.0, 1.0], # sigma_m_prior
        [0.0, 0.0, 0.0], # lower_bounds
        [Inf, Inf, Inf], # upper_bounds
    ),
    (
        :pearson7, # peak_type
        [21.5, 44.0, 3.0, 0.4], # m_prior (intensity, position, hwhm, shape exponent)
        [3.0, 2.0, 5.0, 0.02], # sigma_m_prior
        [0.0, 0.0, 0.0, 0.0], # lower_bounds
        [100.0, 100.0, 50.0, Inf], # upper_bounds
    ),
]
3-element Vector{Tuple{Symbol, Vararg{Vector{Float64}, 4}}}:
 (:gaussian, [10.5, 30.0, 11.0], [5.0, 5.0, 3.0], [0.0, 0.0, 0.0], [Inf, Inf, 50.0])
 (:lorentzian, [17.5, 54.0, 3.1], [5.0, 3.0, 1.0], [0.0, 0.0, 0.0], [Inf, Inf, Inf])
 (:pearson7, [21.5, 44.0, 3.0, 0.4], [3.0, 2.0, 5.0, 0.02], [0.0, 0.0, 0.0, 0.0], [100.0, 100.0, 50.0, Inf])
Note

Upper and lower boundaries for model parameters are always required when constructing this peaks information vector. However, remember that they are not used in the quasi-Newton method.

We then pass the data and this vector of peak informations to prepare_context that stores everything in a Julia object. In addition to peaks_info, you will need the data vectors x, y_obs, and the errors on y_obs. Usually we do not have a precise idea of those errors. A good approximation can be provided by the difference between the observed signal and a smoothed version. We adopt this approach here and check if the estimated error makes sens:

y_smo = smooth(x, y_obs, method="gcvspline");
estimated_mean_error = sqrt(mean((y_obs .- y_smo).^2))
println("The estimated mean standard error on y_obs is $(round(estimated_mean_error,digits=2))")
The estimated mean standard error on y_obs is 0.25

OK, the result seems to be not too bad. We will place ourselves in a "real world" situation and use those errors. For convenience we create an vector of errors

estimated_error = estimated_mean_error * ones(size(x));
501-element Vector{Float64}:
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 ⋮
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907
 0.2541910639909907

We can now pass the data and associated errors to prepare_context:

ctx = prepare_context(x, y_obs, peaks_info, estimated_error)
FitContext([0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8  …  98.2, 98.4, 98.6, 98.8, 99.0, 99.2, 99.4, 99.6, 99.8, 100.0], [1.6088686840357067, 1.1768098635632713, 1.2542895598680681, 1.4549624563172046, 1.1515194775042086, 1.1503532038054318, 0.6433975473731379, 0.9871559991716443, 0.7237932299106741, 0.9477195617366884  …  1.0258099218592251, 0.9362388380003671, 1.0926820748510897, 0.9267941861139594, 0.5242733358386387, 0.5613998542220258, 1.1952521714858138, 0.8508185592348501, 0.3432224544143415, 0.6039496097487248], Tuple[(:gaussian, [10.5, 30.0, 11.0], [5.0, 5.0, 3.0], [0.0, 0.0, 0.0], [Inf, Inf, 50.0]), (:lorentzian, [17.5, 54.0, 3.1], [5.0, 3.0, 1.0], [0.0, 0.0, 0.0], [Inf, Inf, Inf]), (:pearson7, [21.5, 44.0, 3.0, 0.4], [3.0, 2.0, 5.0, 0.02], [0.0, 0.0, 0.0, 0.0], [100.0, 100.0, 50.0, Inf])], [0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907  …  0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907, 0.2541910639909907], [25.0 0.0 … 0.0 0.0; 0.0 25.0 … 0.0 0.0; … ; 0.0 0.0 … 25.0 0.0; 0.0 0.0 … 0.0 0.0004], [0.04 0.0 … 0.0 0.0; 0.0 0.04 … 0.0 0.0; … ; 0.0 0.0 … 0.04 0.0; 0.0 0.0 … 0.0 2500.0], [0.06461309701287192 0.0 … 0.0 0.0; 0.0 0.06461309701287192 … 0.0 0.0; … ; 0.0 0.0 … 0.06461309701287192 0.0; 0.0 0.0 … 0.0 0.06461309701287192], [15.476738404920981 0.0 … 0.0 0.0; 0.0 15.476738404920981 … 0.0 0.0; … ; 0.0 0.0 … 15.476738404920981 0.0; 0.0 0.0 … 0.0 15.476738404920981], Dict{Any, Any}(2 => 4:6, 3 => 7:10, 1 => 1:3), Dict{Symbol, Function}(:pearson7 => Spectra.pearson7, :lorentzian => Spectra.lorentzian, :gaussian => Spectra.gaussian, :pseudovoigt => Spectra.pseudovoigt), [10.5, 30.0, 11.0, 17.5, 54.0, 3.1, 21.5, 44.0, 3.0, 0.4], [5.0, 5.0, 3.0, 5.0, 3.0, 1.0, 3.0, 2.0, 5.0, 0.02], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [Inf, Inf, 50.0, Inf, Inf, Inf, 100.0, 100.0, 50.0, Inf], Spectra.var"#model#58"{Vector{Float64}, Vector{Tuple{Symbol, Vararg{Vector{Float64}, 4}}}, Dict{Symbol, Function}, Dict{Any, Any}}([0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8  …  98.2, 98.4, 98.6, 98.8, 99.0, 99.2, 99.4, 99.6, 99.8, 100.0], [(:gaussian, [10.5, 30.0, 11.0], [5.0, 5.0, 3.0], [0.0, 0.0, 0.0], [Inf, Inf, 50.0]), (:lorentzian, [17.5, 54.0, 3.1], [5.0, 3.0, 1.0], [0.0, 0.0, 0.0], [Inf, Inf, Inf]), (:pearson7, [21.5, 44.0, 3.0, 0.4], [3.0, 2.0, 5.0, 0.02], [0.0, 0.0, 0.0, 0.0], [100.0, 100.0, 50.0, Inf])], Dict{Symbol, Function}(:pearson7 => Spectra.pearson7, :lorentzian => Spectra.lorentzian, :gaussian => Spectra.gaussian, :pseudovoigt => Spectra.pseudovoigt), Dict{Any, Any}(2 => 4:6, 3 => 7:10, 1 => 1:3)))
Note

estimated_error is an positional argument that will be set to an array of 1 if you do not pass a vector. It is advised to pass proper errors for a proper scaling of the misfit function.

From there, a good thing is to check that your prior model is not too remote from a good fit. The algorithms we use are local optimization methods and do not aim at finding a global minimum, but only local solutions. If you set $m_{prior}$ to values far from the solution, they will fail. To check your starting parameters, we can fit the prior model and the data using:

p = plot_fit(ctx, title="Prior model")
No result provided, plotting the prior...

Modify your starting parameters until the model starts to make sense, and then perform a fit calling fit_peaks with your favorite backend. For instance, if you want to use IPNewton, call:

result = fit_peaks(ctx, backend=:Optim)

or if you want to use the quasi-Newton method described above, call:

result = fit_peaks(ctx, backend=:qGN, maxiter=100, relax=5)

result is an object containing:

  • context::FitContext: Fit context
  • peak_results::Vector: Peak results
  • params::Vector{Float64}: Peak parameters with uncertainties
  • covariance::Matrix{Float64}: Covariance matrix
  • errors::Vector{Float64}: 1-sigma standard errors on parameters
  • y_calc::Vector{Float64}: Model predictions
  • residuals::Vector{Float64}: Residuals

In result.peal_results, numbers are now Measurements as we use the Measurements.jl library to automatically propagate fitting errors for peak area calculations.

After a fit, you can print the parameters and peak areas using print_params and plot the fit using plot_fit. Let's do the fit and those steps below

print_params(result.peak_results)

plot_fit(ctx, result.peak_results)
Peak 1 (gaussian):
  amplitude: 10.024 ± 0.053
  center: 35.026 ± 0.084
  width: 9.977 ± 0.087
  area: 212.9 ± 2.4
Peak 2 (lorentzian):
  amplitude: 14.891 ± 0.078
  center: 55.005 ± 0.017
  width: 3.0 ± 0.027
  area: 140.3 ± 1.2
Peak 3 (pearson7):
  amplitude: 20.1 ± 0.15
  center: 45.007 ± 0.01
  width: 1.973 ± 0.037
  exponent: 0.3955 ± 0.0035
  area: -147.1 ± 9.2

Errors on peak parameters

Errors provided by fit_peaks

The errors provided by fit_peaks come from the evaluation of the Hessian matrix at the optimal point. In the quasi-Newton algorithm, we directly calculate the posterior model covariance matrix as (Tarantola, 2005, eq. 3.53):

\[C_{M, post} = C_{M} - C_{M}G^{t}(G C_M G{t} +C_D)^{-1}G C_M\]

and retrieve the standard errors on model parameters from the squared root of the diagonal of $C_{M, post}$. In the IPNewton case, we use ForwardDiff.hessian() to calculate the Hessian matrix at the optimal point to optain $C_{M, post}$.

Note

Do not neglect off-diagonal terms in $C_{M, post}$. Peak parameters often are strongly correlated, and neglecting off-diagonal terms may lead to report wrong errors on quantities that depends on several peak parameters, such as peak areas. This is why we automatically propagate errors using $C_{M, post}$ and Measurements.jl when calculating peak areas in fit_peaks for instance.

Checking errors with bootstrapping

$C_{M, post}$ may not necessarily contain valid parameter uncertainties, particularly if the problem is strongly non-linear.

To check for parameter errors, one option is to use bootstrapping. bootstrap allows us to bootstrap a spectrum and refit the model on the spectrum subsamples. You will obtain samples of models all adjusted on slightly different data. Using the model samples, we can check that the errors calculated from the Hessian are valid, or if they are not, we can use the new errors calculated from the bootstrap samples.

The interface is easy, similar to that of fit_peaks but the fit context is defined internally so you don't even have to worry about that. For instance, using the quasi-Newton method, we can do:

boot_params, boot_results = bootstrap(x, y_obs, estimated_error, peaks_info, nb_boot = 50, backend=:Optim);
Tip

The quasi-Newton algorithm is the quickest so you may prefer this one for bootstrapping. If so, you can try also setting maxiter to a value as small as possible without affecting fit convergence. Another tip is to provide a new peaks_info vector with $m_{prior}$ set to the values of a good fit that you previously did.

Tip

Here we only used 50 bootstraps for the sake of generating the documentation in a reasonable time. In practice, you should use more bootstraps (e.g. 1000) to get a good estimate of the errors.

The bootstrap function returns:

  • a matrix of size (nb_params, nb_boot) with the fitted parameters (here boot_params);
  • a peakresults objects with values tied to their errors thanks to Measurements.jl (here called `bootresults`).

We can now print the bootstrapped results and compare the errors with those previously calculated from the Hessian:

print_params(boot_results)
Peak 1 (gaussian):
  amplitude: 10.039 ± 0.059
  center: 35.045 ± 0.098
  width: 9.99 ± 0.11
  area: 213.4 ± 3.0
Peak 2 (lorentzian):
  amplitude: 14.91 ± 0.11
  center: 55.009 ± 0.019
  width: 2.994 ± 0.041
  area: 140.2 ± 1.5
Peak 3 (pearson7):
  amplitude: 20.09 ± 0.15
  center: 45.007 ± 0.012
  width: 1.968 ± 0.044
  exponent: 0.3948 ± 0.0037
  area: -145.3 ± 10.0

OK, actually for this example, we see that the errors from the boostrap analysis are close to those calculated from the Hessian matrix. Everything thus seems OK.

Of course, a final quick visual inspection is always nice. This can be done by passing the median of the matrix of bootstrapped parameters to the plot_fit function:

plot_fit(ctx, boot_results)

Bayesian MCMC fit with Turing.jl

The same problem can be tackled using Turing.jl and the peak shape functions from Spectra as follow. This offers another way to check that the estimated errors are good for instance, or to use different types of prior probability distributions on model parameters (as in the quasi-Newton we assume Gaussian priors).

You will need to install Turing.jl, which is not a dependency of Spectra. The code below runs well but it may not be fully optimized. It is just for the sack of example.

Run it on your own computer! If you have suggestions, do not hesitate!

using Turing

# Define a Bayesian model with priors
@model function bayesian_peaks(x, y)
    # Define priors based on peak_types

    # PEAK 1
    amplitude ~ truncated(Normal(10.016, 0.5), 0.0, Inf)
    center ~ Normal(34.92, 0.5)
    width ~ truncated(Normal(10.0, 0.5), 0.0, Inf)

    μ = gaussian(x, [amplitude, center, width])
    
    # PEAK 2
    amplitude2 ~ truncated(Normal(14.9, 0.5), 0.0, Inf)
    center2 ~ Normal(55.0, 0.5)
    width2 ~ truncated(Normal(3.0, 0.5), 0.0, Inf)
    
    μ2 = lorentzian(x, [amplitude2, center2, width2])
    
    # PEAK 3
    amplitude3 ~ truncated(Normal(25.5, 0.5), 0.0, Inf)
    center3 ~ Normal(43.0, 10.0)
    width3 ~ truncated(Normal(2.0, 0.5), 0.0, Inf)
    lr ~ truncated(Normal(0.39, 0.03), 0.0, 1.0)
    
    # Calculate model prediction
    μ3 = pseudovoigt(x, [amplitude3, center3, width3, lr])
    
    # Likelihood
    σ ~ truncated(Normal(0.2, 0.03), 0.001, Inf)
    y ~ MvNormal(μ + μ2 + μ3, σ^2 * I)
end

chain = sample(bayesian_peaks(x_fit, y_fit), NUTS(), 2000, nchains=3, progress=true)

Final remarks

Done, please do check the examples in the Tutorials section for further peak fitting examples. Below you will find the full API of the various functions, including peak shapes, area calculations, fitting algorithms...

Functions API

Peak fitting

Spectra.prepare_contextFunction
prepare_context(x, y, peaks_info, sigma=ones(length(x)))

Create a precomputed context for peak fitting operations.

Arguments

  • x::Vector{Float64}: Independent variable values
  • y::Vector{Float64}: Observations
  • peaks_info::Vector{Tuple}: Peak specifications containing:
    • Symbol: Peak type (:gaussian, :lorentzian, :pseudovoigt, :pearson7)
    • Vector{Float64}: Initial parameters
    • Vector{Float64}: Parameter uncertainties
    • Vector{Float64}: Lower bounds
    • Vector{Float64}: Upper bounds
  • sigma::Vector{Float64}: Data uncertainties (default: ones)

Returns

  • FitContext: Precomputed structure containing matrices, indices, and model function

Examples

'''julia peaksinfo = [ (:gaussian, [1.0, 0.0, 0.5], [0.1, 0.05, 0.1], [-Inf, -1.0, 0.1], [Inf, 1.0, 2.0]), (:lorentzian, [0.5, 0.2, 0.3], [0.1, 0.05, 0.1], [0.0, -0.5, 0.1], [2.0, 0.5, 1.0]) ] ctx = preparecontext(x, peaks_info, sigma) '''

source
Spectra.fit_peaksFunction
fit_peaks(ctx::FitContext; backend=:qGN, relax=4, maxiter=100)

Perform peak fitting using specified optimization backend.

Arguments

  • ctx::FitContext: Precomputed context from prepare_context
  • backend::Symbol: Optimization method (:Optim or :qGN)
  • relax::Real: Step relaxation factor (qGN only)
  • maxiter::Int: Maximum iterations (qGN only)

Returns

  • FitResult: Structured results containing:
    • context::FitContext: Fit context
    • peak_results::Vector: Peak results with uncertainties
    • params::Vector{Float64}: Peak parameters
    • covariance::Matrix{Float64}: Covariance matrix
    • errors::Vector{Float64}: 1-sigma standard errors on parameters
    • y_calc::Vector{Float64}: Model predictions
    • residuals::Vector{Float64}: Residuals

Examples

x = 1:1.0:100
y = gaussian(x, 1., 50.0, 5.0) .+ 0.1 * randn(length(x))
peaks_info = [
    (:gaussian, [1.0, 50.0, 5.0], [0.1, 0.05, 0.1], [0.0, 40.0, 0.1], [2.0, 60.0, 10.0])
]
ctx = prepare_context(x, y, peaks_info)
result = fit_peaks(ctx; backend=:qGN, relax=4, maxiter=100)
plot_fit(ctx; result=result.peak_results)
source
Spectra.fit_OptimFunction
fit_Optim(ctx::FitContext)

Perform a box constrained fit using the Optim package with the LBFGS algorithm.

The loss function combines a loss on data and on model prior (eq. 3.46 in Tarantola 2005)

Arguments

-ctx: context created by prepare_context

Returns

  • fitted_params: fitted parameters
  • CMpost: covariance matrix of the fitted parameters
  • sqrt.(diag(CMpost)): errors of the fitted parameters
source
Spectra.fit_qNewtonFunction
fit_qNewton(ctx::FitContext; maxiter=100, relax=5)

Perform a quasi-Newton fit (Tarantola 2005, eq. 3.51)

Arguments

-ctx: context created by prepare_context -maxiter: maximum number of iterations -relax: relaxation factor for the step size

Returns

  • mcurrent: fitted parameters
  • CMpost: covariance matrix of the fitted parameters
  • sqrt.(diag(CMpost)): errors of the fitted parameters
source
Spectra.plot_fitFunction
plot_fit(ctx, peak_results=nothing; xlabel="X", ylabel="Y", title="Model adjustement")

return a plot of the data and the fit given a ctx context created by prepare_context, and a result generated by fit_peaks or get_fit_results. If result is not provided, the prior is plotted.

source
Spectra.print_paramsFunction
print_params(peak_results)

print the parameters after the fit, peakresults is a vector of Named Tuples (generated by getpeak_results) with the results of the fit.

source
Spectra.get_peak_resultsFunction
get_peak_results(ctx, params, CMpost)

Returns a vector of Named Tuples with the results of the fit. Values are tied to their errors via the covariance matrix CMpost.

source
Spectra.bootstrapFunction
bootstrap(x, y, sigma, peaks_info; nb_boot = 100, backend=:qGN, relax=5., maxiter=100)

Perform a bootstrap on the fit.

Arguments

-x: x-axis data -y: y-axis data -sigma: data noise -peaks_info: vector of tuples with the peak type and parameters (peaktype, initialparams, uncertainties, lowerbounds, upperbounds) -nb_boot: number of bootstrap samples -backend: optimization backend, either :Optim or :qGN -relax: relaxation factor for the step size (only used in :qGN) -maxiter: maximum number of iterations (only used in :qGN)

Returns

  • a matrix of size (number of parameters, number of bootstrap samples) with the fitted parameters
  • a vector of Named Tuples with the results of the fit. Values are tied to their errors via the covariance matrix CMpost. (peaktype, initialparams, uncertainties, lowerbounds, upperbounds)
source
Spectra.FitContextType

FitContext type containing all fitting context -x::Vector{Float64}: x data vector. -y::Vector{Float64}: y data vector. -peaks_info::Vector{Tuple}: contains the informations of the peaks of the model. -sigma::Vector{Float64}: error on y. -CD::Matrix{Float64}: data covariance matrix. -ICD::Matrix{Float64}: inverse of the data covariance matrix. -mprior::Vector{Float64}: prior model parameters. -mprior_sigma::Vector{Float64}: uncertainties on prior model parameters. -CM::Matrix{Float64}: model covariance matrix (prior). -ICM::Matrix{Float64}: inverse of the model covariance matrix (prior). -all_lower_bounds::Vector{Float64}: lower boundaries for parameters. -all_upper_bounds::Vector{Float64}: upper boundaries for parameters. -model::Function

source
Spectra.FitResultType

FitResult type containing all fitting results and metadata context::FitContext: Fit Context. peak_results::Vector: Vector of Named Tuples containing parameters and areas for each peak, with errors. params::Vector{Float64}: Vector of parameters. covariance::Matrix{Float64}: Covariance matrix at the optimal point $C_{M, post}$. errors::Vector{Float64}: Uncertainties on parameters calculated as $sqrt.(diag(covariance))$. y_calc::Vector{Float64}: Calculated y values, using params. residuals::Vector{Float64}: Difference between calculated and input y values.

source

Peak shapes

Spectra.create_peaksFunction
create_peaks(x::Vector{Float64}, peak_infos::Vector{Dict{Symbol,Any}})

Generate multiple peaks and their sum from a collection of peak descriptions.

Arguments

  • x::Vector{Float64}: X-axis values where peaks will be evaluated.
  • peak_infos::Vector{Dict}: List of dictionaries describing peaks. Each dictionary should contain:
    • :type: Peak type (:gaussian, :lorentzian, :pseudovoigt, :pearson7)
    • Required parameters for the specified peak type:
      • Gaussian: :amplitude, :center, :hwhm
      • Lorentzian: :amplitude, :center, :hwhm
      • PseudoVoigt: :amplitude, :center, :hwhm, :lorentzian_fraction
      • Pearson7: :amplitude, :center, :hwhm, :exponent

Returns

  • peaks::Matrix{Float64}: Matrix where each column represents a individual peak
  • total_spectrum::Vector{Float64}: Sum of all peaks

Examples

x = collect(0:0.1:10)
peak_infos = [
Dict(:type => :gaussian, :amplitude => 1.0, :center => 5.0, :hwhm => 0.5),
Dict(:type => :lorentzian, :amplitude => 0.8, :center => 7.0, :hwhm => 1.2),
Dict(:type => :pearson7, :amplitude => 0.8, :center => 3.0, :hwhm => 0.2, :exponent => 1.9)
]
peaks, total = create_peaks(x, peak_infos)
source
Spectra.gaussianFunction
gaussian(x, amplitude, center, hwhm)
gaussian(x, p)

Gaussian function with parameters amplitude, center, hwhm or a vector p = [amplitude, center, hwhm].

Notes:

  • the half-width at half maximum (hwhm) of the gaussian peak is related to the standard deviation sigma by: hwhm = sqrt(2*log(2)) * sigma
  • gaussian(x, p) is a shorthand for gaussian(x, p[1], p[2], p[3])
source
Spectra.lorentzianFunction
lorentzian(x, amplitude, center, hwhm)
lorentzian(x, p)

Lorentzian function with parameters amplitude, center, hwhm or a vector p = [amplitude, center, hwhm].

Notes:

  • hwhm: half-width at half maximum
  • lorentzian(x, p) is a shorthand for lorentzian(x, p[1], p[2], p[3])
source
Spectra.pseudovoigtFunction
pseudovoigt(x, amplitude, center, hwhm, lorentzian_fraction)
pseudovoigt(x, p)

Pseudovoigt function with parameters amplitude, center, hwhm, lorentzianfraction or a vector p = [amplitude, center, hwhm, lorentzianfraction].

Notes:

  • hwhm: half-width at half maximum.
  • pseudovoigt(x, p) is a shorthand for lorentzian(x, p[1], p[2], p[3], p[4]).
  • calculated as lorentzianfraction*lorentzian + (1 - lorentzianfraction)*gaussian.
  • lorentzian_fraction is a value between 0 and 1 that controls the mixing between Gaussian and Lorentzian.
source
Spectra.pearson7Function
pearson7(x, amplitude, center, hwhm, exponent)
pearson7(x, p)

Pearson 7 function with parameters amplitude, center, hwhm, exponent or a vector p = [amplitude, center, hwhm, exponent].

Notes:

  • Equation is amplitude / (1 + ((x - center)/hwhm)^2 * (2^(1/exponent) - 1))^exponent.
  • pearson7(x, p) is a shorthand for pearson7(x, p[1], p[2], p[3], p[4]).
source

Utilities

Spectra.area_peaksFunction
area_peaks(peak_type::Symbol, amplitude::Float64, hwhm::Float64; lorentzian_fraction=nothing, exponent=nothing)

Calculate the area under Gaussian, Lorentzian, Pseudo-Voigt, or Pearson VII peaks based on their parameters. Areas are calculated using analytical formulas.

Arguments

  • peak_type::Symbol: The type of peak. Supported types are:
    • :gaussian: Gaussian peak.
    • :lorentzian: Lorentzian peak.
    • :pseudovoigt: Pseudo-Voigt peak (weighted combination of Gaussian and Lorentzian).
    • :pearson7: Pearson VII peak.
  • amplitude::Float64: Amplitude of the peak (maximum height).
  • hwhm::Float64: Half-width at half-maximum of the peak.
  • lorentzian_fraction::Union{Float64, Nothing}: Lorentzian fraction (for Pseudo-Voigt peaks). Must be in [0, 1]. Default is nothing.
  • exponent::Union{Float64, Nothing}: Shape parameter for Pearson VII peaks. Default is nothing.

Returns

  • area::Float64: The calculated area under the specified peak.

Examples

Gaussian Peak

A = 2.0
hwhm = 0.5
area_gaussian = peak_area("gaussian", amplitude=A, hwhm=hwhm)

Lorentzian Peak

A = 2.0
hwhm = 0.5
area_lorentzian = peak_area("lorentzian", amplitude=A, hwhm=hwhm)

Pseudo-Voigt Peak

A = 2.0
hwhm = 0.5
lorentzian_fraction = 0.5
area_pseudovoigt = peak_area("pseudovoigt", amplitude=A, hwhm=hwhm, lorentzian_fraction=lorentzian_fraction)

Pearson VII Peak

A = 2.0
hwhm = 0.5
exponent = 2.0
area_pearson7 = peak_area("pearson7", amplitude=A, hwhm=hwhm, exponent=exponent)

Notes

  1. Gaussian Formula: $\text{Area} = A \cdot \text{hwhm} \cdot \sqrt{\frac{\pi}{\ln 2}}$
  2. Lorentzian Formula: $\text{Area} = \pi \cdot A \cdot \text{hwhm}$
  3. Pseudo-Voigt Formula: $\text{Area} = \eta \cdot (\pi \cdot A \cdot \text{hwhm}) + (1-\eta) \cdot (A \cdot \text{hwhm} \cdot \sqrt{\frac{\pi}{\ln 2}})$
  4. Pearson VII Formula: $\text{Area} = A \cdot \text{hwhm} \cdot \sqrt{\frac{\pi}{2^{1/exponent} - 1}} \cdot \frac{\Gamma(exponent - 0.5)}{\Gamma(exponent)}$

Errors

  • Throws an error if an unsupported peak_type is provided.
  • Throws an error if required parameters for a specific peak type are not provided.
  • Throws an error if lorentzian_fraction is not comprised in the [0, 1] interval
source
Spectra.bootsampleFunction
bootsample(x, y; boottype::String="np", ese=nothing)

Inputs

  • x: the x axis. It can have multiple columns.
  • y: the y axis. It can have multiple columns.

Options

  • boottype::String = "np": type of bootstrapping
    • "np": non-parametric bootstrapping. Data resampled with replacement.
    • "p": parametric bootstrapping. Data resampled from the Normal(y, ese) distribution.
  • ese = nothing: standard errors on y.

Outputs

  • b_x_f: bootstrapped x sample
  • b_y_f: bootstrapped y sample

The bootstrap function can be embedded in a for loop, and will each time produce a different dataset. Performing K times the bootstrapping and fitting each time the model will allow to estimate the error distribution on the peak parameters. This technic has the advantage of making no prior assumption on the probability distribution functions of parameters errors. However, it is more time consuming that using the covariance matrix.

References

  • Efron, B. 1979. “Bootstrap Methods: Another Look at the Jackknife.” The Annals of Statistics 7 (1): 1–26.
  • Efron, Bradley. 1981. “Nonparametric Estimates of Standard Error: The Jackknife, the Bootstrap and Other Methods.” Biometrika 68 (3): 589–99. doi:10.1093/biomet/68.3.589.
  • Efron, B., and Tibshirani, R. 1994. An Introduction to the Bootstrap. CRC press.
source