Least-Squares Fitting of Isotropic EPR spectra by Simulations
Source:R/eval_sim_EPR_isoFit.R
eval_sim_EPR_isoFit.Rd
Fitting of the simulated spectrum onto the experimental one represents an important step in the analysis
of EPR spectra. Parameters of the simulated spectrum like \(g_{\text{iso}}\); coupling constants
(in MHz
) \(A_{\text{iso}}\) for each group of equivalent nuclei; linewidth
(either \(\Delta B_{\text{pp}}\) or \(FWHM\) depending on the lineSpecs.form
argument);
spectral baseline (see the baseline.correct
) and finally the intensity (multiplication coefficient)
are optimized by the methods listed in optim_for_EPR_fitness
.
The lineG.content
corresponding parameter is the only one,
which needs to be varied "manually". In order to control the optimization/fitting process,
by similar interactive way like in EasySpin,
a Shiny app
and/or gganimate visualization is under development.
Usage
eval_sim_EPR_isoFit(
data.spectr.expr,
Intensity.expr = "dIepr_over_dB",
Intensity.sim = "dIeprSim_over_dB",
nu.GHz,
B.unit = "G",
nuclear.system.noA = NULL,
baseline.correct = "constant",
lineG.content = 0.5,
lineSpecs.form = "derivative",
optim.method = "neldermead",
optim.params.init,
optim.params.lower = NULL,
optim.params.upper = NULL,
Nmax.evals = 512,
tol.step = 5e-07,
pswarm.size = NULL,
pswarm.diameter = NULL,
pswarm.type = NULL,
check.fit.plot = TRUE,
output.list.final = FALSE,
...
)
Arguments
- data.spectr.expr
Data frame object/table, containing the experimental spectral data the with magnetic flux density (
"B_mT"
or"B_G"
) and the intensity (see theIntensity.expr
argument) columns.- Intensity.expr
Character string, pointing to column name of the experimental EPR intensity within the original
data.spectr.expr
. Default:dIepr_over_dB
.- Intensity.sim
Character string, pointing to column name of the simulated EPR intensity within the related output data frame. Default:
Intensity.sim = "dIeprSim_over_dB"
.- nu.GHz
Numeric value, microwave frequency in
GHz
.- B.unit
Character string, denoting the magnetic flux density unit e.g.
B.unit = "G"
(gauss, default) orB.unit = "mT"
/"T"
(millitesla/tesla).- nuclear.system.noA
List or nested list without estimated hyperfine coupling constant values, such as
list("14N",1)
orlist(list("14N", 2),list("1H", 4),list("1H", 12))
. The \(A\)-values are already defined as elements of theoptim.params.init
argument/vector. If the EPR spectrum does not display any hyperfine splitting, the argument definition readsnuclear.system.noA = NULL
(default).- baseline.correct
Character string, referring to baseline correction of the simulated/fitted spectrum. Corrections like
"constant"
(default),"linear"
or"quadratic"
can be applied.- lineG.content
Numeric value between
0
and1
, referring to content of the Gaussian line form. IflineG.content = 1
(default) it corresponds to "pure" Gaussian line form and iflineG.content = 0
it corresponds to Lorentzian one. The value from (0,1) (e.g.lineG.content = 0.5
) represents the linear combination (for the example above, with the coefficients 0.5 and 0.5) of both line forms => so called pseudo-Voigt.- lineSpecs.form
Character string, describing either
"derivative"
(default) or"integrated"
(i.e."absorption"
which can be used as well) line form of the analyzed EPR spectrum/data.- optim.method
Character string (vector), setting the optimization method(s) gathered within the
optim_for_EPR_fitness
. Default:optim.method = "neldermead"
. Additionally, several consecutive methods can be defined likeoptim.method = c("levenmarq","neldermead")
, where the best fit parameters from the previous method are used as input for the next one. In such case, the output islist
with the elements/vectors from each method, in order to see the progress of the optimization.- optim.params.init
Numeric vector with the initial parameter guess (elements) where the first five elements are immutable
g-value (g-factor)
Gaussian linewidth
Lorentzian linewidth
baseline constant (intercept)
intensity multiplication constant
baseline slope (only if
baseline.correct = "linear"
orbaseline.correct = "quadratic"
), ifbaseline.correct = "constant"
it corresponds to the first HFCC (\(A_1\))baseline quadratic coefficient (only if
baseline.correct = "quadratic"
), ifbaseline.correct = "constant"
it corresponds to the second HFCC (\(A_2\)), ifbaseline.correct = "linear"
it corresponds to the first HFCC (\(A_1\))additional HFCC (\(A_3\)) if
baseline.correct = "constant"
or ifbaseline.correct = "linear"
(\(A_2\)), ifbaseline.correct = "quadratic"
it corresponds to the first HFCC (\(A_1\))...additional HFCCs (\(A_k...\), each vector element is reserved only for one \(A\))
DO NOT PUT ANY OF THESE PARAMETERS to
NULL
. If the lineshape is expected to be pure Lorentzian or pure Gaussian then put the corresponding vector element to0
.- optim.params.lower
Numeric vector (with the same element order like
optim.params.init
) with the lower bound constraints. Default:optim.params.lower = NULL
which actually equals to \(g_{\text{init}} - 0.001\), \(0.8\,\Delta B_{\text{G,init}}\), \(0.8\,\Delta B_{\text{L,init}}\), baseline intercept initial constant \(- 0.001\), intensity multiplication initial constant \(= 0.8\,\text{init}\), baseline initial slope \(- 5\) (in case thebaseline.correct
is set either to"linear"
or"quadratic"
) and finally, the baseline initial quadratic coefficient \(- 5\) (in case thebaseline.correct
is set to"quadratic"
). Lower limits of all hyperfine coupling constant (HFCCs) are set to \(0.875\,A_{\text{init}}\).- optim.params.upper
Numeric vector (with the same element order like
optim.params.init
) with the upper bound constraints. Default:optim.params.upper = NULL
which actually equals to \(g_{\text{init}} + 0.001\), \(1.2\,\Delta B_{\text{G,init}}\), \(1.2\,\Delta B_{\text{L,init}}\), baseline intercept initial constant \(+ 0.001\), intensity multiplication initial constant \(= 1.2\,\text{init}\), baseline initial slope \(+ 5\) (in case thebaseline.correct
is set either to"linear"
or"quadratic"
) and finally, the baseline initial quadratic coefficient \(+ 5\) (in case thebaseline.correct
is set to"quadratic"
). Upper limits of all HFCCs are set to \(1.125\,A_{\text{init}}\).- Nmax.evals
Numeric value, maximum number of function evaluations and/or iterations. The only one method, limited by this argument, is
nls.lm
, whereNmax.evals = 1024
. HigherNmax.evals
may extremely extend the optimization time, therefore the default value readsNmax.evals = 512
. However, the"pswarm"
method requires at least the default or even higher values.- tol.step
Numeric value, describing the smallest optimization step (tolerance) to stop the optimization. Default:
tol.step = 5e-7
.- pswarm.size
Numeric value, which equals to particle swarm size (i.e. number of particles), if
method = "pswarm"
. The default value (pswarm.size = NULL
) actually corresponds tofloor(10+2*sqrt(length(x.0)))
(forSPSO2007
, see thepswarm.type
argument), e.g. to optimize 8 parameters, number of particles = 15. For theSPSO2011
the default number of particles equals to40
.- pswarm.diameter
Numeric value, corresponding to diameter of the particle swarm search space (in case
method = "pswarm"
). The default value (pswarm.diameter = NULL
) refers to the Euclidean distance, defined as: $$\sqrt{\sum_k\,(\text{optim.params.upper}[k] - \text{optim.params.lower}[k])^2}$$- pswarm.type
Character string, setting the type/version of particle swarm algorithm if
method = "pswarm"
. There are two types available:pswarm.type = "SPSO2007"
andpswarm.type = "SPSO2011"
. The latter introduced an adaptive random topology, which allows the swarm to dynamically adjust its communication structure. This helps in maintaining diversity in the swarm and improves the algorithm's ability to escape local optima. This type generally offers better performance on larger multidimensional spaces than thepswarm.type = "SPSO2007"
, which uses a more static topology. Details may be found in theReferences
of theoptim_for_EPR_fitness
. Default:pswarm.type = NULL
(actually corresponding to"SPSO2007"
, that performs slightly better on smaller scales such as simulations of EPR spectra).- check.fit.plot
Logical, whether to return overlay plot with the initial simulation + the best simulation fit + experimental spectrum (including residuals in the lower part of the plot,
check.fit.plot = TRUE
, default) or with the following three spectra (check.fit.plot = FALSE
): 1. experimental, 2. the best simulated one with the baseline fit and 3. the best simulated spectrum with the baseline fit subtracted. The latter two are offset for clarity, within the plot.- output.list.final
Logical. If
TRUE
,list
with the following components will be exclusively returned: 1. optimized parameters from the best fit (together with the minimum sum of residual squares) and 2. data frame of the final best simulated spectrum with and without the baseline fit (seeValue
). Such output will be applied for the more complex optimization/fitting (which is currently under development), as stated in theDescription
, therefore, the default value readsoutput.list.final = FALSE
.- ...
additional arguments specified (see also
optim_for_EPR_fitness
).
Value
Optimization/Fitting procedure results in vector or data frame or list depending on the check.fit.plot
and output...
arguments.
If
check.fit.plot = TRUE
orcheck.fit.plot = FALSE
, the result corresponds to list with the following components:- plot
Visualization of the experimental as well as the best fitted EPR simulated spectra. If
check.fit.plot = TRUE
, the overlay plot consists of the initial simulation + the best simulation fit + experimental spectrum, including residuals in the plot lower part. Whereas, ifcheck.fit.plot = FALSE
, following three spectra are available: 1. experimental, 2. the best simulated one with the baseline fit and 3. the best simulated spectrum with the baseline fit subtracted. The latter two are offset for clarity.- best.fit.params
Vector of the best (final) fitting (optimized) parameters, for each corresponding
optim.method
, to simulate the experimental EPR spectrum, see also description of theoptim.params.init
.- df
Tidy data frame (table) with the magnetic flux density and intensities of the experimental, the best simulated/fitted, as well as the initially simulated EPR spectrum and residuals (if
check.fit.plot = TRUE
), or wide data frame with the following variables / columns (forcheck.fit.plot = FALSE
): magnetic flux density, intensity of the experimental spectrum, intensity of the best simulated one (including the baseline fit), residual intensity and finally, the best simulated spectrum intensity without the baseline fit.- sum.LSQ.min
The minimum sum of the residual square vector after the least-square procedure.
- N.evals
Number of iterations/function evaluations completed before termination. If the
pswarm
optimization algorithm is included inoptim.method
, theN.evals
equals to vector with the following elements: number of function evaluations, number of iterations (per one particle) and the number of restarts.- N.converg
Vector or simple integer code indicating the successful completion of the optimization/fit. In the case of
"levenmarq"
method, the vector elements coincide with the sum of residual squares at each iteration. If theoptim.method = "pswarm"
is applied, one of the following codes can be returned:0
: algorithm terminated by reaching the absolute tolerance,1
: maximum number of function evaluations reached,2
: maximum number of iterations reached,3
: maximum number of restarts reached,4
: maximum number of iterations without improvement reached. For all the other remaining methods (coming from{nloptr}
package), the integers have to be positive to indicate the successful convergence.
If
output.list.final = TRUE
, the function exclusively returns list with the two components, which will be applied for the more complex optimization/fitting (which is currently under development).- params
Vector, corresponding to the best fitting (optimized) parameters (related to the
optim.params.init
argument, see also list above) + minimum sum of the residual squares, after the (final)optim.method
procedure.- df
Data frame including the final best simulated spectrum with and without the baseline fit.
Note
In order to guess the intensity multiplication constant (please, refer to the optim.params.init
argument), one might compare the intensities of the experimental (expr
) and simulated (sim
)
EPR spectrum by one of the interactive or static plot functions (e.g. plot_EPR_Specs
or plot_EPR_Specs2D_interact
) as well as by the eval_sim_EPR_iso
. Accordingly,
the initial intensity multiplication constant can be estimated as the ratio
max(expr
intensity)/max(sim
intensity).
See also
Other Simulations and Optimization:
eval_sim_EPR_iso()
,
eval_sim_EPR_iso_combo()
,
optim_for_EPR_fitness()
,
quantify_EPR_Sim_series()
,
smooth_EPR_Spec_by_npreg()
Examples
## loading built-in example dataset which is simple
## EPR spectrum of the aminoxyl radical:
aminoxyl.data.path <-
load_data_example(file = "Aminoxyl_radical_a.txt")
aminoxyl.data <-
readEPR_Exp_Specs(aminoxyl.data.path,
qValue = 2100)
#
## EPR spectrum simulation fit with "Nelder-Mead"
## optimization method with `check.fit.plot = FALSE`:
tempo.test.sim.fit.a <-
eval_sim_EPR_isoFit(data.spectr.expr = aminoxyl.data,
nu.GHz = 9.806769,
lineG.content = 0.5,
optim.method = "neldermead",
nuclear.system.noA = list("14N",1),
baseline.correct = "linear",
optim.params.init =
c(2.006, # g-value
4.8, # G Delta Bpp
4.8, # L Delta Bpp
0, # intercept (constant) lin. baseline
0.016, # Sim. intensity multiply
1e-6, # slope lin. baseline
49), # A in MHz
check.fit.plot = FALSE
)
## OUTPUTS:
## best fit parameters:
tempo.test.sim.fit.a$best.fit.params
#> [[1]]
#> [1] 2.0058005e+00 5.3018790e+00 5.7508704e+00 -5.9891781e-06 1.6724428e-02
#> [6] 1.6964173e-09 5.1142528e+01
#>
#
## spectrum plot with experimental spectrum,
## simulated one with the linear baseline fit
## and simulated one with the linear baseline
## fit subtracted:
tempo.test.sim.fit.a$plot
#
## minimum sum of squared residuals:
tempo.test.sim.fit.a$sum.LSQ.min
#> [[1]]
#> [1] 2.2997381e-07
#>
#
## number of evaluations / iterations:
tempo.test.sim.fit.a$N.evals
#> [[1]]
#> [1] 512
#>
#
## convergence, in this case it is represented
## by the integer code indicating the successful
## completion (it must be > 0):
tempo.test.sim.fit.a$N.converg
#> [[1]]
#> [1] 5
#>
#
## preview of data frame including all EPR spectra:
head(tempo.test.sim.fit.a$df)
#> B_G Experiment Simulation Residuals Simulation_NoBasLin
#> <num> <num> <num> <num> <num>
#> 1: 3332.7000 -1.7738564e-06 -3.2882168e-07 -1.4450347e-06 6.7065691e-09
#> 2: 3332.9005 1.2209461e-07 -3.2845643e-07 4.5055104e-07 6.7325324e-09
#> 3: 3333.1009 -1.8403788e-07 -3.2808922e-07 1.4405134e-07 6.7587618e-09
#> 4: 3333.3014 -7.1641412e-08 -3.2772370e-07 2.5608229e-07 6.7849972e-09
#> 5: 3333.5019 -1.7361444e-06 -3.2735622e-07 -1.4087882e-06 6.8115018e-09
#> 6: 3333.7023 -5.1531169e-07 -3.2699042e-07 -1.8832127e-07 6.8380128e-09
#
## similar EPR spectrum simulation fit with "particle swarm"
## optimization algorithm and `check.fit.plot = TRUE` option
## as well as user defined bound constraints:
tempo.test.sim.fit.b <-
eval_sim_EPR_isoFit(data.spectr.expr = aminoxyl.data,
nu.GHz = 9.806769,
lineG.content = 0.4,
optim.method = "pswarm",
nuclear.system.noA = list("14N",1),
baseline.correct = "constant",
optim.params.init = c(2.006,4.8,4.8,0,1.1e-2,49),
optim.params.lower = c(2.0048,4.4,4.4,-1e-4,9e-3,45),
optim.params.upper = c(2.0072,5.2,5.2,1e-4,1.5e-2,53),
check.fit.plot = TRUE
)
## OUTPUTS:
## best fit parameters:
tempo.test.sim.fit.b$best.fit.params
#> [[1]]
#> [1] 2.0054356e+00 4.6948967e+00 4.4000000e+00 3.3271591e-07 1.4952839e-02
#> [6] 5.2989415e+01
#>
#
## quick simulation check by plotting the both
## simulated and the experimental EPR spectra
## together with the initial simulation
## and the residuals (differences between the
## experiment and the best fit)
tempo.test.sim.fit.b$plot
#
## fitting of the aminoxyl EPR spectrum
## by the combination of the 1. "Levenberg-Marquardt"
## and 2. "Nelder-Mead" algorithms
tempo.test.sim.fit.c <-
eval_sim_EPR_isoFit(aminoxyl.data,
nu.GHz = 9.86769,
lineG.content = 0.5,
optim.method = c("levenmarq",
"neldermead"),
nuclear.system.noA = list("14N",1),
baseline.correct = "constant",
optim.params.init = c(2.0060,
4.8,
4.8,
0,
7e-3,
49),
check.fit.plot = FALSE
)
## OUTPUTS:
## best fit parameters for both procedures within a list:
tempo.test.sim.fit.c$best.fit.params
#> [[1]]
#> [1] 2.0060000e+00 5.7600000e+00 5.7600000e+00 -9.6564591e-10 8.4000000e-03
#> [6] 4.9000000e+01
#>
#> [[2]]
#> [1] 2.0069962e+00 4.7457426e+00 3.8400000e+00 -6.0710550e-10 8.4000000e-03
#> [6] 5.3445202e+01
#>
#
## compare the results with the example in the `readMAT_params_file`,
## corresponding to the best fit from `Easyspin`
#
## `N.converg` also consists of two components
## each corresponding to result of the individual
## optimization method where the "levenmarq" returns
## the sum of squares at each iteration, therefore the 1st
## component is vector and the 2nd one is integer code
## as already stated above:
tempo.test.sim.fit.c$N.converg
#> [[1]]
#> [1] 1.6573191e-06 1.5747107e-06 1.5676320e-06 1.5676320e-06
#>
#> [[2]]
#> [1] 5
#>