Using SpectraFit¶
This guide covers both basic and advanced usage patterns for SpectraFit, including configuration options, input file formats, and special features.
Standard Usage¶
Basic Command
In case of the standard usage of SpectraFit, the following steps are necessary:
- Having a structured textfile available with or without header. Preferred fileformat can be
.csv
or.txt
; however every file with a consistent separation is supported.
-1.600000000000000089e+00 0.000000000000000000e+00
-1.583333333333333481e+00 3.891050583657595843e-03
-1.566666666666666874e+00 3.973071404922200699e-03
-1.550000000000000266e+00 4.057709648331829684e-03
-1.533333333333333659e+00 4.145077720204749357e-03
-1.516666666666667052e+00 4.235294117647067993e-03
-
A wide range of pre-defined separators can be chosen:
,
,;
,:
,|
,\t
,s+
, -
Having an input file available as JSON, TOML, or YAML. The input file must have at least the initial parameters for the peaks. More options are in the default settings not necessary, but can be activated by extending the objects in the input file.
-
More command line arguments can be seen by activating the
-h
or--help
flag. -
The attributes
minimizer
andoptimizer
must also be defined, because they control the optimization algorithm of lmfit. For information see lmfit.mininizer. The input file has to contain the following lines:
"parameters": {
"minimizer": { "nan_policy": "propagate", "calc_covar": true },
"optimizer": { "max_nfev": 1000, "method": "leastsq" }
}
- Starting SpectraFit via:
Peak definition in the input file
In the input file, the peaks must be defined as nested objects, as shown below:
"peaks": {
"1": {
"pseudovoigt": {
"amplitude": {
"max": 2,
"min": 0,
"vary": true,
"value": 1
},
"center": {
"max": 2,
"min": -2,
"vary": true,
"value": 0
},
"fwhmg": {
"max": 0.1,
"min": 0.02,
"vary": true,
"value": 0.01
},
"fwhml": {
"max": 0.1,
"min": 0.01,
"vary": true,
"value": 0.01
}
}
}
}
First, peaks have to be declared in the input file. Every peak contains a number as string; like "1" for peak #1. Next, the type of peak has to be defined in the input. The following peak types are available:
- Gaussian
- Lorentzian also known as Cauchy distribution
- Voigt
- Pseudo Voigt
- Exponential
- Powerlaw (also known as Log-parabola or just Power)
- Linear
- Constant
- Error Function
- Arcus Tangens
- Logarithmic
More information about the models, please see the Models Documentation. For every model, the attributes have to be defined. The attributes are in case of the pseudovoigt
model:
- Amplitude: The height or maximum value of the peak function
- Center: The position of the maximum along the x-axis
- fwhmg: The full width half maximum of the Gaussian distribution
- fwhml: The full width half maximum of the Lorentzian distribution
Each attribute has sub-attributes, which are always:
- max: the maximum allowed value of the attribute
- min: the minimum allowed value of the attribute
- vary: if the attribute should be varied during the fit
- value: the initial value of the attribute
At the moment, no default attributes are defined for the models, but this will come in a future release.
Advanced Usage¶
In case of advanced usage of SpectraFit, the following features are available:
Redefine the SpectraFit Settings¶
Define the settings in the input file as shown below:
{
"settings": {
"column": [0, 1],
"decimal": ".",
"energy_start": 0,
"energy_stop": 8,
"header": null,
"infile": "spectrafit/test/rixs_fecl4.txt",
"outfile": "fit_results",
"oversampling": false,
"separator": "\t",
"shift": 0,
"smooth": 0,
"verbose": 1,
"noplot": false
}
If the settings are pre-defined in the input file, the corresponding command line arguments will be automatically replaced with them. If they are not defined, the command line arguments or their default values will be used. This allows for faster execution of SpectraFit and also ensures consistency in the fitting procedure in case of larger studies. For the detailed mechanism of overwriting the settings, please see the API documentation of Command Line Module.
Datatype of columns for pandas.read_csv
According to the documentation of [pandas.read_csv
][17], the datatype of can be both: int
or str
. The in
is the default. In case of using the header the str
is the mandatory.
Define Project Details¶
Another advanced feature of SpectraFit is to define the fit as a project, which can become very useful for versioning the fitting project. For using SpectraFit as a project, the project details have to be defined as attributes. The attributes are project_name
, project_details
, and keywords
, as shown in the snippet below:
"fitting": {
"description": {
"project_name": "Template",
"project_details": "Template for testing",
"keywords": [
"2D-Spectra",
"fitting",
"curve-fitting",
"peak-fitting",
"spectrum"
]
}
All three attributes are strings, the project_name
should ideally be a single name with no spaces. The project_details
can be longer text, and the keywords
should be a list of strings for tagging in a database.
Tuning Minimizer and Optimizer and Activating Confidence Intervals¶
The input file can be extended with more parameters, which are optional in case of the confidence intervals. In general, the keywords of the lmfit minimizer
function are supported. For more information please check the module lmfit.mininizer. The attributes have to be initialized with the keyword parameters
as shown below:
"parameters": {
"minimizer": { "nan_policy": "propagate", "calc_covar": true },
"optimizer": { "max_nfev": 1000, "method": "leastsq" },
"report": { "min_correl": 0.0 },
"conf_interval": {
"p_names": null,
"sigmas": null,
"trace": true,
"maxiter": 200,
"verbose": 1,
"prob_func": null
}
}
About confidence interval calculations
The calculations of the confidence intervals depends on the number of features and maxiter
. Consequently, the confidence interval calculations should be only used for the final fit to keep the calculation time low.
Using mathematical expressions¶
The input file can be further extended by expressions
, which are evaluated during the fitting process. The expressions
have to be defined as attributes of the fitting
object in the input file. It can only contain mathematical constraints or dependencies between different peaks
; please compare the docs of lmfit.eval and Expression Documentation. The attributes are defined by the keyword expr
followed by the string, which can contain any mathematical expression supported by Python.
About the importance of expressions
Using the expr
attribute, the amplitude of the peak 2
can be defined as ⅓ of the amplitude of the peak 1
. In general, this expression mode is very useful in cases of fitting relative dependencies like the L-edge X-ray Absorption Spectra (L-XAS), where relative dependencies between the L3 and L2 edge have to be defined.
"peaks": {
"1": {
"pseudovoigt": {
"amplitude": {
"max": 2,
"min": 0,
"vary": true,
"value": 1
},
"center": {
"max": 2,
"min": -2,
"vary": true,
"value": 0
},
"fwhmg": {
"max": 0.74,
"min": 0.02,
"vary": true,
"value": 0.21
},
"fwhml": {
"max": 0.74,
"min": 0.01,
"vary": true,
"value": 0.21
}
}
},
"2": {
"pseudovoigt": {
"amplitude": {
"expr": "pseudovoigt_amplitude_1 / 3"
},
"center": {
"expr": "pseudovoigt_center_1 + 1.73"
},
"fwhmg": {
"max": 0.5,
"min": 0.02,
"vary": true,
"value": 0.01
},
"fwhml": {
"max": 0.5,
"min": 0.01,
"vary": true,
"value": 0.01
}
}
},
"3": {
"constant": {
"amplitude": {
"max": 2,
"min": 0.01,
"vary": true,
"value": 1
}
}
},
"4": {
"gaussian": {
"amplitude": {
"max": 2,
"min": 0,
"vary": true,
"value": 1
},
"center": {
"expr": "pseudovoigt_center_2 + 0.35"
},
"fwhmg": {
"max": 0.4,
"min": 0.02,
"vary": true,
"value": 0.01
}
}
}
}
Activating Global Fitting¶
The input file as well as the command line interface can be turned into the global fitting
mode. The global fitting
mode is useful when fitting several spectra with the same initial model. In case of using the global fitting
mode = 1, the fitting
object has to be defined in the same way like the local fitting model; via Input file
{
"settings": {
"column": ["energy"],
"decimal": ".",
"header": 0,
"infile": "data_global.csv",
"outfile": "example_6",
"oversampling": false,
"separator": ",",
"shift": 0.2,
"smooth": false,
"verbose": 1,
"noplot": false,
"global": 1,
"autopeak": false
}
}
or via Command Line.
For more info please see the Global Fitting Example.
Correct Data Format for Global Fits
For the correct fitting the data file has to contain only spectra data; meaning energy
and intensity
columns. No other columns are allowed!!
Activating Automatic Peak Detection for Fitting¶
The input file can be further extended by autopeak
, which is used to automatically find the peaks in the data. The autopeak
has to be defined as an attribute of the setting
object in the input file or directly via command line:
The default peak model is Gaussian
, but the setting
-section in the input file allows switching to:
Furthermore, the finding attributes of the autopeak
are identical to the Scipy's find_peaks. The autopeak
attributes have to be defined as follows:
"autopeak": {
"modeltype": "gaussian",
"height": [0.0, 10],
"threshold": [0.0, 10],
"distance": 2,
"prominence": [0.0, 1.0],
"width": [0.0, 10],
"wlen": 2,
"rel_height": 1,
"plateau_size": 0.5
}
Configurations¶
In terms of the configuration of SpectraFit, configurations depend on the lmfit package. Most of the provided features of lmfit
can be used. The configurations can be called as attributes of optimizer
and minimizer
as shown in Standard Usage step 5. For the individualization of the configuration, please use the keywords of lmfit
minimizer module and also check the SpectraFit's fitting routine.
Input Files¶
- JSON
JavaScript Object Notation format, a lightweight data-interchange format
- :material-toml: TOML
Tom's Obvious, Minimal Language format, a config file format for humans
- :material-language-yaml: YAML
YAML Ain't Markup Language, a human-friendly data serialization standard
The input file of SpectraFit are dictionary-like objects. The input file can be one of these three types:
Especially, the toml
and yaml
files are very useful for the configuration of SpectraFit due to their structure and simplicity. ConvertSimple allows easily to convert between these three file types.
Reference Input in JSON
{
"settings": {
"column": [0, 1],
"decimal": ".",
"energy_start": 0,
"energy_stop": 8,
"header": null,
"infile": "spectrafit/test/rixs_fecl4.txt",
"outfile": "fit_results",
"oversampling": false,
"noplot": false,
"separator": "\t",
"shift": 0,
"smooth": 0,
"verbose": 1
},
"fitting": {
"description": {
"project_name": "Template",
"project_details": "Template for testing",
"keywords": [
"2D-Spectra",
"fitting",
"curve-fitting",
"peak-fitting",
"spectrum"
]
},
"parameters": {
"minimizer": { "nan_policy": "propagate", "calc_covar": true },
"optimizer": { "max_nfev": 1000, "method": "leastsq" },
"report": { "min_correl": 0.0 },
"conf_interval": {
"p_names": null,
"sigmas": null,
"trace": true,
"maxiter": 200,
"verbose": 1,
"prob_func": null
}
},
"peaks": {
"1": {
"pseudovoigt": {
"amplitude": {
"max": 2,
"min": 0,
"vary": true,
"value": 1
},
"center": {
"max": 2,
"min": -2,
"vary": true,
"value": 0
},
"fwhmg": {
"max": 0.1,
"min": 0.02,
"vary": true,
"value": 0.01
},
"fwhml": {
"max": 0.1,
"min": 0.01,
"vary": true,
"value": 0.01
}
}
},
"2": {
"pseudovoigt": {
"amplitude": {
"max": 2,
"min": 0,
"vary": true,
"value": 1
},
"center": {
"max": 2,
"min": -2,
"vary": true,
"value": 0
},
"fwhmg": {
"max": 0.1,
"min": 0.02,
"vary": true,
"value": 0.01
},
"fwhml": {
"max": 0.1,
"min": 0.01,
"vary": true,
"value": 0.01
}
}
},
"3": {
"constant": {
"amplitude": {
"max": 2,
"min": 0.01,
"vary": true,
"value": 1
}
}
},
"4": {
"gaussian": {
"amplitude": {
"max": 2,
"min": 0,
"vary": true,
"value": 1
},
"center": {
"max": 2,
"min": -2,
"vary": true,
"value": 0
},
"fwhmg": {
"max": 0.1,
"min": 0.02,
"vary": true,
"value": 0.01
}
}
}
}
}
}
Jupyter Notebook Interface¶
The SpectraFit can be used in the Jupyter Notebook. First, the plugin has to be installed as described in Installation. To use the SpectraFit in the Jupyter Notebook, the SpectraFit has to be imported as follows:
Next, the peak definition has to be defined now as follows:
initial_model = [
{
"pseudovoigt": {
"amplitude": {"max": 2, "min": 0, "vary": True, "value": 1},
"center": {"max": 2, "min": -2, "vary": True, "value": 0},
"fwhmg": {"max": 0.1, "min": 0.02, "vary": True, "value": 0.01},
"fwhml": {"max": 0.1, "min": 0.01, "vary": True, "value": 0.01},
}
},
{
"gaussian": {
"amplitude": {"max": 2, "min": 0, "vary": True, "value": 1},
"center": {"max": 2, "min": -2, "vary": True, "value": 0},
"fwhmg": {"max": 0.1, "min": 0.02, "vary": True, "value": 0.01},
}
},
]
For generating the first fit via spectrafit.plugins.notebook
, the following code has to be used:
spf = SpectraFitNotebook(df=df, x_column="Energy", y_column="Noisy")
spf.solver_model(
initial_model,
)
and to save the results as toml
file, just save the spf
object as follows:
About the Jupyter Interface
The Jupyter interface is still under development and not yet fully functional. Furthermore, the peak definition changes a little bit - from a dictionary
of dictionary
to a list
of dictionaries
. As a consequence, the peak number is not needed anymore.
Also the _global fitting routine_ is not yet implemented for the Jupyter interface.
More information about the Jupyter Notebook interface can be found in the Jupyter Example and the plugin documentation for Jupyter-SpectraFit-Interface of SpectraFit.
The Jupyter Lab for the SpectraFit can be also started via the command line: