PKL-Converter and Visualizer
About the Pkl Converter and Visualizer
With the spectrafit-pkl-converter
command line tool you can convert the pkl files with nested dictionaries and list or numpy arrays
to list-of-dictionaries with numpy arrays
. This is useful for further processing with other tools.
In general, the pickle files can be very complex and contain nested dictionaries and lists, as shown in the following example:
stateDiagram
[*] --> pkl
pkl --> list
pkl --> np.array
pkl --> dict
pkl --> else
dict --> dict
dict --> list
dict --> np.array
list --> list
list --> np.array
np.array --> np.array
np.array --> list
np.array --> dict
dict --> list_of_dicts
list_of_dicts --> [*]
For the visualization of the pkl files, the spectrafit-pkl-visualizer
command line tool can be used. It creates a graph of the pkl file and
PKL Converter¶
The spectrafit-pkl-converter
command line tool can be used like this:
➜ spectrafit-pkl-converter -h
usage: spectrafit-pkl-converter [-h] [-f {utf-16,utf-8,latin1,utf-32}] [-e {pkl.gz,pkl,npy,npz}] infile
Converter for 'SpectraFit' from pkl files to CSV files.
positional arguments:
infile Filename of the pkl file to convert.
options:
-h, --help show this help message and exit
-f {latin1,utf-16,utf-8,utf-32}, --file-format {latin1,utf-16,utf-8,utf-32}
File format for the optional encoding of the pickle file. Default is 'latin1'.
-e {pkl.gz,pkl,npy,npz}, --export-format {pkl.gz,pkl,npy,npz}
File format for export of the output file. Default is 'pkl'.
The following export files are possible:
-
pkl
: Pickle file aspkl
file and compressedpkl.gz
file. -
npy
: Numpy array asnpy
file and compressednpz
file.
In case of using other file formats, the spectrafit-pkl-converter
supports the following file formats:
-
utf-8
: UTF-8 encoded file. -
utf-16
: UTF-16 encoded file. -
utf-32
: UTF-32 encoded file. -
latin1
: Latin-1 encoded file.
All keys up to the first key-value pair of a numpy.ndarray
or list
are merged into a single string, which is used as a new filename. A list will be converted to a numpy.ndarray
with the shape (len(list),)
.
graph LR
.pkl --> dict_1
.pkl --> dict_2
.pkl --> dict_3
.pkl --> dict_4
dict_1 --> dict_1.pkl
dict_2 --> dict_2.pkl
dict_3 --> dict_3.pkl
dict_4 --> dict_4.pkl
Using the spectrafit-pkl-converter
as a Python module
In the case of using spectrafit-pkl-converter
as a Python module, the following:
from spectrafit.plugins.pkl_converter import PklConverter
pkl_converter = PklConverter()
list_dict = pkl_converter.convert_pkl_to_csv(
infile="test.pkl",
)
The list_dict
variable contains the converted data as a list of dictionaries.
See also:
Bases: Converter
Convert pkl data to a CSV files.
General information
The pkl data is converted to a CSV file. The CSV file is saved in the same directory as the input file. The name of the CSV file is the same as the input file with the suffix .csv
and prefixed with the name of the 'major' keys in the pkl file. Furthermore, a graph of the data is optionally saved as a PDF file to have a visual representation of the data structure.
Supported file formats
Currently supported file formats:
-[x] pkl -[x] pkl.gz -[x] ...
Source code in spectrafit/plugins/pkl_converter.py
|
|
__call__()
¶
Run the converter.
Source code in spectrafit/plugins/pkl_converter.py
236 237 238 239 240 |
|
convert(infile, file_format)
staticmethod
¶
Convert the input file to the output file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
infile | Path | The input file of the as a path object. | required |
file_format | str | The output file format. | required |
Returns:
Type | Description |
---|---|
Dict[str, Any] | Dict[str, Any]: The data as a dictionary, which can be a nested dictionary |
Source code in spectrafit/plugins/pkl_converter.py
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 |
|
get_args()
¶
Get the arguments from the command line.
Returns:
Type | Description |
---|---|
Dict[str, Any] | Dict[str, Any]: Return the input file arguments as a dictionary without additional information beyond the command line arguments. |
Source code in spectrafit/plugins/pkl_converter.py
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 |
|
save(data, fname, export_format)
¶
Save the converted pickle data to a file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data | Any | The converted nested dictionary of the pkl data. | required |
fname | Path | The filename of the output file. | required |
export_format | str | The file format of the output file. | required |
Raises:
Type | Description |
---|---|
ValueError | If the export format is not supported. |
Source code in spectrafit/plugins/pkl_converter.py
216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 |
|
¶
¶
¶
¶
¶
:members: :undoc-members:
PKL Visualizer¶
The spectrafit-pkl-visualizer
should be used for the visualization of the pkl files. It creates a graph of the pkl file and exports it as a graph file.
The spectrafit-pkl-visualizer
command line tool can be used like this:
➜ spectrafit-pkl-visualizer -h
usage: spectrafit-pkl-visualizer [-h] [-f {utf-32,utf-16,latin1,utf-8}] [-e {jpg,pdf,jpeg,png}] infile
Converter for 'SpectraFit' from pkl files to a graph.
positional arguments:
infile Filename of the pkl file to convert to graph.
options:
-h, --help show this help message and exit
-f {latin1,utf-16,utf-8,utf-32}, --file-format {latin1,utf-16,utf-8,utf-32}
File format for the optional encoding of the pickle file. Default is 'latin1'.
-e {jpg,pdf,jpeg,png}, --export-format {jpg,pdf,jpeg,png}
File extension for the graph export.
Furthermore the spectrafit-pkl-visualizer
allows export the structure of the pkl file as a JSON file. The information about the attributes and their structure is stored in the JSON file. The following example shows the structure of the JSON file:
Example of the JSON file
{
"file_1": {
"attribute_1": "<class 'list'>",
"attribute_2": "<class 'str'>",
"attribute_3": "<class 'numpy.ndarray'> of shape (201,)",
"attribute_4": "<class 'numpy.ndarray'> of shape (199,)",
"attribute_5": "<class 'numpy.ndarray'> of shape (10, 201, 10000)",
"attribute_6": "<class 'numpy.ndarray'> of shape (10, 201, 10000)",
"attribute_7": "<class 'numpy.ndarray'> of shape (10, 201, 10000)",
"attribute_8": "<class 'numpy.ndarray'> of shape (10, 199, 10000)",
"attribute_9": "<class 'numpy.ndarray'> of shape (10, 199, 10000)",
"attribute_10": "<class 'numpy.ndarray'> of shape (10, 199, 10000)",
"attribute_11": "<class 'numpy.ndarray'> of shape (10, 199, 10000)",
"attribute_12": "<class 'numpy.ndarray'> of shape (10000,)",
"attribute_13": "<class 'list'>",
"attribute_14": "<class 'numpy.ndarray'> of shape (10, 201)",
"attribute_16": "<class 'int'>",
"attribute_17": "<class 'str'>",
"attribute_19": "<class 'str'>"
},
"file_2": {
"attribute_1": "<class 'list'>",
"attribute_2": "<class 'str'>",
"attribute_3": "<class 'numpy.ndarray'> of shape (201,)",
"attribute_4": "<class 'numpy.ndarray'> of shape (199,)",
"attribute_5": "<class 'numpy.ndarray'> of shape (10, 201, 10000)",
"attribute_6": "<class 'numpy.ndarray'> of shape (10, 201, 10000)",
"attribute_7": "<class 'numpy.ndarray'> of shape (10, 201, 10000)",
"attribute_8": "<class 'numpy.ndarray'> of shape (10, 199, 10000)",
"attribute_9": "<class 'numpy.ndarray'> of shape (10, 199, 10000)",
"attribute_10": "<class 'numpy.ndarray'> of shape (10, 199, 10000)",
"attribute_11": "<class 'numpy.ndarray'> of shape (10, 199, 10000)",
"attribute_12": "<class 'numpy.ndarray'> of shape (10000,)",
"attribute_13": "<class 'list'>",
"attribute_14": "<class 'numpy.ndarray'> of shape (10, 201)",
"attribute_16": "<class 'int'>",
"attribute_17": "<class 'str'>",
"attribute_19": "<class 'str'>"
}
}
Example of the graph
The resulting graph looks like this: