Readme
Pysotope uses a hillclimbing algorithm to unmix spike and sample isotope composition, by calculating mass dependent instrument and natural isotope fractionation from a known standard. This yields sample isotope composition and concentration for the element of interest.
Installing
Get a working installation of python e.g. via the Anaconda python distribution (Used for testing on Windows 10, macOS Mojave, and Manjaro GNU/Linux).
Install the package by downloading/cloning this repository navigating to the cloned folder and running:
To clone (requires git, $
indicates to run as non priveleged user).
$ git clone https://github.com/tarting/pysotope
$ cd pysotope
$ python setup.py install --user
This will install the packages listed in the requirements.txt file as well as the pysotope library and command.
Check that pysotope is installed by importing the pysotope library in
python: Run $ python
in the commandline to launch the python
shell followed by: >>> import pysotope
(the >>>
indicates a python shell). If no errors are reported, pysotope is installed correctly. Type >>> exit()
to
exit the python shell.
To get a list of available commands run:
$ pysotope --help
Subcommands also yield help texts e.g.:
$ pysotope init --help
Using the command-line tools
These tools are specifically made for the instruments, naming-convention and workflow at the UCPH labs.
Pysotope requires a specific directory structure, and works best if
a separate folder is used per sample-set or project. The data_root
directory
must contain one appropriate .json
specification file and a folder
containing the data as .xls files or .raw folders from the IsoWorks
software.
| project_root/
| Cd_data_root/
| Cd_spec_file.json
| data_dir/
| run1 2189.xls
| run2 2190.xls
...
| Cr_data_root/
| Cr_spec_file.json
| data_dir/
| run1.raw/
| run1 2191.xls
| run2.raw/
| run2 2192.xls
...
Running pysotope is then a matter of opening a console, e.g. anaconda prompt on Windows, or a terminal on macOS and Linux.
Navigate to the data_root
directory using the cd
commmand and launch the
pysotope command in the following order.
$ pysotope init 'data_dir'
This command generates a list of datafiles saved as external_variables.xlsx. Now modify this file to include sample weight, spike weight, and spike concentration. The existing columns can be edited to exclude parts of the recorded data e.g. in case of missing signal, to exclude entire runs without signal etc. Any data in columns added to the right of the initial rows will be carried on to the final results file. Any modifications made to the list persist through reruns of the init command. In this way you can add more data file by simply dropping them into the data directory a re-running the pysotope init command.
$ pysotope invert 'external_variables.xlsx'
This command uses the list generated by the init command and the specification file to invert the double spike data, and produces a results.xlsx with summarized data for each run, and a results_cycles.csv containing each collected cycle for every run.
$ pysotope plot results.xlsx
This command creates summary diagrams for each bead, and collected summary for all runs. These are put as png files in a GFX folder. The summary diagrams display each run in separate colors, cycles excluded from mean calculation as red crosses, as well as 2 standard deviation and standard error fields, for both individual runs and summarized across a bead run.
Using as a python module
Import pysotope:
>>> import pysotope as pst
You still need to provide a specification file (described in previous section), to provide data external to the measurement.
>>> spec = pst.read_json('pysotope/spec/Cr-reduction-scheme-data_only.json')
Read the data such that each cycle represented as a list, numpy.array, or pd.Series of float values in a dictionary containing the key ‘CYCLES’. Given an excel-sheet, csv or pandas dataframe of the format:
Index | m49 | m50 | m51 | m52 | m53 | m54 | m56 |
---|---|---|---|---|---|---|---|
0 | 0.0 | 0.2 | 0.0 | 0.5 | 0.3 | 0.2 | 0.0 |
1 | 0.0 | 0.3 | 0.0 | 0.6 | 0.4 | 0.3 | 0.0 |
2 | 0.0 | 0.2 | 0.0 | 0.5 | 0.3 | 0.2 | 0.0 |
… | … | … | … | … | … | … | … |
120 | 0.0 | 0.2 | 0.0 | 0.5 | 0.3 | 0.2 | 0.0 |
Where the index of the data columns is specified in the spec-file, in
this case with pd.read_[filetype](<file>, index_col=0)
index of m49
is 0 and m56 is 6. There can be any number of columns in the datafile as
long as it is indexed correctly.
>>> import pandas as pd
# sample-ID note Date-ISO no-serial number
>>> df = pd.read(
'datadir/subdir/SPLID-015 500mV 2019-09-01 01-9999.xls',
index_col=0
)
Invert the data:
>>> reduced_cycles = pst.invert_data(df.values, spec)
Where reduced_cycles
is an OrderedDict with string index and
numpy.ndarray[float] values.
And summarise:
>>> summary_statistics = pst.summarise_data(reduced_cycles, spec)
Which is an OrderedDict with string index and float values. These can e.g. be joined into a pandas dataframe, or be written to a csv file.