2. acquisition
– data acquisition¶
The acquisition package is responsible for fetching data from an experimental database and returning pyfusion data objects. Base classes as well as datasystem-specific sub-packages are provided.
Two classes are involved in obtaining data. An acquisition class
(subclass of BaseAcquisition
) provides the basic
interface to the data source, setting up any connections required. A
fetcher class (subclass of BaseDataFetcher
) is used to
get data from a specified channel and shot number. In general usage, a
fetcher class is not handled directly, but via the
getdata()
method. For example:
>>> import pyfusion
>>> h1 = pyfusion.getDevice('H1')
>>> mirnov_data = h1.acq.getdata(58123, 'H1_mirnov_array_1_coil_1')
Here, h1
is an instance of H1
(the
subclass of Device
specified in the
[Device:H1]
section in the configuration file). When instantiated,
the device class checks the configuration file for a acquisition class
specification, and attaches an instance of the specified acquisition
class, here h1.acq
(which is a synonym of h1.acquisition
). The
getdata()
method checks for a
configuration section (here it is a section named
[Diagnostic:H1_mirnov_array_1_coil_1]
) with information about the
diagnostic including which data fetcher class to use. The data fetcher
is then called to fetch and return the data.
2.1. Base classes¶
-
class
pyfusion.acquisition.base.
BaseAcquisition
(config_name=None, **kwargs)[source]¶ Base class for datasystem specific acquisition classes.
Parameters: config_name – name of acquisition as specified in configuration file. On instantiation, the pyfusion configuration is searched for a
[Acquisition:config_name]
section. The contents of the configuration section are loaded into the object namespace. For example, a configuration section:[Acquisition:my_custom_acq] acq_class = pyfusion.acquisition.base.BaseAcquisition server = my.dataserver.com
will result in the following behaviour:
>>> from pyfusion.acquisition.base import BaseAcquisition >>> my_acq = BaseAcquisition('my_custom_acq') >>> print(my_acq.server) my.dataserver.com
The configuration entries can be overridden with keyword arguments:
>>> my_other_acq = BaseAcquisition('my_custom_acq', server='your.data.net') >>> print(my_other_acq.server) your.data.net
-
getdata
(shot, config_name=None, **kwargs)[source]¶ Get the data and return prescribed subclass of BaseData.
Parameters: - shot – shot number
- config_name – ?? bdb name of a fetcher class in the configuration file
Returns: an instance of a subclass of
BaseData
orBaseDataSet
This method needs to know which data fetcher class to use, if a config_name argument is supplied then the
[Diagnostic:config_name]
section must exist in the configuration file and contain adata_fetcher
class specification, for example:[Diagnostic:H1_mirnov_array_1_coil_1] data_fetcher = pyfusion.acquisition.H1.fetch.H1DataFetcher mds_path = \h1data::top.operations.mirnov:a14_14:input_1 coords_cylindrical = 1.114, 0.7732, 0.355 coord_transform = H1_mirnov
If a
data_fetcher
keyword argument is supplied, it overrides the configuration file specification.The fetcher class is instantiated, including any supplied keyword arguments, and the result of the
fetch
method of the fetcher class is returned.
-
-
class
pyfusion.acquisition.base.
BaseDataFetcher
(acq, shot, config_name=None, **kwargs)[source]¶ Base class providing interface for fetching data from an experimental database.
Parameters: - acq – in instance of a subclass of
BaseAcquisition
- shot – shot number
- config_name – name of a Diagnostic configuration section.
It is expected that subclasses of BaseDataFetcher will be called via the
getdata()
method, which calls the data fetcher’sfetch()
method.-
do_fetch
()[source]¶ Actually fetches the data, using the environment set up by
setup()
Returns: an instance of a subclass of BaseData
orBaseDataSet
Although
BaseDataFetcher.do_fetch()
does not return any data object itself, it is expected that a do_fetch() method on a subclass ofBaseDataFetcher
will.
-
error_info
(step=None)[source]¶ return specific information about error to aid interpretation - e.g for mds, path The dummy return should be replaced in the specific routines
-
fetch
()[source]¶ Always use this to fetch the data, so that
setup()
andpulldown()
are used to setup and pull down the environmet used bydo_fetch()
.Returns: the instance of a subclass of BaseData
orBaseDataSet
returned bydo_fetch()
- acq – in instance of a subclass of
-
class
pyfusion.acquisition.base.
MultiChannelFetcher
(acq, shot, config_name=None, **kwargs)[source]¶ Fetch data from a diagnostic with multiple timeseries channels.
This fetcher requres a multichannel configuration section such as:
[Diagnostic:H1_mirnov_array_1] data_fetcher = pyfusion.acquisition.base.MultiChannelFetcher channel_1 = H1_mirnov_array_1_coil_1 channel_2 = H1_mirnov_array_1_coil_2 channel_3 = H1_mirnov_array_1_coil_3 channel_4 = H1_mirnov_array_1_coil_4
The channel names must be channel_ followed by an integer, and the channel values must correspond to other configuration sections (for example
[Diagnostic:H1_mirnov_array_1_coil_1]
,[Diagnostic:H1_mirnov_array_1_coil_1]
, etc) which each return a single channel instance ofTimeseriesData
.-
fetch
()[source]¶ Fetch each channel and combine into a multichannel instance of
TimeseriesData
.Return type: TimeseriesData
-
2.2. Sub-packages for specific data sources¶
Custom subclasses BaseAcquisition
and
BaseDataFetcher
classes are contained in dedicated
sub-packages. Each sub-package has the structure:
subpkg/
__init__.py
acq.py
fetch.py
with acq.py
containing a subclass of
BaseAcquisition
and fetch.py
containing a
subclass of BaseDataFetcher
.
2.2.1. MDSPlus
¶
Interface for MDSplus data acquisition and storage.
This package depends on the MDSplus python package, available from http://www.mdsplus.org/binaries/python/
Pyfusion supports four modes for accessing MDSplus data:
- local
- thick client
- thin client
- HTTP via a H1DS MDSplus web service
The data access mode used is determined by the mds path and server variables in the configuration file (or supplied to the acquisition class via keyword arguments):
[Acquisition:my_data]
acq_class = pyfusion.acquisition.MDSPlus.acq.MDSPlusAcquisition
mydata_path = ...
server = my.mdsdataserver.net
The full MDSplus node path is stored in a diagnostic configuration section:
[Diagnostic:my_probe]
data_fetcher = pyfusion.acquisition.MDSPlus.fetch.MDSPlusDataFetcher
mds_node_path = \mydata::top.probe_signal
# Note that changing data sources (fetchers) is easier with :ref:`substitutions`
2.2.1.1. Local data access¶
The ‘local’ mode is used when a tree path definition refers to the local
file system rather than an MDSplus server on the network. The
mydata_path
entry in the above example would look something
like:
mydata_path = /path/to/my/data
2.2.1.2. Thick client access¶
The ‘thick client’ mode uses an MDSplus data server to retieve the raw data files, but the client is responsible for evaluating expressions and decompressing the data. The server tree definitions are used, and the server for a given mds tree is specified by the tree path in the format:
mydata_path = my.mdsdataserver.net::
or, if a port other than the default (8000) is used:
mydata_path = my.mdsdataserver.net:port_number::
2.2.1.3. Thin client access¶
The ‘thin client’ mode maintains a connection to an MDSplus data
server. Expressions are evaluated and data decompressed on the server,
requiring greater amounts of data to be transferred over the
network. Because the thin client mode uses the tree paths defined on the
server, no path variable is required. Instead, the server
entry
is used:
server = my.mdsdataserver.net
or, if a port other than the default (8000) is used:
server = my.mdsdataserver.net:port_number
2.2.1.4. HTTP web service access¶
The HTTP web service mode uses standard HTTP queries via the H1DS
RESTful API to access the MDSplus data. The server is responsible for
evaluating the data and transmits quantisation-compressed data to the
client over port 80. This is especially useful if the MDSplus data is
behind a firewall. The server
attribute will be used for web
service access if it begins with http://, for example:
server = http://h1svr.anu.edu.au/mdsplus/
The server
attribute must be the URL component up to the MDSplus
tree name. In this example, the URL for mds
path:attr:\h1data::top.operations.mirnov:a14_14:input_1 and shot 58063 corresponds to
http://h1svr.anu.edu.au/mdsplus/h1data/58063/top/operations/mirnov/a14_14/input_1/
2.2.1.5. How Pyfusion chooses the access mode¶
If an acquisition configuration section contains a server
entry
(which does not start with http://), then
MDSPlusAcquisition
will set up a connection to the mdsip
server when it is instantiated. Additionally, any tree path definitions
(local and thick client) are loaded into the runtime environment at this
time. When a call to the data fetcher is made (via getdata()
), the
data fetcher uses the full node path (including tree name) from the
configuration file. If a matching (tree name) _path
variable is
defined for the acquisition module, then the corresponding local or
thick client mode will be used. If no tree path is defined then, if the
server
variable is defined, pyfusion will attempt to use either
the web services mode (if server
begins with http://) or the
thin client mode (if server
does not begin with http://).
2.2.1.6. Classes¶
-
class
pyfusion.acquisition.MDSPlus.acq.
MDSPlusAcquisition
(*args, **kwargs)[source]¶ Acquisition class for MDSplus data systems.
If a ‘server’ configuration parameter (not starting with ‘http’) is provided, a connection for thin client access will be set up. Also, any configuration parameters which end with ‘_path’ will be loaded into the environment.
2.2.2. H1
¶
The H1 data acquisition package.
This subpackage contains a subclass of the MDSplus data fetcher which gets additional H1 specific metadata.
2.2.2.1. Classes¶
2.2.3. LHD
¶
Data acquisition for LHD.
2.2.3.1. Classes¶
-
class
pyfusion.acquisition.LHD.fetch.
LHDTimeseriesDataFetcher
(acq, shot, config_name=None, **kwargs)[source]¶ need: export Retrieve=~/retrieve/bin/ # (maybe not) export INDEXSERVERNAME=DasIndex.LHD.nifs.ac.jp/LHD
Debugging
Off-site in pyfusion:
# set the config to use LHD fetcher pyfusion.config.set('DEFAULT','LHDfetcher','pyfusion.acquisition.LHD.fetch.LHDTimeseriesDataFetcher') # choose a shot that doesn't exist locally run pyfusion/examples/plot_signals.py shot_number=999 diag_name='VSL_6' dev_name='LHD'
On-site test lines for exes:
retrieve SX8O 74181 1 33 retrieve Magnetics3lab1 74181 1 33 2015: retrieve_t seems to only work on FMD retrieve_t FMD 117242 1 33 different error messages on Magnetics3lab1
Using retrieve_t:
Don't know when it is needed - always trying it first? if it gives an error, calculate according to .prm timeit fmd=retriever.retrieve('Magnetics3lab1',105396,1,[33],False) 142ms without retrieve_t, 224 with, including failure (set True in above)
2.2.4. DSV
¶
Acquisition module for data in a delimiter-separated value (DSV) format.
parameter | description |
---|---|
filename |
Name of data file, with (shot) substitution string, e.g. /data/(shot).dat -> /data/12345.dat for shot 12345. (required) |
delimiter |
Delimiter character for values, e.g. , for comma separated value (CSV) format. (optional, default is whitespace) |
This module provides support for reading data from a plain text file via
numpy’s genfromtxt function. The only required configuration parameter
is filename, which can include a shot number substitution string
(shot)
. An an example, consider the following datafile for 2-channel
timeseries signal for shot number 12345:
# timebase channel 1 channel 2
3.000000e+00 -1.201389e-01 3.177084e-01
3.000002e+00 6.437500e-01 -4.461806e-01
3.000004e+00 5.347222e-02 -1.684028e-01
3.000006e+00 1.923611e-01 -2.951390e-02
3.000008e+00 4.006945e-01 -5.156250e-01
3.000010e+00 -8.840278e-01 1.012153e+00
3.000012e+00 2.618056e-01 -2.031250e-01
3.000014e+00 -1.597222e-02 -1.336806e-01
3.000016e+00 -1.597222e-02 1.788194e-01
3.000018e+00 5.743055e-01 -7.586806e-01
If the datafile is saved at /data/mirnov_data_12345.txt
, we could
use the following configuration file:
[Acquisition:my_text_data]
acq_class = pyfusion.acquisition.DSV.acq.DSVAcquisition
[Diagnostic:mirnov_data]
data_fetcher = pyfusion.acquisition.DSV.fetch.DSVMultiChannelTimeseriesFetcher
filename = /data/mirnov_data_(shot).txt
And access the data with pyfusion:
>>> import pyfusion as pf
>>> acq = pf.getAcquisition("my_text_data")
>>> data = acq.getdata(12345, "mirnov_data")
>>> data.timebase
Timebase([ 3. , 3.000002, 3.000004, 3.000006, 3.000008, 3.00001 ,
3.000012, 3.000014, 3.000016, 3.000018])
>>> data.signal[0]
Signal([-0.1201389 , 0.64375 , 0.05347222, 0.1923611 , 0.4006945 ,
-0.8840278 , 0.2618056 , -0.01597222, -0.01597222, 0.5743055 ])
>>> data.signal[1]
Signal([ 0.3177084, -0.4461806, -0.1684028, -0.0295139, -0.515625 ,
1.012153 , -0.203125 , -0.1336806, 0.1788194, -0.7586806])
By default, pyfusion expects values to be delimited by whitespace characters. The delimiting character can also be set in the configuration file, for example, the following datafile and configuration give the same result as the above example:
# timebase, channel 1, channel 2
3.000000e+00, -1.201389e-01, 3.177084e-01
3.000002e+00, 6.437500e-01, -4.461806e-01
3.000004e+00, 5.347222e-02, -1.684028e-01
3.000006e+00, 1.923611e-01, -2.951390e-02
3.000008e+00, 4.006945e-01, -5.156250e-01
3.000010e+00, -8.840278e-01, 1.012153e+00
3.000012e+00, 2.618056e-01, -2.031250e-01
3.000014e+00, -1.597222e-02, -1.336806e-01
3.000016e+00, -1.597222e-02, 1.788194e-01
3.000018e+00, 5.743055e-01, -7.586806e-01
where the configuration is:
[Acquisition:my_text_data]
acq_class = pyfusion.acquisition.DSV.acq.DSVAcquisition
[Diagnostic:mirnov_data]
data_fetcher = pyfusion.acquisition.DSV.fetch.DSVMultiChannelTimeseriesFetcher
filename = /data/mirnov_data_(shot).txt
delimiter = ,
Note that whitespace is stripped from configuration file values - if you want to use whitespace delimited data, as in the first example, simply omit the delimiter setting in your configuration.
2.2.4.1. Classes¶
-
class
pyfusion.acquisition.DSV.fetch.
DSVMultiChannelTimeseriesFetcher
(acq, shot, config_name=None, **kwargs)[source]¶ Fetch DSV data from specified filename.
This data fetcher uses two configuration parameters, filename (required) and delimiter (optioanl).
The filename parameter can include a substitution string
(shot)
which will be replaced with the shot number.By default, whitespace is used for the delimiter character (if the delimiter parameter is not provided.)
2.2.5. FakeData
¶
Acquisition module for generating fake timeseries data for testing purposes.
At present, only a single channel sine wave generator is provided. Available configuration parameters are:
parameter | description |
---|---|
t0 |
Starting time of signal timebase. |
n_samples |
Number of samples. |
sample_freq |
Sample frequency (Hz). |
frequency |
Frequency of test sine-wave signal (Hz). |
amplitude |
Amplitude of test sine-wave signal. |
All parameters are required.
For example, with the following configuration:
[Acquisition:fake_acq]
acq_class = pyfusion.acquisition.FakeData.acq.FakeDataAcquisition
[Diagnostic:fake_data]
data_fetcher = pyfusion.acquisition.FakeData.fetch.SingleChannelSineFetcher
t0 = 0.0
n_samples = 1024
sample_freq = 1.e6
frequency = 2.e4
amplitude = 2.5
we can generate a 20 kHz sine wave:
>>> import pyfusion as pf
>>> shot = 12345
>>> acq = pf.getAcquisition("fake_acq")
>>> data = acq.getdata(shot, "fake_data")
>>> data.timebase
Timebase([ 0.00000000e+00, 1.00000000e-06, 2.00000000e-06, ...,
1.02100000e-03, 1.02200000e-03, 1.02300000e-03])
>>> data.signal
Signal([ 0. , 0.31333308, 0.62172472, ..., 1.20438419,
0.92031138, 0.62172472])