Direct xarray views over the MeerKAT archive

This package present an xarray view over observations in the MeerKAT archive.

Required Reading

You'll need some familiarity with xarray. In particular:

Opening a dataset

Use xarray.open_datatree with engine="xarray-kat" (or let xarray infer the engine automatically from the URL). The call returns a xarray.DataTree whose children are the individual scans of the observation.

import xarray_kat  # registers the "xarray-kat" engine
import xarray

token = "eyFILLMEIN"
capture_block_id = 1234567890
url = (
    f"https://archive-gw-1.kat.ac.za/{capture_block_id}"
    f"/{capture_block_id}_sdp_l0.full.rdb?token={token}"
)

dt = xarray.open_datatree(url, engine="xarray-kat", chunked_array_type="xarray-kat", chunks={})

Each child node is keyed "<capture_block_id>_<stream_name>/<scan_index>" and exposes the MSv4-style data variables:

VISIBILITY — complex correlator data, shape (time, baseline_id, frequency, polarization)
WEIGHT — per-sample weights, same shape as VISIBILITY
FLAG — per-sample flags, same shape as VISIBILITY
UVW — baseline UVW coordinates, shape (time, baseline_id, uvw)

Parameters

All keyword arguments below are passed through xarray.open_datatree.

applycal : str or list of str, default "": Calibration products to apply. Use "all" to apply every available product, an explicit list such as ["l1.G", "l1.B"] to apply specific products, or "" to skip calibration entirely.
scan_states : iterable of str, default ("scan", "track"): Only scans whose activity-sensor state appears in this collection are included in the tree. Pass e.g. ("track",) to keep only tracking scans.
uvw_sign_convention : "casa" or "fourier", default "casa": Sign convention for UVW coordinates. "casa" follows the antenna2 - antenna1 convention used by CASA and most radio-astronomy software. "fourier" uses the opposite sign.
van_vleck : "off" or "autocorr", default "off": Van Vleck correction for the MeerKAT F-engine quantisation distortion. "autocorr" corrects autocorrelation amplitudes using the built-in lookup table; "off" leaves the data unchanged.
preferred_chunks : dict, optional: Preferred chunk sizes along named dimensions, e.g. {"time": 4, "frequency": 512}. These are hints; the engine uses the natural archive chunking where possible.
capture_block_id : str, optional: Override the capture-block ID inferred from the RDB file. Rarely needed in normal use; useful when the ID embedded in the file differs from the one in the URL.
stream_name : str, optional: Override the data-stream name inferred from the RDB file (e.g. "sdp_l0"). Useful when an observation contains multiple streams and you want to open a specific one.

Example Usage

Load a small observation entirely into memory:

import xarray_kat
import xarray

token = "eyFILLMEIN"
capture_block_id = 1234567890
url = (
    f"https://archive-gw-1.kat.ac.za/{capture_block_id}"
    f"/{capture_block_id}_sdp_l0.full.rdb?token={token}"
)

dt = xarray.open_datatree(url, chunked_array_type="xarray-kat", chunks={})
dt.load()

Select a subset of the data before loading:

ds = dt[f"{capture_block_id}_sdp_l0/0"].ds
ds = ds.isel(
    time=slice(10, 20),
    baseline_id=[1, 20, 30, 31, 32, 50],
    frequency=slice(256, 768),
)
ds.load()

Apply calibration solutions and Van Vleck correction:

dt = xarray.open_datatree(
    url,
    chunked_array_type="xarray-kat",
    chunks={},
    applycal="all",
    van_vleck="autocorr",
)

If dask is installed, request dask-backed chunks along specific dimensions:

# Natural (archive) chunking — most efficient
dt = xarray.open_datatree(url, chunks={})
dt = dt.compute()

# Custom chunking — may cause repeated archive requests for overlapping chunks;
# prefer rechunking on top of natural chunks, or use a cache pool
dt = xarray.open_datatree(
    url, chunks={"time": 20, "baseline_id": 155, "frequency": 256}
)
dt = dt.compute()

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
.github		.github
src/xarray_kat		src/xarray_kat
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.rst		CHANGELOG.rst
LICENSE		LICENSE
README.rst		README.rst
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Direct xarray views over the MeerKAT archive

Required Reading

Opening a dataset

Parameters

Example Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Direct xarray views over the MeerKAT archive

Required Reading

Opening a dataset

Parameters

Example Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages