Lazy Loading

Lazy Loading

This page captures the utilities from nbs/01_lazy_loading.py.

Array Proxy

import numpy as np

from mertisreader.lazy_loading import LazyArray

base = np.arange(12, dtype=np.float64).reshape(3, 4)
lazy = LazyArray(base, header={"SIMPLE": True})

print(lazy)
print(lazy.shape)
print(lazy.dtype)
print(lazy.mean())
print(lazy.materialize())
LazyArray(shape=(3, 4), dtype=float64, status=lazy (memmap))
(3, 4)
float64
5.5
[[ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]]

CSV Proxy

from pathlib import Path

from mertisreader.lazy_loading import LazyCSVLoader

sample_csv = Path("/tmp/mertisreader_lazy_loading_sample.csv")
sample_csv.write_text("a,b,c\n1,2,3\n4,5,6\n", encoding="utf-8")

loader = LazyCSVLoader(sample_csv)

print(loader)
print(loader.columns)
print(loader.materialize())
LazyCSVLoader(path=/tmp/mertisreader_lazy_loading_sample.csv, status=lazy (query plan))
['a', 'b', 'c']
   a  b  c
0  1  2  3
1  4  5  6
/home/kidpixo/work/esa/mertis_data_management/mertisreader/mertisreader/lazy_loading.py:211: PerformanceWarning:

Determining the column names of a LazyFrame requires resolving its schema, which is a potentially expensive operation. Use `LazyFrame.collect_schema().names()` to get the column names without this warning.

Notes

LazyArray keeps FITS-backed data accessible without forcing an immediate load, while LazyCSVLoader defers CSV reads until the data is explicitly collected.