# `latpy.io` — I/O for NDArray Read and write NDArray to CSV, JSON, and plain text files. ```python from latpy.io import save_csv, load_csv, save_json, load_json, save_text, load_text ``` Each format serves a different purpose: | Format | Best for | |---|---| | **CSV** | Spreadsheet interchange, pandas/numpy interop, human inspection in Excel | | **JSON** | Full round-trip preservation (dtype, shape, axes), metadata-rich storage | | **Text** | Quick debugging dumps, simple logs, piping to command-line tools | --- ## CSV ```python save_csv(path, arr, delimiter=",", fmt="%.18g") load_csv(path, dtype=None, delimiter=",") ``` CSV (comma-separated values) is the most universally supported tabular format. The latpy CSV writer includes a **header comment line** that records the original shape and dtype so that `load_csv` can reconstruct arrays exactly — including 1D and 3D+ shapes that CSV alone cannot represent. Header format: ``` # shape=(rows, cols), dtype= ``` | Feature | Behavior | |---|---| | 1D arrays | Written as one value per line; header preserves original 1D shape | | 2D arrays | Written as-is, one row per line | | 3D+ arrays | Flatten leading dims into rows (last dim as columns); header preserves original shape | | Header | `# shape=(...), dtype=...` — used on load to reconstruct | | No header | Load infers 2D shape from row/column count (1-row data → 1D) | | dtype inference | From header; if absent, inferred from values (int → I64, float → F64) | | Custom delimiter | `delimiter="\|"` for pipe-delimited, `delimiter="\t"` for TSV | ### Complete examples **2D array round-trip:** ```python from latpy.latmath.array import array from latpy.io import save_csv, load_csv a = array([[1, 2, 3], [4, 5, 6]], dtype="i64") save_csv("out.csv", a) ``` File contents (`out.csv`): ``` # shape=(2, 3), dtype=i64 1,2,3 4,5,6 ``` Loading it back: ```python b = load_csv("out.csv") b.tolist() # [[1, 2, 3], [4, 5, 6]] b.shape # (2, 3) b.dtype.name # "i64" ``` **1D array — single value per line, header preserves 1D shape:** ```python a = array([1.5, 2.5, 3.5], dtype="f64") save_csv("out.csv", a) ``` File contents: ``` # shape=(3,), dtype=f64 1.5 2.5 3.5 ``` Loading: ```python b = load_csv("out.csv") b.tolist() # [1.5, 2.5, 3.5] b.shape # (3,) ``` **3D+ array — leading dimensions flattened into rows:** ```python a = array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) # shape (2, 2, 2) save_csv("out.csv", a) ``` File contents: ``` # shape=(2, 2, 2), dtype=i64 1,2 3,4 5,6 7,8 ``` The CSV file has 4 data rows (product `2 × 2`) and 2 columns (last dimension). The header records the original 3D shape, so loading restores it exactly: ```python b = load_csv("out.csv") b.shape # (2, 2, 2) b.tolist() # [[[1, 2], [3, 4]], [[5, 6], [7, 8]]] ``` **Custom delimiter (pipe):** ```python a = array([[1, 2], [3, 4]]) save_csv("out.csv", a, delimiter="|") ``` File contents: ``` # shape=(2, 2), dtype=i64 1|2 3|4 ``` Loading: ```python b = load_csv("out.csv", delimiter="|") b.tolist() # [[1, 2], [3, 4]] ``` ### Edge cases **Empty array — CSV cannot round-trip** because an empty array produces zero data rows (only the header). `load_csv` raises `ShapeError`. ```python a = array([]) # shape (0,) save_csv("empty.csv", a) # writes header only # File content: # # shape=(0,), dtype=i64 load_csv("empty.csv") # ShapeError: load_csv: no data rows found ``` **1D with no header (e.g. hand-written file):** ```python with open("no_header.csv", "w") as f: f.write("1,2,3\n") b = load_csv("no_header.csv") b.tolist() # [1, 2, 3] — inferred as 1D (single row) b.shape # (3,) ``` **Ragged rows — different column counts on different lines:** ```python with open("ragged.csv", "w") as f: f.write("1,2\n3,4,5\n") load_csv("ragged.csv") # ShapeError: load_csv: ragged rows — expected 2 cols, got 3 ``` **dtype mismatch between header and actual data:** The header dtype is treated as a hint; if the parsed values are of a different type, promotion occurs. If `dtype=` is passed explicitly to `load_csv`, it overrides the header. **NaN/Inf values in CSV:** These are written as the string `nan` or `inf` by the `%.18g` format. On load, they parse as float, triggering F64 dtype. ```python from latpy.latmath.array import array a = array([[1.0, float("nan")], [float("inf"), 4.0]], dtype="f64") save_csv("nan.csv", a) ``` File contents: ``` # shape=(2, 2), dtype=f64 1,nan inf,4 ``` Loading: ```python b = load_csv("nan.csv") b.tolist() # [[1.0, nan], [inf, 4.0]] b.dtype.name # "f64" ``` ### Error handling | Scenario | Error | |---|---| | File not found | `FileNotFoundError` (from `open`) | | Permission denied | `PermissionError` (from `open`) | | Empty file (no header, no data) | `ShapeError: load_csv: empty file` | | No data rows after header | `ShapeError: load_csv: no data rows found` | | Ragged rows | `ShapeError: load_csv: ragged rows — expected N cols, got M` | | Non-NDArray passed to save | `DTypeError: save_csv: arr must be an NDArray` | --- ## JSON ```python save_json(path, arr, indent=2) load_json(path) ``` JSON stores the full array representation: data, dtype, shape, and axes names. This enables **perfect round-trip** for any NDArray regardless of dimensionality or axes metadata. JSON structure: ```json { "data": [[1, 2], [3, 4]], "dtype": "i64", "shape": [2, 2], "axes": ["row", "col"] } ``` All keys are present on save. On load, `dtype`, `shape`, and `axes` are optional; if missing they fall back to sensible defaults (`dtype` → `i64`, `shape` → inferred from data, `axes` → `("x0", "x1", ...)`). Supports all dtypes: I64, F64, B1. ### Complete examples **2D round-trip with axes:** ```python from latpy.latmath.array import array from latpy.io import save_json, load_json a = array([[1, 2], [3, 4]], dtype="i64", axes=("row", "col")) save_json("out.json", a) ``` File contents (`out.json`): ```json { "data": [[1, 2], [3, 4]], "dtype": "i64", "shape": [2, 2], "axes": ["row", "col"] } ``` Loading: ```python b = load_json("out.json") b.tolist() # [[1, 2], [3, 4]] b.shape # (2, 2) b.dtype.name # "i64" b.axes # ("row", "col") ``` **1D and 3D — JSON handles any dimensionality:** ```python a1 = array([1, 2, 3], axes=("x",)) save_json("out.json", a1) ``` File: ```json { "data": [1, 2, 3], "dtype": "i64", "shape": [3], "axes": ["x"] } ``` **Boolean arrays:** ```python a = array([True, False, True], dtype="b1", axes=("flags",)) save_json("out.json", a) ``` File: ```json { "data": [true, false, true], "dtype": "b1", "shape": [3], "axes": ["flags"] } ``` **Custom indentation:** ```python save_json("out.json", a, indent=4) # 4-space indentation ``` ### Edge cases **Empty array round-trip** — JSON handles empty arrays correctly because `data: []` is valid JSON and the `shape` key records the original shape (e.g. `[0]` for 1D empty, `[0, 3]` for 2D empty). ```python a = array([], dtype="i64") save_json("empty.json", a) b = load_json("empty.json") b.tolist() # [] b.shape # (0,) ``` **None axes** — If an NDArray has default axes (e.g. `("x0", "x1")`), these are still written to JSON. On load they are restored. There is no case where axes are `None` in the NDArray — they are always a tuple. **Missing `data` key in loaded JSON:** ```python import json with open("bad.json", "w") as f: json.dump({"not_data": 1}, f) load_json("bad.json") # ShapeError: load_json: missing 'data' key ``` **Malformed JSON:** ```python load_json("out.json") # json.JSONDecodeError: ... ``` ### Error handling | Scenario | Error | |---|---| | File not found | `FileNotFoundError` (from `open`) | | Malformed JSON | `json.JSONDecodeError` | | Not a JSON object | `ShapeError: load_json: root must be a JSON object` | | Missing `data` key | `ShapeError: load_json: missing 'data' key` | | Non-NDArray passed to save | `DTypeError: save_json: arr must be an NDArray` | --- ## Text ```python save_text(path, arr, fmt="%.18g") load_text(path, dtype=None) ``` Whitespace-delimited text with **no header line**. This is the simplest format — great for quick debugging, piping to Unix tools, or human reading. Because there is no header, multi-dimensional arrays are flattened to 2D **irreversibly**. | Feature | Behavior | |---|---| | 1D arrays | One value per line (single column) | | 2D arrays | One row per line, space-separated | | 3D+ arrays | Flattened to 2D (lossy, no header to restore shape) | | Load | Infers shape from consistent line lengths | ### Complete examples **1D array — one value per line:** ```python from latpy.latmath.array import array from latpy.io import save_text, load_text a = array([1.5, 2.5, 3.5], dtype="f64") save_text("out.txt", a) ``` File contents (`out.txt`): ``` 1.5 2.5 3.5 ``` Loading: ```python b = load_text("out.txt") b.tolist() # [1.5, 2.5, 3.5] b.shape # (3,) ``` **2D array — one row per line, space-separated:** ```python a = array([[1, 2, 3], [4, 5, 6]], dtype="i64") save_text("out.txt", a) ``` File contents: ``` 1 2 3 4 5 6 ``` Loading: ```python b = load_text("out.txt") b.tolist() # [[1, 2, 3], [4, 5, 6]] b.shape # (2, 3) ``` **3D array — flattened to 2D, original shape lost:** ```python a = array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) # shape (2, 2, 2) save_text("out.txt", a) ``` File contents (same as a 2D (4, 2) array): ``` 1 2 3 4 5 6 7 8 ``` Loading: ```python b = load_text("out.txt") b.shape # (4, 2) — NOT (2, 2, 2) b.tolist() # [[1, 2], [3, 4], [5, 6], [7, 8]] ``` **Custom format string:** ```python a = array([[1.23456789, 2.3456789]], dtype="f64") save_text("out.txt", a, fmt="%.2f") ``` File contents: ``` 1.23 2.35 ``` ### Edge cases **Empty array** — Text cannot round-trip empty arrays. `save_text` with a 1D empty array produces an empty file; `load_text` raises `ShapeError`. ```python a = array([]) save_text("empty.txt", a) # writes nothing load_text("empty.txt") # ShapeError: load_text: empty file ``` **Single-row data — load infers 1D:** ```python with open("single.txt", "w") as f: f.write("1 2 3\n") b = load_text("single.txt") b.tolist() # [1, 2, 3] — single row → 1D array b.shape # (3,) ``` **Ragged rows — inconsistent column counts:** ```python with open("ragged.txt", "w") as f: f.write("1 2\n3 4 5\n") load_text("ragged.txt") # ShapeError: load_text: ragged rows — expected 2 cols, got 3 ``` ### Error handling | Scenario | Error | |---|---| | File not found | `FileNotFoundError` (from `open`) | | Empty file | `ShapeError: load_text: empty file` | | No numeric data found | `ShapeError: load_text: no numeric data found` | | Ragged rows | `ShapeError: load_text: ragged rows — expected N cols, got M` | | Non-NDArray passed to save | `DTypeError: save_text: arr must be an NDArray` | --- ## Format comparison | Feature | CSV | JSON | Text | |---|---|---|---| | Human-readable in Excel | Yes | No (but readable in editor) | Yes | | Preserves dtype | Via header | Explicit | No (inferred) | | Preserves shape (1D, 3D+) | Via header | Explicit | No (flattened) | | Preserves axes names | No | Yes | No | | Empty array round-trip | No | Yes | No | | NaN / Inf support | Yes (text nan/inf) | Yes (JSON null → NaN) | Yes (text nan/inf) | | Custom delimiter | Yes (any string) | N/A | N/A | | File size (for same data) | Small | Largest (verbose) | Smallest | | Dependencies | stdlib | stdlib (`json`) | stdlib | **When to use each:** - **CSV** — When you need to open the file in a spreadsheet, share with colleagues who use Excel/R/pandas, or pipe into data processing pipelines that expect CSV. The header comment is ignored by most CSV parsers, so the file remains compatible. - **JSON** — When you need faithful storage of latpy-specific metadata (axes names, exact dtype, arbitrary dimensionality). Use for checkpoints and long-term storage where round-trip fidelity matters. - **Text** — When you want a quick glance at array contents, need to feed data into a shell pipeline, or are debugging and want minimal overhead. Accept the loss of dimensionality information.