# latpy API Overview **Version:** 0.0.8 | **Python:** >=3.10 | **License:** MIT ## Package Tree ``` latpy/ ├── latmath/ # Mathematics │ ├── array/ # N-dimensional arrays (core) │ │ ├── ufuncs/ # Element-wise ops, comparisons │ │ ├── reduce/ # Reductions (sum/min/max) │ │ ├── broadcast/ # Broadcast-to view │ │ └── sort/ # Sort, argsort, clip, etc. │ ├── core/ # Integers, rationals, bits, checks, errors │ ├── stats/ # Statistics │ ├── random/ # PRNG and sampling │ ├── scalar/ # Constants, approximate comparison, error measures │ └── optimize/ # Root-finding, gradient descent, discrete search ├── latdata/ # Tabular data ├── io/ # CSV/JSON/text I/O ├── ml/ # K-Means, LinearRegression, metrics, SOV ├── viz/ # Pure-SVG line/scatter/bar/hist plots, graph drawing ├── latx/ # Symbolic expression compiler └── torch/ # Autograd wrapper (Tensor, backward, SGD) ``` ## Design Philosophy per Module ### `latmath.array` — The numerical foundation This is what nearly everything else builds on. NDArray provides a dense, strided, n-dimensional array backed by `array.array` — the only built-in Python type that offers compact numeric storage without NumPy. The module exists to answer the question: *how do you do vectorized numerical computing with zero dependencies?* It solves the problem of broadcasting, dtype promotion, strided views, reductions, and linear algebra, all within the stdlib. The module is deliberately split into `ufuncs`, `reduce`, `broadcast`, `sort`, `index`, `linalg` — each solving one well-defined sub-problem so they can be tested and optimized independently. ### `latmath.core` — Exact arithmetic and foundational primitives Pure-Python number theory and bitwise operations that NDArray and other modules depend on. `Rational` and `FixedPoint` provide exact rational arithmetic (no floating-point drift) and fixed-point representation, which are critical for lattice-based cryptography, equilibrium logic, and torsion physics where IEEE 754 rounding is unacceptable. The `checks` sub-module (`require_shape`, `require_dtype`, etc.) provides uniform validation throughout the library. ### `latmath.stats` — Descriptive statistics from scratch Provides moment-based descriptive statistics (`describe`, `skew`, `kurtosis`) and probability distribution functions (`norm_pdf/cdf`, `poisson_pmf`, `uniform_pdf`). Everything is computed directly from NDArray reductions — no external statistics package. The module exists because statistical description is a universal need, and implementing it atop latpy's own arrays demonstrates the framework's self-sufficiency. ### `latmath.random` — Seedable PRNG Wraps Python's `random.Random` with a seed-management layer and adds array-oriented sampling (`randn` via Box–Muller, `randint`, `uniform`, `choice`, `shuffle`). Every seeded operation is deterministic across Python versions and platforms (no platform-specific RNG). Used by ML algorithms (K-Means initialization), random projections (SOV models), and testing infrastructure. ### `latdata` — Tabular data with named axes Built directly on NDArray, `latdata` (`Axis`, `Table`, `GroupBy`) adds semantic column/row labeling so you can index data by name (`table["column_name"]`) rather than by integer position. This solves the ergonomic gap between raw arrays and spreadsheet-like usage. `GroupBy` provides split-apply-combine aggregation, demonstrating that latpy arrays are a sufficient backend for data-frame operations. ### `latpy.io` — Serialization round-trip Saves and loads arrays and tables in CSV, JSON, and plain-text formats using nothing but stdlib modules (`csv`, `json`, `pathlib`). The CSV format embeds shape/dtype metadata in a comment header, enabling lossless round-trip for multi-dimensional data. This module exists because a library that can compute but cannot persist its data is incomplete. ### `latpy.ml` — From-scratch machine learning K-Means, LinearRegression, classification/regression metrics, and SOV (State-Observation-Vector) models. Everything is implemented on NDArray primitives — matrix multiplication is just loops over arrays; linear regression is the normal equations solved via Gaussian elimination. The SOV sub-package implements latent-state-space models using random projections and power iteration, directly leveraging `latmath.array.linalg`. ### `latpy.viz` — Pure-SVG visualization Generates SVG figures with zero rendering dependencies — no Cairo, no matplotlib, no browser. The pipeline converts data coordinates to pixel coordinates via `LinearScale` / `LogScale` / `BandScale`, builds SVG elements via an XML builder (`SVG` / `Element`), and composes them into a `Figure`. This module exists because in lattice/embedded/restricted environments you often cannot install a rendering backend, but you can always produce an SVG string that any browser will display. ### `latmath.scalar` — Scalar utilities Provides mathematical constants (`pi`, `e`, `tau`, `inf`, `nan`), error measures (`relative_error`, `absolute_error`), and tolerance-based comparison (`isclose`, `allclose`, `approx_eq`/`ne`/`lt`/`le`/`gt`/`ge`). This module exists because floating-point arithmetic is inherently imprecise, and naive `==` / `<` comparisons fail for computed values (e.g., `sqrt(2)**2 == 2` is `False`). The approximate comparison functions provide a consistent, configurable solution for numerical code. ### `latmath.optimize` — Optimization Root-finding (`bisection`, `newton`), gradient-based optimization (`gradient_descent`), and discrete search (`grid_search`, `random_search`). All operate on scalar functions without any external solver dependency. This module exists because optimization is a universal need in numerical computing — from fitting models to tuning hyperparameters — and implementing it on latpy's own primitives demonstrates self-sufficiency. ### `latx` — Symbolic expression compiler Define symbolic expressions using operator overloading (`x * x + 2 * x + 1`) and transcendental functions (`Sin`, `Cos`, `Exp`, `Log`), then compile them to callable Python functions. This bridges the gap between mathematical model specification and numerical evaluation — useful for defining loss functions, physical models, or any computation that benefits from a separation between "what to compute" and "compute it." ### `torch` — Autograd wrapper Lightweight reverse-mode automatic differentiation on latpy's NDArray. `Tensor` wraps NDArray and tracks a dynamic computation graph; calling `.backward()` computes gradients via the chain rule. `SGD` provides basic optimization. This module exists to support gradient-based learning without importing PyTorch or JAX — keeping latpy's zero-dependency promise while enabling differentiable programming. ## Module Interrelationships ``` latmath.core ────────────────────────► latmath.array ──► latdata (dtypes, errors) │ │ │ │ (NDArray- │ │ backed ▼ ▼ Table) ┌───────────────── latmath.stats ──► latpy.io │ latmath.random │ latmath.scalar │ latmath.optimize │ │ ▼ ▼ latx ──────► compiler ──► callable fns │ ▼ torch (autograd, Tensor, SGD) │ ▼ latpy.ml ──────────► latpy.viz │ │ (models) (SVG plots) ▼ stdout/file ``` - `latmath.core` provides the dtype system (`DType`, `I64`, `F64`, `B1`), error types (`ShapeError`, `DTypeError`, `DomainError`), and scalar primitives (`Rational`, `FixedPoint`) used by all other modules. - `latmath.array` is the computational engine. Every other module that operates on numerical data imports NDArray from here. - `latmath.stats`, `latmath.random`, `latmath.scalar`, and `latmath.optimize` consume NDArray and return NDArray or scalar results. - `latx` is a standalone symbolic expression compiler (no NDArray dependency); its compiled functions can accept NDArray or plain float values. - `torch` wraps NDArray with autograd; it depends on `latmath.array` for all operations and `latmath.array.numpy_compat` for transcendental functions. - `latdata` is built on NDArray but adds a label-based indexing layer on top. - `latpy.ml` depends on `latmath.array` for all matrix math, `latmath.random` for initialization, and `latmath.core.checks` for input validation. - `latpy.viz` uses its own `LinearScale`/`LogScale`/`BandScale` from `scale.py` (not from `latmath.scalar`), though it consumes NDArray data for plotting. The separation is intentional: scale transforms are data-to-pixel mappings, not mathematical scalars. - `latpy.io` reads NDArray and latdata structures from disk and reconstructs them with full dtype/axes fidelity. ## Conventions — in depth ### "Pure stdlib" in practice Every `import` statement in the library draws exclusively from Python's standard library. The backbone types are: | Need | Stdlib solution | |---|---| | Compact numeric buffer | `array.array` (typecodes `'q'` for I64, `'d'` for F64, `'b'` for B1) | | PRNG | `random.Random` (seeded, deterministic) | | SVG output | `xml.etree.ElementTree` | | CSV parsing | `csv.reader` / `csv.writer` | | JSON | `json.load` / `json.dump` | | Math functions | `math.sqrt`, `math.sin`, etc. | | Data classes | `dataclasses.dataclass` | | Path handling | `pathlib.Path` | This means latpy works on any Python ≥ 3.10 installation — Alpine Linux, Windows, a restricted CI runner, an embedded MicroPython-like environment — **without a single `pip install` beyond latpy itself**. There are no C extensions, no SIMD intrinsics, no compiled wrappers. ### Why named axes matter Every array carries an `axes` tuple like `("row", "col")` or `("batch", "channel", "height", "width")`. Named axes serve three purposes: 1. **Self-documenting code.** `arr.axes` tells you what each dimension *means* without needing to remember positional conventions. 2. **Semantic indexing.** Future versions will support `arr.index(axis="col", key=...)` — already the infrastructure is laid for dimension-by-name access. 3. **Lattice-readiness.** In lattice-based substrates, dimensions correspond to physical directions (e.g., `("x", "y", "z")` for a 3D lattice or `("site", "spin")` for a spin lattice). Named axes make it possible to write generic lattice operations that refer to axes by physical meaning rather than position. When you don't supply axes, defaults are `("a0", "a1", ...)`. ### Lattice-readiness "Lattice-ready" means the array architecture was designed from the ground up to support: - **Strided views over regular grids** — the same stride mechanism that enables broadcasting also models lattice translations. - **Named dimension labels** — axes like `("site", "neighbor")` map naturally onto adjacency structures. - **Exact integer arithmetic** — `I64` and `Rational` avoid floating-point drift in lattice calculations (energy sums, partition functions). - **Deterministic PRNG** — lattice Monte Carlo simulations require reproducible random sequences across platforms. Future lattice modules (equilibrium logic solvers, torsion field propagators) will slot into the existing array infrastructure without changing NDArray internals. ### Shapes and strides - Shapes are tuples of `int`, e.g., `(3, 4)` for a 3×4 matrix. - Strides are tuples of `int` measured in **elements** (not bytes), e.g., `(4, 1)` for a C-contiguous 3×4 array. - Row-major (C-order) is the default; strides are always computed via `c_strides()` for new arrays. - Views share the underlying `_buf` (`array.array`) and differ only in `shape`, `strides`, `offset`, and `axes`. Copy-on-write is not performed; explicit `.copy()` materializes. ## Module Quick-Links | Module | Status | Key Exports | Rationale | |---|---|---|---| | [`latmath.array`](latmath.array.md) | **Stable** | `NDArray`, `zeros`, `array`, `add`, `mul`, `sum`, `min`, `max`, `where`, `sub`, `ssd`, `argmin`, `qr`, `eig`, `solve`, `DType`, `I64`, `F64`, `B1`, `linalg`, `numpy_compat` | The foundational array type and all vectorized operations. Start here. | | [`latmath.core`](latmath.core.md) | **Stable** | `gcd`, `egcd`, `lcm`, `modinv`, `isqrt`, `Rational`, `FixedPoint`, `bits`, `checks`, errors | Exact integer/rational arithmetic and validation utilities. Needed when floating-point is not acceptable. | | [`latmath.stats`](latmath.stats.md) | **Stable** | `describe`, `histogram`, `skew`, `kurtosis`, `norm_pdf/cdf`, `poisson_pmf`, `uniform_pdf` | Descriptive statistics and probability distributions built on NDArray reductions. | | [`latmath.random`](latmath.random.md) | **Stable** | `seed`, `randn`, `randint`, `uniform`, `rand`, `choice`, `shuffle` | Seedable, deterministic PRNG with array-oriented sampling. Used by ML and testing. | | [`latdata`](latdata.md) | **Stable** | `Axis`, `Table`, `GroupBy` | Label-based tabular data on top of NDArray. For spreadsheet-like workflows. | | [`latmath.scalar`](latmath.scalar.md) | **Stable** | `pi`, `e`, `tau`, `inf`, `nan`, `isclose`, `allclose`, `relative_error`, `absolute_error`, `approx_eq/ne/lt/le/gt/ge` | Constants, error measures, and tolerance-based comparison for scalar values. | | [`latmath.optimize`](latmath.optimize.md) | **Stable** | `bisection`, `newton`, `gradient_descent`, `grid_search`, `random_search` | Root-finding, gradient descent, and discrete/black-box optimization. | | [`latpy.io`](latpy.io.md) | **Stable** | `save_csv`, `load_csv`, `save_json`, `load_json`, `save_text`, `load_text` | Zero-dependency serialization with full metadata round-trip. | | [`latpy.ml`](latpy.ml.md) | **Stable** | `kmeans`, `LinearRegression`, `accuracy`, `precision`, `recall`, `f1_score`, `confusion_matrix`, `mse`, `mae`, `r2`, `SOVRegression`, `SOVClassifier`, `SOVDynamics` | From-scratch ML: clustering, regression, classification, latent-state-space models. | | [`latpy.viz`](latpy.viz.md) | **Stable** | `SVG`, `Figure`, `plot`, `scatter`, `bar`, `hist`, `draw_graph`, `LinearScale`, `LogScale`, `BandScale` | Pure-SVG visualization with no rendering dependencies. | | [`latx`](latx.md) | **Stable** | `Var`, `Const`, `Expr`, `Add`, `Sub`, `Mul`, `Div`, `Pow`, `Neg`, `Sin`, `Cos`, `Exp`, `Log`, `compile` | Symbolic expression compiler — define models algebraically, compile to callable functions. | | [`torch`](torch.md) | **Stable** | `Tensor`, `tensor`, `add`, `mul`, `sub`, `div`, `neg`, `pow`, `sin`, `cos`, `exp`, `log`, `sum`, `mean`, `matmul`, `SGD` | Autograd wrapper — reverse-mode AD on NDArray with SGD optimizer. | ## When to Use Each Module | Task | Module | |---|---| | Creating and manipulating numerical arrays | `latmath.array` | | Matrix math (QR, solve, eig) | `latmath.array.linalg` | | Exact rational arithmetic (no float drift) | `latmath.core` (`Rational`, `FixedPoint`) | | Bitwise operations, GCD, modular inverse | `latmath.core` | | Descriptive statistics of array data | `latmath.stats` | | Generating random numbers or shuffling arrays | `latmath.random` | | Labeled column/row data with group-by aggregation | `latdata` | | Reading/writing CSV, JSON, or text files | `latpy.io` | | K-Means clustering or linear regression | `latpy.ml` | | Classification metrics (precision, recall, F1) | `latpy.ml` | | Latent-state-space models (SOV) | `latpy.ml.sov` | | Generating SVG plots or graphs | `latpy.viz` | | Mathematical constants (π, e, τ) or tolerance-based comparison | `latmath.scalar` | | Root-finding, gradient descent, or discrete optimization | `latmath.optimize` | | Defining symbolic mathematical models that compile to functions | `latx` | | Gradient computation via automatic differentiation | `torch` | | Migrating existing NumPy code to latpy | `latmath.array.numpy_compat` | ## Architecture Decisions ### Why no NumPy dependency NumPy is a compile-time C extension with platform-specific wheels, SIMD dispatch, and a large installation footprint (~200 MB on Alpine). For latpy's target environments — air-gapped systems, embedded Python runtimes, `python:alpine` containers, educational settings — requiring NumPy would defeat the purpose of a sovereign library. By using only `array.array` and stdlib primitives, latpy installs in <1 second, works on every platform that supports Python ≥ 3.10, and produces identical results everywhere. **Trade-off:** Operations are ~10–100× slower than NumPy on large arrays. latpy is designed for correctness, transparency, and portability, not peak throughput. ### Why named axes Most array libraries (NumPy, PyTorch, JAX) use positional dimensions — you remember that axis 0 is rows, axis 1 is columns. Named axes make this explicit and machine-checkable. When you write `data.axes = ("batch", "channel", "height", "width")`, every subsequent operation can refer to these by name. The primary motivation is lattice-readiness: in a 4D spin lattice, dimensions correspond to physical directions, and naming them avoids off-by-one bugs that plague positional indexing in complex geometries. ### Why pure-SVG for viz (no Cairo, no matplotlib) Rendering backends like Cairo require compiled C libraries; matplotlib pulls in a dependency tree of 30+ packages. Neither is acceptable in latpy's target environments. SVG is a human-readable XML format that can be produced entirely with `xml.etree.ElementTree` — a stdlib module. The generated SVG can be viewed in any browser, embedded in Jupyter notebooks (via `IPython.display.SVG`), or converted to PDF/PNG by external tools if needed. This makes viz work in environments where no display server exists, and ensures that plots are infinitely zoomable without requiring a rasterization engine. ## Conventions - All array operations are **pure stdlib** (no NumPy dependency). See "[Pure stdlib in practice](#pure-stdlib-in-practice)" above. - Integer arithmetic is exact; floating-point is IEEE 754 double. - Shapes are tuples of `int`; strides are tuples of `int` (elements, not bytes). - Axes are optional string tuples for named dimension labeling. - `latpy.__version__` provides the runtime version string. - All public APIs raise `ShapeError`, `DTypeError`, or `DomainError` (from `latmath.core.errors`) on invalid input; errors are never silently swallowed. - Functions that mutate state (e.g., `sort(axis=-1)`) do so in-place; functions that return new arrays leave the original unchanged.