Getting Started with latpy

latpy is a pure-Python array and data library built from the ground up for clarity, portability, and zero compiled dependencies. It provides a NumPy-like experience — arrays, broadcasting, linear algebra, stats, ML, labeled tables, and SVG visualisation — entirely in Python, with no C extensions or Fortran libraries.

This guide walks you through every feature with practical examples, explains why things work the way they do, and points out edge cases you’ll encounter in real use.


Installation

latpy is installed from its Git repository. The -e (editable) flag means changes to the source take effect immediately — useful if you’re developing or debugging.

git clone https://gitlab.com/tyarc-lab/latpy.git
cd latpy
pip install -e .

Alternatively, you can add the source directory to your PYTHONPATH. This works when you want to point an existing environment at a specific checkout without re-running pip.

set PYTHONPATH=C:\path\to\latpy\src;%PYTHONPATH%

Edge case — fresh environment: If you see ModuleNotFoundError: No module named 'latpy', double-check that (a) the src/ directory is on PYTHONPATH, or (b) pip install -e . completed without error. latpy has no required runtime dependencies, so a missing module is almost always a path issue.


Your First Array

array() and zeros() are your primary constructors. They live in the latmath.array namespace and return NDArray objects — the core data structure.

from latpy.latmath.array import array, zeros

# From a Python list
a = array([1, 2, 3, 4, 5])
print(a)           # NDArray([1, 2, 3, 4, 5])
print(a.shape)     # (5,)
print(a.dtype)     # DType(name='i64', code='q', size=8)

Why I64? When you pass integers, latpy chooses I64 (signed 64-bit) — the widest platform-safe integer type. This avoids overflow on most common operations and mimics NumPy’s default int64 on 64-bit platforms.

# 2D array
b = array([[1, 2], [3, 4], [5, 6]])
print(b.shape)     # (3, 2)

Shapes are always tuples: (N,) for 1-D, (M, N) for 2-D, etc. A scalar value extracted via a[0] is a plain Python int or float, not a 0-D array.

# Zero-filled
c = zeros((2, 3))
print(c.tolist())  # [[0, 0, 0], [0, 0, 0]]

zeros() returns an I64 array by default. Use dtype="f64" to get floats.

# Float array
d = array([1.5, 2.5], dtype="f64")

Edge cases:

  • Empty list: array([]) creates an array of shape (0,). Most reductions (.sum(), .mean()) return 0 or 0.0 on empty arrays; min() / max() raise ValueError.

  • Mixed int/float: array([1, 2.0]) promotes to F64 (float has higher priority). See the type-promotion rules in “Troubleshooting”.

  • Ragged nesting: array([[1, 2], [3]]) raises ValueError — all sub-lists must have the same length.


Data Types

Three built-in dtypes cover the vast majority of use cases. There are no unsigned or half-precision types.

from latpy.latmath.array.dtypes import I64, F64, B1, parse_dtype

# Three built-in dtypes:
#   I64 — signed 64-bit integer
#   F64 — double-precision float
#   B1  — boolean (0/1)

Why only three? latpy is designed for teaching, prototyping, and data analysis — not system programming. Restricting to I64, F64, and B1 eliminates the “which int size?” confusion that beginners face in NumPy, while covering every operation in this guide.

# Parse from string name:
dt = parse_dtype("f64")   # F64
dt = parse_dtype("b1")    # B1
dt = parse_dtype(None)    # I64 (default)

parse_dtype(None) returns I64, which is the fallback when no dtype is specified.

# DType properties:
print(F64.name)   # "f64"
print(F64.kind)   # "f"  — "i" for int, "f" for float, "b" for bool
print(F64.size)   # 8 (bytes)

Type promotion rules: When two dtypes meet in an operation:

  • F64 wins over everything (float + int → float, float + bool → float).

  • I64 wins over B1 (int + bool → int, treating True as 1 and False as 0).

  • B1 with B1 stays B1.

This mirrors NumPy’s “safe” promotion rules but with a much smaller type set.


Indexing

Indexing supports scalars, slices, None (newaxis), boolean masks, and integer arrays (fancy indexing). Understanding the copy vs. view distinction is critical.

a = array([10, 20, 30, 40, 50])

# Scalar
print(a[0])       # 10

Scalar indexing returns a plain Python int or float.

# Slice (returns view)
print(a[1:4])     # NDArray([20, 30, 40])

Why views? Slices return a view (not a copy) — they share memory with the original. This makes slicing cheap (O(1)) and avoids data duplication. Changes to the slice affect the original array, and vice versa. If you need an independent copy, call .copy() explicitly.

# Newaxis (inserts dimension of size 1)
print(a[None].shape)       # (1, 5)
print(a[:, None].shape)    # (5, 1)

None (or np.newaxis) inserts a dimension of size 1 at that position. This is primarily used for broadcasting — for example, a[:, None] - a[None, :] builds a pairwise-difference matrix.

# Boolean mask
mask = a > 25
print(mask)                 # NDArray([0, 0, 1, 1, 1])
print(a[mask])              # NDArray([30, 40, 50])

Boolean indexing always copies. Because the selected elements may not occupy a contiguous memory region, latpy returns a fresh array.

# Fancy indexing
idx = array([0, 2, 4])
print(a[idx])              # NDArray([10, 30, 50])

Fancy indexing (using integer arrays) also always copies. This matches NumPy’s contract: when you index with an array of positions, the result is guaranteed to be contiguous and independent of the original.

# 2D fancy indexing (row selection + paired indices)
A = array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])
print(A[array([0, 2])])           # NDArray([[10, 20, 30], [70, 80, 90]])
print(A[array([0, 2]), array([1, 2])])  # NDArray([20, 90])
print(A[array([0, 1]), 1])        # NDArray([20, 50])

When you pass two index arrays (A[[i0, i1], [j0, j1]]), they are paired element-wise: you get A[i0, j0], A[i1, j1]. Mixing a 1-D array with a scalar broadcasts the scalar.

Edge cases:

  • Out-of-bounds scalar: a[100] raises IndexError. latpy validates bounds eagerly.

  • Out-of-bounds slice: a[3:100] does not raise — it silently returns whatever elements overlap (like Python’s own list slicing).

  • Boolean mask size mismatch: a[np.array([True, False])] on a length-5 array raises IndexError — the mask must match the axis size.

  • Empty slice: a[2:2] returns an empty array of shape (0,).


Operations

Arithmetic, comparison, and reduction operations follow NumPy broadcasting semantics and type promotion rules.

a = array([1, 2, 3])
b = array([4, 5, 6])

# Arithmetic
print(a + b)       # NDArray([5, 7, 9])
print(a - b)       # NDArray([-3, -3, -3])
print(a * b)       # NDArray([4, 10, 18])
print(a / b)       # NDArray([0.25, 0.4, 0.5])
print(a ** 2)      # NDArray([1, 4, 9])

Why NumPy broadcasting? Broadcasting allows arrays of different shapes to be combined without explicit looping or replication. latpy follows the same rules:

  1. Right-align shapes: (3,) vs () becomes (3,) vs (1,).

  2. Dimensions of size 1 stretch to match the other.

  3. Mismatched sizes raise ValueError.

So a ** 2 works because 2 (shape ()) broadcasts to (3,). Equivalently, a + array([[1], [2]]) would broadcast to (3,) + (2, 1) (3, 2).

Why division returns float: a / b with integer entries produces F64 results. This avoids integer-truncation surprises. All other operations preserve dtype unless promotion is needed (e.g., F64 + I64 F64).

# Comparisons (returns B1 mask)
print(a > 1)       # NDArray([0, 1, 1])  (B1 dtype)
print(a == 2)      # NDArray([0, 1, 0])

Comparisons always return a B1 (0/1) array, even when comparing floats or mixed types.

# Reductions
print(a.sum())     # 6
print(a.mean())    # 2.0

Why reductions support an axis parameter: When you call a.sum(axis=0) on a 2-D array, you collapse that dimension — useful for row-wise or column-wise aggregation. Without axis, reductions flatten the array. The axis parameter exists so you can control which dimension to eliminate.

Edge cases:

  • Empty array: array([]).sum() returns 0 (the identity). array([]).mean() returns 0.0. array([]).min() raises ValueError — there is no minimum of nothing.

  • Integer overflow: latpy does not check for overflow on I64. array([2**62]).sum() will silently wrap on CPython. Use F64 for large accumulations.

  • Division by zero: array([1, 0]) / array([0, 0]) produces F64 values of inf and nan — not an exception.


Linear Algebra

Linear algebra routines live in latmath.array.linalg. They operate on 1-D and 2-D NDArray objects and are pure-Python implementations (not LAPACK wrappers).

from latpy.latmath.array.linalg import sub, ssd, argmin, qr, eig, solve

a = array([1, 2, 3, 4])
b = array([4, 3, 2, 1])

print(sub(a, b))    # NDArray([-3, -1, 1, 3])
print(ssd(a))       # 5.0  (sum of squared diffs from mean)
print(argmin(a))    # NDArray([0])  (index of minimum)

sub computes element-wise subtraction (equivalent to a - b but explicit). ssd computes the sum of squared deviations from the mean — a building block for variance. argmin returns the index of the minimum as an array (so you can use it for fancy indexing).

# QR decomposition
A = array([[3.0, 2.0], [1.0, 4.0]], dtype="f64")
Q, R = qr(A)

QR decomposition factors A = Q @ R where Q is orthogonal and R is upper-triangular. It is used internally for least-squares and eigenvalue computation.

# Dominant eigenvalue
lam, v = eig(A)
print(lam)          # ~4.0

eig computes the dominant eigenvalue (largest magnitude) and its corresponding eigenvector via power iteration — it does not return all eigenvalues. For the full spectrum, use np.linalg.eigvals from the NumPy compat layer (which calls numpy if available).

# Linear solve
x = solve(A, array([5.0, 6.0], dtype="f64"))
print(x.tolist())   # [0.8, 1.3]

solve solves Ax = b using Gaussian elimination. It requires A to be square and invertible.

Edge cases:

  • Non-square matrices for solve: Raises ValueError — only square systems are supported.

  • Singular matrix for solve: Raises LinAlgError — no solution exists (or infinite solutions).

  • QR on rectangular matrices: Supported, but Q and R may not be the “thin” form you expect from LAPACK.

  • Power iteration convergence: eig on a matrix with two eigenvalues of equal magnitude may not converge to a single result.


NumPy Compatibility Layer

The np singleton wraps a subset of the NumPy API so that you can write latpy code that looks like NumPy — useful for migration or for users familiar with the NumPy ecosystem.

from latpy.latmath.array.numpy_compat import np

a = np.array([1, 2, 3, 4, 5])
b = np.zeros((2, 3), dtype=np.float64)
c = np.eye(3)
d = np.arange(0, 10, 2)          # NDArray([0, 2, 4, 6, 8])
e = np.linspace(0, 1, 5)         # NDArray([0., 0.25, 0.5, 0.75, 1.])
f = np.concatenate([a, np.array([6, 7])])
g = np.dot(a, a)                 # 55 (dot product)

Why a compatibility layer? If you already know NumPy, np removes the need to learn a new API. It also makes it easy to swap real NumPy in later if performance demands it — just change the import. Note that np here is not the real NumPy; it’s a latpy object that returns NDArray.

# Random
np.random.seed(42)
r = np.random.randn(3, 3)

Seeding and random generation are forwarded to latpy’s own random module, not numpy.random.

# Linear algebra
M = np.array([[1, 2], [3, 4]])
print(np.linalg.det(M))          # -2.0
print(np.linalg.inv(M))
print(np.linalg.qr(M))

np.linalg provides det, inv, qr, eigvals, solve, and more. Each delegates to the corresponding latpy linalg module.

Edge cases:

  • np.float64 vs np.float32: Only np.float64 is defined; np.float32 raises AttributeError. Use F64 or "f64" for precision control.

  • np.random.rand vs np.random.randn: Both exist but return latpy arrays, not NumPy arrays.

  • Missing functions: np.fft, np.linalg.svd, np.unique are not provided. Check dir(np) for available names.


Statistics

The stats module provides descriptive statistics, histogramming, moment calculations, and probability density functions — no SciPy required.

from latpy.latmath.stats import describe, histogram, skew, kurtosis, norm_pdf

a = array([1, 2, 3, 4, 5])

# Five-number summary
desc = describe(a)
print(desc["mean"])    # 3.0
print(desc["std"])     # 1.414...

describe returns a dict with min, max, mean, median, std, q1, q3. It computes sample standard deviation (denominator n-1).

# Histogram
counts, edges = histogram(a, bins=3, range_=(1, 5))
print(counts.tolist()) # [1, 1, 3]

histogram computes bin counts and bin edges. The range_ parameter constrains the domain (data outside is ignored). Note the trailing underscore to avoid shadowing Python’s range.

# Moments
print(skew(a))         # 0.0
print(kurtosis(a))     # -1.3

Skewness measures asymmetry (0 = symmetric). Kurtosis here is excess kurtosis (Fisher definition, normal = 0). A uniform distribution’s excess kurtosis is -1.2, so -1.3 for [1,2,3,4,5] is expected.

# Normal PDF
print(norm_pdf(0.0))   # 0.3989...

norm_pdf(x) returns the standard Normal PDF at x: exp(-x²/2) / sqrt(2π).

Edge cases:

  • Single-element array: describe([5]) works, but std will be 0 (single value, zero variance).

  • Zero-variance data: skew([5, 5, 5]) returns nan — skewness is undefined when there is no spread.

  • Histogram with zero bins or invalid range: Raises ValueError.

  • Out-of-range norm_pdf: Handled gracefully — norm_pdf(1e10) returns 0.0 (underflow to zero is mathematically correct).


Random Numbers

The random module provides a self-contained pseudo-random number generator (Mersenne Twister, independently implemented). It does not depend on Python’s random or NumPy’s.

from latpy.latmath.random import seed, randn, randint, uniform, choice, shuffle

seed(42)  # deterministic reproducibility

Why seed(42)? Setting a seed makes random output deterministic and reproducible. This is essential for tests, tutorials, and debugging. Any integer seed works; 42 is simply a convention.

# Continuous
print(randn(3).tolist())       # 3 standard normal samples
print(uniform(0, 10, size=4)) # 4 samples in [0, 10)
print(rand(2, 2).tolist())    # 2x2 uniform[0,1)
  • randn(n) samples from N(0, 1) using the Box-Muller transform.

  • uniform(low, high, size=n) samples uniformly in [low, high).

  • rand(m, n) is shorthand for uniform(0, 1, size=(m, n)).

# Discrete
print(randint(0, 10, size=5)) # 5 integers 0..9

randint(low, high, size) samples uniformly from {low, low+1, ..., high-1} — the upper bound is exclusive, matching NumPy’s convention.

# Sampling
deck = array([1, 2, 3, 4, 5])
print(choice(deck, size=3))   # 3 draws with replacement
shuffle(deck)                 # in-place shuffle

choice draws with replacement by default (elements can repeat). shuffle permutes the array in place and returns None.

Edge cases:

  • Seed repeatability: Two seed(42) calls in the same session reset the generator to the same state — you’ll get the same sequence again.

  • Empty array in choice: Raises ValueError.

  • size=0 in randint or uniform: Returns an empty array of shape (0,).

  • shuffle on empty or 1-element array: No-op (no error).


Working with Labeled Data

latpy’s latdata module adds named axes (like pandas Index) on top of NDArray, giving you labeled tables.

from latpy.latdata import Axis, Table

# Named axis
rows = Axis("row", ["a", "b", "c"])
cols = Axis("col", ["x", "y"])

# Table from nested lists
t = Table.from_list(
    [[1, 2], [3, 4], [5, 6]],
    row_labels=["a", "b", "c"],
    col_labels=["x", "y"],
)

Table.from_list infers the shape from the data and creates named axes. The underlying data is an NDArray stored at t.data.

# Label-based indexing
print(t["a", "x"])     # 1
print(t["a":"c", :])   # Table (3 rows, 2 cols)

How label indexing works: The first index labels rows, the second labels columns. Slices use label strings (not positions) — "a":"c" selects rows from "a" up to and including "c", unlike Python slices which exclude the end. This matches pandas’ label-slice behavior.

Edge cases:

  • Key not found: t["z", :] raises KeyError.

  • Duplicate labels: Not prohibited. Label-based slicing on duplicate labels may skip intervening rows.

  • Slice with non-existent label: t["a":"z", :] raises KeyError — the end label must exist.

  • Empty table: Table.from_list([[]]) is allowed; shape will reflect the empty dimension.


GroupBy

GroupBy partitions a table’s rows (or columns) by matching label values — like SQL’s GROUP BY or pandas’ groupby.

from latpy.latdata import GroupBy

t = Table.from_list(
    [[1, 2], [3, 4], [5, 6]],
    row_labels=["a", "b", "a"],
    col_labels=["x", "y"],
)

# Group rows by label
gb = GroupBy(t, "row")
print(gb.sum().data.tolist())   # [[6, 8], [3, 4]]
print(gb.mean().data.tolist())  # [[3.0, 4.0], [3.0, 4.0]]
print(gb.count().data.tolist()) # [[2], [1]]

Rows "a" (indices 0 and 2) are aggregated together; row "b" (index 1) forms its own group. sum, mean, and count each reduce the grouped axis. The count result has shape (2, 1) because count returns a single value per group (not per column).

Edge cases:

  • No matching groups: A label value with no rows is simply absent from the result.

  • Unordered labels: Groups are returned in order of first appearance, not sorted.

  • Single group: If all labels are identical, gb.sum() returns a single-row table.


I/O

latpy reads and writes arrays and tables in CSV and JSON formats. CSV is portable (works with Excel, pandas); JSON preserves dtype, shape, and axis metadata.

from latpy.io import save_csv, load_csv, save_json, load_json

a = array([[1, 2, 3], [4, 5, 6]])

# CSV with automatic header
save_csv("data.csv", a)
b = load_csv("data.csv")

CSV output includes a header row by default. Data is comma-separated with each row on its own line. On load, latpy infers dtype from the CSV content. If you use integer CSV data, it loads as I64; if a column contains decimal points, it becomes F64.

# JSON with full metadata (dtype, shape, axes)
from latpy.latdata import Axis, Table
save_json("data.json", a)
c = load_json("data.json")

JSON output stores the array as a flat list alongside dtype, shape, and optionally axis labels. This means save_json / load_json round-trip perfectly — even for labeled Table objects.

Edge cases:

  • Empty array to CSV: Writes a file containing only the header row. On load, you get an array of shape (0,).

  • Missing file: load_csv / load_json raise FileNotFoundError.

  • Corrupt JSON: Raises json.JSONDecodeError.

  • Non-ASCII data: CSV is written with UTF-8 encoding; JSON uses ASCII-safe escaped Unicode by default.


Machine Learning

The ml module provides simple, pure-Python implementations of common ML algorithms — k-means, linear regression, and classification metrics. These are not production-grade (no GPU, no regularization paths), but are suitable for learning, prototyping, and small datasets.

from latpy.ml import kmeans, LinearRegression, accuracy, f1_score, confusion_matrix

# K-Means clustering
X = array([[1.0, 2.0], [1.5, 1.8], [5.0, 8.0], [8.0, 8.0], [1.0, 0.6], [9.0, 11.0]])
centroids, labels, inertia = kmeans(X, k=2)
print(labels.tolist())   # cluster assignments

kmeans uses random initialisation (not k-means++), so results vary between runs unless you seed() first. centroids is the final cluster centers (shape (k, n_features)), labels is the assignment per point, and inertia is the sum of squared distances to the nearest centroid.

Seed dependence: kmeans is particularly sensitive to the random seed. Calling seed(42) before kmeans guarantees reproducible cluster assignments.

# Linear regression
X = array([[1.0], [2.0], [3.0], [4.0]])
y = array([2.0, 4.0, 6.0, 8.0])
lr = LinearRegression()
lr.fit(X, y)
print(lr.predict(array([[5.0]])))  # ~10.0
print(lr.score(X, y))              # R² = 1.0

LinearRegression fits an ordinary least-squares model. The score method returns R² (coefficient of determination). Perfect fit gives 1.0.

# Classification metrics
y_true = array([1, 0, 1, 1, 0])
y_pred = array([1, 0, 0, 1, 0])
print(accuracy(y_true, y_pred))    # 0.8
print(f1_score(y_true, y_pred))    # 0.8
print(confusion_matrix(y_true, y_pred).tolist())

Metrics compare predicted labels against ground truth. confusion_matrix returns a 2-D array where row i, column j counts the number of times true class i was predicted as class j.

Edge cases:

  • k=1 for kmeans: Returns a single centroid at the data mean; inertia is total variance.

  • k > n_samples: Raises ValueError — cannot have more clusters than data points.

  • LinearRegression with singular X: Raises LinAlgError if the normal equations matrix is not invertible.

  • All-zero predictions in f1_score: Raises ZeroDivisionError — F1 is undefined when both precision and recall are zero.

  • Mismatched y_true / y_pred lengths: Raises ValueError.


SOV Models (State-Observation-Vector)

SOV is a lightweight state-space / linear dynamical system framework built on latpy arrays. It models hidden states that evolve linearly and emit observations.

from latpy.ml.sov import SOVRegression, SOVClassifier, SOVDynamics

# SOV Regression
X = array([[1.0], [2.0], [3.0]])
y = array([2.0, 4.0, 6.0])
sov = SOVRegression(n_states=2)
sov.fit(X, y)
print(sov.score(X, y))

SOVRegression maps observations X to outputs y through a latent state of dimension n_states. The internal state captures temporal or latent structure that a direct regression might miss.

# SOV dynamics simulation
dyn = SOVDynamics(n_states=2, n_obs=3)
dyn.fit_random(seed=42)
states, obs = dyn.simulate(n_steps=10)
print(states.shape)  # (11, 2)
print(dyn.equilibrium().tolist())  # stable equilibrium state

simulate runs the dynamics forward for n_steps timesteps, returning both hidden states (shape (n_steps+1, n_states)) and observations (shape (n_steps, n_obs)). The extra +1 on states is the initial state. equilibrium() computes the fixed point S = A @ S (if stable).

Edge cases:

  • n_states larger than data rank: The fit may underdetermine the state.

  • Unstable dynamics: equilibrium() may diverge if the transition matrix has eigenvalues > 1 in magnitude.


Visualization

latpy’s viz module renders plots as SVG — a vector format viewable in any browser. No GUI, no matplotlib, no JavaScript.

from latpy.viz import plot, scatter, bar, hist, Figure

# Line plot (auto-scales, returns SVG)
fig, line_el = plot([1, 2, 3, 4, 5], [2, 4, 1, 8, 6])
fig.save("line_plot.svg")

plot(x, y) automatically determines axis ranges to fit all data points. It returns a Figure object and the SVG line element. The figure is not displayed automatically — you must call fig.save(filename) to write the SVG file.

# Scatter plot
fig, dots = scatter([1, 2, 3, 4], [2, 5, 3, 7], r=5)
fig.save("scatter.svg")

The r=5 argument controls the radius of each plotted circle.

# Bar chart
fig, bars = bar(["A", "B", "C", "D"], [3, 7, 2, 5])
fig.save("bar.svg")

Labels are strings; numeric axes are auto-scaled.

# Histogram
fig, bins_ = hist([1, 1, 2, 3, 3, 3, 4, 5], bins=4)
fig.save("hist.svg")

hist(data, bins=n) bins the data, computes frequencies, and draws rectangles.

# Graph visualization
from latpy.viz import draw_graph
svg = draw_graph(["A", "B", "C", "D"],
                 [("A", "B"), ("B", "C"), ("C", "D"), ("A", "D")],
                 width=400, height=300)
with open("graph.svg", "w") as f:
    f.write(svg)

draw_graph returns a raw SVG string (not a Figure) that you write to a file yourself. It places nodes in a simple layout and draws edges with lines.

Why SVG? SVG is pure text (XML), not a binary format. You can view it in any browser, embed it in web pages, and include it in Jupyter Notebooks (the browser renders it inline). The downside is that SVG files can be larger than PNG for the same data.

Edge cases:

  • Empty data for plot: Raises ValueError — at least two points are needed.

  • Single bar for bar: Works fine; a lone bar is drawn.

  • All identical values for plot: The y-range defaults to [value-1, value+1] to avoid a zero-height plot.

  • SVG showing as text in browser: This happens if you open the .svg file in a text editor or serve it with the wrong MIME type. Save with a .svg extension and open in a browser, or configure your server to serve .svg files as image/svg+xml.


Performance Notes

latpy is a pure-Python library with no C, Cython, or Fortran extensions. This means:

  • Slower than NumPy: For most operations (addition, multiplication, reductions), latpy is 10–50× slower than NumPy, because the tight loops run in CPython rather than compiled C. This is acceptable for teaching, prototyping, and datasets under ~10⁵ elements.

  • Comparable to native Python lists: For small arrays (under ~1,000 elements), latpy overhead is small — on par with manual list comprehensions.

  • No parallelism: latpy does not use threads, multiprocessing, or SIMD. All operations are single-threaded Python.

Big-O complexity of common operations:

Operation

Complexity

Notes

sum(), mean(), min(), max()

O(n)

Single pass over data

a + b, a * b (element-wise)

O(n)

Scalar operations are O(n)

dot(a, b)

O(n) for 1-D, O(mnk) for matmul

solve(A, b)

O(n³)

Gaussian elimination

eig(A)

O(k·n²)

Power iteration, k iterations

qr(A)

O(m·n²)

Gram-Schmidt

sort()

O(n log n)

Uses Python’s TimSort

kmeans()

O(k·n·d·iter) per run

k clusters, n points, d dimensions

histogram(data, bins=b)

O(n + b)

Count in bins

Migration path: If you outgrow latpy’s performance, switch to NumPy by:

  1. Using np = latpy.latmath.array.numpy_compat.np — the API is similar.

  2. Converting latpy arrays to NumPy with np.array(ndarray_obj) (requires real NumPy installed).

  3. Replacing from latpy.ml import ... with sklearn equivalents.


Troubleshooting

“Why is my array I64 when I passed floats?”

latpy picks the dtype that fits all input values. If you write:

a = array([1, 2, 3])      # all ints → I64
b = array([1, 2, 3.0])    # has a float → F64

If you explicitly want floats, pass dtype="f64" or include a decimal value. See the “Type promotion rules” in the Data Types section.

“Why did my indexing return a copy instead of a view?”

Only slices (and None) return views. Everything else returns a copy:

  • a[1:4]view (contiguous, shared memory)

  • a[[0, 2, 4]]copy (fancy indexing, non-contiguous)

  • a[a > 25]copy (boolean mask, non-contiguous)

You can check identity: arr[1:4] is arr is False even for views (Python is checks identity), but modifications to the slice will affect the original. If you need a guaranteed copy, call .copy().

“Why did kmeans give different results each time?”

kmeans initialises centroids randomly. Without a fixed seed, each run picks different starting points, which can converge to different local minima. For reproducible results:

from latpy.latmath.random import seed
seed(42)          # or any integer
centroids, labels, inertia = kmeans(X, k=2)

For a deterministic run, also consider that the data order affects tie-breaking in label assignment.

“Why is my SVG showing as text?”

You’re likely viewing the .svg file in a text editor or terminal. SVG is plain XML, so it looks like <?xml ...<svg ...>...</svg>. To see the rendered graphic:

  • Save to a .svg file, then open that file in a web browser (Chrome, Firefox, Edge).

  • Serve over HTTP with the correct MIME type: your web server must serve .svg files as image/svg+xml. If you see XML in the browser, the MIME type is wrong.

  • In Jupyter Notebook, calling fig._repr_svg_() (if available) or using IPython.display.SVG(fig.svg) will render inline.

“ImportError: No module named ‘latpy’”

Python cannot find the latpy package. Solutions:

  1. Did you install? Run pip install -e . from the latpy/ directory (the one containing pyproject.toml or setup.py).

  2. Check PYTHONPATH: The src/ directory inside the repository must be discoverable. Either install with pip, or set:

    • Windows: set PYTHONPATH=C:\path\to\latpy\src;%PYTHONPATH%

    • Linux/macOS: export PYTHONPATH=/path/to/latpy/src:$PYTHONPATH

  3. Virtual environment: If you’re using a virtual environment, activate it before running pip install. A library installed globally won’t be visible inside a virtual environment (and vice versa).

  4. Spelling: The package name is latpy (all lowercase, no hyphen). import latpy works; import latPy does not.


Next Steps

  • Browse the API documentation for detailed reference

  • Run tests: python -m pytest tests/

  • Read the CHANGELOG for version history