Getting Started with latpy
latpy is a pure-Python array and data library built from the ground up for clarity, portability, and zero compiled dependencies. It provides a NumPy-like experience — arrays, broadcasting, linear algebra, stats, ML, labeled tables, and SVG visualisation — entirely in Python, with no C extensions or Fortran libraries.
This guide walks you through every feature with practical examples, explains why things work the way they do, and points out edge cases you’ll encounter in real use.
Installation
latpy is installed from its Git repository. The -e (editable) flag means changes to the source take effect immediately — useful if you’re developing or debugging.
git clone https://gitlab.com/tyarc-lab/latpy.git
cd latpy
pip install -e .
Alternatively, you can add the source directory to your PYTHONPATH. This works when you want to point an existing environment at a specific checkout without re-running pip.
set PYTHONPATH=C:\path\to\latpy\src;%PYTHONPATH%
Edge case — fresh environment: If you see ModuleNotFoundError: No module named 'latpy', double-check that (a) the src/ directory is on PYTHONPATH, or (b) pip install -e . completed without error. latpy has no required runtime dependencies, so a missing module is almost always a path issue.
Your First Array
array() and zeros() are your primary constructors. They live in the latmath.array namespace and return NDArray objects — the core data structure.
from latpy.latmath.array import array, zeros
# From a Python list
a = array([1, 2, 3, 4, 5])
print(a) # NDArray([1, 2, 3, 4, 5])
print(a.shape) # (5,)
print(a.dtype) # DType(name='i64', code='q', size=8)
Why I64? When you pass integers, latpy chooses I64 (signed 64-bit) — the widest platform-safe integer type. This avoids overflow on most common operations and mimics NumPy’s default int64 on 64-bit platforms.
# 2D array
b = array([[1, 2], [3, 4], [5, 6]])
print(b.shape) # (3, 2)
Shapes are always tuples: (N,) for 1-D, (M, N) for 2-D, etc. A scalar value extracted via a[0] is a plain Python int or float, not a 0-D array.
# Zero-filled
c = zeros((2, 3))
print(c.tolist()) # [[0, 0, 0], [0, 0, 0]]
zeros() returns an I64 array by default. Use dtype="f64" to get floats.
# Float array
d = array([1.5, 2.5], dtype="f64")
Edge cases:
Empty list:
array([])creates an array of shape(0,). Most reductions (.sum(),.mean()) return0or0.0on empty arrays;min()/max()raiseValueError.Mixed int/float:
array([1, 2.0])promotes toF64(float has higher priority). See the type-promotion rules in “Troubleshooting”.Ragged nesting:
array([[1, 2], [3]])raisesValueError— all sub-lists must have the same length.
Data Types
Three built-in dtypes cover the vast majority of use cases. There are no unsigned or half-precision types.
from latpy.latmath.array.dtypes import I64, F64, B1, parse_dtype
# Three built-in dtypes:
# I64 — signed 64-bit integer
# F64 — double-precision float
# B1 — boolean (0/1)
Why only three? latpy is designed for teaching, prototyping, and data analysis — not system programming. Restricting to I64, F64, and B1 eliminates the “which int size?” confusion that beginners face in NumPy, while covering every operation in this guide.
# Parse from string name:
dt = parse_dtype("f64") # F64
dt = parse_dtype("b1") # B1
dt = parse_dtype(None) # I64 (default)
parse_dtype(None) returns I64, which is the fallback when no dtype is specified.
# DType properties:
print(F64.name) # "f64"
print(F64.kind) # "f" — "i" for int, "f" for float, "b" for bool
print(F64.size) # 8 (bytes)
Type promotion rules: When two dtypes meet in an operation:
F64wins over everything (float + int → float, float + bool → float).I64wins overB1(int + bool → int, treatingTrueas 1 andFalseas 0).B1withB1staysB1.
This mirrors NumPy’s “safe” promotion rules but with a much smaller type set.
Indexing
Indexing supports scalars, slices, None (newaxis), boolean masks, and integer arrays (fancy indexing). Understanding the copy vs. view distinction is critical.
a = array([10, 20, 30, 40, 50])
# Scalar
print(a[0]) # 10
Scalar indexing returns a plain Python int or float.
# Slice (returns view)
print(a[1:4]) # NDArray([20, 30, 40])
Why views? Slices return a view (not a copy) — they share memory with the original. This makes slicing cheap (O(1)) and avoids data duplication. Changes to the slice affect the original array, and vice versa. If you need an independent copy, call .copy() explicitly.
# Newaxis (inserts dimension of size 1)
print(a[None].shape) # (1, 5)
print(a[:, None].shape) # (5, 1)
None (or np.newaxis) inserts a dimension of size 1 at that position. This is primarily used for broadcasting — for example, a[:, None] - a[None, :] builds a pairwise-difference matrix.
# Boolean mask
mask = a > 25
print(mask) # NDArray([0, 0, 1, 1, 1])
print(a[mask]) # NDArray([30, 40, 50])
Boolean indexing always copies. Because the selected elements may not occupy a contiguous memory region, latpy returns a fresh array.
# Fancy indexing
idx = array([0, 2, 4])
print(a[idx]) # NDArray([10, 30, 50])
Fancy indexing (using integer arrays) also always copies. This matches NumPy’s contract: when you index with an array of positions, the result is guaranteed to be contiguous and independent of the original.
# 2D fancy indexing (row selection + paired indices)
A = array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])
print(A[array([0, 2])]) # NDArray([[10, 20, 30], [70, 80, 90]])
print(A[array([0, 2]), array([1, 2])]) # NDArray([20, 90])
print(A[array([0, 1]), 1]) # NDArray([20, 50])
When you pass two index arrays (A[[i0, i1], [j0, j1]]), they are paired element-wise: you get A[i0, j0], A[i1, j1]. Mixing a 1-D array with a scalar broadcasts the scalar.
Edge cases:
Out-of-bounds scalar:
a[100]raisesIndexError. latpy validates bounds eagerly.Out-of-bounds slice:
a[3:100]does not raise — it silently returns whatever elements overlap (like Python’s own list slicing).Boolean mask size mismatch:
a[np.array([True, False])]on a length-5 array raisesIndexError— the mask must match the axis size.Empty slice:
a[2:2]returns an empty array of shape(0,).
Operations
Arithmetic, comparison, and reduction operations follow NumPy broadcasting semantics and type promotion rules.
a = array([1, 2, 3])
b = array([4, 5, 6])
# Arithmetic
print(a + b) # NDArray([5, 7, 9])
print(a - b) # NDArray([-3, -3, -3])
print(a * b) # NDArray([4, 10, 18])
print(a / b) # NDArray([0.25, 0.4, 0.5])
print(a ** 2) # NDArray([1, 4, 9])
Why NumPy broadcasting? Broadcasting allows arrays of different shapes to be combined without explicit looping or replication. latpy follows the same rules:
Right-align shapes:
(3,)vs()becomes(3,)vs(1,).Dimensions of size 1 stretch to match the other.
Mismatched sizes raise
ValueError.
So a ** 2 works because 2 (shape ()) broadcasts to (3,). Equivalently, a + array([[1], [2]]) would broadcast to (3,) + (2, 1) → (3, 2).
Why division returns float: a / b with integer entries produces F64 results. This avoids integer-truncation surprises. All other operations preserve dtype unless promotion is needed (e.g., F64 + I64 → F64).
# Comparisons (returns B1 mask)
print(a > 1) # NDArray([0, 1, 1]) (B1 dtype)
print(a == 2) # NDArray([0, 1, 0])
Comparisons always return a B1 (0/1) array, even when comparing floats or mixed types.
# Reductions
print(a.sum()) # 6
print(a.mean()) # 2.0
Why reductions support an axis parameter: When you call a.sum(axis=0) on a 2-D array, you collapse that dimension — useful for row-wise or column-wise aggregation. Without axis, reductions flatten the array. The axis parameter exists so you can control which dimension to eliminate.
Edge cases:
Empty array:
array([]).sum()returns0(the identity).array([]).mean()returns0.0.array([]).min()raisesValueError— there is no minimum of nothing.Integer overflow: latpy does not check for overflow on
I64.array([2**62]).sum()will silently wrap on CPython. UseF64for large accumulations.Division by zero:
array([1, 0]) / array([0, 0])producesF64values ofinfandnan— not an exception.
Linear Algebra
Linear algebra routines live in latmath.array.linalg. They operate on 1-D and 2-D NDArray objects and are pure-Python implementations (not LAPACK wrappers).
from latpy.latmath.array.linalg import sub, ssd, argmin, qr, eig, solve
a = array([1, 2, 3, 4])
b = array([4, 3, 2, 1])
print(sub(a, b)) # NDArray([-3, -1, 1, 3])
print(ssd(a)) # 5.0 (sum of squared diffs from mean)
print(argmin(a)) # NDArray([0]) (index of minimum)
sub computes element-wise subtraction (equivalent to a - b but explicit). ssd computes the sum of squared deviations from the mean — a building block for variance. argmin returns the index of the minimum as an array (so you can use it for fancy indexing).
# QR decomposition
A = array([[3.0, 2.0], [1.0, 4.0]], dtype="f64")
Q, R = qr(A)
QR decomposition factors A = Q @ R where Q is orthogonal and R is upper-triangular. It is used internally for least-squares and eigenvalue computation.
# Dominant eigenvalue
lam, v = eig(A)
print(lam) # ~4.0
eig computes the dominant eigenvalue (largest magnitude) and its corresponding eigenvector via power iteration — it does not return all eigenvalues. For the full spectrum, use np.linalg.eigvals from the NumPy compat layer (which calls numpy if available).
# Linear solve
x = solve(A, array([5.0, 6.0], dtype="f64"))
print(x.tolist()) # [0.8, 1.3]
solve solves Ax = b using Gaussian elimination. It requires A to be square and invertible.
Edge cases:
Non-square matrices for
solve: RaisesValueError— only square systems are supported.Singular matrix for
solve: RaisesLinAlgError— no solution exists (or infinite solutions).QR on rectangular matrices: Supported, but
QandRmay not be the “thin” form you expect from LAPACK.Power iteration convergence:
eigon a matrix with two eigenvalues of equal magnitude may not converge to a single result.
NumPy Compatibility Layer
The np singleton wraps a subset of the NumPy API so that you can write latpy code that looks like NumPy — useful for migration or for users familiar with the NumPy ecosystem.
from latpy.latmath.array.numpy_compat import np
a = np.array([1, 2, 3, 4, 5])
b = np.zeros((2, 3), dtype=np.float64)
c = np.eye(3)
d = np.arange(0, 10, 2) # NDArray([0, 2, 4, 6, 8])
e = np.linspace(0, 1, 5) # NDArray([0., 0.25, 0.5, 0.75, 1.])
f = np.concatenate([a, np.array([6, 7])])
g = np.dot(a, a) # 55 (dot product)
Why a compatibility layer? If you already know NumPy, np removes the need to learn a new API. It also makes it easy to swap real NumPy in later if performance demands it — just change the import. Note that np here is not the real NumPy; it’s a latpy object that returns NDArray.
# Random
np.random.seed(42)
r = np.random.randn(3, 3)
Seeding and random generation are forwarded to latpy’s own random module, not numpy.random.
# Linear algebra
M = np.array([[1, 2], [3, 4]])
print(np.linalg.det(M)) # -2.0
print(np.linalg.inv(M))
print(np.linalg.qr(M))
np.linalg provides det, inv, qr, eigvals, solve, and more. Each delegates to the corresponding latpy linalg module.
Edge cases:
np.float64vsnp.float32: Onlynp.float64is defined;np.float32raisesAttributeError. UseF64or"f64"for precision control.np.random.randvsnp.random.randn: Both exist but return latpy arrays, not NumPy arrays.Missing functions:
np.fft,np.linalg.svd,np.uniqueare not provided. Checkdir(np)for available names.
Statistics
The stats module provides descriptive statistics, histogramming, moment calculations, and probability density functions — no SciPy required.
from latpy.latmath.stats import describe, histogram, skew, kurtosis, norm_pdf
a = array([1, 2, 3, 4, 5])
# Five-number summary
desc = describe(a)
print(desc["mean"]) # 3.0
print(desc["std"]) # 1.414...
describe returns a dict with min, max, mean, median, std, q1, q3. It computes sample standard deviation (denominator n-1).
# Histogram
counts, edges = histogram(a, bins=3, range_=(1, 5))
print(counts.tolist()) # [1, 1, 3]
histogram computes bin counts and bin edges. The range_ parameter constrains the domain (data outside is ignored). Note the trailing underscore to avoid shadowing Python’s range.
# Moments
print(skew(a)) # 0.0
print(kurtosis(a)) # -1.3
Skewness measures asymmetry (0 = symmetric). Kurtosis here is excess kurtosis (Fisher definition, normal = 0). A uniform distribution’s excess kurtosis is -1.2, so -1.3 for [1,2,3,4,5] is expected.
# Normal PDF
print(norm_pdf(0.0)) # 0.3989...
norm_pdf(x) returns the standard Normal PDF at x: exp(-x²/2) / sqrt(2π).
Edge cases:
Single-element array:
describe([5])works, butstdwill be 0 (single value, zero variance).Zero-variance data:
skew([5, 5, 5])returnsnan— skewness is undefined when there is no spread.Histogram with zero bins or invalid range: Raises
ValueError.Out-of-range
norm_pdf: Handled gracefully —norm_pdf(1e10)returns0.0(underflow to zero is mathematically correct).
Random Numbers
The random module provides a self-contained pseudo-random number generator (Mersenne Twister, independently implemented). It does not depend on Python’s random or NumPy’s.
from latpy.latmath.random import seed, randn, randint, uniform, choice, shuffle
seed(42) # deterministic reproducibility
Why seed(42)? Setting a seed makes random output deterministic and reproducible. This is essential for tests, tutorials, and debugging. Any integer seed works; 42 is simply a convention.
# Continuous
print(randn(3).tolist()) # 3 standard normal samples
print(uniform(0, 10, size=4)) # 4 samples in [0, 10)
print(rand(2, 2).tolist()) # 2x2 uniform[0,1)
randn(n)samples from N(0, 1) using the Box-Muller transform.uniform(low, high, size=n)samples uniformly in[low, high).rand(m, n)is shorthand foruniform(0, 1, size=(m, n)).
# Discrete
print(randint(0, 10, size=5)) # 5 integers 0..9
randint(low, high, size) samples uniformly from {low, low+1, ..., high-1} — the upper bound is exclusive, matching NumPy’s convention.
# Sampling
deck = array([1, 2, 3, 4, 5])
print(choice(deck, size=3)) # 3 draws with replacement
shuffle(deck) # in-place shuffle
choice draws with replacement by default (elements can repeat). shuffle permutes the array in place and returns None.
Edge cases:
Seed repeatability: Two
seed(42)calls in the same session reset the generator to the same state — you’ll get the same sequence again.Empty array in
choice: RaisesValueError.size=0inrandintoruniform: Returns an empty array of shape(0,).shuffleon empty or 1-element array: No-op (no error).
Working with Labeled Data
latpy’s latdata module adds named axes (like pandas Index) on top of NDArray, giving you labeled tables.
from latpy.latdata import Axis, Table
# Named axis
rows = Axis("row", ["a", "b", "c"])
cols = Axis("col", ["x", "y"])
# Table from nested lists
t = Table.from_list(
[[1, 2], [3, 4], [5, 6]],
row_labels=["a", "b", "c"],
col_labels=["x", "y"],
)
Table.from_list infers the shape from the data and creates named axes. The underlying data is an NDArray stored at t.data.
# Label-based indexing
print(t["a", "x"]) # 1
print(t["a":"c", :]) # Table (3 rows, 2 cols)
How label indexing works: The first index labels rows, the second labels columns. Slices use label strings (not positions) — "a":"c" selects rows from "a" up to and including "c", unlike Python slices which exclude the end. This matches pandas’ label-slice behavior.
Edge cases:
Key not found:
t["z", :]raisesKeyError.Duplicate labels: Not prohibited. Label-based slicing on duplicate labels may skip intervening rows.
Slice with non-existent label:
t["a":"z", :]raisesKeyError— the end label must exist.Empty table:
Table.from_list([[]])is allowed; shape will reflect the empty dimension.
GroupBy
GroupBy partitions a table’s rows (or columns) by matching label values — like SQL’s GROUP BY or pandas’ groupby.
from latpy.latdata import GroupBy
t = Table.from_list(
[[1, 2], [3, 4], [5, 6]],
row_labels=["a", "b", "a"],
col_labels=["x", "y"],
)
# Group rows by label
gb = GroupBy(t, "row")
print(gb.sum().data.tolist()) # [[6, 8], [3, 4]]
print(gb.mean().data.tolist()) # [[3.0, 4.0], [3.0, 4.0]]
print(gb.count().data.tolist()) # [[2], [1]]
Rows "a" (indices 0 and 2) are aggregated together; row "b" (index 1) forms its own group. sum, mean, and count each reduce the grouped axis. The count result has shape (2, 1) because count returns a single value per group (not per column).
Edge cases:
No matching groups: A label value with no rows is simply absent from the result.
Unordered labels: Groups are returned in order of first appearance, not sorted.
Single group: If all labels are identical,
gb.sum()returns a single-row table.
I/O
latpy reads and writes arrays and tables in CSV and JSON formats. CSV is portable (works with Excel, pandas); JSON preserves dtype, shape, and axis metadata.
from latpy.io import save_csv, load_csv, save_json, load_json
a = array([[1, 2, 3], [4, 5, 6]])
# CSV with automatic header
save_csv("data.csv", a)
b = load_csv("data.csv")
CSV output includes a header row by default. Data is comma-separated with each row on its own line. On load, latpy infers dtype from the CSV content. If you use integer CSV data, it loads as I64; if a column contains decimal points, it becomes F64.
# JSON with full metadata (dtype, shape, axes)
from latpy.latdata import Axis, Table
save_json("data.json", a)
c = load_json("data.json")
JSON output stores the array as a flat list alongside dtype, shape, and optionally axis labels. This means save_json / load_json round-trip perfectly — even for labeled Table objects.
Edge cases:
Empty array to CSV: Writes a file containing only the header row. On load, you get an array of shape
(0,).Missing file:
load_csv/load_jsonraiseFileNotFoundError.Corrupt JSON: Raises
json.JSONDecodeError.Non-ASCII data: CSV is written with UTF-8 encoding; JSON uses ASCII-safe escaped Unicode by default.
Machine Learning
The ml module provides simple, pure-Python implementations of common ML algorithms — k-means, linear regression, and classification metrics. These are not production-grade (no GPU, no regularization paths), but are suitable for learning, prototyping, and small datasets.
from latpy.ml import kmeans, LinearRegression, accuracy, f1_score, confusion_matrix
# K-Means clustering
X = array([[1.0, 2.0], [1.5, 1.8], [5.0, 8.0], [8.0, 8.0], [1.0, 0.6], [9.0, 11.0]])
centroids, labels, inertia = kmeans(X, k=2)
print(labels.tolist()) # cluster assignments
kmeans uses random initialisation (not k-means++), so results vary between runs unless you seed() first. centroids is the final cluster centers (shape (k, n_features)), labels is the assignment per point, and inertia is the sum of squared distances to the nearest centroid.
Seed dependence: kmeans is particularly sensitive to the random seed. Calling seed(42) before kmeans guarantees reproducible cluster assignments.
# Linear regression
X = array([[1.0], [2.0], [3.0], [4.0]])
y = array([2.0, 4.0, 6.0, 8.0])
lr = LinearRegression()
lr.fit(X, y)
print(lr.predict(array([[5.0]]))) # ~10.0
print(lr.score(X, y)) # R² = 1.0
LinearRegression fits an ordinary least-squares model. The score method returns R² (coefficient of determination). Perfect fit gives 1.0.
# Classification metrics
y_true = array([1, 0, 1, 1, 0])
y_pred = array([1, 0, 0, 1, 0])
print(accuracy(y_true, y_pred)) # 0.8
print(f1_score(y_true, y_pred)) # 0.8
print(confusion_matrix(y_true, y_pred).tolist())
Metrics compare predicted labels against ground truth. confusion_matrix returns a 2-D array where row i, column j counts the number of times true class i was predicted as class j.
Edge cases:
k=1 for kmeans: Returns a single centroid at the data mean;
inertiais total variance.k > n_samples: Raises
ValueError— cannot have more clusters than data points.LinearRegression with singular X: Raises
LinAlgErrorif the normal equations matrix is not invertible.All-zero predictions in
f1_score: RaisesZeroDivisionError— F1 is undefined when both precision and recall are zero.Mismatched
y_true/y_predlengths: RaisesValueError.
SOV Models (State-Observation-Vector)
SOV is a lightweight state-space / linear dynamical system framework built on latpy arrays. It models hidden states that evolve linearly and emit observations.
from latpy.ml.sov import SOVRegression, SOVClassifier, SOVDynamics
# SOV Regression
X = array([[1.0], [2.0], [3.0]])
y = array([2.0, 4.0, 6.0])
sov = SOVRegression(n_states=2)
sov.fit(X, y)
print(sov.score(X, y))
SOVRegression maps observations X to outputs y through a latent state of dimension n_states. The internal state captures temporal or latent structure that a direct regression might miss.
# SOV dynamics simulation
dyn = SOVDynamics(n_states=2, n_obs=3)
dyn.fit_random(seed=42)
states, obs = dyn.simulate(n_steps=10)
print(states.shape) # (11, 2)
print(dyn.equilibrium().tolist()) # stable equilibrium state
simulate runs the dynamics forward for n_steps timesteps, returning both hidden states (shape (n_steps+1, n_states)) and observations (shape (n_steps, n_obs)). The extra +1 on states is the initial state. equilibrium() computes the fixed point S = A @ S (if stable).
Edge cases:
n_stateslarger than data rank: The fit may underdetermine the state.Unstable dynamics:
equilibrium()may diverge if the transition matrix has eigenvalues > 1 in magnitude.
Visualization
latpy’s viz module renders plots as SVG — a vector format viewable in any browser. No GUI, no matplotlib, no JavaScript.
from latpy.viz import plot, scatter, bar, hist, Figure
# Line plot (auto-scales, returns SVG)
fig, line_el = plot([1, 2, 3, 4, 5], [2, 4, 1, 8, 6])
fig.save("line_plot.svg")
plot(x, y) automatically determines axis ranges to fit all data points. It returns a Figure object and the SVG line element. The figure is not displayed automatically — you must call fig.save(filename) to write the SVG file.
# Scatter plot
fig, dots = scatter([1, 2, 3, 4], [2, 5, 3, 7], r=5)
fig.save("scatter.svg")
The r=5 argument controls the radius of each plotted circle.
# Bar chart
fig, bars = bar(["A", "B", "C", "D"], [3, 7, 2, 5])
fig.save("bar.svg")
Labels are strings; numeric axes are auto-scaled.
# Histogram
fig, bins_ = hist([1, 1, 2, 3, 3, 3, 4, 5], bins=4)
fig.save("hist.svg")
hist(data, bins=n) bins the data, computes frequencies, and draws rectangles.
# Graph visualization
from latpy.viz import draw_graph
svg = draw_graph(["A", "B", "C", "D"],
[("A", "B"), ("B", "C"), ("C", "D"), ("A", "D")],
width=400, height=300)
with open("graph.svg", "w") as f:
f.write(svg)
draw_graph returns a raw SVG string (not a Figure) that you write to a file yourself. It places nodes in a simple layout and draws edges with lines.
Why SVG? SVG is pure text (XML), not a binary format. You can view it in any browser, embed it in web pages, and include it in Jupyter Notebooks (the browser renders it inline). The downside is that SVG files can be larger than PNG for the same data.
Edge cases:
Empty data for
plot: RaisesValueError— at least two points are needed.Single bar for
bar: Works fine; a lone bar is drawn.All identical values for
plot: The y-range defaults to[value-1, value+1]to avoid a zero-height plot.SVG showing as text in browser: This happens if you open the
.svgfile in a text editor or serve it with the wrong MIME type. Save with a.svgextension and open in a browser, or configure your server to serve.svgfiles asimage/svg+xml.
Performance Notes
latpy is a pure-Python library with no C, Cython, or Fortran extensions. This means:
Slower than NumPy: For most operations (addition, multiplication, reductions), latpy is 10–50× slower than NumPy, because the tight loops run in CPython rather than compiled C. This is acceptable for teaching, prototyping, and datasets under ~10⁵ elements.
Comparable to native Python lists: For small arrays (under ~1,000 elements), latpy overhead is small — on par with manual list comprehensions.
No parallelism: latpy does not use threads, multiprocessing, or SIMD. All operations are single-threaded Python.
Big-O complexity of common operations:
Operation |
Complexity |
Notes |
|---|---|---|
|
O(n) |
Single pass over data |
|
O(n) |
Scalar operations are O(n) |
|
O(n) for 1-D, O(mnk) for matmul |
|
|
O(n³) |
Gaussian elimination |
|
O(k·n²) |
Power iteration, k iterations |
|
O(m·n²) |
Gram-Schmidt |
|
O(n log n) |
Uses Python’s TimSort |
|
O(k·n·d·iter) per run |
k clusters, n points, d dimensions |
|
O(n + b) |
Count in bins |
Migration path: If you outgrow latpy’s performance, switch to NumPy by:
Using
np = latpy.latmath.array.numpy_compat.np— the API is similar.Converting latpy arrays to NumPy with
np.array(ndarray_obj)(requires real NumPy installed).Replacing
from latpy.ml import ...withsklearnequivalents.
Troubleshooting
“Why is my array I64 when I passed floats?”
latpy picks the dtype that fits all input values. If you write:
a = array([1, 2, 3]) # all ints → I64
b = array([1, 2, 3.0]) # has a float → F64
If you explicitly want floats, pass dtype="f64" or include a decimal value. See the “Type promotion rules” in the Data Types section.
“Why did my indexing return a copy instead of a view?”
Only slices (and None) return views. Everything else returns a copy:
a[1:4]→ view (contiguous, shared memory)a[[0, 2, 4]]→ copy (fancy indexing, non-contiguous)a[a > 25]→ copy (boolean mask, non-contiguous)
You can check identity: arr[1:4] is arr is False even for views (Python is checks identity), but modifications to the slice will affect the original. If you need a guaranteed copy, call .copy().
“Why did kmeans give different results each time?”
kmeans initialises centroids randomly. Without a fixed seed, each run picks different starting points, which can converge to different local minima. For reproducible results:
from latpy.latmath.random import seed
seed(42) # or any integer
centroids, labels, inertia = kmeans(X, k=2)
For a deterministic run, also consider that the data order affects tie-breaking in label assignment.
“Why is my SVG showing as text?”
You’re likely viewing the .svg file in a text editor or terminal. SVG is plain XML, so it looks like <?xml ...<svg ...>...</svg>. To see the rendered graphic:
Save to a
.svgfile, then open that file in a web browser (Chrome, Firefox, Edge).Serve over HTTP with the correct MIME type: your web server must serve
.svgfiles asimage/svg+xml. If you see XML in the browser, the MIME type is wrong.In Jupyter Notebook, calling
fig._repr_svg_()(if available) or usingIPython.display.SVG(fig.svg)will render inline.
“ImportError: No module named ‘latpy’”
Python cannot find the latpy package. Solutions:
Did you install? Run
pip install -e .from thelatpy/directory (the one containingpyproject.tomlorsetup.py).Check PYTHONPATH: The
src/directory inside the repository must be discoverable. Either install with pip, or set:Windows:
set PYTHONPATH=C:\path\to\latpy\src;%PYTHONPATH%Linux/macOS:
export PYTHONPATH=/path/to/latpy/src:$PYTHONPATH
Virtual environment: If you’re using a virtual environment, activate it before running pip install. A library installed globally won’t be visible inside a virtual environment (and vice versa).
Spelling: The package name is
latpy(all lowercase, no hyphen).import latpyworks;import latPydoes not.
Next Steps
Browse the API documentation for detailed reference
Run tests:
python -m pytest tests/Read the CHANGELOG for version history