torch — Autograd for NDArray
Lightweight automatic differentiation built on latpy’s NDArray. Reverse-mode autograd with a PyTorch-like API. Pure stdlib, zero dependencies.
Why it exists: Gradient computation is essential for optimization and ML. This module tracks operations on Tensor objects, builds a computation graph, and computes gradients via reverse-mode automatic differentiation (backpropagation). The API mirrors PyTorch so users familiar with it can start immediately.
Tensor
from latpy.torch import Tensor, tensor, zeros_like
Signature |
Description |
|---|---|
|
Create a new tensor from lists, scalars, or NDArray |
|
Low-level constructor |
|
Zero-filled tensor with same shape |
from latpy.torch import tensor
x = tensor([1.0, 2.0, 3.0], requires_grad=True)
print(x.tolist()) # [1.0, 2.0, 3.0]
print(x.shape) # (3,)
Properties:
.data— underlyingNDArray.grad— accumulated gradientNDArray(orNonebeforebackward()).requires_grad— whether this tensor tracks gradients.shape,.ndim— shape convenience properties
Differentiable Operations
from latpy.torch import (
add, mul, sub, div, neg, pow,
sin, cos, exp, log,
sum, mean, matmul,
)
All operations support broadcasting and return new Tensor instances that track the computation graph.
Operation |
Forward |
|
|---|---|---|
|
a + b |
∂/∂a = 1, ∂/∂b = 1 |
|
a * b |
∂/∂a = b, ∂/∂b = a |
|
a − b |
∂/∂a = 1, ∂/∂b = −1 |
|
a / b |
∂/∂a = 1/b, ∂/∂b = −a/b² |
|
−a |
∂/∂a = −1 |
|
aᵇ |
∂/∂a = b·aᵇ⁻¹, ∂/∂b = aᵇ·ln(a) |
|
sin(a) |
∂/∂a = cos(a) |
|
cos(a) |
∂/∂a = −sin(a) |
|
eᵃ |
∂/∂a = eᵃ |
|
ln(a) |
∂/∂a = 1/a |
|
Σa |
∂/∂a = 1 |
|
Σa / n |
∂/∂a = 1/n |
|
a @ b |
∂/∂a = grad @ bᵀ, ∂/∂b = aᵀ @ grad |
Automatic Differentiation
x = tensor([3.0], requires_grad=True)
y = mul(x, x) # y = x²
z = add(y, x) # z = x² + x
z.backward() # dz/dx = 2x + 1 = 7
print(x.grad.tolist()) # [7.0]
Chain rule composes naturally:
x = tensor([2.0], requires_grad=True)
y = pow(x, 2.0) # y = x²
z = sin(y) # z = sin(x²)
z.backward() # dz/dx = cos(x²) · 2x
# ≈ cos(4.0) * 4.0 = -2.614...
Optimizer
from latpy.torch import SGD
Signature |
Description |
|---|---|
|
Stochastic gradient descent |
|
Update all parameters: p ← p − lr · p.grad |
|
Clear gradients from all parameters |
from latpy.torch import tensor, pow, SGD
x = tensor([5.0], requires_grad=True)
opt = SGD([x], lr=0.1)
# Training loop
for _ in range(50):
loss = pow(x, 2.0) # minimize x²
loss.backward()
opt.step()
opt.zero_grad()
print(x.tolist()[0]) # ≈ 0.0
Design Notes
The computation graph is dynamic — rebuilt on every forward pass.
Gradients accumulate across multiple
backward()calls (callzero_grad()between iterations).SGDis a minimal optimizer. Extend the pattern for momentum, Adam, etc.All tensors are float64 (
F64) — no mixed-precision support yet.The
ndarray.Tproperty works for transposition inmatmulbackward.