Skip to content

TSL — Tensor Separation Learning

TSL is a glass-box regression model: it fits accurately and lets you read its learned structure directly off one-dimensional plots — with no post-hoc surrogate. A fitted model is a small sum of stages, each a separable product of per-feature curves, so every feature's effect is recoverable exactly from a partial-dependence curve.

Everything below is produced by TSL itself (via the tensorsl.plot helpers) — these figures are the model, not an approximation of it.

What TSL shows you

Where a stage is active — the spatial backbone

California spatial backbone and per-stage 2D partial dependence
California housing — two boosting stages, each shown as two panels. Top row: the 2D backbone product \(b_\mathrm{lon}(x)\cdot b_\mathrm{lat}(y)\), a non-negative gate whose magnitude encodes where the stage is active (darker blue = larger). Bottom row: the 2D partial dependence \(\hat{m}_+ - \hat{m}_-\), the signed prediction each stage contributes at that location (warm orange = positive, cool blue = negative). Stage 1 captures a broad coastal gradient; Stage 2 refines it with a sharper Bay Area and LA correction.

Faithful 1-D partial dependence

1D partial dependence on latitude and longitude
TSL's first-order partial dependence for Latitude (left) and Longitude (right) on California housing, overlaid against EBM, XGBoost, and SepALS baselines. TSL (dark blue) shows sharp localized peaks — a spike near latitude 37–38 (San Francisco / Bay Area) and a concentrated coastal band near longitude −122 — that additive-marginalization baselines smooth into nearly monotone slopes. This faithfulness follows from separability: for a product-form model the 1D PD curve recovers the exact factor shape rather than a marginal main effect.

Feature importance — magnitude vs. direction

Per-stage backbone and tilt feature importance
Six-panel feature importance report on California housing. Top row (heatmaps): per-stage backbone importance \(\mathrm{Var}[\log b_j]\) (left) and tilt importance \(\mathrm{Var}[d_j]\) (right) — rows are stages, columns are features, colour encodes magnitude. Bottom row (bars): global tilt importance (left), combined score \(I_j = I_j^b + \gamma I_j^d\) (center), global backbone importance (right), with Stage 1 dominating the energy-weight panel. Longitude and Latitude lead in backbone (they gate the spatial stages on/off); Latitude and MedInc lead in tilt (they drive the signed price direction within each stage).

Local, per-observation explanations

Local explanation for a coastal point
Local explanation for a coastal home in the San Francisco Bay area — Longitude −122.41, Latitude 37.70, MedInc 2.41, HouseAge 23, TotalRooms 1817, TotalBedrooms 400, Population 1376, Households 382. Left card: per-stage contribution bars summing to the total prediction. Center cards: per-stage backbone share, showing which features gate each stage on (Latitude and Longitude dominate Stage 1). Right card: signed tilt \(d_j\) bars indicating the direction of each feature's effect. The 10-stage blackbox TSL predicts $173,675, with Stage 1 contributing +$172,967 (the coastal premium) and later stages making small corrections.
Local explanation for a desert point
Local explanation for an inland (desert) home near Palm Springs — Longitude −116.50, Latitude 33.81, MedInc 2.54, HouseAge 26, TotalRooms 5032, TotalBedrooms 1229, Population 3086, Households 1183 — shown for contrast with the coastal point above. Stage 1 again dominates (+$118,103), but the backbone shares and tilt signs differ: at this inland longitude/latitude the spatial gate is weaker, and Longitude and Latitude contribute a negative tilt rather than the large positive coastal premium. The 10-stage blackbox TSL predicts $111,364.

Read this first — it is short, and it is the point

These plots only mean something once you know what a backbone, a tilt, and a stage are. We strongly recommend reading the Under the hood section before relying on the figures: start with The model, then Partial dependence. It is what makes TSL interpretable rather than just another regressor.

Examples

The figures above come from runnable scripts in tsl-py/examples/, which reproduce the paper's plots via the tensorsl.plot helpers:

python tsl-py/examples/california.py
python tsl-py/examples/bike_sharing.py
python tsl-py/examples/synthetic.py
python tsl-py/examples/synthetic2.py
Script What it shows
california.py spatial backbone, 1D PD faithfulness, feature importance, local explanations
bike_sharing.py 2D hour × workingday interaction PD
synthetic.py the masked interaction — signed PD ≈ 0 for every model, yet the backbone recovers the effect
synthetic2.py bagging diagnostics (backbone bimodality + similarity filtering)
sepals_synthetic.py small synthetic factor-value / SepALS comparison

Each script accepts --data-root, --out, --refit, and --variant; the pretrained models in examples/models/ are used by default. See the examples README for the full per-script output list and flags.

How the codebase is organized

The model is a three-level hierarchy, and the src/ module tree mirrors it exactly:

Level Type What it is Docs
1 GridTensor one fitted separable component (backbone/tilt + \(\lambda_\pm\)) GridTensor
2 StagePredictor one boosting stage: a bag of GridTensors + OLS scaling StagePredictor
3 TSL the boosted model: a Vec<StagePredictor> summed TSL

The core is the Rust crate tsl_rust (library name tsl). tsl-py/ wraps it for Python with a scikit-learn API (Python API), tsl-r/ (tensorsl) wraps it for R with an S3 fit/predict interface and a ggplot2 interpretability layer (R API), and tsl-split-evolution-dashboard/ (tslviz) visualizes how a fit was built (Visualization dashboard).

Where to start

Before anything else, read the Under the hood material — start with Notation and The model to understand what a backbone, a tilt, and a stage are. That understanding is what makes TSL interpretable rather than just another regressor, and it is worth five minutes before you fit your first model. From there: Fitting and Partial dependence round out the theory.