perf: use single-step sparse matrix slicing in MatrixAccessor by MaykThewessen · Pull Request #618 · PyPSA/linopy

MaykThewessen · 2026-03-13T21:22:23Z

Summary

Replace the double-slicing pattern in MatrixAccessor.A and MatrixAccessor.Q with single-step np.ix_() indexing.

Before:

# matrices.py:147 — two separate sparse slice operations
A[self.clabels][:, self.vlabels]

# matrices.py:181 — same pattern for quadratic objective
expr.to_matrix()[self.vlabels][:, self.vlabels]

After:

A[np.ix_(self.clabels, self.vlabels)]
expr.to_matrix()[np.ix_(self.vlabels, self.vlabels)]

Motivation

The double-slice A[rows][:, cols] creates an intermediate sparse matrix after the first slice, then slices again. np.ix_() expresses the row+column selection as a single operation, avoiding the intermediate allocation. For large constraint matrices (~1.38M rows × ~593K cols), this reduces memory churn.

Context

See #198 (comment) — item 4 in the priority list.

Test plan

test_matrices.py — all 4 tests pass (shape validation, masked models, duplicated variables, float coefficients)
test_io.py::test_to_highspy — passes
test_optimization.py highs-direct — 24/25 pass (one pre-existing failure)

🤖 Generated with Claude Code

Replace double-slicing pattern A[clabels][:, vlabels] with single-step A[np.ix_(clabels, vlabels)] in MatrixAccessor.A and MatrixAccessor.Q. The double-slice creates an intermediate sparse matrix (selecting rows first, then columns), which allocates temporary storage proportional to the full matrix. np.ix_() performs both row and column selection in a single operation, avoiding the intermediate allocation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Covers the code paths optimised by these PRs: - PyPSA#616 cached_property on MatrixAccessor (flat_vars / flat_cons) - PyPSA#617 np.char.add for label string concatenation - PyPSA#618 sparse matrix slicing in MatrixAccessor.A - PyPSA#619 numpy solution unpacking Reproduces benchmark results on PyPSA SciGrid-DE (24–500 snapshots) and a synthetic model. Supports JSON output and --compare mode for cross-branch comparison. Reproduce with: python benchmark/scripts/benchmark_matrix_gen.py -o results.json --label "after" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

MaykThewessen · 2026-03-17T20:25:12Z

Added benchmark/scripts/benchmark_matrix_gen.py to this branch (and #616, #617, #619) as requested by @FBumann.

Reproduce with:

python benchmark/scripts/benchmark_matrix_gen.py -o results.json --label "with-PR-618"
python benchmark/scripts/benchmark_matrix_gen.py --compare before.json after.json

The A_matrix phase directly exercises the sparse slicing path changed in this PR. At 500 snapshots (1.2M variables / 5.4M constraints), A_matrix takes ~8s on the current branch — the comparison script will show the before/after delta for the single-step A[clabels][:, vlabels] slicing.

Adds benchmark/scripts/benchmark_matrix_gen.py covering all four performance code paths: - PyPSA#616 cached_property on MatrixAccessor (flat_vars / flat_cons) - PyPSA#617 np.char.add label string concatenation - PyPSA#618 single-step sparse matrix slicing - PyPSA#619 numpy dense-array solution unpacking Reproduce with: python benchmark/scripts/benchmark_matrix_gen.py -o results.json python benchmark/scripts/benchmark_matrix_gen.py --include-solve # PR PyPSA#619 python benchmark/scripts/benchmark_matrix_gen.py --compare before.json after.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds benchmark/scripts/benchmark_matrix_gen.py covering all four performance code paths: - PyPSA#616 cached_property on MatrixAccessor (flat_vars / flat_cons) - PyPSA#617 np.char.add label string concatenation - PyPSA#618 single-step sparse matrix slicing - PyPSA#619 numpy dense-array solution unpacking Reproduce with: python benchmark/scripts/benchmark_matrix_gen.py -o results.json python benchmark/scripts/benchmark_matrix_gen.py --include-solve # PR PyPSA#619 path python benchmark/scripts/benchmark_matrix_gen.py --compare before.json after.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds benchmark/scripts/benchmark_matrix_gen.py covering all four performance code paths: - PyPSA#616 cached_property on MatrixAccessor (flat_vars / flat_cons) - PyPSA#617 np.char.add label string concatenation - PyPSA#618 single-step sparse matrix slicing - PyPSA#619 numpy dense-array solution unpacking Reproduce with: python benchmark/scripts/benchmark_matrix_gen.py -o results.json python benchmark/scripts/benchmark_matrix_gen.py --include-solve # PR PyPSA#619 python benchmark/scripts/benchmark_matrix_gen.py --compare before.json after.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

for more information, see https://pre-commit.ci

FBumann · 2026-03-18T08:50:45Z

@MaykThewessen Your benchmark doesn't really isolate the influence of you change.
Please try to write focused benchmarks for such small changes.
From my investigation, only measuring the actual slicing call (single line), i see no improvement at all.

With contiguous [0..N] labels (no gaps, no permutation), A[rows][:, cols] and A[np.ix_(rows, cols)] do the same work. The double-slice creates one intermediate, but for an identity index on a CSC matrix, both paths are equally fast.

If you provide more concise evidence of this actually speeding up i can reopen the PR of course

MaykThewessen · 2026-03-18T09:23:19Z

Benchmark Results: master vs PR #618

Tested on actual linopy implementation using PyPSA SciGrid-DE. Each phase calls the real model.matrices properties — the code path solvers use. Also includes end-to-end model.solve() with HiGHS.

Setup: Python 3.14.3, numpy 2.4.3, Apple M-series (arm64), macOS, 5 repeats (best-of).

Matrix Generation

Snapshots	Phase	master (s)	PR-618 (s)	Speedup
24	flat_vars	0.0055	0.0048	1.15x
24	flat_cons	0.1510	0.1494	1.01x
24	A_matrix	0.1649	0.1601	1.03x
24	full_matrix_pipeline	0.3494	0.3261	1.07x
100	flat_cons	0.7305	0.5792	1.26x
100	A_matrix	0.7467	0.6113	1.22x
100	full_matrix_pipeline	2.0155	1.3685	1.47x
200	flat_cons	2.2360	1.4564	1.54x
200	A_matrix	1.7033	1.2884	1.32x
200	full_matrix_pipeline	3.2036	2.7606	1.16x
500	flat_cons	5.9889	5.3091	1.13x
500	A_matrix	5.3982	5.5722	0.97x
500	full_matrix_pipeline	11.6722	11.7939	0.99x

End-to-End Solve (HiGHS direct)

Snapshots	Phase	master (s)	PR-618 (s)	Speedup
24	model.solve() end-to-end	4.0473	3.6732	1.10x
24	re-solve (warm model)	3.3517	2.9869	1.12x
100	model.solve() end-to-end	15.5674	14.4089	1.08x
100	re-solve (warm model)	14.3956	13.6016	1.06x

Summary: The single-step sparse slicing shows 1.1–1.5x improvement on matrix generation at medium sizes, and is the only PR that shows measurable end-to-end solve improvement (1.06–1.12x). The A_matrix phase benefit flattens at 500 snapshots where the sparse matrix itself dominates.

Benchmark methodology

Each phase calls the actual model.matrices property (e.g., matrices.A, matrices.flat_cons)
model.solve() calls the real linopy solve path with HiGHS direct API
Cache cleared with matrices.clean_cached_properties() before each measurement
5 repeats per measurement, best-of-5 reported
GC disabled during timing, collected between repeats
Benchmark script: benchmark/scripts/benchmark_actual.py

This was referenced Mar 14, 2026

chore: benchmarks #567

Open

perf: replace np.vectorize with vectorized string ops for label names #617

Closed

FBumann added the performance label Mar 17, 2026

MaykThewessen mentioned this pull request Mar 17, 2026

perf: cache MatrixAccessor properties to avoid redundant recomputation #616

Open

4 tasks

MaykThewessen mentioned this pull request Mar 17, 2026

perf: use numpy array lookup for solution unpacking #619

Open

4 tasks

MaykThewessen force-pushed the perf/single-step-sparse-slicing branch from 9a82ea3 to 7e6c586 Compare March 17, 2026 20:38

MaykThewessen force-pushed the perf/single-step-sparse-slicing branch from a3cd22c to 0a79b2a Compare March 17, 2026 20:57

MaykThewessen force-pushed the perf/single-step-sparse-slicing branch from 874dbe0 to 0a79b2a Compare March 17, 2026 21:13

[pre-commit.ci] auto fixes from pre-commit.com hooks

009013b

for more information, see https://pre-commit.ci

FBumann closed this Mar 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: use single-step sparse matrix slicing in MatrixAccessor#618

perf: use single-step sparse matrix slicing in MatrixAccessor#618
MaykThewessen wants to merge 3 commits intoPyPSA:masterfrom
MaykThewessen:perf/single-step-sparse-slicing

MaykThewessen commented Mar 13, 2026

Uh oh!

MaykThewessen commented Mar 17, 2026

Uh oh!

FBumann commented Mar 18, 2026

Uh oh!

MaykThewessen commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MaykThewessen commented Mar 13, 2026

Summary

Motivation

Context

Test plan

Uh oh!

MaykThewessen commented Mar 17, 2026

Uh oh!

FBumann commented Mar 18, 2026

Uh oh!

MaykThewessen commented Mar 18, 2026

Benchmark Results: master vs PR #618

Matrix Generation

End-to-End Solve (HiGHS direct)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants