pyepo.data.dataset

optDataset class based on PyTorch Dataset

Attributes

Classes

optDataset

PyTorch Dataset for predict-then-optimize problems.

optDatasetKNN

PyTorch Dataset for the kNN-robust decision-focused loss.

optDatasetConstrs

PyTorch Dataset for the CaVE cone-aligned loss.

Functions

collate_tight_constraints(batch)

Collate function for optDatasetConstrs batches.

Module Contents

pyepo.data.dataset.optMpaxModel = None
pyepo.data.dataset.logger
class pyepo.data.dataset.optDataset(model: pyepo.model.opt.optModel, feats: numpy.ndarray | torch.Tensor, costs: numpy.ndarray | torch.Tensor)

Bases: torch.utils.data.Dataset

PyTorch Dataset for predict-then-optimize problems.

At construction time it solves the optimization problem for every cost vector and caches the optimal solution \(\mathbf{w}^*(\mathbf{c})\) and objective value \(z^*(\mathbf{c})\). This precomputation removes solver overhead from the training loop, making optDataset the standard input format for end-to-end training in PyEPO. When labels are already available from another source, optDataset can be skipped and batches fed directly to pyepo.func modules.

Variables:
model
feats
costs
sols
objs
class pyepo.data.dataset.optDatasetKNN(model: pyepo.model.opt.optModel, feats: numpy.ndarray | torch.Tensor, costs: numpy.ndarray | torch.Tensor, k: int = 10, weight: float = 0.5)

Bases: optDataset

PyTorch Dataset for the kNN-robust decision-focused loss.

For each instance the cost vector is replaced with a convex combination of its k nearest neighbours in feature space, and the optimization problem is solved on the smoothed costs. The mean kNN solutions and objective values are cached for training, providing a robust supervision signal under noisy or out-of-distribution feature observations.

Reference: Schutte et al. (2023) https://arxiv.org/abs/2310.04328

Variables:
  • model (optModel) – Optimization model

  • k (int) – number of nearest neighbours selected

  • weight (float) – self-weight in the kNN convex combination (1.0 = no smoothing)

  • feats (torch.Tensor) – Data features

  • costs (torch.Tensor) – kNN-smoothed cost vectors

  • sols (torch.Tensor) – Mean kNN optimal solutions

  • objs (torch.Tensor) – Mean kNN optimal objective values

model
k = 10
weight = 0.5
feats
costs
sols
objs
class pyepo.data.dataset.optDatasetConstrs(model: pyepo.model.opt.optModel, feats: numpy.ndarray | torch.Tensor, costs: numpy.ndarray | torch.Tensor, skip_infeas: bool = False)

Bases: optDataset

PyTorch Dataset for the CaVE cone-aligned loss.

Stores features and cost coefficients, solves each instance with a fresh copy of the Gurobi model, and extracts the normals of the binding constraints at the optimal vertex in canonical <= orientation. These normals span the polyhedral cone onto which coneAlignedCosine projects the predicted cost vector during training.

CaVE is defined for binary linear programs only, so the optimal vertex must be binary; instances that are infeasible or have non-binary optima raise (or are skipped when skip_infeas=True). Binding-constraint extraction uses Gurobi’s sparse-matrix API, which is why this dataset currently requires a Gurobi-backed optModel.

Per-instance row counts differ (different constraints bind at different vertices), so batches must be assembled with collate_tight_constraints.

Reference: Tang & Khalil (2024) https://link.springer.com/chapter/10.1007/978-3-031-60599-4_12

Variables:
model
skip_infeas = False
feats
costs
sols
objs
ctrs
pyepo.data.dataset.collate_tight_constraints(batch)

Collate function for optDatasetConstrs batches.

Stacks the standard (x, c, w, z) tensors and zero-pads the ragged per-instance binding-constraint matrices to a common row count so they can be assembled into a single (batch, max_rows, num_cost) tensor for coneAlignedCosine.