Two-Stage Method

The two-stage approach trains a regression model \(\hat{\mathbf{c}} = g(\mathbf{x}; \boldsymbol{\theta})\) by minimizing a prediction error such as mean squared error \(l_{MSE}(\hat{\mathbf{c}}, \mathbf{c}) = \frac{1}{n} \sum_i^n \| \hat{\mathbf{c}}_i - \mathbf{c}_i \| ^ 2\). At inference time, the model first predicts \(\hat{\mathbf{c}} = g(\mathbf{x}; \boldsymbol{\theta})\), then the predicted costs are used to solve the optimization problem.

pyepo.twostage.sklearnPred(pmodel: BaseEstimator) MultiOutputRegressor

Wrap a scikit-learn estimator into a multi-output regressor for two-stage baselines.

The two-stage approach trains a regression model to minimize prediction error (e.g. MSE on cost coefficients) and only afterwards plugs the predicted costs into the optimization solver. This helper turns any single-output scikit-learn estimator (LinearRegression, RandomForestRegressor, MLPRegressor, …) into a multi-output regressor suitable for predicting the full cost vector.

Parameters:

pmodel – a scikit-learn single-output regression estimator

Returns:

scikit-learn multi-output regression wrapper

Return type:

MultiOutputRegressor

pyepo.twostage.sklearnPred is a helper function that wraps a scikit-learn estimator into a multi-output regressor.

import pyepo

# model for shortest path
grid = (5,5) # grid size
model = pyepo.model.grb.shortestPathModel(grid)

# generate data
num_data = 1000 # number of data
num_feat = 5 # size of feature
deg = 4 # polynomial degree
noise_width = 0 # noise width
x, c = pyepo.data.shortestpath.genData(num_data, num_feat, grid, deg, noise_width, seed=135)

# sklearn regressor
from sklearn.linear_model import LinearRegression
reg = LinearRegression() # linear regression

# build model
twostage_model = pyepo.twostage.sklearnPred(reg)

# training
twostage_model.fit(x, c)

# prediction
c_pred = twostage_model.predict(x)