API Docs
deeprenewal.deeprenewal._estimator.DeepRenewalEstimator
Construct a DeepRenewal estimator.
This implements an RNN-based model, close to the one described in https://arxiv.org/abs/1911.10416.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
freq |
str |
Frequency of the data to train on and predict |
required |
prediction_length |
int |
Length of the prediction horizon |
required |
trainer |
Optional[Trainer] |
Trainer object to be used. Defaults to Trainer(). |
required |
context_length |
Optional[int] |
Number of steps to unroll the RNN for before computing predictions. Defaults to None in which case context_length = prediction_length. |
required |
num_layers |
Optional[int] |
Number of RNN layers. Defaults to 2. |
required |
num_cells |
Optional[int] |
Number of RNN cells for each layer. Defaults to 40. |
required |
cell_type |
Optional[str] |
Type of recurrent cells to use (available: 'lstm' or 'gru'). Defaults to "lstm". |
required |
dropout_rate |
Optional[float] |
Dropout regularization parameter. Defaults to 0.1. |
required |
use_feat_dynamic_real |
Optional[bool] |
Whether to use the |
required |
use_feat_static_cat |
Optional[bool] |
Whether to use the |
required |
use_feat_static_real |
Optional[bool] |
Whether to use the |
required |
cardinality |
Optional[List[int]] |
Number of values of each categorical feature.
This must be set if |
required |
embedding_dimension |
Optional[List[int]] |
Dimension of the embeddings for categorical features. Defaults to None in which case [min(50, (cat+1)//2) for cat in cardinality]. |
required |
scaling |
Optional[bool] |
Whether to automatically scale the target values. Defaults to True. |
required |
lags_seq |
Optional[List[int]] |
Indices of the lagged target values to use as inputs of the RNN Defaults to None in which case these are automatically determined based on freq. |
required |
time_features |
Optional[List[TimeFeature]] |
Time features to use as inputs of the RNN. Defaults to None in which case these are automatically determined based on freq. |
required |
num_parallel_samples |
Optional[int] |
Number of evaluation samples per time series to increase parallelism during inference. This is a model optimization that does not affect the accuracy. Defaults to 100. |
required |
forecast_type |
Optional[str] |
The model outputs M and Q. This determines how those parameters are translated to a regular timeseries forecast. flat --> A flat forecast of M/Q for the prediction length exact --> Will use M and Q to produce a foreast with (Q) times 0,M, repeat hybrid --> Will use M and Q to create flat forecasts, but the M/Q changes depending on the subsequent predictions eg: We trained the model with prediction length = 5. the output of the model are M --> 22,33,12, Q--> 2,1,4 flat --> 11, 11, 11, 11, 11 exact --> 0, 22, 33, 0, 0 hybrid --> 11, 11, 33, 3, 3 (22/2, 22/2, 33/1, 12/4, 12/4). Defaults to "flat". |
required |
create_predictor(self, transformation, trained_network)
Create and return a predictor object.
Returns
Predictor
A predictor wrapping a HybridBlock
used for inference.
Source code in deeprenewal/deeprenewal/_estimator.py
def create_predictor(
self, transformation: Transformation, trained_network: HybridBlock
) -> Predictor:
prediction_network = DeepRenewalPredictionNetwork(
num_parallel_samples=self.num_parallel_samples,
num_layers=self.num_layers,
num_cells=self.num_cells,
cell_type=self.cell_type,
history_length=self.history_length,
context_length=self.context_length,
prediction_length=self.prediction_length,
distr_output_m=self.distr_output_m,
distr_output_q=self.distr_output_q,
dropout_rate=self.dropout_rate,
cardinality=self.cardinality,
embedding_dimension=self.embedding_dimension,
lags_seq=self.lags_seq,
scaling=self.scaling,
dtype=self.dtype,
)
copy_parameters(trained_network, prediction_network)
return RepresentableBlockPredictor(
input_transform=transformation,
prediction_net=prediction_network,
batch_size=self.trainer.batch_size,
freq=self.freq,
prediction_length=self.prediction_length,
ctx=self.trainer.ctx,
dtype=self.dtype,
forecast_generator=IntermittentSampleForecastGenerator(
prediction_length=self.prediction_length,
forecast_type=self.forecast_type,
),
)
create_training_network(self)
Create and return the network used for training (i.e., computing the loss).
Returns
HybridBlock The network that computes the loss given input data.
Source code in deeprenewal/deeprenewal/_estimator.py
def create_training_network(self) -> DeepRenewalTrainingNetwork:
return DeepRenewalTrainingNetwork(
num_layers=self.num_layers,
num_cells=self.num_cells,
cell_type=self.cell_type,
history_length=self.history_length,
context_length=self.context_length,
prediction_length=self.prediction_length,
distr_output_m=self.distr_output_m,
distr_output_q=self.distr_output_q,
dropout_rate=self.dropout_rate,
cardinality=self.cardinality,
embedding_dimension=self.embedding_dimension,
lags_seq=self.lags_seq,
scaling=self.scaling,
dtype=self.dtype,
)
create_transformation(self)
Create and return the transformation needed for training and inference.
Returns
Transformation The transformation that will be applied entry-wise to datasets, at training and inference time.
Source code in deeprenewal/deeprenewal/_estimator.py
def create_transformation(self) -> Transformation:
remove_field_names = [FieldName.FEAT_DYNAMIC_CAT]
if not self.use_feat_static_real:
remove_field_names.append(FieldName.FEAT_STATIC_REAL)
if not self.use_feat_dynamic_real:
remove_field_names.append(FieldName.FEAT_DYNAMIC_REAL)
return Chain(
[RemoveFields(field_names=remove_field_names)]
+ (
[SetField(output_field=FieldName.FEAT_STATIC_CAT, value=[0.0])]
if not self.use_feat_static_cat
else []
)
+ (
[SetField(output_field=FieldName.FEAT_STATIC_REAL, value=[0.0])]
if not self.use_feat_static_real
else []
)
+ [
AsNumpyArray(
field=FieldName.FEAT_STATIC_CAT,
expected_ndim=1,
dtype=self.dtype,
),
AsNumpyArray(
field=FieldName.FEAT_STATIC_REAL,
expected_ndim=1,
dtype=self.dtype,
),
AsNumpyArray(
field=FieldName.TARGET,
# in the following line, we add 1 for the time dimension
expected_ndim=1 + len(self.distr_output_m.event_shape),
dtype=self.dtype,
),
AddObservedValuesIndicator(
target_field=FieldName.TARGET,
output_field=FieldName.OBSERVED_VALUES,
dtype=self.dtype,
),
AddTimeFeatures(
start_field=FieldName.START,
target_field=FieldName.TARGET,
output_field=FieldName.FEAT_TIME,
time_features=self.time_features,
pred_length=self.prediction_length,
),
AddInterDemandPeriodFeature(
start_field=FieldName.START,
target_field=FieldName.TARGET,
output_field=FieldName.TARGET, # FieldName.FEAT_TIME FieldName.TARGET #if we want to append to feat time,
pred_length=self.prediction_length,
),
AddAgeFeature(
target_field=FieldName.TARGET,
output_field=FieldName.FEAT_AGE,
pred_length=self.prediction_length,
log_scale=True,
dtype=self.dtype,
),
VstackFeatures(
output_field=FieldName.FEAT_TIME,
input_fields=[FieldName.FEAT_TIME, FieldName.FEAT_AGE]
+ (
[FieldName.FEAT_DYNAMIC_REAL]
if self.use_feat_dynamic_real
else []
),
),
# DropNonZeroTarget(
# input_fields=[FieldName.FEAT_TIME, FieldName.OBSERVED_VALUES],
# target_field=FieldName.TARGET,
# pred_length=self.prediction_length,
# ),
RenewalInstanceSplitter(
target_field=FieldName.TARGET,
is_pad_field=FieldName.IS_PAD,
start_field=FieldName.START,
forecast_start_field=FieldName.FORECAST_START,
train_sampler=ExpectedNumInstanceSampler(num_instances=1),
past_length=self.history_length,
future_length=self.prediction_length,
time_series_fields=[FieldName.FEAT_TIME, FieldName.OBSERVED_VALUES],
# dummy_value=self.distr_output_m.value_in_support,
# pick_incomplete=False
),
]
)
deeprenewal._evaluator.IntermittentEvaluator
An Evaluator which implements metrics which are more attuned to be used in intermittent demand patterns
Parameters:
Name | Type | Description | Default |
---|---|---|---|
quantiles |
|
list of strings of the form 'p10' or floats in [0, 1] with the quantile levels |
required |
seasonality |
|
seasonality to use for seasonal_error, if nothing is passed
uses the default seasonality
for the given series frequency as returned by |
required |
alpha |
|
Parameter of the MSIS metric from the M4 competition that defines the confidence interval. For alpha=0.05 (default) the 95% considered is considered in the metric, see https://www.m4.unic.ac.cy/wp-content/uploads/2018/03/M4-Competitors-Guide.pdf for more detail on MSIS |
required |
calculate_owa |
|
Determines whether the OWA metric should also be calculated, which is computationally expensive to evaluate and thus slows down the evaluation process considerably. By default False. |
required |
calculate_spec |
|
Determines whether the SPEC metric should also be calculated, which is computationally expensive to evaluate and thus slows down the evaluation process considerably. By default False. |
required |
median |
|
Determines whether to use median or mean for point estimation By default True |
required |
round_integer |
|
Determines whether to round the forecasts to nearest digit before evaluating forecasts By default True.. |
required |
num_workers |
|
The number of multiprocessing workers that will be used to process the data in parallel. Default is multiprocessing.cpu_count(). Setting it to 0 means no multiprocessing. |
required |
chunk_size |
|
Controls the approximate chunk size each workers handles at a time. Default is 32. |
required |
cum_error(target, forecast)
staticmethod
Cumulative Error
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target |
Union[pandas.core.series.Series, numpy.ndarray] |
Ground Truth |
required |
forecast |
Union[pandas.core.series.Series, numpy.ndarray] |
Forecast |
required |
Returns:
Type | Description |
---|---|
float |
cumulative error |
Source code in deeprenewal/_evaluator.py
@staticmethod
def cum_error(target: Union[pd.Series, np.ndarray], forecast: Union[pd.Series, np.ndarray]):
"""Cumulative Error
Args:
target (Union[pd.Series, np.ndarray]): Ground Truth
forecast (Union[pd.Series, np.ndarray]): Forecast
Returns:
float: cumulative error
"""
return np.cumsum(target - forecast)
maape(target, forecast)
staticmethod
Mean Arctangent Absolute Percent Error is a variant of MAPE which is not affected by division by zero. https://www.sciencedirect.com/science/article/pii/S0169207016000121
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target |
Union[pandas.core.series.Series, numpy.ndarray] |
Ground Truth |
required |
forecast |
Union[pandas.core.series.Series, numpy.ndarray] |
Forecast |
required |
Returns:
Type | Description |
---|---|
float |
MAAPE |
Source code in deeprenewal/_evaluator.py
@staticmethod
def maape(target: Union[pd.Series, np.ndarray], forecast: Union[pd.Series, np.ndarray]):
"""Mean Arctangent Absolute Percent Error is a variant of MAPE which is not affected by division by zero.
https://www.sciencedirect.com/science/article/pii/S0169207016000121
Args:
target (Union[pd.Series, np.ndarray]): Ground Truth
forecast (Union[pd.Series, np.ndarray]): Forecast
Returns:
float: MAAPE
"""
ape = np.zeros_like(target, dtype="float")
mask = np.logical_not((target == 0) & (forecast == 0))
ape = np.divide(np.abs(target - forecast), target, out=ape, where=mask)
return np.mean(np.arctan(ape))
mae(target, forecast)
staticmethod
Calculated Mean Absolute Error (mean(abs(ground_truth - forecast)))
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target |
Union[pandas.core.series.Series, numpy.ndarray] |
Ground Truth |
required |
forecast |
Union[pandas.core.series.Series, numpy.ndarray] |
Forecast |
required |
Returns:
Type | Description |
---|---|
float |
MAE |
Source code in deeprenewal/_evaluator.py
@staticmethod
def mae(target: Union[pd.Series, np.ndarray], forecast: Union[pd.Series, np.ndarray]):
"""Calculated Mean Absolute Error (mean(abs(ground_truth - forecast)))
Args:
target (Union[pd.Series, np.ndarray]): Ground Truth
forecast (Union[pd.Series, np.ndarray]): Forecast
Returns:
float: MAE
"""
return np.mean(np.abs(target - forecast))
mean_relative_abs_error(target, forecast, naive_fc)
staticmethod
Mean Absolute Relative Error = Absolute Error/ Abs(Ground Truth - ReferenceFC)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target |
Union[pandas.core.series.Series, numpy.ndarray] |
Ground Truth |
required |
forecast |
Union[pandas.core.series.Series, numpy.ndarray] |
Forecast |
required |
naive_fc |
Union[pandas.core.series.Series, numpy.ndarray] |
reference forecast, most commonly naive forecast |
required |
Returns:
Type | Description |
---|---|
float |
MRAE |
Source code in deeprenewal/_evaluator.py
@staticmethod
def mean_relative_abs_error(target: Union[pd.Series, np.ndarray], forecast: Union[pd.Series, np.ndarray], naive_fc: Union[pd.Series, np.ndarray]):
"""Mean Absolute Relative Error = Absolute Error/ Abs(Ground Truth - ReferenceFC)
Args:
target (Union[pd.Series, np.ndarray]): Ground Truth
forecast (Union[pd.Series, np.ndarray]): Forecast
naive_fc (Union[pd.Series, np.ndarray]): reference forecast, most commonly naive forecast
Returns:
float: MRAE
"""
return np.mean(np.abs(target - forecast) / np.abs(target - naive_fc))
naive_fc(time_series, forecast)
staticmethod
time_series:time_series
forecast:forecast
Returns:
Type | Description |
---|---|
ndarray |
np.ndarray
time series without the forecast dates
Source code in deeprenewal/_evaluator.py
@staticmethod
def naive_fc(
time_series: Union[pd.Series, pd.DataFrame], forecast: Forecast
) -> np.ndarray:
"""
Parameters:
----------
time_series:time_series
forecast:forecast
Returns:
-------
np.ndarray
time series without the forecast dates
"""
assert forecast.index.intersection(time_series.index).equals(forecast.index), (
"Index of forecast is outside the index of target\n"
f"Index of forecast: {forecast.index}\n Index of target: {time_series.index}"
)
# Extending the last demand to the prediction length
date_before_forecast = forecast.index[0] - forecast.index[0].freq
naive_fc = time_series.loc[date_before_forecast]
if isinstance(naive_fc, pd.Series):
naive_fc = naive_fc.values.item()
return np.atleast_1d(
np.squeeze(pd.Series(index=forecast.index, data=naive_fc).transpose())
)
nos_p(target, forecast)
staticmethod
Calculates the percent of periods which are stock out # of periods where CFE > 0 and Ground Truth >0 upon total # number of periods
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target |
Union[pandas.core.series.Series, numpy.ndarray] |
Ground Truth |
required |
forecast |
Union[pandas.core.series.Series, numpy.ndarray] |
Forecast |
required |
Returns:
Type | Description |
---|---|
float |
Percentage of periods which is stockout |
Source code in deeprenewal/_evaluator.py
@staticmethod
def nos_p(target: Union[pd.Series, np.ndarray], forecast: Union[pd.Series, np.ndarray]):
"""Calculates the percent of periods which are stock out
# of periods where CFE > 0 and Ground Truth >0 upon total
# number of periods
Args:
target (Union[pd.Series, np.ndarray]): Ground Truth
forecast (Union[pd.Series, np.ndarray]): Forecast
Returns:
float: Percentage of periods which is stockout
"""
c = np.cumsum(target - forecast)
mask = target > 0
return np.sum(c[mask] > 0) / np.sum(mask)
percent_better(target, forecast, naive_fc)
staticmethod
Percent Better finds the percent of instances where the MAE is better than reference MAE
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target |
Union[pandas.core.series.Series, numpy.ndarray] |
Ground Truth |
required |
forecast |
Union[pandas.core.series.Series, numpy.ndarray] |
Forecast |
required |
naive_fc |
Union[pandas.core.series.Series, numpy.ndarray] |
Reference forecast, most commonly naive forecast |
required |
Returns:
Type | Description |
---|---|
float |
PBMAE |
Source code in deeprenewal/_evaluator.py
@staticmethod
def percent_better(target: Union[pd.Series, np.ndarray], forecast: Union[pd.Series, np.ndarray], naive_fc: Union[pd.Series, np.ndarray]):
"""Percent Better finds the percent of instances where the MAE is better than reference MAE
Args:
target (Union[pd.Series, np.ndarray]): Ground Truth
forecast (Union[pd.Series, np.ndarray]): Forecast
naive_fc (Union[pd.Series, np.ndarray]): Reference forecast, most commonly naive forecast
Returns:
float: PBMAE
"""
mae = np.abs(target - forecast)
mae_star = np.abs(target - naive_fc)
pb = mae > mae_star
return np.sum(pb) / len(pb)
pis(target, forecast)
staticmethod
Measures the total number of periods the forecasted item spends in stock or number of stock out periods. It can also be calculated as the cumulative sum of the CFE
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target |
Union[pandas.core.series.Series, numpy.ndarray] |
Ground Truth |
required |
forecast |
Union[pandas.core.series.Series, numpy.ndarray] |
Forecast |
required |
Returns:
Type | Description |
---|---|
float |
PIS |
Source code in deeprenewal/_evaluator.py
@staticmethod
def pis(target: Union[pd.Series, np.ndarray], forecast: Union[pd.Series, np.ndarray]):
"""Measures the total number of periods the forecasted item spends in
stock or number of stock out periods. It can also be calculated as
the cumulative sum of the CFE
Args:
target (Union[pd.Series, np.ndarray]): Ground Truth
forecast (Union[pd.Series, np.ndarray]): Forecast
Returns:
float: PIS
"""
cfe_t = np.cumsum(target - forecast)
return np.sum(-1 * cfe_t)
signed_error(target, forecast)
staticmethod
Calculates the signed error (Ground Truth - Forecast)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target |
Union[pandas.core.series.Series, numpy.ndarray] |
Ground Truth |
required |
forecast |
Union[pandas.core.series.Series, numpy.ndarray] |
Forecast |
required |
Returns:
Type | Description |
---|---|
float |
The total signed error |
Source code in deeprenewal/_evaluator.py
@staticmethod
def signed_error(target: Union[pd.Series, np.ndarray], forecast: Union[pd.Series, np.ndarray]):
"""Calculates the signed error (Ground Truth - Forecast)
Args:
target (Union[pd.Series, np.ndarray]): Ground Truth
forecast (Union[pd.Series, np.ndarray]): Forecast
Returns:
float: The total signed error
"""
return np.sum(target - forecast)
spec(y_true, y_pred, a1=0.75, a2=0.25)
staticmethod
Stock-keeping-oriented Prediction Error Costs (SPEC)
Read more in the :ref:https://arxiv.org/abs/2004.10537
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
y_true |
|
array-like of shape (n_samples,) Ground truth (correct) target values. |
required |
y_pred |
|
array-like of shape (n_samples,) Estimated target values. |
required |
a1 |
|
opportunity costs weighting parameter a1 ∈ [0, ∞]. Default value is 0.75. |
0.75 |
a2 |
|
stock-keeping costs weighting parameter a2 ∈ [0, ∞]. Default value is 0.25. |
0.25 |
loss : float
SPEC output is non-negative floating point. The best value is 0.0.
Examples:
>>> from spec_metric import spec
>>> y_true = [0, 0, 5, 6, 0, 5, 0, 0, 0, 8, 0, 0, 6, 0]
>>> y_pred = [0, 0, 5, 6, 0, 5, 0, 0, 8, 0, 0, 0, 6, 0]
>>> spec(y_true, y_pred)
0.1428...
>>> spec(y_true, y_pred, a1=0.1, a2=0.9)
0.5142...
Source code in deeprenewal/_evaluator.py
@staticmethod
#https://github.com/DominikMartin/spec_metric/blob/master/spec_metric/_metric.py
def spec(y_true, y_pred, a1=0.75, a2=0.25):
"""Stock-keeping-oriented Prediction Error Costs (SPEC)
Read more in the :ref:`https://arxiv.org/abs/2004.10537`.
Parameters:
y_true : array-like of shape (n_samples,)
Ground truth (correct) target values.
y_pred : array-like of shape (n_samples,)
Estimated target values.
a1 : opportunity costs weighting parameter
a1 ∈ [0, ∞]. Default value is 0.75.
a2 : stock-keeping costs weighting parameter
a2 ∈ [0, ∞]. Default value is 0.25.
Returns:
-------
loss : float
SPEC output is non-negative floating point. The best value is 0.0.
Examples:
--------
>>> from spec_metric import spec
>>> y_true = [0, 0, 5, 6, 0, 5, 0, 0, 0, 8, 0, 0, 6, 0]
>>> y_pred = [0, 0, 5, 6, 0, 5, 0, 0, 8, 0, 0, 0, 6, 0]
>>> spec(y_true, y_pred)
0.1428...
>>> spec(y_true, y_pred, a1=0.1, a2=0.9)
0.5142...
"""
assert len(y_true) > 0 and len(y_pred) > 0
assert len(y_true) == len(y_pred)
sum_n = 0
for t in range(1, len(y_true) + 1):
sum_t = 0
for i in range(1, t + 1):
delta1 = np.sum([y_k for y_k in y_true[:i]]) - np.sum([f_j for f_j in y_pred[:t]])
delta2 = np.sum([f_k for f_k in y_pred[:i]]) - np.sum([y_j for y_j in y_true[:t]])
sum_t = sum_t + np.max([0, a1 * np.min([y_true[i - 1], delta1]), a2 * np.min([y_pred[i - 1], delta2])]) * (
t - i + 1)
sum_n = sum_n + sum_t
return sum_n / len(y_true)
deeprenewal._datasets.get_dataset(dataset_name, path=None, regenerate=False)
Get the repository dataset. Currently only Retail Dataset is available
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_name |
str |
name of the dataset, for instance "retail" |
required |
regenerate |
bool |
whether to regenerate the dataset even if a local file is present. If this flag is False and the file is present, the dataset will not be downloaded again. |
False |
path |
Optional[pathlib.Path] |
where the dataset should be saved |
None |
Returns:
Type | Description |
---|---|
TrainDatasets |
dataset obtained by either downloading or reloading from local file. |
Source code in deeprenewal/_datasets.py
def get_dataset(
dataset_name: str,
path: Optional[Path] = None,
regenerate: bool = False,
) -> TrainDatasets:
"""
Get the repository dataset.
Currently only [Retail Dataset](https://archive.ics.uci.edu/ml/datasets/online+retail) is available
Parameters:
dataset_name:
name of the dataset, for instance "retail"
regenerate:
whether to regenerate the dataset even if a local file is present.
If this flag is False and the file is present, the dataset will not
be downloaded again.
path:
where the dataset should be saved
Returns:
dataset obtained by either downloading or reloading from local file.
"""
if path is None:
path = default_dataset_path
dataset_path = materialize_dataset(dataset_name, path, regenerate)
return load_datasets(
metadata=dataset_path,
train=dataset_path / "train",
test=dataset_path / "test",
)