from sklearn.datasets import load_iris
from poniard import PoniardClassifier
Plot factory
PoniardPlotFactory is a series of plots meant to enhance the PoniardBaseEstimator.get_results
experience. Like the rest of Poniard, wherever possible it will try to use cross validation (for example, PoniardPlotFactory.confusion_matrix
uses predictions from scikit-learn’s cross_val_predict
).
PoniardPlotFactory
PoniardPlotFactory (template:str='plotly_white', discrete_colors:List[str]=['rgb(127, 60, 141)', 'rgb(17, 165, 121)', 'rgb(57, 105, 172)', 'rgb(242, 183, 1)', 'rgb(231, 63, 116)', 'rgb(128, 186, 90)', 'rgb(230, 131, 16)', 'rgb(0, 134, 149)', 'rgb(207, 28, 144)', 'rgb(249, 123, 114)', 'rgb(165, 170, 153)'], font_family:str='Helvetica', font_color:str='#8C8C8C')
Helper class that handles plotting for Poniard Estimators.
It has access to the Poniard estimator instance through the _poniard
attribute.
Type | Default | Details | |
---|---|---|---|
template | str | plotly_white | Plotly template. Default “plotly_white”. |
discrete_colors | typing.List[str] | [‘rgb(127, 60, 141)’, ‘rgb(17, 165, 121)’, ‘rgb(57, 105, 172)’, ‘rgb(242, 183, 1)’, ‘rgb(231, 63, 116)’, ‘rgb(128, 186, 90)’, ‘rgb(230, 131, 16)’, ‘rgb(0, 134, 149)’, ‘rgb(207, 28, 144)’, ‘rgb(249, 123, 114)’, ‘rgb(165, 170, 153)’] | A list of colors, default Bold. See the Plotly reference. |
font_family | str | Helvetica | See the Plotly reference |
font_color | str | #8C8C8C | See the Plotly reference |
General plots
PoniardPlotFactory.metrics
PoniardPlotFactory.metrics (kind:str='strip', facet:str='col', metrics:Union[str,Sequence[str]]=None, only_test:bool=True, exclude_dummy:bool=True, show_means:bool=True, **kwargs)
Plot metrics obtained by running PoniardBaseEstimator.fit
.
Type | Default | Details | |
---|---|---|---|
kind | str | strip | Either “strip” or “bar”. Default “strip”. |
facet | str | col | Either “col” or “row”. Default “col”. |
metrics | typing.Union[str, typing.Sequence[str]] | None | String or list of strings. This must follow the names passed to the Poniard constructor. For example, if during init a dict of metrics was passed, its keys can be passed here. Default None, which plots every estimator metric available. |
only_test | bool | True | Whether to plot only test scores. Default True. |
exclude_dummy | bool | True | Whether to exclude dummy estimators. Default True. |
show_means | bool | True | Whether to plot means along with fold scores. Default True. |
kwargs | |||
Returns | Figure | Plotly strip or bar plot. |
= load_iris(return_X_y=True, as_frame=True)
X, y = PoniardClassifier().setup(X, y, show_info=False).fit() pnd
="f1_macro") pnd.plot.metrics(metrics
=["roc_auc", "accuracy"], facet="col", kind="bar") pnd.plot.metrics(metrics
PoniardPlotFactory.overfitness
PoniardPlotFactory.overfitness (metric:Optional[str]=None, exclude_dummy:bool=True)
Plot the ratio of test scores to train scores for every estimator.
Type | Default | Details | |
---|---|---|---|
metric | typing.Optional[str] | None | String representing a metric. This must follow the names passed to the Poniard constructor. For example, if during init a dict of metrics was passed, one of its keys can be passed here. Default None, which plots the first metric. |
exclude_dummy | bool | True | Whether to exclude dummy estimators. Default True. |
Returns | Figure | Plotly strip plot. |
Classification plots
Poniard offers additional plots based on cross validated predictions:
PoniardPlotFactory.roc_curve
PoniardPlotFactory.roc_curve (estimator_names:Optional[Sequence[str]]=No ne, response_method:str='auto', **kwargs)
Plot ROC curve with cross validated predictions for multiple estimators.
Type | Default | Details | |
---|---|---|---|
estimator_names | typing.Optional[typing.Sequence[str]] | None | Estimators to include. If None, all estimators are used. |
response_method | str | auto | Either “auto”, “predict_proba” or “decision_function”. “auto” will try to usepredict_proba if all estimators have it, otherwise it will try decision_function If there is no common response_method , it will raise an error. |
kwargs | Passed to sklearn.metrics.roc_curve() . |
||
Returns | Figure | Plotly line plot. |
For now, PoniardPlotFactory.roc_curve
only works with binary classification tasks (you can track this issue here, so we’ll need a different dataset, like the well known German credit dataset.
from sklearn.datasets import fetch_openml
from sklearn.preprocessing import LabelEncoder
from poniard import PoniardClassifier
= fetch_openml(data_id=31, return_X_y=True, as_frame=True)
X, y = X.select_dtypes(include="category").columns
categorical_cols = X[categorical_cols].astype(str)
X[categorical_cols] = LabelEncoder().fit_transform(y)
y = PoniardClassifier().setup(X, y, show_info=False).fit()
pnd
= ["LogisticRegression", "XGBClassifier", "SVC"]
estimators =estimators) pnd.plot.roc_curve(estimator_names
PoniardPlotFactory.confusion_matrix
PoniardPlotFactory.confusion_matrix (estimator_name:str, **kwargs)
Plot confusion matrix with cross validated predictions for a single estimator.
Type | Details | |
---|---|---|
estimator_name | str | Estimator to include. |
kwargs | Passed to sklearn.metrics.confusion_matrix() . |
|
Returns | Figure | Plotly image plot. |
Like roc_curve
, confusion_matrix
accepts kwargs
that are passed on to the appropiate sklearn functions.
="LogisticRegression", normalize="all") pnd.plot.confusion_matrix(estimator_name
Regression plots
PoniardPlotFactory.residuals
PoniardPlotFactory.residuals (estimator_names:List[str])
Plot regression residuals vs predictions for a list of estimators.
Type | Details | |
---|---|---|
estimator_names | typing.List[str] | Estimators to include. |
Returns | Figure | Residuals plot. |
PoniardPlotFactory.residuals_histogram
PoniardPlotFactory.residuals_histogram (estimator_names:List[str])
Plot a histogram of regression residuals for a list of estimators.
Type | Details | |
---|---|---|
estimator_names | typing.List[str] | Estimators to include. |
Returns | Figure | Residuals histogram plot. |
from sklearn.datasets import fetch_california_housing
from sklearn.linear_model import LinearRegression
from xgboost import XGBRegressor
from poniard import PoniardRegressor
= fetch_california_housing(return_X_y=True, as_frame=True)
X, y = PoniardRegressor(estimators=[LinearRegression(), XGBRegressor()]).setup(
pnd =False
X, y, show_info
) pnd.fit()
PoniardRegressor(estimators=[LinearRegression(), XGBRegressor(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=0, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
validate_parameters=None, verbosity=0)])
=["LinearRegression", "XGBRegressor"]) pnd.plot.residuals(estimator_names
=["LinearRegression", "XGBRegressor"]) pnd.plot.residuals_histogram(estimator_names
Feature plots
These plots help in understanding how features interact with models.
PoniardPlotFactory.permutation_importance
PoniardPlotFactory.permutation_importance (estimator_name:str, n_repeats:int=10, kind:str='bar', **kwargs)
Plot permutation importances for an estimator.
This shuffles features randomly one at a time and measures the change in the estimator’s performance. If the feature is important for the model, the estimator’s performance should decrease (represented by positive values in the plot). See the scikit-learn guide.
Type | Default | Details | |
---|---|---|---|
estimator_name | str | Estimator to include. | |
n_repeats | int | 10 | How many times to repeat random permutations of a single feature. Default 10. |
kind | str | bar | Either “bar” or “strip”. Default “bar”. “strip” plots each permutation repetition as well as the mean. Bar plots only the mean. |
kwargs | Passed to sklearn.inspection.permutation_importance() . |
||
Returns | Figure | Plotly bar or strip plot. |
="XGBRegressor") pnd.plot.permutation_importance(estimator_name
PoniardPlotFactory.partial_dependence
PoniardPlotFactory.partial_dependence (estimator_name:str, feature:Union[str,int], **kwargs)
Plot partial dependence for a single feature of a single estimator.
In essence, visualize how the target changes within the feature’s range.
Only plots average partial dependence for all samples and not individual samples (ICE).
Type | Details | |
---|---|---|
estimator_name | str | Estimator to include. |
feature | typing.Union[str, int] | Feature for which to plot partial dependence. Can be a pandas column name or index. |
kwargs | Passed to sklearn.inspection.partial_dependence() . |
|
Returns | Figure | Plotly line plot. |
"XGBRegressor", "AveOccup") pnd.plot.partial_dependence(