Explainable AI (XAI)
As ML models make consequential decisions — approving loans, flagging fraud, recommending medical treatments — the question “why did the model decide this?” becomes as important as “how accurate is it?” Explainable AI provides tools to answer that question at the model level, the prediction level, and in terms regulators and business stakeholders can understand.
Two Levels of Explanation
Global explanations: How does the model behave in general? Which features matter most across all predictions?
Local explanations: Why did the model make this specific prediction for this specific input?
Most XAI tools provide both, and both are necessary in practice.
SHAP (SHapley Additive exPlanations)
SHAP is the gold standard for ML interpretability. Based on game-theoretic Shapley values, it assigns each feature a fair contribution to each prediction — mathematically consistent across all model types.
import shapimport xgboost as xgbimport matplotlib.pyplot as plt
model = xgb.XGBClassifier(n_estimators=200, random_state=42)model.fit(X_train, y_train)
# Create explainer (TreeExplainer is fast for tree models)explainer = shap.TreeExplainer(model)shap_values = explainer.shap_values(X_test)
# Global: Bar plot of mean |SHAP| per featureshap.summary_plot(shap_values, X_test, feature_names=feature_names, plot_type='bar')
# Global: Beeswarm plot showing feature impact distributionshap.summary_plot(shap_values, X_test, feature_names=feature_names)
# Local: Waterfall plot for a single predictionshap.plots.waterfall(shap.Explanation( values=shap_values[42], base_values=explainer.expected_value, data=X_test[42], feature_names=feature_names))
# Local: Force plot (interactive)shap.force_plot(explainer.expected_value, shap_values[42], X_test[42], feature_names=feature_names)SHAP Interaction Analysis
# Dependence plot: how one feature's SHAP value changes across its range# Colored by a second feature to show interactionsshap.dependence_plot('income', shap_values, X_test, feature_names=feature_names, interaction_index='age') # Color by age
# Compute interaction values (expensive, but shows pairwise feature interactions)shap_interaction = explainer.shap_interaction_values(X_test[:100])shap.summary_plot(shap_interaction, X_test[:100], feature_names=feature_names)LIME (Local Interpretable Model-Agnostic Explanations)
LIME explains individual predictions by fitting a simple interpretable model (linear regression) in the local neighborhood of the prediction:
import limeimport lime.lime_tabular
# Create LIME explainerexplainer = lime.lime_tabular.LimeTabularExplainer( training_data=X_train, feature_names=feature_names, class_names=['Normal', 'Fraud'], mode='classification', discretize_continuous=True)
# Explain a single predictionidx = 42explanation = explainer.explain_instance( data_row=X_test[idx], predict_fn=model.predict_proba, num_features=10)
# Show explanationexplanation.show_in_notebook(show_table=True)print(explanation.as_list()) # [(feature_name, contribution), ...]Partial Dependence Plots (PDP)
Shows the marginal effect of one or two features on model predictions:
from sklearn.inspection import PartialDependenceDisplayimport matplotlib.pyplot as plt
# Single feature PDPfig, ax = plt.subplots(figsize=(8, 5))PartialDependenceDisplay.from_estimator( model, X_train, features=['income'], feature_names=feature_names, ax=ax)plt.title("Partial Dependence: Income")
# Two-feature interaction PDP (heatmap)PartialDependenceDisplay.from_estimator( model, X_train, features=[('income', 'credit_score')], # Tuple for 2D PDP feature_names=feature_names)Counterfactual Explanations
Answers: “What would have to change for the model to flip its decision?”
import dice_mlfrom dice_ml import Dice
# Define data and modeldata = dice_ml.Data(dataframe=df_train, continuous_features=numeric_cols, outcome_name='target')m = dice_ml.Model(model=model, backend="sklearn")
# Generate counterfactualsexp = Dice(data, m)query_instance = X_test[42:43]cf = exp.generate_counterfactuals( query_instance, total_CFs=3, desired_class="opposite", proximity_weight=0.2, diversity_weight=1.0)cf.visualize_as_dataframe()# Output: "If income had been $62,000 instead of $38,000, the model would have approved."Model-Specific Interpretability
# Decision tree: visualize the full treefrom sklearn.tree import plot_tree, export_textplot_tree(dt_model, feature_names=feature_names, class_names=['No', 'Yes'], filled=True, rounded=True, max_depth=3)
# Linear models: plot coefficientsimport pandas as pdcoef = pd.Series(lr_model.coef_[0], index=feature_names).sort_values()coef.plot(kind='barh', figsize=(10, 8))plt.title('Logistic Regression Coefficients')Regulatory Context (2026)
Several regulations now require AI explainability:
- EU AI Act (2025+): High-risk AI systems require transparency and human oversight
- GDPR Article 22: Right to explanation for automated decisions
- US Executive Order on AI (2023): Transparency requirements for government AI
- Financial Services: Model risk management (SR 11-7) requires explainability for credit models
In practice, this means:
- Document which features drive decisions
- Be able to explain individual decisions in plain language
- Monitor for feature drift that changes model behavior
- Detect and document bias across demographic groups
SHAP values are now the industry standard for fulfilling these requirements — they’re precise, consistent, and can be automated into production monitoring pipelines.