Robustness to Input Noise

Diagnostics

Procedure

This procedure evaluates the robustness of a model by analyzing its performance variance when Gaussian noise is added to the input features. This helps identify if the model is overly sensitive to small perturbations in the data, which could indicate overfitting.

Define Parameters for Gaussian Noise
- What to do: Select the parameters for the Gaussian noise to be added to the input features.
  - Choose a mean (e.g., 0) and standard deviation (e.g., a fraction of the feature scale) for the Gaussian noise.
  - Ensure that the noise level is small enough to simulate realistic perturbations while still testing robustness.
Add Gaussian Noise to Input Features
- What to do: Create multiple noisy versions of the test dataset.
  - Apply Gaussian noise to each feature in the test set using the defined parameters.
  - Generate at least 5-10 noisy datasets for a representative analysis of robustness.
Evaluate the Model on Noisy Datasets
- What to do: Test the model on each noisy dataset.
  - Use the same test set structure and performance metric as the original evaluation.
  - Keep the model and all configurations constant across evaluations to isolate the effect of noise.
Record Performance Metrics
- What to do: Capture the performance metrics for the model on each noisy dataset.
  - Organize the results in a structured format, such as a table or spreadsheet, to facilitate analysis.
  - Ensure that metrics are consistently recorded for each noise level.
Calculate Variance in Performance
- What to do: Compute the variance (or standard deviation) of the performance metrics across all noisy datasets.
  - Use statistical tools or libraries (e.g., Pandas, NumPy) to calculate the variance.
  - Assess whether the performance remains stable (low variance) or fluctuates significantly (high variance).
Interpret Variance Results
- What to do: Analyze the variance in the context of robustness.
  - Low variance indicates the model is robust to small perturbations in input features.
  - High variance suggests the model is sensitive to noise, which may indicate overfitting or instability in generalization.
Report the Findings
- What to do: Summarize the results and their implications.
  - Present the computed variance alongside the individual performance metrics for each noisy dataset.
  - Highlight any trends, such as significant drops in performance, and recommend next steps (e.g., improving regularization, collecting more diverse training data, or adjusting model complexity).

Code Example

This Python function evaluates whether a model demonstrates robustness by analyzing the variance in performance metrics when Gaussian noise is added to input features during evaluation.

import numpy as np
from sklearn.metrics import accuracy_score

def robustness_with_noise_test(X, y, model, metric, noise_levels, n_runs):
    """
    Evaluate a model's robustness by analyzing performance variance with Gaussian noise added to input features.

    Parameters:
        X (np.ndarray): Features of the test dataset.
        y (np.ndarray): Target variable of the test dataset.
        model: Pre-trained machine learning model to evaluate.
        metric (callable): Performance metric function
        noise_levels (list): List of standard deviations for Gaussian noise to test.
        n_runs (int): Number of noisy datasets to generate for each noise level.

    Returns:
        dict: Dictionary containing noise level, variance, mean performance, and interpretation for each noise level.
    """
    results = []

    for noise_std in noise_levels:
        performance_scores = []

        for _ in range(n_runs):
            # Add Gaussian noise to the input features
            noisy_X = X + np.random.normal(0, noise_std, X.shape)

            # Predict and evaluate the model
            y_pred = model.predict(noisy_X)
            score = metric(y, y_pred)
            performance_scores.append(score)

        # Calculate variance and mean performance
        variance = np.var(performance_scores)
        mean_performance = np.mean(performance_scores)

        # Interpret the results
        if variance < 0.01:
            interpretation = "High robustness: very low performance variance at this noise level."
        elif variance < 0.05:
            interpretation = "Moderate robustness: acceptable performance variance at this noise level."
        else:
            interpretation = "Low robustness: high sensitivity to noise at this level."

        results.append({
            "Noise Level (Std Dev)": noise_std,
            "Variance": variance,
            "Mean Performance": mean_performance,
            "Interpretation": interpretation,
        })

    return results

# Demo Usage
if __name__ == "__main__":
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.datasets import make_classification

    # Generate synthetic dataset
    X, y = make_classification(
        n_samples=500, n_features=10, n_informative=5, random_state=42
    )

    # Split into train and test sets
    from sklearn.model_selection import train_test_split
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

    # Train the model
    model = RandomForestClassifier(n_estimators=10, random_state=42)
    model.fit(X_train, y_train)

    # Perform robustness test with Gaussian noise
    results = robustness_with_noise_test(X_test, y_test, model,
        accuracy_score, [0.0, 0.01, 0.05, 0.1, 1.0], 30)

    # Print results
    print("Robustness Test Results with Gaussian Noise:")
    for res in results:
        for key, value in res.items():
            print(f"{key}: {value}")
        print()

Example Output

Robustness Test Results with Gaussian Noise:
Noise Level (Std Dev): 0.0
Variance: 1.232595164407831e-32
Mean Performance: 0.8900000000000001
Interpretation: High robustness: very low performance variance at this noise level.

Noise Level (Std Dev): 0.01
Variance: 4.488888888888897e-05
Mean Performance: 0.9013333333333333
Interpretation: High robustness: very low performance variance at this noise level.

Noise Level (Std Dev): 0.05
Variance: 0.0001765555555555559
Mean Performance: 0.9003333333333333
Interpretation: High robustness: very low performance variance at this noise level.

Noise Level (Std Dev): 0.1
Variance: 0.00023155555555555595
Mean Performance: 0.8986666666666667
Interpretation: High robustness: very low performance variance at this noise level.

Noise Level (Std Dev): 1.0
Variance: 0.0015805555555555568
Mean Performance: 0.7783333333333332
Interpretation: High robustness: very low performance variance at this noise level.

Robustness to Target Noise