Day 3: Advanced Hyperparameter Tuning with Bayesian Optimization

Welcome to Day 3. Grid Search is brutally slow. Random Search is fast, but it is completely blind—it doesn't learn from its mistakes.

If Random Search tests a learning_rate of 0.9 and gets a terrible score, it might completely randomly test 0.85 on the very next attempt, completely wasting CPU time.

Enter Bayesian Optimization. It utilizes a probabilistic "Surrogate Model" (a miniature AI algorithm) that watches the tests happen. If it sees that high learning rates cause the score to drop, it stops testing high learning rates. It intelligently narrows in on the absolute mathematical peak of accuracy!

The Optuna Library

The absolute best library in the world for Bayesian Optimization is Optuna. We simply write an objective function, and Optuna handles the hyper-intelligent guessing!

Look at day3_ex.py. We load XGBoost and set up Optuna!

# day3_ex.py
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score
import optuna

# 1. Define the mathematical sandbox
def objective(trial):
    # Optuna will dynamically inject random numbers within these ranges!
    params = {
        'n_estimators': trial.suggest_int('n_estimators', 50, 500),
        'max_depth': trial.suggest_int('max_depth', 3, 100),
        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3),
        'colsample_bytree': trial.suggest_float('colsample_bytree', 0.6, 1.0)
    }

    # Train XGBoost with the newly suggested params
    model = XGBClassifier(eval_metric='logloss', random_state=42, **params)
    model.fit(X_train, y_train)

    # Return the accuracy score so Optuna can learn!
    return accuracy_score(y_test, model.predict(X_test))

# 2. Let the AI tournament begin!
study = optuna.create_study(direction="maximize")

# Allow the AI to test 50 different models
study.optimize(objective, n_trials=50)

print("Best Hyperparameters:", study.best_params)
print("Best Accuracy: ", study.best_value)

Explorations vs. Exploitation

Optuna is so successful because it perfectly balances: * Exploration: Looking at wild, random areas of the parameter grid just to see what happens. * Exploitation: Once it finds a mathematical "hotspot" (like a learning rate of 0.02), it relentlessly mines that hotspot, testing 0.021 and 0.019 to squeeze out maximum accuracy.

Wrapping Up Day 3

If you are doing competitive Machine Learning, Optuna paired with XGBoost or LightGBM is the dominant pipeline. It yields world-class models with very little manual intervention.

Optimization isn't just about tweaking Hyperparameter Dials. We also optimize models mathematically. Tomorrow on Day 4: Regularization Techniques, we revisit Linear models to see how \(L1\) and \(L2\) penalties keep our algorithms balanced.