Hyperparameter Tuning: GridSearchCV, RandomizedSearchCV, Optuna, Cross-Validation Strategies, and Practical Tuning Workflows

You built a Random Forest with default parameters. It gets 82% accuracy. Is that good? Could it be 92% with different settings? You change n_estimators from 100 to 500 — accuracy jumps to 86%. You change max_depth from None to 10 — it drops to 79%. Which combination of settings gives the best result?

This is the hyperparameter tuning problem. Model parameters (like weights in linear regression) are learned from data automatically. Hyperparameters (like number of trees, learning rate, max depth) are settings YOU choose BEFORE training. The wrong combination wastes compute and produces mediocre models. The right combination unlocks the model’s full potential.

Think of it like tuning a guitar. The strings (parameters) find their pitch through playing (training). But the tuning pegs (hyperparameters) must be set by the musician BEFORE playing. Turn them too tight — the string snaps (overfitting). Too loose — it sounds flat (underfitting). The sweet spot requires systematic experimentation.

This post covers every hyperparameter tuning technique — from brute-force GridSearch to intelligent Bayesian optimization with Optuna — with Python code, comparison tables, cross-validation strategies, and a practical workflow you can apply to any model.

Parameters vs Hyperparameters
Why Default Hyperparameters Are Rarely Optimal
Cross-Validation: The Foundation
K-Fold Cross-Validation
Stratified K-Fold
Other CV Strategies
GridSearchCV (Exhaustive Search)
How GridSearch Works
GridSearchCV in Python
GridSearch Limitations
RandomizedSearchCV (Random Sampling)
How RandomizedSearch Works
RandomizedSearchCV in Python
GridSearch vs RandomizedSearch
Bayesian Optimization with Optuna
How Bayesian Optimization Works
Optuna in Python
Optuna Visualization
Key Hyperparameters by Model
Random Forest Hyperparameters
XGBoost Hyperparameters
Logistic Regression Hyperparameters
Practical Tuning Workflow
Overfitting vs Underfitting During Tuning
Common Mistakes
Interview Questions
Wrapping Up

Parameters vs Hyperparameters

Aspect	Parameters	Hyperparameters
Set by	The model (learned during training)	You (set before training)
Examples	Weights, coefficients, split thresholds	Learning rate, max_depth, n_estimators, C
Change during training?	Yes (updated with each iteration)	No (fixed for the entire training run)
Stored in model?	Yes (model.coef_, tree splits)	Yes (model.get_params())
Analogy	The notes a musician plays (learned)	The tuning pegs (set before playing)

Key insight: You cannot tune hyperparameters by looking at training accuracy alone. A model with max_depth=50 might get 99% training accuracy but 60% test accuracy (overfitting). You need cross-validation to evaluate hyperparameter choices on unseen data.

Why Default Hyperparameters Are Rarely Optimal

Library defaults are chosen to work “reasonably” across many datasets. But your data is unique. Defaults are like buying medium-sized clothes for everyone — they fit nobody perfectly.

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

# Default Random Forest
rf_default = RandomForestClassifier(random_state=42)
default_scores = cross_val_score(rf_default, X, y, cv=5, scoring='accuracy')
print(f"Default: {default_scores.mean():.4f}")  # 0.8240

# After tuning
rf_tuned = RandomForestClassifier(
    n_estimators=300, max_depth=12, min_samples_split=5,
    min_samples_leaf=2, max_features='sqrt', random_state=42
)
tuned_scores = cross_val_score(rf_tuned, X, y, cv=5, scoring='accuracy')
print(f"Tuned:   {tuned_scores.mean():.4f}")  # 0.9120

# +8.8% accuracy just from hyperparameter tuning — no new data, no new features

Cross-Validation: The Foundation

Before learning any tuning technique, you must understand cross-validation — it is the evaluation method that all tuning techniques use internally.

The problem with train/test split: If you tune hyperparameters based on test set performance, you are indirectly “training” on the test set. The model optimizes for THAT specific test split, and your reported accuracy is overly optimistic.

Cross-validation solves this by evaluating on multiple different train/test splits and averaging the results.

K-Fold Cross-Validation

5-Fold Cross-Validation:

Fold 1: [TEST] [Train] [Train] [Train] [Train]  → Score: 0.82
Fold 2: [Train] [TEST] [Train] [Train] [Train]  → Score: 0.85
Fold 3: [Train] [Train] [TEST] [Train] [Train]  → Score: 0.79
Fold 4: [Train] [Train] [Train] [TEST] [Train]  → Score: 0.84
Fold 5: [Train] [Train] [Train] [Train] [TEST]  → Score: 0.81

Average Score: 0.822 ± 0.02

Every data point is in the test set exactly once.
Every data point is in the training set exactly 4 times.
The average score is a more reliable estimate than any single split.

from sklearn.model_selection import cross_val_score, KFold

# Basic K-Fold
kf = KFold(n_splits=5, shuffle=True, random_state=42)
scores = cross_val_score(model, X, y, cv=kf, scoring='accuracy')
print(f"Mean: {scores.mean():.4f}, Std: {scores.std():.4f}")
# Mean: 0.8220, Std: 0.0189

Stratified K-Fold

For classification problems, Stratified K-Fold preserves the class distribution in each fold. If 30% of your data is “fraud” (class 1), each fold will also have approximately 30% fraud. This prevents a fold from having zero fraud cases (which would give a misleading score).

from sklearn.model_selection import StratifiedKFold

# Stratified K-Fold (DEFAULT in scikit-learn for classifiers)
skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
scores = cross_val_score(model, X, y, cv=skf, scoring='f1')
print(f"Stratified Mean F1: {scores.mean():.4f}")

# For imbalanced datasets (e.g., 5% fraud), Stratified is ESSENTIAL
# Regular K-Fold might create a fold with 0% fraud → meaningless evaluation

Other CV Strategies

Strategy	Use When	Code
K-Fold	Regression, balanced classification	`KFold(n_splits=5)`
Stratified K-Fold	Imbalanced classification	`StratifiedKFold(n_splits=5)`
Leave-One-Out (LOO)	Very small datasets (<100 rows)	`LeaveOneOut()`
Repeated K-Fold	When you need very stable estimates	`RepeatedKFold(n_splits=5, n_repeats=3)`
Time Series Split	Temporal data (no future data leakage)	`TimeSeriesSplit(n_splits=5)`
Group K-Fold	When certain groups must stay together	`GroupKFold(n_splits=5)`

Time Series Split deserves special attention for data engineers — it ensures training data always comes BEFORE test data chronologically, preventing future data leakage:

Time Series Split (5 folds):
  Fold 1: [Train] [TEST]  ----  ----  ----
  Fold 2: [Train] [Train] [TEST] ---- ----
  Fold 3: [Train] [Train] [Train] [TEST] ----
  Fold 4: [Train] [Train] [Train] [Train] [TEST]

Training set grows with each fold. Test is always the NEXT period.

GridSearchCV (Exhaustive Search)

GridSearch tries EVERY combination of hyperparameter values you specify. If you give it 3 options for parameter A and 4 options for parameter B, it tries all 3 × 4 = 12 combinations.

Real-life analogy: You are trying to find the best coffee recipe. You have 3 grind sizes (coarse, medium, fine) and 4 brew times (2, 3, 4, 5 minutes). GridSearch brews ALL 12 combinations and picks the best. Thorough, but it takes 12 cups of coffee.

How GridSearch Works

Parameter Grid:
  n_estimators: [100, 200, 300]
  max_depth: [5, 10, 15, 20]
  min_samples_split: [2, 5]

Total combinations: 3 × 4 × 2 = 24
With 5-fold CV: 24 × 5 = 120 model fits

GridSearch fits ALL 120, records every score, and returns the best combination.

GridSearchCV in Python

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier

# Define the model
rf = RandomForestClassifier(random_state=42)

# Define the parameter grid
param_grid = {
    'n_estimators': [100, 200, 300, 500],
    'max_depth': [5, 10, 15, 20, None],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4],
    'max_features': ['sqrt', 'log2']
}
# Total: 4 × 5 × 3 × 3 × 2 = 360 combinations × 5 folds = 1,800 fits!

# Run GridSearch
grid_search = GridSearchCV(
    estimator=rf,
    param_grid=param_grid,
    cv=5,                     # 5-fold cross-validation
    scoring='accuracy',        # metric to optimize
    n_jobs=-1,                 # use all CPU cores
    verbose=2,                 # show progress
    return_train_score=True    # also record training scores
)
grid_search.fit(X_train, y_train)

# Best results
print(f"Best Score: {grid_search.best_score_:.4f}")
print(f"Best Params: {grid_search.best_params_}")

# Use the best model directly
best_model = grid_search.best_estimator_
test_score = best_model.score(X_test, y_test)
print(f"Test Score: {test_score:.4f}")

# View all results as a DataFrame
results = pd.DataFrame(grid_search.cv_results_)
print(results[['params', 'mean_test_score', 'std_test_score', 'rank_test_score']]
      .sort_values('rank_test_score').head(10))

GridSearch Limitations

Computationally explosive — 5 parameters with 5 values each = 5⁵ = 3,125 combinations × 5 folds = 15,625 fits
Wasteful — many combinations are obviously bad but still evaluated
Discrete grid — the optimal value might be between your grid points (e.g., best learning_rate is 0.07 but you only tried 0.01, 0.05, 0.1)

When to use GridSearch: Small parameter spaces (<100 combinations), important final tuning, when compute time is not a concern.

RandomizedSearchCV (Random Sampling)

Instead of trying EVERY combination, RandomizedSearch samples a fixed number of random combinations from the parameter space. You control how many combinations to try with n_iter.

Real-life analogy: Instead of tasting every dish at a buffet (GridSearch), you randomly pick 20 dishes (RandomizedSearch). You are unlikely to find the absolute best dish, but you will find a great one — in a fraction of the time.

How RandomizedSearch Works

Parameter Space (same as GridSearch):
  n_estimators: [100, 200, 300, 500]
  max_depth: [5, 10, 15, 20, None]
  min_samples_split: [2, 5, 10]

GridSearch: tries ALL 4 × 5 × 3 = 60 combinations
RandomizedSearch (n_iter=20): tries 20 RANDOM combinations

Key advantage: you can use CONTINUOUS distributions instead of fixed lists:
  learning_rate: uniform(0.001, 0.3)     → samples any value in range
  max_depth: randint(3, 30)              → samples any integer in range

This explores the space more efficiently than a fixed grid.

RandomizedSearchCV in Python

from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint, uniform

# Define distributions (not just lists!)
param_distributions = {
    'n_estimators': randint(100, 1000),       # any integer 100-1000
    'max_depth': randint(3, 30),              # any integer 3-30
    'min_samples_split': randint(2, 20),      # any integer 2-20
    'min_samples_leaf': randint(1, 10),       # any integer 1-10
    'max_features': ['sqrt', 'log2', None],   # categorical
    'bootstrap': [True, False]                # boolean
}

# Run RandomizedSearch
random_search = RandomizedSearchCV(
    estimator=RandomForestClassifier(random_state=42),
    param_distributions=param_distributions,
    n_iter=100,               # try 100 random combinations
    cv=5,
    scoring='accuracy',
    n_jobs=-1,
    verbose=1,
    random_state=42,
    return_train_score=True
)
random_search.fit(X_train, y_train)

# Results
print(f"Best Score: {random_search.best_score_:.4f}")
print(f"Best Params: {random_search.best_params_}")
print(f"Test Score: {random_search.best_estimator_.score(X_test, y_test):.4f}")

GridSearch vs RandomizedSearch

Feature	GridSearchCV	RandomizedSearchCV
Search strategy	Every combination	Random sample of combinations
Compute cost	Exponential (grows with grid size)	Fixed (you set n_iter)
Finds the best?	Guaranteed (within the grid)	Not guaranteed but usually close
Continuous params?	No (discrete grid only)	Yes (sample from distributions)
When to use	Small grids, final fine-tuning	Large spaces, first-pass exploration
Typical n_iter	All (no control)	50-200

Best practice: Use RandomizedSearch first (broad exploration, n_iter=100), then GridSearch (narrow fine-tuning around the best region found).

Bayesian Optimization with Optuna

GridSearch and RandomizedSearch are “uninformed” — each trial is independent. Bayesian optimization is informed — it learns from previous trials. If trial #5 showed that max_depth=10 with learning_rate=0.05 scored well, trial #6 explores nearby values rather than jumping to a random corner of the search space.

Optuna is the most popular Bayesian optimization library for Python. It is faster, smarter, and easier to use than GridSearch or RandomizedSearch for complex hyperparameter spaces.

Real-life analogy: GridSearch is like searching for a restaurant by visiting every block in the city. RandomizedSearch is like picking random blocks. Optuna is like asking locals: “The Italian place on 5th was great” → “Try the one on 6th, it is similar but better.” Each trial is informed by what worked before.

How Bayesian Optimization Works

1. Try a few random combinations (exploration phase)
2. Build a probability model of "which hyperparameters → which scores"
3. Use the model to predict the MOST PROMISING next combination
4. Try that combination, observe the score
5. Update the probability model
6. Repeat steps 3-5 for N trials

The model balances:
  EXPLORATION: try unexplored regions (maybe something better is out there)
  EXPLOITATION: try near the current best (refine what is already working)

Result: finds near-optimal hyperparameters in 50-100 trials instead of 1000+

Optuna in Python

import optuna
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

# Define the objective function
def objective(trial):
    # Optuna suggests values from defined ranges
    params = {
        'n_estimators': trial.suggest_int('n_estimators', 100, 1000),
        'max_depth': trial.suggest_int('max_depth', 3, 30),
        'min_samples_split': trial.suggest_int('min_samples_split', 2, 20),
        'min_samples_leaf': trial.suggest_int('min_samples_leaf', 1, 10),
        'max_features': trial.suggest_categorical('max_features', ['sqrt', 'log2']),
        'bootstrap': trial.suggest_categorical('bootstrap', [True, False]),
    }

    model = RandomForestClassifier(**params, random_state=42)
    scores = cross_val_score(model, X_train, y_train, cv=5, scoring='accuracy')
    return scores.mean()

# Create study and optimize
study = optuna.create_study(direction='maximize')   # maximize accuracy
study.optimize(objective, n_trials=100, show_progress_bar=True)

# Results
print(f"Best Score: {study.best_value:.4f}")
print(f"Best Params: {study.best_params}")
print(f"Trials completed: {len(study.trials)}")

# Train final model with best params
best_model = RandomForestClassifier(**study.best_params, random_state=42)
best_model.fit(X_train, y_train)
print(f"Test Score: {best_model.score(X_test, y_test):.4f}")

Optuna for XGBoost

import xgboost as xgb

def xgb_objective(trial):
    params = {
        'n_estimators': trial.suggest_int('n_estimators', 100, 1000),
        'max_depth': trial.suggest_int('max_depth', 3, 12),
        'learning_rate': trial.suggest_float('learning_rate', 0.001, 0.3, log=True),
        'subsample': trial.suggest_float('subsample', 0.5, 1.0),
        'colsample_bytree': trial.suggest_float('colsample_bytree', 0.5, 1.0),
        'min_child_weight': trial.suggest_int('min_child_weight', 1, 10),
        'gamma': trial.suggest_float('gamma', 0.0, 5.0),
        'reg_alpha': trial.suggest_float('reg_alpha', 1e-8, 10.0, log=True),
        'reg_lambda': trial.suggest_float('reg_lambda', 1e-8, 10.0, log=True),
    }

    model = xgb.XGBClassifier(**params, use_label_encoder=False, 
                               eval_metric='logloss', random_state=42)
    scores = cross_val_score(model, X_train, y_train, cv=5, scoring='f1')
    return scores.mean()

study = optuna.create_study(direction='maximize')
study.optimize(xgb_objective, n_trials=100)

Optuna Visualization

# Optuna has built-in visualization
from optuna.visualization import (
    plot_optimization_history,
    plot_param_importances,
    plot_parallel_coordinate,
    plot_slice
)

# 1. How the score improved over trials
fig = plot_optimization_history(study)
fig.show()

# 2. Which hyperparameters matter most
fig = plot_param_importances(study)
fig.show()

# 3. Parallel coordinate plot (relationships between params)
fig = plot_parallel_coordinate(study)
fig.show()

# 4. Slice plot (each parameter vs score)
fig = plot_slice(study)
fig.show()

# The param_importances plot is gold for interviews:
# "learning_rate and max_depth had the highest impact on model performance.
#  min_samples_leaf barely mattered."

Key Hyperparameters by Model

Random Forest Hyperparameters

Hyperparameter	What It Controls	Default	Typical Range
`n_estimators`	Number of trees	100	100-1000
`max_depth`	Maximum tree depth	None (unlimited)	3-30
`min_samples_split`	Min samples to split a node	2	2-20
`min_samples_leaf`	Min samples in a leaf node	1	1-10
`max_features`	Features considered per split	sqrt	sqrt, log2, None
`bootstrap`	Sample with replacement?	True	True, False

XGBoost Hyperparameters

Hyperparameter	What It Controls	Default	Typical Range
`learning_rate (eta)`	Step size per tree (lower = more trees needed)	0.3	0.001-0.3
`n_estimators`	Number of boosting rounds	100	100-1000
`max_depth`	Maximum tree depth	6	3-12
`subsample`	Row sampling ratio per tree	1.0	0.5-1.0
`colsample_bytree`	Column sampling ratio per tree	1.0	0.5-1.0
`min_child_weight`	Min sum of instance weight in a leaf	1	1-10
`gamma`	Min loss reduction to make a split	0	0-5
`reg_alpha (L1)`	L1 regularization	0	1e-8 to 10
`reg_lambda (L2)`	L2 regularization	1	1e-8 to 10

Logistic Regression Hyperparameters

Hyperparameter	What It Controls	Default	Typical Range
`C`	Inverse regularization strength (lower = stronger)	1.0	0.001-100
`penalty`	Regularization type	l2	l1, l2, elasticnet
`solver`	Optimization algorithm	lbfgs	lbfgs, liblinear, saga
`max_iter`	Maximum iterations	100	100-1000

Practical Tuning Workflow

Step 1: BASELINE
  Train model with default hyperparameters.
  Record baseline cross-validation score.
  This is your "bar to beat."

Step 2: COARSE SEARCH (RandomizedSearch or Optuna, n_iter=50-100)
  Define wide parameter ranges.
  Find the general "good region" of hyperparameter space.

Step 3: FINE SEARCH (GridSearch, narrow grid around Step 2 best)
  Narrow ranges to ±20% of Step 2 best values.
  Example: Step 2 found max_depth=12 → Grid: [10, 11, 12, 13, 14]

Step 4: EVALUATE ON HELD-OUT TEST SET
  Train final model with best params on ALL training data.
  Evaluate ONCE on the test set.
  This is your final reported score.

Step 5: CHECK FOR OVERFITTING
  Compare train score vs CV score.
  If train=0.99, CV=0.85 → overfitting → increase regularization.
  If train=0.80, CV=0.78 → healthy gap.

# Complete tuning workflow in code
from sklearn.model_selection import train_test_split

# Hold out a final test set FIRST (never used during tuning)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# Step 1: Baseline
baseline = RandomForestClassifier(random_state=42)
baseline_scores = cross_val_score(baseline, X_train, y_train, cv=5)
print(f"Baseline: {baseline_scores.mean():.4f}")

# Step 2: Coarse search with Optuna
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
print(f"Optuna best: {study.best_value:.4f}")

# Step 3: Fine-tune with GridSearch around Optuna best
best = study.best_params
fine_grid = {
    'n_estimators': [best['n_estimators'] - 50, best['n_estimators'], best['n_estimators'] + 50],
    'max_depth': [best['max_depth'] - 1, best['max_depth'], best['max_depth'] + 1],
    'min_samples_split': [max(2, best['min_samples_split'] - 1), best['min_samples_split'], 
                          best['min_samples_split'] + 1],
}
grid = GridSearchCV(RandomForestClassifier(random_state=42), fine_grid, cv=5, n_jobs=-1)
grid.fit(X_train, y_train)
print(f"Fine-tuned: {grid.best_score_:.4f}")

# Step 4: Final evaluation
final_model = grid.best_estimator_
print(f"Test Score: {final_model.score(X_test, y_test):.4f}")

Overfitting vs Underfitting During Tuning

Signal	Problem	Fix
Train=0.99, CV=0.75	Overfitting	Reduce max_depth, increase min_samples_split, add regularization
Train=0.70, CV=0.68	Underfitting	Increase n_estimators, increase max_depth, add more features
Train=0.88, CV=0.85	Good fit	No action — healthy 3% gap
CV varies widely (0.60-0.90)	High variance	More data, simpler model, or more CV folds

# Check for overfitting: compare train vs CV scores
from sklearn.model_selection import cross_validate

results = cross_validate(best_model, X_train, y_train, cv=5,
                          scoring='accuracy', return_train_score=True)
print(f"Train: {results['train_score'].mean():.4f}")
print(f"CV:    {results['test_score'].mean():.4f}")
print(f"Gap:   {results['train_score'].mean() - results['test_score'].mean():.4f}")
# Gap > 0.10 = likely overfitting
# Gap < 0.03 = good generalization

Common Mistakes

Tuning on the test set — if you use the test set to compare hyperparameter combinations, you are leaking test information into training. Use cross-validation on the training set for tuning. Evaluate on the test set ONCE at the very end.
Starting with GridSearch on a large space — GridSearch with 6 parameters × 5 values = 15,625 combinations × 5 folds = 78,125 fits. Start with RandomizedSearch or Optuna to narrow the space first, then fine-tune with a small GridSearch.
Ignoring cross-validation variance — a model with CV scores [0.60, 0.90, 0.70, 0.85, 0.65] (mean=0.74, std=0.12) is unreliable despite a decent mean. High variance means the model is inconsistent. Look at both mean AND standard deviation.
Using regular K-Fold for imbalanced data — if 5% of your data is the positive class, a fold might have 0% positives. Use Stratified K-Fold for classification to preserve class ratios in each fold.
Not setting random_state — without it, results change every run and you cannot reproduce your best model. Always set random_state=42 (or any fixed number) in the model, CV splits, and train_test_split.
Tuning hyperparameters before fixing data quality — no amount of tuning compensates for missing values, leaky features, or incorrect labels. Clean and feature-engineer your data FIRST, then tune.

Interview Questions

Q: What is the difference between GridSearchCV and RandomizedSearchCV? A: GridSearchCV tries every combination in the parameter grid (exhaustive but slow). RandomizedSearchCV samples a fixed number of random combinations (faster, supports continuous distributions). Use RandomizedSearch for initial exploration and GridSearch for final fine-tuning around the best region.

Q: What is cross-validation and why is it necessary for hyperparameter tuning? A: Cross-validation splits training data into K folds, trains on K-1 folds, and evaluates on the remaining fold — repeating K times. It gives a more reliable performance estimate than a single train/test split. Without it, you might overfit your hyperparameters to one specific data split.

Q: How does Bayesian optimization differ from grid/random search? A: GridSearch and RandomizedSearch try combinations independently — each trial ignores results of previous trials. Bayesian optimization (Optuna) learns from previous trials, building a probability model of which hyperparameter regions produce good scores. It intelligently focuses on promising regions, finding near-optimal values in far fewer trials.

Q: When would you use Stratified K-Fold instead of regular K-Fold? A: For classification tasks with imbalanced classes. Stratified K-Fold preserves the class distribution in each fold. If 5% of data is fraud, each fold has approximately 5% fraud. Regular K-Fold might create a fold with 0% fraud, giving a misleading evaluation score.

Q: How do you know if your model is overfitting during tuning? A: Compare the training score with the cross-validation score. If training accuracy is 99% but CV accuracy is 80%, the 19% gap indicates overfitting. Fix by increasing regularization (lower max_depth, higher min_samples_split) or adding more training data. A healthy gap is typically under 5%.

Q: Describe a practical hyperparameter tuning workflow. A: Step 1: Establish a baseline with default hyperparameters. Step 2: Coarse search using RandomizedSearch or Optuna (n_iter=100) with wide parameter ranges. Step 3: Fine-tune using GridSearch with narrow ranges around the best values found. Step 4: Evaluate the final model ONCE on the held-out test set. Step 5: Check for overfitting by comparing train vs CV scores.

Wrapping Up

Hyperparameter tuning is where models go from “good enough” to “production-grade.” The jump from default hyperparameters to tuned hyperparameters can be 5-15% accuracy improvement — without any new data or features.

Start with RandomizedSearch or Optuna to explore the space broadly. Fine-tune with GridSearch. Always use cross-validation. And remember: the best hyperparameters in the world cannot fix bad data — clean and feature-engineer first, then tune.

Optuna is the modern choice for serious tuning — it is smarter (Bayesian), faster (pruning bad trials), and gives you built-in visualization of what matters most. If you are still using only GridSearch, try Optuna on your next project. You will never go back.

← Previous: Clustering Algorithms

AI/ML (9/9)

Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

Hyperparameter Tuning: GridSearchCV, RandomizedSearchCV, Optuna, Cross-Validation Strategies, and Practical Tuning Workflows

Hyperparameter Tuning: GridSearchCV, RandomizedSearchCV, Optuna, Cross-Validation Strategies, and Practical Tuning Workflows

Table of Contents

Parameters vs Hyperparameters

Why Default Hyperparameters Are Rarely Optimal

Cross-Validation: The Foundation

K-Fold Cross-Validation

Stratified K-Fold

Other CV Strategies

GridSearchCV (Exhaustive Search)

How GridSearch Works

GridSearchCV in Python

GridSearch Limitations

RandomizedSearchCV (Random Sampling)

How RandomizedSearch Works

RandomizedSearchCV in Python

GridSearch vs RandomizedSearch

Bayesian Optimization with Optuna

How Bayesian Optimization Works

Optuna in Python

Optuna for XGBoost

Optuna Visualization

Key Hyperparameters by Model

Random Forest Hyperparameters

XGBoost Hyperparameters

Logistic Regression Hyperparameters

Practical Tuning Workflow

Overfitting vs Underfitting During Tuning

Common Mistakes

Interview Questions

Wrapping Up

Leave a Comment Cancel Reply

Hyperparameter Tuning: GridSearchCV, RandomizedSearchCV, Optuna, Cross-Validation Strategies, and Practical Tuning Workflows

Table of Contents

Parameters vs Hyperparameters

Why Default Hyperparameters Are Rarely Optimal

Cross-Validation: The Foundation

K-Fold Cross-Validation

Stratified K-Fold

Other CV Strategies

GridSearchCV (Exhaustive Search)

How GridSearch Works

GridSearchCV in Python

GridSearch Limitations

RandomizedSearchCV (Random Sampling)

How RandomizedSearch Works

RandomizedSearchCV in Python

GridSearch vs RandomizedSearch

Bayesian Optimization with Optuna

How Bayesian Optimization Works

Optuna in Python

Optuna for XGBoost

Optuna Visualization

Key Hyperparameters by Model

Random Forest Hyperparameters

XGBoost Hyperparameters

Logistic Regression Hyperparameters

Practical Tuning Workflow

Overfitting vs Underfitting During Tuning

Common Mistakes

Interview Questions

Wrapping Up

Related Posts

Leave a Comment Cancel Reply