Home
/
Blog
/
Tech Tutorials
/
Beginners Tutorial on XGBoost and Parameter Tuning in R

Beginners Tutorial on XGBoost and Parameter Tuning in R

Author
Manish Saraswat
Calendar Icon
December 20, 2016
Timer Icon
3 min read
Share

Explore this post with:

Introduction

Last week, we learned about Random Forest Algorithm. Now we know it helps us reduce a model's variance by building models on resampled data and thereby increases its generalization capability. Good!

Now, you might be wondering, what to do next for increasing a model's prediction accuracy ? After all, an ideal model is one which is good at both generalization and prediction accuracy. This brings us to Boosting Algorithms.

Developed in 1989, the family of boosting algorithms has been improved over the years. In this article, we'll learn about XGBoost algorithm.

XGBoost is the most popular machine learning algorithm these days. Regardless of the data type (regression or classification), it is well known to provide better solutions than other ML algorithms. In fact, since its inception (early 2014), it has become the "true love" of kaggle users to deal with structured data. So, if you are planning to compete on Kaggle, xgboost is one algorithm you need to master.

In this article, you'll learn about core concepts of the XGBoost algorithm. In addition, we'll look into its practical side, i.e., improving the xgboost model using parameter tuning in R.

On 5th March 2017: How to win Machine Learning Competitions ?

Table of Contents

  1. What is XGBoost? Why is it so good?
  2. How does XGBoost work?
  3. Understanding XGBoost Tuning Parameters
  4. Practical - Tuning XGBoost using R

Machine learning challenge, ML challenge

What is XGBoost ? Why is it so good ?

XGBoost (Extreme Gradient Boosting) is an optimized distributed gradient boosting library. Yes, it uses gradient boosting (GBM) framework at core. Yet, does better than GBM framework alone. XGBoost was created by Tianqi Chen, PhD Student, University of Washington. It is used for supervised ML problems. Let's look at what makes it so good:

  1. Parallel Computing: It is enabled with parallel processing (using OpenMP); i.e., when you run xgboost, by default, it would use all the cores of your laptop/machine.
  2. Regularization: I believe this is the biggest advantage of xgboost. GBM has no provision for regularization. Regularization is a technique used to avoid overfitting in linear and tree-based models.
  3. Enabled Cross Validation: In R, we usually use external packages such as caret and mlr to obtain CV results. But, xgboost is enabled with internal CV function (we'll see below).
  4. Missing Values: XGBoost is designed to handle missing values internally. The missing values are treated in such a manner that if there exists any trend in missing values, it is captured by the model.
  5. Flexibility: In addition to regression, classification, and ranking problems, it supports user-defined objective functions also. An objective function is used to measure the performance of the model given a certain set of parameters. Furthermore, it supports user defined evaluation metrics as well.
  6. Availability: Currently, it is available for programming languages such as R, Python, Java, Julia, and Scala.
  7. Save and Reload: XGBoost gives us a feature to save our data matrix and model and reload it later. Suppose, we have a large data set, we can simply save the model and use it in future instead of wasting time redoing the computation.
  8. Tree Pruning: Unlike GBM, where tree pruning stops once a negative loss is encountered, XGBoost grows the tree upto max_depth and then prune backward until the improvement in loss function is below a threshold.

I'm sure now you are excited to master this algorithm. But remember, with great power comes great difficulties too. You might learn to use this algorithm in a few minutes, but optimizing it is a challenge. Don't worry, we shall look into it in following sections.

How does XGBoost work ?

XGBoost belongs to a family of boosting algorithms that convert weak learners into strong learners. A weak learner is one which is slightly better than random guessing. Let's understand boosting first (in general).

Boosting is a sequential process; i.e., trees are grown using the information from a previously grown tree one after the other. This process slowly learns from data and tries to improve its prediction in subsequent iterations. Let's look at a classic classification example:

explain boosting

Four classifiers (in 4 boxes), shown above, are trying hard to classify + and - classes as homogeneously as possible. Let's understand this picture well.

  1. Box 1: The first classifier creates a vertical line (split) at D1. It says anything to the left of D1 is + and anything to the right of D1 is -. However, this classifier misclassifies three + points.
  2. Box 2: The next classifier says don't worry I will correct your mistakes. Therefore, it gives more weight to the three + misclassified points (see bigger size of +) and creates a vertical line at D2. Again it says, anything to right of D2 is - and left is +. Still, it makes mistakes by incorrectly classifying three - points.
  3. Box 3: The next classifier continues to bestow support. Again, it gives more weight to the three - misclassified points and creates a horizontal line at D3. Still, this classifier fails to classify the points (in circle) correctly.
  4. Remember that each of these classifiers has a misclassification error associated with them.
  5. Boxes 1,2, and 3 are weak classifiers. These classifiers will now be used to create a strong classifier Box 4.
  6. Box 4: It is a weighted combination of the weak classifiers. As you can see, it does good job at classifying all the points correctly.

That's the basic idea behind boosting algorithms. The very next model capitalizes on the misclassification/error of previous model and tries to reduce it. Now, let's come to XGBoost.

As we know, XGBoost can used to solve both regression and classification problems. It is enabled with separate methods to solve respective problems. Let's see:

Classification Problems: To solve such problems, it uses booster = gbtree parameter; i.e., a tree is grown one after other and attempts to reduce misclassification rate in subsequent iterations. In this, the next tree is built by giving a higher weight to misclassified points by the previous tree (as explained above).

Regression Problems: To solve such problems, we have two methods: booster = gbtree and booster = gblinear. You already know gbtree. In gblinear, it builds generalized linear model and optimizes it using regularization (L1,L2) and gradient descent. In this, the subsequent models are built on residuals (actual - predicted) generated by previous iterations. Are you wondering what is gradient descent? Understanding gradient descent requires math, however, let me try and explain it in simple words:

  • Gradient Descent: It is a method which comprises a vector of weights (or coefficients) where we calculate their partial derivative with respective to zero. The motive behind calculating their partial derivative is to find the local minima of the loss function (RSS), which is convex in nature. In simple words, gradient descent tries to optimize the loss function by tuning different values of coefficients to minimize the error.
gradient descent convex function

Hopefully, up till now, you have developed a basic intuition around how boosting and xgboost works. Let's proceed to understand its parameters. After all, using xgboost without parameter tuning is like driving a car without changing its gears; you can never up your speed.

Note: In R, xgboost package uses a matrix of input data instead of a data frame.

Understanding XGBoost Tuning Parameters

Every parameter has a significant role to play in the model's performance. Before hypertuning, let's first understand about these parameters and their importance. In this article, I've only explained the most frequently used and tunable parameters. To look at all the parameters, you can refer to its official documentation.

XGBoost parameters can be divided into three categories (as suggested by its authors):
  • General Parameters: Controls the booster type in the model which eventually drives overall functioning
  • Booster Parameters: Controls the performance of the selected booster
  • Learning Task Parameters: Sets and evaluates the learning process of the booster from the given data

  1. General Parameters
    1. Booster[default=gbtree]
      • Sets the booster type (gbtree, gblinear or dart) to use. For classification problems, you can use gbtree, dart. For regression, you can use any.
    2. nthread[default=maximum cores available]
      • Activates parallel computation. Generally, people don't change it as using maximum cores leads to the fastest computation.
    3. silent[default=0]
      • If you set it to 1, your R console will get flooded with running messages. Better not to change it.

  2. Booster Parameters
  3. As mentioned above, parameters for tree and linear boosters are different. Let's understand each one of them:

    Parameters for Tree Booster

    1. nrounds[default=100]
      • It controls the maximum number of iterations. For classification, it is similar to the number of trees to grow.
      • Should be tuned using CV
    2. eta[default=0.3][range: (0,1)]
      • It controls the learning rate, i.e., the rate at which our model learns patterns in data. After every round, it shrinks the feature weights to reach the best optimum.
      • Lower eta leads to slower computation. It must be supported by increase in nrounds.
      • Typically, it lies between 0.01 - 0.3
    3. gamma[default=0][range: (0,Inf)]
      • It controls regularization (or prevents overfitting). The optimal value of gamma depends on the data set and other parameter values.
      • Higher the value, higher the regularization. Regularization means penalizing large coefficients which don't improve the model's performance. default = 0 means no regularization.
      • Tune trick: Start with 0 and check CV error rate. If you see train error >>> test error, bring gamma into action. Higher the gamma, lower the difference in train and test CV. If you have no clue what value to use, use gamma=5 and see the performance. Remember that gamma brings improvement when you want to use shallow (low max_depth) trees.
    4. max_depth[default=6][range: (0,Inf)]
      • It controls the depth of the tree.
      • Larger the depth, more complex the model; higher chances of overfitting. There is no standard value for max_depth. Larger data sets require deep trees to learn the rules from data.
      • Should be tuned using CV
    5. min_child_weight[default=1][range:(0,Inf)]
      • In regression, it refers to the minimum number of instances required in a child node. In classification, if the leaf node has a minimum sum of instance weight (calculated by second order partial derivative) lower than min_child_weight, the tree splitting stops.
      • In simple words, it blocks the potential feature interactions to prevent overfitting. Should be tuned using CV.
    6. subsample[default=1][range: (0,1)]
      • It controls the number of samples (observations) supplied to a tree.
      • Typically, its values lie between (0.5-0.8)
    7. colsample_bytree[default=1][range: (0,1)]
      • It control the number of features (variables) supplied to a tree
      • Typically, its values lie between (0.5,0.9)
    8. lambda[default=0]
      • It controls L2 regularization (equivalent to Ridge regression) on weights. It is used to avoid overfitting.
    9. alpha[default=1]
      • It controls L1 regularization (equivalent to Lasso regression) on weights. In addition to shrinkage, enabling alpha also results in feature selection. Hence, it's more useful on high dimensional data sets.

    Parameters for Linear Booster

    Using linear booster has relatively lesser parameters to tune, hence it computes much faster than gbtree booster.
    1. nrounds[default=100]
      • It controls the maximum number of iterations (steps) required for gradient descent to converge.
      • Should be tuned using CV
    2. lambda[default=0]
      • It enables Ridge Regression. Same as above
    3. alpha[default=1]
      • It enables Lasso Regression. Same as above

  4. Learning Task Parameters
  5. These parameters specify methods for the loss function and model evaluation. In addition to the parameters listed below, you are free to use a customized objective / evaluation function.

    1. Objective[default=reg:linear]
      • reg:linear - for linear regression
      • binary:logistic - logistic regression for binary classification. It returns class probabilities
      • multi:softmax - multiclassification using softmax objective. It returns predicted class labels. It requires setting num_class parameter denoting number of unique prediction classes.
      • multi:softprob - multiclassification using softmax objective. It returns predicted class probabilities.
    2. eval_metric [no default, depends on objective selected]
      • These metrics are used to evaluate a model's accuracy on validation data. For regression, default metric is RMSE. For classification, default metric is error.
      • Available error functions are as follows:
        • mae - Mean Absolute Error (used in regression)
        • Logloss - Negative loglikelihood (used in classification)
        • AUC - Area under curve (used in classification)
        • RMSE - Root mean square error (used in regression)
        • error - Binary classification error rate [#wrong cases/#all cases]
        • mlogloss - multiclass logloss (used in classification)

We've looked at how xgboost works, the significance of each of its tuning parameter, and how it affects the model's performance. Let's bolster our newly acquired knowledge by solving a practical problem in R.

Practical - Tuning XGBoost in R

In this practical section, we'll learn to tune xgboost in two ways: using the xgboost package and MLR package. I don't see the xgboost R package having any inbuilt feature for doing grid/random search. To overcome this bottleneck, we'll use MLR to perform the extensive parametric search and try to obtain optimal accuracy.

I'll use the adult data set from my previous random forest tutorial. This data set poses a classification problem where our job is to predict if the given user will have a salary <=50K or >50K.

Using random forest, we achieved an accuracy of 85.8%. Theoretically, xgboost should be able to surpass random forest's accuracy. Let's see if we can do it. I'll follow the most common but effective steps in parameter tuning:

  1. First, you build the xgboost model using default parameters. You might be surprised to see that default parameters sometimes give impressive accuracy.
  2. If you get a depressing model accuracy, do this: fix eta = 0.1, leave the rest of the parameters at default value, using xgb.cv function get best n_rounds. Now, build a model with these parameters and check the accuracy.
  3. Otherwise, you can perform a grid search on rest of the parameters (max_depth, gamma, subsample, colsample_bytree etc) by fixing eta and nrounds. Note: If using gbtree, don't introduce gamma until you see a significant difference in your train and test error.
  4. Using the best parameters from grid search, tune the regularization parameters(alpha,lambda) if required.
  5. At last, increase/decrease eta and follow the procedure. But remember, excessively lower eta values would allow the model to learn deep interactions in the data and in this process, it might capture noise. So be careful!

This process might sound a bit complicated, but it's quite easy to code in R. Don't worry, I've demonstrated all the steps below. Let's get into actions now and quickly prepare our data for modeling (if you don't understand any line of code, ask me in comments):

# set working directory
path <- "~/December 2016/XGBoost_Tutorial"
setwd(path)

# load libraries
library(data.table)
library(mlr)

# set variable names
setcol <- c("age",
            "workclass",
            "fnlwgt",
            "education",
            "education-num",
            "marital-status",
            "occupation",
            "relationship",
            "race",
            "sex",
            "capital-gain",
            "capital-loss",
            "hours-per-week",
            "native-country",
            "target")

# load data
train <- read.table("adultdata.txt", header = FALSE, sep = ",",
                    col.names = setcol, na.strings = c(" ?"),
                    stringsAsFactors = FALSE)
test <- read.table("adulttest.txt", header = FALSE, sep = ",",
                   col.names = setcol, skip = 1,
                   na.strings = c(" ?"), stringsAsFactors = FALSE)

# convert data frame to data table
setDT(train)
setDT(test)

# check missing values
table(is.na(train))
sapply(train, function(x) sum(is.na(x)) / length(x)) * 100
table(is.na(test))
sapply(test, function(x) sum(is.na(x)) / length(x)) * 100

# quick data cleaning
# remove extra character from target variable
library(stringr)
test[, target := substr(target, start = 1, stop = nchar(target) - 1)]

# remove leading whitespaces
char_col <- colnames(train)[sapply(test, is.character)]
for (i in char_col) set(train, j = i, value = str_trim(train[[i]], side = "left"))
for (i in char_col) set(test, j = i, value = str_trim(test[[i]], side = "left"))

# set all missing value as "Missing"
train[is.na(train)] <- "Missing"
test[is.na(test)] <- "Missing"

Up to this point, we dealt with basic data cleaning and data inconsistencies. To use xgboost package, keep these things in mind:

  1. Convert the categorical variables into numeric using one hot encoding
  2. For classification, if the dependent variable belongs to class factor, convert it to numeric

R's base function model.matrix is quick enough to implement one hot encoding. In the code below, ~.+0 leads to encoding of all categorical variables without producing an intercept. Alternatively, you can use the dummies package to accomplish the same task. Since xgboost package accepts target variable separately, we'll do the encoding keeping this in mind:

# using one hot encoding
>labels <- train$target
>ts_label <- test$target
>new_tr <- model.matrix(~.+0, data = train[,-c("target"), with = FALSE])
>new_ts <- model.matrix(~.+0, data = test[,-c("target"), with = FALSE])

# convert factor to numeric
>labels <- as.numeric(labels) - 1
>ts_label <- as.numeric(ts_label) - 1

For xgboost, we'll use xgb.DMatrix to convert data table into a matrix (most recommended):

# preparing matrix
>dtrain <- xgb.DMatrix(data = new_tr, label = labels)
&t;dtest <- xgb.DMatrix(data = new_ts, label = ts_label)

As mentioned above, we'll first build our model using default parameters, keeping random forest's accuracy 85.8% in mind. I'll capture the default parameters from above (written against every parameter):

# default parameters
params <- list(
    booster = "gbtree",
    objective = "binary:logistic",
    eta = 0.3,
    gamma = 0,
    max_depth = 6,
    min_child_weight = 1,
    subsample = 1,
    colsample_bytree = 1
)

Using the inbuilt xgb.cv function, let's calculate the best nround for this model. In addition, this function also returns CV error, which is an estimate of test error.

xgbcv <- xgb.cv(
    params = params,
    data = dtrain,
    nrounds = 100,
    nfold = 5,
    showsd = TRUE,
    stratified = TRUE,
    print.every.n = 10,
    early.stop.round = 20,
    maximize = FALSE
)
# best iteration = 79

The model returned lowest error at the 79th (nround) iteration. Also, if you noticed the running messages in your console, you would have understood that train and test error are following each other. We'll use this insight in the following code. Now, we'll see our CV error:

min(xgbcv$test.error.mean)
# 0.1263

As compared to my previous random forest model, this CV accuracy (100-12.63)=87.37% looks better already. However, I believe cross-validation accuracy is usually more optimistic than true test accuracy. Let's calculate our test set accuracy and determine if this default model makes sense:

# first default - model training
xgb1 <- xgb.train(
    params = params,
    data = dtrain,
    nrounds = 79,
    watchlist = list(val = dtest, train = dtrain),
    print.every.n = 10,
    early.stop.round = 10,
    maximize = FALSE,
    eval_metric = "error"
)

# model prediction
xgbpred <- predict(xgb1, dtest)
xgbpred <- ifelse(xgbpred > 0.5, 1, 0)

The objective function binary:logistic returns output predictions rather than labels. To convert it, we need to manually use a cutoff value. As seen above, I've used 0.5 as my cutoff value for predictions. We can calculate our model's accuracy using confusionMatrix() function from caret package.

# confusion matrix
library(caret)
confusionMatrix(xgbpred, ts_label)
# Accuracy - 86.54%

# view variable importance plot
mat <- xgb.importance(feature_names = colnames(new_tr), model = xgb1)
xgb.plot.importance(importance_matrix = mat[1:20])  # first 20 variables

xgboost variable importance plot

As you can see, we've achieved better accuracy than a random forest model using default parameters in xgboost. Can we still improve it? Let's proceed to the random / grid search procedure and attempt to find better accuracy. From here on, we'll be using the MLR package for model building. A quick reminder, the MLR package creates its own frame of data, learner as shown below. Also, keep in mind that task functions in mlr doesn't accept character variables. Hence, we need to convert them to factors before creating task:

# convert characters to factors
fact_col <- colnames(train)[sapply(train, is.character)]
for (i in fact_col) set(train, j = i, value = factor(train[[i]]))
for (i in fact_col) set(test, j = i, value = factor(test[[i]]))

# create tasks
traintask <- makeClassifTask(data = train, target = "target")
testtask <- makeClassifTask(data = test, target = "target")

# do one hot encoding
traintask <- createDummyFeatures(obj = traintask, target = "target")
testtask <- createDummyFeatures(obj = testtask, target = "target")

Now, we'll set the learner and fix the number of rounds and eta as discussed above.


#create learner
# create learner
lrn <- makeLearner("classif.xgboost", predict.type = "response")
lrn$par.vals <- list(
    objective = "binary:logistic",
    eval_metric = "error",
    nrounds = 100L,
    eta = 0.1
)

# set parameter space
params <- makeParamSet(
    makeDiscreteParam("booster", values = c("gbtree", "gblinear")),
    makeIntegerParam("max_depth", lower = 3L, upper = 10L),
    makeNumericParam("min_child_weight", lower = 1L, upper = 10L),
    makeNumericParam("subsample", lower = 0.5, upper = 1),
    makeNumericParam("colsample_bytree", lower = 0.5, upper = 1)
)

# set resampling strategy
rdesc <- makeResampleDesc("CV", stratify = TRUE, iters = 5L)

With stratify=T, we'll ensure that distribution of target class is maintained in the resampled data sets. If you've noticed above, in the parameter set, I didn't consider gamma for tuning. Simply because during cross validation, we saw that train and test error are in sync with each other. Had either one of them been dragging or rushing, we could have brought this parameter into action.

Now, we'll set the search optimization strategy. Though, xgboost is fast, instead of grid search, we'll use random search to find the best parameters.

Subscribe to The HackerEarth Blog

Get expert tips, hacks, and how-tos from the world of tech recruiting to stay on top of your hiring!

Author
Manish Saraswat
Calendar Icon
December 20, 2016
Timer Icon
3 min read
Share

Hire top tech talent with our recruitment platform

Access Free Demo
Related reads

Discover more articles

Gain insights to optimize your developer recruitment process.

What AI Is Forcing HR to Rethink About Hiring

What AI is forcing HR to rethink

For recruiters and talent leaders, AI has made one thing clear: resumes can no longer be trusted as the primary signal of candidate capability. What AI is forcing HR to rethink is the entire screening stack — from how reqs are written, to how the ATS filters applicants, to how quality of hire (QoH) is measured against time-to-fill. According to LinkedIn's Future of Recruiting 2024 report, 73% of recruiters say skills-based hiring is a priority, yet most pipelines still screen on degree and employer brand at the ATS layer. That gap is where the rethink begins.

Why traditional resumes no longer predict strong hires

Resumes measure presentation more reliably than capability. Recruiters have long used job titles, company names, degrees, and years of experience as proxies for performance, but generative AI tools — ChatGPT, Teal, Rezi, and Kickresume among them — have collapsed the cost of producing a polished application. The World Economic Forum's Future of Jobs Report 2023 found that 44% of workers' core skills are expected to change by 2027, which means a resume snapshot ages faster than the role it describes.

For recruiters, the operational impact is direct: pipelines fill, screen rates rise, and yet QoH stays flat. As AI becomes more deeply embedded in hiring, HR leaders are being forced to rethink a single question:

What if resumes are no longer the best predictor of performance?

That question is reshaping recruitment faster than many organizations expected — though, as discussed later, the shift away from resumes carries its own trade-offs.

Share of Workers' Core Skills Expected to Change by 2027
Source: World Economic Forum Future of Jobs Report 2023

The resume was built for a different era

Modern work no longer fits the resume's static format. Skills evolve in months rather than years, roles overlap across functions, and professionals build expertise through online communities, freelance projects, bootcamps, and self-directed learning. According to SHRM's 2024 Talent Trends research, nearly half of HR leaders report that candidates from non-traditional backgrounds are increasingly competitive on assessments.

Resumes still reduce people to standardized timelines, and many capable candidates are filtered out by ATS rules simply because they lack the "right" employer logos. At the same time, candidates skilled in resume optimization can outperform genuinely capable professionals at the screen stage — a pattern that pre-dates AI but has been amplified by it.

It has become far easier for candidates to generate polished resumes, cover letters, and interview responses in minutes. For recruiters, the takeaway is practical: formatting and phrasing are no longer reliable proxies for capability.

AI did not break hiring — it exposed existing problems

AI did not create the resume problem; it surfaced one already present in most hiring funnels. Surveys of recruiters, including Gartner's 2024 HR research, have consistently shown three pre-AI pressures: recruiters overwhelmed by application volume, candidates optimizing resumes to pass ATS filters, and hiring managers reporting weak outcomes despite reviewing seemingly strong resumes.

AI accelerated these problems to a point where they can no longer be ignored. Many candidates can now generate a highly optimized application in seconds, and recruiters increasingly struggle to distinguish between candidates skilled at self-presentation and those who can actually do the work.

The operational shift is moving from:

"What does your resume say?"

Toward:

"Can you actually do the job?"

The rise of skills-based hiring

Skills-based hiring outperforms resume screening because it measures demonstrated capability rather than credential proximity. A growing number of organizations — including IBM, Accenture, and Delta, profiled in LinkedIn's Skills Path program — are moving toward skills-first models that prioritize practical assessments, simulations, project work, and role-specific problem-solving over employer brand or degree.

This trend is most visible in technology hiring, where coding assessments and real-world technical evaluations generally provide stronger signals than resumes alone, particularly when compared against resume-only screens for time-to-productivity. HackerEarth has run over 100 million developer assessments across enterprise hiring programs, and the consistent pattern in that dataset is that demonstrated coding performance correlates more closely with on-the-job output than degree or prior employer.

Beyond tech, a growing number of organizations are extending the model: marketing teams using campaign-brief exercises, sales teams using recorded customer-handling scenarios, and operations teams using situational judgment tests. For a deeper view of how this maps to specific roles, see our skills-based hiring guide and developer assessment platform.

Where skills-based hiring breaks down

Skills-based hiring is not without trade-offs, and recruiters evaluating it should plan for known failure modes:

  • Assessment bias. Poorly designed assessments can disadvantage career returners, caregivers, and candidates with limited test-taking time as severely as resume screens disadvantage non-traditional backgrounds.
  • Gaming of take-home tests. Unproctored coding or case exercises are increasingly solvable with generative AI, which means assessment design has to evolve in step with candidate tooling.
  • Candidate experience at scale. Long assessment batteries lower completion rates and damage employer brand, particularly for senior candidates who have multiple offers in play.
  • Legal exposure. In jurisdictions including New York City (Local Law 144) and under the EU AI Act, automated employment decision tools are subject to bias audits and disclosure requirements. Recruiters should confirm vendor compliance before deploying AI-driven scoring.

The honest read: most organizations announcing a "shift" to skills-based hiring still filter by degree at the ATS layer. The shift is real, but it is uneven.

Skills-Based Hiring Priority vs. ATS Screening Reality
Source: LinkedIn Future of Recruiting 2024; ATS screening figure illustrative based on article claims

Why HR leaders are rethinking potential

Potential is becoming more measurable in ways resumes never allowed. Traditional hiring often prioritized pedigree — familiar universities, recognizable employers, conventional career paths — but AI-powered assessment platforms (HackerEarth, HireVue, Pymetrics, Codility, and Workday Skills Cloud among them) score candidates on demonstrated performance against role-specific tasks, calibrated to a benchmark population.

These tools typically combine task-based evaluations, behavioral simulations, and structured scoring rubrics. Their limits matter too: they score what they are trained to score, they can encode bias from the training population, and they do not measure long-arc traits like cultural contribution or leadership trajectory. Recruiters should treat them as one signal in a structured interview loop, not a single decision point.

Research suggests that candidates without elite degrees frequently match or outperform credentialed peers on standardized technical assessments. In many cases, career switchers and self-taught professionals demonstrate strong adaptability and practical skill. Organizations that shift toward capability-based evaluation may gain access to broader and more diverse talent pools — though, as noted above, only if assessment design itself is audited for fairness.

The recruiter's role is changing

AI is not replacing recruiters; it is shifting where recruiters spend their time. Traditional recruitment rewarded screening volume and speed. Modern hiring increasingly rewards judgment, stakeholder alignment, and structured decision-making.

As automation handles sourcing, scheduling, resume parsing, and initial outreach, recruiters are spending more time on work AI cannot do well:

  • Probing candidate motivation through structured behavioral interviews
  • Evaluating adaptability against specific role demands using scorecards
  • Building hiring-manager alignment on the req and intake brief
  • Designing candidate-experience touchpoints that protect offer-accept rates
  • Calibrating assessment results against on-the-job performance data

The recruiter who succeeds in an AI-heavy pipeline is the one who can interpret signal, not the one who can scan resumes faster.

Candidates are changing faster than hiring systems

Modern career paths now move faster than most ATS configurations. Today's workforce values flexibility, creativity, continuous learning, and project-based growth, and many professionals build experience through freelance work, startups, creator platforms, and side projects. Their resumes often look unconventional, but unconventional no longer equates to unqualified.

Organizations that shift toward capability-based evaluation may access talent pools that rigid resume filters would otherwise miss. For practical guidance on adjusting screening criteria, see our guide to evaluating an ATS for skills-based hiring.

The future of hiring will feel more human

There is an irony in the AI shift: as resumes become easier to automate, organizations are being pushed to evaluate creativity, adaptability, collaboration, and real-world problem-solving more directly. The likely structure of mature AI-enabled hiring is AI handling repetitive tasks — sourcing, scheduling, parsing, initial scoring — while recruiters and hiring managers focus on nuance, context, and long-term fit.

FAQ

Is skills-based hiring more effective than resume screening? Skills-based hiring tends to predict on-the-job performance more reliably than resume screening for roles where the work can be assessed directly, such as engineering, data, sales, and marketing execution. According to LinkedIn's Future of Recruiting report, 73% of recruiters now prioritize skills-based approaches. Effectiveness depends heavily on assessment design and on whether downstream ATS filters still gate candidates by degree.

What HR processes is AI changing first? AI is changing sourcing, resume parsing, candidate matching, and initial assessment scoring first, because these are high-volume, rules-based tasks. Structured interviewing, offer negotiation, and onboarding remain primarily human-led, though AI-assisted note-taking and scorecard analysis are growing.

Will AI replace recruiters? AI is unlikely to replace recruiters, but it is changing the skill profile. Recruiters who can interpret assessment data, align hiring managers, and design candidate experience will be more valuable; recruiters whose role is primarily resume scanning are most exposed.

How do I evaluate an AI hiring tool for bias? Ask the vendor for a bias audit report (required under NYC Local Law 144 for automated employment decision tools), the demographic composition of the training data, the validation methodology against job performance, and the appeal process for candidates. Avoid tools that cannot answer all four.

Is resume-based hiring going away? Resume-based hiring is under pressure but not disappearing. Most organizations are moving toward hybrid models where resumes provide context and assessments provide the capability signal. A full move away from resumes is unlikely in the next hiring cycle for most enterprises.

What is the biggest risk of switching to skills-based hiring? The biggest risk is poorly designed assessments that introduce new forms of bias or damage candidate experience. A skills-based process built on a long, unproctored, untested assessment battery will perform worse than a structured resume screen.

Next steps: See it in action

If you are a recruiter or talent leader evaluating how to move from resume-led to skills-led screening, book a demo of HackerEarth Assessments to see how role-specific evaluations, proctoring, and benchmarked scoring fit into an existing ATS pipeline. For background reading, see our developer assessment platform overview and the HackerEarth recruiter blog.

Recruiters who pair structured assessment data with strong human judgment build better pipelines than either resumes or AI alone can produce.

Must-Know Recruitment Questions for HR and Talent Acquisition Teams (2026)

Recruitment questions every HR professional should know in 2025

Estimated read time: 7 minutes

Most "tell me about yourself" answers are now written by ChatGPT the night before the interview. That single shift — candidates arriving with rehearsed, AI-polished narratives — has broken the standard interview script and forced recruiters to redesign their question sets from the ground up. This guide outlines the categories of recruitment questions every HR professional should know in 2025, why each matters, and example questions you can adapt to your hiring rubric or scorecard today.

LinkedIn's 2024 Global Talent Trends report notes that skills-based hiring and behavioral assessment have moved from optional to expected in most talent acquisition workflows. Yet many hiring conversations still rely on outdated prompts that produce polished answers and unclear signals. The recruiter persona — the one running req intake, pipeline reviews, and screen calls — needs a tighter toolkit.

Who this is for: This article is written for recruiters and talent acquisition partners running structured interviews. Hiring managers building a scorecard alongside the recruiter will also find the question categories useful.

Adoption of Structured Hiring Practices Among HR Teams (2020–2025)
Source: LinkedIn Global Talent Trends claims cited in article

Why modern recruitment questions fail when they stay outdated

Industry observers at SHRM have noted that candidates are better prepared, interviews are more structured, and expectations on both sides have risen (SHRM research). With generative AI tools widely available, many candidates now enter screens with refined, rehearsed narratives.

The result is predictable — polished answers, unclear signals, and decisions made on incomplete understanding. The quality of the recruitment questions you bring into the room directly defines the quality of the signal you capture on the scorecard.

A contestable position worth stating plainly: behavioral interview frameworks like STAR are now overused to the point where candidates have memorized the structure, which reduces signal quality unless interviewers probe past the rehearsed answer with follow-ups.

What this article won't claim

Structured behavioral interviewing is not a silver bullet. Over-indexing on adaptability can screen out deep specialists whose value is stability and depth. Ownership-mindset framing, if applied rigidly, can disadvantage neurodivergent candidates or those from cultures where collective credit is the norm. Use the questions below as part of a balanced rubric — not as a single filter.

From "tell me about yourself" to understanding real intent

Traditional opening questions rarely reveal a candidate's intent or direction. A stronger opening probes why a candidate is moving at this specific point and what kind of work keeps them engaged beyond compensation.

Evidence from Gallup's 2023 State of the Global Workplace report suggests today's workforce is increasingly motivated by alignment, learning, and perceived growth — not stability alone. If this layer is missed early in the interview, the rest of the evaluation becomes less reliable.

Example intent and motivation questions

  • "Walk me through the last time you decided to leave a role. What specifically triggered the decision?"
  • "What kind of work has made you lose track of time in the last 12 months?"
  • "If this role didn't exist, what would your second-choice next move be — and why?"
  • "What would need to be true 18 months from now for you to consider this move a success?"

What to listen for

  • Specific triggers and trade-offs, not generic phrases like "growth" or "new challenges."
  • Consistency between the stated motivation and the candidate's actual career pattern.

Red flags

  • Answers that match the job description back to you almost verbatim.
  • Vague language about "culture" or "growth" with no concrete example.

Behavioral and competency-based recruitment questions: getting past scripted answers

One of the biggest challenges recruiters face today is not lack of talent, but over-prepared talent. Hiring practitioners increasingly find that well-structured, confident answers do not always reflect real capability, especially when responses are influenced by preparation tools or rehearsed narratives.

This is why competency-based questions — which explore decision-making logic, trade-offs, and real-time reasoning — produce higher signal than story-based prompts alone. For technical roles, pairing these with a practical assessment helps confirm what the interview surfaces. HackerEarth's skill assessments use role-specific question libraries and rubric-based scoring so the recruiter can compare candidate outputs against a defined standard, rather than relying on the candidate's own narrative of their capability.

Example behavioral and competency-based questions

  1. "Tell me about a decision you made in the last six months that you would make differently today. What changed your thinking?"
  2. "Describe a time you disagreed with your manager on a priority. How did you handle it?"
  3. "Walk me through a project where the scope changed mid-execution. What did you cut, and why?"
  4. "Give me an example of feedback you initially rejected but later acted on."

How to probe past the rehearsed answer

If a candidate delivers a clean STAR-format response, follow up with: "What's one detail you usually leave out of that story?" or "Who would tell that story differently?" These prompts disrupt the rehearsed structure and surface the actual reasoning.

Situational judgment and adaptability questions

Workplaces are shaped by continuous change — shifting priorities, evolving tools, and hybrid collaboration. Many hiring teams now treat adaptability as a core hiring parameter rather than a soft skill, particularly for roles where ambiguity is the default state.

Situational judgment questions present a realistic scenario and ask the candidate how they would navigate it. They are harder to rehearse than story-based prompts because the scenario is novel.

Example situational judgment questions

  • "You join the team and discover the project you were hired to lead has already slipped two months. What are your first three actions in week one?"
  • "Two stakeholders give you conflicting priorities on the same Friday. Both are senior to you. How do you handle it?"
  • "A teammate is consistently delivering work that is technically correct but late. You are not their manager. What do you do?"
  • "You realize halfway through a quarter that the metric you committed to is no longer the right one. How do you raise it?"
  • "Your top-performing team member tells you in a 1:1 they're considering leaving. They haven't told their manager. What do you do in the next 24 hours?"
  • "A vendor misses a critical deadline that puts your launch at risk. Walk me through how you decide whether to escalate, switch vendors, or absorb the delay."

What to listen for

  • Sequencing — do they ask clarifying questions before acting?
  • Trade-off awareness — do they acknowledge what they would not do?
  • Stakeholder reasoning — who do they involve, and when?

Culture and values-alignment questions

Cultural fit is often misunderstood as shared interests or personality alignment. A more useful frame is behavioral consistency with the team's working norms.

A second contestable position: generic "culture fit" questions should be retired in favor of values-alignment scenarios that name a specific behavior the company expects. "Culture fit" as a phrase invites bias; a scenario tied to a stated company value forces a more concrete answer.

Example values-alignment questions

  • "Our team gives feedback in writing before live discussion. Describe the last time you gave hard feedback. What did you write down first?"
  • "We prioritize shipping over perfection. Tell me about a time you shipped something you weren't fully proud of. What happened next?"
  • "Describe the last time you changed your mind because of data, not opinion."

For a deeper look at how culture signals show up in technical interviews, see our guide on how to design a structured technical interview.

Identifying ownership mindset over task execution

Task completion alone is no longer a strong hiring indicator for most knowledge roles. What recruiters and hiring managers increasingly screen for is the ownership mindset — how a candidate behaves when outcomes are unclear, accountability is shared, or success metrics evolve mid-execution.

A concrete scenario

Consider a Series B SaaS company hiring its first sales operations manager. The pipeline is messy, the CRM is half-implemented, and the founder is the de-facto rev-ops owner. Standard task-execution questions ("walk me through how you'd clean a pipeline") produce textbook answers. Ownership-mindset questions — "What would you stop doing in your first 30 days, and how would you tell the founder?" — surface whether the candidate can hold the seat. A strong answer names a specific thing they'd stop (e.g., "weekly pipeline reviews in their current form"), the trade-off they're willing to accept, and how they'd frame the conversation with the founder. A weak answer lists everything they'd add — new dashboards, new processes, new tooling — without naming a single thing they'd remove or a single conversation they'd own.

Example ownership questions

  • "Tell me about something you fixed that wasn't your job to fix."
  • "Describe a time the goalposts moved on you. What did you do in the first 48 hours?"
  • "What's a process you killed, and what replaced it?"

Red flags

  • Answers that always credit "the team" with no individual decision named.
  • Stories where the candidate is consistently the rescuer or always the victim.

Questions to avoid: legal and compliance boundaries

A structured question set is only as strong as its weakest prompt. In most jurisdictions, certain questions are either illegal or carry significant legal risk because they touch protected characteristics or regulated information.

Common categories to avoid in initial screens:

  • Age, date of birth, or graduation year as a proxy for age.
  • Marital status, family planning, or childcare arrangements ("Do you plan to have kids?" "Who watches your children?").
  • Citizenship or national origin beyond the legally permitted "Are you authorized to work in [country]?"
  • Religion, religious holidays, or observance schedules.
  • Disability or medical history, including questions about prior workers' compensation claims.
  • Salary history — now restricted or banned in many US states and several other jurisdictions. Ask about salary expectations instead.

For a deeper treatment of pre-employment screening practices and compliance, see our overview of pre-employment assessment design. Always confirm specifics with your legal or HR compliance partner — local law varies.

Rethinking what "good answers" actually mean

In traditional interviews, clarity and confidence were often equated with strong performance. Modern hiring increasingly challenges this assumption.

The signal you want is depth, consistency, and reasoning quality — even when responses are less polished. A candidate who says "I don't know, but here's how I'd find out" is often a stronger hire than one who delivers a fluent answer with no underlying logic.

To codify this on the scorecard, score reasoning and presentation as separate rubric lines. A candidate can score 4/5 on reasoning and 2/5 on presentation and still be a strong hire — but you will only see that if the rubric separates them.

FAQ: structured hiring questions

Which recruitment question category is most often skipped — and why does it matter?

In practice, ownership-mindset questions are the category recruiters most often skip, because they're the hardest to score consistently and the answers don't fit neatly into STAR. The cost of skipping them is high: ownership signal is what separates strong individual contributors from people who execute well only when the path is clear. If you only have time to add one new category to your interview guide, this is the one with the largest marginal lift.

What is the STAR method, and is it still useful?

STAR stands for Situation, Task, Action, Result. It is a candidate-response framework that helps structure answers to behavioral questions. It remains useful as a default structure, but because most candidates now prepare STAR-formatted stories, interviewers should probe past the rehearsed answer with follow-up questions about trade-offs, omitted details, and alternative perspectives.

How many interview question frameworks should a structured interview include?

Practitioners commonly recommend 5–8 core questions per 45-minute round, with planned follow-up probes. This is a rule of thumb rather than a sourced standard. Fewer questions with deeper probes typically produce more signal than many surface-level questions.

What is the difference between behavioral and situational judgment questions?

Behavioral questions ask about past actions ("Tell me about a time you…"). Situational judgment questions ask about hypothetical scenarios ("What would you do if…"). Behavioral questions test verified history; situational questions test reasoning on novel problems. Strong interview loops use both.

How do you reduce bias in recruitment questions?

Use a structured interview where every candidate is asked the same core questions, score answers on a defined rubric, and have at least two interviewers calibrate independently before discussing. Avoid "culture fit" as a freeform judgment; replace it with values-alignment scenarios tied to documented company behaviors.

Can skill assessments replace interview questions?

No. Assessments and interview questions answer different things. Assessments produce structured skill evaluation against a defined rubric; interview questions surface reasoning, motivation, and judgment. The strongest hiring loops pair both — skill assessments for verified capability, structured behavioral interviews for everything assessments can't measure.

Final thoughts and next steps

The recruitment questions every HR professional should know in 2025 are not a fixed list — they are a working toolkit you adapt to the role, the level, and the rubric. The categories above (intent, behavioral, situational, values-alignment, ownership) give you a structure; the example questions give you a starting point.

Next steps

  • Audit your current interview guide. Map every question to one of the five categories above. If a category is empty, add two questions.
  • Separate reasoning from presentation on your scorecard. Score them as distinct rubric lines.
  • Pair interviews with skill verification. Schedule a demo of HackerEarth Assessments to see how rubric-based skill scores integrate with your interview scorecard, so your hiring decision isn't relying on candidate self-report alone.

Sources referenced: LinkedIn Global Talent Trends, SHRM Research, Gallup State of the Global Workplace.

Why Empathy Could Be Your Biggest Hiring Advantage

Why Empathy Could Be Your Biggest Hiring Advantage

Why Human-Centered Hiring Matters More Than Ever

Hiring has never been more optimized than it is today.

From AI-powered recruitment tools to automated screening systems and structured interview workflows, HR and talent acquisition teams now have more ways than ever to improve hiring speed, consistency, and scalability.

But in the middle of this efficiency-driven approach, one critical element is slowly disappearing: employee empathy.

Empathy in hiring is not about slowing down recruitment or making decisions less objective. It is about ensuring candidates are treated like people navigating important career decisions, not just profiles moving through a hiring pipeline.

As recruitment becomes increasingly system-driven, preserving the human side of hiring is becoming both more difficult and more important.

For HR leaders and talent acquisition professionals, this is no longer just a workplace culture discussion. It directly impacts candidate experience, employer branding, hiring quality, and long-term employee retention.

When Hiring Feels Like a Process Instead of an Experience

Most modern recruitment systems are designed around efficiency.

Applications are filtered automatically, interviews are scheduled faster, and candidates move through hiring stages with minimal manual effort. Operationally, this creates speed and structure.

But from a candidate’s perspective, the experience can often feel distant and impersonal.

Many candidates go through multiple interview rounds without clear communication, feedback, or transparency about timelines and expectations. Even when the hiring process is fair, it may still feel mechanical.

This creates a growing challenge for HR and TA teams:

How do you maintain hiring efficiency without removing the human connection from recruitment?

That is where empathy becomes essential.

The Hidden Cost of Low-Empathy Hiring

The impact of low-empathy hiring is not always immediate, but it compounds over time.

Candidates remember how organizations made them feel during the recruitment process, especially during rejection or delayed communication. Those experiences shape employer perception long before someone becomes an employee.

Over time, this directly affects employer brand and candidate trust.

There is also another hidden cost.

When hiring becomes too rigid or overly process-driven, recruiters may overlook candidates with strong long-term potential simply because they do not perfectly match predefined criteria.

Without empathy, context disappears.

And when context disappears, opportunities are often missed.

For HR leaders, empathy is no longer just a soft skill. It is becoming a competitive hiring advantage.

Why Empathy Is Becoming a Competitive Hiring Skill

Today’s workforce is far more dynamic than it was a decade ago.

Professionals switch industries, build careers through unconventional paths, and learn skills outside traditional education systems. As a result, resumes and structured evaluations only tell part of the story.

Empathy helps recruiters understand what exists beyond the surface.

It allows hiring teams to better understand:

  • Career transitions
  • Employment gaps
  • Nontraditional experience
  • Personal growth journeys

This shift changes the entire hiring mindset.

Instead of asking:

“Does this candidate perfectly match the role?”

Recruiters are increasingly asking:

“What could this candidate become in the right environment?”

That perspective creates stronger and more future-focused hiring decisions.

Where Empathy Fits in Modern Recruitment

Empathy does not replace structured hiring systems.

In fact, it becomes most effective when built into them.

Simple improvements in communication can significantly improve candidate experience. Clear updates, transparent timelines, respectful rejection emails, and honest feedback all contribute to a more human-centered recruitment process.

These small changes often have a lasting impact on how candidates perceive an organization.

For HR teams, the goal is not to remove structure from hiring.

The goal is to ensure structure does not remove humanity.

Better Hiring Decisions Start With Better Human Understanding

Empathy also improves the quality of hiring decisions themselves.

When recruiters take time to understand a candidate’s context, they often uncover strengths that are not immediately visible on resumes or scorecards.

A candidate who appears average on paper may demonstrate exceptional adaptability, resilience, or problem-solving ability in real-world situations.

Without empathy, those signals are easy to miss.

For talent acquisition leaders, this means recognizing that hiring is not just about selecting the strongest profile.

It is about identifying the strongest long-term fit within a real human context.

Final Thoughts

As recruitment continues evolving through automation, AI hiring tools, and structured decision-making, the biggest risk is not losing efficiency.

It is losing humanity.

Employee empathy ensures hiring remains people-focused, even as processes become more technology-driven.

It does not slow recruitment down. Instead, it helps organizations create better candidate experiences, stronger employer brands, and more thoughtful hiring decisions.

Because candidates may forget interview questions or assessment scores.

But they will always remember how they were treated during the hiring process.

And in today’s competitive talent market, that experience often determines whether top talent chooses to join or walk away.

Top Products

Explore HackerEarth’s top products for Hiring & Innovation

Discover powerful tools designed to streamline hiring, assess talent efficiently, and run seamless hackathons. Explore HackerEarth’s top products that help businesses innovate and grow.
Frame
Hackathons
Engage global developers through innovation
Arrow
Frame 2
Assessments
AI-driven advanced coding assessments
Arrow
Frame 3
FaceCode
Real-time code editor for effective coding interviews
Arrow
Frame 4
L & D
Tailored learning paths for continuous assessments
Arrow
Get A Free Demo