Home
/
Blog
/
Developer Insights
/
Practical Tutorial on Random Forest and Parameter Tuning in R

Practical Tutorial on Random Forest and Parameter Tuning in R

Author
Manish Saraswat
Calendar Icon
December 14, 2016
Timer Icon
3 min read
Share

Explore this post with:

Introduction

Treat "forests" well. Not for the sake of nature, but for solving problems too!

Random Forest is one of the most versatile machine learning algorithms available today. With its built-in ensembling capacity, the task of building a decent generalized model (on any dataset) gets much easier. However, I've seen people using random forest as a black box model; i.e., they don't understand what's happening beneath the code. They just code.

In fact, the easiest part of machine learning is coding. If you are new to machine learning, the random forest algorithm should be on your tips. Its ability to solve—both regression and classification problems along with robustness to correlated features and variable importance plot gives us enough head start to solve various problems.

Most often, I've seen people getting confused in bagging and random forest. Do you know the difference?

In this article, I'll explain the complete concept of random forest and bagging. For ease of understanding, I've kept the explanation simple yet enriching. I've used MLR, data.table packages to implement bagging, and random forest with parameter tuning in R. Also, you'll learn the techniques I've used to improve model accuracy from ~82% to 86%.

Table of Contents

  1. What is the Random Forest algorithm?
  2. How does it work? (Decision Tree, Random Forest)
  3. What is the difference between Bagging and Random Forest?
  4. Advantages and Disadvantages of Random Forest
  5. Solving a Problem
    • Parameter Tuning in Random Forest

What is the Random Forest algorithm?

Random forest is a tree-based algorithm which involves building several trees (decision trees), then combining their output to improve generalization ability of the model. The method of combining trees is known as an ensemble method. Ensembling is nothing but a combination of weak learners (individual trees) to produce a strong learner.

Say, you want to watch a movie. But you are uncertain of its reviews. You ask 10 people who have watched the movie. 8 of them said "the movie is fantastic." Since the majority is in favor, you decide to watch the movie. This is how we use ensemble techniques in our daily life too.

Random Forest can be used to solve regression and classification problems. In regression problems, the dependent variable is continuous. In classification problems, the dependent variable is categorical.

Trivia: The random Forest algorithm was created by Leo Breiman and Adele Cutler in 2001.

How does it work? (Decision Tree, Random Forest)

To understand the working of a random forest, it's crucial that you understand a tree. A tree works in the following way:

decision tree explaining

1. Given a data frame (n x p), a tree stratifies or partitions the data based on rules (if-else). Yes, a tree creates rules. These rules divide the data set into distinct and non-overlapping regions. These rules are determined by a variable's contribution to the homogeneity or pureness of the resultant child nodes (X2, X3).

2. In the image above, the variable X1 resulted in highest homogeneity in child nodes, hence it became the root node. A variable at root node is also seen as the most important variable in the data set.

3. But how is this homogeneity or pureness determined? In other words, how does the tree decide at which variable to split?

  • In regression trees (where the output is predicted using the mean of observations in the terminal nodes), the splitting decision is based on minimizing RSS. The variable which leads to the greatest possible reduction in RSS is chosen as the root node. The tree splitting takes a top-down greedy approach, also known as recursive binary splitting. We call it "greedy" because the algorithm cares to make the best split at the current step rather than saving a split for better results on future nodes.
  • In classification trees (where the output is predicted using mode of observations in the terminal nodes), the splitting decision is based on the following methods:
    • Gini Index - It's a measure of node purity. If the Gini index takes on a smaller value, it suggests that the node is pure. For a split to take place, the Gini index for a child node should be less than that for the parent node.
    • Entropy - Entropy is a measure of node impurity. For a binary class (a, b), the formula to calculate it is shown below. Entropy is maximum at p = 0.5. For p(X=a)=0.5 or p(X=b)=0.5 means a new observation has a 50%-50% chance of getting classified in either class. The entropy is minimum when the probability is 0 or 1.

Entropy = - p(a)*log(p(a)) - p(b)*log(p(b))

entropy curve

In a nutshell, every tree attempts to create rules in such a way that the resultant terminal nodes could be as pure as possible. Higher the purity, lesser the uncertainty to make the decision.

But a decision tree suffers from high variance. "High Variance" means getting high prediction error on unseen data. We can overcome the variance problem by using more data for training. But since the data set available is limited to us, we can use resampling techniques like bagging and random forest to generate more data.

Building many decision trees results in a forest. A random forest works the following way:

  1. First, it uses the Bagging (Bootstrap Aggregating) algorithm to create random samples. Given a data set D1 (n rows and p columns), it creates a new dataset (D2) by sampling n cases at random with replacement from the original data. About 1/3 of the rows from D1 are left out, known as Out of Bag (OOB) samples.
  2. Then, the model trains on D2. OOB sample is used to determine unbiased estimate of the error.
  3. Out of p columns, P ≪ p columns are selected at each node in the data set. The P columns are selected at random. Usually, the default choice of P is p/3 for regression tree and √p for classification tree.
  4. pruning decision trees Unlike a tree, no pruning takes place in random forest; i.e., each tree is grown fully. In decision trees, pruning is a method to avoid overfitting. Pruning means selecting a subtree that leads to the lowest test error rate. We can use cross-validation to determine the test error rate of a subtree.
  5. Several trees are grown and the final prediction is obtained by averaging (for regression) or majority voting (for classification).

Each tree is grown on a different sample of original data. Since random forest has the feature to calculate OOB error internally, cross-validation doesn't make much sense in random forest.

What is the difference between Bagging and Random Forest?

Many a time, we fail to ascertain that bagging is not the same as random forest. To understand the difference, let's see how bagging works:

  1. It creates randomized samples of the dataset (just like random forest) and grows trees on a different sample of the original data. The remaining 1/3 of the sample is used to estimate unbiased OOB error.
  2. It considers all the features at a node (for splitting).
  3. Once the trees are fully grown, it uses averaging or voting to combine the resultant predictions.

Aren't you thinking, "If both the algorithms do the same thing, what is the need for random forest? Couldn't we have accomplished our task with bagging?" NO!

The need for random forest surfaced after discovering that the bagging algorithm results in correlated trees when faced with a dataset having strong predictors. Unfortunately, averaging several highly correlated trees doesn't lead to a large reduction in variance.

But how do correlated trees emerge? Good question! Let's say a dataset has a very strong predictor, along with other moderately strong predictors. In bagging, a tree grown every time would consider the very strong predictor at its root node, thereby resulting in trees similar to each other.

The main difference between random forest and bagging is that random forest considers only a subset of predictors at a split. This results in trees with different predictors at the top split, thereby resulting in decorrelated trees and more reliable average output. That's why we say random forest is robust to correlated predictors.

Advantages and Disadvantages of Random Forest

Advantages are as follows:

  1. It is robust to correlated predictors.
  2. It is used to solve both regression and classification problems.
  3. It can also be used to solve unsupervised ML problems.
  4. It can handle thousands of input variables without variable selection.
  5. It can be used as a feature selection tool using its variable importance plot.
  6. It takes care of missing data internally in an effective manner.

Disadvantages are as follows:

  1. The Random Forest model is difficult to interpret.
  2. It tends to return erratic predictions for observations out of the range of training data. For example, if the training data contains a variable x ranging from 30 to 70, and the test data has x = 200, random forest would give an unreliable prediction.
  3. It can take longer than expected to compute a large number of trees.

Solving a Problem (Parameter Tuning)

Let's take a dataset to compare the performance of bagging and random forest algorithms. Along the way, I'll also explain important parameters used for parameter tuning. In R, we'll use MLR and data.table packages to do this analysis.

I've taken the Adult dataset from the UCI machine learning repository. You can download the data from here.

This dataset presents a binary classification problem to solve. Given a set of features, we need to predict if a person's salary is <=50K or >=50K. Since the given data isn't well structured, we'll need to make some modification while reading the dataset.

# set working directory
path <- "~/December 2016/RF_Tutorial"
setwd(path)
# Set working directory
path <- "~/December 2016/RF_Tutorial"
setwd(path)

# Load libraries
library(data.table)
library(mlr)
library(h2o)

# Set variable names
setcol <- c("age",
            "workclass",
            "fnlwgt",
            "education",
            "education-num",
            "marital-status",
            "occupation",
            "relationship",
            "race",
            "sex",
            "capital-gain",
            "capital-loss",
            "hours-per-week",
            "native-country",
            "target")

# Load data
train <- read.table("adultdata.txt", header = FALSE, sep = ",", 
                    col.names = setcol, na.strings = c(" ?"), stringsAsFactors = FALSE)
test <- read.table("adulttest.txt", header = FALSE, sep = ",", 
                   col.names = setcol, skip = 1, na.strings = c(" ?"), stringsAsFactors = FALSE)

After we've loaded the dataset, first we'll set the data class to data.table. data.table is the most powerful R package made for faster data manipulation.


>setDT(train)
>setDT(test)

Now, we'll quickly look at given variables, data dimensions, etc.


>dim(train)
>dim(test)
>str(train)
>str(test)

As seen from the output above, we can derive the following insights:

  1. The train dataset has 32,561 rows and 15 columns.
  2. The test dataset has 16,281 rows and 15 columns.
  3. Variable target is the dependent variable.
  4. The target variable in train and test data is different. We'll need to match them.
  5. All character variables have a leading whitespace which can be removed.

We can check missing values using:

# Check missing values in train and test datasets
>table(is.na(train))
# Output:
#  FALSE   TRUE 
#  484153  4262

>sapply(train, function(x) sum(is.na(x)) / length(x)) * 100

table(is.na(test))
# Output:
#  FALSE  TRUE 
#  242012 2203

>sapply(test, function(x) sum(is.na(x)) / length(x)) * 100

As seen above, both train and test datasets have missing values. The sapply function is quite handy when it comes to performing column computations. Above, it returns the percentage of missing values per column.

Now, we'll preprocess the data to prepare it for training. In R, random forest internally takes care of missing values using mean/mode imputation. Practically speaking, sometimes it takes longer than expected for the model to run.

Therefore, in order to avoid waiting time, let's impute the missing values using median/mode imputation method; i.e., missing values in the integer variables will be imputed with median and in the factor variables with mode (most frequent value).

We'll use the impute function from the mlr package, which is enabled with several unique methods for missing value imputation:

# Impute missing values
>imp1 <- impute(data = train, target = "target", 
              classes = list(integer = imputeMedian(), factor = imputeMode()))

>imp2 <- impute(data = test, target = "target", 
              classes = list(integer = imputeMedian(), factor = imputeMode()))

# Assign the imputed data back to train and test
>train <- imp1$data
>test <- imp2$data

Being a binary classification problem, you are always advised to check if the data is imbalanced or not. We can do it in the following way:

# Check class distribution in train and test datasets
setDT(train)[, .N / nrow(train), target]
# Output:
#    target     V1
# 1: <=50K   0.7591904
# 2: >50K    0.2408096

setDT(test)[, .N / nrow(test), target]
# Output:
#    target     V1
# 1: <=50K.  0.7637737
# 2: >50K.   0.2362263

If you observe carefully, the value of the target variable is different in test and train. For now, we can consider it a typo error and correct all the test values. Also, we see that 75% of people in the train data have income <=50K. Imbalanced classification problems are known to be more skewed with a binary class distribution of 90% to 10%. Now, let's proceed and clean the target column in test data.

# Clean trailing character in test target values
test[, target := substr(target, start = 1, stop = nchar(target) - 1)]

We've used the substr function to return the substring from a specified start and end position. Next, we'll remove the leading whitespaces from all character variables. We'll use the str_trim function from the stringr package.

> library(stringr)
> char_col <- colnames(train)[sapply(train, is.character)]
> for(i in char_col)
>     set(train, j = i, value = str_trim(train[[i]], side = "left"))

Using sapply function, we've extracted the column names which have character class. Then, using a simple for - set loop we traversed all those columns and applied the str_trim function.

Before we start model training, we should convert all character variables to factor. MLR package treats character class as unknown.


> fact_col <- colnames(train)[sapply(train,is.character)]
>for(i in fact_col)
			set(train,j=i,value = factor(train[[i]]))
>for(i in fact_col)
	     set(test,j=i,value = factor(test[[i]]))

Let's start with modeling now. MLR package has its own function to convert data into a task, build learners, and optimize learning algorithms. I suggest you stick to the modeling structure described below for using MLR on any data set.

#create a task
> traintask <- makeClassifTask(data = train,target = "target")
> testtask <- makeClassifTask(data = test,target = "target")

#create learner > bag <- makeLearner("classif.rpart",predict.type = "response") > bag.lrn <- makeBaggingWrapper(learner = bag,bw.iters = 100,bw.replace = TRUE)

I've set up the bagging algorithm which will grow 100 trees on randomized samples of data with replacement. To check the performance, let's set up a validation strategy too:

#set 5 fold cross validation
> rdesc <- makeResampleDesc("CV", iters = 5L)

For faster computation, we'll use parallel computation backend. Make sure your machine / laptop doesn't have many programs running in the background.

#set parallel backend (Windows)
> library(parallelMap)
> library(parallel)
> parallelStartSocket(cpus = detectCores())
>

For linux users, the function parallelStartMulticore(cpus = detectCores()) will activate parallel backend. I've used all the cores here.

r <- resample(learner = bag.lrn,
              task = traintask,
              resampling = rdesc,
              measures = list(tpr, fpr, fnr, fpr, acc),
              show.info = T)

#[Resample] Result: 
# tpr.test.mean = 0.95,
# fnr.test.mean = 0.0505,
# fpr.test.mean = 0.487,
# acc.test.mean = 0.845

Being a binary classification problem, I've used the components of confusion matrix to check the model's accuracy. With 100 trees, bagging has returned an accuracy of 84.5%, which is way better than the baseline accuracy of 75%. Let's now check the performance of random forest.

#make randomForest learner
> rf.lrn <- makeLearner("classif.randomForest")
> rf.lrn$par.vals <- list(ntree = 100L,
                          importance = TRUE)

> r <- resample(learner = rf.lrn,
                task = traintask,
                resampling = rdesc,
                measures = list(tpr, fpr, fnr, fpr, acc),
                show.info = T)

# Result:
# tpr.test.mean = 0.996,
# fpr.test.mean = 0.72,
# fnr.test.mean = 0.0034,
# acc.test.mean = 0.825

On this data set, random forest performs worse than bagging. Both used 100 trees and random forest returns an overall accuracy of 82.5 %. An apparent reason being that this algorithm is messing up classifying the negative class. As you can see, it classified 99.6% of the positive classes correctly, which is way better than the bagging algorithm. But it incorrectly classified 72% of the negative classes.

Internally, random forest uses a cutoff of 0.5; i.e., if a particular unseen observation has a probability higher than 0.5, it will be classified as <=50K. In random forest, we have the option to customize the internal cutoff. As the false positive rate is very high now, we'll increase the cutoff for positive classes (<=50K) and accordingly reduce it for negative classes (>=50K). Then, train the model again.

#set cutoff
> rf.lrn$par.vals <- list(ntree = 100L,
                          importance = TRUE,
                          cutoff = c(0.75, 0.25))

> r <- resample(learner = rf.lrn,
                task = traintask,
                resampling = rdesc,
                measures = list(tpr, fpr, fnr, fpr, acc),
                show.info = T)

#Result: 
# tpr.test.mean = 0.934,
# fpr.test.mean = 0.43,
# fnr.test.mean = 0.0662,
# acc.test.mean = 0.846

As you can see, we've improved the accuracy of the random forest model by 2%, which is slightly higher than that for the bagging model. Now, let's try and make this model better.

Parameter Tuning: Mainly, there are three parameters in the random forest algorithm which you should look at (for tuning):

  • ntree - As the name suggests, the number of trees to grow. Larger the tree, it will be more computationally expensive to build models.
  • mtry - It refers to how many variables we should select at a node split. Also as mentioned above, the default value is p/3 for regression and sqrt(p) for classification. We should always try to avoid using smaller values of mtry to avoid overfitting.
  • nodesize - It refers to how many observations we want in the terminal nodes. This parameter is directly related to tree depth. Higher the number, lower the tree depth. With lower tree depth, the tree might even fail to recognize useful signals from the data.

Let get to the playground and try to improve our model's accuracy further. In MLR package, you can list all tuning parameters a model can support using:

> getParamSet(rf.lrn)

# set parameter space
params <- makeParamSet(
    makeIntegerParam("mtry", lower = 2, upper = 10),
    makeIntegerParam("nodesize", lower = 10, upper = 50)
)

# set validation strategy
rdesc <- makeResampleDesc("CV", iters = 5L)

# set optimization technique
ctrl <- makeTuneControlRandom(maxit = 5L)

# start tuning
> tune <- tuneParams(learner = rf.lrn,
                     task = traintask,
                     resampling = rdesc,
                     measures = list(acc),
                     par.set = params,
                     control = ctrl,
                     show.info = T)

[Tune] Result: mtry=2; nodesize=23 : acc.test.mean=0.858

After tuning, we have achieved an overall accuracy of 85.8%, which is better than our previous random forest model. This way you can tweak your model and improve its accuracy.

I'll leave you here. The complete code for this analysis can be downloaded from Github.

Summary

Don't stop here! There is still a huge scope for improvement in this model. Cross validation accuracy is generally more optimistic than true test accuracy. To make a prediction on the test set, minimal data preprocessing on categorical variables is required. Do it and share your results in the comments below.

My motive to create this tutorial is to get you started using the random forest model and some techniques to improve model accuracy. For better understanding, I suggest you read more on confusion matrix. In this article, I've explained the working of decision trees, random forest, and bagging.

Did I miss out anything? Do share your knowledge and let me know your experience while solving classification problems in comments below.

Subscribe to The HackerEarth Blog

Get expert tips, hacks, and how-tos from the world of tech recruiting to stay on top of your hiring!

Author
Manish Saraswat
Calendar Icon
December 14, 2016
Timer Icon
3 min read
Share

Hire top tech talent with our recruitment platform

Access Free Demo
Related reads

Discover more articles

Gain insights to optimize your developer recruitment process.

What AI Is Forcing HR to Rethink About Hiring

What AI is forcing HR to rethink

For recruiters and talent leaders, AI has made one thing clear: resumes can no longer be trusted as the primary signal of candidate capability. What AI is forcing HR to rethink is the entire screening stack — from how reqs are written, to how the ATS filters applicants, to how quality of hire (QoH) is measured against time-to-fill. According to LinkedIn's Future of Recruiting 2024 report, 73% of recruiters say skills-based hiring is a priority, yet most pipelines still screen on degree and employer brand at the ATS layer. That gap is where the rethink begins.

Why traditional resumes no longer predict strong hires

Resumes measure presentation more reliably than capability. Recruiters have long used job titles, company names, degrees, and years of experience as proxies for performance, but generative AI tools — ChatGPT, Teal, Rezi, and Kickresume among them — have collapsed the cost of producing a polished application. The World Economic Forum's Future of Jobs Report 2023 found that 44% of workers' core skills are expected to change by 2027, which means a resume snapshot ages faster than the role it describes.

For recruiters, the operational impact is direct: pipelines fill, screen rates rise, and yet QoH stays flat. As AI becomes more deeply embedded in hiring, HR leaders are being forced to rethink a single question:

What if resumes are no longer the best predictor of performance?

That question is reshaping recruitment faster than many organizations expected — though, as discussed later, the shift away from resumes carries its own trade-offs.

Share of Workers' Core Skills Expected to Change by 2027
Source: World Economic Forum Future of Jobs Report 2023

The resume was built for a different era

Modern work no longer fits the resume's static format. Skills evolve in months rather than years, roles overlap across functions, and professionals build expertise through online communities, freelance projects, bootcamps, and self-directed learning. According to SHRM's 2024 Talent Trends research, nearly half of HR leaders report that candidates from non-traditional backgrounds are increasingly competitive on assessments.

Resumes still reduce people to standardized timelines, and many capable candidates are filtered out by ATS rules simply because they lack the "right" employer logos. At the same time, candidates skilled in resume optimization can outperform genuinely capable professionals at the screen stage — a pattern that pre-dates AI but has been amplified by it.

It has become far easier for candidates to generate polished resumes, cover letters, and interview responses in minutes. For recruiters, the takeaway is practical: formatting and phrasing are no longer reliable proxies for capability.

AI did not break hiring — it exposed existing problems

AI did not create the resume problem; it surfaced one already present in most hiring funnels. Surveys of recruiters, including Gartner's 2024 HR research, have consistently shown three pre-AI pressures: recruiters overwhelmed by application volume, candidates optimizing resumes to pass ATS filters, and hiring managers reporting weak outcomes despite reviewing seemingly strong resumes.

AI accelerated these problems to a point where they can no longer be ignored. Many candidates can now generate a highly optimized application in seconds, and recruiters increasingly struggle to distinguish between candidates skilled at self-presentation and those who can actually do the work.

The operational shift is moving from:

"What does your resume say?"

Toward:

"Can you actually do the job?"

The rise of skills-based hiring

Skills-based hiring outperforms resume screening because it measures demonstrated capability rather than credential proximity. A growing number of organizations — including IBM, Accenture, and Delta, profiled in LinkedIn's Skills Path program — are moving toward skills-first models that prioritize practical assessments, simulations, project work, and role-specific problem-solving over employer brand or degree.

This trend is most visible in technology hiring, where coding assessments and real-world technical evaluations generally provide stronger signals than resumes alone, particularly when compared against resume-only screens for time-to-productivity. HackerEarth has run over 100 million developer assessments across enterprise hiring programs, and the consistent pattern in that dataset is that demonstrated coding performance correlates more closely with on-the-job output than degree or prior employer.

Beyond tech, a growing number of organizations are extending the model: marketing teams using campaign-brief exercises, sales teams using recorded customer-handling scenarios, and operations teams using situational judgment tests. For a deeper view of how this maps to specific roles, see our skills-based hiring guide and developer assessment platform.

Where skills-based hiring breaks down

Skills-based hiring is not without trade-offs, and recruiters evaluating it should plan for known failure modes:

  • Assessment bias. Poorly designed assessments can disadvantage career returners, caregivers, and candidates with limited test-taking time as severely as resume screens disadvantage non-traditional backgrounds.
  • Gaming of take-home tests. Unproctored coding or case exercises are increasingly solvable with generative AI, which means assessment design has to evolve in step with candidate tooling.
  • Candidate experience at scale. Long assessment batteries lower completion rates and damage employer brand, particularly for senior candidates who have multiple offers in play.
  • Legal exposure. In jurisdictions including New York City (Local Law 144) and under the EU AI Act, automated employment decision tools are subject to bias audits and disclosure requirements. Recruiters should confirm vendor compliance before deploying AI-driven scoring.

The honest read: most organizations announcing a "shift" to skills-based hiring still filter by degree at the ATS layer. The shift is real, but it is uneven.

Skills-Based Hiring Priority vs. ATS Screening Reality
Source: LinkedIn Future of Recruiting 2024; ATS screening figure illustrative based on article claims

Why HR leaders are rethinking potential

Potential is becoming more measurable in ways resumes never allowed. Traditional hiring often prioritized pedigree — familiar universities, recognizable employers, conventional career paths — but AI-powered assessment platforms (HackerEarth, HireVue, Pymetrics, Codility, and Workday Skills Cloud among them) score candidates on demonstrated performance against role-specific tasks, calibrated to a benchmark population.

These tools typically combine task-based evaluations, behavioral simulations, and structured scoring rubrics. Their limits matter too: they score what they are trained to score, they can encode bias from the training population, and they do not measure long-arc traits like cultural contribution or leadership trajectory. Recruiters should treat them as one signal in a structured interview loop, not a single decision point.

Research suggests that candidates without elite degrees frequently match or outperform credentialed peers on standardized technical assessments. In many cases, career switchers and self-taught professionals demonstrate strong adaptability and practical skill. Organizations that shift toward capability-based evaluation may gain access to broader and more diverse talent pools — though, as noted above, only if assessment design itself is audited for fairness.

The recruiter's role is changing

AI is not replacing recruiters; it is shifting where recruiters spend their time. Traditional recruitment rewarded screening volume and speed. Modern hiring increasingly rewards judgment, stakeholder alignment, and structured decision-making.

As automation handles sourcing, scheduling, resume parsing, and initial outreach, recruiters are spending more time on work AI cannot do well:

  • Probing candidate motivation through structured behavioral interviews
  • Evaluating adaptability against specific role demands using scorecards
  • Building hiring-manager alignment on the req and intake brief
  • Designing candidate-experience touchpoints that protect offer-accept rates
  • Calibrating assessment results against on-the-job performance data

The recruiter who succeeds in an AI-heavy pipeline is the one who can interpret signal, not the one who can scan resumes faster.

Candidates are changing faster than hiring systems

Modern career paths now move faster than most ATS configurations. Today's workforce values flexibility, creativity, continuous learning, and project-based growth, and many professionals build experience through freelance work, startups, creator platforms, and side projects. Their resumes often look unconventional, but unconventional no longer equates to unqualified.

Organizations that shift toward capability-based evaluation may access talent pools that rigid resume filters would otherwise miss. For practical guidance on adjusting screening criteria, see our guide to evaluating an ATS for skills-based hiring.

The future of hiring will feel more human

There is an irony in the AI shift: as resumes become easier to automate, organizations are being pushed to evaluate creativity, adaptability, collaboration, and real-world problem-solving more directly. The likely structure of mature AI-enabled hiring is AI handling repetitive tasks — sourcing, scheduling, parsing, initial scoring — while recruiters and hiring managers focus on nuance, context, and long-term fit.

FAQ

Is skills-based hiring more effective than resume screening? Skills-based hiring tends to predict on-the-job performance more reliably than resume screening for roles where the work can be assessed directly, such as engineering, data, sales, and marketing execution. According to LinkedIn's Future of Recruiting report, 73% of recruiters now prioritize skills-based approaches. Effectiveness depends heavily on assessment design and on whether downstream ATS filters still gate candidates by degree.

What HR processes is AI changing first? AI is changing sourcing, resume parsing, candidate matching, and initial assessment scoring first, because these are high-volume, rules-based tasks. Structured interviewing, offer negotiation, and onboarding remain primarily human-led, though AI-assisted note-taking and scorecard analysis are growing.

Will AI replace recruiters? AI is unlikely to replace recruiters, but it is changing the skill profile. Recruiters who can interpret assessment data, align hiring managers, and design candidate experience will be more valuable; recruiters whose role is primarily resume scanning are most exposed.

How do I evaluate an AI hiring tool for bias? Ask the vendor for a bias audit report (required under NYC Local Law 144 for automated employment decision tools), the demographic composition of the training data, the validation methodology against job performance, and the appeal process for candidates. Avoid tools that cannot answer all four.

Is resume-based hiring going away? Resume-based hiring is under pressure but not disappearing. Most organizations are moving toward hybrid models where resumes provide context and assessments provide the capability signal. A full move away from resumes is unlikely in the next hiring cycle for most enterprises.

What is the biggest risk of switching to skills-based hiring? The biggest risk is poorly designed assessments that introduce new forms of bias or damage candidate experience. A skills-based process built on a long, unproctored, untested assessment battery will perform worse than a structured resume screen.

Next steps: See it in action

If you are a recruiter or talent leader evaluating how to move from resume-led to skills-led screening, book a demo of HackerEarth Assessments to see how role-specific evaluations, proctoring, and benchmarked scoring fit into an existing ATS pipeline. For background reading, see our developer assessment platform overview and the HackerEarth recruiter blog.

Recruiters who pair structured assessment data with strong human judgment build better pipelines than either resumes or AI alone can produce.

Must-Know Recruitment Questions for HR and Talent Acquisition Teams (2026)

Recruitment questions every HR professional should know in 2025

Estimated read time: 7 minutes

Most "tell me about yourself" answers are now written by ChatGPT the night before the interview. That single shift — candidates arriving with rehearsed, AI-polished narratives — has broken the standard interview script and forced recruiters to redesign their question sets from the ground up. This guide outlines the categories of recruitment questions every HR professional should know in 2025, why each matters, and example questions you can adapt to your hiring rubric or scorecard today.

LinkedIn's 2024 Global Talent Trends report notes that skills-based hiring and behavioral assessment have moved from optional to expected in most talent acquisition workflows. Yet many hiring conversations still rely on outdated prompts that produce polished answers and unclear signals. The recruiter persona — the one running req intake, pipeline reviews, and screen calls — needs a tighter toolkit.

Who this is for: This article is written for recruiters and talent acquisition partners running structured interviews. Hiring managers building a scorecard alongside the recruiter will also find the question categories useful.

Adoption of Structured Hiring Practices Among HR Teams (2020–2025)
Source: LinkedIn Global Talent Trends claims cited in article

Why modern recruitment questions fail when they stay outdated

Industry observers at SHRM have noted that candidates are better prepared, interviews are more structured, and expectations on both sides have risen (SHRM research). With generative AI tools widely available, many candidates now enter screens with refined, rehearsed narratives.

The result is predictable — polished answers, unclear signals, and decisions made on incomplete understanding. The quality of the recruitment questions you bring into the room directly defines the quality of the signal you capture on the scorecard.

A contestable position worth stating plainly: behavioral interview frameworks like STAR are now overused to the point where candidates have memorized the structure, which reduces signal quality unless interviewers probe past the rehearsed answer with follow-ups.

What this article won't claim

Structured behavioral interviewing is not a silver bullet. Over-indexing on adaptability can screen out deep specialists whose value is stability and depth. Ownership-mindset framing, if applied rigidly, can disadvantage neurodivergent candidates or those from cultures where collective credit is the norm. Use the questions below as part of a balanced rubric — not as a single filter.

From "tell me about yourself" to understanding real intent

Traditional opening questions rarely reveal a candidate's intent or direction. A stronger opening probes why a candidate is moving at this specific point and what kind of work keeps them engaged beyond compensation.

Evidence from Gallup's 2023 State of the Global Workplace report suggests today's workforce is increasingly motivated by alignment, learning, and perceived growth — not stability alone. If this layer is missed early in the interview, the rest of the evaluation becomes less reliable.

Example intent and motivation questions

  • "Walk me through the last time you decided to leave a role. What specifically triggered the decision?"
  • "What kind of work has made you lose track of time in the last 12 months?"
  • "If this role didn't exist, what would your second-choice next move be — and why?"
  • "What would need to be true 18 months from now for you to consider this move a success?"

What to listen for

  • Specific triggers and trade-offs, not generic phrases like "growth" or "new challenges."
  • Consistency between the stated motivation and the candidate's actual career pattern.

Red flags

  • Answers that match the job description back to you almost verbatim.
  • Vague language about "culture" or "growth" with no concrete example.

Behavioral and competency-based recruitment questions: getting past scripted answers

One of the biggest challenges recruiters face today is not lack of talent, but over-prepared talent. Hiring practitioners increasingly find that well-structured, confident answers do not always reflect real capability, especially when responses are influenced by preparation tools or rehearsed narratives.

This is why competency-based questions — which explore decision-making logic, trade-offs, and real-time reasoning — produce higher signal than story-based prompts alone. For technical roles, pairing these with a practical assessment helps confirm what the interview surfaces. HackerEarth's skill assessments use role-specific question libraries and rubric-based scoring so the recruiter can compare candidate outputs against a defined standard, rather than relying on the candidate's own narrative of their capability.

Example behavioral and competency-based questions

  1. "Tell me about a decision you made in the last six months that you would make differently today. What changed your thinking?"
  2. "Describe a time you disagreed with your manager on a priority. How did you handle it?"
  3. "Walk me through a project where the scope changed mid-execution. What did you cut, and why?"
  4. "Give me an example of feedback you initially rejected but later acted on."

How to probe past the rehearsed answer

If a candidate delivers a clean STAR-format response, follow up with: "What's one detail you usually leave out of that story?" or "Who would tell that story differently?" These prompts disrupt the rehearsed structure and surface the actual reasoning.

Situational judgment and adaptability questions

Workplaces are shaped by continuous change — shifting priorities, evolving tools, and hybrid collaboration. Many hiring teams now treat adaptability as a core hiring parameter rather than a soft skill, particularly for roles where ambiguity is the default state.

Situational judgment questions present a realistic scenario and ask the candidate how they would navigate it. They are harder to rehearse than story-based prompts because the scenario is novel.

Example situational judgment questions

  • "You join the team and discover the project you were hired to lead has already slipped two months. What are your first three actions in week one?"
  • "Two stakeholders give you conflicting priorities on the same Friday. Both are senior to you. How do you handle it?"
  • "A teammate is consistently delivering work that is technically correct but late. You are not their manager. What do you do?"
  • "You realize halfway through a quarter that the metric you committed to is no longer the right one. How do you raise it?"
  • "Your top-performing team member tells you in a 1:1 they're considering leaving. They haven't told their manager. What do you do in the next 24 hours?"
  • "A vendor misses a critical deadline that puts your launch at risk. Walk me through how you decide whether to escalate, switch vendors, or absorb the delay."

What to listen for

  • Sequencing — do they ask clarifying questions before acting?
  • Trade-off awareness — do they acknowledge what they would not do?
  • Stakeholder reasoning — who do they involve, and when?

Culture and values-alignment questions

Cultural fit is often misunderstood as shared interests or personality alignment. A more useful frame is behavioral consistency with the team's working norms.

A second contestable position: generic "culture fit" questions should be retired in favor of values-alignment scenarios that name a specific behavior the company expects. "Culture fit" as a phrase invites bias; a scenario tied to a stated company value forces a more concrete answer.

Example values-alignment questions

  • "Our team gives feedback in writing before live discussion. Describe the last time you gave hard feedback. What did you write down first?"
  • "We prioritize shipping over perfection. Tell me about a time you shipped something you weren't fully proud of. What happened next?"
  • "Describe the last time you changed your mind because of data, not opinion."

For a deeper look at how culture signals show up in technical interviews, see our guide on how to design a structured technical interview.

Identifying ownership mindset over task execution

Task completion alone is no longer a strong hiring indicator for most knowledge roles. What recruiters and hiring managers increasingly screen for is the ownership mindset — how a candidate behaves when outcomes are unclear, accountability is shared, or success metrics evolve mid-execution.

A concrete scenario

Consider a Series B SaaS company hiring its first sales operations manager. The pipeline is messy, the CRM is half-implemented, and the founder is the de-facto rev-ops owner. Standard task-execution questions ("walk me through how you'd clean a pipeline") produce textbook answers. Ownership-mindset questions — "What would you stop doing in your first 30 days, and how would you tell the founder?" — surface whether the candidate can hold the seat. A strong answer names a specific thing they'd stop (e.g., "weekly pipeline reviews in their current form"), the trade-off they're willing to accept, and how they'd frame the conversation with the founder. A weak answer lists everything they'd add — new dashboards, new processes, new tooling — without naming a single thing they'd remove or a single conversation they'd own.

Example ownership questions

  • "Tell me about something you fixed that wasn't your job to fix."
  • "Describe a time the goalposts moved on you. What did you do in the first 48 hours?"
  • "What's a process you killed, and what replaced it?"

Red flags

  • Answers that always credit "the team" with no individual decision named.
  • Stories where the candidate is consistently the rescuer or always the victim.

Questions to avoid: legal and compliance boundaries

A structured question set is only as strong as its weakest prompt. In most jurisdictions, certain questions are either illegal or carry significant legal risk because they touch protected characteristics or regulated information.

Common categories to avoid in initial screens:

  • Age, date of birth, or graduation year as a proxy for age.
  • Marital status, family planning, or childcare arrangements ("Do you plan to have kids?" "Who watches your children?").
  • Citizenship or national origin beyond the legally permitted "Are you authorized to work in [country]?"
  • Religion, religious holidays, or observance schedules.
  • Disability or medical history, including questions about prior workers' compensation claims.
  • Salary history — now restricted or banned in many US states and several other jurisdictions. Ask about salary expectations instead.

For a deeper treatment of pre-employment screening practices and compliance, see our overview of pre-employment assessment design. Always confirm specifics with your legal or HR compliance partner — local law varies.

Rethinking what "good answers" actually mean

In traditional interviews, clarity and confidence were often equated with strong performance. Modern hiring increasingly challenges this assumption.

The signal you want is depth, consistency, and reasoning quality — even when responses are less polished. A candidate who says "I don't know, but here's how I'd find out" is often a stronger hire than one who delivers a fluent answer with no underlying logic.

To codify this on the scorecard, score reasoning and presentation as separate rubric lines. A candidate can score 4/5 on reasoning and 2/5 on presentation and still be a strong hire — but you will only see that if the rubric separates them.

FAQ: structured hiring questions

Which recruitment question category is most often skipped — and why does it matter?

In practice, ownership-mindset questions are the category recruiters most often skip, because they're the hardest to score consistently and the answers don't fit neatly into STAR. The cost of skipping them is high: ownership signal is what separates strong individual contributors from people who execute well only when the path is clear. If you only have time to add one new category to your interview guide, this is the one with the largest marginal lift.

What is the STAR method, and is it still useful?

STAR stands for Situation, Task, Action, Result. It is a candidate-response framework that helps structure answers to behavioral questions. It remains useful as a default structure, but because most candidates now prepare STAR-formatted stories, interviewers should probe past the rehearsed answer with follow-up questions about trade-offs, omitted details, and alternative perspectives.

How many interview question frameworks should a structured interview include?

Practitioners commonly recommend 5–8 core questions per 45-minute round, with planned follow-up probes. This is a rule of thumb rather than a sourced standard. Fewer questions with deeper probes typically produce more signal than many surface-level questions.

What is the difference between behavioral and situational judgment questions?

Behavioral questions ask about past actions ("Tell me about a time you…"). Situational judgment questions ask about hypothetical scenarios ("What would you do if…"). Behavioral questions test verified history; situational questions test reasoning on novel problems. Strong interview loops use both.

How do you reduce bias in recruitment questions?

Use a structured interview where every candidate is asked the same core questions, score answers on a defined rubric, and have at least two interviewers calibrate independently before discussing. Avoid "culture fit" as a freeform judgment; replace it with values-alignment scenarios tied to documented company behaviors.

Can skill assessments replace interview questions?

No. Assessments and interview questions answer different things. Assessments produce structured skill evaluation against a defined rubric; interview questions surface reasoning, motivation, and judgment. The strongest hiring loops pair both — skill assessments for verified capability, structured behavioral interviews for everything assessments can't measure.

Final thoughts and next steps

The recruitment questions every HR professional should know in 2025 are not a fixed list — they are a working toolkit you adapt to the role, the level, and the rubric. The categories above (intent, behavioral, situational, values-alignment, ownership) give you a structure; the example questions give you a starting point.

Next steps

  • Audit your current interview guide. Map every question to one of the five categories above. If a category is empty, add two questions.
  • Separate reasoning from presentation on your scorecard. Score them as distinct rubric lines.
  • Pair interviews with skill verification. Schedule a demo of HackerEarth Assessments to see how rubric-based skill scores integrate with your interview scorecard, so your hiring decision isn't relying on candidate self-report alone.

Sources referenced: LinkedIn Global Talent Trends, SHRM Research, Gallup State of the Global Workplace.

Why Empathy Could Be Your Biggest Hiring Advantage

Why Empathy Could Be Your Biggest Hiring Advantage

Why Human-Centered Hiring Matters More Than Ever

Hiring has never been more optimized than it is today.

From AI-powered recruitment tools to automated screening systems and structured interview workflows, HR and talent acquisition teams now have more ways than ever to improve hiring speed, consistency, and scalability.

But in the middle of this efficiency-driven approach, one critical element is slowly disappearing: employee empathy.

Empathy in hiring is not about slowing down recruitment or making decisions less objective. It is about ensuring candidates are treated like people navigating important career decisions, not just profiles moving through a hiring pipeline.

As recruitment becomes increasingly system-driven, preserving the human side of hiring is becoming both more difficult and more important.

For HR leaders and talent acquisition professionals, this is no longer just a workplace culture discussion. It directly impacts candidate experience, employer branding, hiring quality, and long-term employee retention.

When Hiring Feels Like a Process Instead of an Experience

Most modern recruitment systems are designed around efficiency.

Applications are filtered automatically, interviews are scheduled faster, and candidates move through hiring stages with minimal manual effort. Operationally, this creates speed and structure.

But from a candidate’s perspective, the experience can often feel distant and impersonal.

Many candidates go through multiple interview rounds without clear communication, feedback, or transparency about timelines and expectations. Even when the hiring process is fair, it may still feel mechanical.

This creates a growing challenge for HR and TA teams:

How do you maintain hiring efficiency without removing the human connection from recruitment?

That is where empathy becomes essential.

The Hidden Cost of Low-Empathy Hiring

The impact of low-empathy hiring is not always immediate, but it compounds over time.

Candidates remember how organizations made them feel during the recruitment process, especially during rejection or delayed communication. Those experiences shape employer perception long before someone becomes an employee.

Over time, this directly affects employer brand and candidate trust.

There is also another hidden cost.

When hiring becomes too rigid or overly process-driven, recruiters may overlook candidates with strong long-term potential simply because they do not perfectly match predefined criteria.

Without empathy, context disappears.

And when context disappears, opportunities are often missed.

For HR leaders, empathy is no longer just a soft skill. It is becoming a competitive hiring advantage.

Why Empathy Is Becoming a Competitive Hiring Skill

Today’s workforce is far more dynamic than it was a decade ago.

Professionals switch industries, build careers through unconventional paths, and learn skills outside traditional education systems. As a result, resumes and structured evaluations only tell part of the story.

Empathy helps recruiters understand what exists beyond the surface.

It allows hiring teams to better understand:

  • Career transitions
  • Employment gaps
  • Nontraditional experience
  • Personal growth journeys

This shift changes the entire hiring mindset.

Instead of asking:

“Does this candidate perfectly match the role?”

Recruiters are increasingly asking:

“What could this candidate become in the right environment?”

That perspective creates stronger and more future-focused hiring decisions.

Where Empathy Fits in Modern Recruitment

Empathy does not replace structured hiring systems.

In fact, it becomes most effective when built into them.

Simple improvements in communication can significantly improve candidate experience. Clear updates, transparent timelines, respectful rejection emails, and honest feedback all contribute to a more human-centered recruitment process.

These small changes often have a lasting impact on how candidates perceive an organization.

For HR teams, the goal is not to remove structure from hiring.

The goal is to ensure structure does not remove humanity.

Better Hiring Decisions Start With Better Human Understanding

Empathy also improves the quality of hiring decisions themselves.

When recruiters take time to understand a candidate’s context, they often uncover strengths that are not immediately visible on resumes or scorecards.

A candidate who appears average on paper may demonstrate exceptional adaptability, resilience, or problem-solving ability in real-world situations.

Without empathy, those signals are easy to miss.

For talent acquisition leaders, this means recognizing that hiring is not just about selecting the strongest profile.

It is about identifying the strongest long-term fit within a real human context.

Final Thoughts

As recruitment continues evolving through automation, AI hiring tools, and structured decision-making, the biggest risk is not losing efficiency.

It is losing humanity.

Employee empathy ensures hiring remains people-focused, even as processes become more technology-driven.

It does not slow recruitment down. Instead, it helps organizations create better candidate experiences, stronger employer brands, and more thoughtful hiring decisions.

Because candidates may forget interview questions or assessment scores.

But they will always remember how they were treated during the hiring process.

And in today’s competitive talent market, that experience often determines whether top talent chooses to join or walk away.

Top Products

Explore HackerEarth’s top products for Hiring & Innovation

Discover powerful tools designed to streamline hiring, assess talent efficiently, and run seamless hackathons. Explore HackerEarth’s top products that help businesses innovate and grow.
Frame
Hackathons
Engage global developers through innovation
Arrow
Frame 2
Assessments
AI-driven advanced coding assessments
Arrow
Frame 3
FaceCode
Real-time code editor for effective coding interviews
Arrow
Frame 4
L & D
Tailored learning paths for continuous assessments
Arrow
Get A Free Demo