Data Science MCQ Questions Practice Problems

Question 1

A pie chart in Matplotlib displays incorrect proportions. What could be the issue?

Accepted Answer

Incorrect sum of values

Answer

Wrong data labels

Answer

Missing data

Answer

Invalid chart type

Question 2

A scatter plot shows overlapping points, making it hard to interpret. What can improve its readability?

Accepted Answer

Add jitter

Answer

Increase marker size

Answer

Use smaller axes

Answer

Change chart type

Question 3

A line chart is difficult to interpret due to too many data points. What is the best approach to simplify it?

Accepted Answer

Aggregate data

Answer

Remove the chart

Answer

Use larger axes

Answer

Switch to bar chart

Question 4

What is the primary objective of machine learning?

Accepted Answer

To make predictions based on data

Answer

To clean data

Answer

To create databases

Answer

To improve hardware

Question 5

Which of the following is a supervised learning algorithm?

Accepted Answer

Decision Trees

Answer

K-Means

Answer

DBSCAN

Answer

Principal Component Analysis

Question 6

What is overfitting in machine learning?

Accepted Answer

Model performs well on training data but poorly on new data

Answer

Model performs poorly on training data

Answer

Model is too simple

Answer

Model has no bias

Question 7

What is the purpose of a loss function in machine learning?

Accepted Answer

To evaluate model predictions

Answer

To split datasets

Answer

To improve visualization

Answer

To standardize data

Question 8

Why is it important to split data into training and testing datasets?

Accepted Answer

To evaluate model performance on unseen data

Answer

To increase dataset size

Answer

To clean data

Answer

To preprocess features

Question 9

Which Python library provides the train_test_split function?

Accepted Answer

scikit-learn

Answer

NumPy

Answer

Pandas

Answer

Matplotlib

Question 10

How do you train a linear regression model using scikit-learn?

Accepted Answer

model.fit(X, y)

Answer

model.train(X, y)

Answer

model.learn(X, y)

Answer

model.predict(X, y)

Question 11

Which scikit-learn function is used to calculate the accuracy of a classification model?

Accepted Answer

accuracy_score

Answer

classification_report

Answer

score

Answer

confusion_matrix

Question 12

A model's predictions have high bias. What could be the likely issue?

Accepted Answer

Underfitting

Answer

Overfitting

Answer

Feature scaling

Answer

Incorrect testing data

Question 13

A classification model achieves 99% accuracy on the training set but only 60% on the test set. What is the issue?

Accepted Answer

Overfitting

Answer

Underfitting

Answer

Data imbalance

Answer

Feature scaling

Question 14

After training a regression model, the residuals show a clear pattern. What does this imply?

Accepted Answer

Model assumptions are violated

Answer

Model is accurate

Answer

Feature scaling is wrong

Answer

Data is balanced

Question 15

What is the key difference between supervised and unsupervised learning?

Accepted Answer

Supervised uses labeled data, unsupervised does not

Answer

Both use labeled data

Answer

Both use unlabeled data

Answer

Unsupervised requires labels

Question 16

Which of the following is an example of a supervised learning algorithm?

Accepted Answer

Linear Regression

Answer

K-Means

Answer

Hierarchical Clustering

Answer

PCA

Question 17

Which task is best suited for unsupervised learning?

Accepted Answer

Identifying customer segments

Answer

Predicting house prices

Answer

Spam classification

Answer

Stock price prediction

Question 18

What metric is commonly used to evaluate a regression model in supervised learning?

Accepted Answer

Mean Squared Error (MSE)

Answer

Accuracy

Answer

Precision

Answer

Silhouette score

Question 19

Why is clustering considered an unsupervised learning technique?

Accepted Answer

It finds patterns in unlabeled data

Answer

It requires labeled data

Answer

It uses supervised models

Answer

It predicts outcomes

Question 20

Which Python library provides the KMeans function for clustering?

Accepted Answer

scikit-learn

Answer

NumPy

Answer

Pandas

Answer

Matplotlib

Question 21

How do you fit a decision tree classifier in scikit-learn?

Accepted Answer

model.fit(X, y)

Answer

model.train(X, y)

Answer

model.learn(X, y)

Answer

model.split(X, y)

Question 22

Which function in scikit-learn is used to calculate the silhouette score for a clustering model?

Accepted Answer

silhouette_score()

Answer

cluster_score()

Answer

clustering_score()

Answer

silhouette_metric()

Question 23

How do you specify the number of clusters in the KMeans algorithm using scikit-learn?

Accepted Answer

KMeans(n_clusters=n)

Answer

KMeans(clusters=n)

Answer

KMeans(n=n)

Answer

KMeans(n_cluster=n)

Question 24

A supervised model performs poorly on unseen data. What is the likely issue?

Accepted Answer

Data leakage

Answer

Underfitting

Answer

Incorrect loss function

Answer

Missing labels

Question 25

A clustering model produces inconsistent results. What could be the likely cause?

Accepted Answer

Wrong feature scaling

Answer

Labeled data

Answer

High accuracy

Answer

Balanced dataset

Question 26

After applying KMeans, one cluster has very few data points. What should you consider next?

Accepted Answer

Visualize clusters

Answer

Increase cluster count

Answer

Decrease cluster count

Answer

Change the algorithm

Question 27

What is the primary goal of feature engineering in machine learning?

Accepted Answer

Enhance model performance

Answer

Improve model interpretability

Answer

Reduce dataset size

Answer

Avoid overfitting

Question 28

Which technique is commonly used to handle categorical data in feature engineering?

Accepted Answer

One-hot encoding

Answer

Normalization

Answer

PCA

Answer

Standardization

Question 29

Why is feature scaling important in machine learning?

Accepted Answer

Improves convergence during training

Answer

Reduces model size

Answer

Handles missing values

Answer

Reduces overfitting

Question 30

What is feature selection?

Accepted Answer

Choosing the best features

Answer

Adding new features

Answer

Removing outliers

Answer

Scaling data

Data Science Multiple Choice Questions (MCQs) and Answers