Data Science MCQ Questions Practice Problems

Question 1

When should feature extraction be used instead of feature selection?

Accepted Answer

When features need transformation

Answer

When raw features are sufficient

Answer

When data is balanced

Answer

When model accuracy is high

Question 2

Which scikit-learn function is used to normalize data?

Accepted Answer

normalize()

Answer

standardize()

Answer

scale()

Answer

transform()

Question 3

How do you perform one-hot encoding in Pandas?

Accepted Answer

pd.dummies()

Answer

pd.one_hot()

Answer

pd.categorical()

Answer

pd.encoding()

Question 4

Which method in scikit-learn is used for dimensionality reduction?

Accepted Answer

PCA()

Answer

StandardScaler()

Answer

KMeans()

Answer

OneHotEncoder()

Question 5

A dataset has highly correlated features. How should you handle this issue?

Accepted Answer

Drop one of the correlated features

Answer

Normalize features

Answer

Encode features

Answer

Use PCA

Question 6

A numerical feature has a skewed distribution. What transformation can address this?

Accepted Answer

Log transformation

Answer

Drop the feature

Answer

One-hot encoding

Answer

Normalize values

Question 7

A dataset has missing values for important features. What is the best approach to address this?

Accepted Answer

Impute values

Answer

Remove the rows

Answer

Drop the feature

Answer

Ignore the missing data

Question 8

What is a key characteristic of time series data?

Accepted Answer

Sequential observations over time

Answer

Random observations

Answer

Data without timestamps

Answer

Categorical data

Question 9

Which of the following is commonly used to detect seasonality in time series data?

Accepted Answer

Autocorrelation

Answer

Histogram

Answer

Scatter plot

Answer

PCA

Question 10

Why is stationarity important in time series analysis?

Accepted Answer

It allows for accurate forecasting

Answer

It ensures data completeness

Answer

It stabilizes variance

Answer

It reduces data size

Question 11

What is the purpose of differencing in time series preprocessing?

Accepted Answer

To remove trend and make data stationary

Answer

To detect seasonality

Answer

To visualize data

Answer

To encode features

Question 12

Which metric is commonly used to evaluate the accuracy of a time series model?

Accepted Answer

Mean Absolute Error (MAE)

Answer

Precision

Answer

Silhouette Score

Answer

Log Loss

Question 13

Which Python library provides the seasonal_decompose function for analyzing time series components?

Accepted Answer

statsmodels

Answer

Pandas

Answer

NumPy

Answer

Matplotlib

Question 14

How do you plot a time series in Pandas?

Accepted Answer

time_series.plot()

Answer

plt.plot(time_series)

Answer

pd.plot(time_series)

Answer

plot(time_series)

Question 15

Which method is used in statsmodels to fit an ARIMA model for time series forecasting?

Accepted Answer

ARIMA().fit()

Answer

fit_arima()

Answer

arima_fit()

Answer

forecast_arima()

Question 16

A time series dataset shows an upward trend. What preprocessing step is necessary before modeling?

Accepted Answer

Differencing

Answer

One-hot encoding

Answer

Scaling

Answer

Normalizing

Question 17

A time series forecast consistently underestimates values during high seasons. What could be the issue?

Accepted Answer

Incorrect seasonality handling

Answer

Overfitting

Answer

Underfitting

Answer

Missing timestamps

Question 18

What is the main goal of Natural Language Processing?

Accepted Answer

Understanding and processing human language

Answer

Analyzing numerical data

Answer

Creating images

Answer

Performing clustering

Question 19

Which of the following tasks is NOT part of Natural Language Processing?

Accepted Answer

Image classification

Answer

Sentiment analysis

Answer

Speech recognition

Answer

Text summarization

Question 20

What is tokenization in NLP?

Accepted Answer

Dividing text into words or subwords

Answer

Encoding numerical data

Answer

Creating embeddings

Answer

Reducing noise in data

Question 21

What is the purpose of stopword removal in text preprocessing?

Accepted Answer

To remove common but insignificant words

Answer

To normalize text

Answer

To reduce dimensionality

Answer

To correct spelling

Question 22

What is a bag-of-words representation in NLP?

Accepted Answer

A numerical representation of text

Answer

A method to remove stopwords

Answer

A type of neural network

Answer

A clustering algorithm

Question 23

Which library provides the word_tokenize function for tokenization in Python?

Accepted Answer

NLTK

Answer

NumPy

Answer

Pandas

Answer

Scikit-learn

Question 24

How do you create a term frequency-inverse document frequency (TF-IDF) matrix in scikit-learn?

Accepted Answer

TfidfVectorizer.fit_transform()

Answer

CountVectorizer.fit_transform()

Answer

TfidfTransformer.fit()

Answer

transform_TF()

Question 25

Which Python library provides pre-trained word embeddings like Word2Vec?

Accepted Answer

Gensim

Answer

NLTK

Answer

Pandas

Answer

SpaCy

Question 26

A text classification model performs poorly due to high-dimensional feature space. What preprocessing step can help?

Accepted Answer

Dimensionality reduction

Answer

Normalization

Answer

Feature extraction

Answer

Stopword removal

Question 27

A sentiment analysis model misclassifies reviews with negations (e.g., "not good"). What could address this?

Accepted Answer

Using n-grams

Answer

Stopword removal

Answer

Bag-of-words

Answer

TF-IDF

Question 28

Which tool is primarily used for creating interactive and shareable notebooks for data analysis?

Accepted Answer

Jupyter Notebook

Answer

RStudio

Answer

PyCharm

Answer

Tableau

Question 29

Which library in Python is most commonly used for data manipulation and analysis?

Accepted Answer

Pandas

Answer

Matplotlib

Answer

SciPy

Answer

NumPy

Question 30

What is the main use of R in Data Science?

Accepted Answer

Data visualization and statistical analysis

Answer

Deep learning

Answer

Web development

Answer

API creation

Data Science Multiple Choice Questions (MCQs) and Answers