NLP MCQ Questions Practice Problems

Question 1

A machine translation system generates grammatically incorrect sentences. What is the likely issue?

Accepted Answer

Insufficient training

Answer

Lack of linguistic features

Answer

Poor tokenization

Answer

Low vocabulary coverage

Question 2

What is the primary purpose of text classification in NLP?

Accepted Answer

To classify text into categories

Answer

To generate embeddings

Answer

To tokenize sentences

Answer

To remove stopwords

Question 3

Which method is commonly used for vectorizing text in traditional NLP pipelines?

Accepted Answer

Bag-of-Words

Answer

Transformers

Answer

Word2Vec

Answer

RNNs

Question 4

What is the main limitation of TF-IDF vectorization?

Accepted Answer

Ignores word context

Answer

Ignores word frequency

Answer

Overfits data

Answer

Requires embeddings

Question 5

Which algorithm is commonly used for binary text classification tasks?

Accepted Answer

Naive Bayes

Answer

K-Means

Answer

Apriori

Answer

Decision Trees

Question 6

Which vectorization method captures both word order and context in text classification?

Accepted Answer

Transformer-based embeddings

Answer

TF-IDF

Answer

Bag-of-Words

Answer

Word Embeddings

Question 7

Which library in Python provides tools for creating TF-IDF vectors?

Accepted Answer

scikit-learn

Answer

nltk

Answer

spaCy

Answer

TextBlob

Question 8

How can you preprocess text for vectorization using nltk?

Accepted Answer

Tokenize and lowercase

Answer

Skip tokenization

Answer

Generate embeddings

Answer

Apply TF-IDF directly

Question 9

How do you implement a classification pipeline using scikit-learn?

Accepted Answer

Use Pipeline to combine steps

Answer

Build and train models separately

Answer

Skip preprocessing

Answer

Train without vectorization

Question 10

A classifier performs poorly due to irrelevant features in vectorization. What should you do?

Accepted Answer

Apply stopword removal

Answer

Increase vocabulary size

Answer

Reduce dataset size

Answer

Skip preprocessing

Question 11

A model overfits during text classification. What can you adjust?

Accepted Answer

Apply regularization

Answer

Reduce embedding size

Answer

Skip vectorization

Answer

Use smaller datasets

Question 12

A classifier struggles to differentiate between similar classes. What approach can improve this?

Accepted Answer

Use embeddings with context

Answer

Reduce feature set

Answer

Use simpler models

Answer

Increase batch size

Question 13

What is the primary purpose of sequence-to-sequence models in NLP?

Accepted Answer

Sequence prediction

Answer

Classification

Answer

Tokenization

Answer

Entity recognition

Question 14

Which component of a sequence-to-sequence model generates the output sequence?

Accepted Answer

Decoder

Answer

Encoder

Answer

Embedding

Answer

Attention

Question 15

How does the attention mechanism improve sequence-to-sequence models?

Accepted Answer

Focuses on relevant parts of the input

Answer

Reduces training time

Answer

Ignores long inputs

Answer

Speeds up decoding

Question 16

Which type of sequence-to-sequence model architecture is most effective for long sequences?

Accepted Answer

Transformer-based

Answer

RNN-based

Answer

CNN-based

Answer

Naive Bayes

Question 17

What is the role of positional encoding in transformer-based sequence-to-sequence models?

Accepted Answer

Preserves word order

Answer

Adds semantic meaning

Answer

Represents token relationships

Answer

Tokenizes text

Question 18

Which library provides pre-trained sequence-to-sequence models like BART and T5?

Accepted Answer

Hugging Face

Answer

nltk

Answer

TextBlob

Answer

spaCy

Question 19

How do you fine-tune a pre-trained sequence-to-sequence model using transformers?

Accepted Answer

All of the above

Answer

Load a pre-trained model

Answer

Train with a custom tokenizer

Answer

Use a labeled sequence dataset

Question 20

Which parameter in transformers controls the length of output sequences during generation?

Accepted Answer

max_length

Answer

min_length

Answer

output_size

Answer

length_penalty

Question 21

A sequence-to-sequence model generates incomplete outputs. What could improve this?

Accepted Answer

Increase max_length

Answer

Use smaller datasets

Answer

Reduce attention heads

Answer

Skip fine-tuning

Question 22

A sequence-to-sequence model produces irrelevant output for longer inputs. What should you adjust?

Accepted Answer

Use attention mechanisms

Answer

Increase training epochs

Answer

Reduce vocabulary size

Answer

Ignore longer sequences

Question 23

What is the main advantage of transformer models over RNNs?

Accepted Answer

Parallel processing

Answer

Handles fixed-length inputs

Answer

Simpler architecture

Answer

Lower computational cost

Question 24

What is the role of self-attention in transformer models?

Accepted Answer

Focuses on relevant words

Answer

Preserves word order

Answer

Simplifies embeddings

Answer

Improves tokenization

Question 25

Which component of a transformer model ensures information flow across layers?

Accepted Answer

Residual connections

Answer

Feed-forward layers

Answer

Normalization layers

Answer

Positional encoding

Question 26

How does BERT differ from traditional transformer models?

Accepted Answer

It uses bi-directional context

Answer

It processes data sequentially

Answer

It ignores masked tokens

Answer

It requires no pre-training

Question 27

Which library in Python provides pre-trained BERT models?

Accepted Answer

Hugging Face

Answer

nltk

Answer

TextBlob

Answer

spaCy

Question 28

How do you fine-tune BERT for a text classification task?

Accepted Answer

Use AutoModelForSequenceClassification

Answer

Train from scratch

Answer

Apply Bag-of-Words

Answer

Use RNNs

Question 29

A BERT model performs poorly on domain-specific tasks. What should you do?

Accepted Answer

Fine-tune on domain-specific data

Answer

Use a smaller model

Answer

Train with more epochs

Answer

Reduce vocabulary size

Question 30

A transformer model fails to generate coherent long texts. What should you adjust?

Accepted Answer

Add positional encoding

Answer

Reduce context length

Answer

Train on short sentences

Answer

Use static embeddings

NLP Multiple Choice Questions (MCQs) and Answers