
Q1
Q1 What is the primary objective of Natural Language Processing (NLP)?
To design database systems
To enable machines to understand, interpret, and generate human language
To enhance programming speed
To automate computations
Q2
Q2 NLP is a subfield of which broader field?
Computer Science
Artificial Intelligence
Data Science
Machine Learning
Q3
Q3 Which of these is not an application of NLP?
Sentiment Analysis
Text Summarization
Speech Recognition
Image Classification
Q4
Q4 What type of input does NLP primarily handle?
Images
Numbers
Text and Speech
Video
Q5
Q5 Which phase of NLP focuses on identifying the structure and meaning of sentences?
Syntax Analysis
Tokenization
Semantic Analysis
POS Tagging
Q6
Q6 What is the difference between syntax and semantics in NLP?
Syntax focuses on meaning while semantics focuses on structure
Syntax focuses on structure while semantics focuses on meaning
Both focus on structure
Both focus on meaning
Q7
Q7 What is the Bag-of-Words model used for in NLP?
Capturing context between words
Counting word frequencies without context
Detecting sentiment
Generating word embeddings
Q8
Q8 Which Python library provides tools for NLP tasks like tokenization and stemming?
matplotlib
scikit-learn
nltk
tensorflow
Q9
Q9 A model outputs unrelated words when translating text. What is the likely issue in the NLP pipeline?
Incorrect tokenization
Too much training data
Improper evaluation metrics
High learning rate
Q10
Q10 What is tokenization in NLP?
Breaking text into sentences
Breaking text into words
Removing stopwords
Converting text to lowercase
Q11
Q11 What is the purpose of removing stopwords in NLP?
To improve context
To reduce data noise
To enhance syntax analysis
To tokenize text
Q12
Q12 Which of the following is an example of a lemma for the word "running"?
run
running
runs
ran
Q13
Q13 Why is stemming less accurate than lemmatization?
It uses dictionary lookups
It considers context
It uses heuristic rules
It generates tokens
Q14
Q14 Which method in Python’s nltk library is used to tokenize a sentence into words?
word_tokenize
sent_tokenize
split
tokenize
Q15
Q15 How does lemmatization differ from stemming?
It’s faster
It’s more accurate
It generates shorter tokens
It removes stopwords
Q16
Q16 What are stopwords?
Rarely used words
Frequent and insignificant words
Nouns
Verbs
Q17
Q17 Which Python library provides a predefined list of stopwords?
nltk
spacy
pandas
numpy
Q18
Q18 Which method in nltk can be used to check if a word is a stopword?
stopwords.is_stop
stopwords.words
stopwords.check
stopwords.tokenize
Q19
Q19 A tokenization function splits words incorrectly due to punctuations. What should be modified?
The language model
The tokenization algorithm
The stopword list
The stemming logic
Q20
Q20 What is the main purpose of text preprocessing in NLP?
To reduce data noise
To create embeddings
To train models directly
To generate stopwords
Q21
Q21 Which preprocessing step involves converting all characters to lowercase?
Tokenization
Normalization
Stemming
POS tagging
Q22
Q22 Why is removing punctuation important in text preprocessing?
It improves tokenization
It simplifies embeddings
It enhances syntax analysis
It reduces data size
Q23
Q23 Which step replaces contractions like “don’t” with “do not”?
Tokenization
Expanding contractions
Stemming
Lemmatization
Q24
Q24 Which method can handle spelling corrections during preprocessing?
Bag-of-Words
Spell checkers
Stemming
Tokenization
Q25
Q25 What is the role of stemming in text preprocessing?
To retain context
To remove suffixes from words
To expand contractions
To identify stopwords
Q26
Q26 Which Python library provides the TextBlob class for text preprocessing?
nltk
TextBlob
spaCy
pandas
Q27
Q27 How do you remove punctuation from text using Python’s string library?
text.split()
text.translate()
text.strip()
text.replace()
Q28
Q28 How can you remove numbers from text using re in Python?
re.sub(r'\d+', '', text)
re.findall(r'\d+', text)
re.split(r'\d+', text)
re.match(r'\d+', text)
Q29
Q29 A preprocessing pipeline is failing because stopwords are not being removed. What should you check?
Stopword list
Stemming logic
POS tags
Normalization steps
Q30
Q30 A dataset contains text with special characters disrupting tokenization. What should you do?
Expand contractions
Remove special characters
Use a new tokenization method
Apply stemming