Part 3: Natural Language Processing — NLP, NLU, and NLG

Part of the AI Fundamentals 101 Series

Why NLP Is the Backbone of Modern AI

Every time you interact with a modern AI system — asking Claude a question, using Google Translate, dictating a message to Siri, or getting autocomplete suggestions in your IDE — you're using Natural Language Processing.

NLP is the bridge between human language and machine computation. Without it, AI would be limited to numbers, images, and structured data. NLP is what makes AI conversational.

I first appreciated NLP when I built a log analysis tool for my home lab. Kubernetes error logs are technically "text," but they're messy: stack traces mixed with timestamps, pod names, cryptic error codes. Teaching a system to extract meaningful information from that unstructured text is an NLP problem. And understanding the distinction between NLP, NLU, and NLG is what helped me architect that system correctly.


What is NLP?

Natural Language Processing (NLP) is the field of AI that deals with the interaction between computers and human language. It covers everything from basic text manipulation to understanding meaning to generating coherent text.

NLP is actually an umbrella term that contains two major subfields:

Natural Language Processing (NLP)
├── NLU — Natural Language Understanding
│   (Input: text → Output: meaning/intent/entities)
└── NLG — Natural Language Generation
    (Input: meaning/data → Output: text)

NLU vs NLG — The Two Halves

  • NLU (Natural Language Understanding): Taking text and extracting meaning. "What does this text say?"

  • NLG (Natural Language Generation): Taking structured data or intent and producing text. "How do I express this?"

Every conversational AI system uses both:

  1. NLU to understand what you said

  2. Processing/reasoning in the middle

  3. NLG to formulate a response


The NLP Pipeline: From Raw Text to Understanding

Processing natural language involves multiple stages. Here's the pipeline that most NLP systems use, starting from raw text:

Step 1: Tokenization — Breaking Text into Pieces

Tokenization splits text into smaller units (tokens) that the model can process.

LLMs use subword tokenization, which is different:

Step 2: Text Preprocessing — Cleaning the Data

Step 3: Stopword Removal and Stemming

Note: Stemming is a crude approach — "repeatedly" becomes "repeatedli" which isn't a real word. Modern approaches use lemmatization (which produces real words) or skip this step entirely and let the model handle it.

Step 4: Feature Extraction — Turning Text into Numbers

Models need numbers, not words. There are several ways to convert text to numerical features:

Modern approach: embeddings. Instead of counting words, we use neural networks to create dense vector representations where similar meanings are close together. We'll cover this in detail in Part 4.


Named Entity Recognition (NER)

NER identifies and classifies entities in text — names, organizations, locations, dates, etc.

Output:

This is a simple rule-based NER. Production systems use trained models (spaCy, transformers) that can recognize entities they've never seen before.


Sentiment Analysis

Sentiment analysis determines whether text expresses positive, negative, or neutral sentiment.


Text Classification

Beyond sentiment, NLP can classify text into any categories you define:


From Rule-Based Chatbots to LLM-Powered Assistants

The evolution of chatbots perfectly illustrates the progression of NLP technology.

Generation 1: Rule-Based (Pattern Matching)

Limitations: You must anticipate every possible phrasing. "How's the cluster doing?" won't match "status.*cluster". This approach doesn't scale.

Generation 2: Intent Classification + Entity Extraction

Better: Handles paraphrasing because the classifier learned the patterns. But still limited to predefined intents.

Generation 3: LLM-Powered (Modern)

The leap: LLMs handle arbitrary phrasing, multi-step reasoning, and natural conversation without any intent classification training data.


NLP in Practice: A Log Analysis Pipeline

Here's a practical example combining multiple NLP techniques — similar to what I built for my home lab monitoring:

Output:

This pipeline: tokenizes → cleans → vectorizes → clusters → summarizes. Each step is a core NLP operation.


Key NLP Concepts Summary

Concept
What It Does
Example

Tokenization

Splits text into tokens

"Hello world" → ["Hello", "world"]

Stemming

Reduces words to root form

"running" → "run"

Lemmatization

Reduces to dictionary form

"better" → "good"

Stopword Removal

Removes common words

"The cat is on the mat" → "cat mat"

TF-IDF

Weights words by importance

Rare words get higher scores

NER

Finds named entities

"Deploy to US-East" → {region: "US-East"}

Sentiment Analysis

Detects positive/negative

"System crashed" → Negative

Text Classification

Assigns categories

"OOMKilled" → Category: Memory

Embeddings

Dense vector representations

"king" → [0.2, -0.1, 0.8, ...]


The Shift from Classical NLP to LLMs

Classical NLP (everything we've covered above) required:

  • Feature engineering (TF-IDF, n-grams, hand-crafted rules)

  • Task-specific models (one model for sentiment, another for NER, another for classification)

  • Labeled training data for each task

Modern LLM-based NLP requires:

  • A good prompt

  • That's it.

But classical NLP isn't dead. When you need:

  • Fast inference (microseconds, not seconds)

  • Low cost (no API calls)

  • Deterministic behavior (same input → same output)

  • To process millions of documents

...classical NLP with scikit-learn is still the right tool. I use TF-IDF + Naive Bayes for my initial log classification (fast, cheap, handles volume) and then route only the interesting cases to an LLM for deeper analysis.


What's Next

Now that you understand how machines process human language, we'll explore the technology that revolutionized NLP: Large Language Models and Generative AI — how transformers work, what makes models "large," and why generative AI is both powerful and limited.


Next: Part 4 — Large Language Models and Generative AI


← Part 2: ML, DL, and Foundation Models · Series Overview · Next →

Last updated