Ch. 2. How AI models are trained.

April 05, 2025

**Chapter Two: Inside the Brain — How AI Models Are Trained**

AI may seem like magic from the outside, but beneath the surface, it's all about *data*, *math*, and *computational power*. To understand why AI consumes so much energy — and water — we need to explore **how AI is trained** and what goes on behind the scenes.

---

### **2.1 What Does It Mean to “Train” an AI?**

Training an AI model means teaching it how to perform a specific task by feeding it **huge amounts of data** and letting it learn patterns.

#### Think of it like this:

- You want an AI to recognize cats in photos.

- You show it millions of labeled images: *this is a cat, this is not*.

- It adjusts its internal “weights” (parameters) to reduce its mistakes.

- Over time, it becomes better at recognizing cats — just like a student gets better with practice.

---

### **2.2 Neural Networks: The Core of Modern AI**

At the heart of most advanced AI is a system called a **neural network**, inspired by the human brain.

#### Key Concepts:

- **Neurons (Nodes)**: These are like digital brain cells.

- **Layers**: Neural networks have input layers, hidden layers, and output layers.

- **Weights**: These are values that get adjusted during training to minimize errors.

- **Backpropagation**: A method the AI uses to learn from its mistakes and adjust the weights.

---

### **2.3 Why It’s So Resource-Intensive**

#### Massive Models = Massive Resources

- **ChatGPT-3** has **175 billion parameters** — all of which need to be tuned during training.

- Training takes **weeks or months**, using hundreds or thousands of **power-hungry GPUs** (Graphics Processing Units).

- The **data sets** can be **petabytes** in size, requiring storage, loading, and processing.

#### Energy → Heat → Cooling → Water

- These operations generate heat — a lot of it.

- To prevent servers from overheating, **cooling systems** kick in, and in water-cooled data centers, **tons of water** are evaporated or cycled.

---

### **2.4 Training vs. Inference**

There are two phases in an AI’s life:

| Phase | What Happens | Resource Use |

|--------------|--------------------------------------------------------|----------------------|

| **Training** | Model learns from huge datasets | Extremely high |

| **Inference**| You ask a question; AI gives you an answer | Still high, but less |

Inference (like you using ChatGPT now) still uses energy and water, but **much less** than training the model from scratch.

---

### **2.5 Example: Training ChatGPT-3**

- **Hardware**: ~10,000 GPUs

- **Time**: Weeks to months

- **Data**: Hundreds of billions of words

- **Water**: ~85,000 gallons (for cooling alone)

- **Cost**: Millions of dollars

And that’s just for one model!

---

## **2.6 Final Thought: Intelligence Isn’t Free**

Training an AI model is like building a skyscraper — it takes materials, planning, and massive effort. The final result may seem simple or effortless, but behind every “smart” answer lies a complex system that consumed energy, water, time, and human labor.

---

## **Next Chapter Preview: Chapter Three — The Data Center Deep Dive**

In the next chapter, we’ll open the doors of the **data centers** that make AI possible:

- What do they look like?

- How are they cooled?

- Why are some greener than others?

We’ll also explore **alternatives and innovations**: could AI be powered by clean energy and cooled sustainably?

---

Want to include a glossary of AI terms as a side guide? Or would you like visual aids for these chapters like infographics or illustrations?

Search This Blog

That's interesting too!

Ch. 2. How AI models are trained.

Comments

Post a Comment

Popular posts from this blog

Ch. 1. Ai's hidden thirst.

Chemistry/Biology 8th/9th grade 36 weeks. 1 Credit each for full year. Homeschool