Ch. 2. How AI models are trained.
**Chapter Two: Inside the Brain — How AI Models Are Trained**
AI may seem like magic from the outside, but beneath the surface, it's all about *data*, *math*, and *computational power*. To understand why AI consumes so much energy — and water — we need to explore **how AI is trained** and what goes on behind the scenes.
---
### **2.1 What Does It Mean to “Train” an AI?**
Training an AI model means teaching it how to perform a specific task by feeding it **huge amounts of data** and letting it learn patterns.
#### Think of it like this:
- You want an AI to recognize cats in photos.
- You show it millions of labeled images: *this is a cat, this is not*.
- It adjusts its internal “weights” (parameters) to reduce its mistakes.
- Over time, it becomes better at recognizing cats — just like a student gets better with practice.
---
### **2.2 Neural Networks: The Core of Modern AI**
At the heart of most advanced AI is a system called a **neural network**, inspired by the human brain.
#### Key Concepts:
- **Neurons (Nodes)**: These are like digital brain cells.
- **Layers**: Neural networks have input layers, hidden layers, and output layers.
- **Weights**: These are values that get adjusted during training to minimize errors.
- **Backpropagation**: A method the AI uses to learn from its mistakes and adjust the weights.
---
### **2.3 Why It’s So Resource-Intensive**
#### Massive Models = Massive Resources
- **ChatGPT-3** has **175 billion parameters** — all of which need to be tuned during training.
- Training takes **weeks or months**, using hundreds or thousands of **power-hungry GPUs** (Graphics Processing Units).
- The **data sets** can be **petabytes** in size, requiring storage, loading, and processing.
#### Energy → Heat → Cooling → Water
- These operations generate heat — a lot of it.
- To prevent servers from overheating, **cooling systems** kick in, and in water-cooled data centers, **tons of water** are evaporated or cycled.
---
### **2.4 Training vs. Inference**
There are two phases in an AI’s life:
| Phase | What Happens | Resource Use |
|--------------|--------------------------------------------------------|----------------------|
| **Training** | Model learns from huge datasets | Extremely high |
| **Inference**| You ask a question; AI gives you an answer | Still high, but less |
Inference (like you using ChatGPT now) still uses energy and water, but **much less** than training the model from scratch.
---
### **2.5 Example: Training ChatGPT-3**
- **Hardware**: ~10,000 GPUs
- **Time**: Weeks to months
- **Data**: Hundreds of billions of words
- **Water**: ~85,000 gallons (for cooling alone)
- **Cost**: Millions of dollars
And that’s just for one model!
---
## **2.6 Final Thought: Intelligence Isn’t Free**
Training an AI model is like building a skyscraper — it takes materials, planning, and massive effort. The final result may seem simple or effortless, but behind every “smart” answer lies a complex system that consumed energy, water, time, and human labor.
---
## **Next Chapter Preview: Chapter Three — The Data Center Deep Dive**
In the next chapter, we’ll open the doors of the **data centers** that make AI possible:
- What do they look like?
- How are they cooled?
- Why are some greener than others?
We’ll also explore **alternatives and innovations**: could AI be powered by clean energy and cooled sustainably?
---
Want to include a glossary of AI terms as a side guide? Or would you like visual aids for these chapters like infographics or illustrations?
Comments
Post a Comment