AI Engineering has emerged as a distinct discipline, separate from traditional Machine Learning Engineering. While ML engineers train models from scratch, AI engineers build applications on top of foundation models. This shift fundamentally changes what you need to learn.
1. Understanding Foundation Models
Foundation models (GPT-5.2, Claude, Gemini, Llama) are the building blocks of modern AI applications. Unlike traditional ML, where you train models from scratch, AI engineering leverages pre-trained models. Understanding their capabilities, limitations, tokenization, context windows, and pricing is fundamental to building cost-effective applications.
Project: Model Comparison Notebook
Create a simple Python notebook that sends the same 10 prompts to different models (use free tiers: Gemini API, Groq for Llama, OpenAI playground) and compare responses side by side. Document differences in quality, speed, and style. No infrastructure needed, just API keys and a Jupyter notebook.
2. Prompt Engineering
Your prompts are your “code” in AI engineering. The difference between a mediocre and excellent AI application often comes down to prompt design. Techniques like few-shot learning, chain-of-thought, and structured outputs can dramatically improve results without any model training.
Project: Prompt Experiments Spreadsheet
Pick one task (e.g., “summarize this article”). Write 5 different prompts: zero-shot, few-shot, chain-of-thought, persona based, and structured output. Test each on 10 examples and score results in a spreadsheet. You will see firsthand which techniques work best.
3. Retrieval Augmented Generation (RAG)
LLMs have knowledge cutoffs and hallucinate. RAG grounds them in your data. It’s the most common pattern for building production AI applications, from customer support bots to internal knowledge assistants. Understanding chunking strategies, embedding models, vector databases, and retrieval metrics is essential.
Project: Chat With Your Notes
Build a simple RAG app over 5-10 of your markdown notes or text files. Use any agentic framework with ChromaDB (runs locally, no setup). Split documents into chunks, embed them, and query with natural language. Start simple, and you can add complexity later. One Python file, ~50 lines of code.