Learn Python Series (#37) - Virtual Environments and Dependency Management

Repository
What will I learn?
- You will learn why Python projects need isolation from each other;
- the mental model behind virtual environments and how they actually work;
- why dependency management tools exist and what problem they solve;
- how to think about version constraints and reproducible builds;
- the difference between "what you want" (pyproject.toml) and "what you got" (lock files).
Requirements
- A working modern computer running macOS, Windows or Ubuntu;
- An installed Python 3(.11+) distribution;
- The ambition to learn Python programming.
Difficulty
- Intermediate
Curriculum (of the Learn Python Series):
- Learn Python Series - Intro
- Learn Python Series (#2) - Handling Strings Part 1
- Learn Python Series (#3) - Handling Strings Part 2
- Learn Python Series (#4) - Round-Up #1
- Learn Python Series (#5) - Handling Lists Part 1
- Learn Python Series (#6) - Handling Lists Part 2
- Learn Python Series (#7) - Handling Dictionaries
- Learn Python Series (#8) - Handling Tuples
- Learn Python Series (#9) - Using Import
- Learn Python Series (#10) - Matplotlib Part 1
- Learn Python Series (#11) - NumPy Part 1
- Learn Python Series (#12) - Handling Files
- Learn Python Series (#13) - Mini Project - Developing a Web Crawler Part 1
- Learn Python Series (#14) - Mini Project - Developing a Web Crawler Part 2
- Learn Python Series (#15) - Handling JSON
- Learn Python Series (#16) - Mini Project - Developing a Web Crawler Part 3
- Learn Python Series (#17) - Roundup #2
- Learn Python Series (#18) - PyMongo Part 1
- Learn Python Series (#19) - PyMongo Part 2
- Learn Python Series (#20) - PyMongo Part 3
- Learn Python Series (#21) - Handling Dates and Time Part 1
- Learn Python Series (#22) - Handling Dates and Time Part 2
- Learn Python Series (#23) - Handling Regular Expressions Part 1
- Learn Python Series (#24) - Handling Regular Expressions Part 2
- Learn Python Series (#25) - Handling Regular Expressions Part 3
- Learn Python Series (#26) - pipenv & Visual Studio Code
- Learn Python Series (#27) - Handling Strings Part 3 (F-Strings)
- Learn Python Series (#28) - Using Pickle and Shelve
- Learn Python Series (#29) - Handling CSV
- Learn Python Series (#30) - Data Science Part 1 - Pandas
- Learn Python Series (#31) - Data Science Part 2 - Pandas
- Learn Python Series (#32) - Data Science Part 3 - Pandas
- Learn Python Series (#33) - Data Science Part 4 - Pandas
- Learn Python Series (#34) - Working with APIs in 2026
- Learn Python Series (#35) - Working with APIs Part 2
- Learn Python Series (#36) - Type Hints and Modern Python
GitHub Account
Learn Python Series (#37) - Virtual Environments and Dependency Management
When you install Python, you get a single Python installation on your system. When you pip install requests, that package goes into that one Python's site-packages directory. Every Python script you run uses that shared installation.
This works fine until you have two projects with conflicting needs. Project A needs requests version 2.28 because it relies on specific behavior. Project B needs requests version 2.31 with new features. You can only have one version installed globally.
Nota bene: This episode is about virtual environments and dependency management - the solution to this "one Python, many projects" problem. We'll focus on understanding WHY these tools exist and the mental model behind them, not just the commands.
The core problem: global installation conflicts
Think about how Python finds packages. When you write import requests, Python searches through its sys.path - a list of directories. The first requests it finds wins. If you have only one Python installation, there's only one place packages can live.
This creates several problems beyond version conflicts:
Permission issues: System Python on Linux/macOS often requires sudo to install packages. You shouldn't need admin rights to try a library.
Experimentation risk: Want to test a new package? Installing it globally means it's there forever, cluttering your environment even after your experiment fails.
Reproducibility: When you share your project, how do you tell someone else exactly which package versions to use? "Just install these packages" isn't precise enough - version 2.28 and 2.31 behave differently.
Development vs production: Your development machine might have debugging tools, test frameworks, linters. Production servers shouldn't. But if everything's global, how do you separate them?
The solution is isolation. Each project gets its own Python environment, independent from all others.
Virtual environments: the conceptual model
A virtual environment is surprisingly simple. It's just a directory containing:
- A copy (or symlink) to your Python interpreter
- Its own
site-packagesfolder for installed packages - A few activation scripts that modify your shell's PATH
When you "activate" a virtual environment, you're telling your shell: "When I type python, use THIS python, not the system one. When I pip install something, put it in THIS environment's site-packages."
That's it. No virtual machines, no containers, no magic. Just directory isolation plus PATH manipulation.
The beauty is: you can have dozens of these environments, one per project, all using the same base Python but with different package sets. Project A's environment has requests==2.28. Project B's has requests==2.31. Both coexist peacefully because they're in separate directories.
Creating isolation: the tools available
Python has built-in support for this via the venv module. Every Python 3.3+ includes it:
python3 -m venv myproject_env
This creates a fresh environment in the myproject_env directory. Activate it, and you have isolation.
But venv only creates the environment. You still need to manage what goes into it. You still need to track which packages you installed. You still need a way to recreate the exact same environment on another machine.
This is where dependency management tools come in. Tools like pipenv (which we covered in episode #26) and poetry (the modern standard in 2026) handle both environment creation AND dependency tracking.
Dependency tracking: requirements vs reality
Here's the fundamental problem dependency managers solve: there's a difference between what you WANT and what you GET.
You want: "Give me the requests library, version 2.x, and pandas version 2.x."
What you get: requests 2.31.0, pandas 2.2.1, plus ALL their dependencies: urllib3 2.2.1, certifi 2024.2.2, charset-normalizer 3.3.2, idna 3.6, numpy 1.26.4, python-dateutil 2.9.0, pytz 2024.1, tzdata 2024.1, and six 1.16.0.
You asked for 2 packages. You got 11. This is the dependency tree - the packages you want depend on other packages, which depend on others.
Now the question: when you tell a colleague "install my project's dependencies," which versions should they get? The 2 you explicitly requested? Or the exact same 11 you ended up with?
If you only specify the 2, they might get different versions of the other 9. Maybe numpy 1.26.5 came out yesterday with a subtle bug. Maybe urllib3 2.3.0 changed behavior. Suddenly their environment isn't identical to yours, and code that works for you breaks for them.
This is why modern tools use TWO files:
The intent file (pyproject.toml, Pipfile): What you want. "requests >=2.28, <3.0" means "I need requests 2.x, any version."
The lock file (poetry.lock, Pipfile.lock): What you got. "On February 13, 2026, running dependency resolution gave me EXACTLY these versions of these 11 packages."
Your colleague doesn't run dependency resolution again. They install from the lock file, getting the exact versions you tested with. This is reproducible builds.
Version constraints: expressing intent
When you specify requests >=2.28, <3.0, you're expressing: "I need at least 2.28 (for features I use), but I don't want 3.0 (which might break things)."
Modern tools use semantic versioning notation. The caret ^ is shorthand for "compatible versions":
requests = "^2.28"
This means: >=2.28.0, <3.0.0. Any 2.x version, but not 3.x, because major version bumps can break compatibility.
For packages in the 0.x range (still in development), ^0.5 means >=0.5.0, <0.6.0 - more conservative, because 0.x projects often break compatibility in minor version bumps.
The mental model: semantic versioning promises that patch versions (2.28.0 → 2.28.1) are bug fixes only, minor versions (2.28 → 2.29) add features without breaking existing code, and major versions (2.x → 3.x) can break everything.
Version constraints let you say "give me bug fixes and new features, but warn me before breaking changes."
Poetry: the modern standard
In episode #26, we covered pipenv. Since then, the Python community has largely moved to Poetry. The core concepts are identical - environments plus dependency tracking - but Poetry has better performance and more features.
The workflow is:
- Create a project, which generates
pyproject.toml - Add dependencies, which updates both
pyproject.toml(your intent) andpoetry.lock(exact versions) - Poetry automatically creates and manages a virtual environment for you
- When someone else clones your project, they install from
poetry.lockto get your exact environment
Poetry adds one killer feature: dependency groups. You can separate production dependencies (needed to run the app) from development dependencies (testing tools, linters) from documentation dependencies (Sphinx and themes).
Why does this matter? When you deploy to production, you only install the production group. No pytest, no mypy, no Sphinx. Smaller installation, fewer security vulnerabilities, faster deployment.
The command to install a package:
poetry add requests
Behind the scenes, Poetry: (1) updates your pyproject.toml with the constraint, (2) resolves all dependencies, (3) updates poetry.lock with exact versions, (4) installs everything into the project's virtual environment.
One command, four operations, full isolation and reproducibility.
Dependency groups: organizing by purpose
Think about the dependencies a typical project has:
- Core dependencies: The libraries your application code imports. Needed in production.
- Development dependencies: Linters like
black, type checkers likemypy, test frameworks likepytest. Only needed during development. - Documentation dependencies: Sphinx for building docs, themes, extensions. Only needed when generating documentation.
- Optional dependencies: Features users can opt into. Maybe you support both MySQL and PostgreSQL, but users only need one database driver.
Without groups, everything gets installed everywhere. With groups, you can be precise:
[tool.poetry.dependencies]
requests = "^2.31"
[tool.poetry.group.dev.dependencies]
pytest = "^8.0"
black = "^24.1"
[tool.poetry.group.docs.dependencies]
sphinx = "^7.2"
Now you can install only what you need:
poetry install --only main
This installs just requests and its dependencies. Perfect for production. For development, poetry install (with no flags) installs main plus dev groups.
The lock file: reproducibility in practice
Lock files are verbose. They list every package, every version, every hash. Here's why that matters:
Security: Hashes ensure the package you download is the exact bytes the lock file expects. No tampering.
Time travel: The lock file is a snapshot. Six months from now, after 50 package updates, you can still recreate today's exact environment.
Platform independence: Lock files can include platform-specific variations. The numpy wheel for macOS ARM64 vs Linux x86_64 vs Windows - all specified.
Conflict resolution: When you add a new package that conflicts with existing ones, the lock file catches it. Resolution happens once, at development time, not every time someone installs.
Think of pyproject.toml as your recipe: "I want a cake with chocolate and sugar." The lock file is the exact measurements: "I got 200g Valrhona chocolate (batch #XYZ), 150g Mauritius cane sugar (harvest 2025)." Anyone following the lock file bakes the identical cake.
Activation: entering the environment
Virtual environments need activation - telling your shell to use the environment's Python instead of the system's.
With raw venv:
source myenv/bin/activate
This modifies your PATH environment variable, prepending the environment's bin directory. Now python resolves to myenv/bin/python, not /usr/bin/python.
Poetry manages this for you:
poetry shell
This activates the environment Poetry created for your project. Or use poetry run python script.py to run a single command in the environment without fully activating.
When you're done, exit (or deactivate`) returns to your normal shell.
Nota bene: Activation is per-shell-session. If you open a new terminal window, the environment isn't active there. This is by design - isolation means explicit opt-in.
TL;DR — what we covered
In this episode, we covered the conceptual foundations of Python project isolation:
- Why global package installation creates conflicts and reproducibility problems
- How virtual environments provide isolation through directory separation and PATH manipulation
- The difference between intent files (what you want) and lock files (what you got)
- Why dependency tracking needs both: constraints for flexibility, locks for reproducibility
- How version constraints express compatibility ranges using semantic versioning
- Dependency groups for separating production, development, and optional dependencies
- The role of activation in entering an isolated environment
Virtual environments and dependency management aren't about memorizing commands. They're about understanding the problem: multiple projects, one Python, conflicting needs. The solution: isolated directories per project, plus tools that track exactly what went into each.