A comprehensive guide to setting up a complete Python development environment for data engineering. Learn how to install Python across different operating systems, configure VS Code with essential extensions, create and manage virtual environments, and establish a professional workflow with dependency management using pip and requirements.txt.
This guide will walk you through setting up Python, VS Code, and virtual environments on macOS, Windows, and Linux.
To verify the installation:
python3 --version and press Enter. You should see the Python version you just installed.To verify the installation:
python --version and press Enter. You should see the Python version you just installed.Most Linux distributions come with Python pre-installed. To check if you have Python and which version:
python3 --version and press EnterIf Python is not installed or you want a newer version:
For Ubuntu or Debian-based distributions:
sudo apt updatesudo apt install python3For Fedora:
sudo dnf install python3For Arch Linux:
sudo pacman -S pythonCmd+Shift+XCtrl+Shift+XVirtual environments allow you to have isolated environments for different projects. Here's how to set one up:
Open the Terminal in VS Code (Terminal > New Terminal)
Navigate to your project folder using the cd command
Create a new virtual environment:
python3 -m venv venv # On macOS and Linux
python -m venv venv # On Windows
Replace "myenv" with whatever name you want for your environment.
Activate the virtual environment:
source venv/bin/activate
venv\Scripts\activate
You should see (venv) at the beginning of your terminal prompt.VS Code should detect the new environment. If not, select it manually as we did in the previous step.
With your virtual environment activated:
pip install requests
pip freeze > requirements.txt
To update requirements.txt after installing new packages:
pip install numpy pandas
pip freeze > requirements.txt
For each new project:
When you're done working:
deactivate in the terminalTo work on the project again:
This process will help you keep your projects organized and isolated from each other.
Remember to regularly update your requirements.txt file as you add or remove packages from your project. This ensures that your project remains reproducible across different environments.