Skip to content

SatvikPraveen/NumPyMasterPro

๐Ÿง  NumPyMasterPro

License: GPL v3 Python CI/CD Tests Issues Jupyter Notebooks Docker Ready NumPy Focused Real-World Use Cases K-Means Project

NumPyMasterPro is a comprehensive, modular, and hands-on project designed to help you master NumPy from first principles to real-world applications.

This project isn't just a learning exercise โ€” it's a complete reference toolkit, interview-ready resource, and a portfolio-quality project that showcases your fluency with one of Pythonโ€™s most essential libraries for scientific computing and data analysis.


๐Ÿš€ Why This Project Matters

Most learners stop at tutorials. This repository takes you further โ€” by combining theory, implementation, real-world use cases, and production practices in one place.

โœ… Covers 100% of NumPy's essential concepts
โœ… Demonstrates clean project structure and modular code reuse
โœ… Includes interview-ready topics like broadcasting, vectorization, and matrix algebra
โœ… Provides Jupyter notebooks + Python utility scripts + cheat sheet
โœ… Ends with a K-Means algorithm from scratch with Elbow Method โ€” great for resumes


๐Ÿ“Œ Project Objectives

  • ๐Ÿ” Master Core NumPy Syntax through progressively organized notebooks
  • ๐Ÿ”„ Understand Memory Efficiency: broadcasting, vectorization, views vs. copies
  • โš™๏ธ Practice Clean Coding using reusable utility scripts in /scripts
  • ๐Ÿง  Explore Real-World Scenarios: regression, simulations, image ops, clustering
  • ๐Ÿ“‚ Build a Reference Toolkit for revision, projects, and technical interviews

๐Ÿงฑ Folder Structure

NumPyMasterPro/
โ”œโ”€โ”€ notebooks/                 # ๐Ÿ““ Themed Jupyter Notebooks (core + advanced topics)
โ”œโ”€โ”€ scripts/                   # ๐Ÿ› ๏ธ Modular, reusable Python utilities
โ”œโ”€โ”€ datasets/                  # ๐Ÿ“ Data files used in notebooks
โ”œโ”€โ”€ docs/                      # ๐Ÿ“œ Cheat sheets and markdown-based quick notes
โ”œโ”€โ”€ requirements.txt           # ๐Ÿ“ฆ Minimal dependencies to run the project
โ”œโ”€โ”€ requirements_dev.txt       # ๐Ÿ“ฆ Full dev environment
โ”œโ”€โ”€ .env.example               # ๐Ÿ›ก๏ธ Sample env file for Docker-based config (login-free setup)
โ”œโ”€โ”€ docker-compose.yml         # ๐Ÿณ Multi-container orchestration for Jupyter Lab
โ”œโ”€โ”€ Dockerfile                 # ๐Ÿณ Docker image setup using Jupyter minimal notebook base
โ”œโ”€โ”€ .gitignore                 # โŒ Files to exclude from version control
โ””โ”€โ”€ README.md                  # ๐Ÿ“˜ This file!

๐Ÿงฎ Topics Covered

Notebook Description
01_array_basics.ipynb Array creation, types, shapes, memory attributes
02_indexing_slicing.ipynb Indexing, slicing, masking, .take(), .put()
03_array_manipulation.ipynb Reshaping, stacking, splitting, tiling, padding
04_math_operations.ipynb Element-wise ops, aggregation, rounding, broadcasting
05_linear_algebra.ipynb Dot product, inverse, norms, eig/SVD, solving systems
06_statistics_probability.ipynb Descriptive stats, histograms, correlations, sampling
07_masking_conditions.ipynb where, select, logical ops, nonzero, isfinite, etc.
08_file_io_memory.ipynb save, load, memmap, vectorize, views vs. copies
09_real_world_cases.ipynb Regression, image ops, time-series scaling, simulations
10_kmeans_from_scratch.ipynb ๐ŸŽฏ BONUS: K-Means Clustering + Elbow Method using NumPy only

๐Ÿงฐ Utility Scripts

File Purpose
array_utils.py Inspect shapes, types, identities, and metadata
linear_algebra_utils.py Matrix algebra: dot, inverse, SVD, eigenvalues
math_utils.py Element-wise math: power, root, trig, rounding, logs, exponent
aggregation_utils.py Sum, mean, std, var, min, max โ€” global & axis-wise
stats_utils.py Z-score, normalization, correlation, histogram bins
logical_utils.py Boolean logic, masking, conditionals (any, all, where)
kmeans_utils.py K-Means from scratch, inertia calculation, and centroid init

Example usage:

# Direct module import
from scripts.kmeans_utils import kmeans, compute_inertia

# Or use convenient re-exports from __init__.py
from scripts import kmeans, describe_array, minmax_normalize

๐ŸŽ›๏ธ Streamlit Frontend (Interactive Demo)

You can try the K-Means algorithm with different datasets or number of clusters using:

streamlit run kmeans_app.py

This allows you to upload .csv files, set cluster count, and visualize results in real time. Great for experiments, education, and showcasing clustering interactively.


๐Ÿณ Docker-Based Setup (Optional)

Prefer running in a containerized Jupyter Lab environment?

docker compose up --build

Then open the browser at: ๐Ÿ‘‰ http://localhost:8889

You can also stop the container with:

docker compose down --volumes --remove-orphans

๐Ÿ” Authentication & Security

This project is configured for login-free use of Jupyter Lab โ€” no password or token required.

  • โœ… .env.example is included with recommended settings.
  • ๐Ÿšซ .env is deliberately excluded from the repo (add your own if needed).
  • ๐Ÿ›ก๏ธ You may modify the docker-compose.yml to add a token or hashed password later.

๐Ÿง  Recommended Use

  • โœ๏ธ Study each notebook sequentially and refer back as needed
  • ๐Ÿงช Use /scripts/ functions in other projects or interview tasks
  • ๐Ÿงต Treat docs/numpy_cheatsheet.md as your quick review guide
  • ๐Ÿง  Use 10_kmeans_from_scratch.ipynb in your resume to show NumPy fluency
  • ๐Ÿ’ก Add your own notebooks (e.g., PCA from scratch, numerical integration, etc.)

๐Ÿ”ง Getting Started (Without Docker)

  1. Clone the repo

    git clone https://github.com/SatvikPraveen/NumPyMasterPro.git
    cd NumPyMasterPro
  2. Create & activate a virtual environment

    python -m venv venv
    source venv/bin/activate        # On Windows: venv\Scripts\activate
    pip install -r requirements.txt
  3. Launch the Jupyter Lab interface

    jupyter lab

๐Ÿงช Testing

NumPyMasterPro includes a comprehensive test suite with 80+ unit tests covering all utility modules.

Quick Testing

# Install test dependencies
pip install pytest pytest-cov

# Run all tests
pytest

# Run with coverage
pytest --cov=scripts --cov-report=term-missing

Using Makefile Commands

make test              # Run all tests
make test-coverage     # Generate coverage report
make lint              # Check code quality
make format            # Auto-format code
make all               # Run complete checks

Test Coverage

  • โœ… Array utilities (describe, compare, flags)
  • โœ… Logical operations (any, all, where, masking)
  • โœ… K-Means algorithm (clustering, inertia)
  • โœ… Math operations (arithmetic, trig, rounding)
  • โœ… Linear algebra (matrices, eigenvalues, SVD)
  • โœ… Statistics (normalization, correlation)

๐Ÿ“– Detailed testing guide: TESTING.md

CI/CD Pipeline

Automated testing runs on:

  • ๐Ÿ”„ Every push to main/develop
  • ๐Ÿ”„ All pull requests
  • โœ… Multi-OS (Ubuntu, macOS, Windows)
  • โœ… Python 3.10, 3.11, 3.12
  • โœ… Code linting & formatting checks
  • โœ… Notebook validation
  • โœ… Docker build verification

๐Ÿ“„ License

This project is licensed under the GNU General Public License v3.0. See the LICENSE file for more details.


๐ŸŒŸ Showcase & Star

If this project helped you master NumPy, feel free to โญ it and share it with others!

About

๐Ÿ”ข A hands-on, production-ready toolkit to master NumPy โ€” from first principles to real-world applications. Includes modular Jupyter notebooks, reusable utility scripts, cheatsheets, and advanced projects like K-Means clustering from scratch.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors