Skip to content Skip to footer
ClientOptiMaxDateJuly, 2024AuthorMatthew LeeShare

Project Overview:
This comparative study benchmarks three popular Python libraries—Pandas Profiler, SweetViz, and AutoViz—on their ability to automate exploratory data analysis. With datasets of varying complexity (Titanic, housing prices, NYC taxi), we assessed run time, report comprehensiveness, memory usage, and user interactivity.

Key Highlights:

  • Generated full EDA reports using a single command per library.
  • Compared report features: alerts, correlation metrics, interactive visuals, and dataset summaries.
  • Benchmarked runtime and profiling speed on large vs. small datasets and presented performance trade‑offs.

Technologies Used:
Python · Pandas Profiling (YData Profiling) · SweetViz · AutoViz · Jupyter Notebooks

Outcome & Learnings:
Provided a clear decision framework for selecting EDA tools based on dataset size and analysis depth. SweetViz offered more interactive visuals, Pandas Profiling excelled in alerts and detail, and AutoViz was fastest for large datasets. A great reference for streamlining data analysis workflows.