Introduction to Django for Beginners
■ Episode 14: NumPy Statistical Functions
(Last Updated: 2023.06.9)
Image of the Django framework
This article takes about 3 minutes to read!
(Rotate your smartphone for a larger image)
“Let’s use it together with NumPy!”
In the previous episode, we learned about arithmetic operations, broadcasting, dot product, and matrix multiplication in NumPy. Among them, broadcasting was a powerful feature that enabled efficient computation.
In this article, we will introduce statistical functions for analyzing such results, and libraries that allow drawing graphs and tables.
[Table of Contents]
NumPy Statistical Functions
NumPy and Scientific Libraries
Summary
1. NumPy Statistical Functions
NumPy includes many functions for performing statistical analysis on arrays. Below are some basic statistical functions. Here's an explanation of each function. These functions enable advanced data manipulation and analysis with NumPy.
np.sum: Calculates the sum of all elements in the array
np.mean: Calculates the mean of all elements in the array
np.median: Calculates the median of all elements in the array
np.std: Calculates the standard deviation of all elements in the array
np.var: Calculates the variance of all elements in the array
np.min / np.max: Finds the minimum or maximum value in the array
np.argmin / np.argmax: Finds the index of the minimum or maximum value in the array
Statistical functions in NumPy
Statistical functions in NumPy
2. NumPy and Scientific Libraries
NumPy plays a central role in Python's scientific computing ecosystem. Its efficient multidimensional array object and manipulation tools are used in many other Python libraries. Below are examples of how NumPy is related to other major libraries. (The percentages in parentheses indicate usage rates in machine learning projects on GitHub in 2019. NumPy was used in 74% of the projects.)
SciPy (47%): A library supporting scientific and technical computing. It is built on top of NumPy and provides functions for linear algebra, probability, integration, optimization, statistics, and more.
Pandas (41%): A powerful library for data analysis and manipulation. Its DataFrame object is based on NumPy arrays and offers labeling and better handling of missing data.
Matplotlib (40%): A library for creating graphs and charts in Python. It uses NumPy arrays to generate plots and visualizations.
Scikit-learn (38%): A library for machine learning in Python. Many of its functions take NumPy arrays as input and return NumPy arrays as output.
TensorFlow (24%) / PyTorch: Deep learning frameworks. They provide NumPy-compatible array operations and APIs, and can directly use NumPy arrays.
As shown above, NumPy plays a central role in the Python scientific and data analysis ecosystem. These libraries are deeply interconnected and, when used together, form a powerful data analysis toolset.
3. Summary
In this article, we introduced NumPy’s statistical functions and scientific libraries often used alongside it. NumPy is widely used in fields like machine learning, scientific computation, and statistics. This is due to its rich features, efficiency, and speed. If you're thinking about diving into these areas, mastering NumPy is a great place to start!
▼References
Summary of Statistical Functions in NumPy - Ebi Works
Overview of Scientific Computing with Python - Toshihiro Kamishima