Python

Python Menu

NumPy is a Python library used for working with arrays. It also has functions for working in the domain of linear algebra, fourier transform, and matrices.

NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely. NumPy stands for Numerical Python.

Why Use NumPy?

In Python, we have lists that serve the purpose of arrays, but they are slow to process. NumPy aims to provide an array object that is up to 50x faster than traditional Python lists. The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray very easy. Arrays are very frequently used in data science, where speed and resources are very important.

Why is NumPy Faster Than Lists?

NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently. This behavior is called locality of reference in computer science. This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU architectures.

Installation of NumPy

If you have Python and PIP already installed on a system, then installation of NumPy is very easy.

Install it using this command:

C:\Users\Your Name>pip install numpy

If this command fails, then use a python distribution that already has NumPy installed like, Anaconda, Spyder etc.

Importing NumPy

Once NumPy is installed, import it in your applications by adding the import keyword:

import numpy

Example:

import numpy
arr = numpy.array([1, 2, 3, 4, 5]) print(arr)

NumPy as np

NumPy is usually imported under the np alias.

Create an alias with the as keyword while importing:

import numpy as np

Example:

import numpy as np
arr = np.array([1, 2, 3, 4, 5]) print(arr)

Checking NumPy Version

The version string is stored under __version__ attribute.

import numpy as np
print(np.__version__)

NumPy Arrays

NumPy arrays are great alternatives to Python Lists. Some key advantages of NumPy arrays are that they are fast, easy to work with, and give users the opportunity to perform calculations across entire arrays.

In the following example, you will first create two Python lists. Then, you will import the numpy package and create numpy arrays out of the newly created lists.

Then we can perform element-wise calculations on height and weight. For example, you could take all 6 of the height and weight observations above, and calculate the BMI for each observation with a single equation. These operations are very fast and computationally efficient. They are particularly helpful when you have 1000s of observations in your data.

# Import the numpy package as np import numpy as np
# Create 2 new lists height and weight height = [1.87, 1.87, 1.82, 1.91, 1.90, 1.85] weight = [81.65, 97.52, 95.25, 92.98, 86.18, 88.45]
# Create 2 numpy arrays from height and weight np_height = np.array(height) np_weight = np.array(weight)
# Print out the type of np_height print(type(np_height))
# Calculate bmi bmi = np_weight / np_height ** 2
# Print the result print(bmi)

Subsetting

Another great feature of NumPy arrays is the ability to subset. For instance, if you wanted to know which observations in our BMI array are above 23, we could quickly subset it to find out.

# Import the numpy package as np import numpy as np
# Create 2 new lists height and weight height = [1.87, 1.87, 1.82, 1.91, 1.90, 1.85] weight = [81.65, 97.52, 95.25, 92.98, 86.18, 88.45]
# Create 2 numpy arrays from height and weight np_height = np.array(height) np_weight = np.array(weight)
# Print out the type of np_height print(type(np_height))
# Calculate bmi bmi = np_weight / np_height ** 2
# Print only bmi > 25 print(bmi[bmi > 25])

Exercise

First, convert the list of weights from a list to a NumPy array. Then, convert all of the weights from kilograms to pounds. Use the scalar conversion of 2.2 lbs per kilogram to make your conversion. Lastly, print the resulting array of weights in pounds.

weight_kg = [35, 40, 45, 50, 55, 60, 65]
import numpy as np
# Create a numpy array np_weight_kg from weight_kg
# Create np_weight_lbs from np_weight_kg
# Print out np_weight_lbs
weight_kg = [35, 40, 45, 50, 55, 60, 65]
import numpy as np
# Create a numpy array np_weight_kg from weight_kg np_weight_kg = np.array(weight_kg)
# Create np_weight_lbs from np_weight_kg np_weight_lbs = np_weight_kg * 2.2
# Print out np_weight_lbs print(np_weight_lbs)
test_output_contains("[ 77. 88. 99. 110. 121. 132. 143.]") success_msg("Excellent!")

NumPy ufuncs

What are ufuncs?

ufuncs stands for "Universal Functions" and they are NumPy functions that operates on the ndarray object.

Why use ufuncs?

ufuncs are used to implement vectorization in NumPy which is way faster than iterating over elements. They also provide broadcasting and additional methods like reduce, accumulate etc. that are very helpful for computation.

ufuncs also take additional arguments, like:

  • where - boolean array or condition defining where the operations should take place.
  • dtype - defining the return type of elements.
  • out - output array where the return value should be copied.

What is Vectorization?

Converting iterative statements into a vector based operation is called vectorization.

It is faster as modern CPUs are optimized for such operations.

Add the Elements of Two Lists

    list 1: [1, 2, 3, 4]
    list 2: [4, 5, 6, 7]

One way of doing it is to iterate over both of the lists and then sum each element.

Example Without ufunc:

x = [1, 2, 3, 4] y = [4, 5, 6, 7] z = []
for i, j in zip(x, y): z.append(i + j) print(z)

NumPy has a ufunc for this, called add(x, y) that will produce the same result.

With ufunc, we can use the add() function:

import numpy as np
x = [1, 2, 3, 4] y = [4, 5, 6, 7] z = np.add(x, y) print(z)

Introduction

Python Basics

Python Advance

Data Science Python Tutorials

Python Functions and Methods