Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Introduction to the numpy library in python


Jun 01, 2021 Article blog


Table of contents


1. What is Numpy?

NumPy is an open source Python of scientific computing. W ith NumPy arrays and matrices can be used naturally. NumPy contains a number of useful mathematical functions, including linear algebra, Fourier transformations, and random number generation.

The library's predecessor was a library for array operations that began development in 1995. Over time, it has largely become the foundational package for most Python scientific computing, including, of course, all the deep learning frameworks that provide Python interface.

2. Why use Numpy?

a) Convenient:

For the same numerical calculation task, it is much easier to use NumPy than to write Python code directly. This is because NumPy ability to operate directly on arrays and matrices can omit many loop statements, and its numerous mathematical functions make writing code much easier.

b) Performance:

Array storage efficiency and input and output performance in NumPy are far superior to Python mid-priced basic data structures, such as nested list containers. T he performance it can improve is proportional to the number of elements in the array. F or operations with large arrays, using NumPy does have an advantage. For large TB files, NumPy uses memory mapping files for optimal data read and write performance.

c) Efficient:

Most of NumPy code is written in C语言 which makes NumPy much more efficient than pure Python code.

Of course, NumPy also has its drawbacks, as NumPy uses memory mapping NumPy for optimal data read and write performance, and the Python size of list memory limits its handling of large TB files; As a result, NumPy advantages are less obvious in areas other than scientific computing.

(Recommended tutorial: python tutorial)

3. Numpy installation:

  1. Official website installation: http://www.numpy.org/.
  2. Pip installation: pip install numpy.
  3. LFD installation: http://www.lfd.uci.edu/~gohlke/pythonlibs/ for windows users.
  4. Anaconda installation (recommended): Anaconda integrates a number of third-party libraries on python scientific computing, mainly for ease of installation. Download address: https://www.anaconda.com/download/.

4.numpy Foundation:

NumPy object is a multidimensional array of the same element. T his is a type where all elements are. D imensions in NumPy are called axes, and the number of axes is called rank. NumPy array class is called ndarray (matrix is also called array). Often referred to as arrays.

Common ndarray object properties are:

  • ndarray.ndim (the number of array axes, the number of axes is called rank),
  • ndarray.shape (the dimension of the array.) T his is an integer array that indicates the size of an array on each dimension. For example, a matrix of n-row m-columns, whose shape property will be (2,3), the length of which is obviously rank, i.e. dimension or ndim property,
  • ndarray.size (the total number of shape array elements equal to the product of the tuple elements in the property).
  • ndarray.dtype (an object used to describe the type of element in an array that can use the standard Python type by creating or specifying dtype In NumPy provides its own data type).

Numpy's data type:

import numpy as np
a = np.dtype(np.int_)     #  np.int64, np.float32 …
print(a) 

Numpy's built-in signature:

Int8, int16, int32, int64 can be replaced by strings 'i1', 'i2', 'i4', 'i8', the rest and so on.

import numpy as np
a = np.dtype(‘i8’)    # ’f8’, ‘i4’’c16’,’a30’(30个字符的字
# 符串), ‘>i4’…
print (a)

You can indicate the byte order of the data type in memory, '>' means stored as a large end, '<' means stored as a small end, and '<' means that data is stored by hardware default. Large-end or small-end storage only affects the order in which bytes are stored when data is stored in underlying memory, which is generally not considered when we actually use python for scientific calculations.

(Recommended micro-course: python3 basic micro-course)

5. Create an array and view its properties:

(1) Create arrays from python lists and tuples with np.array:

import numpy as np
a = np.array([[1,2,3], [4, 5, 6]], dtype=int)
print(a.shape)       #  a.ndim, a.size, a.dtype

import numpy as np
a = np.array([(1,2,3), (4, 5, 6)], dtype=float)
print(a.shape)      #  a.ndim, a.size, a.dtype

(2) Create an array with np.arange(.reshape():

import numpy as np
a = np.arange(10).reshape(2, 5) # 创建2行5列的二维数组,
# 也可以创建三维数组,
# a = np.arange(12).reshape(2,3,2)
print(a)

Determines the shape of the following three-dimensional array :

a = np.array([[[1,2,3], [4, 5, 6], [7, 8, 9]]])
b = np.array([[[1,2,3]], [[4, 5, 6]], [[7, 8, 9]]])

6. Basic operation:

import numpy as np
a = np.random.random(6)
b = np.random.rand(6)
c = np.random.randn(6)
print(a-b)                    # print(a+b),print(a*c) …
# 二维数组运算
d = np.random.random((2,3))
e = np.random.randn(2, 3)
f = np.random.rand(2,3)
print(d-e)                    # print(d+f),print(e*f) …
print(np.dot(a,b))          #复习矩阵乘法
print(a.dot(b))
# Numpy 随机数模块np.random.random, np.random.randn, np.random.rand的比较1)rand 生成均匀分布的伪随机数。分布在(0~1)之间
(2)randn 生成标准正态分布的伪随机数(均值为0,方差为1)。

import numpy as np
a = np.ones((2,3)) 
b = np.zeros((2,3))
a*=3
b+=a

7. Common functions:

import numpy as np 
a = np.arange(10)
np.where()

8. Index, slice and iteration:

import numpy as np
 a = arange(10)**3
a[2]
a[2:5]
a[:6:2] = -1000
a[ : :-1]
for i in a:
    print i**(1/3.)


# 多维数组的索引
b = np.arange(20).reshape(5,4)
b[2,3]
b[0:5, 1]
b[ : ,1]
b[1:3, : ]
#当少于轴数的索引被提供时,确失的索引被认为是整个切片
b[-1]    #相当于b[-1,:]
# b[i] 中括号中的表达式被当作 i 和一系列 : ,来代表剩下的轴。NumPy也允许你使用“点”像 b[i,...] 。
#点 (…)代表许多产生一个完整的索引元组必要的分号。如果x是
#秩为5的数组(即它有5个轴),那么:x[1,2,…] 等同于 x[1,2,:,:,:],x[…,3] 等同于 x[:,:,:,:,3],x[4,…,5,:] 等同 x[4,:,:,5,:].


三维数组的索引:
c = np.arange(12).reshape(2,3,2)
c[1]
c[2,1]    # 等价于c[2][1]
c[2,1,1]  # 等价于c[2][1][1]


# 通过数组索引
d = np.arange(10)**2
e = np.array ([3, 5, 6])
d[e] = ?


#练习, 用同样的方法在二维数组中操作。
# 通过布尔数组索引
f = np.arange(12).reshape(3, 4)
g = f>4
print(g)
f [g]

The iterative multidimensional array is in terms of the first axis:

h = np.arange(12).reshape(3,4)
for i in h:
   print(i)

If we want to perform operations on elements in each array, we can use the flat property, which is an iterator of the array element:

for i in h.flat:
print(i)

Add: The use of flatten() np.flatten() returns an array that collapses into one dimension. However, this function can only be applied to numpy objects, i.e. array or mat and a normal list list is not available.

import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])

 
a.flatten()
b = np.mat([[1,2,3], [4, 5, 6]])
b.flatten()
c = [[1,2,3], [4, 5, 6]]
c.flatten() ?

For list to achieve the same effect, you can use a list expression:

[y for x in a for y in x]

9. Shape operation:

ravel() , vstack() , hstack() , column_stack , row_stack , stack , split , hsplit , vsplit

import numpy as np


#增加维度
a = np.arange(5)
a[:, np.newaxis]
a[np.newaxis, :]
np.tile([1,2], 2)


#合并
a = np.arange(10).reshape(2,5)
print(a.ravel())
print(a.resize(5,2))
b = np.arange(6).reshape(2,3)
c = np.ones((2,3))
d = np.hstack((b,c))              # hstack:horizontal stack 左右合并
e = np.vstack((b,c))              # vstack: vertical stack 上下合并          
f = np.column_stack((b,c))
g = np.row_stack((b,c))
h = np.stack((b, c), axis=1)      # 按行合并
i = np.stack((b,c), axis=0)       # 按列合并
j = np.concatenate ((b, c, c, b), axis=0)   #多个合并

 
#分割
k = np.hsplit(i, 2)
l = np.vsplit(i, 2)
m = np.split(i, 2, axis=0)
n = np.split(i, 2,axis=1)

 
o = np.array_split(np.arange(10),3)   #不等量分割

10. Deep copy:

import numpy as np
a = np.arange (4)
b = a
c = a
d = b
a[0]=10  a = ? b = ? c= ? d = ?
b = a.copy()
a [0] = 9
b = ?

11. Broadcasting

Broadcasting is a powerful mechanism that allows Numpy to mathematically calculate matrices of different sizes together. We often have a small matrix and a large matrix, and then we need to do some calculations on the large matrix with a small matrix.

Add a vector to each line of the matrix:

    import numpy as np
    a = np.array ([[1,2,3], [4,5,6], [7,8,9]])
    b = np.array ([10,10,10])
    c = np.tile(b, (4,1))
    d = a + c
    #用广播机制:
    c = a + b

The following rules are observed when using broadcast mechanisms for two arrays:

  1. If the rank of the array is different, use 1 to extend the array with a smaller rank until the dimensions of both arrays are the same length.
  2. If two arrays are the same length on a dimension, or if one array is 1 length on that dimension, then we say that the two arrays are compatible on that dimension.
  3. If the two arrays are compatible in all dimensions, they can use broadcasting.
  4. If the sizes of the two input arrays are different, pay attention to the larger of them. Because after the broadcast, the size of the two arrays will be the same as the larger size.
  5. On either dimension, if one array is 1 in length and the other is greater than 1, it is as if the first array has been copied.

That's all about the numpy in python and I hope it will help you.