May 29, 2021 Article blog
from numba import jitNumba is designed for scientific computing, and when used with NumPy, Numba generates specialized code for different array data types to optimize performance:
import random
@jit(nopython=True)
def monte_carlo_pi(nsamples):
acc = 0
for i in range(nsamples):
x = random.random()
y = random.random()
if (x ** 2 + y ** 2) < 1.0:
acc += 1
return 4.0 * acc / nsamples
@numba.jit(nopython=True, parallel=True)Now let's look at the same code, using numba before and after the performance comparison with C. For example, if we want to find out all the prime numbers within 10 million, the algorithm logic of the code is the same: Python code:
def logistic_regression(Y, X, w, iterations):
for i in range(iterations):
w -= np.dot(((1.0 /
(1.0 + np.exp(-Y * np.dot(X, w)))
- 1.0) * Y), X)
return w
import mathTime-consuming execution:
import time
def is_prime(num):
if num == 2:
return True
if num <= 1 or not num % 2:
return False
for div in range(3, int(math.sqrt(num) + 1), 2):
if not num % div:
return False
return True
def run_program(N):
total = 0
for i in range(N):
if is_prime(i):
total += 1
return total
if __name__ == "__main__":
N = 10000000
start = time.time()
total = run_program(N)
end = time.time()
print(f"total prime num is {total}")
print(f"cost {end - start}s")
total prime num is 664579The code for C+ is as follows:
cost 47.386465072631836s
#include <iostream>
#include <cmath>
#include <time.h>
using namespace std;
bool isPrime(int num) {
if (num == 2) return true;
if (num <= 1 || num % 2 == 0) return false;
double sqrt_num = sqrt(double(num));
for (int div = 3; div <= sqrt_num; div +=2){
if (num % div == 0) return false;
}
return true;
}
int run_program(int N){
int total = 0;
for (int i; i < N; i++) {
if(isPrime(i)) total ++;
}
return total;
}
int main()
{
int N = 10000000;
clock_t start,end;
start = clock();
int total = run_program(N);
end = clock();
cout << "total prime num is " << total;
cout << "\ncost " << (end - start) / ((double) CLOCKS_PER_SEC) << "s\n";
return 0;
}
$ g++ isPrime.cpp -o isPrime
$ ./isPrime
total prime num is 664579
cost 2.36221s
import mathRunning, you can see that the time has been reduced from 47.39 seconds to 3 seconds.
import time
from numba import njit
# @njit 相当于 @jit(nopython=True)
@njit
def is_prime(num):
if num == 2:
return True
if num <= 1 or not num % 2:
return False
for div in range(3, int(math.sqrt(num) + 1), 2):
if not num % div:
return False
return True
@njit
def run_program(N):
total = 0
for i in range(N):
if is_prime(i):
total += 1
return total
if __name__ == "__main__":
N = 10000000
start = time.time()
total = run_program(N)
end = time.time()
print(f"total prime num is {total}")
print(f"cost {end - start}s")
total prime num is 664579It's still a little slower than the 2.3 seconds of C++, and you might say Python still can't. Wait, we still have room for optimization, which is Python's for loop, which is a 10 million loop, for which Numba provides the prange parameter to calculate in parallel, thus concurrently processing the loop statement, only need to modify range to prange, decorator pass a parameter: parallel , true, other unchanged, code changes as follows:
cost 3.0948808193206787s
import mathNow run:
import time
from numba import njit, prange
@njit
def is_prime(num):
if num == 2:
return True
if num <= 1 or not num % 2:
return False
for div in range(3, int(math.sqrt(num) + 1), 2):
if not num % div:
return False
return True
@njit(parallel = True)
def run_program(N):
total = 0
for i in prange(N):
if is_prime(i):
total += 1
return total
if __name__ == "__main__":
N = 10000000
start = time.time()
total = run_program(N)
end = time.time()
print(f"total prime num is {total}")
print(f"cost {end - start}s")
python isPrime.pyIt's only 1.43 seconds, faster than C, numba's a real cow! I ran twice more to make sure I wasn't mistaken, averaging 1.4 seconds:
total prime num is 664579
cost 1.4398791790008545s
Reprinted from: Rookie Python
That's all you've got for you about what Python can do faster than C.