# Losing your Loops Fast Numerical Computing with NumPy

436 3 21982

By anonymous 2017-09-20

You can do this with numpy built-ins using broadcasting. Broadcasting allows you to add together two arrays of different shapes without making excessive copies or looping excessively.

We can solve your problem by creating two vectors representing the row and column sums respectively, and 'multiplying' them together, which will broadcast them into a correctly sized and shaped array.

The best introduction to this topic I know of is the talk Losing Your Loops: Fast Numerical Computation with Numpy by Jake Vanderplass. It contains visual examples that I find essential for wrapping your head around broadcasting.

Here's a simple example:

IN

```
import numpy as np
a = np.arange(3)
b = np.reshape(np.arange(3), [3, 1])
print('a = ', a)
print('b = ')
print(b)
print('a+b = ')
print(a+b)
```

OUT:

```
a = [0 1 2]
b =
[[0]
[1]
[2]]
a+b =
[[0 1 2]
[1 2 3]
[2 3 4]]
```

We can solve your problem by creating two vectors representing the row and column sums respectively 'multiplying' them together, broadcasting them into a correctly sized and shaped array.

```
import numpy as np
def gen_expected(array: np.ndarray):
col_sums = (np.sum(array, axis=0))
row_sums = np.sum(array, axis=1)
np.reshape(row_sums, [len(row_sums), 1])
return (col_sums * row_sums) / np.sum(array)
# NOTE: this result might be transposed! Check it yourself!
```

By anonymous 2017-11-27

I have some data that I want to "one-hot encode" and it is represented as a 1-dimensional vector of positions.

**Is there any function in NumPy that can expand my x into my x_ohe?**

I'm trying to avoid using for-loops in Python at all costs for operations like this after watching Jake Vanderplas's talk

```
x = np.asarray([0,0,1,0,2])
x_ohe = np.zeros((len(x), 3), dtype=int)
for i, pos in enumerate(x):
x_ohe[i,pos] = 1
x_ohe
# array([[1, 0, 0],
# [1, 0, 0],
# [0, 1, 0],
# [1, 0, 0],
# [0, 0, 1]])
```

By anonymous 2018-01-07

I already did! Its more of an abstract question really, I saw this vid and it blew my mind https://www.youtube.com/watch?v=EEUXKG97YRw. I was just trying to apply it to my work. No worries tough, I will have a look at the pivot table post you linked to. Thanks for your help.

Submit Your Video

By anonymous 2017-09-20

Instead of substituting your for-loops with

`lambdas`

, try substituting them with ufuncs.Losing Your Loops: Fast Numerical Computation with Numpy is an excellent talk by Jake Vanderplass on the subject. Using universal functions and broadcasting instead of for-loops can dramatically improve the speed of your code.

Here is a basic example:

INPUT:

OUTPUT:

Note by eliminating our loops we have a 60x speedup!

Original Thread