Usage#

Hide code cell source
fname = "data/2020/numpy_survey_results.tsv"
column_names = [
    'using_random', 'bug', 'bug_resolution', 'bug_resolution_other',
    'unsolvable', 'unsolvable_resolution', 'unsolvable_resolution_other',
    'deprecation', 'deprecation_other'
]
featdep_dtype = np.dtype({
    "names": column_names,
    "formats": ['U1024'] * len(column_names),
})

data = np.loadtxt(
    fname, delimiter='\t', skiprows=3, dtype=featdep_dtype,
    usecols=range(87, 96), comments=None
)

This section comprises various questions to try to gain insight on things like new feature adoption, issue resolution, and the length of deprecation cycles.

New numpy.random Adoption#

A new API for random number generation was added to numpy.random in version 1.17. We asked survey paricipants whether they were using the new random API. Of the 1236 survey participants, 596 (48%) shared whether they were using the new random API.

Hide code cell source
rand = data['using_random'][data['using_random'] != '']
labels, cnts = np.unique(rand, return_counts=True)

fig, ax = plt.subplots(figsize=(8, 8))
ax.pie(cnts, labels=labels, autopct='%1.1f%%')
fig.tight_layout()

glue(
    'num_random_users',
    gluval(rand.shape[0], data.shape[0]),
    display=False
)
../../_images/ea0ab3ae8de1d1659014a99bfd65fd37926da99c26d79e094200743a35d6ba7c.png

Handling Issues#

We wanted to get a sense of how often users experience issues with NumPy, so we asked the following question:

In the last year, have you experienced problems in code you’ve written stemming from a problem in NumPy?

Of the 1236 survey participants, 885 (72%) responded to this question.

Hide code cell source
bug = data['bug'][data['bug'] != '']
labels, cnts = np.unique(bug, return_counts=True)

fig, ax = plt.subplots(figsize=(8, 8))
ax.pie(cnts, labels=labels, autopct='%1.1f%%', labeldistance=None)
ax.legend()
fig.tight_layout()

glue(
    'bug_reporters',
    gluval(bug.shape[0], data.shape[0]),
    display=False,
)
../../_images/82248cefa9441e1b73b63d980da34fcab64d68a21acb5d95d1a2096a8b64ea81.png

We asked those who reported experience issues what action(s) they took to resolve the issue.

Hide code cell source
bug_resolution = data['bug_resolution'][data['bug_resolution'] != '']
labels, cnts = np.unique(flatten(bug_resolution), return_counts=True)
I = np.argsort(cnts)
labels, cnts = labels[I], cnts[I]

fig, ax = plt.subplots(figsize=(12, 8))
ax.barh(
    np.arange(len(labels)),
    100 * cnts / bug_resolution.shape[0], 
    tick_label=labels,
)
ax.set_xlabel('Percentage of Respondents')
fig.tight_layout()
../../_images/5110ddd136c6afe2c2ac872e29191f8f353d5ee9b6df6586a1624976d0b9076c.png

Data Analysis with NumPy#

Similar to the the previous question, we tried to get a sense of how well NumPy meets users’ data analysis needs. We asked the following question:

In the last year, have you encountered a problem involving numerical data that you were unable to solve using NumPy?

Of the 1236 survey participants, 874 (71%) responded to the above question, with 164 (19%) reporting that they’ve had a problem that they initially expected to be able to solve using NumPy, but were unable to do so.

Hide code cell source
unsolvable = data['unsolvable'][data['unsolvable'] != '']
labels, cnts = np.unique(unsolvable, return_counts=True)
num_yes = np.sum(unsolvable == 'Yes')

fig, ax = plt.subplots(figsize=(8, 8))
ax.pie(cnts, labels=labels, autopct='%1.1f%%')
fig.tight_layout()

glue(
    'num_solvers',
    gluval(unsolvable.shape[0], data.shape[0]),
    display=False,
)
glue(
    'num_unsolved',
    gluval(num_yes, unsolvable.shape[0]),
    display=False
)
../../_images/ee0436a2c7b0f87a5bc4f7def3f7a11ecb4245ebd9de141af38f17564e849aca.png

We asked those that responded “Yes” to the previous question what action(s) they took to resolve the issue.

Hide code cell source
resolution = data['unsolvable_resolution'][data['unsolvable'] == 'Yes']
resolution = resolution[resolution != '']
labels, cnts = np.unique(flatten(resolution), return_counts=True)
I = np.argsort(cnts)
labels, cnts = labels[I], cnts[I]

fig, ax = plt.subplots(figsize=(12, 8))
ax.barh(
    np.arange(len(labels)),
    100 * cnts / resolution.shape[0], 
    tick_label=labels,
)
ax.set_xlabel('Percentage of Respondents')
fig.tight_layout()
../../_images/893d28ae771469ab640c070b2d413307f2b5eeda6a3b9f25a550c8be3ce82543.png

Deprecation Timeframe#

We asked survey participants to share their opinion on the NumPy deprecation cycle, specifically:

What do you consider as a good deprecation time frame?

Of the 1236 survey participants, 863 (70%) responded to this question.

Hide code cell source
depcycle = data['deprecation'][data['deprecation'] != '']
labels, cnts = np.unique(depcycle, return_counts=True)

fig, ax = plt.subplots(figsize=(8, 8))
ax.pie(cnts, labels=labels, autopct='%1.1f%%')
fig.tight_layout()

glue(
    'dep_opinions',
    gluval(depcycle.shape[0], data.shape[0]),
    display=False
)
../../_images/f90560892bda0917ab6a4c2ac504e536697ae136915f2b1c52e76b4ffda5c472.png