--- jupytext: formats: md:myst text_representation: format_name: myst kernelspec: display_name: Python 3 name: python3 --- ```{code-cell} ipython3 --- tags: [remove-cell] --- # Notebook setup import os import numpy as np import matplotlib.pyplot as plt plt.style.use('../../site.mplstyle') %matplotlib inline from numpy_survey_results.utils import gluval, gen_mdlist # Location of generated content os.makedirs('_generated', exist_ok=True) # For variable integration from myst_nb import glue ``` # Priorities ```{code-cell} ipython3 --- tags: [hide-input] --- fname = "data/2021/numpy_survey_results.tsv" column_names = [ 'website', 'performance', 'reliability', 'packaging', 'new_features', 'documentation', 'other' ] priorities_dtype = np.dtype({ "names": column_names, "formats": ['U1'] * len(column_names), }) data = np.loadtxt( fname, delimiter='\t', skiprows=3, dtype=priorities_dtype, usecols=range(58, 65), comments=None, encoding='UTF-16' ) # Discard empty data num_respondents = data.shape[0] unstructured = data.view(np.dtype('(7,)U1')) data = data[~np.any(unstructured == '', axis=1)] glue('2021_num_prioritizers', gluval(data.shape[0], num_respondents), display=False) ``` We asked survey respondents to share their priorities for NumPy to get a sense of the needs/desires of the NumPy community. Users were asked to rank the following categories in order of priority: ```{code-cell} ipython3 --- tags: [hide-input] --- for category in sorted(column_names[:-1]): print(f" - {category.replace('_', ' ').capitalize()}") ``` A write-in category (`Other`) was also included so that participants could share priorities beyond those listed above. ## Overview Of the {glue}`demographics.md::2021_num_respondents` survey participants, {glue:text}`2021_num_prioritizers` shared their priorities for NumPy moving forward. To get a sense of the overall relative "importance" of each of the categories, the following figure summarizes the score for each category as determined by the [Borda counting procedure for ranked-choice voting][borda-wiki]. [borda-wiki]: https://en.wikipedia.org/wiki/Borda_count ```{code-cell} ipython3 --- tags: [hide-input] --- # Unstructured, numerical data raw = data.view(np.dtype('U1')).reshape(-1, len(column_names)).astype(int) borda = len(column_names) + 1 - raw relative_score = np.sum(borda, axis=0) relative_score = 100 * relative_score / relative_score.sum() # Prettify labels for plotting labels = np.array([l.replace('_', ' ').capitalize() for l in column_names]) I = np.argsort(relative_score) labels, relative_score = labels[I], relative_score[I] fig, ax = plt.subplots(figsize=(12, 8)) ax.barh(np.arange(len(relative_score)), relative_score, tick_label=labels) ax.set_xlabel('Relative Borda score (%)') ax.set_title("Overall Importance Score"); fig.tight_layout() ``` In {ref}`sec:2021_priorities` we will take a closer look at how things are prioritized. (sec:2021_priorities)= ## Top Priorities The following figure shows the breakdown of the top priority items. ```{code-cell} ipython3 --- tags: [hide-input] --- # Prettify labels for plotting labels = np.array([l.replace('_', ' ').capitalize() for l in column_names]) # Collate top-priority data cnts = np.sum(raw == 1, axis=0) I = np.argsort(cnts) labels, cnts = labels[I], cnts[I] fig, ax = plt.subplots(figsize=(12, 8)) ax.barh(np.arange(cnts.shape[0]), 100 * cnts / cnts.sum(), tick_label=labels) ax.set_title('Distribution of Top Priority') ax.set_xlabel('Percent of Responses') fig.tight_layout() ``` ### Details We asked respondents who shared their priorities to provide specifics on their top two priorities. For example, if a user ranked "Performance" as a top priority, they were asked to share any specific thoughts on how performance could be improved. The responses for each of the categories are provided below. ```{code-cell} ipython3 --- tags: [hide-input] --- categories = { "docs", "newfeatures", "other", "packaging", "performance", "reliability", "website", } # Load the text responses for each category response_dict = {} for category in categories: responses = np.loadtxt( f"data/2021/{category}_comments_master.tsv", delimiter='\t', skiprows=1, usecols=0, dtype='U', comments=None ) responses = responses[responses != ''] response_dict[category] = responses # Generate nicely-formatted lists for category, responses in response_dict.items(): gen_mdlist(responses, f"{category}_comments_list.md") # Register number of responses in each category for k, v in response_dict.items(): glue(f"2021_num_{k}_comments", v.shape[0], display=False) ``` % TODO: This would be much more convenient if the MD tables could be included % programmatically. For example, if myst-nb adds support for the % IPython.display Markdown() function #### Documentation {glue:text}`2021_num_docs_comments` participants shared their thoughts on how documentation could be improved. ````{admonition} Click to expand! :class: toggle ```{include} _generated/docs_comments_list.md ``` ```` #### New Features {glue:text}`2021_num_newfeatures_comments` participants shared their thoughts on new features to improve NumPy. ````{admonition} Click to expand! :class: toggle ```{include} _generated/newfeatures_comments_list.md ``` ```` #### Other {glue:text}`2021_num_other_comments` participants selected "Other" as a top priority: ````{admonition} Click to expand! :class: toggle ```{include} _generated/other_comments_list.md ``` ```` #### Packaging {glue:text}`2021_num_packaging_comments` participants shared their thoughts on how the packaging utilities in NumPy could be improved. ````{admonition} Click to expand! :class: toggle ```{include} _generated/packaging_comments_list.md ``` ```` #### Performance {glue:text}`2021_num_performance_comments` participants shared thoughts on why performance is a top priority and ideas on how it can be improved. ````{admonition} Click to expand! :class: toggle ```{include} _generated/performance_comments_list.md ``` ```` #### Reliability {glue:text}`2021_num_reliability_comments` participants shared their thoughts on reliability and how it can be improved. ````{admonition} Click to expand! :class: toggle ```{include} _generated/reliability_comments_list.md ``` ```` #### Website Finally, {glue:text}`2021_num_website_comments` participants selected the NumPy website as a top priority and shared their thoughts on how it could be improved. ````{admonition} Click to expand! :class: toggle ```{include} _generated/website_comments_list.md ``` ```` ## Summary The following figure shows the relative frequency of selection for each of the listed categories[^other2021] at each priority level. ```{code-cell} ipython3 --- tags: [hide-input] --- fig, axes = plt.subplots(3, 2, figsize=(12, 8)) for i, ax in enumerate(axes.ravel()): priority_level = i + 1 cnts = np.sum(raw == priority_level, axis=0)[I] ax.barh(np.arange(cnts.shape[0]), 100 * cnts / cnts.sum(), tick_label=labels) ax.set_title(f"Priority: {priority_level}") fig.tight_layout() ``` [^other2021]: Excluding `Other`, which was an optional category and therefore constitutes the majority of the "lowest-priority".