Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions _freeze/html/profile_optimise_py/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"hash": "38a57a547f37c4dae26ea70c12f3c8a7",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: \"Profiling and Optimising in Python :: CHEAT SHEET\"\ndescription: \" \"\nimage-alt: \"\"\nexecute:\n eval: true\n output: false\n warning: false\n---\n\n\n::: {.cell .column-margin}\n<a href=\"../profile_optimise_py.pdf\">\n<p><i class=\"bi bi-file-pdf\"></i> Download PDF</p>\n<img src=\"../pngs/profile_optimise_py.png\" width=\"200\" alt=\"\"/>\n</a>\n<br><br>\n:::\n\n\n## Basics\n\nAs code grows in complexity, it starts taking longer to execute. Improving code’s performance will makes the work easier and saves energy.\n\nThe Carpentries Incubator lesson can be found on the GitHub repository: https://github.com/carpentries-incubator/pando-python\n\n## Profiling\n\nProfiling is the process of measuring the performance of your code, and identifying which parts of your code are taking the most time to run.\n\nNot every script needs profiling: For example, if you have a one-off script that runs quickly, then profiling will not be necessary. However, if you have a long-running script that is expected to be run multiple times, then profiling can be very useful.\n\n### Types of profiling\n\n- **Manual profiling**: Use the `time` module to measure the time taken by different sections of the code.\n\n- **Function-level profiling**: Measures time taken by individual functions in the code.\n\n- **Line-level profiling**: Measures the time taken by individual lines of code.\n\n- **Timeline profiling**: Gives an idea of the execution of the code over time.\n\n- **Hardware metric profiling**: Measures hardware related metric.\n\n## Function-level profiling\n\nMeasures the time taken by individual functions in your code., to help identify which functions are taking the most time to execute.\n\n- `cProfile`: Commonly used tool for function-level profiling. It is a part of the Python standard library.\n\n- **Run**: `python -m cProfile my_script.py`\n\n- **Store output**: `python -m cProfile -o profile_output.prof my_script.py`\n\n- **Visualise output using `snakeviz`**: `snakeviz profile_output.prof`\n\n## Line-level profiling\n\nRecords the time taken to execute individual lines of code.\n\nPotential risks associated with optimising:\n\n- Performing optimisations in a way that make code harder to understand.\n\n- Changing the code with a potential to introduce new bugs.\n\n- Trying to optimise code that only takes up a minuscule fraction of the total runtime, you might end up spending more time on the optimisation than you actually gain from it.\n\n- `line_profiler`: Commonly used tool for line-level profiling. It is not a part of the Python standard library and needs to be installed using: `pip install “line_profiler[all]”`\n\n- **Decorate functions to be profiled**: `line_profiler` must be attached to specific functions and cannot attach to a full Python file or project. If your Python file has significant code in the global scope it will be necessary to move it into a new function which can then instead be called from global scope.\n\n ``` py\n from line_profiler import LineProfiler\n\n @profile\n def my_function(arg):\n ...\n return\n\n print(my_function(arg=arg_value))\n ```\n\n- **Run**: `python -m kernprof -lvr my_script.py`\n\n- **Output**: Outputs to a file, in this case, `my_script.py.lprof`. This file is not human-readable. To print it to console: `python -m line_profiler –rm my_script.py.lprof`\n\n## Optimisation\n\nOptimising is the process of improving the performance of your code by making changes to it.\n\nPotential risks associated with optimizing:\n\n- Performing optimisations in a way that make code harder to understand.\n\n- Changing the code with a potential to introduce new bugs.\n\n- Trying to optimise code that only takes up a minuscule fraction of the total runtime, you might end up spending more time on the optimisation than you actually gain from it.\n\n### Simple tips to optimise Python code\n\n- Use built-in functions and the standard library - don’t reinvent the wheel!\n\n- Use list comprehension when creating lists\n\n- Prefer tuples, dictionaries, and sets over lists, when appropriate\n\n- Minimise Python written: Use scientific Python packages (NumPy, Pandas). NumPy arrays support broadcasting - mathematical operation is applied element-wise looping over the array explicitly\n\n- Newer is often faster: When possible, use the latest versions of Python and packages\n\n- One large file is better than many small files\n\n- Avoid destroying and recreating objects",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
],
"includes": {},
"engineDependencies": {},
"preserve": {},
"postProcess": true
}
}
Binary file added html/images/python.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
118 changes: 118 additions & 0 deletions html/profile_optimise_py.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
---
title: "Profiling and Optimising in Python :: CHEAT SHEET"
description: " "
image-alt: ""
execute:
eval: true
output: false
warning: false
---

```{r}
#| output: asis
#| echo: false
#| column: margin
source("common.R")
# use_cheatsheet_logo(
# "python",
# alt = "Python logo"
# )
sheet_name <- tools::file_path_sans_ext(knitr::current_input())
pdf_preview_link(sheet_name)
translation_list(sheet_name)
```

## Basics

As code grows in complexity, it starts taking longer to execute. Improving code’s performance will makes the work easier and saves energy.

The Carpentries Incubator lesson can be found on the GitHub repository: https://github.com/carpentries-incubator/pando-python

## Profiling

Profiling is the process of measuring the performance of your code, and identifying which parts of your code are taking the most time to run.

Not every script needs profiling: For example, if you have a one-off script that runs quickly, then profiling will not be necessary. However, if you have a long-running script that is expected to be run multiple times, then profiling can be very useful.

### Types of profiling

- **Manual profiling**: Use the `time` module to measure the time taken by different sections of the code.

- **Function-level profiling**: Measures time taken by individual functions in the code.

- **Line-level profiling**: Measures the time taken by individual lines of code.

- **Timeline profiling**: Gives an idea of the execution of the code over time.

- **Hardware metric profiling**: Measures hardware related metric.

## Function-level profiling

Measures the time taken by individual functions in your code., to help identify which functions are taking the most time to execute.

- `cProfile`: Commonly used tool for function-level profiling. It is a part of the Python standard library.

- **Run**: `python -m cProfile my_script.py`

- **Store output**: `python -m cProfile -o profile_output.prof my_script.py`

- **Visualise output using `snakeviz`**: `snakeviz profile_output.prof`

## Line-level profiling

Records the time taken to execute individual lines of code.

Potential risks associated with optimising:

- Performing optimisations in a way that make code harder to understand.

- Changing the code with a potential to introduce new bugs.

- Trying to optimise code that only takes up a minuscule fraction of the total runtime, you might end up spending more time on the optimisation than you actually gain from it.

- `line_profiler`: Commonly used tool for line-level profiling. It is not a part of the Python standard library and needs to be installed using: `pip install “line_profiler[all]”`

- **Decorate functions to be profiled**: `line_profiler` must be attached to specific functions and cannot attach to a full Python file or project. If your Python file has significant code in the global scope it will be necessary to move it into a new function which can then instead be called from global scope.

``` py
from line_profiler import LineProfiler

@profile
def my_function(arg):
...
return

print(my_function(arg=arg_value))
```

- **Run**: `python -m kernprof -lvr my_script.py`

- **Output**: Outputs to a file, in this case, `my_script.py.lprof`. This file is not human-readable. To print it to console: `python -m line_profiler –rm my_script.py.lprof`

## Optimisation

Optimising is the process of improving the performance of your code by making changes to it.

Potential risks associated with optimizing:

- Performing optimisations in a way that make code harder to understand.

- Changing the code with a potential to introduce new bugs.

- Trying to optimise code that only takes up a minuscule fraction of the total runtime, you might end up spending more time on the optimisation than you actually gain from it.

### Simple tips to optimise Python code

- Use built-in functions and the standard library - don’t reinvent the wheel!

- Use list comprehension when creating lists

- Prefer tuples, dictionaries, and sets over lists, when appropriate

- Minimise Python written: Use scientific Python packages (NumPy, Pandas). NumPy arrays support broadcasting (mathematical operation is automatically parallelised over elements of the array - much faster than explicit for-loop)

- Newer is often faster: When possible, use the latest versions of Python and packages

- One large file is better than many small files

- Avoid destroying and recreating objects
Binary file added pngs/profile_optimise_py.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added powerpoints/profile_optimise_py.pptx
Binary file not shown.
Binary file added profile_optimise_py.pdf
Binary file not shown.