This repository contains Jupyter notebooks used in the “Data and Text Processing for Health and Life Sciences” book.
The notebooks provide a step-by-step guide to data and text processing using practical shell scripting.
What You'll Learn:
- Combine simple, powerful command-line tools, like digital LEGO bricks, for biological data work
- Automate real-world biological data handling and retrieval
- Extract information from web resources efficiently
- Mine scientific literature using command-line techniques
- Work with universal, open standard formats: TSV, CSV, XML, and OWL
Note: Includes fix for new ChEBI 2.0 web interface that currently lacks detailed cross-references on entry pages.
notebooks/– Jupyter notebooks.data/– Files with data created and used in the notebooks.scripts/– Scripts created in the notebooks.
-
Go to Google Colab.
-
Click File > Open notebook > GitHub tab.
-
Enter your GitHub repo URL (e.g.,
https://github.com/lasigeBioTM/data-text-processing-notebooks) and select the notebook (e.g.,notebooks/data-text-processing-notebooks-01-unix-shell.ipynb).Quick link method: Replace
github.comwithgithubtocolab.comin any notebook URL, example:
GitHub can render .ipynb files directly in the browser, so readers can view code and outputs without running them.
This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
