Skip to content

Jupyter notebooks for the "Data and Text Processing for Health and Life Sciences" book, covering Unix shell basics, text manipulation, and data processing workflows. Run them instantly in Google Colab - no local setup required. Licensed under CC BY 4.0.

Notifications You must be signed in to change notification settings

lasigeBioTM/data-text-processing-notebooks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data Processing Book Notebooks

This repository contains Jupyter notebooks used in the “Data and Text Processing for Health and Life Sciences” book.
The notebooks provide a step-by-step guide to data and text processing using practical shell scripting.

What You'll Learn:

  • Combine simple, powerful command-line tools, like digital LEGO bricks, for biological data work
  • Automate real-world biological data handling and retrieval
  • Extract information from web resources efficiently
  • Mine scientific literature using command-line techniques
  • Work with universal, open standard formats: TSV, CSV, XML, and OWL

Note: Includes fix for new ChEBI 2.0 web interface that currently lacks detailed cross-references on entry pages.

Contents

  • notebooks/ – Jupyter notebooks.
  • data/ – Files with data created and used in the notebooks.
  • scripts/ – Scripts created in the notebooks.

How to use in Google Colab

  1. Go to Google Colab.

  2. Click File > Open notebook > GitHub tab.

  3. Enter your GitHub repo URL (e.g., https://github.com/lasigeBioTM/data-text-processing-notebooks) and select the notebook (e.g., notebooks/data-text-processing-notebooks-01-unix-shell.ipynb).

    Quick link method: Replace github.com with githubtocolab.com in any notebook URL, example:

Viewing on GitHub

GitHub can render .ipynb files directly in the browser, so readers can view code and outputs without running them.

License

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).

Creative Commons License

About

Jupyter notebooks for the "Data and Text Processing for Health and Life Sciences" book, covering Unix shell basics, text manipulation, and data processing workflows. Run them instantly in Google Colab - no local setup required. Licensed under CC BY 4.0.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published