{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Part 1 - Introduction to Python and the Jupyter Notebook" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this course we will make extensive use of Jupyter notebooks. Jupyter notebooks offer an interactive python session inside a browser, with great added functionality such as markdown support, widgets and much more. They are a great tool for data exploration, and freqently used in data science.\n", "\n", "To find out more:\n", "*http://jupyter.org/*\n", "\n", "This site has numerous examples and tutorials:\n", "*https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Jupyter in the LEARN environment\n", "\n", "We are currently running a Python3 Jupyter Notebook within the LEARN system. This means there is no need for you to set Jupyter up on your own system you can do everything that you need for the upcoming lectures inside here. You may chose to install it locally for use during your coursework. If so you should first need to [install Python](https://www.python.org/downloads/) and then visit the [Jupyter Install page](http://jupyter.org/install) and follow the instructions there." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Notebook Basics\n", "\n", "The notebook is organised in cells, each of which can hold markdown text (as this one) or python code. To execute the code in a cell, press the key combination ``Shift`` + ``Return`` or click the \"Run\" button in the control bar along the top of the Notebook.\n", "\n", "You can try this in the cell below, where we assign a string to the variable ``a`` and then print it:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = 'ATG'\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once the cell above has been executed, the interpreter knows about ``a``, so we can start woking with it in the python way:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "print(a+a)\n", "print(a+a[::-1])\n", "print(len(a))\n", "print(len(a)**2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Further Python Basics\n", "Using the same approach as we used for uploading this Python Notebook you can download some more notebooks that introduce a range of basic python concepts from the [Python-Lectures](https://github.com/rajathkmp/Python-Lectures/blob/master/01.ipynb) GitHub site." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Biopython\n", "\n", "*http://biopython.org/*\n", "\n", "Biopython provides tools for analysis of genomics and proteomic data. We will use this throughout the course, so make sure this runs on your computer.\n", "\n", "First we need to install the Biopython package in your environment. **You only need to do this once**, it will remain in your account space througout the course." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "### Install the biopython package in your Noteable environment\n", "%pip install biopython\n", "\n", "import Bio" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can find an excellent [biopython cookbook](http://biopython.org/DIST/docs/tutorial/Tutorial.html) written by the biopython community in the resource list for this course that you can practice with to familiarise yourself with some of its functionality." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a DNA sequence using BioPython" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from Bio.Seq import Seq\n", "my_seq = Seq(\"AGTACACTGGT\")\n", "print(my_seq)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Basic operations on DNA sequences" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#sequence length\n", "print(len(my_seq),\"nucletotides long\")\n", "\n", "#sequence %GC content\n", "from Bio.SeqUtils import GC\n", "\n", "#simple print\n", "print(\"%GC content = \",GC(my_seq),\"%\")\n", "\n", "#printing to two decimal places\n", "print(\"%GC content = \"+'%4.2f' % GC(my_seq)+\"%\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#original sequence\n", "print(\"original sequence\",my_seq)\n", "\n", "#sequence slicing NB this displays nucleotides 2-5\n", "print(\"indexing from 1->5\",my_seq[1:5])\n", "\n", "#the sequence is indexed from 0\n", "print(\"indexing from 0->5\",my_seq[0:5])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#complement of the sequence\n", "print(my_seq.complement())\n", "\n", "#reverse complement of the sequence\n", "print(my_seq.reverse_complement())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Biopython contains useful meta-data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from Bio.Data import CodonTable\n", "standard_table = CodonTable.unambiguous_dna_by_id[1]\n", "\n", "print(standard_table)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#and STOP codons\n", "print(standard_table.stop_codons)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Challenge 1 - Creating a Random Sequence\n", "An **optional** challenge to get a bit of basic practice manipulating sequences in BioPython. I will post example answers on the course Discussion Board before next week's class.\n", "\n", "Create a random DNA sequence of length 100 base pairs and print it out. HINT you will need to use the Python ``random`` function." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Challenge 2 - Creating Mutated Sequences\n", "\n", "Another **optional** challenge, again I will post a possible answer on the course Discussion Board before next week's class. First I will help out by introducing some functions that you will probably want to use. There are many ways to do this so there is no one correct answer!\n", "\n", "Using the random sequence you created above, make 20 random mutations in it replacing the original base with a random lower case one. Then print the original sequence with the mutated sequence below it.\n", "\n", "HINT 1 - You will need to convert your random sequence into a ``MutableSeq`` object, see the BioPython cookbook for an example of it in use.\n", "\n", "HINT 2 - You will want to select a random position in the sequence, you'll probably want to use the ``random.randrange()`` function for that.\n", "\n", "Your final ouput should look like this:-\n", "\n", "```\n", "GTAAGCGCGTTGGGTTTGAAAGCCCACCGCAAAATGAAGCTCTAAGCAAACTGGGATAAATTGGCGACCCCGCACTGTTAGGACCGAAAGGTTTGTGACA\n", "cTgcGCcCGTTGGGTTTGcAcGCCCACtGtAAAATGAAGttCTAAGCAAACTGGGATAAATTGtCGtCtCCGCACTGTTgGGACCGgAAGGTTTGtGcCA\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Matplotlib" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is essentailly a library for data ploting, and is useful to use in combination with ``numpy``. Try this:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.plot(np.random.rand(20),np.random.rand(20),'o')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Python help system\n", "\n", "Python libraries and functions are well documented. To see help for a function easily from a cell you can type ``?function`` and execute the cell. This will show extended documentation at the bottom of the screen\n", "\n", "Execute this cell to try:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "?plt.plot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Shutting Down the Notebook\n", "\n", "To shutdown the Notebook choose \"File->Close and Halt\" from the menu at the top of the Notebook.\n", "\n", "### Questions?\n", "\n", "Any questions feel free to put any quesitons on the Discussion Forum in the \"Tutorial Information & Discussion\" Channel." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3.8.3 ('base')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" }, "vscode": { "interpreter": { "hash": "c79478e135452d4f8dcea3898ce85a4457be8d06848dc07bbec8d2854f4ceed7" } } }, "nbformat": 4, "nbformat_minor": 2 }