lecture.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Extended Data Types\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Agenda\n",
    "\n",
    "* Lists (repetition)\n",
    "* Tuples\n",
    "* Sets\n",
    "* Dictionaries"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    " ## Introduction\n",
    "\n",
    "* We already know lists\n",
    "* Lists are only one of 4 built-in data types in Phython that store collections of data:\n",
    "  * Lists\n",
    "  * Tuples\n",
    "  * Sets\n",
    "  * Dictionaries\n",
    "* Collections are used to store multiple items in a single variable.\n",
    "* Working with collections is something that makes Python extremly powerful\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Recap of what we know about Lists\n",
    "\n",
    "1. Lists are created with square brackets ```[ ]```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "list = [1,2,5,3]\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "2. List items are ordered, changeable, and allow duplicate values\n",
    "    * Orders: Items have a defined order, and that order will not change unless we add or remove items\n",
    "    * Changeable: We can change, add, and remove items in a list after it has been created\n",
    "3. We access list items by their index (the position in the list, starting at 0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3\n"
     ]
    }
   ],
   "source": [
    "list = [1,2,3,3,2]\n",
    "print(list[3])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Accessing List Items\n",
    "\n",
    "Besides accessing list items with ```[index]``` Python offers some additional options:\n",
    "\n",
    "1. Negative Indexing: ```[-1]``` refers to the last item, ```[-2]``` refers to the second last item etc.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "d\n",
      "c\n"
     ]
    }
   ],
   "source": [
    "my_list = ['a','b','c','d']\n",
    "print(my_list[-1])\n",
    "print(my_list[-2])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "2. Instead of selecting a single item, we can also select a subset of the list with the schema ```start-index:end-index```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['b', 'c']\n"
     ]
    }
   ],
   "source": [
    "my_list = ['a','b','c','d']\n",
    "print(my_list[1:3])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## List Functions\n",
    "\n",
    "We have already seen the most important list methods (functions). Here is a complete table:\n",
    "\n",
    "| Method    | Description                                                                 |\n",
    "| :-------- | :-------------------------------------------------------------------------- |\n",
    "| append()  | Adds an element at the end of the list                                      |\n",
    "| clear()   | Removes all elements from the list                                          |\n",
    "| copy()    | Returns a copy of the list                                                  |\n",
    "| count()   | Returns the number of elements with the specified value                     |\n",
    "| extend()  | Adds an element of a list (or any iterable), to the end of the current list |\n",
    "| index()   | Returns the index of the first element with the specified value             |\n",
    "| insert()  | Adds an element at the specified position                                   |\n",
    "| pop()     | Removes the element at the specified position                               |\n",
    "| remove()  | Removes the item with the specified value                                   |\n",
    "| reverse() | Reverses the order of the list                                              |\n",
    "| sort()    | Sorts the list                                                              |\n",
    "\n",
    "See [this page](https://www.w3schools.com/python/python_lists_methods.asp) for more details."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Tuples\n",
    "\n",
    "* A tuple is a collection that is ordered and **unchangeable**.\n",
    "* Tuples are written with round brackets.\n",
    "* Tuples are ordered and allow duplicate values.\n",
    "* tems in tuples are accessed in the same way as lists, using ```[]```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mio\n"
     ]
    },
    {
     "ename": "TypeError",
     "evalue": "'tuple' object does not support item assignment",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[10], line 4\u001b[0m\n\u001b[1;32m      1\u001b[0m my_tuple \u001b[38;5;241m=\u001b[39m (\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmia\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmio\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;241m1\u001b[39m)\n\u001b[1;32m      2\u001b[0m \u001b[38;5;28mprint\u001b[39m(my_tuple[\u001b[38;5;241m1\u001b[39m])\n\u001b[0;32m----> 4\u001b[0m \u001b[43mmy_tuple\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mciao\u001b[39m\u001b[38;5;124m\"\u001b[39m\n",
      "\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment"
     ]
    }
   ],
   "source": [
    "my_tuple = (\"mia\", \"mio\", 1)\n",
    "print(my_tuple[1])\n",
    "\n",
    "my_tuple[0] = \"ciao\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "* Tuples only know the methods ```count()```and ```index()```\n",
    "* We use tuples to make sure, our data will not be changed anywhere in the code\n",
    "* Tuples are faster and more memory-efficient than lists"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## When should we use lists and when should we use tuples?\n",
    "\n",
    "* Use Lists, when you want to change data in your collection\n",
    "  * Lists are more flexible and have more built-in methods, making them ideal for dynamic collections.\n",
    "* Use Tuples, when your data is immutable\n",
    "  * We use tuples to make sure, our data will not be changed anywhere in the code.\n",
    "  * Tuples are faster and more memory efficient than lists (especially during iterations)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Nested Lists and Tuples\n",
    "\n",
    "Since lists, tuples, and sets can contain any datatype we can also nest them:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2\n",
      "2\n",
      "4\n"
     ]
    }
   ],
   "source": [
    "my_nested_list = [[1,2],[3,4]] # nested list (2D Array)\n",
    "my_nested_tuble = ((1,2),(3,4)) # nested tuble (2D Array)\n",
    "my_nested_3D_Matrix = ((1,2,5),\n",
    "                       (3,4,2),\n",
    "                       (3,3,3)) # Matrix\n",
    "\n",
    "\n",
    "print(my_nested_list[0][1])\n",
    "print(my_nested_tuble[0][1])\n",
    "print(my_nested_3D_Matrix[1][1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Sets\n",
    "* A set is a collection which is unordered, and unindexed and its itmes are unchangeable.\n",
    "  * Once a set is created, you cannot change its items, but you can add new items.\n",
    "  * Duplicated items will be treated as one item\n",
    "* Like lists and tuples, sets can contain multiple data types\n",
    "* You cannot access items in a set by referring to an index or a key (since they have no order)\n",
    "* Sets are written with curly brackets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\n",
      "3\n",
      "5\n"
     ]
    },
    {
     "ename": "TypeError",
     "evalue": "'set' object is not subscriptable",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[1], line 5\u001b[0m\n\u001b[1;32m      3\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m x \u001b[38;5;129;01min\u001b[39;00m my_set:\n\u001b[1;32m      4\u001b[0m   \u001b[38;5;28mprint\u001b[39m(x)\n\u001b[0;32m----> 5\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[43mmy_set\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m2\u001b[39;49m\u001b[43m]\u001b[49m)\n",
      "\u001b[0;31mTypeError\u001b[0m: 'set' object is not subscriptable"
     ]
    }
   ],
   "source": [
    "my_set = {1, 5, 3, 1} # last value will be ignored\n",
    "my_set.add(1) # will be ignored\n",
    "for x in my_set:\n",
    "  print(x)\n",
    "print(my_set[2])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Example of using Sets\n",
    "\n",
    "* Sets are highly useful to efficiently remove duplicate values from a collection like a list and to perform common math operations like unions and intersections.\n",
    "* We can use **casting** (remember) between lists and sets to remove dublicats from a list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{1, 2, 3, 4, 5}\n",
      "[1, 2, 3, 4, 5]\n",
      "[1, 2, 3, 4, 5]\n"
     ]
    }
   ],
   "source": [
    "my_list = [1,2,3,4,5,1,2,3,3]\n",
    "my_set = set(my_list) # remove duplicates by converting to set\n",
    "print(my_set)\n",
    "my_list = list(my_set) # convert back to list to make it mutable\n",
    "print(my_list)\n",
    "\n",
    "## Remove dublicates in one line\n",
    "my_list = [1,2,3,4,5,1,2,3,3]\n",
    "my_list = list(set(my_list))\n",
    "print(my_list)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Dictionaries I\n",
    "\n",
    "* If we want to have a more dynamic and structured way to access information (especially nested information), dictionaries are a very useful concept.\n",
    "* Dictionaries to store data values in key:value pairs.\n",
    "* A dictionary is a collection which is ordered*, changeable and do not allow duplicates.\n",
    "* Dictionaries are written with curly brackets, and have keys and values, sepperated by a ```:```\n",
    "* Keys are allways a String, e.g. ```\"title\"```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "song = {\n",
    "    \"title\" : \"Another brick in the wall\",\n",
    "    \"artist\" : \"Pink Floyd\",\n",
    "    \"album\" : \"The Wall\",\n",
    "    \"year\" : 1979\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Dictionaries II\n",
    "\n",
    "* Since the key must be unike (like in a set), dublicates are not allowed.\n",
    "* Dicitionaries are accessd by the key in ```[]```or with the ```get()```method.\n",
    "* Teh advantage of the get method: If the key is not found, we can define a default value.\n",
    "* With the ```keys()``` method, we can access all keys of a dictionary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Another brick in the wall\n",
      "Another brick in the wall\n",
      "unknown\n",
      "dict_keys(['title', 'artist', 'album', 'year'])\n"
     ]
    }
   ],
   "source": [
    "song = {\n",
    "    \"title\" : \"Another brick in the wall\",\n",
    "    \"artist\" : \"Pink Floyd\",\n",
    "    \"album\" : \"The Wall\",\n",
    "    \"year\" : 1979\n",
    "}\n",
    "print (song[\"title\"])\n",
    "print (song.get(\"title\"))\n",
    "print (song.get(\"duration\", \"unknown\"))\n",
    "print (song.keys())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Important Methods of Dictionaries\n",
    "\n",
    "* ```update()``` : change or add an item in/to the dictionary\n",
    "* ```pop()``` : remove an item from the dictionary\n",
    "* ```copy()``` : creates a copy (not an object reference) of the dictionary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'title': 'Another brick in the wall', 'artist': 'Pink Floyd', 'year': 1976, 'duration': '3:30', 'tempo': 105}\n"
     ]
    }
   ],
   "source": [
    "song = {\n",
    "    \"title\" : \"Another brick in the wall\",\n",
    "    \"artist\" : \"Pink Floyd\",\n",
    "    \"album\" : \"The Wall\",\n",
    "    \"year\" : 1979\n",
    "}\n",
    "song.update({\"duration\" : \"3:30\"}) # Add key-value pair\n",
    "song.update({\"year\" : 1976}) # Update value\n",
    "song[\"tempo\"] = 105 # Alternative way to add key-value pair\n",
    "song.pop(\"album\") # Remove key-value pair\n",
    "print (song)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Iterations on Dictionaries\n",
    "\n",
    "* When we do a ```for x in dict``` iteration\n",
    "* x will be the key the key for each key:value pair in the dicitonary\n",
    "* If you prefer the values instead you can use the ```values()``` method of the dictionary\n",
    "* If you want both, keys and values: use the ``ìtems()``` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "title: Another brick in the wall | artist: Pink Floyd | album: The Wall | year: 1979 | \n",
      "Another brick in the wall, Pink Floyd, The Wall, 1979, title :  Another brick in the wall\n",
      "artist :  Pink Floyd\n",
      "album :  The Wall\n",
      "year :  1979\n"
     ]
    }
   ],
   "source": [
    "song = {\n",
    "    \"title\" : \"Another brick in the wall\",\n",
    "    \"artist\" : \"Pink Floyd\",\n",
    "    \"album\" : \"The Wall\",\n",
    "    \"year\" : 1979\n",
    "}\n",
    "\n",
    "for key in song:\n",
    "    print(key, end = \": \")\n",
    "    print(song[key], end = \" | \")\n",
    "print()\n",
    "\n",
    "for value in song.values():\n",
    "    print(value, end = \", \" )\n",
    "\n",
    "for key, value in song.items():\n",
    "    print(key, \": \", value)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Nestet Dictionaries\n",
    "\n",
    "* Link lists and tuples, dicitonaries can be nested\n",
    "* But in contrast to lists, we can explain with the key, what the value is about\n",
    "* Thus, with dicitionaries we can build meaningful data strctures that can be processed by human and computers\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Led Zeppelin\n"
     ]
    }
   ],
   "source": [
    "music_library = {\n",
    "    \"rock\": {\n",
    "        \"band\": \"Led Zeppelin\",\n",
    "        \"album\": \"IV\",\n",
    "        \"year\": 1971\n",
    "    },\n",
    "    \"jazz\": {\n",
    "        \"artist\": \"Miles Davis\",\n",
    "        \"album\": \"Kind of Blue\",\n",
    "        \"year\": 1959\n",
    "    },\n",
    "    \"pop\": {\n",
    "        \"artist\": \"Taylor Swift\",\n",
    "        \"album\": \"1989\",\n",
    "        \"year\": 2014\n",
    "    }\n",
    "}\n",
    "\n",
    "print(music_library[\"rock\"][\"band\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## For Geeeks: What does copy() really do?\n",
    "\n",
    "* Copying objets is often useful, when we want to update an object without loosing the original.\n",
    "* But is copy just a shallow copy of the first level or a deep copy?\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'1': {'title': 'Another brick in the wall', 'artist': 'Pink Floyd'}, '2': {'title': 'Shine on you crazy diamond', 'artist': 'Pink Floyd'}}\n",
      "{'1': {'title': 'Wish you were here', 'artist': 'Pink Floyd'}, '2': {'title': 'Shine on you crazy diamond', 'artist': 'Pink Floyd'}}\n"
     ]
    }
   ],
   "source": [
    "song = {\n",
    "    \"title\" : \"Another brick in the wall\",\n",
    "    \"artist\" : \"Pink Floyd\"\n",
    "}\n",
    "best_of_album = { \"1\" : song, \"2\" : { \"title\" : \"Shine on you crazy diamond\", \"artist\" : \"Pink Floyd\" }}\n",
    "other_album = best_of_album.copy()\n",
    "print(other_album)\n",
    "song[\"title\"] = \"Wish you were here\" # change the reference\n",
    "print(other_album) # song title is changed in both dictionaries, thus copy is shallow copy\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## JSON\n",
    "\n",
    "* Dicitionaries in Phython are a representaiton of what we call a *data object*, more generally speaking\n",
    "* In Java Script programming language we use exactly the same notation to create objects.\n",
    "* This notation is call JSON (Java Script Object Notation)\n",
    "* If we use the kind on object notation as a means of communicaiton (e.g. in files or web communcation), we can very easily read these JSON strings an create data objects that can be processed directely in our code.\n",
    "* This is so convinient, that it has **become the standard in all kinds of data exchange**: RESTful Webservices, MQTT, Buisness-IT interfaces, Config files, ...\n",
    "* In Python we can use the JSON module to do that."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "fileRow = '{\"title\" : \"Another brick in the wall\", \"artist\" : \"Pink Floyd\" }'\n",
    "song = json.loads(fileRow)\n"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}