Newer
Older
"source": [
"# Extended Data Types\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Agenda\n",
"\n",
"* Lists (repetition)\n",
"* Tuples\n",
"* Sets\n",
"* Dictionaries"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
" ## Introduction\n",
"\n",
"* We already know lists\n",
"* Lists are only one of 4 built-in data types in Phython that store collections of data:\n",
" * Lists\n",
" * Tuples\n",
" * Sets\n",
" * Dictionaries\n",
"* Collections are used to store multiple items in a single variable.\n",
"* Working with collections is something that makes Python extremly powerful\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Recap of what we know about Lists\n",
"\n",
"1. Lists are created with square brackets ```[ ]```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"list = [1,2,5,3]\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"2. List items are ordered, changeable, and allow duplicate values\n",
" * Orders: Items have a defined order, and that order will not change unless we add or remove items\n",
" * Changeable: We can change, add, and remove items in a list after it has been created\n",
"3. We access list items by their index (the position in the list, starting at 0)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3\n"
]
}
],
"source": [
"list = [1,2,3,3,2]\n",
"print(list[3])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Accessing List Items\n",
"\n",
"Besides accessing list items with ```[index]``` Python offers some additional options:\n",
"\n",
"1. Negative Indexing: ```[-1]``` refers to the last item, ```[-2]``` refers to the second last item etc.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"d\n",
"c\n"
]
}
],
"source": [
"my_list = ['a','b','c','d']\n",
"print(my_list[-1])\n",
"print(my_list[-2])\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"2. Instead of selecting a single item, we can also select a subset of the list with the schema ```start-index:end-index```"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['b', 'c']\n"
]
}
],
"source": [
"my_list = ['a','b','c','d']\n",
"print(my_list[1:3])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## List Functions\n",
"\n",
"We have already seen the most important list methods (functions). Here is a complete table:\n",
"\n",
"| Method | Description |\n",
"| :-------- | :-------------------------------------------------------------------------- |\n",
"| append() | Adds an element at the end of the list |\n",
"| clear() | Removes all elements from the list |\n",
"| copy() | Returns a copy of the list |\n",
"| count() | Returns the number of elements with the specified value |\n",
"| extend() | Adds an element of a list (or any iterable), to the end of the current list |\n",
"| index() | Returns the index of the first element with the specified value |\n",
"| insert() | Adds an element at the specified position |\n",
"| pop() | Removes the element at the specified position |\n",
"| remove() | Removes the item with the specified value |\n",
"| reverse() | Reverses the order of the list |\n",
"| sort() | Sorts the list |\n",
"See [this page](https://www.w3schools.com/python/python_lists_methods.asp) for more details."
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"* A tuple is a collection that is ordered and **unchangeable**.\n",
"* Tuples are ordered and allow duplicate values.\n",
"* tems in tuples are accessed in the same way as lists, using ```[]```"
"execution_count": 10,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"mio\n"
]
},
{
"ename": "TypeError",
"evalue": "'tuple' object does not support item assignment",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[10], line 4\u001b[0m\n\u001b[1;32m 1\u001b[0m my_tuple \u001b[38;5;241m=\u001b[39m (\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmia\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmio\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;241m1\u001b[39m)\n\u001b[1;32m 2\u001b[0m \u001b[38;5;28mprint\u001b[39m(my_tuple[\u001b[38;5;241m1\u001b[39m])\n\u001b[0;32m----> 4\u001b[0m \u001b[43mmy_tuple\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mciao\u001b[39m\u001b[38;5;124m\"\u001b[39m\n",
"\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment"
]
}
],
"source": [
"my_tuple = (\"mia\", \"mio\", 1)\n",
"print(my_tuple[1])\n",
"\n",
"my_tuple[0] = \"ciao\""
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Tuples only know the methods ```count()```and ```index()```\n",
"* We use tuples to make sure, our data will not be changed anywhere in the code\n",
"* Tuples are faster and more memory-efficient than lists"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## When should we use lists and when should we use tuples?\n",
"\n",
"* Use Lists, when you want to change data in your collection\n",
" * Lists are more flexible and have more built-in methods, making them ideal for dynamic collections.\n",
"* Use Tuples, when your data is immutable\n",
" * We use tuples to make sure, our data will not be changed anywhere in the code.\n",
" * Tuples are faster and more memory efficient than lists (especially during iterations)."
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Nested Lists and Tuples\n",
"\n",
"Since lists, tuples, and sets can contain any datatype we can also nest them:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2\n",
"2\n",
"4\n"
]
}
],
"my_nested_list = [[1,2],[3,4]] # nested list (2D Array)\n",
"my_nested_tuble = ((1,2),(3,4)) # nested tuble (2D Array)\n",
"my_nested_3D_Matrix = ((1,2,5),\n",
" (3,4,2),\n",
" (3,3,3)) # Matrix\n",
"\n",
"\n",
"print(my_nested_list[0][1])\n",
"print(my_nested_tuble[0][1])\n",
"print(my_nested_3D_Matrix[1][1])"
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Sets\n",
"* A set is a collection which is unordered, and unindexed and its itmes are unchangeable.\n",
" * Once a set is created, you cannot change its items, but you can add new items.\n",
" * Duplicated items will be treated as one item\n",
"* Like lists and tuples, sets can contain multiple data types\n",
"* You cannot access items in a set by referring to an index or a key (since they have no order)\n",
"* Sets are written with curly brackets."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1\n",
"3\n",
"5\n"
]
},
{
"ename": "TypeError",
"evalue": "'set' object is not subscriptable",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[1], line 5\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m x \u001b[38;5;129;01min\u001b[39;00m my_set:\n\u001b[1;32m 4\u001b[0m \u001b[38;5;28mprint\u001b[39m(x)\n\u001b[0;32m----> 5\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[43mmy_set\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m2\u001b[39;49m\u001b[43m]\u001b[49m)\n",
"\u001b[0;31mTypeError\u001b[0m: 'set' object is not subscriptable"
]
}
],
"source": [
"my_set = {1, 5, 3, 1} # last value will be ignored\n",
"my_set.add(1) # will be ignored\n",
"for x in my_set:\n",
" print(x)\n",
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
"print(my_set[2])\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Example of using Sets\n",
"\n",
"* Sets are highly useful to efficiently remove duplicate values from a collection like a list and to perform common math operations like unions and intersections.\n",
"* We can use **casting** (remember) between lists and sets to remove dublicats from a list"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{1, 2, 3, 4, 5}\n",
"[1, 2, 3, 4, 5]\n",
"[1, 2, 3, 4, 5]\n"
]
}
],
"source": [
"my_list = [1,2,3,4,5,1,2,3,3]\n",
"my_set = set(my_list) # remove duplicates by converting to set\n",
"print(my_set)\n",
"my_list = list(my_set) # convert back to list to make it mutable\n",
"print(my_list)\n",
"\n",
"## Remove dublicates in one line\n",
"my_list = [1,2,3,4,5,1,2,3,3]\n",
"my_list = list(set(my_list))\n",
"print(my_list)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Dictionaries I\n",
"\n",
"* If we want to have a more dynamic and structured way to access information (especially nested information), dictionaries are a very useful concept.\n",
"* Dictionaries to store data values in key:value pairs.\n",
"* A dictionary is a collection which is ordered*, changeable and do not allow duplicates.\n",
"* Dictionaries are written with curly brackets, and have keys and values, sepperated by a ```:```\n",
"* Keys are allways a String, e.g. ```\"title\"```"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"song = {\n",
" \"title\" : \"Another brick in the wall\",\n",
" \"artist\" : \"Pink Floyd\",\n",
" \"album\" : \"The Wall\",\n",
" \"year\" : 1979\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Dictionaries II\n",
"\n",
"* Since the key must be unike (like in a set), dublicates are not allowed.\n",
"* Dicitionaries are accessd by the key in ```[]```or with the ```get()```method.\n",
"* Teh advantage of the get method: If the key is not found, we can define a default value.\n",
"* With the ```keys()``` method, we can access all keys of a dictionary."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Another brick in the wall\n",
"Another brick in the wall\n",
"unknown\n",
"dict_keys(['title', 'artist', 'album', 'year'])\n"
]
}
],
"source": [
"song = {\n",
" \"title\" : \"Another brick in the wall\",\n",
" \"artist\" : \"Pink Floyd\",\n",
" \"album\" : \"The Wall\",\n",
" \"year\" : 1979\n",
"}\n",
"print (song[\"title\"])\n",
"print (song.get(\"title\"))\n",
"print (song.get(\"duration\", \"unknown\"))\n",
"print (song.keys())"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Important Methods of Dictionaries\n",
"\n",
"* ```update()``` : change or add an item in/to the dictionary\n",
"* ```pop()``` : remove an item from the dictionary\n",
"* ```copy()``` : creates a copy (not an object reference) of the dictionary"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'title': 'Another brick in the wall', 'artist': 'Pink Floyd', 'year': 1976, 'duration': '3:30', 'tempo': 105}\n"
]
}
],
"source": [
"song = {\n",
" \"title\" : \"Another brick in the wall\",\n",
" \"artist\" : \"Pink Floyd\",\n",
" \"album\" : \"The Wall\",\n",
" \"year\" : 1979\n",
"}\n",
"song.update({\"duration\" : \"3:30\"}) # Add key-value pair\n",
"song.update({\"year\" : 1976}) # Update value\n",
"song[\"tempo\"] = 105 # Alternative way to add key-value pair\n",
"song.pop(\"album\") # Remove key-value pair\n",
"print (song)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Iterations on Dictionaries\n",
"\n",
"* When we do a ```for x in dict``` iteration\n",
"* x will be the key the key for each key:value pair in the dicitonary\n",
"* If you prefer the values instead you can use the ```values()``` method of the dictionary\n",
"* If you want both, keys and values: use the ``ìtems()``` method."
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"title: Another brick in the wall | artist: Pink Floyd | album: The Wall | year: 1979 | \n",
"Another brick in the wall, Pink Floyd, The Wall, 1979, title : Another brick in the wall\n",
"artist : Pink Floyd\n",
"album : The Wall\n",
"year : 1979\n"
]
}
],
"source": [
"song = {\n",
" \"title\" : \"Another brick in the wall\",\n",
" \"artist\" : \"Pink Floyd\",\n",
" \"album\" : \"The Wall\",\n",
" \"year\" : 1979\n",
"}\n",
"\n",
"for key in song:\n",
" print(key, end = \": \")\n",
" print(song[key], end = \" | \")\n",
"print()\n",
"\n",
"for value in song.values():\n",
" print(value, end = \", \" )\n",
"\n",
"for key, value in song.items():\n",
" print(key, \": \", value)\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Nestet Dictionaries\n",
"\n",
"* Link lists and tuples, dicitonaries can be nested\n",
"* But in contrast to lists, we can explain with the key, what the value is about\n",
"* Thus, with dicitionaries we can build meaningful data strctures that can be processed by human and computers\n"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Led Zeppelin\n"
]
}
],
"source": [
"music_library = {\n",
" \"rock\": {\n",
" \"band\": \"Led Zeppelin\",\n",
" \"album\": \"IV\",\n",
" \"year\": 1971\n",
" },\n",
" \"jazz\": {\n",
" \"artist\": \"Miles Davis\",\n",
" \"album\": \"Kind of Blue\",\n",
" \"year\": 1959\n",
" },\n",
" \"pop\": {\n",
" \"artist\": \"Taylor Swift\",\n",
" \"album\": \"1989\",\n",
" \"year\": 2014\n",
" }\n",
"}\n",
"\n",
"print(music_library[\"rock\"][\"band\"])"
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## For Geeeks: What does copy() really do?\n",
"\n",
"* Copying objets is often useful, when we want to update an object without loosing the original.\n",
"* But is copy just a shallow copy of the first level or a deep copy?\n"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'1': {'title': 'Another brick in the wall', 'artist': 'Pink Floyd'}, '2': {'title': 'Shine on you crazy diamond', 'artist': 'Pink Floyd'}}\n",
"{'1': {'title': 'Wish you were here', 'artist': 'Pink Floyd'}, '2': {'title': 'Shine on you crazy diamond', 'artist': 'Pink Floyd'}}\n"
]
}
],
"song = {\n",
" \"title\" : \"Another brick in the wall\",\n",
" \"artist\" : \"Pink Floyd\"\n",
"}\n",
"best_of_album = { \"1\" : song, \"2\" : { \"title\" : \"Shine on you crazy diamond\", \"artist\" : \"Pink Floyd\" }}\n",
"other_album = best_of_album.copy()\n",
"print(other_album)\n",
"song[\"title\"] = \"Wish you were here\" # change the reference\n",
"print(other_album) # song title is changed in both dictionaries, thus copy is shallow copy\n"
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"## JSON\n",
"\n",
"* Dicitionaries in Phython are a representaiton of what we call a *data object*, more generally speaking\n",
"* In Java Script programming language we use exactly the same notation to create objects.\n",
"* This notation is call JSON (Java Script Object Notation)\n",
"* If we use the kind on object notation as a means of communicaiton (e.g. in files or web communcation), we can very easily read these JSON strings an create data objects that can be processed directely in our code.\n",
"* This is so convinient, that it has **become the standard in all kinds of data exchange**: RESTful Webservices, MQTT, Buisness-IT interfaces, Config files, ...\n",
"* In Python we can use the JSON module to do that."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"import json\n",
"\n",
"fileRow = '{\"title\" : \"Another brick in the wall\", \"artist\" : \"Pink Floyd\" }'\n",
"song = json.loads(fileRow)\n"
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}