{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Eager_execution_with_TF2.X.ipynb","provenance":[],"collapsed_sections":[],"toc_visible":true},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"f16t98Q_Glse"},"source":["# Eager Execution\n","\n","Version 1.01"]},{"cell_type":"markdown","metadata":{"id":"fQn5T_Kt1QV5"},"source":["(C) 2020 - Umberto Michelucci, Michela Sperti\n","\n","This notebook is part of the book _Applied Deep Learning: a case based approach, **2nd edition**_ from APRESS by [U. Michelucci](mailto:umberto.michelucci@toelt.ai) and [M. Sperti](mailto:michela.sperti@toelt.ai)."]},{"cell_type":"markdown","metadata":{"id":"KUX8azCzCzVE"},"source":["## Notebook Learning Goals"]},{"cell_type":"markdown","metadata":{"id":"MuNbkdwcCzXv"},"source":["At the end of the notebook you are going to know how to address a computational problem using TensorFlow 2.X. You are going to know the most important feature of TensorFlow 2.X: Eager Execution and to become familiar with TensorFlow environment."]},{"cell_type":"markdown","metadata":{"id":"4HqSqKYUOg8n"},"source":["## TensorFlow Setup"]},{"cell_type":"markdown","metadata":{"id":"ORV-SADOuzEG"},"source":["TensorFlow 2.X can be treated as a Python library."]},{"cell_type":"code","metadata":{"id":"K-BX2PGxuXfc"},"source":["# general libraries\n","import numpy as np\n","import matplotlib.pyplot as plt\n","\n","# tensorflow libraries\n","import tensorflow as tf"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"t3iY-0xDQhLy"},"source":["The default version of TensorFlow is 2.X and eager execution is enabled by default."]},{"cell_type":"code","metadata":{"id":"tENrWC86Qsgb","colab":{"base_uri":"https://localhost:8080/","height":35},"executionInfo":{"status":"ok","timestamp":1603699482466,"user_tz":-60,"elapsed":996,"user":{"displayName":"Michela Sperti","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gh7mD9r-1Xj0Qve63ZPZx9UHRv0PkVhL5ayiHNv=s64","userId":"13210266879998244642"}},"outputId":"3c8d3f2d-a269-41b1-f7d6-ba7826bb2236"},"source":["# check tensorflow version\n","print(tf.__version__)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["2.3.0\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"6Mgm1nEUOvss","colab":{"base_uri":"https://localhost:8080/","height":35},"executionInfo":{"status":"ok","timestamp":1603699484552,"user_tz":-60,"elapsed":1003,"user":{"displayName":"Michela Sperti","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gh7mD9r-1Xj0Qve63ZPZx9UHRv0PkVhL5ayiHNv=s64","userId":"13210266879998244642"}},"outputId":"8b0ae86f-89c6-4beb-fe23-47a3a7adad17"},"source":["# check eager execution\n","tf.executing_eagerly()"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["True"]},"metadata":{"tags":[]},"execution_count":3}]},{"cell_type":"markdown","metadata":{"id":"q7JwzUekFHuP"},"source":["## What does Eager Execution mean?"]},{"cell_type":"markdown","metadata":{"id":"1gHa0QFau0bb"},"source":["Eager execution is the most evident change between TensorFlow 1.X and TensorFlow 2.X. \n","\n","TensorFlow 1.X was designed with a **static computational graph** approach, meaning that, if you have to perform an operation, you must first describe *what* your model should do and only after that you can execute the program (inside a **session**, separated from the graph's definition). TensorFlow 2.X is instead based on an **imperative programming** approach, meaning that you describe *how* your model has to obtain the result and all operations are executed immediately, returning concrete values as output.\n","\n","This change made TensorFlow more pythonic, easier to understand and to use. In particular, as stated in its official documentation, eager execution provides:\n","- an **intuitive interface**: you can structure your code naturally and use Python data structures;\n","- **easier debugging**: you can use standard Python debugging tools for immediate error reporting;\n","- **natural control flow**: you can use Python control flow instead of graph control flow, simplifying the specification of dynamic models."]},{"cell_type":"markdown","metadata":{"id":"DkQwsrJmuPA8"},"source":["Before giving an example on eager execution, let's begin with a quick overview of TensorFlow basics."]},{"cell_type":"markdown","metadata":{"id":"JNkqCGudHzP-"},"source":["## TensorFlow Basic Usage"]},{"cell_type":"markdown","metadata":{"id":"HTrJRpiyucxm"},"source":["### Tensors"]},{"cell_type":"markdown","metadata":{"id":"27iuvPHduc0T"},"source":["If you are familiar with NumPy, you know that its basic units are arrays. TensorFlow basic units are tensors. A tensor is a multi-dimensional array, therefore it is a generalization of vectors.\n","\n","Each tensor in tensorflow is characterized by a static type (`dtype`) and a dynamic dimension (`shape`). This means that, once a tensor has been defined, you cannot change its type, but you can dynamically change its dimensions before evaluating it.\n","\n","Usually, the dimension of a tensor is called **rank**. For example, the simplest tensor (of rank 0) is a scalar. An array has rank 1 and a bidimensional matrix has rank 2. Rank can be calculated as the length of a tensor's `shape`."]},{"cell_type":"markdown","metadata":{"id":"ZNFnxmRwujvC"},"source":["Let's create some basic tensors."]},{"cell_type":"code","metadata":{"id":"RaVuF64cvaPR","colab":{"base_uri":"https://localhost:8080/","height":35},"executionInfo":{"status":"ok","timestamp":1603699510593,"user_tz":-60,"elapsed":1275,"user":{"displayName":"Michela Sperti","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gh7mD9r-1Xj0Qve63ZPZx9UHRv0PkVhL5ayiHNv=s64","userId":"13210266879998244642"}},"outputId":"054872e7-b6a3-4c34-99f9-2a4e7ad2b623"},"source":["# SCALAR TENSOR\n","# This will be an int32 tensor by default; see \"dtypes\" below.\n","# A scalar is a tensor without shape (rank is 0 in this case).\n","# When printing the tensor, you see its value, its shape and its type.\n","rank_0_tensor = tf.constant(4)\n","print(rank_0_tensor)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["tf.Tensor(4, shape=(), dtype=int32)\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"d8JbgJ9avaRm","colab":{"base_uri":"https://localhost:8080/","height":35},"executionInfo":{"status":"ok","timestamp":1603699514688,"user_tz":-60,"elapsed":1664,"user":{"displayName":"Michela Sperti","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gh7mD9r-1Xj0Qve63ZPZx9UHRv0PkVhL5ayiHNv=s64","userId":"13210266879998244642"}},"outputId":"bba3f709-79f4-4232-8ad8-c1ccfb9c64d5"},"source":["# ARRAY TENSOR\n","# Let's make this a float tensor.\n","# An array (a list of elements) is a tensor of rank 1.\n","rank_1_tensor = tf.constant([2.0, 3.0, 4.0])\n","print(rank_1_tensor)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["tf.Tensor([2. 3. 4.], shape=(3,), dtype=float32)\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"TSd4M1IBvaUI","colab":{"base_uri":"https://localhost:8080/","height":90},"executionInfo":{"status":"ok","timestamp":1603699517557,"user_tz":-60,"elapsed":1009,"user":{"displayName":"Michela Sperti","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gh7mD9r-1Xj0Qve63ZPZx9UHRv0PkVhL5ayiHNv=s64","userId":"13210266879998244642"}},"outputId":"77c624a7-3b32-4dee-9d5e-8c201867d405"},"source":["# MATRIX TENSOR\n","# If we want to be specific, we can set the dtype (see below) at creation time.\n","# A bidimensional matrix is a tensor of rank 2.\n","rank_2_tensor = tf.constant([[1, 2],\n"," [3, 4],\n"," [5, 6]], dtype=tf.float16)\n","print(rank_2_tensor)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["tf.Tensor(\n","[[1. 2.]\n"," [3. 4.]\n"," [5. 6.]], shape=(3, 2), dtype=float16)\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"n73oS9skvkRR"},"source":["You can go on and create n-dimensional tensors.\n","\n","Moreover, you can perform all basic mathematical operations with tensors."]},{"cell_type":"markdown","metadata":{"id":"q7EQvfn_vpKC"},"source":["### Variables"]},{"cell_type":"markdown","metadata":{"id":"uZz6wZDUvshD"},"source":["The three most important types of tensors are:\n","- `tf.constant`\n","- `tf.Variable`\n","- `tf.placeholder`\n","\n","The `tf.constant` and the `tf.placeholder` values are, during a single-session run, immutable. Once they have a value, they will not change. \n","\n","A `tf.Variable` contains values that are going to change during\n","running, because, for example, they must be optimized for a specific problem."]},{"cell_type":"code","metadata":{"id":"l8qCJ4vNwBdD","colab":{"base_uri":"https://localhost:8080/","height":90},"executionInfo":{"status":"ok","timestamp":1603699532654,"user_tz":-60,"elapsed":994,"user":{"displayName":"Michela Sperti","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gh7mD9r-1Xj0Qve63ZPZx9UHRv0PkVhL5ayiHNv=s64","userId":"13210266879998244642"}},"outputId":"d2e5ebbd-8673-4631-d273-19c65fa46986"},"source":["# create a variable\n","a = tf.Variable([2.0, 3.0])\n","# create another variable b based on the value of a\n","b = tf.Variable(a)\n","a.assign([5, 6]) # this command changes a values\n","# Two variables will not share the same memory.\n","\n","# a and b are different\n","print(a.numpy())\n","print(b.numpy())\n","\n","# There are other versions of assign\n","print(a.assign_add([2,3]).numpy()) # [7. 9.]\n","print(a.assign_sub([7,9]).numpy()) # [0. 0.]"],"execution_count":null,"outputs":[{"output_type":"stream","text":["[5. 6.]\n","[2. 3.]\n","[7. 9.]\n","[0. 0.]\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"D9fstH4xeoxr"},"source":["With eager execution, TensorFlow operations are immediately evaluated and return their values to you. \n","\n","`tf.Tensor` objects reference concrete values instead of symbolic handles to nodes in a computational graph. Therefore, it's easy to inspect results using `print()` or a debugger."]},{"cell_type":"code","metadata":{"id":"KBogQH0bOzv3","colab":{"base_uri":"https://localhost:8080/","height":72},"executionInfo":{"status":"ok","timestamp":1603699550408,"user_tz":-60,"elapsed":542,"user":{"displayName":"Michela Sperti","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gh7mD9r-1Xj0Qve63ZPZx9UHRv0PkVhL5ayiHNv=s64","userId":"13210266879998244642"}},"outputId":"4816ff0f-0053-44ed-9401-f70c9b5cd9a8"},"source":["# let's define a constant tensor and print it\n","a = tf.constant([[1, 2], [3, 4]])\n","print(a)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["tf.Tensor(\n","[[1 2]\n"," [3 4]], shape=(2, 2), dtype=int32)\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"zNKxUG1FfXVR"},"source":["A striking example of the useful integration of TensorFlow 2.X inside Python is represented by its compatibility with **NumPy** library. NumPy operations accept `tf.Tensor` arguments. The TensorFlow `tf.math` operations convert\n","Python objects and NumPy arrays to `tf.Tensor` objects. The `tf.Tensor.numpy` method returns the object's value as a NumPy `ndarray`."]},{"cell_type":"code","metadata":{"id":"8GJTbjgcfkw1","colab":{"base_uri":"https://localhost:8080/","height":54},"executionInfo":{"status":"ok","timestamp":1603699557949,"user_tz":-60,"elapsed":540,"user":{"displayName":"Michela Sperti","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gh7mD9r-1Xj0Qve63ZPZx9UHRv0PkVhL5ayiHNv=s64","userId":"13210266879998244642"}},"outputId":"b978f6cc-5555-437e-9091-37f1460d1171"},"source":["# broadcasting is supported\n","b = tf.add(a, 1)\n","# numpy is easily integrated\n","c = np.multiply(a, b)\n","print(c)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["[[ 2 6]\n"," [12 20]]\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"FaIlf429Ppj1"},"source":["If you don't know what broadcasting means, have a look at the [Further Readings](#fr) section."]},{"cell_type":"markdown","metadata":{"id":"1TTrrsTk-laO"},"source":["A major benefit of eager execution is that all the functionality of the host language is available while your model is executing. Let's see an example of how you can perform a dynamic control flow (i.e. to dynamically execute algorithm's instructions step by step)."]},{"cell_type":"markdown","metadata":{"id":"YnYUSBF0ivKs"},"source":["## Dynamic Control Flow in TensorFlow: Solving Sudoku"]},{"cell_type":"markdown","metadata":{"id":"xbLAjb1d24Lp"},"source":["To practice a bit with simple operations between tensors using TensorFlow, we will write a recursive program to solve the famous logic game Sudoku. This is possible thanks to eager execution, that give the possibility to integrate TensorFlow code inside Python environment and execute operations immediately. Notice that the same problem could be solve using NumPy, for example."]},{"cell_type":"markdown","metadata":{"id":"e_mEuEHrmX6I"},"source":["**Sudoku Problem**\n","\n","Given a partially filled tensor of shape (9, 9), the goal is to assign digits (from 1 to 9) to the empty cells so that every row, column, and sub-tensor of shape (3, 3) contains exactly one instance of the digits from 1 to 9."]},{"cell_type":"code","metadata":{"id":"vVosV-c0nxr2"},"source":["# input tensor (this is a possible example, you can change values for others)\n","input = tf.constant([[3, 0, 6, 5, 0, 8, 4, 0, 0], \n"," [5, 2, 0, 0, 0, 0, 0, 0, 0], \n"," [0, 8, 7, 0, 0, 0, 0, 3, 1], \n"," [0, 0, 3, 0, 1, 0, 0, 8, 0], \n"," [9, 0, 0, 8, 6, 3, 0, 0, 5], \n"," [0, 5, 0, 0, 9, 0, 6, 0, 0], \n"," [1, 3, 0, 0, 0, 0, 2, 5, 0], \n"," [0, 0, 0, 0, 0, 0, 0, 7, 4], \n"," [0, 0, 5, 2, 0, 6, 3, 0, 0]])"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"ztsPxKen_ned"},"source":["The simplest approach is to generate all possible sets of numbers between 1 and 9 to fill all empty cells and test them (checking if the final tensor meets the required constraints).\n","\n","Instead, we will solve the problem using another approach: **backtracking** (this technique is used to solve problems in which different constraints must be met, trying several possibilities, coming back if the solution has not been reached and trying again until end)."]},{"cell_type":"markdown","metadata":{"id":"kJaKnA6I35tE"},"source":["We define a function to print a tensor (`print_tensor`), a function to check if there are empty cells left inside the tensor (`find_empty_cell`), a function to check if the current assigned number meets the Sudoku's constraints (`check_validity`) and finally the recursive function that takes an input tensor and try to fill it (`generate_elements`)."]},{"cell_type":"code","metadata":{"id":"cvTrgdGNZArz"},"source":["def print_tensor(tensor):\n"," \"\"\"Prints the tensor given as input (i.e. the sudoku grid).\"\"\"\n"," print(tensor)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"GQIH7WAdZVTa"},"source":["def find_empty_cell(tensor):\n"," \"\"\"Find an empty cell inside a tensor, if it exists,\n"," otherwise returns False.\"\"\"\n"," pos0 = tf.where(tensor == 0) # find every 0 present inside tensor\n"," if len(pos0) != 0: # an empty cell has been found\n"," return True\n"," else: # no left empty cells\n"," return False"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"LF3X3OaZ-TRY"},"source":["def check_validity(tensor, i, j, d):\n"," \"\"\"Checks, after assigning the current digit, if the tensor \n"," meets constraints or not.\"\"\"\n"," # a list of all initial and final indeces of the sub-tensors,\n"," # to be quickly identified inside the function\n"," subtensors = [[0,3,0,3],[0,3,3,6],[0,3,6,9],\n"," [3,6,0,3],[3,6,3,6],[3,6,6,9],\n"," [6,9,0,3],[6,9,3,6],[6,9,6,9]]\n"," # check if the current number is already present\n"," pos_row = tf.where(tensor[i,:] == d) \n"," pos_col = tf.where(tensor[:,j] == d) \n"," # check for every row and column\n"," if len(pos_row) != 0 or len(pos_col) != 0:\n"," return False\n"," # check for every sub-tensor\n"," for st in subtensors:\n"," if i >= st[0] and i < st[1] and j >= st[2] and j < st[3]:\n"," pos_sub = tf.where(tensor[st[0]:st[1],st[2]:st[3]] == d)\n"," if len(pos_sub) != 0:\n"," return False\n"," return True # all constraints are satisfied!"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"pMOVaDsP-TTq"},"source":["def generate_elements(tensor):\n"," \"\"\"Takes an input tensor and recursively try to insert an element\n"," and check tensor's validity.\"\"\"\n"," tensor_tmp = tf.Variable(tensor)\n"," # find an empty cell\n"," if not find_empty_cell(tensor_tmp):\n"," # if no empty cells are left, you have successfully filled the Sudoku!\n"," print_tensor(tensor_tmp) \n"," return True\n"," # take the first empty cell and try to fill it\n"," pos0 = tf.where(tensor_tmp == 0)\n"," i, j = pos0[0][0], pos0[0][1]\n"," # try to fill the empty cell with a number from 1 to 9, checking validity\n"," for d in range(1, 10):\n"," # check tensor's validity\n"," if check_validity(tensor_tmp, i, j, d):\n"," # if all constraints are satisfied, assigned the current element to\n"," # the current position\n"," tensor_tmp = tensor_tmp[i, j].assign(d)\n"," # backtracking (recursion): repeat X times the function itself\n"," if generate_elements(tensor_tmp):\n"," return True\n"," # if constraints are not satisfied (failure), assign a zero to the\n"," # current position\n"," tensor_tmp = tensor_tmp[i, j].assign(0)\n"," return False # continue with backtracking "],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"u8cQwkwIOMcF","colab":{"base_uri":"https://localhost:8080/","height":217},"executionInfo":{"status":"ok","timestamp":1603699615410,"user_tz":-60,"elapsed":5415,"user":{"displayName":"Michela Sperti","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gh7mD9r-1Xj0Qve63ZPZx9UHRv0PkVhL5ayiHNv=s64","userId":"13210266879998244642"}},"outputId":"15e67701-95a5-4bf9-90a9-dd44f01ceffc"},"source":["generate_elements(input)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["\n"],"name":"stdout"},{"output_type":"execute_result","data":{"text/plain":["True"]},"metadata":{"tags":[]},"execution_count":15}]},{"cell_type":"markdown","metadata":{"id":"HgXL240x-2cE"},"source":["## Eager Gradients Computation"]},{"cell_type":"markdown","metadata":{"id":"zB0X8a_qzMQ5"},"source":["One of the most important step to train a neural network is weight optimization, done by finding the minimum of a loss function. The way to do this is through **backpropagation** algorithm (see [Further Readings](#fr) section for additional material). To calculate the minimum of any function you need to be able to **compute gradients**. \n","\n","Now, we will discuss ways you can compute gradients with TensorFlow, especially in eager execution. This is an example of **automatic differentiation**"]},{"cell_type":"code","metadata":{"id":"5u_Y9HDaM4Hy","colab":{"base_uri":"https://localhost:8080/","height":35},"executionInfo":{"status":"ok","timestamp":1603699663187,"user_tz":-60,"elapsed":512,"user":{"displayName":"Michela Sperti","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gh7mD9r-1Xj0Qve63ZPZx9UHRv0PkVhL5ayiHNv=s64","userId":"13210266879998244642"}},"outputId":"acec0390-08bc-42a8-a9bb-ca42c1069e91"},"source":["# During eager execution, use tf.GradientTape to trace operations \n","# for computing gradients later\n","w = tf.Variable(3.0) # define a tf variable and assign it the value 3\n","# define a function using tf.GradientTape\n","with tf.GradientTape() as tape:\n"," loss = w * w * w \n","# calculate gradient with respect to a specific variable (w in this case)\n","grad = tape.gradient(loss, w)\n","print(grad) # => tf.Tensor(27., shape=(), dtype=float32)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["tf.Tensor(27.0, shape=(), dtype=float32)\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"tWeyFr_3TbM5"},"source":["Even if TensorFlow 1.X presents some disadvantages (it has a steep learning curve, it is difficult to debug, it has a counter-intuitive semantics and it is not well inegrated with Python), it is still a powerful and highly expressive tool. Therefore, it is recommended to have a deep understanding of computational graphs and the logic behind TensorFlow 1.X, since this helps to better understand TensorFlow 2.X."]},{"cell_type":"markdown","metadata":{"id":"Oa8Zif9AwQmV"},"source":["## Exercises"]},{"cell_type":"markdown","metadata":{"id":"J1LjYhLTwSqD"},"source":["1. [*Easy Difficulty*] Create a tensor of rank 5.\n","2. [*Easy Difficulty*] Calculate the derivative of $y=x^2+y-z^2$ with respect to $z$ and evaluated in $z=2$."]},{"cell_type":"markdown","metadata":{"id":"JHVaoSq6NaV5"},"source":["## References"]},{"cell_type":"markdown","metadata":{"id":"Ebp42WZxNdRL"},"source":["1. https://www.tensorflow.org/guide/eager (eager execution official documentation)"]},{"cell_type":"markdown","metadata":{"id":"cPgmrAveicfB"},"source":["## Further Readings "]},{"cell_type":"markdown","metadata":{"id":"_xcLEs0Eiepq"},"source":["**NumPy package**"]},{"cell_type":"markdown","metadata":{"id":"K4QATKrvihG2"},"source":["1. All documentation (with lots of tutorial and examples already implemented): https://numpy.org/doc/stable/\n","2. Broadcasting in NumPy (https://numpy.org/doc/stable/user/basics.broadcasting.html)"]},{"cell_type":"markdown","metadata":{"id":"JO-1jFFLH6Bl"},"source":["**Backpropagation algorithm in neural networks**"]},{"cell_type":"markdown","metadata":{"id":"IoJOOKu3IAjq"},"source":["1. Section 6.5 of \"Deep Learning\" book by Ian Goodfellow, Yoshua Bengio and Aaron Courville (https://www.deeplearningbook.org/contents/mlp.html), freely available online"]},{"cell_type":"markdown","metadata":{"id":"AURdn9r6M15d"},"source":["**Automatic differentiation**"]},{"cell_type":"markdown","metadata":{"id":"2Ir4ecHqM18e"},"source":["1. Baydin, Atılım Günes, et al. \"Automatic differentiation in machine learning: a survey.\" The Journal of Machine Learning Research 18.1 (2017): 5595-5637."]}]}