Deep Learning using R

class: center, middle, inverse, title-slide

# Deep Learning using R
### Curso-R
### 2019-03-24

---

# Goals

.large[

* What are deep neural networks and how they work?
* What software we can use to train these models and how they relate to each other?
* How train deep learning models for some prediction problems?

]

---

# Requisites

.large[

* Linear regression
* Logistic regression
* R: pipe (`%>%`)

]

---

# References

.large[

* [Deep Learning Book](https://www.deeplearningbook.org)
* [Deep Learning with R](https://www.manning.com/books/deep-learning-with-r)
* [Tensorflow for R Blog](https://blogs.rstudio.com/tensorflow/)
* [Keras examples](https://keras.rstudio.com/articles/examples/index.html)
* [Colah's blog](http://colah.github.io)

]

---

# Why "Deep" Learning?

.large[

* We use many composite nonlinear operations, called *layers*, to learn a representation
* The number of layers is the model depth
* Nowadays we have models with more than 100 layers

]

## Alternative names

.large[

- layered representations learning
- hierarchical representations learning

]

---

# Layers

---

# Deep Learning

---

# Deep Learning

---

# Deep Learning

---

# Relation to Generalized Linear Models

.large[

- Linear regression: single layer neural network, no activation
- Logistic regression: single layer neural netork, logit activation

]

---

# Logistic regression

---

# Deviance function

.large[

`$$D(y,\hat\mu(x)) = \sum_{i=1}^n 2\left[y_i\log\frac{y_i}{\hat\mu_i(x_i)} + (1-y_i)\log\left(\frac{1-y_i}{1-\hat\mu_i(x_i)}\right)\right]$$`

`$$= 2 D_{KL}\left(y||\hat\mu(x)\right),$$`

where `$D_{KL}(p||q) = \sum_i p_i\log\frac{p_i}{q_i}$` is the Kullback-Leibler divergence.

]

---

# Deep learning

.large[

- Linear transformation of `$x$`, add bias and add some nonlinear activation.

`$$f(x) = \sigma(wx + b)$$`

]

#### Loss function

.large[

`$$D_{KL}(p(x)||q(x))$$`

]

---

# Optimization: Stochastic Gradient Descent

```
for(i in 1:num_epochs) {
  grads <- compute_gradient(data, params)
  params <- params - learning_rate * grads
}
```

---

# SGD

---

# TensorFlow

.large[

It's a computational library

- Developed in Google Brain for neural network research
- Open Source
- Automatic Differentiation
- Uses GPU

]

---

# Tensor

(2d)

```
##       Sepal.Length Sepal.Width Petal.Length Petal.Width Species
##  [1,]          5.1         3.5          1.4         0.2       1
##  [2,]          4.9         3.0          1.4         0.2       1
##  [3,]          4.7         3.2          1.3         0.2       1
##  [4,]          4.6         3.1          1.5         0.2       1
##  [5,]          5.0         3.6          1.4         0.2       1
##  [6,]          5.4         3.9          1.7         0.4       1
##  [7,]          4.6         3.4          1.4         0.3       1
##  [8,]          5.0         3.4          1.5         0.2       1
##  [9,]          4.4         2.9          1.4         0.2       1
## [10,]          4.9         3.1          1.5         0.1       1
```

---

# Tensor

(3d)

![](https://github.com/curso-r/deep-learning-R/blob/master/3d-tensor.png?raw=true)

---

# Tensor

(4d)

---

# TensorFlow

.pull-left[
  ![](https://github.com/curso-r/deep-learning-R/blob/master/flow.gif?raw=true)
]

.pull-right[
  - Define the graph
  - Compile and optimize
  - Execute
  - Nodes are calculations
  - the tensors *flow* along the nodes.
]

---

# Keras

.large[

* API used to specify deep learning models in a intuitive flavor.

]

.large[

* Created by François Chollet (@fchollet).

]

.large[

* Originally implemented in `python`.

]

---

# Keras + R

.large[

* R package: [`keras`](https://github.com/rstudio/keras).
* Based in [reticulate](https://github.com/rstudio/reticulate).
* Developed by JJ Allaire (CEO at RStudio).
* R-like syntax using `%>%`.

]

---

# Keras for R

---

# Example 01

---

# Activation

---

# Activation problems

---

# Example 02

---

# Example 03

---

# Convolutions

---

# Convolutions

---

# Max Pooling

![](https://user-images.githubusercontent.com/4706822/48281479-df94a980-e43d-11e8-9dcf-e67d7ba053e4.png)

---

# Convolutions

![](https://user-images.githubusercontent.com/4706822/48281946-6bf39c00-e43f-11e8-845c-2d08570d85a6.png)

---

# Binary Cross-Entropy

---

# Example 04

---

# Categorical Cross-Entropy

---

# Example 05

---

# Stalk us

- Bruna Wundervald: [brunaw.com](brunaw.com)
- Curso-R: [contato@curso-r.com](mailto:jtrecenti@curso-r.com)
- CONRE-3: [jtrecenti@conre3.org.br](mailto:jtrecenti@conre3.org.br)

## Pages:

- https://brunaw.com
- https://curso-r.com
- https://github.com/brunaw
- https://github.com/curso-r

Presentation: https://jtrecenti.github.io/slides/emr-dl/

Code: https://github.com/jtrecenti/slides