--- title: "smoothPRC" output: rmarkdown::html_vignette bibliography: PRCbibliography.bib link-citations: yes vignette: > %\VignetteIndexEntry{smoothPRC} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(PRC) ``` # The PRC package {.unnumbered} The aim of the `PRC` package is to help ecologists unravel time-dependent treatment effects from multivariate response **Y** in longitudinal designed experiments. Principal response curve analysis (PRC) [@vandenBrink1999PRC] uses partial redundancy analysis with model $Time + Time:Treatment$ with $Time$ and $Treatment$ as factors. Here $Time$ is a covariable in the language of ordination [@legendre2012numerical; @Braak1988theory]. Partial redundancy analysis (RDA) is equivalent to reduced rank regression with concomitant variables, where, in PRC, $Time$ is a concomitant factor variable (that is, a set of concomitant indicator variables) [@Braak2023redundancy; @Braak1994biplots; @vandenBrink1999PRC]. Version 2.0.0 of the `PRC` package adds smoothPRC, which produced smooth PRC curves. See https://github.com/CajoterBraak/PRC # Theory of PRC and smoothPRC PRC is is least-squares reduced rank regression with $Time$ as concomitant factor variable [@vandenBrink1999PRC; @Braak2023redundancy]. In SmoothPRC, a ridge-type penalty is applied to the coefficients forming the ordination axes, also called latent variables. With such a penalty the penalized least-square solutiuon is still an eigen problem. The eigen problem in smoothPRC includes penalty paramaters (smoothness parameters), which thus need to estimated also. In `smoothPRC` this is achieved using the `LMMsolver` package [@Boer2023]. All eigen vectors can be computed simultaneously in classical PRC using a singular value decomposition. In smoothPRC different eigen vectors may need different smoothness parameters, so that it is natural to calculated them in turn. This note focuses on the first (dominant) eigen vector. Subsequent eigen vectors can be obtained by specifyin previous vectors as covariables (in formula `fixed`). ## From multivariate regression to partial redundancy analysis Define the model: **Y** = **QA**$_{0}$ + **Z**$_{0}$**M** + **E** with **Y** the response matrix, **Q** a matrix with concomitant variables, **Z**$_{0}$ a matrix of predictor, **E** an error matrix, and **A**$_{0}$ and **M** matrices of unknown regression coefficients that must be estimated from the data **Y**, **Q** and **Z**${_0}$. In partial RDA, the focus lies on a low-dimensional representation of the effects of **Z**${_0}$ on the response, that is on **M**, with the effects of **Q** partial out. To this aim and with $\Pi_{Q}$ the orthogonal projector on **Q**, define **Z** = **Z**$_{0}$ - $\Pi_{Q}$**Z**$_{0}$, the matrix of residuals of the regression of **Z**$_{0}$ on to **Q**. Now, **Q** and **Z** are orthogonal and span the same space as **Q** and **Z**$_{0}$. Moreover, the regression coefficients associated with **Z** are the same as those with **Z**$_{0}$, while the regression coefficients associated with **Q** change. The original model can thus be rewritten to **Y** = **QA** + **ZM** + **E** The low-dimensional representation is obtained by a rank-restriction on **M**. With the rank-1 restriction **M** = **cb**$^T$, where **c** is a vector of canonical coefficients and **b** is a vector of loadings (species scores), **ZM** = **Zcb**$^T$ = **xb**$^T$, which is a partial PCA model with constraint **x** = **Zc** with **x** the constrained site scores. Partial redundancy analysis is thus both reduced rank regression with concomitant variables and a partial PCA with constraints. Note that **A** in the rewritten formula can be obtained by regression **Y** on **Q** only. ## The transition formulas of classical partial redundancy analysis The transition formulas specify the eigen problem of partial RDA and are useful to solve for the first eigen vector by iteration. The convergence of the iteration is accelerated by blocking and Anderson iteration. The transition formulas of partial redundancy analysis are: **x**$^*$ = **Yb** **c** = (**Z**$^T$**Z**)$^{-1}$**Z**$^T$**x**$^*$ **x** = **Zc** $\lambda$**b**=**Y**$^T$**x** This set of equation forms the eigen problem $\lambda$**b**=**Y**$^T$**Z**(**Z**$^T$**Z**)$^{-1}$**Z**$^T$**Yb** This eigen equation is obtained by substitution of symbols, starting with the last transition formula and ending with substitution of **x**$^*$ to **Yb** of the first equation. In classical PRC, **Q** is an indicator matrix defined by the factor time and **Z**$_{0}$ is an indicator matrix defined by the interaction of the factors time and treatment. In words, the main effects of the factor time are partialled out of the interaction effects and the reduced rank (dimension reduction) is applied to the time-dependent treatment effects. In classical PRC, **x** = **Zc** with **Z** orthogonal to **Q** and **c** obtained from regression of **x**$^*$ on to **Z**. In smoothPRC below, **Z** is not explicitly easily available, whereas **Z**$_{0}$ is. But, the regressions of **x**$^*$ on **Q** and **Z**$_{0}$ and of **x**$^*$ on **Q** and **Z** yield identical fitted values, say **y**$_{1}$. Consequently, **x** = **Zc** = **y**$_{1}$ - **Qa** where **y**$_{1}$ is obtained from a regression on **Q** and **Z**$_{0}$ and **Qa** from a regresion of **x**$^*$ on **Q** alone. In pseudo R-code: **x** = fitted(lm(**x**$^*$ \~ **Q** + **Z**$_{0}$)) - fitted(lm(**x**$^*$ \~ **Q**)). By consequence, **Zc** can be computed as a difference of the fitted values of two models, the full model with **Q** and **Z**$_{0}$ and a null model with **Q** alone; in PRC, the model with time and treatment and the model with time only, respectively. ## The transition formulas of smooth PRC In smoothPRC time is treated as a quantitative variable. The model is allowed to be highly non-linear by transformation to a B-spline basis. The single variable time is thereby replaced by a set of variables (B-splines), like time is replaced by a set of indicator variables in classical PRC. As there can and should be many variables in the B-spline basis, the model is over-parametrized and the partial RDA becomes an unconstrained partial PCA. That is why penalized partial RDA is used in smoothPRC. With a treatment with a quantitative basis, \emph{e.g.} a dose, smoothPRC uses a two-dimensional tensor P-spline [@Boer2023], that is, a Kronecker product of B-splines with two penalty parameters, one for the time and the other for treatment. In smoothPRC, the main effect of time is a one-dimensional smooth in time. Both models are fitted using the R package LMMsolver [@Boer2023]. The main effects of time cannot be partialled out as in classical PRC as the smooths are not projectors. Therefore, the formula in classical PRC, in pseudo R-code **x** = **Zc** = fitted(**x**$^*$ \~ **Q** + **Z**$_{0}$)) - fitted(lm(**x**$^*$ \~ **Q**)) is replaced in smoothPRC by the difference of the fitted values of a model with both time and treatment and the fitted values of a null model with only time. In the default smoothPRC model, it is the difference of the fitted values of a 2d-spline of time and treatment and the fitted values of a 1d-spline of time. In summary, the transition formulas of smoothPRC are the three equations, of which the second is motivated above, **x**$^*$ = **Yb** **x** = fitted(smooth of time and treatment) - fitted(smooth of time) $\lambda$**b**=**Y**$^T$**x** ## After convergence After convergence, **x**$^*$ is replaced by **x**$^*$ - fitted(smooth of time), so a to remove the time effect from the 'unconstrained' site scores. This is the $x\_star$ entry in the output of smoothPRC. Also, the PRC graph takes one level or value of the treatment a reference level or value. After convergence, the values of the PRC curves are obtained from the full model by subtracting from the fitted values the predictions for the data set derived from the original data with all treatment levels or values replaced by the reference level or value (0). This is the $PRC$ entry in the output of smoothPRC. # References