---
title: "MSP Data Analysis"
author: "Stephen Nowicki (snowicki@duke.edu) and Sönke Johnsen (sjohnsen@duke.edu), Duke University, for correspondence, and Eleanor Caves (eleanor.caves@gmail.com) for R scripts"
date: "8/25/2018"
output: html_document
---
R Markdown file for EM Caves, LE Schweikert, PA Green, MN Zipple, C Taboada, S Peters, S Nowicki, and S Johnsen. “Variation in carotenoid-containing retinal oil droplets correlates with variation in perception of carotenoid coloration.” Behavioral Ecology and Sociobiology.
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
require(knitr)
require(scales)
require(dplyr)
require(gdata)
```
```{r calculate lambda mid for a transmission file, echo=TRUE}
##If multiple files are to be analyzed at once, transform the following code into a loop
s<-read.table(file.choose(), sep="\t", header=F, skip=14) ##read in the raw data. Skip 14 lines because the first 14 lines of Spectra Suite files are not wavelength or transmittance data
s$wavelength<-s$V1
s$reflectance<-s$V2
spec<-aggregate(as.numeric(s$reflectance),list(as.integer(s$wavelength)),FUN=mean) #Average the reflectance by wavelength integer, to get one reading per nanometer instead of 3
names(spec)<-c("wavelength","reflectance") #give names to the columns in the spec dataframe dataframe
range<-subset(spec,wavelength>=520&wavelength<=680) ##Constrain the file to wavelengths of interest, i.e. those in which lambda mid occurs. This reduces noise from outside of the area of interest and makes the logistic fit better.
# Set your range
x = range$wavelength
y = range$reflectance
# Rescale X and Y to new ranges and build new data frame
x2 = rescale(x,to=c(-1,1))
y2 = rescale(y,to=c(0,1))
data = data.frame(x,y,x2,y2)
# Fit logistic model
fit2 <- nls(y2 ~ SSlogis(x2, Asym, xmid, scal), data = data)
#Plot results
plot(y2 ~ x2, main = "520-680 nm", xlab = "Wavelength", ylab = "Transmission", axes = F)
lines(x2, predict(fit2, newdata = data.frame(x2 = x2)), col = "red", lwd = 3)
box()
#Rescale the axis back to original
axis(1, at = axTicks(1), labels = round(rescale(axTicks(1), to = c(min(x), max(x)))))
axis(2, las = 1, at = axTicks(2), labels = round(rescale(axTicks(2), to = c(min(y), max(y)))))
# Calculate and plot midpoints
x2.mid = coef(fit2)[2]
y2.mid = predict(fit2, newdata = data.frame(x2 = coef(fit2)[2]))[1]
points(x2.mid,y2.mid, pch = "X", col = "blue", cex = 2)
# Rescale or original values
x.mid = rescale(c(x2.mid, x2), to = c(min(x),max(x)))[1]
y.mid = rescale(c(y2.mid, y2), to = c(min(y),max(y)))[1]
##calculate goodness of fit of the model using R squared and Root Mean Squared Error
rescaled_y<-rescale(predict(fit2), to = c(min(y),max(y)))
actual<-range$reflectance
predict<-rescaled_y
R2 <- 1 - (sum((actual-predict )^2)/sum((actual-mean(actual))^2)) ##This is R-squared, which you're probably familiar with
RMSE<-sqrt(mean((predict -actual)^2)) ##Root Mean Squared Error is another measure of how well ths logistic curve fits our data
```