--- title: "MSP Data Analysis" author: "Stephen Nowicki (snowicki@duke.edu) and Sönke Johnsen (sjohnsen@duke.edu), Duke University, for correspondence, and Eleanor Caves (eleanor.caves@gmail.com) for R scripts" date: "8/25/2018" output: html_document --- R Markdown file for EM Caves, LE Schweikert, PA Green, MN Zipple, C Taboada, S Peters, S Nowicki, and S Johnsen. “Variation in carotenoid-containing retinal oil droplets correlates with variation in perception of carotenoid coloration.” Behavioral Ecology and Sociobiology. ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) require(knitr) require(scales) require(dplyr) require(gdata) ``` ```{r calculate lambda mid for a transmission file, echo=TRUE} ##If multiple files are to be analyzed at once, transform the following code into a loop s<-read.table(file.choose(), sep="\t", header=F, skip=14) ##read in the raw data. Skip 14 lines because the first 14 lines of Spectra Suite files are not wavelength or transmittance data s$wavelength<-s$V1 s$reflectance<-s$V2 spec<-aggregate(as.numeric(s$reflectance),list(as.integer(s$wavelength)),FUN=mean) #Average the reflectance by wavelength integer, to get one reading per nanometer instead of 3 names(spec)<-c("wavelength","reflectance") #give names to the columns in the spec dataframe dataframe range<-subset(spec,wavelength>=520&wavelength<=680) ##Constrain the file to wavelengths of interest, i.e. those in which lambda mid occurs. This reduces noise from outside of the area of interest and makes the logistic fit better. # Set your range x = range$wavelength y = range$reflectance # Rescale X and Y to new ranges and build new data frame x2 = rescale(x,to=c(-1,1)) y2 = rescale(y,to=c(0,1)) data = data.frame(x,y,x2,y2) # Fit logistic model fit2 <- nls(y2 ~ SSlogis(x2, Asym, xmid, scal), data = data) #Plot results plot(y2 ~ x2, main = "520-680 nm", xlab = "Wavelength", ylab = "Transmission", axes = F) lines(x2, predict(fit2, newdata = data.frame(x2 = x2)), col = "red", lwd = 3) box() #Rescale the axis back to original axis(1, at = axTicks(1), labels = round(rescale(axTicks(1), to = c(min(x), max(x))))) axis(2, las = 1, at = axTicks(2), labels = round(rescale(axTicks(2), to = c(min(y), max(y))))) # Calculate and plot midpoints x2.mid = coef(fit2)[2] y2.mid = predict(fit2, newdata = data.frame(x2 = coef(fit2)[2]))[1] points(x2.mid,y2.mid, pch = "X", col = "blue", cex = 2) # Rescale or original values x.mid = rescale(c(x2.mid, x2), to = c(min(x),max(x)))[1] y.mid = rescale(c(y2.mid, y2), to = c(min(y),max(y)))[1] ##calculate goodness of fit of the model using R squared and Root Mean Squared Error rescaled_y<-rescale(predict(fit2), to = c(min(y),max(y))) actual<-range$reflectance predict<-rescaled_y R2 <- 1 - (sum((actual-predict )^2)/sum((actual-mean(actual))^2)) ##This is R-squared, which you're probably familiar with RMSE<-sqrt(mean((predict -actual)^2)) ##Root Mean Squared Error is another measure of how well ths logistic curve fits our data ```