I’ve spent more than ten years carrying out research in ecology and evolutionary biology. My main tools for this research come from data science, data visualization, and computer simulations, primarily in the programming language R. For most of my work, the best tools for the job are some combination of mixed effects models, generalized additive, bootstrapping, or permutation testing, and I am well-versed in the theory and application of these. That said, my training is quite broad, and I’m comfortable using many tools — see my resume for a more complete list. I also have a strong grounding in statistical theory and computer science fundamentals, and I am comfortable teaching myself new statistical approaches, software, or programming languages.
Some highlights of my work, from a data science perspective:
- Developing reproducible analysis pipelines for numerous analyses of ecological data. Some examples that have been published along with peer-reviewed research include this project analyzing butterfly populations and this project look at milkweed plant defenses. An example of ongoing work looking at trends in butterflies can be found here.
- Developing new data science tools, including methods to estimate butterfly population characteristics from transect data using for linear and mixed effects models, and an extension of Random Forests to identify complex trait interactions. I’ve been particularly pleased to see the butterfly analysis tool being used in understanding invasive mosquitoes and in salamander conservation.
- Developing numerous data visualization products, including an interactive app showing preliminary results for my analysis of 260 butterfly species across North America, this interactive app to look available data for several hundred butterfly species across North America, this interactive app to look at trends in abundance and the timing of butterfly activity for 31 at-risk species, and this interactive app to look at how ecological models match up with microbial data. I also contributed to this incredible StoryMap of at-risk butterflies.
- Developing several R packages, most notably this package to simulate and evaluate butterfly transect data as part of ongoing work to identify best-practices for estimating butterfly abundance and timing.
- Teaching a semester-long data science course at Tufts University, which was highly reviewed by students and was nominated for a teaching award. Students especially enjoyed the integration of interviews with field ecologists, systematic explanations of data science techniques, and then applications of those techniques on data provided by the field ecologists.
- Mentoring students in data science and quantitative methods. This includes many hours of ad-hoc mentorship, as well as more formal mentorship of Cassandra Doll (who defended her Masters in 2021), Michael Song (who graduates Tufts this year and has just been accepted to dental school), and Dr. Jessica Rozek CaƱizares (who defended fall 2022).
- Telling data science stories in presentations at more than a dozen professional conferences and department seminars. Here’s a presentation I gave about the declining western Monarch butterfly at the “Asilomar” conference in 2021. (Happily, the population experienced a massive surge in 2021, and in 2022 the fall population count was more than a quarter of a million individuals. The scientific community is still working to determine exactly what caused this surprise bounce-back).