Genes affects most biological traits and often interact with other variables, such as those found in the environment. When we build models of these interactions to investigate what genes and environmental parameters are affecting a particular trait, say human height, it is ideal that we estimate the effects of all the variables simultaneously and that we estimate how certain we are that these variables do in fact affect a trait of interest. Doing such an analysis is not simple and many pitfalls exist. FW 599 Statistical Genetics will guide students through genetic statistical analyses and allow them to show what they have learned in an independent project performed throughout the course. The course will be a graduate-level course with ~15-20 students that will meet once a week in-class. The prerequisites will be a genetics class, an introductory statistics class, and programming knowledge.

The time in-class will be divided into a review of common difficulties students had in last week’s homework assignment followed by a lecture. The lectures will focus on a particular theory of statistical genetic model building interspersed with real-world problems that will be solved by applying the theory using a coded program. The weekly homework will evaluate the students’ mastery of the theory by requiring them to evaluate something about a real-world problem, say the uncertainty estimate of some genetic parameter. The homework assignments, worth 50% of the final grade, will be submitted in the form of a program with explanatory text and will be shared with the rest of the class upon submission. I haven’t found the perfect submission platform, but I’m leaning toward using Gradescope. There is also a way to integrate code and explanatory text where the code is automatically run and output in the text and a document is automatically generated from both the processed code and explanatory text; however, this might be complicated to execute. The point of sharing the code with the rest of the class is to allow students to see possibly more efficient uses of code to execute some analysis. Further, I will be making all my comments in this “public” sphere, and I hope it will encourage discussion among students. Students’ grades will not be public.

In addition to homework assignments, a project, worth 50% of the final grade, will be assigned that can either be done alone or in a group chosen by the students. The project will involve students finding a data set they like, which could be one found online or one they use for their research, and analyzing it using the methodology developed during the course to answer some question(s) about it chosen by the students. The project will be submitted in three phases. The first phase will be a summary of the data set and an evaluation of the possible causal links present or absent in the data. The second phase will be the data set fully analyzed and written up. The third phase will be a presentation of the write up. Similar to the homework, the first and second phases of the project will be submitted such that all other students will be able to see and comment on the work.