The CGRB’s biocomputing infrastructure was highlighted in AMD CEO Lisa Su’s keynote speech at the Consumer and Electronics Show (CES) in Las Vegas on January 9, 2019. Watch below:
This blog post was originally published on September 10, 2018 and written by Christopher M. Sullivan, Assistant Director for Biocomputing. Read the whole article here.
The Oregon State University’s Center for Genome Research and Biocomputing (CGRB) and the Plankton Ecology Lab at OSU Hatfield have been collaborating in implementing an image processing pipeline to automate the classification of in situ images of plankton: microscopic organisms at the base of the food web in the world’s oceans and freshwater ecosystems. The imagery collection from a 10-day cruise typically contains approximately 80 TB worth of video, which, in some cases, may convert into image data yielding several billions of segments representing individual plankton and particles that need to be identified; a near impossible task to carry out manually by human experts. While we have a fully functional Convolutional Neural Net (CNN) algorithm that does an excellent job at predicting the identity of the plankton organisms or particles, we have been limited by GPU computational capabilities. We started working with PCI bus based Tesla K40 and K80 GPUs, which were good enough to manage millions of segments. However, when it came to billions of segments, it became a near insurmountable challenge.
Install the DT package from cran
First, one must install and load the DT package. Open up RStudio and run the following commands to install and load the DT package:
# Install the DT package install.packages("DT") # Load the DT package library(DT)
The print function is not the most effective was to display a table in an HTML R Markdown report.
Now let’s look at the
datatable function for comparison. The input to the
datatable function is a data frame or matrix. Let’s make a table with the preloaded iris data that’s in a data.frame. The basic call is
DT::datatable(iris) but in our example I’ve added the filter option to the top of the table, and limited the number of entries to 5 per table. See code and table features below:
datatable(iris, filter = "top", options = list(pageLength = 5))
A screen shot of the output looks like:
NUMBER OF ENTRIES TO DISPLAY
You’ll notice that there is a drop down menu that says: “Show 5 entries”. The default is 10, but I specified 5 as default with the code
pageLength=5. One may select the number of entries to show by using the drop down menu like so:
The widget also includes a search bar on the top right corner which can be very useful when interactively exploring data. Note at the bottom of the table it shows you how many entries (rows) were found and are being displayed.
Notice that to the right of each column name are two arrows: One may sort by ascending or descending order and the direction of the blue arrow indicates by which direction you sorted the column.
datatable function also allows users to filter each column depending on the datatype: filter numeric columns with a slider & filter columns of class factor with a drop down menu. One must add the
filter = "top" (or bottom, etc.) to the code to enable this feature.
Another useful aspect of the datatable function is the “Buttons” extension. This enables users to copy the table, save as a csv, excel or PDF file, or print the table. The table “remembers” what you’ve changed so far—so if you sort by Sepal Length, filter pedal width to > 1 and select species “versicolor” the copied/saved table will have these same restrictions.
datatable(iris, extensions = 'Buttons', options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'excel', 'pdf', 'print'))
The above code adds “buttons” to the top of the table like so:
If one clicks “copy”, the table will be copied to your clipboard, “CSV” or “PDF” will save the table to the give file type, and “print” will bring put the table into a print friendly format and will bring up the print dialog box.
Links and Color
One may also have links in their table. Say you made a data frame with links you want to work in your html report. For example: a data frame of variants w/ links to their position in a genome browser. This is done through not escaping content in the table, specifically the column with the links. The links are made with html and must not be escaped to show up. This applies to other html as well; including color. For me, it was confusing that I had to not escape the html columns. Got it completely backwards the first time I tried it. NOTE:
> got replaced with “& gt;” (with no spaces) when it is rendered on the blog… Need to find a fix!
# Make dataframe df.link <- data.frame(school=c("OSU", "UO", "Linfield", "Willamette"), mascot=c("beavers", "ducks", "wildcats", "bearcats"), website=c('<a href="http://oregonstate.edu/">oregonstate.edu</a>', '<a href="https://www.uoregon.edu/">uoregon.edu</a>', '<a href="https://www.linfield.edu/">linfield.edu</a>', '<a href="https://www.willamette.edu/">willamette.edu</a>'), School_colors=c('<span style="color:orange">orange & black</span>', '<span style="color:green">green & yellow</span>', '<span style="color:purple">purple and red</span>', '<span style="color:red">red and yellow</span>')) # When the html columns, 3 & 4, are not escaped, it works! datatable(df.link, escape = c(1,2,3))
One may also hide columns from visibility and add a button to add the column back interactively. For example, say we have a data frame called
sv.all.i.in. We can hide columns 3 and 4, which are long sequences and disrupt the readability of the table, with the following code:
datatable(sv.all.i.in, extensions = 'Buttons', options = list(dom = 'Bfrtip', columnDefs = list(list(visible=FALSE, targets=c(3,4))), buttons = list(I('colvis'),c('copy', 'csv', 'excel', 'pdf', 'print'))))
There are many more useful features that you can add to your
datatable! Learn more here: https://rstudio.github.io/DT/
The CGRB 2019 Fall conference registration is now open! Please join us for our annual event this September for informative talks, posters and a reception. This year the Fall Conference will also include lighting talks.
- When: Friday, September 20, 2019
- Where: CH2M Hill Alumni Center – Oregon State University
- Registration: here
- Registration fee: $25 (includes lunch and social hour)
- NOTE: registration fee is waved for:
- Undergrads presenting their research poster
- Lightning talk presenters
- Lightning Talk Submission: August 15th
- Poster Registration: September 7th
- Conference Registration: September 13th (registration fee increases to $35 after Sept 13)
School of Pharmacy
University of Washington
Research Assistant Professor
Molecular and Medical Genetics
Oregon Health and Science University
Regenerative Medicine Research
Texas Heart Institute
|8:00-8:50||Registration & refreshments (Poster & sponsor setup)|
|8:50-9:15||Brett Tyler, |
Introduction, CGRB update
|Hosted By Jaga Giebultowicz|
|9:15 – 9:40||Andrew Annalora,|
Environmental and Molecular Toxicology
Exploring Splice Variant Biology in Nuclear Receptor and
Cytochrome P450 Genes
|9:40 – 10:30||Ed Kelly, |
University of Washington
Organs on a Chip – Chips in Space
|10:30 – 10:55||Break (Poster and Sponsor displays)|
|10:55 – 11:35||Morning Lightning Talks (8 talks) – |
moderated by Jeff Anderson
|11:35– 12:00||Felipe Barreto, |
Genomics in the Tidepool: Functional and Population
Genetics of Adaptation and Speciation in a
|12:00 – 12:25||Kevin Brown, |
College of Pharmacy
Adventures in Complex Systems
|12:25 – 1:25||Lunch (Poster and Sponsor displays)|
|Hosted By Craig Marcus|
|1:25 – 1:50||Afua Nyarko, |
Biochemistry and Biophysics
Selectivity and Specificity in Cancer Regulatory Proteins
|1:50 – 2:40||Daniel Liefwalker, |
Oregon Health and Science University
Therapeutic strategies targeting c-MYC
|2:40 – 3:20||Afternoon Lightning Talks (8 talks) – |
Moderated By Viviana Perez
|3:20 – 3:45||Break 25 mins (Poster and Sponsor displays)|
|3:45 – 4:10||Morgan Giers,|
Chemical, Biological, and Environmental Engineering
Regenerating the Intervertebral Disc: Developing Effective
Therapies in a Nutrient Limited Environment
|4:10 – 5:00||Doris Taylor, |
Texas Heart Institute
Building Solutions for Heart Disease: A 2019 Update
|5:00 – 7:30||Poster Session / Reception, Sponsor Displays|
Call for posters!
Invitation to present a Poster at the 2019 CGRB FALL CONFERENCE (Sept 20, 2019)
Students, Post Docs, Research Staff and Research Faculty are invited to present their research as a Poster. Presenters are strongly encouraged but not required to consider utilizing a revolutionary new trend in poster format: https://twitter.com/mikemorrison
(Posters in any format displayed at recent meetings are also welcome)
Prizes for Best Posters: $100 (Undergraduate, Graduate and Post Doc Categories).
All fields and research topics welcome. To submit a Poster, please navigate to https://beav.es/ZPd
DEADLINE Sept. 7, 2019.
Call for lightning talks!
Invitation to present a Lightning Talk at the 2019 CGRB FALL CONFERENCE (Sept. 20, 2019)
Students, Post Docs, Research Staff (FRA, Res. Associates, etc.), and Research Faculty are invited to present their research as a 5-minute Lightning Talk at the annual CGRB Fall Conference, Friday Sept. 20, 2019.
First Prize for Best Lightning Talk = $100
Conference Registration Fee is waivedfor all Lightning Talk Presenters
All fields and research topics welcome. To submit a lightning talk: Please navigate to beav.es/ZPA
Talks are limited to 5 minutes and 5 slides maximum. Please Submit no Later Than: August 15, 2019.
Talks will be selected by the Program Committee and Presenters notified by Aug. 31, 2019.
Thank you to our 2019 Fall Conference Committee:
Jaga Giebultowicz, Department of Integrative Biology
Craig Marcus, Environmental and Molecular Toxicology
Jeff Anderson, Department of Botany and Plant Pathology
Viviana Perez, Department of Biochemistry and Biophysics