Spotlight on Research is the research blog I author for Hokkaido University, highlighting different topics being studied at the University each month. These posts are published on the Hokkaido University website.

When is comes to students wishing to study the propagation of diseases, Heidi Tessmer is not your average new intake.

“I did my first degree in computer science and a masters in information technology,” she explains. “And then I went to work in the tech industry.”

Yet it is this background with computing that Heidi wants to meld with her new biological studies to tackle questions that require the help of some serious data crunching.

Part of the inspiration for Heidi’s change in career came from her time working in the UK, where she lived on a sheep farm. It was there she witnessed first hand the devastating results from an outbreak of the highly infectious ‘foot and mouth’ disease. This particular virus affects cloven-hoofed animals, resulting in the mass slaughter of farm stock to stem the disease’s spread.

“The prospect of culling of animals was very hard on the farmers,” she describes. “I wanted to know if something could be done to save animal lives and people’s livelihoods. That was when I began to look at whether computers could be used to solve problems in the fields of medicine and disease.”

This idea drove Heidi back to the USA, where she began taking classes in biological and genetic sciences at the University of Wisconsin-Madison. After improving her background knowledge, she came to Hokkaido last autumn to begin her PhD program at the School of Veterinary Medicine in the Division of Bioinformatics.

“Bioinformatics is about finding patterns,” Heidi explains.

Identifying trends in data seems straight forward enough until you realise that the data sets involved can be humongous. Heidi explains this by citing a recent example she has been studying that involves the spread of the influenza virus. While often no more than a relatively brief sickness in a healthy individual, the ease at which influenza spreads and mutates gives it the ongoing potential to become a global pandemic, bringing with it a mortality figure in the millions. Understanding and controlling influenza is therefore a high priority across the globe.

Influenza appears in two main types, influenza A and B. The ‘A’ type is the more common of the two, and its individual variations are named based on the types of the two proteins that sit on the virus’ surface. For example, H1N1 has a subtype 1 HA (hemagglutinin) protein and subtype 1 NA (neuraminidase) protein on its outer layer while H5N1 differs by having a subtype 5 HA protein.

The inner region of each influenza A virus contains 8 segments of the genetic encoding material, RNA. Similar to DNA, it is this RNA that forms the virus genome and allows it to harm its host. When it multiplies, a virus takes over a normal cell in the body and injects its own genome, forcing the cell to begin making the requisite RNA segments needed for new viruses. However, the process which gathers the RNA segments up into the correct group of 8 has been a mystery to researchers studying the virus’ reproduction.

In the particular case study Heidi was examining (published by Gog et al. in the journal of Nuclear Acids Research in 2007), researchers proposed that this assembly process could be performed using a ‘packaging signal’ incorporated into each RNA segment. This packaging signal would be designed to tell other RNA segments whether they wished to be part of the same group. If this signalling could be disrupted, proposed the researchers, then the virus would form incorrectly, potentially rendering it harmless.

This, explains Heidi, is where computers come in. Each RNA segment is made up of organic molecules known as ‘nucleotides’, which bunch together in groups of three called ‘codons’. The packaging signal was expected to be a group of one or more codons that were always found in the same place on the RNA segment; signalling a crucial part of encoding. In order to find this, codon positions had to be compared across 1000s of RNA segments. This job is significantly too difficult to do by hand, but it is a trivial calculation for the right bit of computer code. Analysing massive biological data efficiently in this way is the basis for bioinformatics.

In addition to the case above, Heidi cites the spread of epidemics as another area that is greatly benefiting from bioinformatics analysis. By using resources as common as Google, new cases of disease can be compared with historical data and even weather patterns.

“The hardest part about bioinformatics is knowing what questions to ask,” Heidi concludes. “We have all this data which contains a multitude of answers, but you need to know what question you’re asking to write the code.”

It does sound seriously difficult. But Heidi’s unique background and skill set is one that just might turn up some serious answers.