Mar 17

Not quite sure what’s going on with the functional analysis stuff. But I’m sure we’ll get the hang of it in a bit. =)

Mar 15

Today we’re moving on to the new section of lab: Discovery. We’ve completed the positional annotation of Benedict’s genome, so now we are going to begin the functional annotation. In order to determine the functions of all the genes, we’re going to compare the genes to existing ones using Phamerator and Blast searches. So far, not really seeing how this is different than what we did earlier, other than the fact the flowchart we follow is different. Same general concept, it seems. But we’ll see.

Mar 01

Today’s lab was more foraging into the unknown. Things were kind of rough because there’s no set-in-stone procedure for what we’re doing. We’re making things up as we go along. This definitely brings to mind the sentiment expressed in a certain Lady Gaga parody. “Why? Why? Why oh why?”

But it’s all good. “It is a mystery. TEH MYSTERY OF SCIENCE.”

Feb 24

It was definitely interesting working as a group to make calls instead of doing it alone. It was interesting to hear different people’s reasoning behind making calls. It really showed that even though the calls are subjective, some are better than others based on the information at hand.

Feb 10

Done annotating my part of Benedict’s genome. Time to make sure everyone’s work matches up. =)

Feb 03

My assigned genome region is from 26670 to 30360 base pairs. I downloaded the sequence from PhageDB, loaded it into our Annotation Workflow, and opened my section in Apollo. Now I’m getting started.

1. I began with the gene called Glimmer34 and Genemark31.

  • LENGTH: This gene extends from 26,814 backwards to 26,671
  • START CODON: I first began by trying to make sure that the gene starts with the earliest possible start codon. I started scrolling to the right and looking at all the start codons in the correct reading frame. But going any further to the right would extend it into another gene, so the start codon is fine where it is.
  • SD SEQUENCE: I checked the SD score and it was 200 which is very good.
  • CODING POTENTIAL: I checked the TB Coding Potential and the gene encompassed all of it which is good.
  • BLAST: I took the cDNA sequence for the phage and did a BLAST search. The resulting scores were E 5.8 which is very high! What we’re looking for is a score of E -10, so there are no matches in the database for my gene. This means I’m on my own!
  • GAPS/OVERLAP: 1bp overlap at the start of the gene

2. Next is Glimmer33 and Genemark32.

  • START CODON: Earliest possible start codon.
  • LENGTH: 27,023 backwards to 26,814
  • SD SCORE: 152
  • CODING POTENTIAL: Does NOT encompass all coding potential. There is still some to the right (before the gene starts)
  • BLAST: There were not any great matches. Best available match was to “gp39″
  • GAPS/OVERLAP: 55bp between the start of this gene and the end of the next gene.

So far, so good.

Jan 27

Jan 27

RULE#1: The longest reading frame is the rightest reading frame.
RULE #2: Don’t let em get trapped – your genes need gaps.
RULE#3: Anything and everything beyond these rules is a shot in the dark. YAY GENE CALLING!

Perhaps that’s an oversimplification.

Last week, we chopped the practice genome into parts and we each called a few genes. Today, we defended our decisions and consolidated everyone’s genes into one file.

The subjectivity of it all has been a little surprising. After all, computers aren’t subjective, right? When I double-click a program on the desktop, my computer doesn’t look at me and say “Maybe.” At least, not often. It has its days. But for the most part, it either gives me a yes or a no. I figured this gene calling business would work much the same way.

But instead we’re learning that this process is about “making the best possible decision based on the information at hand. ”

…WHAT?!

This does not compute in the mind of a young computer scientist.

But once I get over the fact that there is no “right” answer (my goodness, what a horrifying concept), I think I’ll be alright. Yeah, understanding how to assimilate all the available information and using it to make a decision is definitely like some sort of twisted puzzle game. Twisted, because no one short of Chuck Norris should be able to extrapolate this much data from a four-letter alphabet.

Jan 25

Jan 24

So far, the semester has definitely been interesting. We’ve been familiarizing ourselves with the tools we’ll be using and practicing our annotation skills. All this practice and prep makes me all the more excited for the day we can get our hands on Benedict’s genome!

Sic 'Em Phage…