12 posts tagged “transcriptomics”
...Imbalanced Regions In Cancer
Datasets
Silvio Bicciato
Morning Session, 2 September (11th MGED Meeting, 1-4 September, 2008)
Integrative genomics in the sense they're using it is bridging the gap between DNA copy number and RNA transcription through microarray technology. (There are many other definitions and types of "integrative" genomics.) They are integrating genotyping, transcriptional and structural information before the analysis. The SODEGIR method is the acronym that is the same as the title of the talk. Step 1 is to transform SNP copy number and expression data into CN and GE scores. It estimates of CN and GE scores at the chromosomal locations of Entrez Gene IDs from the probe set data of the microarrays using a kernel regression estimator. Step 2 defines regions with concomitant alterations of gene CN and GE in single samples. Step 3 detects the presence of common SODEGIR signatures across all samples of an entire dataset using the binomial distribution.
These are just my notes and are not guaranteed to be correct. Please feel free to let me know about any errors, which are all my fault and not the fault of the speaker. :)
...complex disease
Alistair Chalk
Morning Session, 2 September (11th MGED Meeting, 1-4 September, 2008)
They use the hOSC model system to study complex neurological diseases. They use olfactory biopsies. There are adult stem cells in the nose, and they are multipotent. They do transcriptomic, phenotypic, genotypic, and epidemiological analyses. They want to characterize and compare these cells, and predict clinical outcome, among other things. They can detect differences between diseases from the exon array. There is concordance between individuals for the CAGE data. They can detect consistent transcription start site (TSS) events. The system is highy reproducible between many individuals, and there is concordance between expression datasets and donors. They can detect disease differences on multiple levels (transcript/exon/TSS).
These are just my notes and are not guaranteed to be correct. Please feel free to let me know about any errors, which are all my fault and not the fault of the speaker. :)
Mike Bittner
Plenary Talk, Morning Session, 2 September (11th MGED Meeting, 1-4 September, 2008)
How close are we to getting into the clinic? Over the past few years, deaths by cardiovascular disease has dropped dramatically, while cancer has remained about the same. In fact, there are now more deaths from cancer than from cardiovascular in the States - since the late 1990s. Molecular biology was begun long ago with the help of the influx of a large number of nuclear physicists.
What are molecular biologists missing? Some blame our rapid and changing use of technology, which causes a form of amnesia about what went on before.
Confounding assumptions in the evaluation of data for model building:
independence assumption. Optimal number of features depends on sample size, classification rule and feature-label distribution. It's good to use LDA, linear model, and highly-correlated features.
[a large part of the talk is missing here due to problems with my laptop that required a restart. My apologies!]
Canalyzing genes force all effort down one (or a small proportion) of the possible pathways.
These are just my notes and are not guaranteed to be correct.
Please feel free to let me know about any errors, which are all my
fault and not the fault of the speaker. :)
Joe Gray
Keynote Talk, Morning Session, 2 September (11th MGED Meeting, 1-4 September, 2008)
Metastatic breast cancer remains essentially incurable, but subtype specific drugs do show promise (survival after three years is very low). Would like to shift the survival curves to the right. Would like to develop molecular markers that will help identify the poor-outcome breast cancers early, so they can be treated early. This integrates omics, mathematical models and functional cancer biology.
Identify and model poor-outcome breast cancers
What are the poor-outcome subgroups? There are a number of ways to calculate these subtypes: there are relatively common PIK3CA pathway mutations (dna sequence), luminal / amplifier subtype (copy number, including ERB2), and the basal subtype (expression).
Identify these subtype-specific markers, and then select specific drugs for each subtype from within ~100 FDA-approved drugs and about 400 experimental drugs. They need a tractable model to accomplish marker and drug discovery efforts using 50 breast cancer cell lines selected to model the molecular subtypes of primary tumors. 50 cell lines isn't really enough - they'd like 250 cell lines, but just can't do that today.
Within these 50 cell lines, multiple instances of each poor-outcome subtype must be represented. Molecular abnormalities that influence drug response must be functioning. The cells have been characterized as much as possible. They've done high-res genome copy number and allelotype (BAC, SNP6, MIP array CGH). Further, interested in expression and alternate splicing using Affy arrays. Further, look at the secreted proteome (cytokine array) and ~140 protein and phosphoproteins using westerns and RPPL arrays. Also look at mutational status of candidate cancer genes from large-scale sequencing efforts.
To what extent do these cell lines look like primary tumors? Not always a good representation of the "average" breast cancer, but then no single breast cancer is close to average: there is large-scale heterogeneity. In terms of the genome, the cell lines are a pretty good match. At the transcriptome level, it's OK but not as good. The transcriptome results can get out the luminal/basal split in general, but the details are not so well matched.
Also been interested in looking at alternative splicing for the subtypes. For instance, isoforms that are spliced in a subtype-specific manner. Aspects of cell-surface signalling tend to be strongly-influenced by variable splicing.
Expression profiles of cell lines in 2D and 3D culture cluster together, but environment matters: 96 genes show strong environment-dependent expression (influence on TGF-Beta)
Develop molecular markers for early detection of poor-outcome subtypes
They are trying to identify subtype-specific, alternately-spliced proteins at an early stage. One example of the loads they are looking at is CD44 - seems from the exon arrays to be pretty interesting. It seems to be expressed almost totally in the basal subtypes. Further, it seems to divide the basals into the more and less virulent forms, in that splicing is very different in these two types. It's predominately spliced out in the Basal B group, and retained in Basal A.
There are about 30 candidates they're looking at more closely, and want to look at what version is in the blood. To do this they're using a new technology which requires a small amount of material: essentially, a small-capillary western, and is called Firefly (CellBio Sciences). All these differently-spliced isoforms are separated out and can be detected with antibodies. It's quite sensitive. Then you can use splice-isoform-specific antibodies in anatomic imaging using an engineered bacteriophage MS2 capsid.
Select/develop candidate therapeutics against the poor-outcome subtypes
Treat cell lines with therapeutic agents and identify milecular features. They're mostly focusing on pathway-targeting drugs, as well as some of the "classical" chemotherapy agents. Pathway-targed drugs tend to show strong subtype specificity. He talks about amounts in terms of concentration of the drug that can inhibit growth of the cell line by 50%. For instance, Lapatinib are most effective in the luminal cell lines, but still quite variable from cell line to cell line.
The drugs that tend to have the greatest luminal specificity are the ones that affect the P13-kinase pathway. For Basal subtype, the drugs that work best are affect the mitotic pathway. Bayesian network analysis reveals AKT-dependent signalling in luminal lines. There is strong connectivity in luminal subtype cells, and weaker connectivity in the basal subtypes.
There aren't all that many regions of recurrent amplification: maybe 5 different regions account for 30% of all breast cancers (8p11, 8q24, 11q13, 17q21, 20q130). Expression levels of 66 genes are deregulated by high-level amplification. siFGF3, siOOFIA1, siNEU3 induce cell apoptosis in 11q13 highly-amplified cell lines. Attacking these types of genes will be optimal for treating patients with this sort of luminal/amplified subtype. a similar story is present at 8q24: PVT1 but not MYC knockdown produces apoptosis-specific.
What do we do about the fact that we can't get drugs to work against some of these gene targets? Try to use isRNA therapeutics to produce in vivo delivery of PVT1 siRNA-DOPC. The PVT1 siRNA Rx is effective against PVT1-amplified xenografts.
In terms of the Basal subtype, they've been focussing on using MEK inhibitors, which shows some basal-subtype specificity, but it isn't very "durable" specificity. Therefore, also want to look at trascriptional features associated with response. The reason for the non-durable response is because the response is moderated by a MEK to EGFR feedback loop (via protein analysis). A negative inhibitor pathway - will just instead go to an alternative route. If you block the other route as well with P13K inhibitors, then you enhance the response in the basal subtype. Mitotic apparatus drugs also show promise for basal subtype tumors, but they'll be persuing that in the future.
Do these things have anything to do with the clinic?
They've started doing this with lapatinib. ERBB2+ is the strongest in vitro predictor of response. Can we stratify response within the ERBB2+ patients? They've focused on the transcriptome, and discounted markers that vary, and specifically focused genes that have a high response. They've ended up with a 6-gene assay. Can the patients be stratified? Indeed, those they predicted would be sensitive did have their survival curve shifted to the right. Transcriptional markers developed in vitro show promise in clinical studies. Subtype-specific drugs include: AKT-pathwya inhibitors for PI3-kinase mutants, MEK+P13-kinase drugs for the basal subtype, one more I missed...
These are just my notes and are not guaranteed to be correct.
Please feel free to let me know about any errors, which are all my
fault and not the fault of the speaker. :)
Grant Cramer, University of Nevada at Reno
Keynote Talk, Afternoon Session, 1 September (11th MGED Meeting, 1-4 September, 2008)
Why interested in this type of stress? Cold is a major problem for grapes, salt tolerance would be useful (over time salts remain in the soil when the water evaporates after very long-term irrigation), and want to know more about drought stress. We want to stunt growth so that most of the effort goes to the fruit. Therefore, grapes can be quite drought tolerant.
General intro to systems biology
Not just grapes make money: wine sales, tourism, etc brings it to $50 billion in California annually. Also, there are 200+ phenolics, (anti-Alzheimer's - interferes with plaque formation, anti heart disease) and other human health benefits. Also, wine tastes good ;)
They are using transcriptomics (affy chips), proteomics (2d-gels) and metabolomics (primarily gcms, but also lcms) data and are integrating that information into MetNet. Goals include annotation of genes, map molecular networks, build models to describe physiology and development, and manipulate and improve fruit quality and stress resistance.
Proteomics: they are using 2d-page with maldi-tof-tof. Transcript data is available publicly. Currently 8461 out of 39423 genes have been mapped, with 120 pathways.
Abiotic stress effects on shoots
They've done a long-term stress experiment. Start with potted 2-year-old, own-rooted, Cabernet Sauvignon clone 8 (don't normally grow from seed as can be very different from the original). Pruned to one shoot. Grown in a greenhouse. Most salt experiments are osmotic shock experiments, where you stick them in salt water all of a sudden. But in the environment, salinization happens very gradually. Salinity affects plants via osmotic removal of water and also have an aspect of ion stress. What he did was to get the osmotic water effect was just to stop watering. To do the salt stress, you have to measure the water deficit of the leaves and then add salt to the roots of the plants to get it to have the same water deficit response as with water deficit only. They harvest the growing shoot tip. Over time, the control was steady, but the pressure of the water potential in the leaves for both the salinity plants and water-deficit plants were virtually identical, so was able to mimic the water deficit well.
The salt and drought-stress plants slow down their growth prior to the drop in water potential: that is, the growth is almost more sensitive then their ability to measure the water potential. Shoot elongation was very sensitive to stress, and in the early stages was actually more affected by water deficit than salinity.
Their microarray data came out very nicely (partly due to the use of clones). They did a gene expression time course, and for both types of stress there was an increase in the number of transcripts being upregulated and downregulated over time. First the water deficit by day 6, and then not until day 13 for salt deficit. There are large differences between the two types of stress.
They did a comparison on day 16 to see which were differentially expressed between salinity and drought stress. There are significant differences between MIPS functional categories. These include transcription, cell defence, transport mechanisms, metabolism. Also, some key hormone biosynthesis genes are affected by stress. ABA-NCED is affected by drought before salinity. Ethylene comes in much later, around day 18. The metbolism of a growth hormone goes up, reducing its amounts in the plant.
The drought plants were wilting faster. They think this is because the salinity plants were able to use the salt in controlling osmosis. There's also large changes in amino acid composition, specifically proline, isoleucine and leucine. The differences in the expression levels between the two types of stress were mainly to do with photosynthesis and ROS.
Summary for the data set: the exp indicates that water damage had larger impact (and caused larger changes in gene expression) than equivalent salinity.
Proteomics comparison to transcriptomics
So, how does this compare to the data they got from proteomics data? Grapes are problematic, e.g. due to the large amount of phenolics. When run, 84 proteins were significant out of 645 proteins quantified (took a year due to all the manual reviewing of the photos). The abundance of 40 proteins increased, 20 decreased, and 22 increased and then decreased in various ways over time.
Comparing the results wrt functional classifications is interesting. Uncharacterised in transcripts is 30%, but not in the proteome, where almost all can be identified. About 30% are involved in metabolism. Energy and protein synthesis also important.
66 of 84 proteins had a Mowse score of 7 (95% confidence that the protein is what we think it is). 57 /66 have a transcript match with 90% identity or higher (90% is an arbitrary number). 17/57 have significant Pearson correlation with the transcript profile, which is relatively low.
Proteins that have bad correlation are, for example, antioxidants. Proteins with good correlatoin are heat shock proteins, and major latex-like proteins, and a methyltransferase. Could be due to limitations of the technology.
Summary: only 30% of protein profiles correlted with transcript profiles. Early responses in energy and growth-related protein profiles are not reflected in transcripts, but late responsive protein profiles do correlate. Plants respond first with changes in proteins related to photosynthesis and growth followed by changes in transcripts in photosynthesis, photorespiration and ROS detoxification.
Berry development
1. rapid growth 2. lag phase 3. ripening
Harvested every week. They did a PCA of the data at each stage. Everything was grouped nicely except those in the lag phase, which means that it's incorrect to assume, as have in the past, that nothing's happening in that phase. Metabolism is higher in phases 1 and 2, and lower in 3. Transcription is going up, in contrast.
A lot of fruits ripen due to higher levels of CO2, which causes production of ethylene. Grapes, like strawberries, are thought not to respond to ethylene. However, it seems there are some small bursts of ethylene around veraison (step 3). There is a burst at 32, which is at the beginning of the lag phase. This is consistent with ethylene usage, which is a growth inhibitor. It goes up again in grapes just before veraison.
Water-deficit effects on berries of two different cultivars
Berries were smaller when they had a water deficit for white grapes and red grapes, though less pronounced in older plants with deeper roots. There are definite changes in the metabolism that can't be accounted for just by reductions in size. The terpinoid pathway is stimulated by the chardonnay but not the red grapes. In both, fatty acid metabolism is stimulated, which creates more volatiles (via yeast or grapes isn't known yet). The phenylpropanoid pathway, stimulated in the Cab Sauv (what makes red wine "healthier") in the drought.
Microarrays provide valuable insights, and SB tools are in development. Molecular network maps will soon be released. There are multiple stress responses, and future work will focus on stress survival and berry quality metrics.
These are just my notes and are not guaranteed to be correct.
Please feel free to let me know about any errors, which are all my
fault and not the fault of the speaker. :)
...or, all you ever wanted to know about wine yeast, but were afraid to ask
Duccio Cavalieri
Plenary Talk, Afternoon Session, 1 September (11th MGED Meeting, 1-4 September, 2008)
Volatile organics in: Grapes = 466 , Wine = 644, Difference= 178. The most ancient evidence from 3150 BC, in Egypt. However, the ecology of yeast has been mainly unknown. The probability of S.cerevisiae on a pristine grape is .0005, while per damaged grap is 25%, which means 10^4 - 10^5 (probably from clonal expansion). Proportion of damaged grapes is 1/1000. When there are many organisms (above 4%), cerevisiae is the only survivor. 88% of the S288c gene pool derives from EM93 isolated from rotten figs in Mercedes, California in 1938. Most regulatory genetic variation is due to a high rate of cis acting alleles and a small number of trans acting alleles.
Exploring the genome-environment interaction, and looking at mendelian segregation of expression profiles. One strain's colony, after 4 days' growth, produces an interesting and cohesive shape ('filigree'). Compared the parent to the first segregants. They found that 2 alleles were controlling the expression of 378 genes, or 6% of the genome. They tried to clone these genes from the recessive mutation, but failed after screening 300 false positives. Then, a few years later, tried to get the genes responsible for the phenotype via pathway analysis via a Fisher exact text followed by pathway correction. If you compare with the wt, there isn't a single significant pathway that segregates except the cell cycle stuff. Then he applied Bagel Analysis to extract absolute values and probabilities using bayesian statistics - discovered significance for amino acid transporters. They then did the same experiment, but with rich amino-acid medium, and with a minimal aa medium. The filigree strain has a mutation in SSY1, which causes a truncation. This means you lack the sensing part of the protein on the outside of the membrane, which means that the strain is blind to the amino acids in the media. This makes the yeast produce its own amino acids.
Only a small number of genes were differentially expressed based on the filigree phenotype. Ammonia mediates signalling between these cells. MEP1, MEP2, and MEP3, and PHD1 three were overexpressed in the strain. M28 is homozygous for a mutation causing non-disjunction of cells after cell division affecting AMN1 and GPA1. The daughter cell doesn't divide immediately after division, which creates 3d structures.
Means that the potential for differences in the various wine strains is quite high. The estimate of the relatedness between By4743 and Em93 (parent strain) is 90% similar. Most of the variation is not evenly distributed. Mutations in the aa, nitrogen sensing, and hap genes are highly pleiotropic. HDAC and SIR-SAS gene expression control provides the cell with a mechanism of epigenetic buffering of variation.
These are just my notes and are not guaranteed to be correct.
Please feel free to let me know about any errors, which are all my
fault and not the fault of the speaker. :)
Wolfgang Huber
Afternoon Session, 1 September (11th MGED Meeting, 1-4 September, 2008)
Meiotic recombination is important for proper chromosome segregation and is an important mechanism in evolution in sexual organisms. It is initiated via double-strand breaks, and then there are two different pathways: CO and NCO (crossover and non-crossover). They are mapping all recombination events in 50 x 4 tiling arrays for the child strains.
Identified 179 recombination hot spots - none overlapped the centromere, as expected. 85% overlapped a promoter, but only about 25% of bases in hot spot intervals overlap promoters, while 68% overlap coding sequences. There were a wide variety of recombination events, some much more "messy" than the normal textbook events. They compared their hotspot maps with some older ds break maps made last year by a different group, to determine correlation between DSB and their recombination hotspots. There was a pretty close correlation. They concluded from analysing the data that genome-wide distributions of CO and NCO hotspots are different (p < 0.0005).
They found correlation between hotspot location and the genes that are transcribed from these regions (genes with distinct expresssion profiles are associated with hotspots and hotspot subtypes). Hotspots tend to be GC-rich. Up to 1% of a meiotic product's genome is subject to conversion per single meiosis. conversion favors GC. Per meiosis, 2.1% of polymorphic positions are converted to the opposite genotype. However, hotspots are also more diverse. Allelic Homogenization appears to be counteracted by other forces, e.g. mutagenicity of recombination evens. Distinct distributions of CO and NCO suggest that genomic position affects DSB resolution. There is inference between NCO and COs. Conversion hotspots unlink genomic regions from the linkage map.
These are just my notes and are not guaranteed to be correct. Please feel free to let me know about any errors, which are all my fault and not the fault of the speaker. :)
..of Saccharomyces cerevisiae
Gavin Sherlock
Afternoon Session, 1 September (11th MGED Meeting, 1-4 September, 2008)
The population structure in the presence of clonal interference is markedly different from that in a classic model. They needed a system to model population dynamics in a population. They use FACS and different-colored fluorescent cells to do this. They grew the cells in a chemostat, which is seeded with equal numbers of each of the 3 color types, and then measure the proportion of the population over time. The experiment has been done 8 different times. One run, for example, shows expansions and contractions followed by one color becoming the majority. Fixation of a color is not necessarily indicative of fixation of an adaptive event (multiple adaptive clones within a population with the same color).
Using yeast tiling microarrays, they can identifiy location nucleotide differences between the evolved and parent strains. Then sequence the candidate mutations. One of the mutations they found (in cox18) called Red 266 was discovered via decreased hybridization compared to the parent. Another example was where there was a comparatively higher level of hybridization. Mutation history can help determine which strains come from which earlier parent strains with mutations of their own.
Clonal interference is important in adaptive evolution of yeast. Specifically, glucose transport and signalling through the Ras pathway were both affected. In future, they wish to directly determine which mutations are adaptive, find out how general these adaptations are, and discover the effects of the adaptive mutations, and finally - what fraction of the adaptive landscape have we explored? Will it be the same or different in the other 7 experiments?
These are just my notes and are not guaranteed to be correct.
Please feel free to let me know about any errors, which are all my
fault and not the fault of the speaker. :)
Claudio Moser, Department of Genetics and Molecular Biology, E. Mach Foundation - IASMA
Plenary Talk, Afternoon Session, 1 September (11th MGED Meeting, 1-4 September, 2008)
The grapevine berry ripening process
It takes 4 months to go from a flowering cluster to a fully-ripened cluster. The development of the grape is in three phases: 2 growing phases divided by a phase where the berry doesn't grow anymore. Berry formation followed by berry ripening. There are three major tissues: skin, pulp, seeds.
Why is it important?
Economics: It is the most important fleshy fruit from an economic point of view. There are 8 million hectares of vineyards worldwide with an annual turnover of more than 20 billion US $. There is increased interest in grape-derived anti-oxidant compounds. In Trentino, grapes and apples represent the two most relevant crops.
Biological relevance: The ancestor of the cultivated version climbed trees, reached the top of the canopy, and then flowered. Domestication has passed from separated-sex flowers (male flowers and female flower) to a hermaphrodite flower.
The grape transcriptome
Transcript profiles are studied to understand the molecular dynamics of berry ripening and regulation. Research can be done with the Affymetrix Vitis GeneChip. This chip contains 14,000 unigenes. Applications of the research improve management practices (via lower inputs, lower costs, and higher quality), use gene transfer to create new varieties or improve traditional ones. There are two ways of getting new genes in: gene transfer (direct) or marker-assisted selection.
They used pinot noir grapes, and sampled them at three different stages
of the life cycle, with 3 biological replicates for each time point,
then repeated the analysis in 3 different seasons. The 2003 data was
quite different - this can probably be explained by 2003 having a very
hot summer. Further, the analysis separated out the ripe berry time
point from the earlier two time points. They ended up with ~1800 genes
that will form the berry-ripening core set.
What have we learned about the biology?
9 GO classes were statistically significantly different from the affy chip, with 6 of those being overrepresented. Ripening is finely programmed before veraison and its transcriptional modulation reflects berry biochemical changes. In the first phase, the berry goes through a re-programming phase before moving on to the next growing phase. A consistent fraction of the isolated genes are devoted to the control of the developmental program. (This has also been found in tomatoes.) Major TF families include zinc finger, oxin-related, and others.
He looked more closely at genes related to ROS metabolism. Noticed there was a burst of hydrogen peroxide at the veraison phase. Seasonal influences produces changes that involve light signalling, ripening-related genes, and non-ripening-specific isoforms
Future directions
Wants to look into ethylene, as it is probably quite important in berry development. Exogenous ethylene application can modify the ripening curve. Endogenous ethylene can be measured just before veraison. Tried to measure these things on their pino noir grapes. They got some noisy data, but think they saw two peaks. In 2007 there were two different genome sequences of pinot noir grapes published. They found about 30,000 genes. They want to do more analysis on it.
These are just my notes and are not guaranteed to be correct. Please feel free to let me know about any errors, which are all my fault and not the fault of the speaker. :)
Geoffrey Faulkner
Morning Session 1 September (11th MGED Meeting, 1-4 September, 2008)
Transcriptional Elements (TEs) include: LINEs, SINEs, LTRs, and DNA transposons. TEs were first characterized in maized, and thought they were regulators of nearby genes via an unknown mechanisms. The discoveries of regulatory ncRNAs and active promoters of TEs has helped. It's difficult to detect genome-wide TE transcription. They use CAGE, mentioned in the first talk of the day. CAGE detects transcription start sites (TSSs), and reliably detects TE promoters.80% of CAGE tags mapping to a repetitive element are unique on the genome. TE promoters are sharp, in that there is a single dominant transcription start site. TE promoters were more than twice as likely to be tissue-specific than other promoters (40% rather than 17%). TE promoters were enriched for protein-encoding genes. TEs are known to provide alternative promoters to nearby genes. More than 700 of the ones he studied were confirmed as such alternative promoters (Worked with FANTOM3 and FANTOM4 mouse libraries for CAGE stuff). Also, ncRNAs can be derived from TEs and could produce "anti-silencing" or "transcriptional interfence", but are more likely to provide the former rather than the latter. TEs provide 1000s of functional elements to the genome, even though they're not usually very well conserve. They contain an interesting subclass of promoters, and are enriched near protein-coding genes, and they provide alternative promoter to nearby genes.
These are just my notes and are not guaranteed to be correct.
Please feel free to let me know about any errors, which are all my
fault and not the fault of the speaker. :)