Review articles - Journal of Physiology and Pharmacology

JOURNAL OF PHYSIOLOGY AND PHARMACOLOGY 2009, 60, Suppl 3, 5-16
www.jpp.krakow.pl
Review articles
J.F. HOCQUETTE1, I. CASSAR-MALEK1, A. SCALBERT2, F. GUILLOU3
CONTRIBUTION OF GENOMICS TO THE UNDERSTANDING
OF PHYSIOLOGICAL FUNCTIONS
INRA, UR1213, Recherche sur les Herbivores, Theix, 63122 Saint-Genes Champanelle, France, 2INRA, UMR1019, Nutrition humaine,
Universite de Clermont I, Theix, 63122 Saint-Genes Champanelle, France, 3UMR6175, Physiologie de la Reproduction, INRA, CNRS,
Universite de Tours, Haras Nationaux, F-37380 Nouzilly, France
1
Genomics has brought with it a true biological revolution and can be applied to all areas of life sciences. The advent of
genomics is thus linked to the development of high-throughput techniques which allows the genome of organisms as a
whole to be studied. The first high-throughput techniques to be developed were sequencing methods. These advances will
allow new approaches to a variety of problems in biology. For instance, the emerging fields of genomic medicine in humans
and genomic selection in livestock are promising. After the sequencing of genomes, genomics has shifted to the study of
gene expression and function. This is called the "post-genomic area" by some authors or "functional genomics" by others.
The most recent "omics" to be developed are associated with the study of the metabolism (e.g. metabolomics). Integrative
"omics" approaches (e.g. nutrigenomics) are based on the association of the omics tools at different levels (DNA, RNA,
proteins, metabolites) for a specific objective (here nutrition). In terms of perspectives, it is likely that methods for
collecting data will outstrip our capacity to adequately analyse these data. So scientists must develop bioinformatic tools
and methods to overcome this difficulty. In addition, high-throughput techniques need to be developed in physiology in
order to match the increasing amount of genomic information with true biological data. Finally, there is no doubt that all
these new approaches will allow important new genes and novel biological mechanisms to be discovered. Physiological
models with invalidated or over-expressed genes will be precious tools to check these new biological discoveries.
K e y w o r d s : genomics, gene expression, metabolomics, data mining, phenotype
INTRODUCTION
Genomics is the study of an organism's entire genome. The
definition of genomics thus refers to that of the genome. The
genome includes both the genes and the non-coding sequences
of DNA. Whereas the term "genome" appeared in the literature
in 1920-1930, the term "genomics" only appeared in the 1980s,
and took off in the 1990s with the initiation and development of
genome projects for several biological species. The advent of
genomics is thus linked to the development of high-throughput
techniques which allows the study of the genome of organisms
as a whole. In other words, genomics provides scientists with
methods to quickly analyse genes and their products. It can also
be defined as the identification of an organism' genes and the
means for understanding gene functions using new
biotechnological approaches.
One major consequence of the advent of genomics is that,
today, scientists have the opportunity to analyse interactions
between genes (and between their products) at genome level and
therefore to understand the interactions between the various
systems of a cell on a large-scale basis, including the
interrelationship of its DNA, RNA and synthesised protein as well
as metabolites, and to learn how these interactions are regulated.
After the sequencing of a great number of genomes,
genomics is thus now shifting to the study of gene expression and
function. Functional genomics allows the detection of genes that
are turned on or off at any given time and in any physiological or
nutritional situation. For instance, transcriptomics is the study of
the transcriptome, e.g. the complete set of RNA transcripts
produced by the genome at any one time. Similarly, proteomics is
the large-scale study of proteins, particularly their structures and
functions. And metabolomics is the comprehensive analysis of
the whole metabolome (the collection of all metabolites in a
biological tissue, biofluid or cell) under a given set of conditions.
Generally speaking, users of the suffix "-ome" frequently take it
as referring to totality of some sort. In addition, unlike basic
genomics, functional genomics focuses on the dynamic aspects
such as gene transcription and translation, and also the molecular
mechanisms regulated by non-coding sequences.
The last discipline to be developed at a high-throughput
level is phenotyping. The overall objective is the understanding
of the physiology of a whole organ, tissue or even organism by
combining genomic data from DNA sequence to metabolome
and phenotypes through gene expression. This will require
modelling approaches using the vast amounts of data from highthroughput techniques. This will also require gene expression
manipulation in well-characterised physiological models in
order to better understand gene function. There is no doubt that
approaches to explore biological processes have dramatically
changed and will continue to change in the immediate future.
6
THE OUTCOMES OF SEQUENCING
Progress in sequencing
The year 2007 saw the 30th anniversary of DNA sequencing.
The amount of nucleotide sequences in the databases has
increased logarithmically during this period due to different
technical innovations. More than 700 complete genomes have
been sequenced so far. The number of completed genomes
sequenced each year has increased 10-fold from 2000 to 2007
(e.g. from 18 to 207) (http://genomesonline.org/gold_statistics
.htm). The total cost for the working draft of the human genome
(1) was reported in 2000 to be approximately $300 million
worldwide (http://www.nih.gov/news/pr/jun2000/nhgri-26.htm).
It currently costs roughly $60,000 to sequence a human genome,
and many researcher groups are hoping to achieve a $1,000
genome within the next years. But an even lower cost may be
achieved by private companies using nanofluidic devices
(http://www.futurepundit.com/archives/005150.html). On a pernucleotide basis, sequencing fees were around $10 in 1986 and
had dropped to 10-20 cents in 2001 and to 0.5 cent nowadays
(2). The rate-limiting steps are likely nowadays to be connected
with the tools for analysis of sequence data. For instance, allagainst-all sequence comparison using informatics tools requires
more time than sequencing by itself (3). Meanwhile, huge
progress is being made in our knowledge of the structure of
genomes. For instance, it was recently shown that, although the
human and the mouse genomes contain similar proportions of
recent duplications (∼5%), the architecture of the genomes of
these two species differs markedly: unlike those in the human
genome, most mouse duplications are organised into discrete
clusters of tandem duplications with specific features (4).
In addition to sequencing, epigenomics is an emerging
science which promises novel insights into the genome because
of its potential to detect quantitative alterations, multiplex
modifications and regulatory sequences outside of genes (5).
Epigenomics is thus the study of factors surrounding DNA that
affect gene expression. In this context, the epigenome is
important for gamete and embryo development since major
chromatin remodelling occurs in these cells (2).
Genomic medicine
The recent sequencing of huge numbers of various genomes
has led to the discovery of many DNA sequence variants among
individuals. Generally, the genomes differ between two random
individuals by 0.1%. The routine determination of those variants
is called genotyping. So far, a great number of Single Nucleotide
Polymorphisms (SNPs) has been found on the whole human
genome and registered in public databases. Some commercial
chips are available which allow genotyping of a huge number of
SNPs simultaneously (about 1 million) which cover the entire
genome. Genome-wide association studies have detected SNPbased variants with modest to large effects on phenotypic traits.
Although many polymorphisms are functionally neutral (half are
supposed to be in the noncoding regions and one quarter
corresponds to silent mutations) (6), disease susceptibility loci
which are polymorphic have been or will be identified in the
near future. They potentially allow the inheritance to be traced
for factors that predispose for common diseases in human beings
(7). In addition, some QTL may explain at least in part the intervariability in drug responses since drugs work better in some
patients than in others. Therefore, SNP technologies also have
potential application in pharmacogenetics and more generally in
individualised medicine (6).
Several genome-wide studies have also demonstrated that
DNA polymorphisms influence gene expression at mRNA level.
Loci influencing transcripts levels have been termed "eQTLs"
for QTLs (Quantitative Trait Loci) of expression. The
combination of linkage genetics and expression profiling
(genomics) is called "genetical genomics". More recent genomewide studies in humans have examined whether gene expression
at protein level could be associated with genetic variation in or
close to the gene coding for those transcripts (this is called cis
effects) or elsewhere in the genome (this called trans effects):
the role of almost 500,000 polymorphisms on 42 blood protein
levels has allowed the identification of certain specific cis and
trans effects. The underlying mechanisms included altered
transcription, altered rates of cleavage of bound to unbound
soluble receptor, altered secretion rates and variation in gene
copy number. Loci influencing protein levels are termed protein
QTLs (pQTLs). Since many of these plasma proteins are
correlated with human diseases, this new approach will help our
understanding of these diseases (8). More generally, the
combination of proteomics, metabolomics and other highthroughput analyses with multifactorial genetic analysis will be
useful to better understand the functional consequences of
natural genetic variation on a very large scale. But, to date, largescale analyses of proteins and metabolites are not available yet
in the way genome analyses are (9).
Genomic selection in livestock
By choosing the best animals, farmers have always made
small but significant genetic improvements to farm animals
throughout the history of humanity. The advent of genetics has
led to higher rates of livestock improvement. Traditional
genetics using information on phenotypes and pedigrees to
predict breeding values was successful. Then genetic maps were
developed in the 1990s which helped in the discovery of QTL
and even genes controlling some production traits. Meanwhile,
commercial tools were developed (10). The first ones were
single-markers and single-gene tests, which have been rapidly
incorporated into selection programmes. The gain induced by
"marked-assisted selection" (MAS) has been sometimes low
depending on (i) the accuracy of the existing estimated breeding
value, (ii) the proportion of the genetic variance explained by the
DNA markers, (iii) the accuracy in estimating the effect of the
QTLs and (iv) the ability to reduce the generation interval by
working at an earlier age than previously (11). As most
economic traits are influenced by many genes each having a
small effect, working with only a small number of genes will not
be very efficient.
With the advent of genome sequencing, SNPs were
identified in the genomes of farm animals like in that of humans.
A recent study was published with more than 15,000 SNP
markers covering all regions of all autosomes, and analysed in
more than 1,500 cattle (12). Today, SNP gene chips with over
50,000 SNPs are available for association studies. So it is
potentially possible to select animals with markers which cover
the whole genome. This is called "genomic selection". It was
initially suggested by Meuwissen et al (13) but not put into
practice because of the lack of suitable genomic tools, but it is
now interesting due to the advent of chips for genotyping. In
theory, since the markers cover all the genome, and since the
markers are assumed to be in linkage disequilibrium with the
QTL, the whole genetic variance is potentially explained by the
markers and the whole genetic value of each animal is well
estimated (being the sum of all QTL effects) without any precise
knowledge of each individual QTL effect. The major limitations
are the large number of markers to analyse and the resulting
costs. But these costs are dramatically decreasing. Thus the next
generations of practical tools for the livestock industry will be
SNP chips for large-scale genotyping (11). This will continue to
7
improve the efficiency of production, reproduction and growth
as well as product quality. More importantly, SNP associations
can be helpful for traits more difficult to work on, such as
behaviour, disease resistance, reduced waste for the
environment. In any case, more initial studies are needed to use
these markers efficiently: in other words, scientists must clearly
demonstrate by association studies the interest of the SNPs with
regard to the traits of interest (14).
As a result, many livestock companies or associations are
thinking of implementing genomic selection. There is a huge
challenge in this area for the immediate future. This is likely to
have major effects on the agendas of research and commercial
organisations since genomic selection will probably redesign
animal breeding and management programmes. For instance, it
is anticipated that the need to progeny test dairy bulls for the
milk production of their daughters or beef bulls for meat quality
of their offspring may disappear at least in part. It is even
intended to select at early stages of life, and may be possible
notably to select embryos directly. However, a balanced
approach must be taken to ensure the new methods will enhance
but not supplant traditional selection (15).
In the long term it can be argued that all the genomics
approaches described below (transcriptomics, proteomics,
metabolomics) have the potential to be integrated into existing
animal breeding. So, we have moved from quantitative genetics
to molecular genetics and we will move from genetical genomics
to systems genetics (16). The main added-values will be
refinement of the identified QTL, understanding of geneenvironment and gene-gene interaction, detection of regulator
genes and of pleiotropic QTL. Some authors also predict direct
selection on heritable gene expression profiles (namely
"expression assisted selection") (16). In human medicine, the
same approach could be called "expression-assisted evaluation"
within a perspective of personalised medicine.
MATURE TECHNIQUES FOR GENE EXPRESSION
STUDIES
For many years, attention was directed to specific biological
pathways or rate-limiting enzymes and key genes with a high
impact on physiological traits. Indeed, before genomics,
molecular biology aimed at investigating the expression of
single genes in isolation from the larger context of other genes.
This is referred to as the "candidate gene approach". Different
methods were used to detect and quantify the expression level of
individual genes (e.g. northern-blot, subtractive hybridisation, or
real-time PCR) and their products (e.g. western-blot and
ELISA). In fact, physiological processes are governed by several
genes acting in concert rather than by only one or a few
individual genes. More recently the advent of genomic
technologies (array technology, proteomics and metabolomics)
has enabled the analysis of thousands of genes or proteins or
metabolites in a single experiment (genomic approach) (15, 17).
Since the genomic strategy is to identify, from among thousands,
differentially expressed genes or proteins between extreme
individuals without any a priori knowledge of their functions,
scientists hope to detect potentially interesting new genes and
molecular mechanisms which were not previously suspected to
be important for biological processes. This will therefore
generate new biological hypotheses.
Transcriptomics and its applications to medicine and livestock
production
Gene expression patterns can be described through methods
enabling large scale analysis of the entire set of genes expressed
from the whole genome (transcriptome). Among these methods
are Serial Analysis of Gene Expression (SAGE) and DNA arrays
(DNA chips). The advantages and limits of both approaches
were recently compared (2).
Briefly, SAGE is a high-throughput, high-efficiency method
to evaluate the expression pattern of thousands of genes in a
quantitative manner without prior sequence information (18,
19). It is based on the isolation of unique sequence tags from
individual transcripts that are concatemerized before being
cloned into a vector (20, 21). Sequencing of concatemer clones
reveals individual tags and enables quantification of transcripts
while giving the opportunity of identifying new transcripts.
Many studies have used SAGE to obtain pictures of global gene
expression patterns, especially in medicine for cancer (22) or
obesity (23) research and in veterinary research for the study of
bovine trypanotolerance genetic control (24).
With the ability to assay thousands to millions of RNA
amounts at the same time, microarray technology has
fundamentally changed how biological questions are addressed,
from examining one or a few genes to a collection of genes or
the whole genome. Compared to the first studies focused on
gene expression, microarray technology has come a long way in
terms of the number of features available on an array and the
range of potential applications. For instance, DNA microarrays
are also used to detect DNA-protein (e.g. transcription factorbinding site and transcription factor) interactions, alternatively
spliced variants, the epigenetic status of the genome (such as
methylation patterns), DNA copy number changes and sequence
polymorphisms, etc (25). Automated arrays have also a
promising future for many other analytical applications (26)
including tissue microarrays which have become a valuable tool
for validating candidate markers in cancer research (27).
However, the most well known use of DNA microarrays is for
profiling messenger RNA levels which will be detailed below.
Recent studies have shown the relevance of microarrays for
revealing novel genes that had not previously been thought to be
involved in a physiological or nutritional response. Considerable
effort has been expended in recent years on examining the
molecular bases of diseases (cancer research, toxicology, etc) and
the effects of pharmaceuticals on cell and animal models (17). A
few selected examples below will illustrate the advantages of
transcriptomics for the prediction of human health.
A large meta-analysis of 3762 DNA microarrays from 40
publications led to the identification of genes differentially
expressed in cancer (28). A metasignature consisting of 67 genes
was found to be a significant predictor of cancer, independently
of the cancer type. Different sets of genes were characterised to
discriminate (i) between differentiated and undifferentiated
cancers, (ii) cancers according to their outcome, (iii) metastatic
and primary cancers or (iv) oestrogen receptor positive and
negative cancers.
A transcriptomic analysis of the aorta of mice fed a high fat
diet and of apoE deficient mice, a widely used model of
atherosclerosis, showed that the expression of over 700 genes
was affected by the disease (29). These genes were differentially
expressed over time as the disease developed. A set of 38 genes
accurately classified five stages of the disease. The genes
affected at the earlier stages of the disease, before the formation
of detectable vessel lesions, may be particularly important as
diagnostic markers
Gene expression profiling also makes it possible to explore
the mechanisms underlying pathological processes. Genes
identified by these approaches may have been previously linked
to the disease but novel genes are also often identified, throwing
new light on the mechanisms driving the development of
disease. For example, the general cancer metasignature made of
67 genes and described above includes many genes previously
8
associated with different cancers (28). These genes are likely key
transcriptional factors in neoplastic transformation. Some of
them encode for enzymes such as topoisomerase II or for
members of the proteasome complex already known to
participate in neoplastic transformation, and established targets
of chemotherapeutic drugs. However, other genes in this
signature were not previously known and might become novel
targets for drugs. Similarly, many of the genes over-expressed in
the aorta of apoE deficient mice were inflammatory genes, some
of them newly associated to atherosclerosis (29). Functional
annotation of these genes through gene ontology confirmed the
contribution of known pathways such as "wound healing",
"apoptosis" or "nitric oxide mediated signal transduction" or
"cell adhesion and migration", but also revealed new biological
processes associated with the development of atherosclerotic
lesions, such as "carbohydrate metabolism", "complement
activation", "calcium ion homeostasis" or "collagen catabolism".
Genomics can also be applied to characterise common
physiological processes. For example, ageing was explored in mice
by comparing the gene expression profile of the muscle of young
and aged mice (30). Out of the 6347 genes surveyed, 113 showed
a more than two-fold increase or decrease in expression over aging.
Functional analysis of these genes suggested an increase in stress
responses and neuronal growth, and a decrease in energy
metabolism and in the biosynthesis of some lipids and proteins.
Transcriptomic studies in livestock animals are still few
despite many recent studies which have been recently reviewed in
pigs (31) and cattle (32, 33). However, a multitude of applications
(e.g. increased livestock productivity, meat and milk quality,
prevention of diseases) is driving genomic studies of farm
animals. Recently, gene expression-based research related to beef
quality has focused on identification of molecular predictors
associated with meat quality traits such as toughness and marbling
(34). Other were developed to better understand foetal muscle
development (35, 36), the mechanisms underlying muscle growth
potential (37, 38) and effects of nutritional changes (39) which all
influence the composition of muscle tissue. Intramuscular fat
development was also examined (40, 41) since it influences
marbling and thus juiciness and flavour of beef.
Only a few studies aimed to identify differentially expressed
genes according to beef sensory quality, especially tenderness.
For instance, Bernard et al. (42) searched for differentially
expressed genes associated with variability of beef tenderness in
Charolais males. They found that expression of the DNAJA1
gene was strongly related to tenderness after 14 days of ageing.
This finding has been protected by a patent filed in Europe in
September 2006 by INRA (EP06300943.5). The DNAJA1
protein is a member of the heat shock 40kDa protein family. It is
a co-chaperone of the Hsc70 protein and seems to play a role in
protein import into mitochondria. An emerging hypothesis is that
DNAJA1 could decrease apoptosis and therefore meat ageing
and its tenderisation during the days following slaughtering.
Further studies are needed to characterize DNAJA1 involvement
in beef tenderness and to look at the relationship between
DNAJA1 expression level and tenderness in other beef breeds or
production systems.
It is clear that gene expression profiling has revealed that
unsuspected genes may be potential molecular markers of
phenotypic traits. Meanwhile, progress is being made in the
understanding of gene expression. For instance, small RNAs are
a growing class of recently identified noncoding RNAs. They
can be divided into different classes including microRNAs,
small interfering RNAs, etc (43). So far, more than 600
microRNAs (miRNAs) have been identified in humans, and are
estimated to regulate more than one third of cellular messenger
RNAs (44). MicroRNAs seem to have unique tissue-specific,
developmental stage-specific or disease-specific patterns. Since
they also seem to regulate gene expression through various
mechanisms, they are of increasing interest in biology (43). The
importance of microRNA in physiology can be illlustrated in
Texel sheep: the allele of the myostatin gene in Texel sheep is
characterized by a G to A transition in an untranslated region.
This mutation creates a target site for miRNAs that are highly
expressed in skeletal muscle. This causes translational inhibition
of the myostatin gene and hence contributes to the muscular
hypertrophy of Texel sheep (45). Analysis of SNP databases for
humans and mice demonstrates that mutations creating or
destroying putative miRNA target sites are abundant and might
be important effectors of phenotypic variation. The profiling of
miRNA expression is a new field under development for which
adaptation of the array technology is needed (43).
Proteomics, principles and examples of applications related to
human medicine and animal science
Unlike DNA and RNA, proteins are the molecules which
build the cells. Knowledge of protein abundance and isoform
patterns is thus critical for the understanding of physiological
functions. One major objective of proteomics is to quantify
protein levels and their dynamic changes. To achieve this goal,
proteins can be studied by different techniques including their
physical separation which is commonly used. Once separated
and if they are of interest (due to different levels for instance),
proteins can be identified using mass spectrometry approaches
which were the subject of major improvements during the last
decade (46) Any type of biological sample can be analysed
including tissues such as for transcriptomics, but also biological
fluids (plasma, lymph, etc). Only a few examples will be given
in this section to illustrate some current methodologies and
potential applications.
Plasma is unique since it lacks a genome and hence it does
not have any transcriptome. However, plasma contains proteins
which can originate from any other tissue or cell within the body.
Great efforts have been made to characterise the plasma
proteome: a great number of proteins has been detected but their
concentrations differ by more than 10 orders of magnitude
between the most abundant and the rarest ones. The major reason
to study the plasma proteome is the hope of detecting protein
markers indicative of any disease, since blood can be easily
obtained through non-invasive procedures. Thus, the human
plasma proteome holds the promise of huge progress in disease
diagnosis and therapeutic monitoring, provided that major
technical challenges in proteomics can be solved (47). In this
context, the objectives of the Plasma Proteome Project are: (i) a
comprehensive analysis of plasma and serum protein
constituents in people, (ii) the identification of biological
sources of variation within individuals over time, with validation
of biomarkers and (iii) the determination of the extent of
variation across populations and within populations
(http://www.hupo.org/research/hppp/). Theoretically, plasma
proteins are easily obtained and some are present in relatively
high concentrations. In fact, 22 proteins make up about 99% of
the plasma protein content. Therefore, the dynamic range of
protein concentrations in plasma (about ten orders of magnitude)
is much less than the dynamic range of the analytical tools
(about two orders of magnitude for a mass spectrometer). So the
less abundant but more interesting proteins are likely to be
overlooked if the most abundant proteins are not removed (48).
Proteomics have also many potential applications in
livestock, namely so far in animal health and disease,
reproduction and muscle biology related to meat quality (46). We
will here illustrate the advances in muscle biology as examples.
The effects of genetic selection towards high muscle
development in order to increase meat production have been
9
extensively studied by proteomic approaches. Various studies
were performed to study extreme animals with muscle
hypertrophy, namely Belgian Blue bulls with myostatin deletion
or Texel sheep harbouring a Quantitative Trait Locus (QTL) for
muscle development. Seventeen Troponin T isoforms were
detected in the bovine Semitendinosus muscle, eleven of them
belonging to the fast type (fTnT) and originating from the
exclusive alternative splicing of fTnT exon 16 and fTnT exon
17. Comparison of the proteomes between the Semitendinosus
muscles of two groups of Belgian Blue bulls with or without
myostatin deletion demonstrated that Troponin T isoform
patterning was altered by myostatin loss-of-function and could
also be a good marker for the prediction of muscle mass (49).
In addition, many papers have described the proteome
changes of post-mortem processes in pork, bovine and fish (50).
Post-mortem markers detected during the first 48h of postslaughter storage included structural proteins (e.g. actin, myosin
and troponin T) as well as metabolic enzymes (e.g. myokinase,
pyruvate kinase and glycogen phosphorylase). Accumulation of
these fragments was found to correlate with meat tenderness.
Some papers have focused more on proteome changes related to
proteolysis during post-mortem storage (51) or to meat quality
problems. Lastly, the occurrence of low-molecular weight
peptides in bovine pectoralis profundus muscle during postmortem storage and cooking was analysed directly by mass
spectrometry (52).
These examples underline that proteomics may have many
applications complementary to those of transcriptomics. Thanks
to the improvement in instrumentation, most current studies in
proteomics use mass spectrometry to detect and identify
proteins. Thus, the advantage of separating proteins (especially
by two dimensional electrophoresis) before mass spectrometry
analysis has increased considerably. There are however some
clear limitations such as difficulties with membrane-associated,
very acidic or very basic, very low or very high molecular
weight and very low abundance proteins (17). In recent years,
significant progress has been made to improve the microarray
technology applied to proteins. This technology is similar to that
of transcriptomics in its principle. Specific proteins or peptides
representing the proteins of interest can be arrayed as well as
antibodies against the studied proteins. Nowadays, sample and
data handling are key issues in developing high-performance
antibody arrays (53). This field is expected to make rapid
progress and to move towards standardised protocols just as
transcriptomics did. In parallel, software for comprehensive
pathway analysis and/or literature mining have been developed
including Ingenuity (Ingenuity Scientific), Pathway Studio
(Ariadne) or Bibliosphere Pathway Edition (Genomatix GmBH).
By providing opportunities for identifying molecular networks,
they constitute powerful tools to go further in the deciphering of
the molecular bases of biological functions.
METABOLOMICS
Objectives and principles of metabolomics
Just as the objective of genomics is to study all genes,
metabolomics aims at quantifying and characterising all
metabolites present within cells, biofluids or tissues under a given
set of conditions. The difficulty is that metabolites are much more
diverse in their chemical structures and properties than nucleic
acids or proteins, making them more difficult to extract and analyse
using a single protocol. In addition, unlike for DNA and RNA, no
amplification techniques are available for metabolites, making
sensitivity critical. So, there are today no universal techniques able
to quantify all metabolites present in a given sample.
Two different metabolic approaches can be distinguished:
metabolic profiling and metabolic fingerprinting (54). Metabolic
profiling is a targeted approach because the studied metabolites
belong to a specific category and share common physicochemical properties. Improvement of the sensitivity and
resolution of the analytical methods has made the analysis of a
much larger number of metabolites of a given class in a single
analysis possible, in comparison to former analytical methods
focused on a more limited number of metabolites. This has led
to the emergence of disciplines such as lipidomics and
peptidomics (55, 56) which are large-scale analyses of lipids and
peptides respectively. But in fact, metabolic profiling is not a
truly omic approach since it analyses metabolites known a priori.
In metabolic fingerprinting, metabolites are analysed in a
truly global manner, using more universal analytical methods
such as nuclear magnetic resonance (NMR) or mass
spectrometry (MS), with no a priori hypothesis on the nature
of the metabolites of interest. The limit for characterisation of
the metabolome is then the limit of detection of the equipment
used for data capture. Metabolic patterns of samples
originating from different cells, animals or individuals, are
compared and the samples classified using multivariate
statistic tools. Proton NMR has been used for over 20 years for
such applications. Any molecule containing one or more
protons gives a signal with a chemical shift characteristic of its
chemical environment in the molecule. Chemical shifts are
therefore characteristic of a given metabolite and can be used
for identification of a priori unknown markers. NMR analysis
offers several advantages: the intensity of each signal is
proportional to the concentration of the proton-containing
molecule with a wide dynamic range. It is robust and fairly
reproducible (57). Spectrum acquisition is fast and simple
since several hundred samples can be analysed in a day. The
main limit of NMR is its lack of sensitivity. Only metabolites
present in millimolar concentrations are usually detected. This
is the reason why only 20-40 metabolites in tissue samples and
100-200 in urine samples are generally observed in NMR
metabolomic studies (58). This low sensitivity explains why
only limited new biological knowledge has so far been
generated using NMR metabolomics.
MS is far more sensitive with detection limits in the
micromolar range. Most organic metabolites can be ionised and
ions can be separated according to their mass/charge (m/z) value.
For technical reasons and to avoid ion-suppression effects,
metabolites are most often separated by gas or liquid
chromatography before mass analysis (54). Gas chromatography
(GC) can be used for volatile compounds and polar metabolites. In
liquid chromatography (LC), analytes are usually separated on
reverse phase columns with particle size of 3-5 µm. The most
wide-spread equipment in use for LC-MS metabolomics are highresolution mass spectrometers such as time-of-flight mass
spectrometers (Tof-MS). Ultraperformance liquid chromatography
(UPLC) on columns with a particle size of 1.4-1.7 µm is also
increasingly used to reduce run times and increase resolution of the
chromatograms (59). Both reverse and direct phases are used to
analyse respectively polar and apolar metabolites. Several
hundreds of variables can be measured within 30 min or less for a
given biological sample, each characterised by its m/z value,
retention time and intensity.
Following multivariate data analysis of the data, biomarkers
of interest can then be identified by comparison of the
corresponding mass information with that stored in libraries or
databases. A search in publicly available databases such as
KEGG, MetaCyc or the Human Metabolome Project provides
tentative annotation of the different ions. In practice, it is still
difficult to annotate the markers due to the lack of
comprehensive databases.
10
These difficulties for the identification of the nature of the
markers have encouraged some groups to develop platforms to
analyse the main metabolites of interest in a given field of
research. For example, a capillary electrophoresis-Tof-MS
method was developed to analyse 569 metabolites expected to be
present in mouse tissues (60). However many markers identified
by the fingerprint approach fell out of this list of expected
metabolites, emphasising the limits of these approaches. No
more than 132 metabolites out of 1859 detected features could
be identified in the mouse tissue extracts. Similarly, 191
metabolites were monitored in human plasma by tandem MS but
only 97 were detected in most samples (61). In addition, 308
metabolites previously described in human cerebrospinal fluid
were analysed on 3 different MS platforms but only 70 could be
routinely detected in this biofluid (62). Identification of new and
unexpected markers and related mechanisms of action of drugs,
toxins or nutrients will depend on our capability to identify these
markers in the future. This will require a considerable effort and
big investments to further develop the metabolite databases and
the bioinformatics tools needed to interpret the information
captured in a system-based approach (63, 64).
Examples of applications
An increasing number of publications in metabolomics is
available. They all demonstrate that the more variables, the
higher the chance to differentiate the subtle differences
characterising each of these phenotypes. For example, 113
unknown metabolites detected by HPLC in urine samples were
found to better discriminate patients with liver cancer from those
with hepatitis or hepatocirrhosis, as compared to 15 known
urinary nucleosides which supposedly accumulate in cancer
cells due to a high turn over of tRNA (65).
Metabolites as the endpoint of physiological regulatory
processes may also be good predictors of disease states.
Comparison of metabolic profiles in heart extracts by 1H-NMR
allowed four genetic mouse models of cardiac diseases to be
differentiated (66). However, the genetic backgrounds for the
different strains also affected the metabolic profiles and
particularly some metabolisms related to vessel function,
therefore reducing the capacity of the model to recognise the
diseases. Similar NMR metabolomic analyses were used to
analyse sera collected from patients with various degrees of
vascular stenosis. The status of the patients could be better
predicted using metabolomics than by measuring conventional
risk factors. Lipid signals contributed most to the prediction (67).
More recently, other authors showed that this approach only
weakly predicts the disease due to a confounding effect of
treatments with lipid-lowering drugs such as statins (68). An
NMR-based metabonomic study was carried out to identify
urinary markers of osteoarthritis and a PLS regression model was
developed and shown to accurately predict the disease grade (69).
These results clearly show that metabolic profiling in wellcontrolled animal studies can be useful to identify diseasespecific markers in human subjects. These markers could be less
easily discovered in human studies due to various confounding
effects such as drug treatment or not easily controlled diet.
AN EXAMPLE OF AN INTEGRATIVE OMIC:
NUTRIGENOMICS
Objectives and principles of nutrigenomics
Nutrition is an integrative science encompassing many
aspects of food science, biochemistry physiology and medicine.
Progress in understanding nutrient absorption and energy
metabolism was achieved by different and sequential approaches
(e.g. calorimetry, multicatheterisation techniques, tissue or cell
culture, gene expression). The advent of functional genomics has
made it possible to study thousands of genes or proteins without
any previous knowledge of the metabolic features to be studied.
Through the development of high-throughput DNA sequencing
techniques, array technology and protein analysis, genomics
provides outstanding opportunities to ask key scientific
questions about nutrient-gene interaction and look at the
molecular links between nutrition and physiology. This has led
to "nutrigenomics". This term was coined in 2002 and refers to
the regulation of gene expression by nutrients taking advantage
of the new genomic approaches (70). Nutrigenomics is an
example of integrative omic science since it relies on
transcriptomics, proteomics, metabolomics and other omics
approaches but for nutritional objectives only. Nutrigenomics is
generally devoted to the interaction between nutrition and health
in human beings. It can however be extrapolated to animal
sciences. It is promising in identifying biomarkers of nutritional
status and disease, and individualised nutrient requirements.
Nutrigenomics is of particular interest in the context of
managing livestock animals for production (animal
performance, health, quality of animal products).
Application to human health
Nutrigenomics appears particularly adapted for exploring
the complex links between diet and health in human beings. The
large amount of data generated by such approaches allows a
metabolic state to be characterised with far greater accuracy. For
example, blood cholesterol correlates to the risk of
atherosclerosis and its level is influenced by the diet. A reduction
of blood cholesterol in populations has become a public health
objective to reduce the incidence of cardiovascular diseases.
However, other independent risk factors are also known for
cardiovascular diseases and it appears today unrealistic to rely
on a single or too limited number of biomarkers for evaluating
the risks of such multifactorial diseases (71). Secondly,
nutrigenomics may allow a description of how the diet
influences metabolism to reach a more healthy metabolic state
(72). The biomarkers known so far are clearly insufficient to
evaluate the influence of the diet or nutrients on disease risk.
Thirdly, assessing the role of the diet in disease prevention must
be global in order to determine its effects on any metabolic
pathway that could lead to disease. The huge amount of
information generated by omics approaches already allows
metabolic states to be characterised with far more precision than
the classical approaches. Sets of biomarkers (transcripts,
proteins or metabolites) can be extracted from metabolic
fingerprints (also called metabolic signatures) and used routinely
for metabolic assessment (73, 74).
Applications of genomics tools to nutrition research in
humans have been discussed in some excellent recent reviews
(75, 76). Some examples are given here to illustrate their
potential and limits.
Transcriptomic tools have been used in well-controlled
animal experiments to characterise the effects of deprivation or
supplementation of particular nutrients. Caloric restriction was
shown to reverse the changes in expression of several genes
associated with ageing in the skeletal muscle of rats (77). More
particularly, the activity of genes involved in fatty acid and
protein biosynthesis and in energy metabolism was restored.
Genes involved in the repair of macromolecule damage were
also over expressed. High-fat diets, when compared to standard
chow diets in mice, induced major changes in the liver
transcriptomic profiles, mainly related to lipid metabolism,
defence response and detoxification (78). The same effects were
11
observed independently of the mouse strain considered,
apoE3Leiden or C57BL/6J. Many of the genes affected were
under the control of nuclear receptors, ligands of biliary acids,
fatty acids and cholesterol.
Phenotypes associated with vitamin deficiency and their
normalisation by vitamin supplementation have been
characterised by proteomics and metabolomics in animal models
(79, 80) and human subjects (81). These approaches helped to
determine the role of vitamin deficiency in disease syndromes.
Phytochemicals present in foods are often characterised by a
wide array of metabolic effects and genomics appears
particularly suited for characterisation of their effects. Genistein,
a phytooestrogen present in soy foods, is thought to participate
in the prevention of cardiovascular diseases. Endothelial cells
challenged by homocysteine, a risk factor for cardiovascular
diseases, were exposed to genistein (82). Several metabolic
pathways related to atherosclerosis and influenced by genistein
could be identified by protein fingerprinting. Genistein blocked
the alterations induced by homocysteine on 17 out of 700
proteins quantified. These proteins were involved in metabolism,
gene regulation, protein folding, detoxification and apoptosis.
Genistein supplemented to the diet of mice was also found to
fully reverse the expression of 80 out of the 97 genes
differentially expressed (>2 fold) in the liver upon a high-fat diet
(83). These genes encoded for enzymes involved in lipid and
carbohydrate metabolism or were related to detoxification,
inflammation, apoptosis and transcription regulation. These
changes in gene expression were linked to a reduction of body
weight and an improvement of various lipid parameters.
Catechin, a phenolic antioxidant present in many fruits,
wine, tea and chocolate, was shown by a metabolomics approach
to reverse certain metabolic dysregulations induced by a high-fat
diet (84). Several of the urinary markers showing a reversion
upon catechin supplementation were related to the metabolism
of tryptophane and nicotinic acid.
Application in farm animals
Nutrigenomics is of interest in the context of managing
livestock animals for production. Underfeeding/refeeding
protocols are generally used to identify genes responsive to
nutritional manipulation. For example, the influence of
prepartum nutrition on hepatic gene expression was examined in
Holstein cows submitted either to moderate energy restriction or
fed ad libitum (85). Energy restriction induced an upregulation
of some of the genes involved in fatty acid oxidation,
gluconeogenesis and cholesterol synthesis. Conversely,
moderate ad libitum feeding favoured the expression of certain
genes associated with fat synthesis, thus predisposing cows to
fatty liver. In addition, ad libitum feeding resulted in
transcriptional changes potentially compromising liver health
through increased susceptibility to oxidative stress and DNA
damage. These data strengthened the importance of shaping the
prepartum nutrition of dairy cows and suggested that the
common practice of increasing the energy density of prepartum
cow diets should be rethought. Another study examined the
impact of fasting on the liver transcriptome of pigs (86). Fasting
induced genes involved in mitochondrial fatty acid oxidation
and ketogenesis, as shown for rodents. These genes were also
induced by feeding pigs a diet supplemented with clofibric acid
indicating that PPARα encoding a transcription factor which is
involved in lipid metabolism is likely to play an active role in the
metabolic adaptation to fasting in pigs.sentence.
A discontinuous growth path is generally observed during
extensive rearing of beef cattle due to huge variations in forage
availability during the year, as is the case for instance in tropical
countries. Previous studies have shown that mild nutritional
restriction followed by ad libitum feeding had only a small effect
on muscle characteristics, with the major effects observed being
changes in metabolic enzyme activities (87). The effect of more
severe undernutrition was examined using microarray
technology to get a broader view of the changes which occur.
Lehnert et al. (88) studied changes in the gene expression profile
of Belmont Red steers' Longissimus during body weight loss and
subsequent realimentation. In the Longissimus muscle, a major
underexpression was observed for genes encoding muscle
structural proteins (ACTA1, TPM2), extracellular matrix
(COL1A1, COL1A2, COL3A1, FN1) and muscle metabolic
enzymes (ATP1A2, CKM) especially those belonging to the
metabolic glycolytic pathway (e.g. ALDOA, ENO, GAPDH,
PGK1, PKM2, TP11) (88). This orientation of metabolism
towards less glycolytic features probably reflects an adaptation
to cope better with nutritional deprivation. The expression of
most of the genes was restored after realimentation. In addition,
a small group of genes potentially involved in myogenic
differentiation, maintenance of mesenchymal stem cells,
modulation of membrane function, prevention of oxidative
damage and regulation of muscle protein degradation was shown
to be upregulated. More surprisingly, expression of the SCD
gene was increased by undernutrition but the significance of this
observation is not known (88). However, these results might
only be valid for the Longissimus muscle as muscle types
respond differently to changes in feeding level, as has been
shown by biochemical approaches (87).
The influence of two production systems (pasture vs. maize
silage indoors) on muscle gene expression was studied in 30month-old Charolais steers (89). Transcriptomic analyses using
a multi-tissue bovine cDNA macroarray (90) were performed to
compare gene expression profiles in two muscles between the
two production groups. This strategy was designed to identify
differentially expressed genes that may be potential indicators of
pasture feeding. Interestingly, the study revealed differential
expression of the selenoprotein W (SeWP) gene, which was
found to be downregulated in the muscles of steers grazing on
pasture. Although its metabolic function is not yet known, SeWP
is likely to play a role in oxidant defence (91). The abundance of
SeWP in skeletal muscles and some other tissues is regulated by
dietary selenium (92, 93) especially in humans, for whom SeWP
is highly sensitive to selenium depletion. Thus, the differential
expression of SeWP in grazing steers may be related to the
selenium content or bioavailability in their diet (grass vs. maize
silage), but this remains to be clarified. Lastly, SePW expression
in grazing steers is most probably linked to selenium availability
in the diet rather than to their mobility (89 and in the same
journal issue). So transcriptomic analysis allowed muscle SeWP
expression to be proposed as a putative indicator of a pasturebased system.
PERSPECTIVES
High-throughput phenotyping
Whereas researchers can analyse gene structure and
expression as well as metabolites relatively easily, they are less
efficient at describing phenotypes due to the huge diversity of
physiological and biological traits to analyse, and the
concomitant high diversity of techniques required for that
objective. One new challenge in biology is therefore to develop
high-throughput techniques of phenotyping to analyse as many
biological traits as possible on a high number of samples.
In the absence of any direct method to assess phenotypes,
one way to solve the problem is to construct a regulatory
network either from in silico connections or from experimental
12
data. In the first case, connections are established between
different items of omics information (metabolites, protein or
gene expression, sequence information) available on public
databases. As examples, we can cite the analysis of regulatory
elements of genes, confirmation of predicted pathways and
interactions between biological pathways, protein interactions,
etc. This approach is widely developed in simple organisms such
as yeast. Another approach is correlation analysis over a range of
data compiled from a large number of experiments. The rationale
for this is that genes controlling a specific phenotype are often
coregulated in different experiment models (94). In both
approaches which use different omic approaches, a phenotype
can be defined by a set of values describing the expression of
genes, and concentrations of proteins and metabolites. An
individual can be positioned in a metabolic hyperspace made of
as many dimensions as variables. Subtle metabolic differences
between individuals or for a same individual between
environmental conditions can be identified using the appropriate
multivariate statistical tools (71).
In fact, the most direct way to assess phenotypes is to
measure them. One way to achieve this goal is to use the tissue
microarray technique which is a high-throughput technique to
analyse hundreds of clinical tissue samples using the 'array'
approach (27). This approach is based on the idea of translating
the convenience of DNA microarrays to tissues. It provides the
ability to analyse simultaneously histologic sections from
hundreds or thousands of tissue samples (for instance, for cancer
studies). The principle is to harvest small disks of tissues from
individual donor paraffin-embedded tissue blocks and to place
them in a recipient block in a grid-like fashion with defined array
coordinates. Then up to 200 consecutive sections can be cut from
each array block and section cuts from the array could be used
for simultaneous detection of DNA, RNA or proteins by various
techniques (for instance, immunohistochemistry) (27). One can
imagine other measurements such as cell size or shape for
instance. Despite some limits, tissue array technology has the
potential to accelerate molecular studies at tissue or at cell level
More generally, comprehensive phenotyping platforms
depend on a number of features. First, they require the
development of approaches to measure almost all physiological
traits for a whole assessment of body systems. Second, they
require standardised methods to ensure that phenotyping will be
comparable both within and between laboratories and over time.
If not, investigators will not be able to compare data properly
and to interpret any similarities or differences among
individuals. Therefore, it will be important for laboratories
undertaking large-scale projects to phenotype living organisms
to adopt standardised methods. It will be equally important that
such procedures are accessible to and operable by smaller
laboratories. The current challenge is to generate a set of
standard operating procedures for all organisms of interest (95).
Manipulation of gene expression in experimental models
The genomic techniques described above allow the
identification of many genes which potentially control
phenotypic traits. A big challenge now facing scientists is to
identify the biological functions of these genes. One way to
succeed is the manipulation of genes in mouse embryonic stem
cells. Indeed, the high degree of homology between the mouse
genome and that of humans also makes it a model of choice (96).
So far, one thousand targeted genes have been knocked out
among the 25,000 genes of the mouse genome. It may be
speculated that the systematic mutagenesis of all protein
encoding genes in the mouse will be achieved in the near future
thanks to a highly accurate mouse genome sequence and to the
sophisticated genetic tools and resources available for the
mouse. Different international consortia are working in this area.
In addition to this and in order to share resources, three major
organisms (National Institute of Health in USA, the European
Union and Genome Canada) have financed the International
Mouse Knockout Consortium (96). By capitalizing on
efficiencies of scale and a centralised production effort, the
project intends to develop a common strategy to make a
catalogue of mutants available in mouse strains.
All these efforts rely not only on high-quality annotation of
the mouse genome but also on high-quality description of the
phenotypes of the different mutated mice (95). The phenotypes
of several hundreds of mutated mouse strains will be described
on a large-scale by different institutes in the USA and centralised
in a database in Jackson laboratory, Bar Harbor USA:
(http://www.jax.org). This resource will be a materiel of choice
for large-scale genomic studies. But a world-wide strategy for
high-throughput phenotyping still remains to be developed (96).
Another way to suppress gene expression is to use the
technology of RNA interference. The mechanism of RNA
interference (RNAi) is the following: the appearance of double
stranded (ds) RNA within a cell (e.g. as a consequence of viral
infection) triggers naturally a complex response, which includes
among other phenomena a cascade of molecular events known
as RNAi. During RNAi, the cellular enzyme Dicer binds to the
foreign dsRNA and cleaves it into short pieces of ~ 20 nucleotide
pairs in lengths known as small interfering RNA (siRNA). These
RNA pairs bind to the cellular enzyme called RNA-induced
silencing complex (RISC) that uses one strand of the siRNA to
bind to single stranded RNA molecules (i.e. mRNA) of
complementary sequence. The nuclease activity of RISC then
degrades the mRNA, thus silencing expression of the viral gene.
Similarly, the genetic machinery of cells is believe to utilize
RNAi to control the expression of endogenous mRNA, thus
adding a new layer of post-transciptional regulation. RNAi can
be exploited in the experimental setting to knock down target
genes of interest with a highly specific and relatively easy
technology. Some authors argue that, at least in farm animals
(which are bigger than mice and less genetically characterised),
this is a much simpler method for conducting gene knock-out
analyses than by knocking out the gene on the genome. RNAi is
at the forefront of genomics research and is likely to generate
useful data in various fields of life sciences (97).
Modelling and integration
In many cases, genomic experiments were disappointing
since they provided catalogues of genes or proteins regulated by
various biological or external factors, but unfortunately
sometimes with no information about their function. Converting
data into knowledge of benefit to physiology is thus the
challenge. Besides gene manipulation in mice, another way to
better understand biology is to "integrate knowledge". In this
case, the aim is to understand the phenotypic data at a higher
level (tissue, whole organism) by understanding the
contributions made by the different genomic experiments. In
other words, the idea of integrative biology is to link data across
the different scales of biological organisation (from DNA, RNA,
proteins to cells, tissues and organs) to better understand
biology. This approach needs suitable databases and powerful
new statistical approaches. This is called "systems biology". The
outcome of it could be a better prediction of physiological
processes on the basis of genomic data (98, 99)
To achieve this goal, the first requirement is to collect data
from suitable databases as previously discussed (17). In addition
to this, the need for biological ontologies has emerged in large
part due to the rapid development of large biological databases.
Ontology defines a common vocabulary for researchers who
13
need to share information in a domain. In other words,
ontologies are developed to define a controlled vocabulary for
description of animal traits. This is important in particular for
phenotypes for which the variability is important. The ultimate
objective is to annotate biological data in a form that allows
users to exchange and to compare their data. Successful
ontologies in biology have been developed in the past few years,
such as Gene Ontology, Rice Ontology, Plant Phenotype and
Trait Ontology (http://www.gramene.org/plant_ontology/) and
more recently Animal Trait Ontology (http://www.
animalgenome.org/bioinfo/projects/ATO/). Precise definition of
physiological trait terms (phenotypes) will help to capture the
biologically relevant distinctions at the desired level of detail in
unambiguous fashion. The Ontology databases provide a
controlled vocabulary to describe each trait of any individual.
They share information for controlled vocabularies (Ontologies)
and their associations to genomic information such as QTL,
phenotype gene, proteins, etc. In fact, ontologies have now
become a de facto standard in genomics as a controlled
vocabulary for annotating the functions, pertinent processes and
cellular locations of gene products.
To summarise, the greatest challenges in establishing this
modelling approach are not biological but computational and
organisational. The computational issues are centred on the
search and analysis of massive amounts of data, on integration of
heterogeneous databases and on large-scale data-presentation
systems interpretable by biologists. Ultimately, the importance of
this approach will be judged not on its mathematical conception
but by how it can be used to describe biological laws (100).
Biotechnological tools
As described before, one major outcome of genomics is the
development of diagnostic tests based on biotechnological
methods which may be useful, for instance, in humans for
personalised medicine or in livestock for detection of animals
with desirable traits.
Well-known applications of DNA-based tests in our modern
society are the identification of a potential criminal, or checking
paternity in humans. Other commercial applications exist in the
area of food safety to detect the presence of a contaminated
organism within foodstuffs (17). Other applications based on
DNA variability also exist in the food industry for traceability
purposes to check, for instance, the animal origin of a piece of
meat (101). In this paragraph, only a few examples of diagnostic
tests based on genomics will be described with the objective of
illustrating the long journey between research and commercial
applications.
The first example concerns the wide variety of molecular
assays which have been developed and implemented in the
clinical management of viral hepatitis thanks to the considerably
improved understanding of the pathogenesis of hepatitis. This has
caused uncertainties in the selection of the most appropriate assays
for clinical requirements. Consequently, a rational choice and
application of these assays requires adequate knowledge of the
performance of each single test. Moreover, the choice of the most
accurate assay needs to take into account specific contexts, such
as diagnosis, management or treatment depending on patients'
needs and doctors' objectives. A major improvement in addressing
the use of molecular assays for viral hepatitis has arisen from
recent standardisation procedures which nowadays allow a
comparison between different tests provided results are given as
International Units. In addition, before being commercialised,
molecular assays have to be approved by European regulation
authorities and validated using internationally recognized
standards. An additional clinical validation must also address the
diagnostic accuracy of the assay (102).
Another well known example of genomic applications is the
discovery of new biomarkers to identify human subjects at risk
for cancer, to detect cancer disease earlier, to predict the
response to particular agents and to monitor response to tumour
treatments. In this context, a biomarker is by definition "a
characteristic that is objectively measured and evaluated as an
indicator of normal biologic processes, pathogenic processess,
or pharmaceutical responses to therapeutic intervention".
Biomarkers may be simple phenotypic traits or mRNA profiles
and more recently, combinations of proteins. A great number of
scientific articles regularly describe new biomarkers, but only a
few of them go to the market because many markers lack clinical
utility. In fact, a biomarker must provide information that is not
available by a more simple and already existing less expensive
method. So, before a biomarker brings a benefit over other
criteria, it needs to adress four concepts: "easier, better, faster
and cheaper". It must also be validated in a clinically relevant
environment. In addition, the researchers must guarantee
reproducibility of the procedures to assess the marker. In
practice the performances of biomarkers are improved through
combinations of a panel of markers. This is again a strong
advantage of the genomic approaches which provide biomarker
profiling (103).
The recent progress in the area of meat quality described
above should lead to the development of commercial diagnostic
tests based on "genomic markers". Ideally, the research in this
topic should lead to the integration of "genomic tracers" into
chips to detect molecular signatures not only predicting the
sensory or nutritional quality of livestock products but also
ensuring traceability of production systems. However,
conversion of genotyping or gene expression profiling tools to
practical biological assays or to diagnostic tests is not easy and
takes quite a long time. This implies many steps, such as
confirmation of the association in large samples (104), as well as
testing before their commercial exploitation. Indeed, it appears
now that commercialised genetic markers previously identified
in specific breeds, production systems or limited countries may
not be relevant for other breeds reared differently in other parts
of the World (105).
Beside scientific and technical issues (the importance of the
studied traits, confirmation of the gene effects on a large
population, successful production of genomic tests, etc), it is
however crucial to determine the economic value of such
diagnostic tests. For instance, it is important to know whether
improvement of a specific biological trait by genomic tools will
or will not provide an economic return for the company or
improve competitiveness of the product compared with
alternative methods. So far, the costs of DNA tests (to genotype
individuals) have dropped by several orders of magnitude
making various companies receptive to their use in a commercial
context. Unfortunately, this is less true for diagnostic tests based
on gene expression methods. However, we anticipate that costs
will drop for array and proteomic tools as well.
CONCLUSION
With the continuous evolution of high-throughput
experimental techniques at different levels (DNA, RNA,
proteins, etc.), the landscape of biological research is
continuously changing. According to Evelyn Fox Keller (106,
107), the concept of the gene has been over-used, and we have
to look at things differently: there is not just one gene but rather
a combination of individual genes governing physiology and
their regulation. The combination of individual expression
levels, rather than the genes themselves, are responsible for
phenotype variability. The next challenge is to integrate the
14
knowledge gained from these studies with the ultimate objective
to optimise human health and livestock production systems.
The great number of new and interesting methodologies and
technologies that are emerging may contribute to meeting this
new challenge. Once a species has a whole genome sequenced,
a great amount of new potential applications arise from this
genome sequence: comparison among species, the discovery of
SNPs, the availability of pan-genomic arrays and the enrichment
of proteomic and metabolomic databases. But progress is still
hampered by high costs (which are decreasing) and
technological hurdles. Techniques for proteomics and especially
metabolomics still require substantial developments and
standardisation (108). It is not possible today to measure the
whole proteome and metabolome of a given sample, and
identification of the markers of interest is still difficult. Big
investments are being made to fully annotate the proteome and
metabolome in comprehensive databases (see for example the
Human Metabolome Project Database, www.hmdb.ca) to
facilitate the identification of these markers and interpretation of
the data. Much progress is expected in the coming few years.
Thanks to the development of omics approaches, data
acquisition has never been so rapid. Biologists are faced with the
difficulty of integrating the data to better interpret them. For
integration, the challenges are now in the area of data
compilation in suitable databases and modelling with the help of
bioinformatic tools. But a major challenge facing biologists is to
ascribe functions to the new discovered genes, proteins and
metabolites. One powerful approach to help describe functions is
the manipulation of genes in intact animals. This is the reason
why major knockout programmes are underway worldwide in
laboratory models. For both modelling and gene manipulation,
the precise and repeatable description of phenotypes on a large
scale will probably be the biggest challenge to achieve.
Large-scale research programmes and efficient coordination
will be needed to make the most of the expensive resources
associated with the different omics dimensions and also in order
to share data at world level. So, genomics is also changing the
organisation of research and the researchers' ways of working.
Conflict of interest statement: None declared.
REFERENCES
1. Venter JC, Adams MD, Myers EW et al. The sequence of the
human genome. Science 2001; 291:1304-1351.
2. Robert C. Challenges of functional genomics applied to farm
animal gametes and pre-hatching embryos. Theriogenology
2008; 70: 1277-1287.
3. Hutchison CA. DNA sequencing: bench to bedside and
beyond. Nucleic Acids Res 2007; 35: 6227-6237.
4. She XW, Cheng Z, Zollner S, Church DM, Eichler EE.
Mouse segmental duplication and copy number variation.
Nat Genet 2008; 40: 909-914.
5. Callinan PA, Feinberg AP. The emerging science of
epigenomics. Hum Mol Genet 2006; 15: 95-101.
6. Shastry BS. SNPs in disease gene mapping, medicinal
drug development and evolution. J Human Genet 2007;
52: 871-880.
7. McCarthy MI, Abecasis GR, Cardon LR et al. Genome-wide
association studies for complex traits: consensus,
uncertainty and challenges. Nat Rev Genet 2008; 9: 356-369.
8. Melzer D, Perry JRB, Hernandez D et al. A genome-wide
association study identifies protein quantitative trait loci
(pQTLs). PLoS Genet 2008; 4(5): e1000072.
9. Keurentjes JJB, Koornneef M, Vreugdenhil D. Quantitative
genetics in the age of omics. Curr Opin Plant Biol 2008; 11:
123-128.
10. Gao Y, Zhang R, Hu X, Li N. Application of genomic
technologies to the improvement of meat quality of farm
animals. Meat Sci 2007; 77: 36-45.
11. Goddard ME, Hayes BJ. Genomic selection. J Anim Breed
Genet 2007; 124: 323-330.
12. Khatkar MS, Nicholas FW, Collins AR et al. Extent of
genome-wide linkage disequilibrium in Australian HolsteinFriesian cattle based on a high-density SNP panel. BMC
Genomics 2008; 9: 187.
13. Meuwissen THE, Hayes BJ, Goddard M.E. Prediction of
total genetic value using genome-wide dense marker maps.
Genetics 2001; 157: 1819-1829.
14. Rothschild MF, Plastow GS. Impact of genomics on animal
agriculture and opportunities for animal health. Trends
Biotechnol 2008; 26: 21-25.
15. Mullen AM, Stapleton PC, Corcoran D, Hamill RM, White
A. Understanding meat quality through the application of
genomic and proteomic approaches. Meat Sci 2006; 74: 3-16.
16. Kadarmideen HN, von Rohr P, Janss LLG. From genetical
genomics to systems genetics: potential applications in
quantitative genomics and animal breeding. Mamm Genome
2006; 17: 548-564.
17. Hocquette JF. Where are we in genomics? J Physiol
Pharmacol 2005; 56(Suppl. 3): 37-70.
18. Polyak K, Riggins GJ. Gene discovery using the serial
analysis of gene expression technique: implications for
cancer research. J Clin Oncol 2001; 19: 2948-2958.
19. Yamamoto M, Wakatsuki T, Hada A, Ryo A. Use of serial
analysis of gene expression (SAGE) technology. J Immunol
Methods 2001; 250: 45-66.
20. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial
analysis of gene expression. Science 1995; 270: 484-487.
21. Velculescu VE, Vogelstein B, Kinzler KW. Analysing
uncharted transcriptomes with SAGE. Trends Genet 2000;
16: 423-425.
22. Weeraratna AT. Discovering causes and cures for cancer from
gene expression analysis. Ageing Res Rev 2005; 4: 548-563.
23. Bolduc C, Larose M, Lafond N et al. Adipose tissue
transcriptome by Serial Analysis of Gene Expression. Obes
Res 2004; 12: 750-757.
24. Maillard JC, Berthier D, Thevenon S et al. Use of the Serial
Analysis of Gene Expression (SAGE) Method in veterinary
research: a concrete application in the study of the bovine
trypanotolerance genetic control. Ann NY Acad Sci 2004;
1026: 171-182.
25. Shiu, SH; Borevitz, JO. The next generation of microarray
research: Applications in evolutionary and ecological
genomics. Heredity 2008; 100: 141-149.
26. Seidel, M; Niessner, R. Automated analytical microarrays: a
critical review. Anal Bioanal Chem 2008; 391: 1521-1544.
27. Radhakrishnan R, Solomon M, Satyamoorthy K, Martin LE,
Lingen MW. Tissue microarray - a high-throughput
molecular analysis in head and neck cancer. J Oral Pathol
Med 2008; 37: 166-176.
28. Rhodes DR, Yu J, Shanker K, Deshpande N et al. Large-scale
meta-analysis of cancer microarray data identifies common
transcriptional profiles of neoplastic transformation and
progression. Proc Natl Acad Sci USA 2004; 101: 9309-9314.
29. Tabibiazar R., Wagner RA, Ashley EA et al. Signature
patterns of gene expression in mouse atherosclerosis and
their correlation to human coronary disease. Physiol
Genomics 2005; 22: 213-226.
30. Lee CK, Klopp RG, Weindruch R, Prolla TA. Gene
expression profile of aging and its retardation by caloric
restriction. Science 1999; 285: 1390-1393.
31. Tuggle CK, Wang Y, Couture O. Advances in swine
transcriptomics. Int J Biol Sci 2007; 3: 132-152.
15
32. Hocquette JF, Lehnert S, Barendse W, Cassar-Malek I,
Picard B. Recent advances in cattle functional genomics and
their application to beef quality. Animal 2007; 1: 159-173.
33. Cassar-Malek I, Picard B, Bernard C, Hocquette JF.
Application of gene expression studies in livestock
production systems: a European perspective. Aust J Exp Agr
2008; 48: 701-710.
34. Lehnert SA, Wang YH, Tan SH, Reverter A. Gene
expression-based approaches to beef quality research. Aust J
Exp Agr 2006; 46: 165-172.
35. Sudre K, Leroux C, Pietu G et al. Transcriptome analysis of
two bovine muscles during ontogenesis. J Biochem 2003;
133: 745-756.
36. Lehnert SA, Reverter A, Byrne KA et al. Gene expression
studies of developing bovine longissimus muscle from two
different beef cattle breeds. BMC Dev Biol 2007; 7: 95.
37. Sudre K, Cassar-Malek I, Listrat A et al. Biochemical and
transcriptomic analyses of two bovine skeletal muscles in
Charolais bulls divergently selected for muscle growth. Meat
Sci 2005; 70: 267-277.
38. Cassar-Malek I, Passelaigue F, Bernard C, Leger J,
Hocquette JF. Target genes of myostatin loss-of-function in
muscles of late bovine fetuses. BMC Genomics 2007; 8: 63.
39. Byrne KA, Wang YH, Lehnert SA et al. Gene expression
profiling of muscle tissue in Brahman steers during
nutritional restriction. J Anim Sci 2005; 83: 1-12.
40. Wang YH, Reverter A, Mannen H et al. Transcriptional
profiling of muscle tissue in growing Japanese Black cattle
to identify genes involved with the development of
intramuscular fat. Aust J Exp Agr 2005; 45: 809-820.
41. Lee SH, Park EW, Cho YM et al. Identification of
differentially expressed genes related to intramuscular fat
development in the early and late fattening stages of hanwoo
steers. J Biochem Mol Biol 2007; 40: 757-764.
42. Bernard C, Cassar-Malek I, Le Cunff M et al. New
indicators of beef sensory quality revealed by expression of
specific genes. J Agr Food Chem 2007; 55: 5229-5237.
43. Yin JQ, Zhao RC, Morris KV. Profiling microRNA expression
with microarrays. Trends Biotechnol 2008; 26: 70-76.
44. Liu Z, Sall A, Yang DC. MicroRNA: an emerging therapeutic
target and intervention tool. Int J Mol Sci 2008; 9: 978-999.
45. Clop A, Marcq F, Takeda H et al. A mutation creating a
potential illegitimate microRNA target site in the myostatin
gene affects muscularity in sheep. Nat Genet 2006; 38:
813-818.
46. Lippolis JD, Reinhardt TA. Centennial paper: proteomics in
animal science. J Anim Sci 2008; 86: 2430-2441.
47. Anderson NL, Anderson NG. The human plasma proteome history, character, and diagnostic prospects. Mol Cell
Proteomics 2002; 1: 845-867.
48. Meng Z, Veenstra TD. Proteomic analysis of serum, plasma,
and lymph for the identification of biomarkers. Proteom Clin
Appl 2007; 1: 747-757.
49. Bouley J, Meunier B, Chambon C et al. Proteomic analysis
of bovine skeletal muscle hypertrophy. Proteomics 2005; 5:
490-500.
50. Bendixen E. The use of proteomics in meat science. Meat Sci
2005; 71: 138-149.
51. Hollung K, Veiseth E, Jia XH, Faergestad EM, Hildrum KI
Application of proteomics to understand the molecular
mechanisms behind meat quality. Meat Sci 2007; 77: 97-104.
52. Bauchart C, Remond D, Chambon C et al. Small peptides
(<5 kDa) found in ready-to- eat beef meat. Meat Sci 2006;
74: 658-666.
53. Dhamoon AS, Kohn EC, Azad NS. The ongoing evolution of
proteomics in malignancy. Drug Discov Today 2007; 12:
700-708.
54. Dettmer K, Aronov PA, Hammock BD. Mass spectrometrybased metabolomics. Mass Spectrom Rev 2007; 26: 51-78.
55. Watkins SM, Reifsnyder PR, Pan HJ, German JB, Leiter EH.
Lipid metabolome-wide effects of the PPAR gamma agonist
rosiglitazone. J Lipid Res 2002; 43: 1809-1817.
56. Baggerman G, Verleyen P, Clynen E et al. Peptidomics. J.
Chromatogr B 2004; 803: 3-16.
57. Keun HC, Ebbels TM, Antti H et al. Analytical
reproducibility in 1H NMR-based metabonomic urinalysis.
Chem Res Toxicol 2002; 15: 1380-1386.
58. Griffin JL, Shockcor JP. Metabolic profiles of cancer cells.
Nat Rev Cancer 2004; 4: 551-561.
59. Nordström A, O'Maille G, Qin C, Siuzdak G. Nonlinear data
alignment for UPLC-MS and HPLC-MS based
metabolomics: Quantitative analysis of endogenous and
exogenous metabolites in human serum. Anal Chem 2006;
78: 3289-3295.
60. Soga T, Baran R, Suematsu M et al. Differential
metabolomics reveals ophthalmic acid as an oxidative stress
biomarker indicating hepatic glutathione consumption. J
Biol Chem 2006; 281: 16768-16776.
61. Shaham O, Wei R, Wang TJ et al. Metabolic profiling of the
human response to a glucose challenge reveals distinct axes
of insulin sensitivity. Mol Syst Biol 2008; 4: 214.
62. Wishart DS, Lewis MJ, Morrissey JA et al. The human
cerebrospinal fluid metabolome. J Chromatogr B 2008; 871:
164-173.
63. Ma HW, Goryanin I. Human metabolic network
reconstruction and its impact on drug discovery and
development. Drug Discov Today 2008; 13: 9-10.
64. van Ommen B, Fairweather-Tait S, Freidig A et al. A
network biology model of micronutrient related health. Brit
J Nutr 2008; 99: S72-S80.
65. Yang J, Xu G, Zheng Y et al. Diagnosis of liver cancer using
HPLC-based metabonomics avoiding false-positive result
from hepatitis and hepatocirrhosis diseases. J Chromatogr B
Analyt Technol Biomed Life Sci 2004; 813: 59-65.
66. Jones GLAH, Sang E, Goddard C et al. A Functional
analysis of mouse models of cardiac disease through
metabolic profiling. J Biol Chem 2005; 280: 7530-7539.
67. Brindle JT, Antti H, Holmes E et al. Rapid and noninvasive
diagnosis of the presence and severity of coronary heart
disease using 1H-NMR-based metabonomics. Nat Med 2002;
8: 1439-1444.
68. Kirschenlohr HL, Griffin JL, Clarke SC et al. Proton NMR
analysis of plasma is a weak predictor of coronary artery
disease. Nat Med 2006; 12: 705-710.
69. Lamers RJAN, van Nesselrooij JHJ, Kraus VB et al.
Identification of an urinary metabolite profile associated
with osteoarthritis. Osteoarthr Cartilage 2005; 13: 762-768.
70. Chadwick R. Nutrigenomics, individualism and public
health. Proc Nutr Soc 2004; 63: 161-166.
71. Scalbert A, Milenkovic D, Llorach R, Manach C, Leroux C.
Nutrigenomics: techniques and applications. 2nd
International Symposium on Energy and Protein Metabolism
and Nutrition, Vichy, France, 9-13 Sept. 2007, In: Energy
and protein metabolism and nutrition, Ortigues-Marty I.
(ed). EAAP Publication 2007; 124: 259-276.
72. German JB, Roberts MA, Watkins SM. Genomics and
metabolomics as markers for the interaction of diet and
health: Lessons from lipids. J Nutr 2003; 133: 2078S-2083S.
73. Fiehn O. Metabolomics-the link between genotypes and
phenotypes. Plant Mol Biol 2002; 48: 155-171.
74. Müller M., Kersten S. Nutrigenomics: goals and strategies.
Nat Rev Genet 2003; 4: 315-322.
75. Afman L, Muller M. Nutrigenomics: From molecular nutrition
to prevention of disease. J Am Diet 2006; 106: 569-576.
16
76. Rezzi S, Ramadan Z, Fay LB, Kochhar S. Nutritional
Metabonomics: Applications and Perspectives. J Proteome
Res 2007; 6: 513-525.
77. Lee CK, Klopp RG, Weindruch R, Prolla TA. Gene
expression profile of aging and its retardation by caloric
restriction. Science 1999; 285: 1390-1393.
78. Kreeft AJ, Moen CJ, Porter G et al. Genomic analysis of the
response of mouse models to high-fat feeding shows a
major role of nuclear receptors in the simultaneous
regulation of lipid and inflammatory genes. Atherosclerosis
2005; 182: 249-257.
79. Griffin JL, Muller D, Woograsingh R et al. Vitamin E
deficiency and metabolic deficits in neuronal ceroid
lipofuscinosis described by bioinformatics. Physiol
Genomics 2002; 11: 195-203.
80. Chanson A, Sayd T, Rock E et al. Proteomic analysis reveals
changes in the liver protein pattern of rats exposed to dietary
folate deficiency. J Nutr 2005; 135: 2524-2529.
81. Weissinger EM, Nguyen-Khoa T, Fumeron C et al. Effects
of oral vitamin C supplementation in hemodialysis patients:
A proteomic assessment. Proteomics 2006; 6: 993-1000.
82. Fuchs D, Erhard P., Rimbach G, Daniel H., Wenzel U.
Genistein blocks homocysteine-induced alterations in the
proteome of human endothelial cells. Proteomics 2005; 5:
2808-2818.
83. Kim S, Sohn I, Lee YS. Hepatic gene expression profiles are
altered by genistein supplementation in mice with dietinduced obesity. J Nutr 2005; 135: 33-41.
84. Fardet A, Llorach R, Martin JF et al. A Liquid
Chromatography-Quadrupole Time-of-Flight (LC-QTOF)based metabolomic approach reveals new metabolic effects
of catechin in rats fed high-fat diets. J Proteome Res 2008;
7: 2388-2398.
85. Loor JJ, Dann HM, Janovick Guretzky NA et al. Plane of
nutrition prepartum alters hepatic gene expression and
function in dairy cows as assessed by longitudinal
transcript and metabolic profiling. Physiol Genomics
2006; 27: 29-41.
86. Cheon Y, Nara TY, Band MR et al. Induction of overlapping
genes by fasting and a peroxisome proliferator in pigs:
evidence of functional PPARalpha in non proliferating
species. Am J Physiol 2005; 288: R1525-R1535.
87. Cassar-Malek I, Hocquette JF, Jurie C et al. Muscle-specific
metabolic, histochemical and biochemical responses to a
nutritionally induced discontinuous growth path. Anim Sci
2004; 204: 79-59.
88. Lehnert SA, Byrne KA, Reverter A et al. Gene expression
profiling of bovine skeletal muscle in response to and during
recovery from chronic and severe undernutrition. J Anim Sci
2006; 84: 3239-3250.
89. Cassar-Malek I, Bernard C, Jurie C et al. Pasture-based beef
production systems may influence muscle characteristics
and gene expression. In: Indicators of Milk and Beef
Quality, JF Hocquette, S Gigli (eds). EAAP Publication
2005; 112: 385-390.
90. Bernard C, Degrelle S, Ollier S et al. A cDNA macro-array
resource for gene expression profiling in ruminant tissues
involved in reproduction and production (milk and beef)
traits. J Physiol Pharmacol 2005; 56(Suppl. 3): 215-224.
91. Jeong DW, Kim TS, Chung YW, Lee BJ, Kim IY.
Selenoprotein W is a glutathione-dependent antioxidant in
vivo. FEBS Lett 2002; 517: 225-228.
92. Vendeland SC, Beilstein MA, Yeh JY, Ream W, Whanger
PD. Rat skeletal muscle selenoprotein W: cDNA clone and
mRNA modulation by dietary selenium. Proc Natl Acad Sci
USA 1995; 92: 8749-8753.
93. Yeh JY, Beilstein MA, Andrews JS, Whanger PD. Tissue
distribution and influence of selenium status on levels of
selenoprotein-W. FASEB J 1995; 9: 392-396.
94. Keurentjes JJB, Koornneef M, Vreugdenhil D. Quantitative
genetics in the age of omics. Curr Opin Plant Biol 2008; 11:
123-128.
95. Brown SDM, Chambon P, de Angelis MH. EMPReSS:
standardized phenotype screens for functional annotation of
the mouse genome. Nat Genet 2005; 37: 1155-1155.
96. The International Mouse Knockout Consortium. A Mouse
for All Reasons. Cell 2007; 128: 9-13.
97. Sellner EM, Kim JW, McClure MC et al. Board-invited
review: Applications of genomic information in livestock. J
Anim Sci 2007, 85(12): 3148-3158.
98. Nicholson JK, Holmes E, Lindon JC, Wilson ID. The
challenges of modeling mammalian biocomplexity. Nat
Biotechnol 2004; 2: 1268-1274.
99. Naylor S, Culbertson AW, Valentine SJ. Towards a systems
level analysis of health and nutrition. Curr Opin Biotech
2008; 19: 100-109.
100. Liu ET. Systems biology, integrative biology, predictive
biology. Cell 2005; 121: 505-506.
101. Dalvit C, De Marchi M, Cassandro M. Genetic traceability
of livestock products. Meat Sci 2007; 77: 437-449.
102. Mangia A, Antonucci F, Brunetto M et al. The use of
molecular assays in the management of viral hepatitis.
Digest Liver Dis 2008; 40: 395-404.
103. Bensalah K, Montorsi F, Shariat SF. Challenges of cancer
biomarker profiling. Eur Urol 2007; 52: 1601-1609.
104. Barendse W. The transition from quantitative trait loci to
diagnostic test in cattle and other livestock. Aust J Exp Agr
2005; 45: 831-836.
105. Renand G, Payet N, Levéziel H et al. Markers in DAGT1
and TG genes are not associated with intramuscular lipid
content in the French beef breeds. In; 'Proceedings of the
53rd International Congress of Meat Science and
Technology, G Zhou, W Zhang (eds), China Agricultural
University Press: Beijing, 2007; pp. 75-76.
106. Fox Keller E. The Century of the Gene. Harvard
University Press 2002.
107. Fox Keller E. The century beyond the gene. J Biosci 2005;
30: 3-10.
108. Fiehn O, Robertson D, Griffin J et al. The metabolomics
standards initiative (MSI). Metabolomics 2007; 3: 175-178.
R e c e i v e d : September 6, 2008
A c c e p t e d : November 16, 2008
Author's address: Jean Francois Hocquette, PhD, UR1213,
Recherche sur les Herbivores, INRA, Theix, 63122 SaintGenes Champanelle, France. Phone: +33-473624253 ; Fax.:
+33-473624639; e-mail: [email protected]