Tarazona, S.; Garcia-Alcalde, F.; Dopazo, J.; Ferrer, A.; Conesa, A.
Differential expression in RNA-seq: A matter of depth Journal Article
In: GENOME RESEARCH, vol. 21, no. 12, pp. 2213-2223, 2011, ISSN: 1088-9051.
@article{ISI:000297918600020,
title = {Differential expression in RNA-seq: A matter of depth},
author = { S. Tarazona and F. Garcia-Alcalde and J. Dopazo and A. Ferrer and A. Conesa},
url = {http://dx.doi.org/10.1101/gr.124321.111},
doi = {10.1101/gr.124321.111},
issn = {1088-9051},
year = {2011},
date = {2011-12-01},
journal = {GENOME RESEARCH},
volume = {21},
number = {12},
pages = {2213-2223},
abstract = {Next-generation sequencing (NGS) technologies are revolutionizing genome
research, and in particular, their application to transcriptomics
(RNA-seq) is increasingly being used for gene expression profiling as a
replacement for microarrays. However, the properties of RNA-seq data
have not been yet fully established, and additional research is needed
for understanding how these data respond to differential expression
analysis. In this work, we set out to gain insights into the
characteristics of RNA-seq data analysis by studying an important
parameter of this technology: the sequencing depth. We have analyzed how
sequencing depth affects the detection of transcripts and their
identification as differentially expressed, looking at aspects such as
transcript biotype, length, expression level, and fold-change. We have
evaluated different algorithms available for the analysis of RNA-seq and
proposed a novel approach-NOISeq-that differs from existing methods in
that it is data-adaptive and nonparametric. Our results reveal that most
existing methodologies suffer from a strong dependency on sequencing
depth for their differential expression calls and that this results in a
considerable number of false positives that increases as the number of
reads grows. In contrast, our proposed method models the noise
distribution from the actual data, can therefore better adapt to the
size of the data set, and is more effective in controlling the rate of
false discoveries. This work discusses the true potential of RNA-seq for
studying regulation at low expression ranges, the noise within RNA-seq
data, and the issue of replication.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
research, and in particular, their application to transcriptomics
(RNA-seq) is increasingly being used for gene expression profiling as a
replacement for microarrays. However, the properties of RNA-seq data
have not been yet fully established, and additional research is needed
for understanding how these data respond to differential expression
analysis. In this work, we set out to gain insights into the
characteristics of RNA-seq data analysis by studying an important
parameter of this technology: the sequencing depth. We have analyzed how
sequencing depth affects the detection of transcripts and their
identification as differentially expressed, looking at aspects such as
transcript biotype, length, expression level, and fold-change. We have
evaluated different algorithms available for the analysis of RNA-seq and
proposed a novel approach-NOISeq-that differs from existing methods in
that it is data-adaptive and nonparametric. Our results reveal that most
existing methodologies suffer from a strong dependency on sequencing
depth for their differential expression calls and that this results in a
considerable number of false positives that increases as the number of
reads grows. In contrast, our proposed method models the noise
distribution from the actual data, can therefore better adapt to the
size of the data set, and is more effective in controlling the rate of
false discoveries. This work discusses the true potential of RNA-seq for
studying regulation at low expression ranges, the noise within RNA-seq
data, and the issue of replication.
Khalaf, A. A.; Gmitter, Jr.; Conesa, A.; Dopazo, J.; Moore, G. A.
Fortunella margarita Transcriptional Reprogramming Triggered by Xanthomonas citri subsp citri Journal Article
In: BMC PLANT BIOLOGY, vol. 11, 2011, ISSN: 1471-2229.
@article{ISI:000297983800001,
title = {Fortunella margarita Transcriptional Reprogramming Triggered by
Xanthomonas citri subsp citri},
author = { A. A. Khalaf and Jr. Gmitter and A. Conesa and J. Dopazo and G. A. Moore},
url = {http://dx.doi.org/10.1186/1471-2229-11-159},
doi = {10.1186/1471-2229-11-159},
issn = {1471-2229},
year = {2011},
date = {2011-11-01},
journal = {BMC PLANT BIOLOGY},
volume = {11},
abstract = {Background: Citrus canker disease caused by the bacterial pathogen
Xanthomonas citri subsp. citri (Xcc) has become endemic in areas where
high temperature, rain, humidity, and windy conditions provide a
favourable environment for the dissemination of the bacterium. Xcc is
pathogenic on many commercial citrus varieties but appears to elicit an
incompatible reaction on the citrus relative Fortunella margarita Swing
(kumquat), in the form of a very distinct delayed necrotic response. We
have developed subtractive libraries enriched in sequences expressed in
kumquat leaves during both early and late stages of the disease. The
isolated differentially expressed transcripts were subsequently
sequenced. Our results demonstrate how the use of microarray expression
profiling can help assign roles to previously uncharacterized genes and
elucidate plant pathogenesis-response related mechanisms. This can be
considered to be a case study in a citrus relative where high throughput
technologies were utilized to understand defence mechanisms in
Fortunella and citrus at the molecular level.
Results: cDNAs from sequenced kumquat libraries (ESTs) made from
subtracted RNA populations, healthy vs. infected, were used to make this
microarray. Of 2054 selected genes on a customized array, 317 were
differentially expressed (P < 0.05) in Xcc challenged kumquat plants
compared to mock-inoculated ones. This study identified components of
the incompatible interaction such as reactive oxygen species (ROS) and
programmed cell death (PCD). Common defence mechanisms and a number of
resistance genes were also identified. In addition, there were a
considerable number of differentially regulated genes that had no
homologues in the databases. This could be an indication of either a
specialized set of genes employed by kumquat in response to canker
disease or new defence mechanisms in citrus.
Conclusion: Functional categorization of kumquat Xcc-responsive genes
revealed an enhanced defence-related metabolism as well as a number of
resistant response-specific genes in the kumquat transcriptome in
response to Xcc inoculation. Gene expression profile(s) were analyzed to
assemble a comprehensive and inclusive image of the molecular
interaction in the kumquat/Xcc system. This was done in order to
elucidate molecular mechanisms associated with the development of the
hypersensitive response phenotype in kumquat leaves. These data will be
used to perform comparisons among citrus species to evaluate means to
enhance the host immune responses against bacterial diseases.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Xanthomonas citri subsp. citri (Xcc) has become endemic in areas where
high temperature, rain, humidity, and windy conditions provide a
favourable environment for the dissemination of the bacterium. Xcc is
pathogenic on many commercial citrus varieties but appears to elicit an
incompatible reaction on the citrus relative Fortunella margarita Swing
(kumquat), in the form of a very distinct delayed necrotic response. We
have developed subtractive libraries enriched in sequences expressed in
kumquat leaves during both early and late stages of the disease. The
isolated differentially expressed transcripts were subsequently
sequenced. Our results demonstrate how the use of microarray expression
profiling can help assign roles to previously uncharacterized genes and
elucidate plant pathogenesis-response related mechanisms. This can be
considered to be a case study in a citrus relative where high throughput
technologies were utilized to understand defence mechanisms in
Fortunella and citrus at the molecular level.
Results: cDNAs from sequenced kumquat libraries (ESTs) made from
subtracted RNA populations, healthy vs. infected, were used to make this
microarray. Of 2054 selected genes on a customized array, 317 were
differentially expressed (P < 0.05) in Xcc challenged kumquat plants
compared to mock-inoculated ones. This study identified components of
the incompatible interaction such as reactive oxygen species (ROS) and
programmed cell death (PCD). Common defence mechanisms and a number of
resistance genes were also identified. In addition, there were a
considerable number of differentially regulated genes that had no
homologues in the databases. This could be an indication of either a
specialized set of genes employed by kumquat in response to canker
disease or new defence mechanisms in citrus.
Conclusion: Functional categorization of kumquat Xcc-responsive genes
revealed an enhanced defence-related metabolism as well as a number of
resistant response-specific genes in the kumquat transcriptome in
response to Xcc inoculation. Gene expression profile(s) were analyzed to
assemble a comprehensive and inclusive image of the molecular
interaction in the kumquat/Xcc system. This was done in order to
elucidate molecular mechanisms associated with the development of the
hypersensitive response phenotype in kumquat leaves. These data will be
used to perform comparisons among citrus species to evaluate means to
enhance the host immune responses against bacterial diseases.
Durban, J.; Juarez, P.; Angulo, Y.; Lomonte, B.; Flores-Diaz, M.; Alape-Giron, A.; Sasa, M.; Sanz, L.; Gutierrez, J. M.; Dopazo, J.; Conesa, A.; Calvete, J. J.
Profiling the venom gland transcriptomes of Costa Rican snakes by 454 pyrosequencing Journal Article
In: BMC GENOMICS, vol. 12, 2011, ISSN: 1471-2164.
@article{ISI:000292249800002,
title = {Profiling the venom gland transcriptomes of Costa Rican snakes by 454
pyrosequencing},
author = { J. Durban and P. Juarez and Y. Angulo and B. Lomonte and M. Flores-Diaz and A. Alape-Giron and M. Sasa and L. Sanz and J. M. Gutierrez and J. Dopazo and A. Conesa and J. J. Calvete},
url = {http://dx.doi.org/10.1186/1471-2164-12-259},
doi = {10.1186/1471-2164-12-259},
issn = {1471-2164},
year = {2011},
date = {2011-05-01},
journal = {BMC GENOMICS},
volume = {12},
abstract = {Background: A long term research goal of venomics, of applied importance
for improving current antivenom therapy, but also for drug discovery, is
to understand the pharmacological potential of venoms. Individually or
combined, proteomic and transcriptomic studies have demonstrated their
feasibility to explore in depth the molecular diversity of venoms. In
the absence of genome sequence, transcriptomes represent also valuable
searchable databases for proteomic projects.
Results: The venom gland transcriptomes of 8 Costa Rican taxa from 5
genera (Crotalus, Bothrops, Atropoides, Cerrophidion, and Bothriechis)
of pitvipers were investigated using high-throughput 454 pyrosequencing.
100,394 out of 330,010 masked reads produced significant hits in the
available databases. 5.165,220 nucleotides (8.27%) were masked by
RepeatMasker, the vast majority of which corresponding to class I
(retroelements) and class II (DNA transposons) mobile elements. BLAST
hits included 79,991 matches to entries of the taxonomic suborder
Serpentes, of which 62,433 displayed similarity to documented venom
proteins. Strong discrepancies between the transcriptome-computed and
the proteome-gathered toxin compositions were obvious at first sight.
Although the reasons underlaying this discrepancy are elusive, since no
clear trend within or between species is apparent, the data indicate
that individual mRNA species may be translationally controlled in a
species-dependent manner. The minimum number of genes from each toxin
family transcribed into the venom gland transcriptome of each species
was calculated from multiple alignments of reads matched to a
full-length reference sequence of each toxin family. Reads encoding ORF
regions of Kazal-type inhibitor-like proteins were uniquely found in
Bothriechis schlegelii and B. lateralis transcriptomes, suggesting a
genus-specific recruitment event during the early-Middle Miocene. A
transcriptome-based cladogram supports the large divergence between A.
mexicanus and A. picadoi, and a closer kinship between A. mexicanus and
C. godmani.
Conclusions: Our comparative next-generation sequencing (NGS) analysis
reveals taxon-specific trends governing the formulation of the venom
arsenal. Knowledge of the venom proteome provides hints on the
translation efficiency of toxin-coding transcripts, contributing thereby
to a more accurate interpretation of the transcriptome. The application
of NGS to the analysis of snake venom transcriptomes, may represent the
tool for opening the door to systems venomics.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
for improving current antivenom therapy, but also for drug discovery, is
to understand the pharmacological potential of venoms. Individually or
combined, proteomic and transcriptomic studies have demonstrated their
feasibility to explore in depth the molecular diversity of venoms. In
the absence of genome sequence, transcriptomes represent also valuable
searchable databases for proteomic projects.
Results: The venom gland transcriptomes of 8 Costa Rican taxa from 5
genera (Crotalus, Bothrops, Atropoides, Cerrophidion, and Bothriechis)
of pitvipers were investigated using high-throughput 454 pyrosequencing.
100,394 out of 330,010 masked reads produced significant hits in the
available databases. 5.165,220 nucleotides (8.27%) were masked by
RepeatMasker, the vast majority of which corresponding to class I
(retroelements) and class II (DNA transposons) mobile elements. BLAST
hits included 79,991 matches to entries of the taxonomic suborder
Serpentes, of which 62,433 displayed similarity to documented venom
proteins. Strong discrepancies between the transcriptome-computed and
the proteome-gathered toxin compositions were obvious at first sight.
Although the reasons underlaying this discrepancy are elusive, since no
clear trend within or between species is apparent, the data indicate
that individual mRNA species may be translationally controlled in a
species-dependent manner. The minimum number of genes from each toxin
family transcribed into the venom gland transcriptome of each species
was calculated from multiple alignments of reads matched to a
full-length reference sequence of each toxin family. Reads encoding ORF
regions of Kazal-type inhibitor-like proteins were uniquely found in
Bothriechis schlegelii and B. lateralis transcriptomes, suggesting a
genus-specific recruitment event during the early-Middle Miocene. A
transcriptome-based cladogram supports the large divergence between A.
mexicanus and A. picadoi, and a closer kinship between A. mexicanus and
C. godmani.
Conclusions: Our comparative next-generation sequencing (NGS) analysis
reveals taxon-specific trends governing the formulation of the venom
arsenal. Knowledge of the venom proteome provides hints on the
translation efficiency of toxin-coding transcripts, contributing thereby
to a more accurate interpretation of the transcriptome. The application
of NGS to the analysis of snake venom transcriptomes, may represent the
tool for opening the door to systems venomics.
Goetz, S.; Arnold, R.; Sebastian-Leon, P.; Martin-Rodriguez, S.; Tischler, P.; Jehl, M.; Dopazo, J.; Rattei, T.; Conesa, A.
B2G-FAR, a species-centered GO annotation repository Journal Article
In: BIOINFORMATICS, vol. 27, no. 7, pp. 919-924, 2011, ISSN: 1367-4803.
@article{ISI:000289162000005,
title = {B2G-FAR, a species-centered GO annotation repository},
author = { S. Goetz and R. Arnold and P. Sebastian-Leon and S. Martin-Rodriguez and P. Tischler and M. Jehl and J. Dopazo and T. Rattei and A. Conesa},
url = {http://dx.doi.org/10.1093/bioinformatics/btr059},
doi = {10.1093/bioinformatics/btr059},
issn = {1367-4803},
year = {2011},
date = {2011-04-01},
journal = {BIOINFORMATICS},
volume = {27},
number = {7},
pages = {919-924},
abstract = {Motivation: Functional genomics research has expanded enormously in the
last decade thanks to the cost reduction in high-throughput technologies
and the development of computational tools that generate, standardize
and share information on gene and protein function such as the Gene
Ontology ( GO). Nevertheless, many biologists, especially working with
non-model organisms, still suffer from non-existing or low-coverage
functional annotation, or simply struggle retrieving, summarizing and
querying these data.
Results: The Blast2GO Functional Annotation Repository (B2G-FAR) is a
bioinformatics resource envisaged to provide functional information for
otherwise uncharacterized sequence data and offers data mining tools to
analyze a larger repertoire of species than currently available. This
new annotation resource has been created by applying the Blast2GO
functional annotation engine in a strongly high-throughput manner to the
entire space of public available sequences. The resulting repository
contains GO term predictions for over 13.2 million non-redundant protein
sequences based on BLAST search alignments from the SIMAP database. We
generated GO annotation for approximately 150 000 different taxa making
available 2000 species with the highest coverage through B2G-FAR. A
second section within B2G-FAR holds functional annotations for 17
non-model organism Affymetrix GeneChips.
Conclusions: B2G-FAR provides easy access to exhaustive functional
annotation for 2000 species offering a good balance between quality and
quantity, thereby supporting functional genomics research especially in
the case of non-model organisms.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
last decade thanks to the cost reduction in high-throughput technologies
and the development of computational tools that generate, standardize
and share information on gene and protein function such as the Gene
Ontology ( GO). Nevertheless, many biologists, especially working with
non-model organisms, still suffer from non-existing or low-coverage
functional annotation, or simply struggle retrieving, summarizing and
querying these data.
Results: The Blast2GO Functional Annotation Repository (B2G-FAR) is a
bioinformatics resource envisaged to provide functional information for
otherwise uncharacterized sequence data and offers data mining tools to
analyze a larger repertoire of species than currently available. This
new annotation resource has been created by applying the Blast2GO
functional annotation engine in a strongly high-throughput manner to the
entire space of public available sequences. The resulting repository
contains GO term predictions for over 13.2 million non-redundant protein
sequences based on BLAST search alignments from the SIMAP database. We
generated GO annotation for approximately 150 000 different taxa making
available 2000 species with the highest coverage through B2G-FAR. A
second section within B2G-FAR holds functional annotations for 17
non-model organism Affymetrix GeneChips.
Conclusions: B2G-FAR provides easy access to exhaustive functional
annotation for 2000 species offering a good balance between quality and
quantity, thereby supporting functional genomics research especially in
the case of non-model organisms.
Garrido-Gomez, T.; Dominguez, F.; Lopez, J. Antonio; Camafeita, E.; Quinonero, A.; Martinez-Conejero, J. Antonio; Pellicer, A.; Conesa, A.; Simon, C.
Modeling Human Endometrial Decidualization from the Interaction between Proteome and Secretome Journal Article
In: JOURNAL OF CLINICAL ENDOCRINOLOGY & METABOLISM, vol. 96, no. 3, pp. 706-716, 2011, ISSN: 0021-972X.
@article{ISI:000288020600040,
title = {Modeling Human Endometrial Decidualization from the Interaction between
Proteome and Secretome},
author = { T. Garrido-Gomez and F. Dominguez and J. Antonio Lopez and E. Camafeita and A. Quinonero and J. Antonio Martinez-Conejero and A. Pellicer and A. Conesa and C. Simon},
url = {http://dx.doi.org/10.1210/jc.2010-1825},
doi = {10.1210/jc.2010-1825},
issn = {0021-972X},
year = {2011},
date = {2011-03-01},
journal = {JOURNAL OF CLINICAL ENDOCRINOLOGY & METABOLISM},
volume = {96},
number = {3},
pages = {706-716},
abstract = {Context: Decidualization of the human endometrium, which involves
morphological and biochemical modifications of the endometrial stromal
cells (ESCs), is a prerequisite for adequate trophoblast invasion and
placenta formation.
Objective: This study aims to investigate the proteome and secretome of
in vitro decidualized ESCs. These data were combined with published
genomic information and integrated to model the human decidualization
interactome.
Design: Prospective experimental case-control study.
Setting: A private research foundation.
Patients: Sixteen healthy volunteer ovum donors.
Intervention: Endometrial samples were obtained, and ESCs were isolated
and decidualized in vitro.
Main Outcome Measures: Two-dimensional difference in-gel
electrophoresis, matrix-assisted laser
desorption/ionization-time-of-flight mass spectrometry, Western blot, human protein cytokine array, ELISA, and bioinformatics analysis were
performed.
Results: The proteomic analysis revealed 60 differentially expressed
proteins (36 over-and 24 underexpressed) in decidualized versus control
ESCs, including known decidualization markers (cathepsin B) and new
biomarkers (transglutaminase 2, peroxiredoxin 4, and the ACTB protein).
In the secretomic analysis, a total of 13 secreted proteins (11 up-and 2
down-regulated) were identified, including well-recognized markers (IGF
binding protein-1 and prolactin) and novel ones (myeloid progenitor
inhibitory factor-1 and platelet endothelial cell adhesion molecule-1).
These proteome/secretome profiles have been integrated into a
decidualization interactome model.
Conclusions: Proteomic and secretomic have been used as hypothesis-free
approaches together with complex bioinformatics to model the human
decidual interactome for the first time. We confirm previous knowledge, describe new molecules, and we have built up a model for human in vitro
decidualization as invaluable tool for the diagnosis, therapy, and
interpretation of biological phenomena. (J Clin Endocrinol Metab 96:
706-716, 2011)},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
morphological and biochemical modifications of the endometrial stromal
cells (ESCs), is a prerequisite for adequate trophoblast invasion and
placenta formation.
Objective: This study aims to investigate the proteome and secretome of
in vitro decidualized ESCs. These data were combined with published
genomic information and integrated to model the human decidualization
interactome.
Design: Prospective experimental case-control study.
Setting: A private research foundation.
Patients: Sixteen healthy volunteer ovum donors.
Intervention: Endometrial samples were obtained, and ESCs were isolated
and decidualized in vitro.
Main Outcome Measures: Two-dimensional difference in-gel
electrophoresis, matrix-assisted laser
desorption/ionization-time-of-flight mass spectrometry, Western blot, human protein cytokine array, ELISA, and bioinformatics analysis were
performed.
Results: The proteomic analysis revealed 60 differentially expressed
proteins (36 over-and 24 underexpressed) in decidualized versus control
ESCs, including known decidualization markers (cathepsin B) and new
biomarkers (transglutaminase 2, peroxiredoxin 4, and the ACTB protein).
In the secretomic analysis, a total of 13 secreted proteins (11 up-and 2
down-regulated) were identified, including well-recognized markers (IGF
binding protein-1 and prolactin) and novel ones (myeloid progenitor
inhibitory factor-1 and platelet endothelial cell adhesion molecule-1).
These proteome/secretome profiles have been integrated into a
decidualization interactome model.
Conclusions: Proteomic and secretomic have been used as hypothesis-free
approaches together with complex bioinformatics to model the human
decidual interactome for the first time. We confirm previous knowledge, describe new molecules, and we have built up a model for human in vitro
decidualization as invaluable tool for the diagnosis, therapy, and
interpretation of biological phenomena. (J Clin Endocrinol Metab 96:
706-716, 2011)
Garcia-Alcalde, F.; Garcia-Lopez, F.; Dopazo, J.; Conesa, A.
Paintomics: a web based tool for the joint visualization of transcriptomics and metabolomics data Journal Article
In: BIOINFORMATICS, vol. 27, no. 1, pp. 137-139, 2011, ISSN: 1367-4803.
@article{ISI:000285626300021,
title = {Paintomics: a web based tool for the joint visualization of
transcriptomics and metabolomics data},
author = { F. Garcia-Alcalde and F. Garcia-Lopez and J. Dopazo and A. Conesa},
url = {http://dx.doi.org/10.1093/bioinformatics/btq594},
doi = {10.1093/bioinformatics/btq594},
issn = {1367-4803},
year = {2011},
date = {2011-01-01},
journal = {BIOINFORMATICS},
volume = {27},
number = {1},
pages = {137-139},
abstract = {Motivation: The development of the omics technologies such as
transcriptomics, proteomics and metabolomics has made possible the
realization of systems biology studies where biological systems are
interrogated at different levels of biochemical activity (gene
expression, protein activity and/or metabolite concentration). An
effective approach to the analysis of these complex datasets is the
joined visualization of the disparate biomolecular data on the framework
of known biological pathways.
Results: We have developed the Paintomics web server as an easy-to-use
bioinformatics resource that facilitates the integrated visual analysis
of experiments where transcriptomics and metabolomics data have been
measured on different conditions for the same samples. Basically, Paintomics takes complete transcriptomics and metabolomics datasets, together with lists of significant gene or metabolite changes, and
paints this information on KEGG pathway maps.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
transcriptomics, proteomics and metabolomics has made possible the
realization of systems biology studies where biological systems are
interrogated at different levels of biochemical activity (gene
expression, protein activity and/or metabolite concentration). An
effective approach to the analysis of these complex datasets is the
joined visualization of the disparate biomolecular data on the framework
of known biological pathways.
Results: We have developed the Paintomics web server as an easy-to-use
bioinformatics resource that facilitates the integrated visual analysis
of experiments where transcriptomics and metabolomics data have been
measured on different conditions for the same samples. Basically, Paintomics takes complete transcriptomics and metabolomics datasets, together with lists of significant gene or metabolite changes, and
paints this information on KEGG pathway maps.
Conesa, A.; Prats-Montalban, J. M.; Tarazona, S.; Nueda, M. Jose; Ferrer, A.
A multiway approach to data integration in systems biology based on Tucker3 and N-PLS Journal Article
In: CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, vol. 104, no. 1, SI, pp. 101-111, 2010, ISSN: 0169-7439.
@article{ISI:000284658300011,
title = {A multiway approach to data integration in systems biology based on
Tucker3 and N-PLS},
author = { A. Conesa and J. M. Prats-Montalban and S. Tarazona and M. Jose Nueda and A. Ferrer},
url = {http://dx.doi.org/10.1016/j.chemolab.2010.06.004},
doi = {10.1016/j.chemolab.2010.06.004},
issn = {0169-7439},
year = {2010},
date = {2010-11-01},
journal = {CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS},
volume = {104},
number = {1, SI},
pages = {101-111},
abstract = {This paper discusses the potential of multi-way projection methods for
analysing multifactorial data structures to identify underlying
components of variability that interconnect different blocks of omics
variables. We explore their suitability for explorative and variable
selection analysis of systems biology data where different types of
biological parameters are studied together. These methodologies were
applied to the integrative analysis of a functional genomics dataset
where transcriptomics, metabolomics and physiological data are
available. Our results show that multiway methods are suited to
accommodate multifactorial omics experiments and to analyse
relationships between different biochemical layers. Additionally, strategies are presented for variable selection in the context of omics
data and for interpreting results at the level of cellular pathways. (C)
2010 Elsevier B.V. All rights reserved.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
analysing multifactorial data structures to identify underlying
components of variability that interconnect different blocks of omics
variables. We explore their suitability for explorative and variable
selection analysis of systems biology data where different types of
biological parameters are studied together. These methodologies were
applied to the integrative analysis of a functional genomics dataset
where transcriptomics, metabolomics and physiological data are
available. Our results show that multiway methods are suited to
accommodate multifactorial omics experiments and to analyse
relationships between different biochemical layers. Additionally, strategies are presented for variable selection in the context of omics
data and for interpreting results at the level of cellular pathways. (C)
2010 Elsevier B.V. All rights reserved.
Medina, I.; Carbonell, J.; Pulido, L.; Madeira, S. C.; Goetz, S.; Conesa, A.; Tarraga, J.; Pascual-Montano, A.; Nogales-Cadenas, R.; Santoyo, J.; Garcia, F.; Marba, M.; Montaner, D.; Dopazo, J.
Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling Journal Article
In: NUCLEIC ACIDS RESEARCH, vol. 38, no. 2, pp. W210-W213, 2010, ISSN: 0305-1048.
@article{ISI:000284148900034,
title = {Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling},
author = { I. Medina and J. Carbonell and L. Pulido and S. C. Madeira and S. Goetz and A. Conesa and J. Tarraga and A. Pascual-Montano and R. Nogales-Cadenas and J. Santoyo and F. Garcia and M. Marba and D. Montaner and J. Dopazo},
url = {http://dx.doi.org/10.1093/nar/gkq388},
doi = {10.1093/nar/gkq388},
issn = {0305-1048},
year = {2010},
date = {2010-07-01},
journal = {NUCLEIC ACIDS RESEARCH},
volume = {38},
number = {2},
pages = {W210-W213},
abstract = {Babelomics is a response to the growing necessity of integrating and
analyzing different types of genomic data in an environment that allows
an easy functional interpretation of the results. Babelomics includes a
complete suite of methods for the analysis of gene expression data that
include normalization (covering most commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors, clustering; large-scale
genotyping assays (case controls and TDTs, and allows population
stratification analysis and correction). All these genomic data analysis
facilities are integrated and connected to multiple options for the
functional interpretation of the experiments. Different methods of
functional enrichment or gene set enrichment can be used to understand
the functional basis of the experiment analyzed. Many sources of
biological information, which include functional (GO, KEGG, Biocarta, Reactome, etc.), regulatory (Transfac, Jaspar, ORegAnno, miRNAs, etc.), text-mining or protein-protein interaction modules can be used for this
purpose. Finally a tool for the de novo functional annotation of
sequences has been included in the system. This provides support for the
functional analysis of non-model species. Mirrors of Babelomics or
command line execution of their individual components are now possible.
Babelomics is available at http://www.babelomics.org.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
analyzing different types of genomic data in an environment that allows
an easy functional interpretation of the results. Babelomics includes a
complete suite of methods for the analysis of gene expression data that
include normalization (covering most commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors, clustering; large-scale
genotyping assays (case controls and TDTs, and allows population
stratification analysis and correction). All these genomic data analysis
facilities are integrated and connected to multiple options for the
functional interpretation of the experiments. Different methods of
functional enrichment or gene set enrichment can be used to understand
the functional basis of the experiment analyzed. Many sources of
biological information, which include functional (GO, KEGG, Biocarta, Reactome, etc.), regulatory (Transfac, Jaspar, ORegAnno, miRNAs, etc.), text-mining or protein-protein interaction modules can be used for this
purpose. Finally a tool for the de novo functional annotation of
sequences has been included in the system. This provides support for the
functional analysis of non-model species. Mirrors of Babelomics or
command line execution of their individual components are now possible.
Babelomics is available at http://www.babelomics.org.
Nueda, M. Jose; Carbonell, J.; Medina, I.; Dopazo, J.; Conesa, A.
Serial Expression Analysis: a web tool for the analysis of serial gene expression data Journal Article
In: NUCLEIC ACIDS RESEARCH, vol. 38, no. 2, pp. W239-W245, 2010, ISSN: 0305-1048.
@article{ISI:000284148900039,
title = {Serial Expression Analysis: a web tool for the analysis of serial gene
expression data},
author = { M. Jose Nueda and J. Carbonell and I. Medina and J. Dopazo and A. Conesa},
url = {http://dx.doi.org/10.1093/nar/gkq488},
doi = {10.1093/nar/gkq488},
issn = {0305-1048},
year = {2010},
date = {2010-07-01},
journal = {NUCLEIC ACIDS RESEARCH},
volume = {38},
number = {2},
pages = {W239-W245},
abstract = {Serial transcriptomics experiments investigate the dynamics of gene
expression changes associated with a quantitative variable such as time
or dosage. The statistical analysis of these data implies the study of
global and gene-specific expression trends, the identification of
significant serial changes, the comparison of expression profiles and
the assessment of transcriptional changes in terms of cellular
processes. We have created the SEA (Serial Expression Analysis) suite to
provide a complete web-based resource for the analysis of serial
transcriptomics data. SEA offers five different algorithms based on
univariate, multivariate and functional profiling strategies framed
within a user-friendly interface and a project-oriented architecture to
facilitate the analysis of serial gene expression data sets from
different perspectives. SEA is available at sea.bioinfo.cipf.es.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
expression changes associated with a quantitative variable such as time
or dosage. The statistical analysis of these data implies the study of
global and gene-specific expression trends, the identification of
significant serial changes, the comparison of expression profiles and
the assessment of transcriptional changes in terms of cellular
processes. We have created the SEA (Serial Expression Analysis) suite to
provide a complete web-based resource for the analysis of serial
transcriptomics data. SEA offers five different algorithms based on
univariate, multivariate and functional profiling strategies framed
within a user-friendly interface and a project-oriented architecture to
facilitate the analysis of serial gene expression data sets from
different perspectives. SEA is available at sea.bioinfo.cipf.es.
Nemeth, A.; Conesa, A.; Santoyo-Lopez, J.; Medina, I.; Montaner, D.; Peterfia, B.; Solovei, I.; Cremer, T.; Dopazo, J.; Laengst, G.
Initial Genomics of the Human Nucleolus Journal Article
In: PLOS GENETICS, vol. 6, no. 3, 2010, ISSN: 1553-7390.
@article{ISI:000276311400004,
title = {Initial Genomics of the Human Nucleolus},
author = { A. Nemeth and A. Conesa and J. Santoyo-Lopez and I. Medina and D. Montaner and B. Peterfia and I. Solovei and T. Cremer and J. Dopazo and G. Laengst},
url = {http://dx.doi.org/10.1371/journal.pgen.1000889},
doi = {10.1371/journal.pgen.1000889},
issn = {1553-7390},
year = {2010},
date = {2010-03-01},
journal = {PLOS GENETICS},
volume = {6},
number = {3},
abstract = {We report for the first time the genomics of a nuclear compartment of
the eukaryotic cell. 454 sequencing and microarray analysis revealed the
pattern of nucleolus-associated chromatin domains (NADs) in the linear
human genome and identified different gene families and certain
satellite repeats as the major building blocks of NADs, which constitute
about 4% of the genome. Bioinformatic evaluation showed that
NAD-localized genes take part in specific biological processes, like the
response to other organisms, odor perception, and tissue development. 3D
FISH and immunofluorescence experiments illustrated the spatial
distribution of NAD-specific chromatin within interphase nuclei and its
alteration upon transcriptional changes. Altogether, our findings
describe the nature of DNA sequences associated with the human nucleolus
and provide insights into the function of the nucleolus in genome
organization and establishment of nuclear architecture.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
the eukaryotic cell. 454 sequencing and microarray analysis revealed the
pattern of nucleolus-associated chromatin domains (NADs) in the linear
human genome and identified different gene families and certain
satellite repeats as the major building blocks of NADs, which constitute
about 4% of the genome. Bioinformatic evaluation showed that
NAD-localized genes take part in specific biological processes, like the
response to other organisms, odor perception, and tissue development. 3D
FISH and immunofluorescence experiments illustrated the spatial
distribution of NAD-specific chromatin within interphase nuclei and its
alteration upon transcriptional changes. Altogether, our findings
describe the nature of DNA sequences associated with the human nucleolus
and provide insights into the function of the nucleolus in genome
organization and establishment of nuclear architecture.
Prado-Lopez, S.; Conesa, A.; Arminan, A.; Martinez-Losa, M.; Escobedo-Lucea, C.; Gandia, C.; Tarazona, S.; Melguizo, D.; Blesa, D.; Montaner, D.; Sanz-Gonzalez, S.; Sepulveda, P.; Goetz, S.; O'Connor, J. Enrique; Moreno, R.; Dopazo, J.; Burks, D. J.; Stojkovic, M.
Hypoxia Promotes Efficient Differentiation of Human Embryonic Stem Cells to Functional Endothelium Journal Article
In: STEM CELLS, vol. 28, no. 3, pp. 407-418, 2010, ISSN: 1066-5099.
@article{ISI:000277093700004,
title = {Hypoxia Promotes Efficient Differentiation of Human Embryonic Stem Cells
to Functional Endothelium},
author = { S. Prado-Lopez and A. Conesa and A. Arminan and M. Martinez-Losa and C. Escobedo-Lucea and C. Gandia and S. Tarazona and D. Melguizo and D. Blesa and D. Montaner and S. Sanz-Gonzalez and P. Sepulveda and S. Goetz and J. Enrique O'Connor and R. Moreno and J. Dopazo and D. J. Burks and M. Stojkovic},
url = {http://dx.doi.org/10.1002/stem.295},
doi = {10.1002/stem.295},
issn = {1066-5099},
year = {2010},
date = {2010-03-01},
journal = {STEM CELLS},
volume = {28},
number = {3},
pages = {407-418},
abstract = {Early development of mammalian embryos occurs in an environment of
relative hypoxia. Nevertheless, human embryonic stem cells (hESC), which
are derived from the inner cell mass of blastocyst, are routinely
cultured under the same atmospheric conditions (21% O(2)) as somatic
cells. We hypothesized that O2 levels modulate gene expression and
differentiation potential of hESC, and thus, we performed gene profiling
of hESC maintained under normoxic or hypoxic (1% or 5% O(2))
conditions. Our analysis revealed that hypoxia downregulates expression
of pluripotency markers in hESC but increases significantly the
expression of genes associated with angio-and vasculogenesis including
vascular endothelial growth factor and angiopoitein-like proteins.
Consequently, we were able to efficiently differentiate hESC to
functional endothelial cells (EC) by varying O(2) levels; after 24 hours
at 5% O(2), more than 50% of cells were CD34+. Transplantation of
resulting endothelial-like cells improved both systolic function and
fractional shortening in a rodent model of myocardial infarction.
Moreover, analysis of the infarcted zone revealed that transplanted EC
reduced the area of fibrous scar tissue by 50%. Thus, use of hypoxic
conditions to specify the endothelial lineage suggests a novel strategy
for cellular therapies aimed at repair of damaged vasculature in
pathologies such as cerebral ischemia and myocardial infarction. STEM
CELLS 2010; 28: 407-418},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
relative hypoxia. Nevertheless, human embryonic stem cells (hESC), which
are derived from the inner cell mass of blastocyst, are routinely
cultured under the same atmospheric conditions (21% O(2)) as somatic
cells. We hypothesized that O2 levels modulate gene expression and
differentiation potential of hESC, and thus, we performed gene profiling
of hESC maintained under normoxic or hypoxic (1% or 5% O(2))
conditions. Our analysis revealed that hypoxia downregulates expression
of pluripotency markers in hESC but increases significantly the
expression of genes associated with angio-and vasculogenesis including
vascular endothelial growth factor and angiopoitein-like proteins.
Consequently, we were able to efficiently differentiate hESC to
functional endothelial cells (EC) by varying O(2) levels; after 24 hours
at 5% O(2), more than 50% of cells were CD34+. Transplantation of
resulting endothelial-like cells improved both systolic function and
fractional shortening in a rodent model of myocardial infarction.
Moreover, analysis of the infarcted zone revealed that transplanted EC
reduced the area of fibrous scar tissue by 50%. Thus, use of hypoxic
conditions to specify the endothelial lineage suggests a novel strategy
for cellular therapies aimed at repair of damaged vasculature in
pathologies such as cerebral ischemia and myocardial infarction. STEM
CELLS 2010; 28: 407-418
Rattei, T.; Tischler, P.; Goetz, S.; Jehl, M.; Hoser, J.; Arnold, R.; Conesa, A.; Mewes, H.
SIMAP-a comprehensive database of pre-calculated protein sequence similarities, domains, annotations and clusters Journal Article
In: NUCLEIC ACIDS RESEARCH, vol. 38, no. 1, pp. D223-D226, 2010, ISSN: 0305-1048.
@article{ISI:000276399100033,
title = {SIMAP-a comprehensive database of pre-calculated protein sequence
similarities, domains, annotations and clusters},
author = { T. Rattei and P. Tischler and S. Goetz and M. Jehl and J. Hoser and R. Arnold and A. Conesa and H. Mewes},
url = {http://dx.doi.org/10.1093/nar/gkp949},
doi = {10.1093/nar/gkp949},
issn = {0305-1048},
year = {2010},
date = {2010-01-01},
journal = {NUCLEIC ACIDS RESEARCH},
volume = {38},
number = {1},
pages = {D223-D226},
abstract = {The prediction of protein function as well as the reconstruction of
evolutionary genesis employing sequence comparison at large is still the
most powerful tool in sequence analysis. Due to the exponential growth
of the number of known protein sequences and the subsequent quadratic
growth of the similarity matrix, the computation of the Similarity
Matrix of Proteins (SIMAP) becomes a computational intensive task. The
SIMAP database provides a comprehensive and up-to-date precalculation of
the protein sequence similarity matrix, sequence-based features and
sequence clusters. As of September 2009, SIMAP covers 48 million
proteins and more than 23 million non-redundant sequences. Novel
features of SIMAP include the expansion of the sequence space by
including databases such as ENSEMBL as well as the integration of
metagenomes based on their consistent processing and annotation.
Furthermore, protein function predictions by Blast2GO are pre-calculated
for all sequences in SIMAP and the data access and query functions have
been improved. SIMAP assists biologists to query the up-to-date sequence
space systematically and facilitates large-scale downstream projects in
computational biology. Access to SIMAP is freely provided through the
web portal for individuals (http://mips.gsf.de/simap/) and for
programmatic access through DAS (http://webclu.bio.wzw.tum.de/das/) and
Web-Service
(http://mips.gsf.de/webservices/services/SimapService2.0?wsdl).},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
evolutionary genesis employing sequence comparison at large is still the
most powerful tool in sequence analysis. Due to the exponential growth
of the number of known protein sequences and the subsequent quadratic
growth of the similarity matrix, the computation of the Similarity
Matrix of Proteins (SIMAP) becomes a computational intensive task. The
SIMAP database provides a comprehensive and up-to-date precalculation of
the protein sequence similarity matrix, sequence-based features and
sequence clusters. As of September 2009, SIMAP covers 48 million
proteins and more than 23 million non-redundant sequences. Novel
features of SIMAP include the expansion of the sequence space by
including databases such as ENSEMBL as well as the integration of
metagenomes based on their consistent processing and annotation.
Furthermore, protein function predictions by Blast2GO are pre-calculated
for all sequences in SIMAP and the data access and query functions have
been improved. SIMAP assists biologists to query the up-to-date sequence
space systematically and facilitates large-scale downstream projects in
computational biology. Access to SIMAP is freely provided through the
web portal for individuals (http://mips.gsf.de/simap/) and for
programmatic access through DAS (http://webclu.bio.wzw.tum.de/das/) and
Web-Service
(http://mips.gsf.de/webservices/services/SimapService2.0?wsdl).
Brumos, J.; Colmenero-Flores, J. M.; Conesa, A.; Izquierdo, P.; Sanchez, G.; Iglesias, D. J.; Lopez-Climent, M. F.; Gomez-Cadenas, A.; Talon, M.
In: FUNCTIONAL & INTEGRATIVE GENOMICS, vol. 9, no. 3, pp. 293-309, 2009, ISSN: 1438-793X.
@article{ISI:000267339600003,
title = {Membrane transporters and carbon metabolism implicated in chloride
homeostasis differentiate salt stress responses in tolerant and
sensitive Citrus rootstocks},
author = { J. Brumos and J. M. Colmenero-Flores and A. Conesa and P. Izquierdo and G. Sanchez and D. J. Iglesias and M. F. Lopez-Climent and A. Gomez-Cadenas and M. Talon},
url = {http://dx.doi.org/10.1007/s10142-008-0107-6},
doi = {10.1007/s10142-008-0107-6},
issn = {1438-793X},
year = {2009},
date = {2009-08-01},
journal = {FUNCTIONAL & INTEGRATIVE GENOMICS},
volume = {9},
number = {3},
pages = {293-309},
abstract = {Salinity tolerance in Citrus is strongly related to leaf chloride
accumulation. Both chloride homeostasis and specific genetic responses
to Cl(-) toxicity are issues scarcely investigated in plants. To
discriminate the transcriptomic network related to Cl(-) toxicity and
salinity tolerance, we have used two Cl(-) salt treatments (NaCl and
KCl) to perform a comparative microarray approach on two Citrus
genotypes, the salt-sensitive Carrizo citrange, a poor Cl(-) excluder, and the tolerant Cleopatra mandarin, an efficient Cl(-) excluder. The
data indicated that Cl(-) toxicity, rather than Na(+) toxicity and/or
the concomitant osmotic perturbation, is the primary factor involved in
the molecular responses of citrus plant leaves to salinity. A number of
uncharacterized membrane transporter genes, like NRT1-2, were
differentially regulated in the tolerant and the sensitive genotypes, suggesting its potential implication in Cl(-) homeostasis. Analyses of
enriched functional categories showed that the tolerant rootstock
induced wider stress responses in gene expression while repressing
central metabolic processes such as photosynthesis and carbon
utilization. These features were in agreement with phenotypic changes in
the patterns of photosynthesis, transpiration, and stomatal conductance
and support the concept that regulation of transpiration and its
associated metabolic adjustments configure an adaptive response to
salinity that reduces Cl(-) accumulation in the tolerant genotype.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
accumulation. Both chloride homeostasis and specific genetic responses
to Cl(-) toxicity are issues scarcely investigated in plants. To
discriminate the transcriptomic network related to Cl(-) toxicity and
salinity tolerance, we have used two Cl(-) salt treatments (NaCl and
KCl) to perform a comparative microarray approach on two Citrus
genotypes, the salt-sensitive Carrizo citrange, a poor Cl(-) excluder, and the tolerant Cleopatra mandarin, an efficient Cl(-) excluder. The
data indicated that Cl(-) toxicity, rather than Na(+) toxicity and/or
the concomitant osmotic perturbation, is the primary factor involved in
the molecular responses of citrus plant leaves to salinity. A number of
uncharacterized membrane transporter genes, like NRT1-2, were
differentially regulated in the tolerant and the sensitive genotypes, suggesting its potential implication in Cl(-) homeostasis. Analyses of
enriched functional categories showed that the tolerant rootstock
induced wider stress responses in gene expression while repressing
central metabolic processes such as photosynthesis and carbon
utilization. These features were in agreement with phenotypic changes in
the patterns of photosynthesis, transpiration, and stomatal conductance
and support the concept that regulation of transpiration and its
associated metabolic adjustments configure an adaptive response to
salinity that reduces Cl(-) accumulation in the tolerant genotype.
Nueda, M. Jose; Sebastian, P.; Tarazona, S.; Garcia-Garcia, F.; Dopazo, J.; Ferrer, A.; Conesa, A.
Functional assessment of time course microarray data Journal Article
In: BMC BIOINFORMATICS, vol. 10, 2009, ISSN: 1471-2105, (European Molecular Biology Network Conference, Martina, ITALY, SEP 18-20, 2008).
@article{ISI:000267522200009,
title = {Functional assessment of time course microarray data},
author = { M. Jose Nueda and P. Sebastian and S. Tarazona and F. Garcia-Garcia and J. Dopazo and A. Ferrer and A. Conesa},
url = {http://dx.doi.org/10.1186/1471-2105-10-S6-S9},
doi = {10.1186/1471-2105-10-S6-S9},
issn = {1471-2105},
year = {2009},
date = {2009-01-01},
journal = {BMC BIOINFORMATICS},
volume = {10},
abstract = {Motivation: Time-course microarray experiments study the progress of
gene expression along time across one or several experimental
conditions. Most developed analysis methods focus on the clustering or
the differential expression analysis of genes and do not integrate
functional information. The assessment of the functional aspects of
time-course transcriptomics data requires the use of approaches that
exploit the activation dynamics of the functional categories to where
genes are annotated.
Methods: We present three novel methodologies for the functional
assessment of time-course microarray data. i) maSigFun derives from the
maSigPro method, a regression-based strategy to model time-dependent
expression patterns and identify genes with differences across series.
maSigFun fits a regression model for groups of genes labeled by a
functional class and selects those categories which have a significant
model. ii) PCA-maSigFun fits a PCA model of each functional
class-defined expression matrix to extract orthogonal patterns of
expression change, which are then assessed for their fit to a
time-dependent regression model. iii) ASCA-functional uses the ASCA
model to rank genes according to their correlation to principal time
expression patterns and assess functional enrichment on a GSA fashion.
We used simulated and experimental datasets to study these novel
approaches. Results were compared to alternative methodologies.
Results: Synthetic and experimental data showed that the different
methods are able to capture different aspects of the relationship
between genes, functions and co-expression that are biologically
meaningful. The methods should not be considered as competitive but they
provide different insights into the molecular and functional dynamic
events taking place within the biological system under study.},
note = {European Molecular Biology Network Conference, Martina, ITALY, SEP
18-20, 2008},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
gene expression along time across one or several experimental
conditions. Most developed analysis methods focus on the clustering or
the differential expression analysis of genes and do not integrate
functional information. The assessment of the functional aspects of
time-course transcriptomics data requires the use of approaches that
exploit the activation dynamics of the functional categories to where
genes are annotated.
Methods: We present three novel methodologies for the functional
assessment of time-course microarray data. i) maSigFun derives from the
maSigPro method, a regression-based strategy to model time-dependent
expression patterns and identify genes with differences across series.
maSigFun fits a regression model for groups of genes labeled by a
functional class and selects those categories which have a significant
model. ii) PCA-maSigFun fits a PCA model of each functional
class-defined expression matrix to extract orthogonal patterns of
expression change, which are then assessed for their fit to a
time-dependent regression model. iii) ASCA-functional uses the ASCA
model to rank genes according to their correlation to principal time
expression patterns and assess functional enrichment on a GSA fashion.
We used simulated and experimental datasets to study these novel
approaches. Results were compared to alternative methodologies.
Results: Synthetic and experimental data showed that the different
methods are able to capture different aspects of the relationship
between genes, functions and co-expression that are biologically
meaningful. The methods should not be considered as competitive but they
provide different insights into the molecular and functional dynamic
events taking place within the biological system under study.
Conesa, A.; Bro, R.; Garcia-Garcia, F.; Prats, J. M.; Gotz, S.; Kjeldahl, K.; Montaner, D.; Dopazo, J.
Direct functional assessment of the composite phenotype through multivariate projection strategies Journal Article
In: GENOMICS, vol. 92, no. 6, pp. 373-383, 2008, ISSN: 0888-7543.
@article{ISI:000261460100001,
title = {Direct functional assessment of the composite phenotype through
multivariate projection strategies},
author = { A. Conesa and R. Bro and F. Garcia-Garcia and J. M. Prats and S. Gotz and K. Kjeldahl and D. Montaner and J. Dopazo},
url = {http://dx.doi.org/10.1016/j.ygeno.2008.05.015},
doi = {10.1016/j.ygeno.2008.05.015},
issn = {0888-7543},
year = {2008},
date = {2008-12-01},
journal = {GENOMICS},
volume = {92},
number = {6},
pages = {373-383},
abstract = {We present a novel approach for the analysis of transcriptomics; data
that integrates functional annotation of gene sets with expression
values in a multivariate fashion, and directly assesses the relation of
functional features to a multivariate space of response phenotypical
variables. Multivariate projection methods are used to obtain new
correlated variables for a set of genes that share a given function.
These new functional variables are then related to the response
variables of interest. The analysis of the principal directions of the
multivariate regression allows for the identification of gene function
features correlated with the phenotype. Two different transcriptomics
studies are used to illustrate the statistical and interpretative
aspects of the methodology. We demonstrate the superiority of the
proposed method over equivalent approaches. (C) 2008 Elsevier Inc. All
rights reserved.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
that integrates functional annotation of gene sets with expression
values in a multivariate fashion, and directly assesses the relation of
functional features to a multivariate space of response phenotypical
variables. Multivariate projection methods are used to obtain new
correlated variables for a set of genes that share a given function.
These new functional variables are then related to the response
variables of interest. The analysis of the principal directions of the
multivariate regression allows for the identification of gene function
features correlated with the phenotype. Two different transcriptomics
studies are used to illustrate the statistical and interpretative
aspects of the methodology. We demonstrate the superiority of the
proposed method over equivalent approaches. (C) 2008 Elsevier Inc. All
rights reserved.
Hoogerwerf, W. A.; Sinha, M.; Conesa, A.; Luxon, B. A.; Shahinian, V. B.; Cornelissen, G.; Halberg, F.; Bostwick, J.; Timm, J.; Cassone, V. M.
Transcriptional Profiling of mRNA Expression in the Mouse Distal Colon Journal Article
In: GASTROENTEROLOGY, vol. 135, no. 6, pp. 2019-2029, 2008, ISSN: 0016-5085.
@article{ISI:000261762200064,
title = {Transcriptional Profiling of mRNA Expression in the Mouse Distal Colon},
author = { W. A. Hoogerwerf and M. Sinha and A. Conesa and B. A. Luxon and V. B. Shahinian and G. Cornelissen and F. Halberg and J. Bostwick and J. Timm and V. M. Cassone},
url = {http://dx.doi.org/10.1053/j.gastro.2008.08.048},
doi = {10.1053/j.gastro.2008.08.048},
issn = {0016-5085},
year = {2008},
date = {2008-12-01},
journal = {GASTROENTEROLOGY},
volume = {135},
number = {6},
pages = {2019-2029},
abstract = {Background & Aims: intestinal epithelial cells and the myenteric plexus
of the mouse gastrointestinal tract contain a circadian clock-based
intrinsic timekeeping system. Because disruption of the biological clock
has been associated with increased susceptibility to colon cancer and
gastrointestinal symptoms, we aimed to identify rhythmically expressed
genes in the mouse distal colon. Methods: Microarray analysis was used
to identify genes that were rhythmically expressed over a 24-hour
light/dark cycle. The transcripts were then classified according to
expression pattern, function, and association with physiologic and
pathophysiologic processes of the colon. Results: A circadian gene
expression pattern was detected in approximately 3.7% of distal.
colonic genes. A large percentage of these genes were involved in cell
signaling, differentiation, and proliferation and cell death. Of all the
rhythmically expressed genes in the mouse colon, approximately 7%
(64/906) have been associated with colorectal cancer formation (eg, B-cell leukemia/lymphoma-2 [Bcl2]) and 1.8% (18/906) with various
colonic functions such as motility and secretion (eg, vasoactive
intestinal polypeptide, cystic fibrosis transmembrane conductance
regulator). Conclusions: A subset of genes in the murine colon follows a
rhythmic expression pattern. These findings may have significant
implications for colonic physiology and pathophysiology.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
of the mouse gastrointestinal tract contain a circadian clock-based
intrinsic timekeeping system. Because disruption of the biological clock
has been associated with increased susceptibility to colon cancer and
gastrointestinal symptoms, we aimed to identify rhythmically expressed
genes in the mouse distal colon. Methods: Microarray analysis was used
to identify genes that were rhythmically expressed over a 24-hour
light/dark cycle. The transcripts were then classified according to
expression pattern, function, and association with physiologic and
pathophysiologic processes of the colon. Results: A circadian gene
expression pattern was detected in approximately 3.7% of distal.
colonic genes. A large percentage of these genes were involved in cell
signaling, differentiation, and proliferation and cell death. Of all the
rhythmically expressed genes in the mouse colon, approximately 7%
(64/906) have been associated with colorectal cancer formation (eg, B-cell leukemia/lymphoma-2 [Bcl2]) and 1.8% (18/906) with various
colonic functions such as motility and secretion (eg, vasoactive
intestinal polypeptide, cystic fibrosis transmembrane conductance
regulator). Conclusions: A subset of genes in the murine colon follows a
rhythmic expression pattern. These findings may have significant
implications for colonic physiology and pathophysiology.
Stierum, R.; Conesa, A.; Heijne, W.; van Ommen, B.; Junker, K.; Scott, M. P.; Price, R. J.; Meredith, C.; Lake, B. G.; Groten, J.
In: FOOD AND CHEMICAL TOXICOLOGY, vol. 46, no. 8, pp. 2616-2628, 2008, ISSN: 0278-6915.
@article{ISI:000258440100004,
title = {Transcriptome analysis provides new insights into liver changes induced
in the rat upon dietary administration of the food additives butylated
hydroxytoluene, curcumin, propyl gallate and thiabendazole},
author = { R. Stierum and A. Conesa and W. Heijne and B. van Ommen and K. Junker and M. P. Scott and R. J. Price and C. Meredith and B. G. Lake and J. Groten},
url = {http://dx.doi.org/10.1016/j.fct.2008.04.019},
doi = {10.1016/j.fct.2008.04.019},
issn = {0278-6915},
year = {2008},
date = {2008-08-01},
journal = {FOOD AND CHEMICAL TOXICOLOGY},
volume = {46},
number = {8},
pages = {2616-2628},
abstract = {Transcriptomics was performed to gain insight into mechanisms of food
additives butylated hydroxytoluene (BHT), curcumin (CC), propyl gallate
(PG), and thiabendazole (TB), additives for which interactions in the
liver can not be excluded. Additives were administered in diets for 28
days to Sprague-Dawley rats and cDNA microarray experiments were
performed on hepatic RNA. BHT induced changes in the expression of 10
genes, including phase I (CYP2B1/2: CYP3A9; CYP2C6) and phase II
metabolism (GST mu 2). The CYP2B1/2 and GST expression findings were
confirmed by real time RT-PCR, western blotting, and increased GST
activity towards DCNB. CC altered the expression of 12 genes. Three out
of these were related to peroxisomes (phytanoyl-CoA dioxygenase, enoyl-CoA hydratase; CYP4A3). Increased cyanide insensitive
palmitoyl-CoA oxidation was observed, suggesting that CC is a weak
peroxisome proliferator. TB changed the expression of 12 genes, including CYP1A2. In line, CYP1A2 protein expression was increased. The
expression level of five genes, associated with p53 was found to change
upon TB treatment, including p53 itself, GADD45 alpha, DN-7, protein
kinase C beta and serum albumin. These array experiments led to the
novel finding that TB is capable of inducing p53 at the protein level, at least at the highest dose levels employed above the current NOAEL.
The expression of eight genes changed upon PG administration. This study
shows the value of gene expression profiling in food toxicology in terms
of generating novel hypotheses on the mechanisms of action of food
additives in relation to pathology. (C) 2008 Elsevier Ltd. All rights
reserved.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
additives butylated hydroxytoluene (BHT), curcumin (CC), propyl gallate
(PG), and thiabendazole (TB), additives for which interactions in the
liver can not be excluded. Additives were administered in diets for 28
days to Sprague-Dawley rats and cDNA microarray experiments were
performed on hepatic RNA. BHT induced changes in the expression of 10
genes, including phase I (CYP2B1/2: CYP3A9; CYP2C6) and phase II
metabolism (GST mu 2). The CYP2B1/2 and GST expression findings were
confirmed by real time RT-PCR, western blotting, and increased GST
activity towards DCNB. CC altered the expression of 12 genes. Three out
of these were related to peroxisomes (phytanoyl-CoA dioxygenase, enoyl-CoA hydratase; CYP4A3). Increased cyanide insensitive
palmitoyl-CoA oxidation was observed, suggesting that CC is a weak
peroxisome proliferator. TB changed the expression of 12 genes, including CYP1A2. In line, CYP1A2 protein expression was increased. The
expression level of five genes, associated with p53 was found to change
upon TB treatment, including p53 itself, GADD45 alpha, DN-7, protein
kinase C beta and serum albumin. These array experiments led to the
novel finding that TB is capable of inducing p53 at the protein level, at least at the highest dose levels employed above the current NOAEL.
The expression of eight genes changed upon PG administration. This study
shows the value of gene expression profiling in food toxicology in terms
of generating novel hypotheses on the mechanisms of action of food
additives in relation to pathology. (C) 2008 Elsevier Ltd. All rights
reserved.
Tarraga, J.; Medina, I.; Carbonell, J.; Huerta-Cepas, J.; Minguez, P.; Alloza, E.; Al-Shahrour, F.; Vegas-Azcarate, S.; Goetz, S.; Escobar, P.; Garcia-Garcia, F.; Conesa, A.; Montaner, D.; Dopazo, J.
GEPAS, a web-based tool for microarray data analysis and interpretation Journal Article
In: NUCLEIC ACIDS RESEARCH, vol. 36, no. S, pp. W308-W314, 2008, ISSN: 0305-1048.
@article{ISI:000258142300058,
title = {GEPAS, a web-based tool for microarray data analysis and interpretation},
author = { J. Tarraga and I. Medina and J. Carbonell and J. Huerta-Cepas and P. Minguez and E. Alloza and F. Al-Shahrour and S. Vegas-Azcarate and S. Goetz and P. Escobar and F. Garcia-Garcia and A. Conesa and D. Montaner and J. Dopazo},
url = {http://dx.doi.org/10.1093/nar/gkn303},
doi = {10.1093/nar/gkn303},
issn = {0305-1048},
year = {2008},
date = {2008-07-01},
journal = {NUCLEIC ACIDS RESEARCH},
volume = {36},
number = {S},
pages = {W308-W314},
abstract = {Gene Expression Profile Analysis Suite (GEPAS) is one of the most
complete and extensively used web-based packages for microarray data
analysis. During its more than 5 years of activity it has continuously
been updated to keep pace with the state-of-the-art in the changing
microarray data analysis arena. GEPAS offers diverse analysis options
that include well established as well as novel algorithms for
normalization, gene selection, class prediction, clustering and
functional profiling of the experiment. New options for time-course (or
dose-response) experiments, microarray-based class prediction, new
clustering methods and new tests for differential expression have been
included. The new pipeliner module allows automating the execution of
sequential analysis steps by means of a simple but powerful graphic
interface. An extensive re-engineering of GEPAS has been carried out
which includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended
database of gene identifiers. GEPAS is nowadays the most quoted web tool
in its field and it is extensively used by researchers of many countries
and its records indicate an average usage rate of 500 experiments per
day. GEPAS, is available at http://www.gepas.org.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
complete and extensively used web-based packages for microarray data
analysis. During its more than 5 years of activity it has continuously
been updated to keep pace with the state-of-the-art in the changing
microarray data analysis arena. GEPAS offers diverse analysis options
that include well established as well as novel algorithms for
normalization, gene selection, class prediction, clustering and
functional profiling of the experiment. New options for time-course (or
dose-response) experiments, microarray-based class prediction, new
clustering methods and new tests for differential expression have been
included. The new pipeliner module allows automating the execution of
sequential analysis steps by means of a simple but powerful graphic
interface. An extensive re-engineering of GEPAS has been carried out
which includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended
database of gene identifiers. GEPAS is nowadays the most quoted web tool
in its field and it is extensively used by researchers of many countries
and its records indicate an average usage rate of 500 experiments per
day. GEPAS, is available at http://www.gepas.org.
Al-Shahrour, F.; Carbonell, J.; Minguez, P.; Goetz, S.; Conesa, A.; Tarrraga, J.; Medina, I.; Alloza, E.; Montaner, D.; Dopazo, J.
Babelomics: advanced functional profiling of transcriptomics, proteomics and genomics experiments Journal Article
In: NUCLEIC ACIDS RESEARCH, vol. 36, no. S, pp. W341-W346, 2008, ISSN: 0305-1048.
@article{ISI:000258142300064,
title = {Babelomics: advanced functional profiling of transcriptomics, proteomics
and genomics experiments},
author = { F. Al-Shahrour and J. Carbonell and P. Minguez and S. Goetz and A. Conesa and J. Tarrraga and I. Medina and E. Alloza and D. Montaner and J. Dopazo},
url = {http://dx.doi.org/10.1093/nar/gkn318},
doi = {10.1093/nar/gkn318},
issn = {0305-1048},
year = {2008},
date = {2008-07-01},
journal = {NUCLEIC ACIDS RESEARCH},
volume = {36},
number = {S},
pages = {W341-W346},
abstract = {We present a new version of Babelomics, a complete suite of web tools
for the functional profiling of genome scale experiments, with new and
improved methods as well as more types of functional definitions.
Babelomics includes different flavours of conventional functional
enrichment methods as well as more advanced gene set analysis methods
that makes it a unique tool among the similar resources available. In
addition to the well-known functional definitions (GO, KEGG), Babelomics
includes new ones such as Biocarta pathways or text mining-derived
functional terms. Regulatory modules implemented include transcriptional
control (Transfac, CisRed) and other levels of regulation such as
miRNA-mediated interference. Moreover, Babelomics allows for
sub-selection of terms in order to test more focused hypothesis. Also
gene annotation correspondence tables can be imported, which allows
testing with user-defined functional modules. Finally, a tool for the de
novo functional annotation of sequences has been included in the system.
This allows using yet unannotated organisms in the program. Babelomics
has been extensively re-engineered and now it includes the use of web
services and Web 2.0 technology features, a new user interface with
persistent sessions and a new extended database of gene identifiers.
Babelomics is available at http://www.babelomics.org.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
for the functional profiling of genome scale experiments, with new and
improved methods as well as more types of functional definitions.
Babelomics includes different flavours of conventional functional
enrichment methods as well as more advanced gene set analysis methods
that makes it a unique tool among the similar resources available. In
addition to the well-known functional definitions (GO, KEGG), Babelomics
includes new ones such as Biocarta pathways or text mining-derived
functional terms. Regulatory modules implemented include transcriptional
control (Transfac, CisRed) and other levels of regulation such as
miRNA-mediated interference. Moreover, Babelomics allows for
sub-selection of terms in order to test more focused hypothesis. Also
gene annotation correspondence tables can be imported, which allows
testing with user-defined functional modules. Finally, a tool for the de
novo functional annotation of sequences has been included in the system.
This allows using yet unannotated organisms in the program. Babelomics
has been extensively re-engineered and now it includes the use of web
services and Web 2.0 technology features, a new user interface with
persistent sessions and a new extended database of gene identifiers.
Babelomics is available at http://www.babelomics.org.
Botton, A.; Galla, G.; Conesa, A.; Bachem, C.; Ramina, A.; Barcaccia, G.
Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology Journal Article
In: BMC GENOMICS, vol. 9, 2008, ISSN: 1471-2164.
@article{ISI:000258552300001,
title = {Large-scale Gene Ontology analysis of plant transcriptome-derived
sequences retrieved by AFLP technology},
author = { A. Botton and G. Galla and A. Conesa and C. Bachem and A. Ramina and G. Barcaccia},
url = {http://dx.doi.org/10.1186/1471-2164-9-347},
doi = {10.1186/1471-2164-9-347},
issn = {1471-2164},
year = {2008},
date = {2008-07-01},
journal = {BMC GENOMICS},
volume = {9},
abstract = {Background: After 10-year-use of AFLP (Amplified Fragment Length
Polymorphism) technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are
available in public databases for model, crop and tree species. AFLP
marker systems have been and are being extensively exploited for genome
scanning and gene mapping, as well as cDNA-AFLP for transcriptome
profiling and differentially expressed gene cloning. The evaluation, annotation and classification of genomic markers and expressed
transcripts would be of great utility for both functional genomics and
systems biology research in plants. This may be achieved by means of the
Gene Ontology (GO), consisting in three structured vocabularies (i.e.
ontologies) describing genes, transcripts and proteins of any organism
in terms of their associated cellular component, biological process and
molecular function in a species-independent manner. In this paper, the
functional annotation of about 8,000 AFLP-derived ESTs retrieved in the
NCBI databases was carried out by using GO terminology.
Results: Descriptive statistics on the type, size and nature of gene
sequences obtained by means of AFLP technology were calculated. The gene
products associated with mRNA transcripts were then classified according
to the three main GO vocabularies. A comparison of the functional
content of cDNA-AFLP records was also performed by splitting the
sequence dataset into monocots and dicots and by comparing them to all
annotated ESTs of Arabidopsis and rice, respectively. On the whole, the
statistical parameters adopted for the in silico AFLP-derived
transcriptome-anchored sequence analysis proved to be critical for
obtaining reliable GO results. Such an exhaustive annotation may offer a
suitable platform for functional genomics, particularly useful in
non-model species.
Conclusion: Reliable GO annotations of AFLP-derived sequences can be
gathered through the optimization of the experimental steps and the
statistical parameters adopted. The Blast2GO software was shown to
represent a comprehensive bioinformatics solution for an
annotation-based functional analysis. According to the whole set of GO
annotations, the AFLP technology generates thorough information for
angiosperm gene products and shares common features across angiosperm
species and families. The utility of this technology for structural and
functional genomics in plants can be implemented by serial annotation
analyses of genome- anchored fragments and organ/tissue-specific
repertories of transcriptome-derived fragments.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Polymorphism) technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are
available in public databases for model, crop and tree species. AFLP
marker systems have been and are being extensively exploited for genome
scanning and gene mapping, as well as cDNA-AFLP for transcriptome
profiling and differentially expressed gene cloning. The evaluation, annotation and classification of genomic markers and expressed
transcripts would be of great utility for both functional genomics and
systems biology research in plants. This may be achieved by means of the
Gene Ontology (GO), consisting in three structured vocabularies (i.e.
ontologies) describing genes, transcripts and proteins of any organism
in terms of their associated cellular component, biological process and
molecular function in a species-independent manner. In this paper, the
functional annotation of about 8,000 AFLP-derived ESTs retrieved in the
NCBI databases was carried out by using GO terminology.
Results: Descriptive statistics on the type, size and nature of gene
sequences obtained by means of AFLP technology were calculated. The gene
products associated with mRNA transcripts were then classified according
to the three main GO vocabularies. A comparison of the functional
content of cDNA-AFLP records was also performed by splitting the
sequence dataset into monocots and dicots and by comparing them to all
annotated ESTs of Arabidopsis and rice, respectively. On the whole, the
statistical parameters adopted for the in silico AFLP-derived
transcriptome-anchored sequence analysis proved to be critical for
obtaining reliable GO results. Such an exhaustive annotation may offer a
suitable platform for functional genomics, particularly useful in
non-model species.
Conclusion: Reliable GO annotations of AFLP-derived sequences can be
gathered through the optimization of the experimental steps and the
statistical parameters adopted. The Blast2GO software was shown to
represent a comprehensive bioinformatics solution for an
annotation-based functional analysis. According to the whole set of GO
annotations, the AFLP technology generates thorough information for
angiosperm gene products and shares common features across angiosperm
species and families. The utility of this technology for structural and
functional genomics in plants can be implemented by serial annotation
analyses of genome- anchored fragments and organ/tissue-specific
repertories of transcriptome-derived fragments.