EMERALD Work package descriptions
WP1: Quality Metrics and Ontologies (PI: Alvis Brazma, EBI).
WP2: Standards (PI: Carole Foy, LGC).
WP3: Organisation and dissemination (PI: Arne Sandvik, NTNU).
WP4: Data Quality and Systems Biology (PI: Martin Kuiper, VIB).
WP5: Standards and European legislation (PI: Heinz Schimmel, IRMM).
WP6: New Technologies (PI: Ulf Landegren, UU).
WP7: Project management and dissemination (PI: Martin Kuiper, VIB).
WP1: Quality Metrics and Ontologies (PI: Alvis Brazma, EBI).
The objective of this WP is to develop and disseminate quality metrics and tools for determining data quality and communicating data transformations. In addition, ontology for describing microarray experiments and analysis (Normalisation and transformation ontology, NTO) will be developed and disseminated.
Although MAGE has proven to be a valuable standard for the reporting of DNA microarray data, its use at present has been largely limited to communicating raw data from individual functional genomics experiments. While the MIAME standards specify that data transformations including normalisation and criteria for the selection of “significant genes” should be part of any microarray report, we still lack a consistent and unambiguous way of describing these analyses. A controlled vocabulary is required to allow researchers to effectively communicate the quantitation types used in any analysis. Further, there is an increasing interest within the microarray community for an unbiased assessment of the quality of each individual hybridisation, and of the value of assessing the overall confidence in any measurement arising from a complete microarray study.
We will address aspects of data quality associated with the measurements of a single feature and those associated with the data from a whole array. Beginning with the individual features (array elements) we will define objective criteria for assessing quality of the data arising from those features. These will then be used as the basis for developing an assessment of individual hybridisation assays as the features are the primary (although not the only) elements we are concerned with in each array assay. Finally, these two will be used to assess the quality of complete studies involving a large number of microarray assays. As part of this, we will work to develop protocols for error assessment and propagation of errors across experiments to assure that expression levels of each gene in each assay are reported with a degree of confidence supported by the underlying data. Such quality scores will have a profound effect on the value of microarray data.
WP2: Standards (PI: Carole Foy, LGC).
The objective of this work package is to plan and advocate the use of standards by the microarray community. This will involve the identification of suitable reference materials (spikes, reference RNAs), the assessment of analytical “best practice” guidelines and standardised approaches to experimental design and execution.
Because of the complex nature of a microarray experiment there are many sources of variability. Sample quality, labelling protocol, hybridisation conditions, scanning instrument, image acquisition and processing, data normalisation and analysis, quality assessment of data and interpretation of results all contribute to the overall uncertainty of the conclusions drawn. Comparing results from seemingly identical experiments between different laboratories, operators or even days can prove challenging, not least due to the current lack of standards throughout the process. The challenges increase further when data from different platforms needs to be compared. Whereas the availability of Quality Metrics will facilitate the implementation of quality control (QC), it will be extremely rewarding to work on assuring quality (QA) at the production process. It is here that the value content of the data will be determined. To help realise the full potential of microarrays, a broad acceptance and implementation of hybridisation references, analytical “best practice” guidelines and standardised approaches to experimental design and execution are required. These will facilitate the production of consistently higher quality data, enable more precise QA/QC procedures to be performed and will also feed-back into the development of quality metrics to objectively assess the performance of a microarray experiment and the quality of the data generated.
WP3: Organisation and dissemination (PI: Arne Sandvik, NTNU).
Whereas WP1 and WP2 provide the ‘push’ for QA/QC, the purpose of WP3 is to organise and structure the community ‘pull’. First, we will identify and bring together the key players in the field of transcriptome microarray use and further development. To obtain more critical mass, we will ask key (MGED) stakeholders to join an Advisory Board for organising the different workshops. We will disseminate to the community results of WP1 and WP2, and work toward a general appreciation of the benefits of QA/QC in the data production process. We will also disseminate the results within the field of micro-array based development and novel applications. As mentioned previously, quality metrics and ontologies are fundamental to assessing and describing pre-processed data in data archives. Therefore, in addition to the workshops described under WP3, specific QM and ontology workshops may be organised under WP1, under the auspices of MGED.
WP4: Data Quality and Systems Biology (PI: Martin Kuiper, VIB).
The purpose of WP4 is to assess the impact of QM-based filtering and general QA/QC implementation on the performance of various mining and modelling approaches of such data compendia.
Selected sets of microarray data compendia from ArrayExpress or other (CAGE, AtGenExpress, Rosetta, etc.) will be used to calculate quality metric values per individual microarray data sets. Compendia will be pruned for data according to increasingly restrictive QM values (or parameters), yielding a number of progressively higher-quality data compendia. These different quality compendia will subsequently be subjected to bootstrapping to assess all-by-all clustering of genes and experiments in order to establish Pearson correlation coefficients. The range of such PCCs across all gene pairs, or 100 arbitrarily chosen gene pairs (whichever is better) over 100 bootstrapped sets will be compared, for the different ‘quality-pruned’ datasets, and the biological significance of the results will be checked based on Gene Ontology terms (GO). These ranges of PCCs will be checked against the specific parameter settings and the quantity of data that had to be removed to derive proper feed-back to WP1. The results of particular parameters will be also discussed with the representatives of WP1, giving them valuable feedback for the effectiveness and priority of the different metrics that may need further development and refinement. The trade-off between gain in PCC value and amount of data remaining in the pruned compendium will be taken as a measure to recommend specific QC targets, or specific parameter filters to the user community.
The importance of QA/QC and quality parameters for systems biology modelling of e.g. gene networks will be borne out once a critical mass of data becomes available. These studies will be carried out by VIB, who will present the major conclusions at the topical workshop VII organised under WP3.
WP5: Standards and European legislation (PI: Heinz Schimmel, IRMM).
The purpose of WP5 is to take the QA/QC criteria analysed, developed and discussed in the previous 4 work packages and translate these into commutability criteria for microarray-relevant Reference Materials. These criteria will form the basis of derived projects independent from the current CA aimed to develop and distribute such European reference materials.
European Reference Materials (ERM) are certified materials which undergo peer evaluation and offer high quality and reliability and are a major tool for improving the confidence in, and the mutual recognition of test results and certificates in a global market. ERMs comply with high metrological requirements, ensuring traceability of measurements results, and are the end-point of the traceability chain, thus being primary standards in chemistry. Present partners of the European Reference Materials concept are three major European reference materials producers:
• IRMM (The European Commission's Directorate General Joint Research Centre), Belgium.
• LGC, United Kingdom.
• The Bundesanstalt für Materialforschung und -prüfung (BAM), Germany.
Conclusions from the workshop on selected standard materials developed in response to this CA will be assessed for their suitability for ERMs, and the consortium will pursue projects and funding to develop such ERM. Such projects, however, will be carried independent from this application.
IRMM will be the conduit to transport the outcome of this CA’s results into the different European standardisation bodies. IRMM as a directorate of the European Commission, having a longstanding experience in the scientific and technical support of national and pan-European legislation, will provide the following activities in the field covering micro-array based technologies:
function as interface between science and regulatory and standardisation bodies
represent the consortium in relevant standardisation and regulation bodies
provide consultancy in emerging questions/discussions
In the exercise of these activities IRMM will maintain close links with LGC and other reference setting organisations.
WP6: New Technologies (PI: Ulf Landegren, UU).
In this work package we will perform a survey of further developments of microarray technologies, and of new applications. Key academic and commercial developers of microarray technology, as well as users (research groups, product developes, service providers), will be identified, contacted, and invited to participate in a workshop arranged by WP3. At the workshop, the impact of quality assurance and quality control (QA/QC) on transcriptome measurements will be discussed, along with the quality metrics and standards developed by WP1 and WP2.
We will furthermore create a publicly available web-based technology database. The database will include method descriptions of microarray-based technologies, and key players in the field will be identified. It will be formatted to allow key players to add detailed protocols, applications, QA/QC procedures, and links to further information. The database will thus be used to disseminate information to the community that develops and uses new microarray technologies. Emerging technologies will be presented to a wider audience in Workshop VII, with a focus on challenges concerning standardisation.
WP7: Project management and dissemination (PI: Martin Kuiper, VIB).
The project management (VIB) is responsible for the coordination of the interactions and the data flow between the different partners, and for discussing and setting common targets in agreement with the other partners. Furthermore the management and interconnected work package groups will have a main role in the dissemination of results, and overall coordination of implementation of quality assurance and quality control (QA/QC). The management team communicates with the EU and will be responsible for overall legal, contractual, ethical, financial & administrative management of the consortium. Moreover, it will coordinate overall scientific and technological activities and update und manage the consortium. Furthermore it is responsible for strategic, technical and financial issues.
The partners plan to meet at six months intervals for the whole consortium. Specific interest group meetings, defined by the work packages of the proposal however, will be more frequently. In addition, ad-hoc meetings can be planned through video conferencing. The management will integrate activities and plan meetings and training events. The research will provide leading-edge training for a new generation of scientists working in the fields of microarray-based transcript profiling, functional genomics technology development and data driven gene network building. Specific emphasis will be given within this project to the dissemination of results to the public using traditional and novel media. Results and approaches of this project shall be published in media available to a wide public audience. The homepage of the EBI project web portal will inform the interested public on project approaches and results. The high scientific profile of the partners in this project will enable a publication of project results also in leading scientific journals. Joint publications by several partners will highlight the project profile, strengthen the role of European research at the international level, and underline the credibility of the research groups.
