Posters

Title and Abstract Product Download
Genedata Selector for Biofuels R&D

Presented at Society for Industrial Microbiology 2011, New Orleans, LA

We report how the Genedata SelectorTM system has successfully been used to select the optimal strain of a specific organism for biofuel production. In particular, we show how to analyze an unlimited number of genomes and predict their capabilities to efficiently generate biofuels. Special focus is given to applying data analysis workflows to process next-generation sequencing  (NGS) data in the context of producing alkanes and butanol from microbes and fungi. The integration of NGS-based genotype and profiling data for transcriptomics together with mass spectrometry-based proteomics and metabolomics data improves and streamlines development cycles.

Genedata Selector Request NGS, Strain Development
Genedata Selector for Fungal Strain Optimization

Presented at Society for Industrial Microbiology 2011, New Orleans, LA

Genedata SelectorTM facilitates integration of public and proprietary data in one relational database and contains built-in tools for data analysis and visualization. The analysis process of complex and huge volumes of omics data can now be streamlined and automated and allows scientists to focus on the interpretation of results. We illustrate how standard omics data together with NGS data of fungi can be used to elucidate the genetic variation, metabolic capacities, and stability of strains at a genome-wide level. We also show how this can be applied to select optimized fungi strains for, e.g. antibiotic production. We finally demonstrate a variety of data analysis tools in the context of applications in different industrial biotechnology segments such as bio-fuels, cosmetics, detergents, agricultural sciences, food, beverage and feed processing.

Genedata Selector Request Strain Development, Omics, NGS
Analysis and Enterprise Data Management Platform for Cell-Based, Label-Free Experiments

Presented at ELA 2011, Hamburg, Germany

Label-free detection offers non-invasive measurement of phenotypic cellular responses, such as ultra-sensitive detection of binding events, cell growth, and changes in intracellular physiology, morphology and adhesion. It therefore holds promise for achieving higher biological relevance in screening experiments.

To realize this potential, the following challenges in data management and analysis need to be overcome: (1) about 100-fold higher volume of time-resolved data compared to conventional screens, (2) complex quality control and parameterization of measured responses, and (3) difficult interpretation of measured time traces in terms of underlying biology.

Here we illustrate a software platform which is capable of overcoming these challenges and is already in use at several pharmaceutical companies.

Genedata Screener Request Label-free
A Unified Infrastructure for Multi-instrument, Multi-site High Content Screening Data

Presented at CHI's High Content Analysis 2011, San Francisco, USA

In recent years, High Content Screening (HCS) has become a foundation technology in lead identification, lead optimization, and toxicology. Typically, multiple HCS instruments from the same or different vendors are used. Ideally, data and analysis from these instruments should be managed in a unified way and processed in streamlined and standardized workflows. The ultimate goal is to provide single-point access to HCS images, data, and standardized results across a geographically dispersed research organization.

In this poster, Genedata shows the concept of a generic HCS data processing and management infrastructure. It is based on several productive HCS installations planned and executed for Genedata customers. This standard infrastructure works with different HCS instruments, scales well with the number of HCS experiments conducted locally, and provides global access to HCS results and images.

Genedata Screener Request HCS
Functional Genomics at Work - Correlating long-range DNA methylation with gene expression in lung fibroblasts

Presented at CHI Next-Generation Sequencing Summit 2010, Rhode Island, USA

Whole genome, single-base resolution analysis of the transciptome and the methylome was performed using the Illumina Genome Analyser II on two cell lines, IMR90 a fetal lung fibroblast cell line and H1 a human embryonic stem cell line. The data were obtained from the publication Lister et. al. Nature. 2009; 462(7271):315-22 and re-analyzed using the Genedata Expressionist® System.

Genedata Expressionist Refiner Genome Request NGS
Peptide Identities Can Inform a Peak Clustering Parameter for 2D MS/MS Data

Presented at ASMS 2010, Salt Lake City, USA

Analysis of thousands of peaks detected across many samples requires optimization of peak detection parameters. Primary scan data permits qualitative visual inspection of peaks and clusters. With MS2 data and peptide identities, a researcher gains an additional qualitative parameter for assessing the correctness of their peak detection, while concordance of multiple peptide identities within the bounds of detected 2D peaks offers a quantitative handle. This work presents the optimization of isotopic peak clustering using identity concordance as a guide.

Genedata Expressionist Request Mass Spectrometry
Data Analysis Strategies for Large-Scale Systems Toxicology

Presented at SOT 2010, Salt Lake City, USA

Within the EU FP6 InnoMed PredTox consortium, a project with 20 partners from pharmaceutical industry, SMEs and universities, data from 14-day rat studies was collected using several different –omics technologies as well as conventional methods. The main goal of this collaborative effort was to achieve more informed decision making earlier in preclinical safety evaluation. To reach this goal, data analysis strategies were designed for each data type taking into consideration platform-specific aspects. Initial data analysis was provided for each technology based on agreed quality parameters, normalization strategies, and statistical approaches. The effects of sixteen compounds were analyzed either independently or for groups of compounds that led to similar phenotypes. In this way, concordance between different technologies as well as between sample types could be established and used to identify potential biomarkers. Also, predictive models were constructed from data within and across technology platforms.

Genedata Expressionist Request Statistical Analysis, Toxicogenomics
Tailored Data Management and Analysis for Biologics R&D

Presented at GRC Antibody Biology & Engineering 2010, Ventura, CA

In collaboration with our partners, we have developed tailored software that incorporates biologics domain knowledge, recognizing the importance of protein and nucleotide sequence data, as well as the importance of modeling the full processes involved in producing proteins. Here, we present a selection of novel tools and databases for antibody screening, antibody engineering and biologics registration that address real-life needs in the laboratories that are often overlooked by current commercially available software.

Genedata Biologics Request Biologics
Integrating HCS Data Analysis into HTS Workflows

Presented at Screening Europe 2010, Barcelona, Spain

High content screening (HCS) experiments produce rich information on phenotypic changes of individual cells in a physiologically relevant context. While primary analysis and management of the HCS images are still the main focus of effort and attention, leading pharmaceutical companies have started to incorporate HCS into their high throughput (HT) screening workflows. This poses new challenges, such as the biologically meaningful automated result quantification, and the standardization of HCS data analysis and management across different instruments, in a framework compatible with high throughput screening technologies. We show how scientists facing HT HCS scenarios can routinely analyze screens in a scalable, fast and thorough fashion, and how the results can easily be integrated within the organization’s existing research informatics infrastructure.

Genedata Screener for High Content Screening Request HCS, HTS
Automated Analysis, Systematic Quality Control and Multiplexed Hit List Generation

Presented at SBS 2010, Phoenix, AZ

High content screens (HCS) produce rich information on phenotypic changes of individual cells when subjected to treatment with compounds, siRNAs or biologics.

However, larger-scale HCS are typically narrowed down to a single result per well in order to fit standard high throughput screening (HTS) procedures and infrastructure, which is limiting the biological information that can be obtained from such screens.

Combining HCS-specific business logic with HTS-style quality assurance, automation and standardization, we show how full-deck HCS (1000 plates) are analyzed without loss of information or fidelity.

Genedata Screener Request HCS
Analysis of a Time-resolved, siRNA High Content Screen Clusters New NFkB Regulators to Known Pathway

Presented at MipTec conference 2009, Basel, Switzerland

The gastric pathogen Helicobacter pylori is responsible for gastric inflammation and the second highest number of infection-associated cancers. The transcription factor NFkB, a key mediator of host tissue inflammation, induces the expression of pro-inflammatory cytokines, which have been shown to be important for the development of gastric cancer. To better understand NFkB activation in response to H. pylori infection, a genome wide RNA interference screen was performed, resulting in 300 primary hits. To characterize the effects of these 300 hits a time-series high content screen (HCS) with three inducers (H. pylori, IL-1b and TNFa) was carried out. Proteins specifically affecting the cellular response to one of the inducers, as well as those which were common to all inducers, were identified.

Data analysis was performed on Genedata Screener® and Genedata AnalystTM software platforms. The effect of three different inducers on almost 40 different features extracted from the HCS images were analyzed, each measured over nine time points in triplicates (about 5 Million data points in total). Data standardization, feature selection and replicate condensing were optimized for this complex  experimental design. Time-dependent profiles of {siRNA x inducer} combinations were clustered to yield groups of genes showing similar outcomes on siRNA silencing. The results were interpreted in the context of known inflammatory pathways.

In summary, a general protocol for cross-experimental analysis of time-dependent HCS data was devised, focusing on detailed comparison of siRNA silencing profiles in large HCS data sets. Targeted genes of known protein function allowed further genes involved in these pathways to be found, improving understanding of the processes leading to inflammation after H. pylori infection.

Genedata Screener & Genedata Analyst Request Statistical Analysis, Label-free
Integrated Analysis of Kinetics Data: How to Analyze Curve Shapes Systematically and Optimize Identification of Actives

Presented at SBS 2009, Lille, France

Time-dependent responses are measured in a variety of assay technologies, including calcium flux assays and scalable label-free detection methods. Conventionally, such responses are integrated over time directly on the instruments, yielding one or a few summary parameters, which are used later to quantify compound activity. However, this procedure eliminates a large amount of information contained in the kinetic traces which could be useful for separating intended effects from artifacts, distinguishing multiple biological mechanisms, and optimizing quantification of responses.

In our case study using a FLIPR® Ca2+ flux assay, we demonstrate a framework for efficient and interactive review and analysis of kinetics data without prior data reduction. Quantification of responses was statistically optimized on the level of a complete screen. A classification parameter was calculated from the kinetic traces and was used to eliminate typical assay artifacts. Both procedures significantly increased the quality of hits, reducing downstream experimentation.

This new streamlined analysis workflow is amenable to FLIPR® and other time series assays. Within minutes, new quantification and artifact classification parameters can be developed interactively, tested on the original kinetic traces, and applied automatically to large screens. It is capable of distinguishing actives and artifacts in a robust, sensitive and time-efficient manner.

Genedata Screener Request Label-free
Advanced Data Analysis for Hit Identification in High Content Screening

Presented at SBS 2009, Lille, France

High content screening (HCS) is becoming standard technology for drug discovery (phenotypic screens and in-vitro toxicity) and academic research (pathway mapping and functional genomics). However, the complex multi-parameter data sets generated pose a significant challenge in terms of management, analysis and in-depth biological interpretation.  

Individually reviewing images and results is manageable in small-scale HCS experiments. However, data complexity and volume in, e.g. a compound library high-content screen or a genome-wide siRNA screen exceed the capacity of in-depth scientific review. This leads to an extreme simplification of the analysis using, e.g. a single outcome such as cell number as the result, and only a fraction of the experimental information is used.  

Our work shows a systematic approach for extracting information-rich features from cell images; these are aggregated to activity profiles which qualify and quantify the observed biological processes, effects are separated from artifacts by in-depth QC, and results are compiled from complete screens. Scientists can comprehensively review the results, optimize the applied processing methods via statistical metrics, and link back to relevant cell images for improved interpretation of observed effects. This approach increases the value derived from HCS experiments, and is applicable for all types of HCS experiments and reader technologies.

Genedata Screener Request Hit & Lead Identification, HCS
Beyond Biomolecular Screening: A Multi-Parametric Procedure for Efficient Hit and Lead Identification

Presented at SBS 2008, St. Louis, USA

Screening more compounds in more assays has led to more data, but has not solved the problem of identifying better candidates with which to shorten drug discovery cycles.

We discuss how more value can be gained from the increased volume of data, by managing the data in a consistent way, placing it in context with other information, finally combining and translating it into efficient hit and lead prioritization.

We suggest definition of corporate-wide business rules for ensuring data consistency, covering data acquisition and processing, application of quality metrics, and aggregation of plate to compound measurements. Tools to apply these standards in an unbiased yet still interactive style have to be provided.

Furthermore, we present a case study in which public databases are used to create hit lists rich in information, such as biological and molecular properties, and known bioactivities. Different ways of ranking compound lists on such multi-parametric data are shown, and a procedure to identify the most promising candidates for lead optimization is demonstrated.

Genedata Screener Request Hit & Lead Identification
An Automated Workflow for Rapid Alignment and Identification of Lipid Biomarkers Obtained from Chip-based Direct Infusion Nanoelectrospray Tandem Mass Spectrometry

Polar lipids (glycerophospholipids, saccharolipids) are routinely analyzed with chromatographic and mass spectrometric techniques.Chip-based direct infusion nanoelectrospray tandem mass spectrometry can provide lipid profiles or lipid fingerprints with short infusion times (one to two minutes). Using high-resolution FT-ICR-MS or low-resolution linear ion trap mass spectra in positive and negative mode together with data dependent tandem mass spectrometry, polar lipids can be semi-quantified and identified. The major challenge is however not the analysis itself but automated data handling and data evaluation of the obtained results. We present efforts which include a reusable workflow environment for rapid alignment and identification of lipids from infusion tandem mass spectra. Results from an environmental tobacco smoke study and plasma and lung tissue are shown.

Genedata Expressionist Request Mass Spectrometry