Genedata Biologics

Genedata Biologics® supports the full protein production process, from definition of the desired protein molecules for production, to molecular biology, expression, purification, and analytics. The system stores all proteins, vectors, plasmids, cell lines, material batches, and related analytics and QC data. It provides tools for molecule and sample registration, as well as analytics and QC data reporting. Vector development campaigns are supported by tools for in silico construct creation for diverse cloning strategies and enable parallel development of both transient and stable cell line pools. All expression and purification samples are related to their underlying experimental protocols.

The integrated and consistent management of expression-relevant data such as promoters, leader peptides, codon usage, and vector design, enable evidence-based optimization and development of expression systems, even for notoriously difficult-to-express proteins. Genedata Biologics can handle diverse antibody formats, including standard IgG, next-gen formats, and other multi-chain proteins and also takes into account modifications such as PEGylations, glycosylations, or conjugations (e.g., for antibody drug conjugates - ADCs).

Analytics and QC data are stored within the same system and are auto-referenced to the relevant expression and purification batches. The system also provides specialized functionalities for the production of tool proteins and antigens. Integrated reporting allows for automatic generation of comprehensive Certificate of Analysis (CoA) reports for release and handover of final protein samples.

Key Features

Biologics Registration
Central registration engine for large molecules

Genedata Biologics® is built on a central registration engine for all categories of biologics entities, including antibodies such as IgGs and bispecifics, non-antibody proteins, plasmids and DNA, cell lines, and other biological samples. The system provides specific capabilities for registration of modified proteins such as antibody drug conjugates (ADCs), PEGylated, glycosylated, or otherwise modified molecules (e.g., de-tagged molecule versions). Genedata Biologics performs automated molecule uniqueness checks, generates unique identifiers for molecules and associated batches, annotates domains such as binding-relevant domains as well as chemical liabilities (e.g., undesired PTMs), and calculates physicochemical and other relevant molecule properties. Molecular ancestries are fully documented, enabling tracking from early discovery (e.g., phage display screens, engineered variants) to fully re-formatted IgGs in downstream testing. Analogous to sub-structure searches for small-molecules databases, Genedata Biologics provides built-in query and reporting tools specifically tailored for large-molecules. The platform’s registration engine is key to managing data and material handovers along the biologics R&D process, and helps to eliminate duplication of work. In addition to therapeutic candidates, the system can register molecular tools and materials (e.g., antigens, tool proteins, related vectors, inserts, developed cell lines, stable pools), which are necessary for full documentation of the biologics discovery and production process protocol.

Integrated Vector Management
Management of constructs, plasmids, and other molecular biology data

Genedata Biologics comes with a built-in database for backbone vectors, cloning and expression constructs, and corresponding vectors maps. All vectors are centrally stored, together with their DNA sequences and annotations, such as promoters, leader peptides, resistance cassette, and cloning strategy. Vectors may encode therapeutic antibodies, proteins, or tool proteins, such as antigens or drug targets. A vector registration engine checks for vector uniqueness to avoid entry duplication. All vector sequences, annotations, and references are managed within a single system. Vectors can be organized and queried according to various criteria such as encoded proteins, contact person, or relevant project. The system stores vectors according to encoded protein ancestries, which makes it easy to link vectors that are used in a specific scientific context, e.g., molecular biology efforts organized by related proteins such as full-length wild-type proteins, catalytic domain expressions, truncations, and mutated variants. Typical applications for tool proteins and reagents include assay development for MTS/HTS, structural biology and X-ray studies, and co-crystallization studies with antibodies or small-molecule compounds. Genedata Biologics supports DNA synthesis workflows by performing automated quality checks to ensure correctness of the encoded protein after codon-optimization. The system includes explicit data organization models for multi-vector expression systems, which is of critical importance when working with multi-chain proteins such as antibodies (e.g., IgGs, bispecifics).

Bulk Cloning & Molecule Re-formatting
Alleviating bottlenecks in molecular biology via in silico cloning tools

Genedata Biologics provides specialized tools for automation and scale-up of vector design. This is particularly useful if larger sets of inserts or variable regions, such as those derived from high-throughput antibody screening campaigns, need to be cloned into a backbone vector to produce full-sized IgG molecules. The system provides flexible tools for automating the bulk generation of new antibody and non-antibody vector maps and encoded protein molecules, based on selected backbone vectors and desired cloning strategies (e.g., Gateway Cloning). Specific functionalities enable automated antibody re-formatting (scFv or Fab to IgG) and isotype switching. The system also enables rapid expression vector optimization by simultaneously cloning inserts into different backbone vectors carrying different combinations of modules of interest (e.g., cleavage sites, tags, leader peptides). All resulting vectors are auto-referenced to the proteins they encode. Auto-generated expression vectors are centrally stored in a common vector repository and can be queried according to their specific properties such as used promoters, leader peptides, resistance cassettes, tags, Kozak, isotype, and constant region sequences.

Protein Expression and Purification Workflow Support
Facilitating complex workflows in a division-of-labor environment

Protein expression and purification workflows require different teams, such as molecular biologists, cell biologists, protein scientists, and analytics personnel to collaborate. Genedata Biologics supports such division-of-labor processes by enabling central registration, naming, and tracking of all relevant biomaterial samples (e.g., vector, cell line, protein expression and protein purification batches) and providing immediate central access to every molecule (e.g., sequence, physical-chemical properties) as well as to process information at any time throughout the expression and purification process. Registered samples are related to their molecules and experimental protocols, including used expression constructs, host cell lines, media, and bioprocess protocols and purification procedures. The system can work with diverse expression systems, including mammalian, bacterial, yeast, or insect. Protein expression batches may be derived from  high- or low-volume expression campaigns (e.g., tubes, shake flasks, wave reactors). Bulk upload functionalities facilitate registration of larger numbers of expression batches (e.g., derived from parallel production in 24- or 96-well plate tube fermentors). Genedata Biologics stores all relevant data for protein expression experiments such as host cell lines (e.g., HEK293, CHO, Sf9, etc), expression protocols (e.g., specific vectors used, transfection conditions, transient or stable pools, baculovirus infection, co-expression), and process parameters (e.g., temperature, induction conditions, growth media). For the downstream purification process, the system stores all relevant purification results and underlying protocols. Similarly to expression batches, Genedata Biologics supports high- and low-volume purification processes (e.g., parallel test purification in 96-well plate gravity flow columns), use of different purification equipment (e.g., such as IMAC, SEC, RFC, HIC, IEX, etc), and product modifications (e.g., de-tagging, buffer optimization). Genedata Biologics further supports users by providing tools for concrete laboratory operations, such as sample pooling, sample splitting, and individual sample processing, as well as protocol management (e.g., protein modifications such as antibody drug conjugation). Genedata Biologics consistently documents all process parameters and uniquely aggregates the data contributed by different groups along the full protein production process, providing critically important input for bioprocess development.

Analytics and QC Data Management
Capturing, reporting, and interpreting protein analytics and quality data

Analytics and QC data for protein expression and purification samples are easily reported in Genedata Biologics. Standard data submission templates enable reporting of results derived from diverse analytics instruments. Typical analytics parameters include yield, purity, stability, activity, solubility, and typical analytics technologies include SDS-PAGE, SEC, SLS/DLS, MS, and SPR. Other quality parameters, critical for biologics, are endotoxin and aggregation levels and melting temperatures. Genedata Biologics analytics data entry forms can also be configured to reflect corporate-specific analytics and quality control processes and standards. The system provides a flexible mechanism for managing all relevant analytics, QC and biophysical data, with automatic linking of reported data to the relevant protein sample. The system calculates biophysical properties such as molecular weight, isolectric point (pI), molar extinction coefficient, and absorbance as well as potential post-translational modification sites (PTMs) for use in performing the actual analytics experiments. Genedata Biologics provides single-point and automated generation of sample and analytics quality reports (e.g., Certificate of Analysis, CoA) for handover of quality-checked samples to requesting groups and departments. The integrated nature of the data management platform enables easy tracking of analytics and QC parameters. Integrated tools help eliminate protein productivity or quality bottlenecks by identifying the optimal combinations of host cells, vector backbones, tags, leader peptides, and expression and purification protocols.