Automated Workflow for Host Cell Protein Monitoring by Mass Spectrometry: From Raw Data to Final Report

March 20, 2018

Presented at the Biotherapeutics Analytical Summit 2018, Baltimore, MD, USA

Allowing unbiased identification and high throughput quantification of multiple low-abundant proteins, MS enables a thorough assessment of HCP contamination in biotherapeutics. We present an automated approach that, starting from the raw data, allows the identification, quantification, and routine monitoring of HCPs by MS.

HCP contaminations can span a wide range of concentrations, with low-abundant species present at the ppm level. To address this challenge we applied optimized algorithms for mass recalibration, noise removal, and retention time alignment to obtain optimal peak detection even for low-abundant species. After that, two different strategies for the identification of HCPs were followed.

In the first approach, monitoring of low-abundant HCPs was accomplished using a two-stage identification procedure. The whole collection of signals belonging to the protein biotherapeutics (e.g. peptides, modifications, etc.) was identified before submitting the data to conventional peptide spectrum match searches. This allowed monitoring and quantification of the expected peptides reducing the chance of low-abundant features from the biotherapeutics to be falsely identified as HCP signals (false positives).

Submitting mass spectra from low-abundant signals to PSM algorithms also poses the risk of missing the identification of HCPs (false negatives). In the second approach, we developed a strategy to mitigate this risk. Potential HCPs were identified by PSM search of samples with enriched HCP content, and the information pertaining to the respective peptides was then used by the software as identification criteria for any other samples. In particular, retention time and m/z coordinates of known impurities were stored in a dedicated knowledge base and were used for matching of the respective signals in single-stage MS data. These libraries can be used in combination to annotate signals in downstream samples and allow for proper identification even in cases where no fragmentation data is available.



Back to list