
I. Introduction
Recombinant proteins are genetically engineered biomolecules expressed in heterologous hosts, which are widely applied in biomedicine, vaccine development, diagnostic reagents, industrial enzymes and other fields with promising market prospects. Their standard production workflow covers host expression, downstream purification, storage and transportation. Nevertheless, recombinant proteins commonly suffer from stability defects caused by fragmentation and degradation throughout the whole process, including peptide bond cleavage, side-chain modification, aggregate dissociation and domain separation. These problems severely impair biological activity and product yield, bring potential safety risks and increase production costs.
In biomedical research and industrial manufacturing, fragmentation and degradation of recombinant proteins mainly manifest as cleavage at the N-terminus, middle region or C-terminus, as well as aggregate dissociation, eventually leading to functional inactivation and reduced production efficiency. This paper systematically elaborates the molecular mechanisms, typical manifestations and impacts on protein function and yield of such defects, and summarizes current mainstream solutions and optimization strategies, aiming to provide theoretical references for stable and high-efficiency recombinant protein production.
II. Mechanisms and Root Causes of Recombinant Protein Fragmentation and Degradation
2.1 Basic Mechanisms
During expression, recombinant proteins are vulnerable to non-specific hydrolysis mediated by endogenous proteases, especially under conditions of incomplete protein folding or excessive expression levels. Post-translational aberrant modifications such as oxidation, incorrect glycosylation and deamidation can disrupt native protein conformations and trigger degradation. Physical stresses including shear force, high temperature and abrupt pH fluctuation may directly induce structural fragmentation and denaturation aggregation. In addition, flexible regions and repetitive sequences in amino acid sequences are structural fragile sites that easily cause multimer dissociation and domain separation.
2.2 Intrinsic Factors
From the perspective of primary structure, specific sequence motifs such as Asp-Pro, Asn-Gly, as well as regions adjacent to Met and Cys are highly susceptible to hydrolysis and oxidation. Asparagine (Asn) and glutamine (Gln) tend to undergo deamidation especially in flexible domains or sequences next to glycine residues, further inducing protein fragmentation. Unstable structures at protein N-terminus and C-terminus are also easily recognized and degraded by proteases.
In terms of advanced structures, proteins with low thermal stability (low Tm value and small ΔG value) tend to lose three-dimensional conformations under mild external disturbances. Exposed hydrophobic regions readily cause protein aggregation, while unpaired cysteine residues may form incorrect disulfide bonds or undergo oxidative cross-linking. Furthermore, intrinsically disordered/flexible regions and weak intermolecular interactions of multi-subunit proteins collectively undermine the overall structural stability.
2.3 Extrinsic Factors
External environmental and process stresses are classified into physical, chemical and biological categories. Physically, operations such as stirring, pumping, filtration and freeze-thaw cycles give rise to mechanical fragmentation and partial denaturation; contact at gas-liquid and solid-liquid interfaces intensifies interfacial stress. Sharp temperature changes, lyophilization and reconstitution also trigger conformational alterations of target proteins.
Chemically, extreme pH conditions accelerate hydrolysis and deamidation reactions; dissolved oxygen, metal ions and light-induced free radicals cause oxidation of Cys, Met, Trp and Tyr residues. Strong reducing environments may destroy functionally critical disulfide bonds. Improper dosage of detergents and denaturants, together with Maillard reaction and β-elimination, are also common chemical degradation triggers.
Biologically, trace residual host proteases can continuously degrade target recombinant proteins, and microbial contamination also poses an inevitable threat to product stability.
2.4 Expression System-related Factors
Different expression systems exert distinct effects on recombinant protein stability. As a typical prokaryotic expression host, Escherichia coli enables rapid protein expression but lacks complete eukaryotic post-translational modification systems, which frequently results in abnormal folding, inclusion body formation and exposure of vulnerable degradation sites.
In contrast, eukaryotic expression systems including CHO cells and yeast achieve relatively lower expression efficiency, yet they are capable of conducting native-like glycosylation, disulfide bond formation and correct protein folding, thus better maintaining structural stability. Moreover, promoter strength of expression vectors, types and insertion positions of fusion tags may introduce extra flexible sequences and protease recognition sites. Host cellular stress responses and endogenous protease background also determine degradation risks, which require comprehensive evaluation and targeted expression strategy optimization.
III. Core Strategies to Alleviate Recombinant Protein Fragmentation and Degradation
To enhance the stability of recombinant proteins during expression, purification and storage, collaborative optimization shall be implemented from multiple dimensions including molecular design, host engineering, expression & purification processes and final formulation development.
3.1 Molecular Design and Host Strain Optimization
Rational protein engineering is widely adopted to improve structural stability at the molecular level. Site-directed mutagenesis can eliminate unstable and degradation-prone amino acid residues: replacing asparagine with glutamine reduces deamidation tendency, while substituting methionine with stable hydrophobic amino acids mitigates oxidative damage. Stabilizing mutations can be introduced to form additional hydrogen bonds, salt bridges and hydrophobic interactions, so as to elevate thermal stability and folding efficiency.
Fusion tags such as SUMO, Trx, GST, MBP and Fc fragments effectively improve protein solubility and anti-degradation capacity. Terminal modifications including N-terminal acetylation and C-terminal amidation can strengthen unstable terminal structures. In addition, protein cyclization modification has been verified to significantly enhance resistance against protease hydrolysis and physical mechanical stress.
Host strain engineering is equally essential. Screening protease-deficient E. coli strains and engineered yeast strains with low endogenous protease activity effectively diminish non-specific proteolytic degradation. Co-expression of molecular chaperones such as GroEL/ES and DnaK/DnaJ facilitates correct protein folding and reduces aggregation and fragmentation caused by misfolding. Co-expression of foldases like protein disulfide isomerase optimizes native conformation formation and maintains structural integrity. Codon usage optimization synchronizes translation efficiency, alleviates cellular expression burden and minimizes misfolding-related degradation.
3.2 Optimization of Expression and Purification Processes
Cultivation and induction conditions profoundly influence final protein quality. Low-temperature induction slows down protein synthesis rate and suppresses misfolding and aggregation. Rational regulation of inducer concentration, induction duration, as well as optimal matching of carbon sources, nitrogen sources and trace elements in culture media greatly improve expression quality. Mild cell lysis methods and streamlined operational procedures shorten the exposure time of free proteins to avoid in-process degradation.
Protease inhibitors can be added during purification to inhibit proteolysis, with thorough removal procedures guaranteed to avoid residual impacts on finished products. Avoidance of extreme pH and high-salt environments, rational selection of chromatographic media to reduce non-specific binding, and adoption of mild elution conditions are all conducive to retaining native protein conformations.
3.3 Formulation Development and Storage Condition Control
Scientific buffer systems and stabilizer selection are core to formulation design. Buffers with stable pH adaptability are preferred to prevent acid-base hydrolysis and charge imbalance-induced aggregation. Saccharides and polyols such as sucrose, trehalose, sorbitol and glycerol stabilize protein structures via preferential hydration effect. Amino acid additives including glycine, arginine, proline and methionine possess dual functions of anti-oxidation and anti-aggregation.
Surfactants represented by polysorbate reduce interfacial adsorption-induced conformational changes, while antioxidants such as ascorbic acid and EDTA alleviate oxidative damage through metal ion chelation and free radical scavenging. Macromolecular additives including albumin and PEG provide steric protection to inhibit protein aggregation.
Selection of dosage forms directly determines long-term storage stability. Liquid formulations require refined formula design and low-temperature preservation; lyophilized preparations extend shelf life substantially by removing moisture, on the premise of optimized lyoprotectants and lyophilization parameters. Solid formulations such as spray-dried powders also show great application potential in large-scale industrial production. During storage and transportation, standardized operations including low-temperature preservation, light avoidance, inert gas protection, split packaging and vibration avoidance shall be strictly executed to minimize pre-application degradation and fragmentation.
IV. Analytical and Detection Technologies
Accurate identification of degradation characteristics and cleavage mechanisms is indispensable for recombinant protein research and quality control. Conventional detection methods for degradation products include SDS-PAGE and diversified liquid chromatography techniques, namely size-exclusion chromatography, ion-exchange chromatography, reversed-phase chromatography and capillary electrophoresis. These approaches enable effective separation and preliminary identification of degraded fragments, aggregates and protein isoforms. Peptide mapping coupled with LC-MS/MS achieves precise localization and quantitative analysis of protein cleavage sites.
In terms of stability evaluation, circular dichroism spectroscopy and fluorescence spectroscopy monitor changes in secondary structures and exposed hydrophobic domains; differential scanning calorimetry assesses protein thermal stability; dynamic and static light scattering characterize aggregate formation and particle size variation. Fourier transform infrared spectroscopy is applicable to conformational transition detection, especially for solid and lyophilized protein samples. For systematic stability assessment, forced degradation studies are commonly performed by exposing proteins to high temperature, extreme pH, oxidative environment and repeated freeze-thaw treatments, to simulate potential degradation pathways during production, logistics and application.
Conclusion
The stability bottlenecks of recombinant proteins represented by fragmentation and degradation originate from intrinsic sequence and structural defects, as well as combined adverse effects of expression systems, manufacturing processes and external environmental conditions. Clarifying degradation mechanisms and cleavage pathways and establishing targeted engineering optimization schemes are core priorities to improve recombinant protein product quality.
Currently, technical strategies covering protein engineering modification, host cell reconstruction, purification process upgrading and formulation condition optimization have achieved remarkable outcomes in multiple recombinant protein product developments. Meanwhile, advances in mass spectrometry, thermal analysis and spectral structural detection technologies realize dynamic monitoring and accurate quantitative analysis of protein degradation processes, laying a solid foundation for the large-scale, stable and high-quality industrialized production of recombinant proteins.