Predicting Antibody Properties & Developability

Introduction

Early developability assessment is now a routine part of antibody discovery in industry, aimed at identifying candidates with the lowest CMC (Chemistry, Manufacturing, and Controls) risks before costly development. In practice, this means evaluating antibody leads in silico for potential liabilities in stability, manufacturability, safety, and efficacy. By triaging problematic sequences early, companies avoid late-stage failures – skipping this step has proven to carry huge financial risks.

This guide is written for industry practitioners and focuses on practical applications of computational tools to predict key developability properties of all types of antibodies (therapeutic, diagnostic, and research reagents). We assume readers are familiar with antibody basics, and we emphasize actionable outcomes over theory. Each section below covers a critical property (e.g. aggregation, immunogenicity), why it matters, how it’s predicted computationally, and how to integrate those predictions into the pipeline. Open-source and free tools are highlighted (with links), alongside a few widely used commercial platforms, and we include real-world examples, metrics, and validation references wherever available.

Key Developability Properties and Their Prediction

Aggregation Propensity

Aggregation is a major developability concern for antibodies – aggregates can reduce efficacy, trigger immunogenic reactions, and complicate manufacturing. Antibodies that aggregate easily may fail formulation stability or cannot be concentrated for delivery (e.g. high-dose injections). To mitigate this, developers aim to predict and avoid sequences with high aggregation propensity.

Computationally, aggregation can be predicted from either the amino acid sequence or the 3D structure (if available). Early “first-generation” algorithms focused on sequence motifs prone to amyloid-like aggregation. For example, TANGO (2004) scans for β-sheet aggregation nucleating regions in the sequence, and Aggrescan (2007) calculates per-residue aggregation “hotspots” from sequence hydrophobicity patterns. These tools flag stretches of hydrophobic or beta-prone residues that could drive self-association.

Newer methods incorporate structural context and machine learning for more accurate risk assessment. For instance, Aggrescan3D takes an antibody’s structure (modeled or experimental) and computes an aggregation score per residue, accounting for solvent exposure and contact topology. Machine-learning approaches have also been developed: one study trained an ensemble of ML models on experimental aggregation data (including hydrophobic interaction chromatography, retention times) to predict an overall aggregation risk from sequence. These models combine multiple sequence-derived features (hydrophobic patches, charge clusters, etc.) into a single risk score.

In practice, companies often use a pipeline of these tools. A typical approach is to run a sequence-based aggregator (fast and high-throughput) on many candidates to filter out the worst liabilitiefs, then apply a more detailed structure-based analysis on the top leads. It’s important to cross-verify with multiple algorithms because each uses different criteria (one might focus on amyloid β-strands, another on general hydrophobicity). By using orthogonal methods, developers reduce method-specific bias in identifying aggregation-prone candidates. Any antibody flagged for high aggregation propensity should be considered for engineering (e.g. point mutations in CDRs to reduce hydrophobic patches) or possibly deprioritized if better alternatives exist. Published case studies have shown that addressing predicted aggregation hotspots (for example by inserting a glycosylation site near an aggregation-prone region or by substituting problematic residues) can dramatically improve an antibody’s solubility and stability without harming its binding function.

Tools: Open-source: Aggrescan, TANGO, Waltz (amyloid motif finder), Aggrescan3D (A3D), PASTA (β-sheet predictor) – all have free web servers. Therapeutic Antibody Profiler (TAP) includes an aggregation-related metric (hydrophobic patch score; see General Developability section), and academic models like the SVM ensemble by Jain et al. (2017) predict HIC retention (a proxy for aggregation) directly from sequence. We summarize a more thorough list of tools for aggregation in the table below.

Solubility

Solubility is closely intertwined with aggregation. An antibody with poor solubility tends to precipitate or form aggregates in solution, especially under stress or high concentration. Low solubility can manifest as difficulties in formulation (protein falling out of solution) or low expression yield (if the protein aggregates inside cells). Therefore, predicting solubility helps prioritize antibodies that are easier to formulate and manufacture.

Computational solubility prediction typically relies on sequence features. For example, the Protein-Sol web tool from University of Manchester predicts a protein’s soluble expression likelihood from just the sequence. It uses a combination of factors like overall hydrophobicity, charge, and known solubility-enhancing motifs to output a solubility score and a percentile relative to a reference set. Another popular tool is CamSol(Cambridge solubility tool), which provides both sequence-based and structure-based solubility scores. The CamSol Intrinsic algorithm computes an intrinsic solubility profile along the sequence and identifies regions likely to cause aggregation or poor solubility. Meanwhile, CamSol Structural (if a 3D model is provided) refines this by accounting for the proximity of residues in the folded antibody and their exposure on the surface. CamSol can even suggest mutations to improve solubility (“CamSol Design”) by optimizing the balance of hydrophobic and hydrophilic residues.

For antibodies, solubility issues often come from the variable domains (VH/VL) because these regions can have hydrophobic paratopes or unusual loops. Some empirically derived guidelines recommend avoiding large hydrophobic patches on the surface. In silico, after obtaining a solubility score, a practitioner might compare it against typical values from known developable antibodies (e.g. if the tool provides a percentile). If an antibody scores particularly low (predicted poorly soluble), one might introduce conservative mutations outside the binding interface to raise solubility – for example, substituting a surface-exposed leucine with threonine in a non-critical framework position. Tools like Solubis can automate this by identifying mutations that reduce aggregation propensity without affecting the structure.

It’s worth noting that solubility prediction is not an exact science – it provides a relative risk indicator. Thus, in a pipeline, solubility scores are used in combination with other metrics: e.g. “remove candidates that are below a solubility score threshold or in the bottom X% of the library.” Those that pass in silico filters should still be confirmed with lab tests (small-scale expression and solubility assays). In fact, a best practice is to experimentally measure solubility or aggregation for a panel of leads to validate the computational ranking and refine the models if needed.

Tools: Open-source: Protein-Sol (sequence-based solubility predictor), CamSol (requires free registration; sequence and structure modes available), SOLpro and SOLart (from the SCRATCH suite, sequence/structure-based solubility predictors). Commercial: Various antibody engineering platforms incorporate solubility predictions; e.g., Thermo Fisher and Sartorius offer solubility optimization as part of antibody humanization packages (often using underlying open algorithms like CamSol under license). In the open domain, DeepSol and PON-Sol are recent ML-based tools that predict solubility changes upon mutations, these can be used to screen potential CDR mutations for improved solubility.

Viscosity and Self-Interaction

High solution viscosity is another practical developability problem, especially for therapeutic antibodies that must be formulated at high concentrations (100+ mg/mL) for subcutaneous injection. Viscosity is related to how readily antibody molecules interact transiently (self-associate) in solution. An antibody with strong attractive self-interactions can exhibit non-Newtonian, highly viscous behavior, making it hard to fill syringes or administer to patients. Predicting viscosity in silico is challenging, but one known contributor is the presence of large electrostatic patches on the antibody surface that can cause “charge-driven” self-association.

Spatial-Charge Map (SCM) highlighting an electrostatic patch on an antibody Fv surface (red region indicates a large negatively charged patch). Such surface charge asymmetry has been correlated with high solution viscosity in concentrated mAb formulations. Practitioners aim to avoid antibodies with extreme surface charge clusters.

One computational method addressing viscosity is the Spatial Charge Map (SCM) developed by Agrawal et al. (2016). SCM quantifies the size of the largest contiguous patch of like-charged surface on the Fv region. For example, an Fv might have a large negatively charged patch (many acidic residues clustered) – SCM will output a high score for that, which was shown to correlate with higher viscosity in concentrated solutions. Antibodies with more balanced charge distribution (no huge patches) tend to have lower viscosity. To use SCM in practice, one needs either an antibody structure or at least a model of the Fv. In a pipeline, after modeling the structure (e.g., via homology modeling or AlphaFold), one can calculate SCM or similar electrostatic patch metrics. If a candidate shows an extreme value (e.g., an unusually large charged patch outside the range seen in approved mAbs), it’s flagged for potential high-viscosity risk. Engineering strategies to reduce viscosity include mutating a few surface residues to reduce the patch (for instance, substituting charged residues with more neutral ones in solvent-exposed, non-critical positions).

A complementary and increasingly practical approach is DeepSP, a deep-learning surrogate that directly predicts spatial properties such as SCM and SAP (spatial aggregation propensity) from sequence, eliminating the need for MD-based electrostatics on a 3D model. DeepSP outputs ~30 spatial descriptors for variable regions (including CDR-level SCM/SAP) and was shown to reproduce MD-derived patch metrics with good fidelity, enabling early, high-throughput viscosity-risk screening from sequence alone. It is available as an open-source implementation and via a web interface.

Apart from SCM/DeepSP-style patch analysis, other predictive surrogates for viscosity include B22 (second virial coefficient) and kD (diffusion interaction parameter), which can be estimated from sequence or structure using coarse models of protein–protein interaction. Some teams use tools that predict self-interaction propensity – for example, spatial B-factor hot spots or docking two identical Fv surfaces to see if they have a strong binding interface (a method sometimes used to predict colloidal stability). These approaches are less standardized, however. Notably, DeepSP-derived features have also been used to train downstream classifiers for high vs. low viscosity at high concentration (e.g., DeepViscosity), further streamlining screening in discovery.

Computational viscosity prediction usually boils down to identifying extreme outliers in molecular surface properties. Industry practitioners often set empirical cut-offs: e.g., “no large contiguous patch of >X surface charge or hydrophobicity” as part of library design. Indeed, the Therapeutic Antibody Profiler (TAP) includes a “surface charge asymmetry” guideline that helps catch antibodies likely to have formulation issues. Antibodies falling outside the normal range for these parameters (derived from clinical-stage mAbs) would be deprioritized for therapeutic use. Diagnostic or research antibodies, which are used at lower concentrations, might tolerate a bit more variability in this area, but generally it’s best to avoid highly viscous clones even for manufacturing reasons.

Tools: Open-source/Web: DeepSP – sequence-only surrogate model that predicts SCM/SAP and related spatial descriptors (~30) for antibody variable regions; available as open-source with a hosted web app (also integrated on Tamarind Bio). Useful for rapid, pre-structural viscosity risk triage. SCM – available via literature (Agrawal et al., mAbs 2016) and academic implementations; can be reproduced with electrostatics software (e.g., PyMOL APBS) to map charge and quantify patches.
Developability Index (Sormanni et al.) – includes charge/colloidal-stability-related components.

Stability and Chemical Liabilities

Stability refers here to the antibody’s resistance to unfolding or degrading under stress (thermal, chemical, or mechanical). A stable antibody is less likely to denature at physiological temperature, less prone to aggregate upon slight perturbations, and generally has a higher melting temperature (T_m). In early developability assessments, teams look for antibodies that maintain their structure and function under various conditions (pH changes, freeze-thaw, etc.). Computational prediction of absolute stability (e.g. exact T_m) is difficult, but relative comparisons and identification of unstable regions are feasible.

One approach is to analyze the antibody sequence for framework robustness. Antibodies with very long or cysteine-rich CDR loops, or unusually hydrophobic frameworks, might indicate a less stable fold. In fact, a study by Seeliger et al.outlined guidelines for stability: (1) avoid motifs prone to proteolysis (e.g. certain dipeptide sequences), (2) ensure the antibody has high intrinsic thermodynamic stability (often correlated with more intra-domain interactions and proper disulfide pairing), (3) avoid large hydrophobic or charged surface patches (they can destabilize or cause aggregation), and (4) avoid extended β-strand motifs that can trigger aggregation. These rules feed into design – for instance, if a particular VL framework is known to have lower stability, one might choose a different human germline framework during humanization.

On the computational side, if a 3D structure is available or modeled, tools like ThermoMPNN, Rosetta or FoldX can be used to estimate stability. For example, Rosetta’s ddG or cartesian_ddG protocols allow you to mutate residues in silico and predict changes in stability (ΔΔG). While these are typically used for point mutations, one can also get an overall stability energy for the antibody model. A stable antibody is expected to have a well-packed core and favorable energy. If Rosetta flags a particular CDR loop as dramatically destabilizing (e.g. positive energy, or many unsatisfied bonds), that region might be a target for engineering to improve stability.

Another set of tools focuses on thermal stability prediction for specific formats like single-domain antibodies (VHHs). For instance, TEMPRO and NanoMelt are a predictors for nanobody melting temperature that uses sequence features to estimate T_m (available from the Sormanni group). It considers factors like framework mutations and CDR composition known to affect VHH stability. While these are specific to VHHs, similar models could be trained for IgG domains given enough data.

In addition to physical stability, antibodies are subject to chemical liabilities – sequence motifs that can undergo modification or degradation. These include: Asn deamidation, Asp isomerization, Met/Trp oxidation, unpaired cysteine thiol-disulfide scrambling, and unintended glycosylation sites in the variable region. Such modifications can reduce stability and potency (e.g. deamidation in a CDR can knock out binding) and may create heterogeneity in the product. It’s crucial to identify these liabilities in silico so they can be engineered out early.

Common in silico checks for chemical liabilities are straightforward pattern searches. For example, Asn-Gly motifs (NG) are well-known hot spots for deamidation. If an NG appears in a CDR loop, that antibody is likely to undergo deamidation over time, potentially losing activity. Similarly, a DG or NG next to a glycine is a risk for isoAsp formation (Asp can rearrange to isoaspartate). Methionine residues in exposed areas (especially in CDRs or Fab regions) are flagged for oxidation risk. If an antibody has a Met in a binding loop, one might consider mutating it to a norleucine or something more oxidation-resistant, unless it’s critical for binding. Unpaired cysteines are a red flag – by default, an IgG should have an even number of cysteines forming disulfides. Any extra cysteine (e.g. engineered for conjugation or a mistake in sequencing) can cause mispairing or aggregation via intermolecular disulfides. Glycosylation motifs (N-X-[S/T]) in the variable domain are another liability – antibodies are glycosylated at the conserved Fc Asn-297, but if a new N-X-S/T appears in VH or VL, a fraction of produced antibodies will carry a sugar there, leading to heterogeneity. This is usually avoided in designs.

Many computational tools will automatically scan antibody sequences for these liabilities. For instance, the Tamarind Antibody Annotator (an antibody analysis platform) provides built-in screens for PTMs and chemical liabilities. It will mark residues that are in positions prone to modifications and even indicate if they are solvent-exposed or buried (buried Asn might be less of a deamidation concern than exposed Asn). This kind of contextual info is valuable – e.g., an NG in a buried beta-sheet might deamidate much more slowly (if at all) than an NG on a flexible loop.

In practice, antibodies with multiple liabilities are usually filtered out. If a lead has one or two manageable liabilities, an engineering campaign can be launched: for example, mutate Asn to Gln to prevent deamidation (sacrificing minimal binding if outside the paratope), or mutate a Met to Leu to avoid oxidation. Companies often maintain an internal list of “Do Not Use” sequences – certain motifs known to cause trouble. Developability guidelines by Xu et al. note that candidates should be selected to have low levels of modifications like deamidation, oxidation, etc., since these cause potency loss and heterogeneity. As a concrete example, if you have 100 lead antibodies, you might eliminate any that have an NG in CDRs, or a free cysteine, on the grounds that these are high-risk for later problems.

Tools: Open-source: abYsis – integrates sequence liability scanning. General bioinformatics servers like Expasy’s FindMod can predict PTM sites on a given sequence (though more useful with MS data). MusiteDeep is a deep learning webserver that predicts likely modification sites (phosphorylation, oxidation, etc.) from sequence. Commercial: Many companies use in-house scripts or tools from suite packages (like Dassault Systèmes BIOVIA). Also, services like IgG Toolbox from Cytiva and others include liability analysis. These typically don’t require special 3D modeling – simple sequence parsing covers most known liabilities, supplemented by structure if available to gauge exposure.

Immunogenicity

For therapeutic antibodies, immunogenicity (the propensity to provoke an anti-drug immune response) is a critical safety and efficacy concern. Even for diagnostic antibodies used in vivo (e.g. imaging agents), repeated use can raise immune responses. Immunogenicity arises when patient T-cells recognize peptides from the therapeutic antibody as foreign, leading to anti-drug antibody production. Therefore, in silico immunogenicity risk assessment primarily focuses on identifying T-cell epitopes in the antibody sequence. Secondarily, one can also examine B-cell epitopes on the therapeutic (i.e. regions on the antibody that a patient’s antibodies could bind), but B-cell epitope prediction is much less reliable at present.

T-cell epitope prediction: This involves scanning the antibody’s sequence for peptides that can be presented by human MHC class II molecules (HLA-DR in particular). Tools like TLimmuno / TLimmuno2 (MHC class II immunogenicity) take peptide–HLA-II pairs (13–21mers) and returns a continuous immunogenicity score plus percentile rank vs a human-proteome background. Shown to enrich for truly immunogenic CD4+ neoantigens. Best fit for therapeutic antibody risk (exogenous protein → HLA-II). NetMHCIIpan (from DTU) and the IEDB suite predict binding affinities of all possible 15-mer peptides from the antibody to a panel of common HLA class II alleles. High-affinity binders are potential T-cell epitopes. For example, if your antibody’s heavy chain has a segment that is predicted to strongly bind HLA-DRB1*04:01, it might activate CD4 T-cells in carriers of that allele, leading to an anti-drug immune response. In practice, one would run these tools such, get a list of top-scoring peptides, and see if any come from the antibody variable regions. Many therapeutic antibodies are human or humanized, so they are often relatively low in T-cell epitopes, but CDR loops can sometimes contain sequences not common in the human germline that slip through humanization and could be immunogenic. If an antibody has multiple strong predicted epitopes, especially in CDRs, it’s a candidate for deimmunization – targeted mutation of T-cell epitope residues to reduce MHC binding while preserving binding function.

Several companies have proprietary algorithms to quantify immunogenicity risk with a single score. These combine predictions across many HLA alleles and adjust for factors like presence of Tregepitopes. While not open-source, the concepts are similar: identify problematic peptides and rank the antibody’s overall risk relative to known low-immunogenicity antibodies. For instance, an antibody with a high adjusted epitope content score would be flagged as high-risk and might be modified or dropped. Published validations have shown such scores correlate with clinical immunogenicity incidence for some monoclonal antibodies.

B-cell epitope prediction: This aims to predict regions on the antibody that could be recognized by patient antibodies (i.e. the antibody acting as an antigen). It’s a meta-problem – essentially, could the therapeutic antibody become the target of an immune response? Tools like ElliPro and DiscoTope analyze the antibody’s structure to find protruding or flexible regions that are likely to be immunogenic if the patient’s B-cells see them. In the context of therapeutic antibodies, this often overlaps with simply looking for non-human sequences on the surface. Humanization (grafting CDRs onto a human framework) is largely intended to remove B-cell epitopes by replacing rodent surface residues with human ones. Some newer methods even do “surface resurfacing,” mutating surface patches on the variable domains to human residues without altering the CDRs, in order to reduce B-cell epitope content. B-cell epitope prediction is still an evolving area; it’s generally not as quantitative or trusted as T-cell epitope prediction. Thus, industry practice is heavily weighted toward T-cell epitope avoidance for immunogenicity risk management.

Putting it into practice: In a developability pipeline, immunogenicity assessment might set hard cut-offs for certain projects. For example, if any de novo sequence (like from a synthetic library) has more than a specified number of strong HLA-binding peptides, it could be eliminated or prioritized for humanization. Conversely, antibodies derived from human germline sequences (or from human donors) typically have low T-cell epitope content – one can run the tools to verify that. It’s important to consider population coverage: using a broad set of HLA alleles in prediction ensures that you aren’t missing an epitope that affects a minority population. Tools often provide an “immunogenicity score” or percentile; these should be interpreted cautiously but can be useful for ranking. In all cases, in silico immunogenicity predictions are an aid, not a guarantee – they reduce risk but do not replace in vitro T-cell assays or eventual clinical monitoring. The aim is to de-risk: choose candidates that are as “self-like” as possible. For research or diagnostic antibodies used only in vitro, immunogenicity may not matter at all, so those projects might skip this filter. But for any antibody going into humans repeatedly, it is standard to perform this analysis.

Tools: Open-source/web: IEDB T-cell Epitope Prediction (covers multiple algorithms for MHC II binding), NetMHCIIpan, RANKPEP and ProPred (older, but still available for MHC II epitope scans)pmc.ncbi.nlm.nih.gov, TLimmuno/2 (ML-based HLA-II immunogenicity scoring to prioritize binders beyond affinity), and DeepImmuno (deep-learning HLA-I immunogenicity; useful for cross-presentation checks). For B-cell: Discotope (discontinuous epitope predictor using structure)pmc.ncbi.nlm.nih.gov, ElliPro (IEDB’s linear/discontinuous epitope tool), and ABadapt/EpiPred (OAS-based antibody-specific epitope prediction, mostly for mapping antibody→antigen but can sometimes be used inversely). Commercial: EpiVax’s ISPRI platform is a widely used suite (EpiMatrix for T-cell, JanusMatrix for cross-reactivity, etc.), often contracted by pharma for immunogenicity “stress tests.” Schrodinger and Thermo Fisher offer immunogenicity analysis as part of their antibody engineering packages as well. In the end, whether using free or commercial tools, the methodology is similar—focus on T-cell epitope mapping and minimize them; when possible, down-select predicted binders with immunogenicity models (TLimmuno for class II, DeepImmuno for class I) before design changes

Isoelectric Point and Charge Profile

The isoelectric point (pI) of an antibody is the pH at which its net charge is zero. This property influences how the antibody behaves in different pH environments (e.g. blood at pH ~7.4 or various formulation buffers). Most therapeutic mAbs have pI values in a moderate range; an analysis of FDA-approved and clinical-stage mAbs found their pIs mostly fell between about 6.1 and 9.4. Outliers outside this range are less common, because extreme pI can correlate with developability issues. For instance, an antibody with very high pI (>9) is highly cationic at neutral pH, which can lead to nonspecific binding (to negatively charged membranes or proteins) and faster clearance in vivo. It can also cause the antibody to precipitate near its pI (since solubility is lowest at pI). Conversely, an antibody with very low pI (<6) will be highly anionic at physiological pH, which might be somewhat more tolerable but can still pose issues with certain purification steps (e.g. cation exchange chromatography behavior) and potentially increased aggregation near pH extremes.

Predicting pI is trivial from the sequence – it’s essentially calculated by summing up the pKa contributions of all ionizable residues and finding the pH of net zero charge. There are many free tools (Tamarind Protein Properties, IPC 2.0, ExPASy ProtParam, EMBOSS iep, etc.) that do this calculation. The value is only as good as the pKa assumptions (which for antibodies are usually standard values for Lys, Arg, His, Asp, Glu, etc., not accounting for microenvironment). Still, it provides a useful guide. If you compute an antibody’s pI and get, say, 10.5, that’s a red flag. Such an antibody likely has an unusually high content of basic residues (Lys/Arg) relative to acidic ones. In context, you might examine where those charges are – if a high pI is driven by many Lys in CDRs, that could also mean a high positive charge patch, linking back to the polyreactivity risk mentioned earlier. Indeed, TAP’s guidelines explicitly include metrics for patches of positive and negative charge on the surface. A highly positive CDR patch would push pI up and can cause issues like non-specific binding and aggregation.

In developability screening, pI is often used as a secondary filter. Teams may avoid candidates with extreme theoretical pI unless there’s a strong reason to keep them. Sometimes, antibody libraries are even designed to bias towards a certain pI range. For example, avoiding sequences that are too basic. If one ends up with a clinical antibody with a high pI, formulation scientists might adjust the formulation pH away from the pI to improve stability (e.g. formulate a very basic antibody at pH 6). But it’s simpler to choose a different antibody if possible.

Charge profile analysis goes beyond pI by looking at the distribution of charges. Two antibodies could have the same pI, but one might have charges evenly distributed while another has a large clustered patch of positive charge on one face. The latter is more problematic due to local effects (patches can drive self-interaction or interact with other proteins). Tools like Patch Analyzer or using the above-mentioned SCM for charge can identify if the net charge asymmetry is high. TAP’s Structural Fv Charge Symmetry Parameter (SFvCSP) specifically quantifies heavy vs light chain charge imbalance. If one chain is much more positively charged than the other, the antibody might have a dipole that affects behavior. In general, a well-behaved antibody tends to have balanced charge distribution and a mid-range pI.

From an in silico perspective, once you have the sequence, you can easily calculate pI and even a rough “charge distribution” (e.g. count of charged residues in CDR vs framework). It’s good practice to do so for all leads and note any outliers. Those outliers might require additional scrutiny or can be eliminated. For example, if one antibody has a pI of 10 and another similar binder has pI 8, you’d likely prefer the pI 8 candidate for development.

Tools: Open-source: Tamarind Protein Properties, IPC 2.0, ProtParam for pI, EMBOSS iep (command-line), Biopython (has a method to compute pI). The TAP tool automatically calculates pI and highlights charge patches on an uploaded sequence. Also, simple programs/scripts can compute the charge at pH 7 for each region to identify any heavily charged CDR. Commercial: Not many dedicated commercial tools just for pI (since it’s simple), but integrated suites will show you the pI of your antibody and sometimes map the electrostatic potential on the structure.

Manufacturability & Expression

Manufacturability covers a broad range of practical considerations: how easily can the antibody be produced at scale in expression systems (usually mammalian cells), purified, and formulated? Expression is one major facet – some antibody sequences express at very low yields or face secretion bottlenecks. Another facet is how the antibody behaves in bioprocessing: does it fold correctly, dimerize properly (heavy-light chain pairing), and remain stable during purification steps (low pH virus inactivation, ultrafiltration, etc.)? While many of these are tested experimentally (small-scale expression and purification is typically done for lead panels), computational prediction can flag sequences with likely manufacturability issues.

One simple predictor of expression is related to the features we’ve discussed: highly aggregation-prone or unstable antibodies often manifest as low expression yield. If a variable region is misfolded or has a tendency to aggregate intracellularly, the cell’s quality control may degrade it, resulting in poor secretion. Indeed, empirical studies have shown poor biophysical properties (low stability, high aggregation propensity) correlate with lower expression levels in mammalian cell culture. Thus, by eliminating antibodies with those issues (using the earlier sections’ tools), you inherently increase the chances that the remaining candidates will express well.

Nonetheless, there are some specific sequence liabilities directly tied to expression:

Signal peptide issues: All antibodies have signal peptides (which are usually standard), so not much variability there in discovery (most use the same leader sequences).
Rare motifs affecting secretion: Certain sequences can cause ER retention or secretion issues. For example, an antibody with a free cysteine might form improper disulfides and get retained in the ER.
Heavy–Light chain pairing: If an antibody heavy and light chain are extremely mismatched in terms of folding kinetics, one might accumulate unpaired heavy chain (which is usually filtered out by cellular quality control). No direct in silico tool predicts this, but checking that both chains use well-behaved human germline frameworks helps.
Codon optimization: While not usually a problem for mammalian systems (since they tolerate most codons well), if expressing in E. coli or yeast, certain sequences might be suboptimal. There are tools to optimize codons, but that’s generally an after a sequence is chosen (you don’t change the protein, just the DNA sequence).
Post-translational processing: An antibody with atypical sequences near the N-terminus might be subject to pyroglutamate formation or signal peptide mis-cleavage, but these are edge cases.

In practice, most companies perform transient expression of a panel of say 10–20 lead antibodies in CHO cells to see which express the best (titer mg/L). Before doing that, many use in silico scoring to predict expression levels. One approach is using solubility predictors (like the ones mentioned) as proxies for expression: a higher predicted solubility often means higher yield in soluble, secreted form. Another is leveraging large datasets of past expression: for example, a company may have trained an internal ML model where input is the antibody sequence and output is the transient expression titer (they have data from hundreds of constructs). Such models might pick up subtle sequence patterns that affect expression. There’s literature suggesting that certain heavy chain CDR compositions can impact expression in mammalian cells (e.g., very hydrophobic heavy CDR3 can sometimes reduce expression, possibly due to self-association during folding). While some bespoke/internal models are not publicly available, the trend is toward data-driven prediction of “expression scores” for antibodies.

Manufacturability also includes how well an antibody can be purified. One metric is how it behaves on chromatography columns. We already discussed HIC retention (related to hydrophobicity) – an antibody that sticks strongly to a HIC column might foul up purification or require harsher conditions to elute, which isn’t ideal. Similarly, behaviors on ion exchange columns (related to charge/pI) are considered. Some computational tools predict these: for instance, Jain et al.(2017) not only predicted HIC retention but generally the idea extends to using ML for other chromatography. If an antibody is predicted to have extremely delayed HIC retention (meaning very hydrophobic surface), it could indicate purification challenges.

Another manufacturability aspect is self-association in concentrated solutions (ties to viscosity) – which we covered. If an antibody tends to self-associate, it may also precipitate during formulation or filtration. There are experimental high-throughput assays like PEG precipitation or self-interaction nanoparticle spectroscopy (AC-SINS) that measure this, but in silico one can use the same features (surface hydrophobic patches, charge patches) as proxies.

Finally, consider novel formats (bi-specifics, fragments, etc.): these can introduce new developability challenges such as chain mispairing or stability issues between domains. Computational prediction for those often builds on the same principles (hydrophobicity, stability, etc.), but may require format-specific adjustments. For example, for a scFv (single-chain Fv), one might specifically watch for the hydrophobic interface that’s normally between heavy and light – in scFv it’s tethered, but if that region is exposed due to any conformational flexibility, aggregation can occur. Tools like Fusion protein stability predictors are emerging, but for now, expert rules and the general toolbox are applied similarly to these formats.

Best Practices for Integrating Predictive Tools into Discovery Pipelines

Having covered each property and available tools, it’s important to outline how to practically integrate these predictions into an antibody discovery and engineering pipeline. Below are some best practices followed in industry:

Multi-parameter Screening: Use a panel of in silico predictors covering all key properties rather than relying on a single “developability score.” Each tool catches different issues (e.g. one may catch an aggregation hotspot that another misses). By combining scores and flags from multiple tools, you get a more robust assessment. For instance, an antibody should ideally pass thresholds for solubility, stability, and immunogenicity together to advance. Some groups establish an aggregate scoring system where each antibody is classified (e.g. “green/yellow/red” or numerical rank) based on combined developability metrics.
Stage-wise Filtering: Apply the fastest, lowest-cost predictions early when the funnel is wide, and reserve more intensive analysis for later stages. A common workflow:
1. Sequence-level filters on large pools: As soon as you have sequence candidates (from panning, B-cell cloning, etc.), run liability scans and basic developability rules. Eliminate any clone with egregious liabilities (NG motifs, free cysteine, etc.) and those far outside normal ranges (e.g. extremely high hydrophobicity or charge metrics). This can cut down hundreds/thousands of hits to a more manageable list.
2. Refined in silico profiling on leads: Take the top 20–50 leads (typically those with good binding and functional activity from screening assays) and do a deeper computational analysis. This is where you’d use structure modeling (homology or AlphaFold) to evaluate aggregation patches (Aggrescan3D, etc.), compute SCM for charge, run TAP or similar guideline checks, and predict immunogenicity. At this stage, you might rank or eliminate a few more candidates – for example, removing those predicted to have especially low solubility or very high immunogenicity risk.
3. Experiment and iterate: The remaining top candidates go into experimental developability assays (small-scale expression, thermal stability by DSC or DSF, aggregation by SEC, etc.). Compare these results with the predictions. Usually the predictions align qualitatively (e.g. the one with worst predicted aggregation actually shows more aggregation in SEC). If an antibody performs poorly in reality but was predicted well, investigate why – this can identify gaps in the models and improve future predictions.
Context-specific Criteria: Tailor the developability criteria to the project needs. For a chronic therapeutic antibody, you might enforce very strict criteria (no significant liabilities, top-tier scores on all properties) because patient safety and product quality demand it. For a one-time use diagnostic antibody, you might tolerate a mild immunogenicity signal or a somewhat high aggregation score if it greatly outperforms others in function. Always document these justifications. The key is understanding which risks are critical for your use case and focusing predictions accordingly.
Use Human Data as Benchmark: Leverage databases of known therapeutic antibodies (like the Thera-SAbDab or internal databases of clinical-stage mAbs). Many tools (TAP, developability index) derive their thresholds from such databases. If your candidate’s properties fall well inside the range of known successful antibodies, that’s a good sign. If it’s an outlier (e.g. much higher hydrophobic patch score than any approved mAb), that’s a red flag. This contextualization can be done manually or via tools like TAP which automatically compare to clinical distributions.
Design for Developability: Incorporate developability predictions during antibody engineering, not just after selection. For example, during affinity maturation or humanization, use these tools to guide choices. If a certain mutation would increase affinity but is predicted to introduce an aggregation hotspot, perhaps avoid that mutation and explore alternatives. Modern antibody design frameworks often include a step to eliminate mutations that hurt developability (some groups use AI models to generate only “developable” sequences by constraining generated sequences to those with acceptable in silico profiles).
Keep Predictions Up-to-date: The field of computational biologics developability is rapidly evolving. New models (often AI-driven) are emerging that offer improved accuracy. It’s wise to periodically evaluate your toolkit – for instance, a 2025 deep learning model might outperform a 2015 heuristic for aggregation prediction. Wherever possible, validate new tools on a set of antibodies with known developability outcomes (perhaps from your company’s past projects) before fully relying on them.
Expert Review and Judgment: Finally, use computational predictions to inform decisions, not make them blindly. Human expertise is crucial. Developers often convene a “developability review” meeting where they look at all the data (in silico scores, lab assays, sequence annotations) for each lead and collectively decide which antibodies advance. There may be cases where an antibody has a slight issue but unique benefits (maybe it’s the only one binding a tough epitope) – the team might carry it forward with a plan to engineer the liability out. The goal is to avoid “unknown unknowns.” In silico tools plus expert interpretation significantly reduce the chance of late-stage surprises by making risks known early, when they are manageable.

With these practices, computational developability predictions become a powerful asset in the pipeline, enabling a rational selection of antibody candidates that are not only potent binders but also manufacturable, stable, and safe.

What is Tamarind Bio?

Tamarind Bio is a platform for 150+ tools in computational protein design, antibody developability, and more. Many of the tools listed here are available on Tamarind, including Aggrescan3D, TAP, Protein properties, and DeepSP. Using our protein scoring tool in Tamarind, you can upload a list of structures or sequences, select from properties to score for, and receive an aggregated CSV with the scores for each antibody. You can also use all Tamarind antibody property prediction tools in pipelines, to create stages and define filters between stages to successively score for more computationally intensive properties.

Key Developability Prediction Tools

Tool	Predicted Property	Input	Output Format	Webserver
Aggrescan	Aggregation hotspots	Sequence	Per-residue aggregation score; overall tendency	app.tamarind.bio/tools/aggrescan3d
Aggrescan3D	Aggregation hotspots (structure-aware)	3D structure (PDB) or model	Colored 3D structure with per-residue aggregation propensity; mutation analysis	app.tamarind.bio/tools/aggrescan3d
Therapeutic Antibody Profiler (TAP)	5 Guidelines: CDR length, hydrophobic patches, charge symmetry, etc.	Sequence	Pass/fail per antibody on each guideline, 3D visualization of patches	app.tamarind.bio/tools/tap
abYsis	Liabilities, humanness, comparison to germline/sequence databases	Sequence	Reports unusual residues, PTM motifs, surface exposure of liabilities, CDR lengths vs norms	app.tamarind.bio/tools/antibody-annotation
Developability Index
Solubis				solubis.switchlab.org
ThermoMPNN	Stability (folding ΔG, ΔΔG upon mutation)	3D structure (PDB) or model	Energy Scores per residue and possible mutations	app.tamarind.bio/tools/thermompnn
Rosetta	Stability (folding ΔG, ΔΔG upon mutation)	3D structure (PDB) or model	Energy Scores per residue
TEMPRO	Nanobody melting temperature	Sequence (VHH)	Single per-residue Tm number
CamSol	Solubility upon expression	Sequence	Solubility score (scaled %) and percentile vs database
NetSolP	Solubility upon expression	Sequence	Solubility score (scaled %)	app.tamarind.bio/tools/netsolp

Ongoing additions

Humanization
Most important papers in antibody developability prediction & key results
Key Immunogenicity Prediction Tools

Predicting Antibody Properties & Developability