import React from 'react';
import { Grid, Typography, Link, Paper, Box } from '@mui/material';
import LigandMPNNImage from './LigandMPNN1.png';
import Navigation from '../Navigation';

export const LigandMPNNPost = () => {
  return (
    <Paper elevation={3} sx={{ p: 4, margin: 'auto', maxWidth: 900 }}>
      <Navigation />
      <Typography variant="h4" gutterBottom>
        LigandMPNN: Nucleotide, Small Molecule, and Metal Binding Protein Sequence Design
      </Typography>
      <Box
        component="img"
        src={LigandMPNNImage}
        alt="LigandMPNN Metrics Overview/Comparison"
        sx={{ width: '100%', height: 'auto', my: 2 }}
      />
      <Typography variant="body1" paragraph>
        Protein sequence design is highly valuable for a wide set of protein design and engineering tasks. Recently, the model <Link href='proteinmpnn' target='_blank'>ProteinMPNN</Link> has demonstrated that machine learning-based methods are the state of the art method to approach producing sequences for computationally designed protein structures. However, ProteinMPNN could not utilize non-protein parts of the imported structures. LigandMPNN seeks to change this, by allowing protein sequence design while including small molecules, nucleotides, and metals. This is critical for enzyme and small molecule binder and sensor design. In addition to extending ProteinMPNN's capabilities in producing sequences, LigandMPNN can also generate sidechain conformations as well.
      </Typography>
      <Typography variant="body1" paragraph>
        LigandMPNN has been both computationally and experimentally shown to significantly outperform its predecessor ProteinMPNN along with Rosetta in each of the non-protein contexts (small molecule, nucleotide, and metal).
      </Typography>
      <Typography variant="body1" paragraph>
        The authors also show that LigandMPNN can also rescue binding or significantly increase binding affinity, giving the example of an increase in binding affinity of cholic acid to rocuronium. With the rise of machine learning tools such as LigandMPNN, we see that knowledge based incumbents such as Rosetta have been unseated for protein sequence design tasks.
      </Typography>
      <Typography variant="body1" paragraph>
        While the authors have found that most the PDB data is mostly representative, one important consideration when applying LigandMPNN is to exercise caution when the relevant ligands contain elements that aren't commonly found in the PDB.
      </Typography>
      <Typography variant="body1" paragraph>
        Want to try LigandMPNN? Use the Tamarind LigandMPNN interface <Link href="/ligandmpnn">here</Link>, no setup required!
      </Typography>
      <Typography variant="h5" gutterBottom>
        Example Use Cases
      </Typography>
      <Typography variant="body1" component="div">
        <ul>
          <li>
            <Link href='https://www.biorxiv.org/content/10.1101/2023.09.20.558720v1' target="_blank">Glasscock et. al.</Link> use LigandMPNN as part of their computational method to design sequence-specific DNA-binding proteins. 
          </li>
          <li>
          <Link href='https://www.biorxiv.org/content/10.1101/2023.11.01.565201v1' target="_blank">Lee et. al.</Link> use LigandMPNN to iteratively reach a sequence design for binders for 17α-hydroxyprogesterone, apixaban, and SN-38
          </li>
          <li>
            <Link href='https://www.biorxiv.org/content/10.1101/2023.10.09.561603v1' target="_blank">Krishna et al.</Link> use LigandMPNN to design small molecule binding proteins. 
          </li>
          <li>
            More than 100 experimentally verified protein-DNA and protein-small molecule binding interfaces have been generated using LigandMPNN.
          </li>
        </ul>
      </Typography>
      <Typography variant="h5" gutterBottom>
        Training / Model Architecture
      </Typography>
      <Typography variant="body1" paragraph>
        As with most protein models, LigandMPNN was trained on entries in the Protein Data Bank, filtered to have less than 6000 residues and a resolution of 3.5 Å or better. It uses 3 different graphs, one dedicated solely to proteins, second for intra-ligand structure, and the third both residues and ligand atoms. Including one encoder for the proteins and one for ligands along with a decoder. This allows for generating/decoding sequences in a autoregressive manner.
      </Typography>
    </Paper>
  );
};
