Searching for functional proteins among random-sequence libraries is a major challenge of protein engineering; the difficulties include the poor solubility of many random-sequence proteins. A library in which most of the polypeptides are soluble and stable would therefore be of great benefit. Although modern proteins consist of 20 amino acids, it has been suggested that early proteins evolved from a reduced alphabet. Here, we have constructed a library of random-sequence proteins consisting of only five amino acids, Ala, Gly, Val, Asp and Glu, which are believed to have been the most abundant in the prebiotic environment. Expression and characterization of arbitrarily chosen proteins in the library indicated that five-alphabet random-sequence proteins have higher solubility than do 20-alphabet random-sequence proteins with a similar level of hydrophobicity. The results support the reduced-alphabet hypothesis of the primordial genetic code and should also be helpful in constructing optimized protein libraries for evolutionary protein engineering.
ASJC Scopus subject areas