In structure analyses of proteins in solution by using small-angle X-ray scattering (SAXS), the molecular models are restored by using ab initio molecular modeling algorithms. There can be variation among restored models owing to the loss of phase information in the scattering profiles, averaging with regard to the orientation of proteins against the direction of the incident X-ray beam, and also conformational fluctuations. In many cases, a representative molecular model is obtained by averaging models restored in a number of ab initio calculations, which possibly provide nonrealistic models inconsistent with the biological and structural information about the target protein. Here, a protocol for classifying predicted models by multivariate analysis to select probable and realistic models is proposed. In the protocol, each structure model is represented as a point in a hyper-dimensional space describing the shape of the model. Principal component analysis followed by the clustering method is applied to visualize the distribution of the points in the hyper-dimensional space. Then, the classification provides an opportunity to exclude nonrealistic models. The feasibility of the protocol was examined through the application to the SAXS profiles of four proteins.
ASJC Scopus subject areas