Protein structural classification and family identification by multifractal analysis and wavelet spectrum

Ahn Vo

doi:10.1088/1674-1056/20/1/010505

Chinese Physics B

GENERAL

Protein structural classification and family identification by multifractal analysis and wavelet spectrum

Zhu Shao-Ming (朱少茗)^1,2, Yu Zu-Guo (喻祖国)^1,2 and Ahn Vo¹

2011 Chinese Physical Society and IOP Publishing Ltd
Chinese Physics B, Volume 20, Number 1 Citation Zhu Shao-Ming et al 2011 Chinese Phys. B 20 010505 DOI 10.1088/1674-1056/20/1/010505

Download Article PDF

Article metrics

137 Total downloads

Permissions

Get permission to re-use this article

Author affiliations

¹ School of Mathematical Sciences, Queensland University of Technology, GPO Box 2434, Brisbane, Q 4001, Australia

² School of Mathematics and Computational Science, Xiangtan University, Hunan 411105, China

Dates

Received 9 April 2010

Journal RSS

Sign up for new issue notifications

Abstract

Family identification is helpful for predicting protein functions. It has been known from the literature that longer sequences of base pairs or amino acids are required to study patterns in biological sequences. Since most protein sequences are relatively short, we randomly concatenate or link the protein sequences from the same family or superfamily together to form longer protein sequences. The 6-letter model, 12-letter model, 20-letter model, the revised Schneider and Wrede scale hydrophobicity, solvent accessibility and stochastic standard state accessibility are used to convert linked protein sequences into numerical sequences. Then multifractal analyses and wavelet analysis are performed on these numerical sequences. The parameters from these analyses can be used to construct parameter spaces where each linked protein is represented by a point. The four classes of proteins, namely the α, β, α + β and α/β classes, are then distinguished in these parameter spaces. The Fisher linear discriminant algorithm is used to assess the discriminant accuracy. Numerical results indicate that the discriminant accuracies are satisfactory in separating these classes. We find that the linked proteins from the same family or superfamily tend to group together and can be separated from other linked proteins. The methods are helpful for identifying the family of an unknown protein.

Export citation and abstract BibTeX RIS

Previous article in issue

Next article in issue

Please wait… references are loading.

Protein structural classification and family identification by multifractal analysis and wavelet spectrum

Article metrics

Permissions

Share this article

Author affiliations

Dates

Abstract