2, you can download PDB file separately from ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/, or http://www.rcsb.org/pdb/files/. For example,
Yes, it is possible to use Bioperl:wget ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/pdb1a2k.ent.gz
wget http://www.rcsb.org/pdb/files/4hhb.pdb.gz
use Bio::Structure::IO;
$in = Bio::Structure::IO->new(-file => "pdb1a2k.ent",
-format => 'pdb');
while ( my $struc = $in->next_structure() ) {
print "Structure ", $struc->id,"\n";
}
3, EBI has some curated information on PDB structures, check here:
ftp://ftp.ebi.ac.uk/pub/databases/rcsb/pdb-remediated/
4, PDB sequences could be downloaded from NCBI website. So you need not generate by yourself from parsing the structures, which is also error-prone.
5, each entry of pdbaa represents a sequence, but could correspond to multiple chains of different structures. This sequence has a NCBI gi, and thus easy to follow. The residue number in .pdb files is based on these sequences. And usually, PDB chains only contain a part of the sequence. Yes, sometimes it introduces more residues, but usually it does not matter a lot.wget ftp://ftp.ncbi.nih.gov/blast/db/FASTA/pdbaa.gz
No comments:
Post a Comment