You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for designing this program! I'm currently using it to replace the HMMER in my python script, but I meet some issues currently:
available_memory = psutil.virtual_memory().available
target_size = os.stat(self.input_faa).st_size
hmm_files=HMMFiles(self.hmm_file)
with open("test_hmmer_result.txt", 'wb') as f:
with pyhmmer.easel.SequenceFile(self.input_faa, digital=True) as seqs:
if target_size < available_memory * 0.1: #Pre-fetching targets into memory
targets = seqs.read_block()
else:
targets = seqs
for i, hits in enumerate(pyhmmer.hmmsearch(hmm_files, targets, cpus=os.cpu_count(), domE=1e-15)):
hits.write(f, format="domains", header=False)
I'm using hits.write(f, format="domains", header=False) to get the domain output, but I want to extract columns with [query name qlen target name tlen i-Evalue(this domain) hmm_from ali_from] to form a new output result with tab as a separator. I read the doc file but I'm still confused about how to extract those information.
Could you please tell me how Tophit was selected? I noticed that I may have two Tophits with one protein accession from the hits.write.
Really appreciate your help!
Best Regards,
XInpeng
The text was updated successfully, but these errors were encountered:
I think I find the way to do it but it would be really helpful if you could verify it:
for hits in pyhmmer.hmmsearch(hmm_files, targets, cpus=os.cpu_count(), domE=1e-15):
for hit in hits:
for domain in hit.domains.included:
#print(dir(domain.alignment))
print(domain.alignment.hmm_name, domain.alignment.hmm_length, domain.alignment.target_name, domain.alignment.target_length, domain.i_evalue, domain.alignment.hmm_from, domain.alignment.hmm_to, domain.alignment.target_from, domain.alignment.target_to)
Hi,
Thank you for designing this program! I'm currently using it to replace the HMMER in my python script, but I meet some issues currently:
I'm using hits.write(f, format="domains", header=False) to get the domain output, but I want to extract columns with [query name qlen target name tlen i-Evalue(this domain) hmm_from ali_from] to form a new output result with tab as a separator. I read the doc file but I'm still confused about how to extract those information.
Could you please tell me how Tophit was selected? I noticed that I may have two Tophits with one protein accession from the hits.write.
Really appreciate your help!
Best Regards,
XInpeng
The text was updated successfully, but these errors were encountered: