Question about extract columns from hmmsearch domain output #82

Xinpeng021001 · 2024-12-19T21:13:49Z

Hi,

Thank you for designing this program! I'm currently using it to replace the HMMER in my python script, but I meet some issues currently:

    available_memory = psutil.virtual_memory().available
    target_size = os.stat(self.input_faa).st_size
    hmm_files=HMMFiles(self.hmm_file)
    with open("test_hmmer_result.txt", 'wb') as f:
        with pyhmmer.easel.SequenceFile(self.input_faa, digital=True) as seqs:
            if target_size < available_memory * 0.1:   #Pre-fetching targets into memory
                targets = seqs.read_block()
            else:
                targets = seqs
            for i, hits in enumerate(pyhmmer.hmmsearch(hmm_files, targets, cpus=os.cpu_count(), domE=1e-15)):
                hits.write(f, format="domains", header=False)

I'm using hits.write(f, format="domains", header=False) to get the domain output, but I want to extract columns with [query name qlen target name tlen i-Evalue(this domain) hmm_from ali_from] to form a new output result with tab as a separator. I read the doc file but I'm still confused about how to extract those information.

Could you please tell me how Tophit was selected? I noticed that I may have two Tophits with one protein accession from the hits.write.

Really appreciate your help!

Best Regards,
XInpeng

The text was updated successfully, but these errors were encountered:

Xinpeng021001 · 2024-12-20T03:20:32Z

I think I find the way to do it but it would be really helpful if you could verify it:

            for hits in pyhmmer.hmmsearch(hmm_files, targets, cpus=os.cpu_count(), domE=1e-15):
                for hit in hits:
                    for domain in hit.domains.included:
                        #print(dir(domain.alignment))
                        print(domain.alignment.hmm_name, domain.alignment.hmm_length, domain.alignment.target_name, domain.alignment.target_length, domain.i_evalue, domain.alignment.hmm_from, domain.alignment.hmm_to, domain.alignment.target_from, domain.alignment.target_to)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about extract columns from hmmsearch domain output #82

Question about extract columns from hmmsearch domain output #82

Xinpeng021001 commented Dec 19, 2024

Xinpeng021001 commented Dec 20, 2024

Question about extract columns from hmmsearch domain output #82

Question about extract columns from hmmsearch domain output #82

Comments

Xinpeng021001 commented Dec 19, 2024

Xinpeng021001 commented Dec 20, 2024

Example usage