当前位置: 动力学知识库 > 问答 > 编程问答 >

python - replace a nucleotide at a certain position in a dna sequence file

问题描述:

I have a fasta file, and another file contains the position, I want to replace at a certain position of each sequence with a default setting, for example, my position file looks like

a/c 120, my replace table looks like a/c to W, so I want to get a new fasta file with the position 120 replaced with w.

The program was written in Python

So the first problem is I can't get to the correct position, for example, if I used

my_seq_id[0:3], I got the sequence name! not the sequence.

The position file looks like

id1 219 A/C

from Bio import SeqIO

import sys

import string

userInput1=raw_input("enter your sequence:")

userInput2=raw_input('enter your position file:')

fasta_file=userInput1

position_file=userInput2

result_file="outfile.txt"

id_list=list()

position_list=list()

nucleotide_list=list()

with open(position_file) as f:

for line in f:

line=line.strip()

headerline = line.split()

position=headerline[1]

ID=headerline[0]

nucleotide=headerline[2]

nucleotide_list.add(nucleotide)

position_list.add(position)

id_list.add(ID)

fasta_sequence=SeqIO.parse(open(fasta_file), 'fasta')

with open(result_file, 'w') as f:

if seq_record.id in wanted and nucleotide_list="A/C":

seq_record[position_list]="W\n"

SeqIO.write([seq_record], f, "fasta")

网友答案:

Your code is a little bit confusing, shouldn't:

fasta_sequence=SeqIO.parse(open(fasta_file), 'fasta')
with open(result_file, 'w') as f:
    if seq_record.id in wanted and nucleotide_list="A/C":
        seq_record[position_list]="W\n"
        SeqIO.write([seq_record], f, "fasta")

be:

fasta_sequence=SeqIO.parse(open(fasta_file), 'fasta')
with open(result_file, 'w') as f:
    if fasta_sequence[j].id in wanted and nucleotide_list[i]=="A/C":
        fasta_sequence[j].seq[position_list]="W\n"
        SeqIO.write([fasta_sequence[j]], f, "fasta")

Being i some counter to go through the nucleotide_list and j to go through fasta_sequence

What you can also do is iterate through fasta_sequence this way:

for record in SeqIO.parse(StringIO(data), "fasta"):
    print("%s %s" % (record.id, record.seq))

being id the id of each element and seq its sequence.

---- Update ----

I think I understood what you want to do, to go through each record in the file, check if the id is a match and then change the sequence in that position do:

#goes through each record in the file
for record in SeqIO.parse(StringIO(data), "fasta"):
    # check if id is wanted
    if record.id in id_list:
        # get list of every item in id_list that matches record.id
        positions_of_id_in_id_list = [i for i, j in enumerate(id_list) if j == record.id]
        for elem_position_lists in positions_of_id_in_id_list:
            # I think here you want to write the new record in the correct position (substitute "W\n" with new item. Maybe nucleotide[elem_position_lists]?)
            record.seq[position_list[elem_position_lists]] = "W\n"
# write new file
SeqIO.write(fasta_sequence, f, "fasta")
分享给朋友:
您可能感兴趣的文章:
随机阅读: