当前位置: 动力学知识库 > 问答 > 编程问答 >

awk - GUI glitch using paste?

问题描述:

this is my file:

file.txt

cg13869341 1 15865

cg24669183 1 534242

cg15560884 1 710097

cg01014490 1 714177

cg17505339 1 720865

cg11954957 1 758829

cg23803172 1 763119

cg16736630 1 779995

cg00168193 1 790667

cg05898754 1 805102

awk '{print $2 "\t" $3 "\t" $3 "\t" $1}' file.txt

output

 1 cg13869341

1 cg24669183

1 cg15560884

1 cg01014490

1 cg17505339

1 cg11954957

1 cg23803172

1 cg16736630

1 cg00168193

1 cg05898754

awk '{print $2 "\t" $3 "\t" $3 "\t" $1}' file.txt | head -1 | tr '\t' '\n'

output

 1

15865

15865

cg13869341

Ok, the format is inherently correct, but the output is strange. So I try something else.

awk '{print $1}' file.txt > 1.txt

awk '{print $2}' file.txt > 2.txt

awk '{print $3}' file.txt > 3.txt

paste 2.txt 3.txt 3.txt 1.txt | head

 1 cg13869341

1 cg24669183

1 cg15560884

1 cg01014490

1 cg17505339

1 cg11954957

1 cg23803172

1 cg16736630

1 cg00168193

1 cg05898754

pasting 2.txt 3.txt gives the expected output (cut to head -2):

 1 15865

1 534242

as does 3.txt and 1.txt:

 15865 cg13869341

534242 cg24669183

So why when I paste 2.txt 3.txt 3.txt 1.txt, the middle 2 columns disappear?

Am I missing something here?

网友答案:

I can reproduce the behavior with a file that has Windows line endings (\r\n instead of \n). In this case, the last field will not be "15865" but "15865\r", so every time $3 is printed, the cursor is moved to the beginning of the line before the next tab and field are printed. The next field then overwrites the just written $3, or part of it if it is shorter.

You can convert the file to UNIX line endings with a number of tools such as fromdos, dos2unix or recode. A way to do it on the fly in awk is

awk '{ sub(/\r$/, ""); print $2 "\t" $3 "\t" $3 "\t" $1 }' file

Style note: Instead of hard-coding the separator, consider using the OFS special variable:

awk -v OFS='\t' '{ sub(/\r$/, ""); print $2, $3, $3, $1 }' file

That makes the command easier to adapt in case you want to generate differently separated values later.

分享给朋友:
您可能感兴趣的文章:
随机阅读: