how to split a string from a column using awk

I am a noob in Linux. I have a file like this:

 col1 col2 col3 ID1234567-DNA_A01 chr1_10203040_T/C gene 0 ID1234568-DNA_A02 chr1_10203050_T/A gene 0 ID1234569-DNA_A03 chr1_10203060_A/G gene 0 ID1234570-DNA_A04 chr1_10203070_C/T gene 0

I want to use only the first column and divide each line into 4 columns:

 #CHROM POS REF ALT 1 10203040 T C 1 10203050 T A 1 10203060 A G 1 10203070 C T

I tried to make:

 awk 'BEGIN{OFS="\t";FS="\t"; print"#CHROM","POS","REF","ALT"} | cut -d' ' -f2- {print substr($1,4,1),substr($1,6}' old_file > new_file

I know I did wrong, but any suggestion would be helpful!Thanks

3 Answers

Maybe you can try like like this:

cut -d " " -f 2 test.txt | awk -F '[_,/]' 'BEGIN{printf "#CHROM \tPOS\tREF\tALT\n"} {printf ("%s\t %s\t %s\t %s\n" ,$1, $2, $3, $4)}'

Here test.txt is name of your file. And if you want to redirect output to file just add > new_file.txt at end of the command.

I'd go with:

awk 'NR>1 {print $2}' file \
| awk -F'[_/]' 'BEGIN{OFS="\t"; print "#CHROM","POS","REF","ALT"}{$1=$1}1'

First awk, output the second field only.
Second awk, choose [_/] as field separator, print the new Header and the fields. $1=$1 triggers reorganisation of fields, which is necessary as we change the output field separator to \t.
You may add | column -t to make the columns in line.

We could do it in one go, but then you need to use split which is more complicated I think.

Output:

#CHROM POS REF ALT
chr1 10203040 T C
chr1 10203050 T A
chr1 10203060 A G
chr1 10203070 C T

If you have GNU awk (gawk), then - notwithstanding the advice here - you could consider capturing the parts you want using a regular expression rather than a string split:

$ gawk ' BEGIN{OFS="\t"; print "#CHROM","POS","REF","ALT"} match($2,/chr([0-9])_([0-9]+)_([ACGT])[/]([ACGT])/,a) {print a[1],a[2],a[3],a[4]} ' old_file
#CHROM POS REF ALT
1 10203040 T C
1 10203050 T A
1 10203060 A G
1 10203070 C T

(Other awk implementations have the match function, but the GNU version extends that with a capture group array.)

Fame Burst

how to split a string from a column using awk

3 Answers

Your Answer

Sign up or log in

Post as a guest

You Might Also Like

Is there a difference between map chat and team chat in WvW in Guild Wars 2?

Does stealing via "Mug" also count towards the Talent for Acquisition trophy?

What does "zoning" mean for fighting games?

Is there no other way to remove the "7 day purchase" market ban other than spending money?