我有一个文件(标准FASTA格式),如下所示。我想替换所有的“-”字符,但只能替换不是以“>”开头的行。例如:
>Title1-mytitle
ACACAGCATGCTA-CTAGCAGT
>Title2
---CAGCTAGCAG-TGACGCGA
GTCAGCTGACGTAGCTAGTATA
CGATCGATCGTAGCT-GAGATT
>Title3-again-q
ATGCTAGCTGACTGATCGTCGG
ATGATGATTTG
输出如下所示:
>Title1-mytitle
ACACAGCATGCTACTAGCAGT
>Title2
CAGCTAGCAGTGACGCGA
GTCAGCTGACGTAGCTAGTATA
CGATCGATCGTAGCTGAGATT
>Title3-again-q
ATGCTAGCTGACTGATCGTCGG
ATGATGATTTG
I require a bash solution for this to be put in a longer pipeline. I was thinking along the lines of sed
with a particularly smart regular expression. Do you have some ideas?
谢谢!
!
can be used to exclude a search pattern. In your example, this would look like:说明:
/^>/!
Exclude substitution for all lines starting with>
s/-/1/g
Substitute all instances of-
with1
For more information, reference the "Addresses Overview" in the GNU Sed manual.
使用正则表达式匹配的另一种选择。
If you
sed
has the-i
flag then you can use it.