Thursday 11 April 2013

Unix: Count the number of delimiters for each line in a huge file

Found this here:

http://ubuntuforums.org/showthread.php?t=301479

*THIS* is the one to use:

awk 'BEGIN { FS = ";" } ; { print NF-1 }' <million_lines.csv>

note that this outputs to console, so pipe to file for easier use
(add this at the end: > output_file)

example:

input file is "addresses.txt" and i want to

awk 'BEGIN { FS = "\t" } ; { print NF-1 }' addresses.txt > tab_count.txt


The other approach using echo |tr|wc is really really slow.


Note to self: learn awk and sed!

No comments: