Doug Finke recently posted a blog post about finding the most common words in a file.

Doug put together a little 19 line PowerShell script to solve the issue, but something just called to me about how it wasn’t necessarily playing to some of the included cmdlets in PowerShell.

So, here’s my interpretation as a one liner:

get-content big.txt | foreach-object {[regex]::split($.ToLower(), '\W+')} | where-object {$.length -gt 0} | group-object | sort-object -property count -descending | select-object -property name -first 6

EDIT: One thing I’ve noticed is Doug’s script runs much faster..

Want more great reading? Check out my reading list!