With almost universal availability of Internet and the availability of reliable on-line productivity tools, such as Google Docs, individuals as well as companies are switching over to Cloud Computing. In this blog I will post items of interest to my colleagues/patrons/clients.
Saturday, April 28, 2018
yogi_From Sentences In Column A List Each Unique Word (case sensitive) And Its Frequency
I have a very large column (A) of sentences, with varying punctuation, capitalization, etc. It is around 20,000 rows and 150,000 words long, meaning any manual solution is impossible. I'm basically trying to produce a list of these words. I'm not sure if this is possible in sheets at all, but if so I would like either a cell formula or a script.
I would like to create a list of every word used in this column, and put it in column B. So B:B would be a list of words such as "the", "and", "hello", "world", etc. Each word should only appear once in column B. In column C, I would like a tally of how many times each of these words are used (I already have a formula for counting word usage, I just need something to produce a list of words).
For my purposes, a word is any string of characters separated by spaces, punctuation or the beginning/end of a sentence. So " apple ", " wqehotasjclfkadsnqjlgea ", and " hello;" are all words. Having the formula be case insensitive would be preferable, but not required.