Thursday, March 1, 2012

yogi_Determine Unique Count Of Entities Based On Characters A to Z Only Regardless Of Case

Yogi Anand, D.Eng, P.E.                Google Spreadsheet          Post  #452          www.energyefficientbuild.com
user timerecords said:
Help writing a function
So I've written a doc that has all of the top movie lists as different sheets(1001 movies to watch before you die, afi top 100, imdb top 250, etc). It is set to where a person can go in and click on each sheet and put a 1 in a cell next to the movie if you've seen it and I have data accrued from there(amount seen, amount unique seen for all users, etc).
Currently I've got each list in a different sheet and the user goes through and clicks if they've seen it. I want to have one central list with all the movie titles and all people have to do is look through it once, and click if they've seen it and it will fill out the rest of these lists.
My issue is that when i copied the lists onto the sheet the formats weren't all the same, i.e. one would say Godfather, one would day GODFATHER, one would say Godfather(1972).
My question, how do I pull out all of the unique films(each film only listed once) from all these lists and not pull out three different titles for the same film? The rest I can do.
Please help,
Tim
------------------------------------------------------
following is a solution to the problem

5 comments:

  1. I see that it works, which is awesome, but I am having difficulties understanding it. Which I would like to.

    ReplyDelete
  2. Also it is cutting some of the names off after a punctuation it seems. dr. strangelove is cut to dr.
    it's a wonderful life is cut to it.
    Any thoughts?
    You're awesome

    ReplyDelete
  3. Hi Tim:

    I have set it up to extract only alphabetical characters and space characters only from the name and disregard everything else.

    If you can tighten up your specification so that it is unambiguous and covers all the bases, I can update my formula accordingly.

    In regard to your question, how my formula works -- I have used regular expression function REGEXEXTRACT to pull only alphatecial characters regardless of case and space characters from listed Film names, then I extract from this modified list UNIQUE names, and finally I use the COUNTUNIQUE function to get the count of unique names only.

    Cheers!
    Yogi

    ReplyDelete
  4. When I apply it to all the lists it is not working.
    Here is the link for what I've got going on. The "unique" values on the right side are not unique.
    https://docs.google.com/spreadsheet/ccc?key=0Auk8e_QiDnJOdFZ1eER4QjI5dG9salBfMWlET0pfVmc

    Thanks again

    ReplyDelete
  5. Hi Tim:

    You have not tightened up your specification in regard to naming of the films. I have however, updated my formulas to cover all the cases you have specifically mentioned so far.

    Cheers!
    Yogi

    ReplyDelete