By replacing each of the letters in the word CARE with 1, 2, 9, and 6 respectively, we form a square number: 1296 = 36^2. What is remarkable is that, by using the same digital substitutions, the anagram, RACE, also forms a square number: 9216 = 96^2. We shall call CARE (and RACE) a square anagram word pair and specify further that leading zeroes are not permitted, neither may a different letter have the same digital value as another letter.
Using words.txt, a 16K text file containing nearly two-thousand common English words, find all the square anagram word pairs (a palindromic word is NOT considered to be an anagram of itself).
What is the largest square number formed by any member of such a pair?
NOTE: All anagrams formed must be contained in the given text file.
rm(list=ls(all=TRUE)) #I want all large integers manipulated without scientific notation options( scipen = 20 ) ##don't use scientific notation options(digits=22)
words <- scan( file= paste(getwd(),"p098_words.txt", sep="/"), what="list", sep=",",skip=0, quote="\"")
> words <- scan( file= paste(getwd(),"p098_words.txt", sep="/"), what="list", sep=",",skip=0, quote="\"") Read 1786 items > > counts <- nchar(words) > max.counts <- max(counts) > max.counts [1] 14 > > min.counts <- min(counts) > min.counts [1] 1 > > words.len <- length(words) > > d <- data.frame(counts, words) > head(d) counts words 11 A 27 ABILITY 34 ABLE 45 ABOUT 55 ABOVE 67 ABSENCE > tail(d) counts words 17813 YET 17823 YOU 17835 YOUNG 17844 YOUR 17858 YOURSELF 17865 YOUTH >
There are 1786 words, the longest is 14 characters and the smallest is 1 character. Since we have already been given the square anagram word pair CARE / RACE I will assume the answer is greater than 4 characters and ignore all 1-4 character words.
Write a function compare.word that will take 2 words, sort the characters and determine if the two words have the same characters.
> compare.word <- function( a, b){ + aa <-sort(strsplit (a,"")[[1]]) + bb <-sort(strsplit (b,"")[[1]]) + (length(aa)==sum(aa==bb)) + } >
##process through all words looking for words that have the same characters ##get words from d by counts. Only consider words with 5 or more characters
##d.nums is the data.frame that holds the squares ##and their lengths temp.nums <- as.numeric(d.nums[d.nums$num.lengths==i,"nums"]) temp.len <- length(temp.nums)
> head(mynums) len row col a b 251121210010201 352131440010404 4510234040110404 5511134410010404 653141690010609 754141960010609 > tail(mynums) len row col a b 4099192161221516999255321993195225 4099292152621517993825625993258256 4099392158921530997801744994077841 4099492157421538996854329994582369 4099592159421573998117649996791184 4099692162321623999950884999950884
> nrow(mynums) [1] 40994 >
There are about 41,000 squares. Assign a square to a word, rearrange according to the square pair, determine if the new word is in the anagram list.