Recently I have seen quite a few blog posts written about re-evaluating the points values assigned to the different letter tiles in the
Scrabble™ brand Crossword Game. The
premise behind these posts is that the creator and designer of the game assigned point values to the different tiles according to their relative frequencies of
occurrence in words in English text, supplemented by information gathered while
playtesting the game. The points
assigned to different letters reflected how difficult it was to play those
letters: common letters like E, A, and R were assigned 1 point, while rarer
letters like J and Q were assigned 8 and 10 points, respectively. These point values were based on the English
lexicon of the late 1930’s. Now, some 70
years later, that lexicon has changed considerably, having gained many new
words (e.g.: EMAIL) and lost a few old ones. So, if one
were to repeat the analysis of the game designer in the present day, would one
come to different conclusions regarding how points should be assigned to
various letters?
Sunday, January 20, 2013
Tuesday, January 1, 2013
The Skeleton Supporting Search Engine Ranking Systems
Posted by
DTC
at
7:09 PM
A lot of the research I’m interested in relates to networks –
measuring the properties of networks and figuring out what those properties
mean. While doing some background
reading, I stumbled upon some discussion of the algorithm that search engines use
to rank search results. The automatic
ranking of the results that come up when you search for something online is a great
example of how understanding networks (in this case, the World Wide Web) can be
used to turn a very complicated problem into something simple.
Ranking search results relies on the assumption that there
is some underlying pattern to how information is organized on the WWW- there
are a few core websites containing the bulk of the sought-after information surrounded by a group of peripheral websites that reference the core. Recognizing that the WWW is a network
representation of how information is organized and using the properties of the
network to detect where that information is centered are the key components to figuring out what websites
belong at the top of the search page.
265
remarks
|
Labels:
centrality measures,
networks,
octopodes,
search engine ranking,
search engines
|
|
Subscribe to:
Posts (Atom)