Slide 1:
Serendipity is another way to say good luck. It's a concept and belief that something fortuitous occurs from a confluence of factors
Slide 2:
Waldo Tobler's First Law of Geography states that things near you have more influence that things farther away. This idea has been applied in a number of situations.
Hecht, Brent and Emily Moxley. "Terabytes of Tobler: Evaluating the First Law in a
Massive, Domain-Neutral Representation of World Knowledge" COSIT'09 Proceedings of the 9th international conference on Spatial information theory, 2009, pp 88-105.
Sui, D. "Tobler’s First Law of Geography: A Big Idea for a Small World?" Annals of the Association of American Geographers, 94(2), 2004, pp. 269–277.
Tobler, W. , "On the First Law of Geography: A Reply," Annals of the Association of American Geographers, 94, 2004, pp. 304-310.
Slide 3:
Will Wright recently gave a talk at O'Reilly's Whereconf titled "Gaming Reality." One of his points was related to Tobler's First Law of Geography, things that are closest are most likely to be of interest.
Slide 4:
Proximity can be measured along many different dimensions: spatial, temporal, social network, and conceptual.
Slide 5:
Slide 6:
Social media, such as Twitter, have been analyzed to determine states of emotion and mapped by Dan Zarella. This just one example of how data can be used to find a person's proximity to an emotion based on location and time.
Zarrella, D. Using Twitter Data to Map Emotions Geographically. The Social Media Scientist, May 7th 2012
Slide 7:
Connections between people in the form of social networks or social graphs provides a rich source of data for measuring conceptual phenomena. For example, Klout declares that it is a measure of influence, LinkedIn can be a measure of a person's professional sphere, Twitter can while Pintrest can reflect the material culture of a person or a group.
Slide 8:
Will Wright postulated that there are at least 50 different dimensions where proximity creates a value gradient. The closer to a person, the greater the value along the value gradient. These gradients can be emotions, communities of interest, school affiliations, or any number of factors that can influence a person's behavior and choices. By bringing all these dimensions to bear on a person, it could be possible to build game dynamics that take advantage of physical world behaviors.
Slide 9:
Measurement of the value gradient is the first step in engineering serendipity. There are a number of ways of quantifying the value gradient, but proximity is often modeled on a network structure. Nodes in the network represent people and possible dimensions of interest and the connections (or links) between nodes can measure the gradient.
Slide 10:
Will Wright suggested that Central Place Theory as one model of understand the effects of proximity. It is a classic geographic model proposed by Walter Christaller for explaining the hierarchy of places. When applied to influencing serendipity, the concepts of threshold and range are key to using the model to measure the influence of proximity. Threshold is the minimum interaction of a dimension needed to influence a person, whereas range is the maximum distance a person will 'travel' to acquire something.
Dempsey, C. Distance Decay and Its Use in GIS. GIS Lounge, 3/15/12.
Slide 11:
There are a number of ways to measure the effects and or the importance of links. Google's page rank algorithm is perhaps the most famous. Page rank indicates the importance of a page based on number of incoming links. Another form of link analysis used by the intelligence community focuses on the transactions between people, organizations, places and time as exemplified by Palantir software.
Holden, C. Osama Bin Laden Letters Analyzed. Analysis Intelligence. May 4, 2012.
Holden, C. From the Bin Laden Letters: Mapping OBL’s Reach into Yemen. Analysis Intelligence. May 11, 2012.
Page, Lawrence and Brin, Sergey and Motwani, Rajeev and Winograd, Terry (1999) The PageRank Citation Ranking: Bringing Order to the Web. Technical Report. Stanford InfoLab.
Slide 12:
Ultimately, all these measures of proximity are attempting to answer this question, "If you friend Joey jumped off a bridge, would you jump?" I.e., would you jump off a bridge because everyone is doing it (social influence/contagion) or would you jump because you are similar to Joey (homophily). A recent paper, Homophily and Contagion are Generally Confounded in Observational Network Studies, posits both the subject and the answer in it's title.
Shalizi,C and A. Thomas. Homophily and Contagion are Generally Confounded in Observational Network Studies. Sociological Methods and Research, vol. 40 (2011), pp. 211-239.
Slide 13:
The comic XKCD manages to summarize the result in a single panel. We don't know.
Slide 14:
The maxim, "Models are wrong, but are useful" has been a truism in research. The idea that models are not only wrong, but that research can be successful without them is starting to gain currency in the era of Big Data. Access to very large datasets and the capability to manipulate them inexpensively is changing how research is performed.
Slide 15:
With large numbers on our side, petabytes or even yottabytes of data can reveal patterns not possible with sampled data.
Slide 16:
Flip Kromer of Infochimps illustrates how a preponderance of data leads us to determine the boundaries of places called Paris and which location is the one used in a particular context
Slide 17:
Third party agents are continuously collecting information about people from social media, social networks, and ecommerce. This provides a wealth of data about people for a third party perspective. In addition, the quantified self is a concept where individuals document every aspect of their lives in order to optimize their day to day interactions.
However, Goodhart's law stipulates that any indicator used to influence a particular behavior will decrease the usefulness of that indicator. In other words, users will game the system and degrade the quality of the information in order to achieve the objective
Slide 18:
There is an emerging an corollary concept of the quantified self. Rather than a continuous collection of data there is an alternate of source of data that reflects information selected and shared but not for the purposes of participating in social networks, i.e. a view to a person's internal life. For example, Amazon collects highlighted phrases from Kindle users as well collecting wish lists which represent material culture.
Carrigan, M. Mass observation, quantified self, and human nature. markcarrigan.net. April 19, 2012.
Currion, P. The Qualified Self. The Unforgiving Minute. November 30, 2011.
Slide 19:
To bring it back to serendipity, perhaps it's time to re-evaluate how we understand how multiple factors affect an individual's choices. Models based on physical properties such proximity may lack the nuance necessary to explain a behavior. Simply creating a confluence of events within many possible proximal dimensions may not be enough to explain or influence. However, a new alternative is possible through the use of big data and the tools of machine learning and algorithms to describe behavior. We should harness these tools to better understand the factors that affect serendipity and let go of Newtonian models that reduce the rich interplay of social factors.