ABSTRACT In today’s data-rich networked world, people express many aspects of their lives online. It is common to segregate different aspects in different places: you might write opinionated rants about movies in your blog under a pseudonym while participating in a forum or web site for scholarly discussion of medical ethics under your real name. However, it may be possible to link these separate identities, because the movies, journal articles, or authors you mention are from a sparse relation space whose properties (e.g., many items related to by only a few users) allow re-identification. This talk examines this general problem in a specific setting: re-identification of users from a public web movie forum in a private movie ratings dataset.
This is my little corner of the Internet, welcome to it. It is my sounding horn for my views on democracy, the environment, security, computers, and code which is beautiful. I like to ask questions and study the wisdom of the crowd, the democratization of information, and why things are different this time around. I am a dog person, and I have been a Mac user since before it was cool.
Leave a reply