User identification across multiple social sites

We collected the profile information of a set of Google+ users who make the links to their Facebook or Twitter’s profiles publicly available. For more details about how we retrieve data, see the draft paper “User Identification across Online Social Networks in Practice: Pitfalls and Solutions” in the Publications page.


The zip file contains the following json files:

  • facebookProfiles.json: Public information about Facebook profiles linked to Google+ profiles
  • google_plusProfile_withFacebookLinks.json: Public information about Google+profiles containing a link to the Facebook profile.
  • google_plus_and_facebook_ids.json: Explicit connection between Google+ and Facebook profiles.
  • google_plus_and_twitter_profiles.json: Google+ and Twitter information about the same user in the two social networks. Fields starting with ‘T’ refer to Twitter, while those starting with ‘G’ refer to Google+

The following files contain the feature matrix and the test sets built by the methods presented in the draft:

Google+ and Facebook train/test sets Google+ and Twitter train/test sets