Tracking performed by Social Networks
In this blog post I analyze methods of user tracking which are performed by popular social network websites such as Facebook, Twitter, Xing, and recently Google+.
Each of these social networks have buttons (called Like, Tweet, Visitors, and +1 buttons) which are installed on numerous websites. I try to put some light on the actions performed by those buttons and how they track users around the web, even when they don't click those buttons.
All these buttons have one thing in common: they are embedded in websites all around the web and load resources (scripts, images, etc.) which are fetched from the social networking website or their content delivery partners. The website operator embedding these buttons does not have the complete control over what content is loaded in the context of the user's browser viewing the website.
In the next paragraphs I show some details about the code of these buttons and what happens when users view the
webpage located at
http://www.example.com/shop.jsp?product=4711. Let's assume that this is a popular shopping
site and the URL points to the product page of a certain product (identified by the parameter in the URL).
I differentiate between the following three cases for each social network while analyzing their abilities to track users surfing the web:
- The user is logged in at the social network site.
- The user is not logged in at the social network site.
- The user is not participating in the social network and has therefore no account.
Facebook's Like button
The IFrame version of the code to embed for a Like button on the above mentioned shopping site's product page looks like this:
<iframe src="http://www.facebook.com/plugins/like.php\ ?href=http%3A%2F%2Fwww.example.com%2Fshop.jsp%3Fproduct%3D4711&\ send=false&layout=standard&width=450&show_faces=false&action=like&\ colorscheme=light&font&height=35" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:35px;" allowTransparency="true"></iframe>
Similar the XFBML version of the same Like button looks like the following code. The website operator has the possibility to let the page to like be dynamically determined from the script or sets it directly in the button's code.
<div id="fb-root"></div><script src="http://connect.facebook.net/en_US/all.js#xfbml=1"> </script><fb:like href="http://www.example.com/shop.jsp?product=4711" send="false" width="450" show_faces="false" action="like" font=""></fb:like>
From the above code samples you can see that the complete URL of the page the website operator equipped with a Like button is given as a parameter to Facebook. As both variants (IFrame and XFBML) are executed directly when viewing our example.com product page, Facebook sees that someone is viewing that page - even without the user having to click on the button. But is Facebook also able to see who is viewing the page? To answer this question we have to dig a bit deeper into the dynamic HTTP traffic exchanged while viewing the targeted page.
GET /plugins/like.php?\ href=http%3A%2F%2Fwww.example.com%2Fshop.jsp%3Fproduct%3D4711\ &send=false&layout=standard&width=450&show_faces=true\ &action=like&colorscheme=light&font&height=80 HTTP/1.1 Host: www.facebook.com User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.7;\ de; rv:188.8.131.52) Gecko/20100914 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: de-de,de;q=0.8,en-us;q=0.5,en;q=0.3 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive Referer: http://www.example.com/shop.jsp?product=4711 Cookie: datr=3efxQeXXXXXXXXXXXho7EX;\ act=1353XXXXX29%2F3;\ c_user=2020XXXXX7006;\ locale=de_DE;\ lu=RgKp7uXXXXXiep_oISg;\ sct=19XXX36;\ xs=60%3A40bc92XXXXXXXXXX8dc88c8;\ x-referer=http%3A%2F%2Fwww.facebook.com%2F%23%2F;\ wd=1280x647
To answer our question regarding the visibility of the user to Facebook while surfing other sites: Take a look at the cookies submitted
to Facebook as part of the above shown HTTP GET request. The snippet contains the cookies which are sent to facebook.com on viewing the example.com
product webpage while the user is logged in at Facebook in the background. Most users of social networks stay logged in for a long time and use the
"keep me logged in" checkbox to get some sort of persistent login cookie.
The cookies sent back include some (
c_user) that can be treated as profile and/or session identifiers.
So the answer is: Yes, Facebook seems to be able to identify the
user which is viewing the product page in our sample. And all this happens by simply surfing to a page which has the Like button embedded and does not require the user to
actually click the Like button.
What happens when the user explicitly logs out of Facebook and then visits our example.com page you might ask.
In such a case the user's browser sends a few cookies less to Facebook. For example the above seen
c_user cookie is missing.
But nevertheless the
datr cookie (which expires after two years) is still present even when logged out of Facebook. So Facebook
seems to have the ability of tracking users and the pages they visit while surfing around the web even after they've logged out of Facebook.
Finally the last pending question is what happens when the user has never accessed the Facebook website (and is therefore not a Facebook user)?
In that case of course no cookie is sent back to Facebook. But as soon as the user vists for example www.facebook.com such a two years valid
datr cookie is set (valid for the whole facebook.com domain without any further path or subdomain restriction). This suggests that Facebook has the ability to track users that have at least one time in the past visited
a Facebook website. So Facebook can keep track of users' surfing habits on Like button enabled websites even before they register an account
at Facebook. This information of the past could then theoretically be linked to the individual upon registration of an account using the
For those of you who want to explore this in more detail I recommend the article by Arnold Roosendaal: Tilburg Law School Research Paper No. 03/2011, Facebook tracks and traces everyone: Like this!. This article comes to a similar conclusion and is certainly worth to read.
Google's +1 button
Let's assume that the user visits our example.com product page, which has such a +1 button embedded. When the user is logged in at Google+ in the background while surfing to the product page, the following HTTP GET request is sent to plusone.google.com:
GET /u/0/_/+1/fastbutton?url=http%3A%2F%2Fwww.example.com%2F\ shop.jsp%3Fproduct%3D4711&size=standard&count=false&\ annotation=&hl=en-US&jsh=r%3Bgc%2F23217085-590ae8cc HTTP/1.1 Host: plusone.google.com User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.7;\ de; rv:184.108.40.206) Gecko/20100914 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: de-de,de;q=0.8,en-us;q=0.5,en;q=0.3 Accept-Encoding: gzip, deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Connection: keep-alive Referer: http://www.example.com/shop.jsp?product=4711 Cookie: PREF=ID=5db2XXXXXX921d:TM=13XXXX775:LM=13XXXX325:S=rMwJXXXXX4TR;\ SID=DQAAAXXXXXXXXXX3CT_d2uhBD12d2mXXXXXXXXXX3Ew2fhjw1erhXXXXXXXXXX3\ Sl18KWUXXXXXXXXXXXXX3C4Ewfwhj-bFoXXXXXXXXXXX8U86FXXXXXXXXXm5e\ cTsdXXXXXXXXXwfkO-5wdlkwnd2uSBXXXXXXXXXXXXXXX5vIaT_XXXXXd1t; HSID=A2rkXXXXXX_1nA; SSID=AZu12fEXXXXDAB
Similar to the request sent to Facebook for pages that have Like buttons embedded, Google+ encodes the URL of the page the user visists
into the request sent to plusone.google.com.
Also several cookies are sent back to Google upon visiting pages that have the +1 button embedded. Especially the
looks like a session-id and is always sent back while the user is logged in at Google+ at the background.
When the user has logged out from
Google+ before visiting the example.com shop's product page a few cookies less are sent back to Google upon the visit. But the
PREF cookie, which
is valid for two years, is still visible to Google including an unchanging value for its
ID content. This enables Google to track users
visiting +1 button carrying webpages even after the users have logged out from Google+. To achieve this, Google has to map the
PREF cookie to the user's profile, which is theoretically possible since both the session-id carrying
PREF cookie were together visible to Google while the user was logged in.
Now imagine the user has never registered an account at Google+. In such a situation no cookie will be sent back to Google of course.
But as soon as the user has visited the google.com site a two years valid cookie named
PREF will be set. This cookie is then
sent back on any request to view a page which has a Google +1 button on it. Finally it looks like Google is able to track users' surfing
habits on +1 button enabled websites even before they register with the Google+ service. Upon creation of a profile at Google+ using the
same browser this data of the past could then theoretically be linked to the individual. I find it quite interesting to see that even google.com
searches and Google Maps requests (and maybe more of the google.com domain) can theoretically be tracked using the
cookie, since it is valid for the whole google.com domain and has no further path or subdomain restriction.
Twitter's Tweet button
When a user surfs to the example.com page the following HTTP GET request will be sent to platform.twitter.com:
GET /widgets/images/t.gif?_=1314043621231&count=none&\ id=twitter_tweet_button_0&lang=en&original_referer=\ http%3A%2F%2Fwww.example.com%2Fshop.jsp%3Fproduct%3D4711\ &text=&url=http%3A%2F%2Fwww.example.com%2Fshop.jsp%3Fproduct%3D4711\ &via=johnXXXXXXXdoe&twttr_referrer=http%3A%2F%2Fwww.example.com%2F\ shop.jsp%3Fproduct%3D4711&twttr_li=1&twttr_widget=1 HTTP/1.1 Host: platform.twitter.com User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.7;\ de; rv:220.127.116.11) Gecko/20100914 Accept: image/png,image/*;q=0.8,*/*;q=0.5 Accept-Language: de-de,de;q=0.8,en-us;q=0.5,en;q=0.3 Accept-Encoding: gzip, deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Connection: keep-alive Referer: http://platform.twitter.com/widgets/tweet_button.html Cookie: k=79.193.XXX.XXX.131404XXXXX12158;\ guest_id=v1%3A131XXXXXXXXX0080;\ __utma=438XXX68.16XXXX4.131XXXXX42.1314XXXX42.1314XXXXX2.1;\ __utmb=438XXX18.104.22.168XXXXX342;\ __utmc=438XXX68;\ __utmz=438XXX68.131XXXXX22.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);\ secure_session=true;\ twid=u%3DXXXXXX43%7CLqsAlXXXXXXXXXXXmoSDk%3D;\ twll=l%3D131XXX127
Aside from the multiple inclusions of the URL of the visited page in the request sent to Twitter, a handful of cookies are sent back
when the user surfing to example.com is logged in at Twitter in the background.
Those cookies beginning with
__utm belong to the urchin tracker monitor used by Google Analytics.
It's quite interesting to see that Twitter seems to use Google Analytics.
The other cookies (especially
guest_id) look a lot like identifiers of the Twitter account (i.e. the user surfing).
When the user surfing to the Tweet button enabled example.com site has logged out of Twitter a few cookies less are sent back.
But nevertheless at least the two years valid
guest_id cookie is sent back to Twitter. This looks like the
same behaviour that Facebook and Google utilize for user tracking.
Finally I tested what happens when the user has no account at Twitter and therefore no cookies to send back:
At this inspection Twitter showed the most interesting results: Even upon the first visit to a webpage that includes the Tweet button
(and no visit to twitter.com happened before) a fresh cookie including a two years valid
guest_id is sent back to Twitter.
while the user visits the example.com webpage.
This means that Twitter has the ability to even track surfing habits (on Tweet button enabled websites) of users that have no Twitter account and have
never visited a Twitter website before. When using the same browser to create an account at Twitter afterwards this collected data of the past can
theoretically be linked to the freshly created profile then. Like Facebook and Google, Twitter's
guest_id cookie is valid for the whole
twitter.com domain and has no further path or subdomain restriction.
Xing's Visitors widget
This widget shows, when installed on our example.com page, how many visitors of Xing's userbase visited that page.
Compared to Facebook, Google+, and Twitter, this widget's code to embed is rather thin: It only consists of an image link (to render the
widget which includes the visitors counter) surrounded by a link. That's all:
<a href="http://www.xing.com/de/directories/people/"> <img src="https://www.xing.com/widgets/visitor_counters/102XXX77_e1XX17?\ label=People%20Directory" alt="People Directory" /></a>
Despite its clearness, Xing's Visitors widget needs a way to track visitors of the page/site in order to increment its counter. Let's examine how this is done: When a user surfs to the example.com page the following HTTP GET request will be sent to www.xing.com:
GET /widgets/visitor_counters/102XXX77_e1XX17?label=People%20Directory HTTP/1.1 Host: www.xing.com User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:6.0)\ Gecko/20100101 Firefox/6.0 Accept: image/png,image/*;q=0.8,*/*;q=0.5 Accept-Language: de-de,de;q=0.8,en-us;q=0.5,en;q=0.3 Accept-Encoding: gzip, deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 DNT: 1 Connection: keep-alive Referer: http://www.example.com/shop.jsp?product=4711 Cookie: language=de;\ _session_id=f540XXXXXXXXXXXXXXXXXXXXXXXXf159;\ xing=|U2XXXXXkX1-uXXXXXXXXXXXXXXXXRUXm-13XXXXXXXXXXXXXXXXXXXXXXXXcUpc\ XnXXXXg5-QXXXXXXXXXp_o9lXXXXXXXXXXXXXXXXa5ayf_jCXXXX7m_lXXS-qXX|;\ s_cc=true;\ s_nr=13144XXXXXX12;\ s_lastvisit=1314469726135;\ s_sq=%5B%5AB%5B%5C;\ s_vi=[CS]v1|27XXXXXXXXXA314B-4001XXXXXXXX60A2[CE];\ xing_ssl=1 If-None-Match: "7bf8ac528ddb29be45c0e2d08033c7ee"
As you can see there is only the referrer header which allows Xing to see where (on what URL carrying the widget) the user is surfing. But it's interesting to inspect
the cookies sent along to xing.com when the user surfing is logged in at the social network: The non-persistent cookies
When the user has logged out of Xing a few cookies less are sent back on visiting the example.com page. But like the other social networks, Xing
still receives a persistent cookie then: The
s_vi cookie expires after five years and is valid for the whole xing.com domain and has no
further path or subdomain restriction. The content of this cookie seems to be an identifier which is unique for each cookie and can therefore be used
(in theory) to track logged out users visiting pages carrying the Visitors widget. This is possible since the
s_vi cookie has also been sent to Xing when
the user was logged in, before logging out and visiting example.com.
Finally I've tested, what happens when the user has no account at Xing and therefore no cookies to send back: Under this scenario no relevant
cookies are sent to Xing of course. But as soon as the user visits xing.com such a five years valid
s_vi persistent cookie is assigned
to the user's browser. So all further visits to pages carrying Xing's Visitors widget can in theory be tracked using that cookie. When a long time later
the user registers an account at Xing this information of the past (in case it was saved at Xing) could be linked to the user's profile.
When further inspecting the
s_vi cookie it soon leads to a product that's called SiteCatalyst, which is capable of visitor tracking.
From my personal perspective I believe that Xing is using that product to track visitors only along its own site and that Xing is not using the tracking
potential which lies in linking that product's cookie to its own user profiles. But this would be much clearer if Xing used a distinct subdomain for serving
the Visitors widget image and restricting the
s_vi cookie to another subdomain, since only then the mentioned tracking potential of logged
out users on other websites is no longer given.
Xing's Share button
Independent from the Visitors widget Xing also offers a Share button to embed in webpages. This button's code is designed very fair, since it tracks nothing upon viewing a page with such a button embedded. The very first request to the image file sends the referrer and the cookies to Xing (as the Visitors widget does it on every request), but then all subsequent requests are served form the cache and no data (referrer and/or cookies) is sent to Xing. That's the case because the image response of this button has long-lasting cache headers with it.
I think this is a very fair situation compared to the other social networks. But since the Visitors widget is still capable of tracking users (that's what it was designed for: counting visitors from the Xing userbase) Xing's tracking capabilities can be compared to those of Google+ and Facebook, as long as the Visitors widget is used on other webpages and served from the www.xing.com domain. From my personal perspective the Visitors widget is deployed much less than the fair and user-friendly Share button of Xing.