WHY THIS MATTERS IN BRIEF
When it comes to our online and offline activites privacy is at the top of most people’s minds, and now Google are embedding Federated AI to try to redress the balance.
Tim Berners-Lee, the inventor of the internet, thinks the internet’s broken and needs re-inventing, and while he’s doing that Google, using a technology that many critics think will help increase people’s privacy on the web, has announced its famous Chrome browser is ditching third party cookies for good in favour of a new form of Artificial Intelligence (AI) called Federated Learning which captures, aggregates, and analyses user data without exposing the individual’s data or their unique identifiers. And, if all goes according to plan then future updates to the world’s most popular web browser will rewrite the rules of online advertising and make it far harder to track the web activity of billions of people. But it’s not that simple. What seems like a big win for privacy may, ultimately, only serve to tighten Google’s grip on the advertising industry and web as a whole.
Critics and regulators argue the move risks putting smaller advertising firms out of business and could harm websites that rely on adverts to make money. For most people, the change will be invisible but, behind the scenes, Google is planning to put Chrome in control of some of the advertising process. To do this it plans to use browser-based machine learning to log your browsing history and lump people into groups alongside others with similar interests.
“They’re going to get rid of the infrastructure that allows individualised tracking and profiling on the web,” says Bennett Cyphers, a technologist at civil liberties group the Electronic Frontier Foundation. “They’re going to replace it with something that still allows targeted advertising – just doing it a different way.”
Google’s plan to replace third-party cookies comes from its Privacy Sandbox, a set of proposals for improving online adverts without obliterating the ad industry. Aside from getting rid of third-party cookies the Privacy Sandbox also deals with issues such as advertising fraud, reducing the number of CAPTCHAs people see and introducing new ways for companies to measure the performance of their ads. Many Google critics say parts of the proposals are an improvement on the existing setup and good for the web.
Change is necessary. The online advertising industry is, to put it mildly, unwieldy. It comprises billions of data points about all of our lives that are automatically traded every second of every day. Such a substantial change to this system will impact a raft of businesses, from brands advertising products and services online to the ad tech networks and news organisations that propel those ads to every corner of the web.
The Privacy Sandbox proposals are complicated and technical. Google is already testing some while others remain firmly at the development stage. Privacy Sandbox is documented online and Google has altered its plans based on feedback and counter-proposals from rivals. But, ultimately, when it comes to Chrome, everything is controlled by Google.
The removal of third-party cookies from Chrome, first announced in January 2020, has been a long time coming.
“Third-party cookies were awful,” Cyphers says. “They were the most privacy-invasive technology in the world for a while.” When Google does remove them in 2022, it won’t be first – but its huge market share does mean it will have the biggest impact. Apple’s Safari, the second biggest browser behind Chrome, limited cookie tracking in 2017. Mozilla’s Firefox blocked third-party cookies in 2019 – the problem is so vast that the browser is currently blocking ten billion trackers per day.
If you’re using Chrome at the moment then the websites you visit, with a few exceptions, will add a third-party cookie to your device. These cookies – small snippets of code – are able to track your browsing history and display ads based on this. Third-party cookies send all the data they collect back to a different domain than the one you’re currently on. First-party cookies, by comparison, beam data back to the owners of the domain you’re visiting at the time.
Third-party cookies are the main reason why the shoes you looked at two weeks ago are still stalking you around the web. All the data gathered by third-party cookies is used to build user profiles, which can include your interests, the things you buy and behaviours online – this can be fed back to murky data brokers.
“The intention really was to initiate a certain set of proposals about how older technologies like third-party cookies, as well as others, can be replaced by privacy-preserving API alternatives,” says Chetna Bindra, a product lead on Google’s ads business.
So what are the alternatives? Google’s plan is to target ads against people’s general interests using an AI system called Federated Learning of Cohorts (FLoC), which as the name suggests is based on the concept of Federated Learning, a relatively new technique that lets AI’s gather and analyse data on all manner of people and topics without compromising their privacy. The machine learning system takes your web history, among other things, and puts you into a certain group based on your interests. Google hasn’t defined what these groups will be yet but they will include thousands of people that have similar interests. Advertisers will then be able to put ads in front of people based on the group they’re in. If Google’s AI works out you really like sneakers, for example, then you’ll be chucked in a group with other similarly-minded sneaker fans.
It all works in a similar way to how Netflix’s algorithm works out what you might like to watch. In essence, your viewing history is similar, but not identical to, plenty of others. If Person A and Person B both like the same four horror films, for example, then chances are Person A will like a fifth horror film that Person B has just watched. Now just expand that out to cover billions of people.
Unlike with third-party cookies, all the data used to determine what group you go in will be processed in Chrome. Third party cookies, by contrast, are sprayed around like confetti. Bindra claims that, despite this fundamental change in how data is stored and processed, the system is 95 per cent as effective at targeting ads as third-party cookies. Others have questioned this claim.
One potential issue with the machine learning system is what it can infer about people.
“Since FLoC uses your browsing history to assign you to interest-based cohorts, the end result is akin to a super-tracker that is present on even more websites than Google Analytics,” says Kamyl Bazbaz, vice president of communications at the search engine DuckDuckGo.
While FLoC means less personal data is being sent to third parties, as with the current cookie setup, there are concerns about how people will be grouped together and whether the automated process that does this will discriminate against certain groups.
“The FLoC clustering algorithm that Google is proposing would be handled by Google themselves, and common for all web users,” says Basile Leparmentier, a senior machine learning engineer at advertising tech firm Criteo, which has proposed its own Privacy Sandbox alternatives. “Google would therefore have the power to modify this algorithm whenever it wanted to.”
If other browsers choose to adopt the machine learning setup – Yahoo! Japan is said to be interested – they may be able to change the grouping for their own use.
Basile and others commenting publicly on Google’s FLoC proposals have questioned whether the system will group people by sensitive attributes such as race, sexual orientation or disability. The system may be able to infer this sensitive information through people’s general behaviours and interests. For example, Facebook’s advertising algorithms have been found to show teaching and secretarial jobs to women more than men. In 2019 Facebook was charged by the US Department of Housing and Urban Development for ads that discriminated against people based on their race. The same risks exist with FLoC and Google engineers have acknowledged the potential for algorithmic bias.
“If an online attacker was looking to target a specific group based on their ethnicity or their religion, this attacker is then able to target the relevant FLoC ID group however they see fit,” Basile says.
Google will start trialling FloC later this month – but will only use websites that have tracking enabled or are already serving display advertising. The company also says it is against its ad policies to serve personalised ads based on sensitive categories. FLoC groups that reveal people’s race, sexual orientation and other categories will be blocked or, if that’s not possible, Google says it will change its algorithm to “reduce the correlation”.
Such is its scope and potential impact, Privacy Sandbox is attracting plenty of regulatory scrutiny. On January 8, the UK’s Competition and Markets Authority (CMA) revealed it was investigating Privacy Sandbox alongside the data protection regulator, the Information Commissioner’s Office (ICO). The CMA says that its investigation is “moving at speed” but it has yet to reach any conclusions about Privacy Sandbox’s impact on competition.
Although, the CMA did outline some of its worries in a wide-ranging July 2020 review of digital advertising. Blocking third-party cookies in Chrome may give Google more power over the whole ecosystem, the CMA’s report says.
“Those proposals will also turn Chrome (or Chromium browsers) into the key bottleneck for ad tech,” the report says. “It is likely, therefore, that Google’s position at the centre of the ad tech ecosystem will remain.”
The CMA report also states that online publishers, such as news websites that rely on advertising, could see short-term revenue from ads decrease by 70 per cent – although at least one publisher has had success with ditching cookies. Bindra says Google is working with the CMA and ICO on their investigation and that removing third-party cookies will have an impact on Google too.
“We use third-party cookies for ads we serve on sites, and our Google advertising products will be impacted just as other ad technologies will be impacted,” she says. “While this does take a toll on the funds that content creators and web developers depend on, we really do believe that a lot of these technologies are going to be able to support publishers and advertisers.”
But third-party cookies aren’t the only way ads are served on the web – and that’s where the rest of Privacy Sandbox comes in. When third-party cookies are removed companies that collect first-party data might be able to better target adverts. For instance, if you’re logged into a news website that site will be able to collect data about what you read and understand your interests. That means it can show adverts that may be more relevant to you as an individual – the more relevant an ad, the more money it can make.
Sounds good, right? Well, it is for the two companies that dominate first-party data collection across the web: Facebook and Google. Both companies have powerful tools to collect user information, both through their own services and the software they provide to others. More than nine Google products – from Gmail to Google Maps – are used by more than a billion people each month. Facebook’s tracking tech is on more than eight million websites.
“Google would still be able to use the insights it obtains from users’ activities on Google Search and YouTube to select personalised ads on Google’s properties,” the CMA’s review says. Those responding to the CMA’s review told it that putting an end to third-party cookies would “further entrench” Google and Facebook’s ad technology.
Ultimately, if the web moves to a system where first-party data becomes the main way of serving adverts then the biggest tech platforms could benefit the most.
“It could be that Google’s ad tech division is at equality with other ad tech companies,” says Paul Bannister, co-founder of ad management firm CafeMedia. “The problem is that [blocking third-party cookies] widens the gap between walled gardens and what they can do versus the open web.”
t’s likely that eliminating third-party cookies will push advertisers to rely on logins and user accounts to collect their own first-party data. Or rely on Google and Facebook to collect that data for them.
Bannister argues that such changes will likely mean that more advertising money is spent on platforms such as Facebook, TikTok and YouTube, where targeting within a closed ecosystem will be easier.
“It has centralised control of the data with a smaller and smaller group of very large companies,” says Bannister. “And they are far more likely to misuse the data and harm people in the process.”