WhatsApp security and role of metadata in

WhatsAppsecurityandroleofmetadatainpreservingprivacy
NidhiRastogi,JamesHendler
RensselaerPolytechnicInstitute,Troy,NY,USA
[email protected]
[email protected]
Abstract:WhatsAppmessengerisarguablythemostpopularmobileappavailableonallsmart-phones.Overone
billionpeopleworldwideforfreemessaging,calling,andmediasharinguseit.InApril2016,WhatsAppswitchedto
adefaultend-to-endencryptedservice.Thismeansthatallmessages(SMS),phonecalls,videos,audios,andany
otherformofinformationexchangedcannotbereadbyanyunauthorizedentitysinceWhatsAppversion2.16.2
(released April 2016). In this paper we analyze the WhatsApp messaging platform and critique its security
architecturealongwithafocusonitsprivacypreservationmechanisms.WereportthattheSignalProtocol,which
formsthebasisofWhatsAppend-to-endencryption,doesofferprotectionagainstforwardsecrecy,andMITMtoa
large extent. Finally, we argue that simply encrypting the end-to-end channel cannot preserve privacy. The
metadatacanrevealjustenoughinformationtoshowconnectionsbetweenpeople,theirpatterns,andpersonal
information.
ThispaperelaboratesonthesecurityarchitectureofWhatsAppandperformsananalysisonthevariousprotocols
used. This enlightens us on the status quo of the app security and what further measures can be used to fill
existinggapswithoutcompromisingtheusability.Westartbydescribingthefollowing(i)importantconceptsthat
needtobeunderstoodtoproperlyunderstandsecurity,(ii)thesecurityarchitecture,(iii)securityevaluation,(iv)
followedbyasummaryofourwork.Someoftheimportantconceptsthatwecoverinthispaperbeforeevaluating
the architecture are - end-to-end encryption (E2EE), signal protocol, and curve25519. The description of the
security architecture covers key management, end-to-end encryption in WhatsApp, Authentication Mechanism,
MessageExchange,andfinallythesecurityevaluation.Wethencoverimportanceofmetadataandroleitplaysin
conservingprivacywithrespecttowhatsapp.
Keywords:WhatsApp,privacy,security,Facebook,signalprotocol,curve25519
1. Introduction
WhatsAppmessengerwasstartedbytwoex-Yahooemployees(BusinessInsider2015)andwassoldtoFacebookin
2014(WhatsApp Blog – Facebook 2016) but remained operationally independent. Since then, the user base has
increasedtremendouslyandoverabillionusersperdaynowusetheapp.AsofJanuary2016,theaveragenumber
of daily messages exchanged over WhatsApp is reported to be an astounding 34 billion (The Verge 2014).
WhatsApp has been able to attract this unprecedented success because of its availability on all popular mobile
operating systems, and is free of cost (or costs a nominal $0.99 per year). Free calls, unlimited messages, and
mediaexchange,alongwithaneasytooperateinterfacemakeitfavorablefornoviceusersaswell.
However,asfarassecurityisconcerned,WhatsApphascomeunderfireseveraltimesinthepast.Thenegligence
shown towards making the application secure made it an easy target for attackers. For example, in 2011, a
problem was found in the app verification process proving that the authentication mechanism was unsecure
(Schrittwieser et. al 2012). Researchers were able to exploit valid usage session by successfully hijacking several
user accounts (called session hijacking). This allowed unauthorized access where an attacker could spoof the
sender identification, thus receiving messages targeted to the victim. A packet sniffer could then intercept the
trafficandlogallcommunicationdetails.Alllaterattemptswereeitherahalf-bakedattempttoencryptmessages
or were broken at launch. This lax approach continued and by the time it was may 2012, WhatsApp was still
sendingmessagesinplaintext,whichmeanstherewasnoencryptionforanykindofcommunication.
In the wake of increasing privacy concerns and the war between Apple and FBI over encryption of phone data,
WhatsApp has switched to end-to-end encryption. This has enabled the messenger app user to send all
communicationencrypted.Itisnomoreeasyforanunauthorizedpersontoreadtextmessages,videos,audios,or
filesbysurreptitiouslylisteningtothecommunicationasdataisnomoresendinplaintext.
This paper elaborates on the security architecture of WhatsApp and analyzes the various protocols used. We
perform an extensive literature study from several online resources on Whatsapp and related concepts and use
thattounderstandtheworkingoftheapplicationanditssecurityprotocols.Also,whilewhatsappisapopularapp
forthemobileplatform,itscomputerversioncanbeaccessedviaawebbrowserorbyinstallinganappforthe
windows or mac OS platform. Since a phone number is required as the primary identification of a user, the QR
codeneedstobescannedtoauthorizethecomputer(WhatsAppFAQ–WhatsAppWeb).
We also take a closer look at the app security and what further measures can make it stronger without
compromising usability. In the following sections, we cover some important security concepts applicable to
WhatsApp, understand and evaluate the security architecture, measures taken to ensure user privacy, make
recommendationsonimprovements,andfinallyendwithasummaryofourwork.
2.SecurityFundamentals
2.1End-to-EndEncryption(E2EE)
Thisisasystemofcommunication,whichallowsonlythecommunicatingpartiestoaccessthemessagesbecause
the medium is encrypted. In theory, no eavesdropper can access the cryptographic keys needed to decrypt the
conversation. This includes service providers like cellular companies, ISPs, and app developers. Theoretically, an
adversarycannotaccessthetransmitteddataevenafterthetraffichasbeenintercepted.Thisispossiblebecause
of the various properties of the encryption protocols used for making the end-to-end communication encrypted
and inaccessible for an unauthorized user. In the figure below, the communication channel between the two
phonesorcomputersisencrypted.
Figure1-E2Eencryptionbetweentwosmartphones.
2.2SignalProtocol
Signal Protocol (previously Axolotl) enables end-to-end encryption in WhatsApp. It is used to encrypt both text
messagesandvoicecallsbyusinganasynchronousmethodunderasharedkey.Theprotocolwaschosenasitcan
provideplausibledeniabilityandforward-secretasynchronouscommunications,amongotherfeatures,onmobile
devices.(Praetorian2015)
2.3Plausibledeniability
Bydeniabilityorrepudiation,itmeansthatamessagereceivercanbesurewherethemessageoriginatedfrombut
cannotprovetheidentityofthesender.Inessence,thesendercandenybeingthepersonwhooriginallysentthe
message (Open Whisper Systems 2013). Signal protocol uses a compact derivative of the Off-the-Record (OTR)
protocol to enable this feature. Before we get into any further details, let’s first understand the working of the
signalprotocol.
Each member participant in a WhatsApp conversation has a long-term identity key that they use to sign an
ephemeral key. This ephemeral key is exchanged among members to calculate a shared secret, typically using
Diffie–Hellman (D-H) key exchange method. D-H allows the participants to jointly establish a shared secret key,
whichcanthenbeusedtoencryptsubsequentcommunications.
The shared secret from this key exchange is used to derive three keys for each participant - a sending and a
receivingcipherkey,andasetofMACkeys.TheseMACkeysconfirmmessageauthenticityandintegrity,andare
includedineverytransmittedmessage.NoticeherethattheMACkeysaresubsequentlyderivedfromtheoriginal
sharedkeyensuringthatthemessagewasindeedsentbytheclaimedsender.Atthesametime,boththeparties
areinvolvedingeneratingthesharedkeyaswellasthesubsequentMACkeys(alsocalledephemeralkeys).While
thiskeepsthemessageintegrityintact,theauthenticityofsendingthemessagethattheyoriginatedcanbedenied
later.Thisisbecauseofthesharedkey,whichmakesthereceivercapableofproducingasender’sMACkey.
2.4Forwardsecrecy
If the encryption keys from a user’s smartphoneor computer somehow get compromised, a fresh key for every
newmessageisissued.Thispreventsanadversaryfromnotonlyderivingtheephemeralkeysbutalsofromusing
ittodecryptanymessagetransmittedinthepast.
SignalProtocolusesthefollowingtypesofkeys:
1. Identity key pair, a long-term Curve25519 key pair generated at install time for all asymmetric
cryptographicoperations.
2. SignedpreKey,amediumtermCurve25519keypair.
3. PreKeys,alsoCurve25519keysbutforone-timeuse.Theseareusedtoactuallyencryptthemessage.
SignalProtocolusesacompactderivativeofOTRwhereitusesD-Hexchangeineachkeygenerationstepabove,
whichcontinuallyratchetsthekeymaterialforward.Thisistheunderlyingprinciplebehindforwardsecrecyasthe
keys that finally encrypt the message are ephemeral. Recording the encrypted traffic cannot divulge the key
materialordecryptpreviousmessages.Evenifadeviceisphysicallycompromised,nokeysatanygiventimeare
storedonthedevicethatcanhelpanadversarydecryptpreviouslyexchangedciphertext.Notethatthispropertyis
verydifferentfromthetraditionalwaysofencryptingdatainmotionoratrest.Inthesecases,thesamekeyora
periodically changed key (which is usually a slow process) is used to encrypt data. This makes it extremely
important to store the key at a secure location, lest all the recorded messages ever exchanged, and sometimes
with all different parties, may get into the hands of the adversary. By contrast, the key exchange mechanism in
signalprotocolisephemeral.Hence,ifakeyisevercompromisedinthefuture,allrecordedciphertextwillremain
private.
Thereareotheradvantagesforchoosingsignalprotocol.Itisamobile-friendlyend-to-end(e2e)protocol,which
candecreasethesizeofpacketsbyusingprotobufs.Protobuf,orprotocolbuffer,isasmalllogicalrecordof
information,containingaseriesofname-valuepairthatofferanautomatedmechanismforserializingstructured
data.ItworkssimilartoXMLbutdiffersbybeingfaster,smaller,andsimpler.
2.5Curve25519
Elliptic-curvebasedcryptographic(ECC)systemsarepublic-keycryptosystemsthatrelyontheinabilityto
determinenfromY=nX,whereXandYarepubliclyknownbasepoints.
Curve25519helpscomputethepublicpartofthisequation,whichis128-bitsinlength.
Curve25519isalsoanECCcurve,whichisavariantoftheDiffie-Hellmanprotocol.Forthisreason,Curve25519can
besuccessfullyimplementedwiththeellipticcurveDiffie–Hellman(ECDH)keyagreementscheme.Thisproperty
enablesCurve25519tocomputesharedkeysthatcanbeexchangedoverunencryptedchannelsaswell.As
mentionedinearliersections,eachmemberonWhatsApphasalong-termidentitykeythatisusedtocalculate
thissharedsecret.
Curve25519introducedbyDanielJ.Bernstein(Bernstein2006),computesveryfastintermsofkeycompression,
keyvalidation,andtiming-attackprotectionamongothers.Thismakesthecurveapracticalchoiceforlarge-scale
implementation,asisthecasewithWhatsApp.
3.SecurityArchitecture
Nowthatweunderstandsomekeyconcepts,wecanturnourattentiontotheimplementationandkeyexchanges
thattakeplaceinWhatsApp.Seefigure2formoredetails.
3.1KeyManagement
3.1.1.End-to-EndEncryptionworkinginWhatsApp
EachWhatsAppuserpossessesalong-termkeythatisstoredonthedevicememory,notreadilyaccessibletothe
user.ThiskeyisusedtocreateanothersharedkeyusingwhichaWhatsAppusercansecurelycommunicatewith
anotheruse.Asecurecommunicationchannelisestablishedbetweenthetwo,anditremainsintactuntilevents
suchasappreinstall,devicechange,etc.Thefollowingstepsdescribekeymanagementintheflowdiagramshown
infigure2.Theinitiatingclientiscalledinitiator,andtherequestingclientiscalledtherecipient.
Figure2:Flowdiagramofwhatsappend-to-endencryption
1. Theinitiatorrequeststhepublicidentitykey,publicsignedprekey,andasinglepublicone-timeprekey
for the recipient. The identity key, called Irecipient is a long-term curve22519 key pair. The signed pre
key,calledSrecipientisamedium-termcurve22519keypair,andsignedbyIrecipient.Theone-timeprekey, called Orecipient is a list of curve22519 key pairs mainly for one time use. All these keys are
generatedduringinstallation,reinstallation,orchangeofdevice.
2. TheWhatsAppserverreturnstherequestedpublickeyvaluestotheinitiatingclient.Theone-timeprekey
isephemeral,andremainsontheserveronlyuntilrequested.
3.
4.
5.
6.
7.
Theinitiatorsavesthekeysrequestedinstep1andgeneratesanephemeralcurve25519keypair,called
Einitiator,andloadsitsownidentitykey,calledIinitiator.
Using these keys generated and requested in the above step the initiator can now calculate a shared
secretwiththerecipient-
MasterKey-ECDH(Iinitiator,Srecipient)||ECDH(Einitiator,Irecipient)||ECDH(Einitiator,Srecipient)||
ECDH(Einitiator,Orecipient)
Thismasterkeyisusedtocreatesubsequentsessionkeysbetweenthetwoparties.AHashedMessage
AuthenticationCode(HMAC)-basedkeyderivationfunction(HKDF)derivestherootkey,andchainkeys
fromthemasterkey.Ittakesmasterkeyastheinputkeyingmaterialandextractsfromitafixed-length
pseudorandomkey.Thiskeyexpandsintoseveraladditionalpseudorandomkeys,resultingintheroot
andchainkeys,bothwith32-bytevalue.
The server contacts the recipient using the member id lookup and sends session information with the
initiator-EinitiatorandIinitiator.
Usingthissessioninformation,therecipientcalculatesatitsendsharedsecret,whichisthemasterkey
and confirms integrity of the message and has been sent unaltered by an authorised person. Recipient
alsodeletestheephemeral,one-timepre-key,Orecipient.
Using the chain keys generated in previous steps, parties involved in the conversation generate a
message key of 80-byte value. This encrypts each message and ratchets forward the chain key used to
derive the message key, every time a message is sent in a given session. This works by increasing a
counterthatispartofafunctionderivingthechainkey.Thisisakeystepinprovidingforwardsecrecy,as
thechainkeyisnomoreofuseformessagessentearlierandhencecannotbeusedtodecryptthem.With
the chain key changing with every message, the message key also changes having a similar effect on
forwardmessageencryption.
Messagekey=HMAC-SHA256(chainkey,0x01)
Chainkey=HMAC-SHA256(chainkey,0x02)
WhatsAppalsousesQRcodeverificationmethodforout-of-banduserverification.TheQRcodecontains,among
otherthings,a32-byteIrecipientandIinitiator-whicharethepublicidentitykeysforbothusers.Anotherwayto
getasimilarexperienceisbycomparinga60-digitnumber.
4.SecurityandPrivacyEvaluation
Signal Protocol drastically reduces the possibility of having a man-in-the-middle attack. This is primarily because
OTR is based on a mechanism where it uses D-H exchange in each key generation step mentioned earlier. This
continuallyratchetsthekeymaterialforward.Foranactiveadversarywhohasmanagedtodecryptthechannel,
theintegrityoftheencryptionkeyscantobetracedallthewaybacktotheoriginalsharedkey,whichrequiresa
fair amount of time and key tracking. One can be assured that no MITM attack is possible on any of the
subsequentlygeneratedkeys.
However, a major security concern is worth mentioning here. While WhatsApp messages are secure in transit,
mostoftheendpointdevices–suchassmartphones,tablets,andcomputers–donotencryptthedataresidingon
theminthesamewaythatAppledoeswithitsmostrecentiPhone.WhatsAppofferstobackupmessageslikelyon
a cloud server. Some of the options given are Google drive, Apple iCloud, etc. We do not have any information
aboutmessageencryptiononthecloudplatformyet,unlessWhatsAppdecidestosharethesedetailssoon.
Also, WhatsApp does not offer encryption of past communication at app level, which can expose the user
messagesincaseofdevicetheft.
4.1Privacyimplicationofplausibledeniability
Theprevalenceofglobalsurveillancehascausedmuchconcerntomanyusers.Someoftheconcernshavebeen
relatedtoathirdpartylisteningtouserconversations,withoutpermission.Anotheroneisbeingheldagainsta
messagetheysentinthepastinthecourtoflaw.
Signalprotocolwasdesignedkeepingsuchprivacyconcernsinmind,amongothersecurityissues.Forthispurpose,
itensuresthatthemessagesenderorreceivercannotbeirrefutablytiedtoaparticularmessagesentinthepast
byusingthevariousratchetforwardencryptiontechniques.
Intheprivacydomain,therehavebeenconcernsrelatedtousermetadataaswell.WhatsAppencryptsthe
communicationchannelbetweenusersusingend-to-endencryption.Themetadataoftheuserisencryptedaswell
whendataisinmotiononthecommunicationchannelbetweenvariousparties.Itisessentialtounderstandthat
informationstoredinmetadataisjustasimportantinpreservingprivacyoftheusers,asisthedataitself.The
company’slegaltermsallowthemtostoreinformationassociatedwithsuccessfullydeliveredmessagessuchas
timeofdelivery,mobilephonenumbersinvolvedinthemessages,sizeofanydigitalcontentswappedbetween
thetwoparties(Bernstein2006).Also,theapppersiststheusertoshareone'sentirecontactlistwiththeapp.This
isawaytofurthergatherinformationaboutwhoisinaparticularsocialnetworkofauser.Itisliketradingthe
convenienceofhavingtheapptofigureoutwhousesitamongstone’scontactsforgivinguptheentirelistof
whichonecontactsregularly,includingthosewhodon'tusetheapp.Thereisstillnooptionofselectivelyadding
contactstotheWhatsApplist.Anyadditionofthisfeatureinthefuturewillnothelpexistingusersastheyhave
alreadysharedthisdetailwiththeapp.
Asmartphonemetadatareflectsawealthofdetailsbothatthelevelofindividualcallsandwhenanalyzedin
aggregate.Computerscientistsandresearchershaveprovedthisanumberoftimesinthepast.Itisherewhere
WhatsAppfalters.Whilethemetadataisencryptedduringtransit,phonenumbers,timestamps,connection
duration,connectionfrequency,aswellasuserlocationarebeingstoredonthecompany’sservers.Thismetadata
issufficienttocreateaprofileanddrawsomestronginferencesbetweenthecommunicatingparties.Andaswe’ve
seenveryoften,bothgovernmentsandhackerscangettheirhandsonthemetadataiftheyreallygoafterit.
WhatadvantagewouldFacebook,theparentcompanyhasinadditiontothemetadatarelatedinformationcoming
viaWhatsApp?WhatsApphadvowedthatitwouldnotbesellingadvertisements.However,thereisnocondition
thatcanstopitsparentcompanyfromdoingsobyusinginformationgatheredthroughthewhatsapp.In
combinationtoone'sactivitiesonFacebook,itcanpotentiallyhelpcreateamoreaccurateunderstandingofthe
userbehavior,andsocialinteractionstherebyservingasastrongmeasureofprofilingforsometargetedads.This
isnottrulyamajorconcernaslongastheuserseesadsthatmakesensetothem.Anychangeinthecontent
deliveryalgorithmcanleadtoaverydifferentuserexperience,whereinsomecasestheusermayoutrightstop
usingtheapp.
Forgroupchat,thecommunicationinitiatorsendsmessagetothewhatsappserver,whichinturndistributesitto
allthegroupmembers.ThisisaveryeasywayofforFacebooktolearnallaboutonessocialinteractionsand
communities.Alotcanbededucedbyperformingsomekindoftrafficanalysisjustbyusingthemetadatalikefrom
themessagevolumeexchanged.
InAugust2016,WhatsAppchangeditstermsofprivacywhereitstatedthatitplanstotransferuserdatatoits
parentcompany,Facebook.Ithadearlierpromisedthatthisdatawouldnotbedisclosedorusedformarketing
purposes.ButnowitwillshareuseraccountinformationwithFacebookandtheFacebookfamilyofcompanies,
likethephonenumbertheuserusedasaprimaryidentifier.ThecompaniesintendtouseWhatsAppaccount
informationtoshowusers"morerelevantadsonFacebook"andtosendusersmarketingmessagesviaWhatsApp.
Aphonenumberislikeadigitalsocialsecuritynumber(EPIC-WhatsApp).Itcanuniquelyidentifyapersonasthis
informationisprovidedeverytimewhenfillingupformsforvariouspurposes.Itcanalsoconnectvarioussources
ofdata,likehealthrecords,financialdata,andeducation,onlinepresence,etc.andcreateafullprofileofaperson.
Metadatacanalsoprovideenoughinformationabouttheuserwhoreliesontheplatformprovidertodeliver
content.Thiscontentcansometimesleadtoinfluencingtheiropinion,forexamplepoliticalopinions.Duringthe
USpresidentialcampaignstakingplacein2016,advertisements,videos,orpostsreachedouttoafairlywide
audience.ThecoverageprovidedbyFacebookisunparalleledincomparisontothecoverageprovidedbyanyother
platform.OnesthatfocustoomuchonacertainnegativeorpositiveaspectofrepublicancandidateDonaldTrump
ordemocratcandidate,HillaryClintoncanleadausertocreateabiasviewofthecandidateoveraperiodoftime.
5.Conclusion
WhatsApphasattractedalotofattentionbecauseofitslarge-scaleusageofend-to-endencryption,afirstofits
kind. The purpose it to re-instate users trust in using chat apps without worrying about privacy and security
concerns. In this paper, we went over the various fundamental of advanced cryptography protocols that enable
the various security and privacy properties of whatsapp. We discussed these features and how successful
whatsapphasbeenbydeployingthemintheirsecurityarchitecture.Wealsowentovertheprivacyconcernsthat
stillremainduetometadataremainingunencryptedandwithintheterritoryoftheappprovider.
References
Bernstein,D.J.,2006,April.Curve25519:newDiffie-Hellmanspeedrecords.InInternationalWorkshoponPublic
KeyCryptography(pp.207-228).SpringerBerlinHeidelberg.
Bernstein,D.J.andLange,T.,2007,December.Fasteradditionanddoublingonellipticcurves.InInternational
ConferenceontheTheoryandApplicationofCryptologyandInformationSecurity(pp.29-50).SpringerBerlin
Heidelberg.
BusinessInsider2015,LeavingYahoowasthebestdecisionthese11peopleevermade
,3August2015.Availablefrom:http://www.businessinsider.com/ex-yahoo-employee-startups-2015-7/
.[26May2016]
ElectronicPrivacyInformationCenter(EPIC)–WhatsApp.Availablefrom:
https://epic.org/privacy/internet/ftc/whatsapp/[29August2016]
GoogleDevelopers,ProtocolBuffers–DeveloperGuide,23August2016.Availablefrom:
https://developers.google.com/protocol-buffers/docs/overview#how-do-they-work.[23August2016]
OpenWhisperSystems,SimplifyingOTRdeniability,27July2013.Availablefrom:
https://whispersystems.org/blog/simplifying-otr-deniability/.[31May2016]
Praetorian2015,End-to-EndWhatsApp:AnOpinionatedSeriesonWhySignalProtocolisWell-Designed
,7April2015.Availablefrom:https://www.praetorian.com/blog/whatsapp-end-to-end-encryption-why-signalprotocol-is-well-designed.[13July2016]
Schrittwieser,S.,Frühwirt,P.,Kieseberg,P.,Leithner,M.,Mulazzani,M.,Huber,M.andWeippl,E.R.,2012,
February.GuessWho'sTextingYou?EvaluatingtheSecurityofSmartphoneMessagingApplications.InNDSS.
TheVerge2014,ConnectorDie,19February2014.Availablefrom:
http://www.theverge.com/2014/2/19/5428022/connect-or-die-why-facebook-needed-whatsapp
.[16August2016]
WhatsApp2016,WhatsAppBlog-Facebook,19February2016.Availablefrom:
https://blog.whatsapp.com/616/One-billion.[13July2016]
WhatsApp2016,WhatsAppLegal.Availablefrom:https://www.whatsapp.com/legal/.[13July2016]
WhatsApp2016,WhatsAppBlog–Onebillion,1February2016.Availablefrom:
https://blog.whatsapp.com/616/One-billion.[13July2016]
WhatsApp2016,FAQ–WhatsAppWeb.Availablefrom:https://www.whatsapp.com/faq/en/web/26000010.[15
November2016]