WeChat Social to Intelligent Connection 杨强 What is WeChat? A New Lifestyle • Mobile App for 500M+ • Multimedia • A Way to Connect: A Platform Users(Unit:100M) 12 >10 Rapid Growth 10 1 Billion Accounts, 550M Active Users 8 6 8M Service Accts 4 20 Languages, 200 Countries 2 0 Oversea Users By Aug 2013 Active Users per Month By Sep 2014 1.0 4.68 微信 Growing Up Path ~6 Moment Ads Growth Path (Unit: 100M) 2014-10-1 Video Clips 1 Red Packet 3 2 2013-1-15 Payment Online Shopping 2012-9-17 2012-3-29 Video Chat Birth Speech Nearby Shake It! Scan Service Platform 2015 -广告 2014 -小视频 2014-10-1 -卡券 3 Milestones -红包 -支付 2012 2013 -游戏 -表情商店 -视频聊天 3亿 2013-1-15 2亿 2012-9-17 1亿 2012-3-29 -朋友圈 -公众平台 2011 -扫一扫 -摇一摇 -漂流瓶 -附近的人 2010 -语音信息 微信诞生 433 days: from 0 to 100 M 6 Months Later: doubled to 200M 3 months later: 300 M COLLECTIVE USER EXPERIENCE: SHAKING FOR RED POCKET IN 2015 SPRING FESTIVAL • INTERACTION / COMMENTS / VOTING / LOTTERY / SHARE WITH FRIENDS Red envelope data:Over 1.01 billion times、Peak at 550 000 / min、Shaking 11 billion times, Peak 810 million times / min LINK BigData @ WeChat WeChat Big Data contacts Big Data relational data subscribed public accounts group non-relational behavior text explicit information posts images user feedback comments videos Relationship Types (%) Friends in Real Life 90 现实生活中的朋友 Classmates 81.4 同学 75.7 亲人或亲戚 70.8 同事 50.4 老师或领导 网友 陌生人 Strangers 32.1 24.7 Artificial Intelligence @ Image Understanding Lifelong Learning Agent Spammer and Rumor detection Location-based social networks NLP and Text Recommendation Feature Engineering Provence of information Speech Understanding Social Search Trust Assessment User Modeling Crowd Intelligence Sentiment Analysis Event Detection Network Analysis WeChat User Modeling & Transfer Learning Case Study: ADs IN WeChat MOMENTS • PRESENTS IN MOMENTS / TARGETING BASED ON BIG DATA / INNTERACTIVE COMMUNICATION • SPREADING BETWEEN FRIENDS User Modeling for Ads • Data Sources • Demographic information • Articles read • Public accounts subscribed • Techniques: Source 1 Source 2 Source 3 • DNN • Multi-task Learning Source 1 Topic modelling Source 2 Topic modelling Source 3 Topic modelling tag 1 Seed users Advertise Co-training tag n Image Understanding for Ads Cross Domain Transfer Learning • Predicting User Feedback from Social Data Source domain: BMW advertisement with user feedback Target domain: SOHO advertising with no previous data Crowd Intelligence @ WeChat 2015 -广告 2014 -小视频 -卡券 Crowd Intelligence 红包 - 2012 2013 -游戏 -表情商店 -视频聊天 3亿 2013-1-15 2亿 2012-9-17 1亿 2012-3-29 -朋友圈 -公众平台 2011 -扫一扫 -摇一摇 -漂流瓶 -附近的人 2010 -语音信息 微信诞生 Grow with Users - Games - Red Pocket - Moments Ads - Shaking TV Program Charity by the Millions: Voice Donor LINK Crowd Intelligence for the Visually Impaired Collect Voices Filter by Standard Models Speech Recognition • Large Mandarin Corpus: DNN (deep neutral network) • Language model: • N-gram, DNN • Low-rank matrix • GPU training • Decoder: • WFST framework • Large, parallel search space Audio Fingerprinting • Challenge • noisy environments, • compactness of fingerprint, and • service scalability when song database is huge (10M) • Application: WeChat “Shake” Music, lunched in Jan, 2013 • Big Music Database(10M songs) Fast Recognition (3-5 seconds) • Daily Page View > 8M, User View > 3M Audio Fingerprinting:WeChat Live TV recognition Recognize live TV program from audio fingerprinting • Challenge: High concurrent throughput • SHAKE-TV: • Can recognize > 500 TV channels across China • User View: 1M simultaneously • Rich Cross-TV Screen User Experience • Fully integrated with social networks Image Understanding @ WeChat Mariana CNN on GPUs Mariana CNN is Tencent’s Deep Convolutional Neural Network based on Single-machine, Multi-core GPU Computation. • Data parallelism and model parallelism • Partition models for parallel execution • Model scalability and performance had major improvement GPU0 GPU1 GPU2 GPU3 Configuration Speed-up 2 GPUs Model Parallelism 1.71 2 GPUs Data Parallelism 1.85 4 GPUs Model + Data Parallelism 2.52 4 GPUs Data Parallelism 2.67 Mobile Visual Search • Scan for information or services • Local and Global Image Feature Descriptor • Highly Efficient Feature Indexing and Matching • Mobile video and image quality assessment • Challenges: • Variable lighting, Non-planar recognition • WeChat “Scan” on covers, lunched in 2013 • Large Image databases (~10M) • Open interfaces for developers LINK Mobile Image Tech. • OCR on a mobile device • Camera OCR based language Translation • Certificate and ID OCR on mobile or cloud • Face Technology: • Detection, Alignment, Tracking, Recognition/Verification • User modeling based on images • Targeted Adverts LINK Augmented Reality 3D animation w/ embedded video on designs • Rich interaction with users Challenges: • Real-time and precise target detecting/tracking, model rendering Applications: • lunched in WeChat movie ticket App “微 票” (Jan 2015) LINK Natural Language Understanding @ WeChat WeChat NLP WeChat: Closed-loop NLP • Closed-loop Feedback in WeChat Services • Always online: real-time message platform • Massive user base: 549 million monthly active users • Payment User Intention payment … WeChat NLP Word Multi-Embedding Learn Embedded Word Representation WeChat NLP Semantic Matching of Questions and Answers Dependency-Tree RNN model Semantic Match WARP loss • Dependency-Tree RNN R(query,doc2) R(query,doc1) output •Word multi-embedding match •BM25 •Other Features: -Sentence type recognition -Synonym -Antonym -Parsing output output h h x h x h x query x x h x h x h x h h x doc1 • semantic match • semantic answer ranking x x x doc2 NLP & Question Answering Text/Voice Speech Recognition NLU intent Query Analysis identification query rewriteQuery Inference Sentiment Analysis NER/NED Semantic Pattern Self-learning Parsing system Graph Search RDF based Inverted index based Semantic Search Data management Semantic Match WeChat knowledge graph User Profile Dialog Context Log/Session User behavior management management management Text Search Inverted index TTS Answer NLG Pattern based Parsing and LM based Recommendation ReRanking NLP Dialog 微信 Future Connecting People, Services and Things Connect Everything A New Connection Model People Things Becomes a New Lifestyle Service Will extend the connection to daily life, commerce & entertainment Provide Mobile Internet Service to Industries People, things and services An ecosystem for connection, a new solution provider Internet + Intelligent Business Solutions Integrate internet and other businesses,Smart Cities,Improved User experience, Connecting everything Eight Principles 1、Bring New Value 2、Remove geographical restrictions 3、Remove middle men 4、Distributed 5、Ecosystem 6、Evolutionary Service Platform 7、Social Centric 8、Users’ Interests Always #1 THANK YOU
© Copyright 2025 Paperzz