-A Survey Xiangqian Lee Websoft Natural language to search the huge social graph. Users, pages, photos, places, posts, games…all nodes. Relationship can be: like, friend of, follow, check in, …… Query by natural language is simple, natural. keyword-based, form-filling not good Natural Language query can do good in a domain-specific area. Two main challenges(by Lars Rasmussen) ◦ Parse the natural languages to structured query. ◦ High scalable indexes of nodes and relationships supporting frequently large scale updates and search. Natural Language Query: ◦ friend in nanjing Semantic Language: Intersect(firend(me), residents(115073811842312)) S-expression Language: (and friend:232343, residents:115073811842312) Unicorn Grammar: Weighted Context Free Grammar(WCFG) N-gram Detect all possible query segments that refer to an entity or a relation. For each, ◦ Find possible categories with a probability. Use Facebook Typeahead to resolve entities behind the query segments with high confidence. One intent can be expressed in various ways. ◦ ◦ ◦ ◦ ◦ “photos of my friends” “friend photos” “photos with my friends” “pictures of my friends” “photos of facebook friends” Query may be not grammatically correct. Synonyms. Unimportant terms in the trees. Find all terminal rules that match the query. Search: ◦ Generate candidate semantic trees -> semantic languages. ◦ The tree is generated from a subset of terminal rules that have a sequence of consecutive, nonoverlapping matching tokens covering the whole range of query. ◦ Output a top-k list of semantic trees.->natural language query suggestions. ◦ Adopt semantic scoring to prevent similar suggestions or semantically incorrect suggestions. NL query: friend in nanjing URL: https://www.facebook.com/search/me/friends/115073811842312/residents/present/i ntersect Intersect(friend(me), present(residents(115073811842312))) Unicorn ◦ ◦ ◦ ◦ ◦ Inverted Indexes framework: nodes and relations In-memory INPUT: a S-expression query language Use Hive, Hadoop & HBase to update indexes. Update Scale: 1 billion people, 240 billion photos , 1 trillion connections, Thousands type of connections PER MONTH. ◦ A series of query optimizations are adopted. My friends who lives in Beijing, China and like Friends(TV show) Intersect(friend(me), residents(12345), like(67890)) (and friend:13579, live-in:12345, like:67890) Friends me Beijing (apply R: A): apply the binary relation R on set A. Columbia University Google Me Goldman New York For hundreds or thousands of results: ◦ Because the query is supposed very close to your intension, query relevance is less important. ◦ Rank by relevance to your social networks. ◦ Example: Find restaurants. Restaurants liked by more people are closely related to you will be ranked higher. ◦ More rules are adopted. Facebook Engineering blog: Reddit Post: ◦ Under the Hood: Building out the infrastructure for Graph Search (https://www.facebook.com/notes/facebookengineering/under-the-hood-building-out-theinfrastructure-for-graph-search/10151347573598920) ◦ Under the Hood: Indexing and ranking in Graph Search(https://www.facebook.com/notes/facebookengineering/under-the-hood-indexing-and-ranking-ingraph-search/10151361720763920) ◦ Under the Hood: The natural language interface of Graph Search (https://www.facebook.com/notes/facebookengineering/under-the-hood-the-natural-languageinterface-of-graph-search/10151432733048920) ◦ Ask Me Everything post on Reddit by Lars Rasmussen.(http://www.reddit.com/r/IAmA/comments/18j b6d/i_am_the_pointyhaired_engineering_director_for/)
© Copyright 2026 Paperzz