- Videocites Confidential Information - Video-by-Video Search Engine Based on RediSearch by Presentation By Eitan Shapiro - Videocites Confidential Information - Videocites • Cloud-based SaaS platform for video-by-video search for all online videos • Enable content owners to keep track and better monetize their videos • Novel, highly efficient, patent-protected fingerprinting technology Fingerprinting Partial or 1,000,000 TB 4 TB Ultra-lightweight of video process Representation ratio = 1/250,000 1 HD frame = 32bit video fingerprints DB completely identical videos 2 - Videocites Confidential Information - The Problem Official views in YT: 107M Katy Perry – Hot N Cold 39,997 replications found in YT Non-official views: 697M (!) LOST 697,000 Premium CPMs Last seen in YouTube upload area ➢ No video fingerprinting technology implemented for digital rights management ➢ 72% of top rated videos on Facebook are freebooted from YouTube ➢ Massive boundless video proliferation 3 - Videocites Confidential Information - Our Solution We help content owners to manage all Regain control on video content Video Duplications/Citations We look everywhere (3rd party privilege) … We find more thanks to modality-indifference and outstanding Ratio - 4:3 accuracy Subtitles Flipped Res - 240P 1 Increased monetization and eyeballs 2 Video-based multi-platform true analytics 4 - Videocites Confidential Information - Basic Architecture Sampling video databases x100s faster than playback and extracting metadata Index and metadata Databases Video sample QUERY RESULTS User Dashboard Creating an ultra-lightweight video fingerprint Technology is protected by a Monetize Ignore Block Take down 5 - Videocites Confidential Information - Interactive Video Search System 6 - Videocites Confidential Information - From Video Search to a Textual-Like Search • Each video frame is represented by an ultra-lightweight fingerprint of 32bits • Once a video is fingerprinted, it can be considered as a text document • Each frame is represented by a hashcode, which is a Term in a document • We build an inverted index from hashcodes to video identifiers Video Frame Hashcode 7C34BD18 XCAZ97N8 Video IDs that contain a specific hashcode x596qz c x57722m x57nelr x57b39 z x58utv x5n34z c vKll65 x57722m 3 C22FGO15 x5972z c jh59rtyr k x57nelr x57b66 z x57nelr 88Glk x x575622 m x57nel r x58s9gy yt76GHF x57ap7 x57iuw l n56gFhs Lk98cv x x32hy6 x57nel j r 92GHGhf x58lap 5 x57nelr 0 x57ap7l LomN6 5 x57nelr x57iuw 0 x57ap7l x57nelr x57ap7l x57iuw 0 OpP76N 7 - Videocites Confidential Information - Example of RediSearch Index • Index holds around 2.6M fingerprinted videos • Index size in memory is around 3.9GB • Index is using only two fields ▪ h – for hashcodes ▪ publish_date – for date filter • The high efficiency of RediSearch index allows handling millions of videos in a single machine 8 - Videocites Confidential Information - RediSearch Query • Query terms (hashcodes) are expanded to logical Synonyms • As a result Boolean query can grow to hundreds of search terms 9 - Videocites Confidential Information - Why RediSearch is a Great Fit (1) • RediSearch index sharding ▪ Once scaling up, database sharding is required ▪ Sharding video documents is based on document identifier - Video ID ▪ Redis sharding policy is based on keys and not values, however in inverted index the keys are the Terms (hashcodes) and the values are the Video IDs ▪ Luckily, RediSearch is naturally sharding by Video IDs • RediSearch fast numeric range filter ▪ Allowing us to filter over numeric metadata of the query video ▪ Significantly speed up search 10 - Videocites Confidential Information - Why RediSearch is a Great Fit (2) • Key feature in search engines is the ability to sort by relevance • Relevance in video search is measured by hashcode differences • RediSearch support: ▪ Server side custom query expansion to logical Synonyms ▪ Scoring based on relevance calculated by custom function • Sorting by score expedites the search process 11 - Videocites Confidential Information - Why Redis Over SSD Saves the Day • In addition to index, we hold fingerprinted videos in Redis (+50GB every month ) • Redis over SSD allows to optimize between requirement for speed and memory usage and eventually drive costs down • Video fingerprinted objects are processed concurrently • Lazy loading from SSD works nicely with prefetching mechanism that allow to eliminate delay SSD Disk Redis Over SSD Multiple Search Processes Fingerprinted Video Objects 12 - Videocites Confidential Information - Collaboration with RedisLabs Team • Support in development of new features, like custom scoring for RediSearch • Support for unit-testing through rmtest - library for disposable local Redis server(s) based on port number • Support rollout to production with quick turnaround for bug fixes 13 - Videocites Confidential Information - Future Challenges • RediSearch cluster • Support the ability to search over multiple pages in RediSearch cluster • Handle High-Availability and Disaster-Recovery • Faster loading time from a Redis backup 14 - Videocites Confidential Information - Thank You ! Videocites is hiring… Contac us: [email protected] 15
© Copyright 2026 Paperzz