Retail Chain

Fast hand is actually an artificial intelligence company

How to effectively allocate the attention of each user to a large amount of short video, rather than focusing on a few explosions, the manual operation is not feasible, which must be achieved through artificial intelligence technology.

How does AI enable short video platforms?

This is the topic of the speech by Zheng Wen, Vice President of AI Technology, Fast Technology, at the "Fire of Innovation" event at Geek Park. From the tool-based application that is purely used for making and sharing GIF diagrams, to the new short-video community that has more than 100 million daily active users and more than one hour per day, this is the road to entrepreneurship that has been in the fast seven years.

In the past seven years, fast-handed users have released more than 7 billion short videos, ranging from hilarity, humor, games and entertainment to local customs and folk customs. For fast hands, many people think that it is just a short video company, and the short video platform has little to do with AI technology.

However, how to efficiently assign each user's attention to a large amount of short video, rather than focusing on a few explosions, is not feasible by manual operation, which must be achieved through artificial intelligence technology.

AI technology makes recording life more interesting

AI is the core competence of the short video platform. The fast hand is actually an artificial intelligence company.

In the face of tens of millions of new videos every day, how do you match them precisely to the user? Fast-handed CEO Suu Hua once described this as an unprecedented problem for fast-handed employees. To this end, Quick has proposed a complete set of AI-based solutions, through video production, content understanding, user understanding, system distribution and other aspects of the use of fast. At the geek employer exchange meeting, Zheng Wen, vice president of fast technology AI technology, said that AI is the core ability to connect the two ports of content production and consumption.

Quickly handed out a number of explosion effects, such as the "ageing" expression called the fast time machine, which can make the face of the characters in the video become 60 years later, and also have real-time limb recognition in a dozen seconds. Dance games, AR changing faces, etc. Behind these gameplays is the rapid development of cutting-edge AI technology, involving multiple technical modules such as human pose estimation, gesture recognition, and background segmentation. This is a new attempt in the content production field to quickly make the record form more interesting.

After the user shoots and uploads a short video through the fast app, the background machine extracts the basic information contained in it, such as the gender, expression, and face value of the face in the video, and tries to understand the video content. The machine also classifies images according to dimensions such as scene recognition, object tracking, and image quality assessment. Speech recognition is also an important aspect of machine understanding video. The machine converts the voice into text and understands the meaning of the video expression through words. The Multi-Media Understanding department uses AI technology to interpret a video through two stages of perception and reasoning. First, it perceives the objective content information of the captured video, and then infers the high-level semantic information of the video.

Just as people put the learned knowledge into the brain, we organize and store the content of the fast hand into the fast knowledge map, so that the fusion of the content and the knowledge map can complete the recognition of the high-level semantics and emotions of the video.

It is also indispensable for the machine to understand the user itself. The basic information of the user's age, gender, whether to use WiFi or the like, and a large amount of behavior data generated by the user when using the fast hand will be transmitted to a deep learning model for machine training, thereby obtaining a comprehensive set of user information. To predict the user's preferences, the association between individual users.