smArDS is an automatic video content analysis system that determines the most suitable points for inserting ads. It is intended primarily for online video material and pretends to avoid the random insertion of advertising, which is a common practice in most non-premium content platforms. Indeed, we often, as spectators, when we are watching a series or film, are bothered by ads being directly inserted in the middle of an action scene or a conversation. These practices are aggressive for the viewer, the content creator and even for the advertiser, who experiences the anger of the user. The basic purpose of smArDS is to analyze videos and determine those points where the insertion of advertising will be less aggressive, trying to maximize the number of ad completions.
From the technological point of view, smArDS digitally processes the audio and video tracks of an audiovisual file and then extracts different parameters for each video shot. These parameters include the most characteristic frames, the type and distribution of colors in each shot, the amount of movement thereof, if people appear, how many, if we are in a conversation, how many people are talking, if we are in a specific part of the soundtrack, etc. With all these parameters, that we could call audiovisual big-data (BigData AV), an intelligent system is trained by performing a hierarchical segmentation of the video. This hierarchical segmentation allows us to analyze the content in a natural way, to understand how the video is structured into scenes, shots, phrases and words. With these annotations we are able to obtain very useful interpretations that are applicable in different commercial applications. In the case of finding the best cuts to insert publicity, we look for scene changes that respect the content, the viewer and are positioned in critical moments of action to ensure that the audience is fully engaged to come back after the advertisement.
smArDS does not only provide the best points for ad insertion as it also allows for a significantly increase in ad inventory points. We firmly believe that video advertising models for the online world should not copy TV models, as it is currently happening right now. We are convinced that if commercials are made shorter, with very brief interruptions in the video content, the viewer will remain engaged and there is going to be an increase in the display of ads. Currently, interruptions are usually random and include three or four ads, a model quite closer to the conventional television.
Moreover, there is currently no legislation for inserting video advertising on the internet, so no penalties are applied to disrupt the content at times that are not appropriate. However, it is assumed that in the future content diffusers will be required to take care about these aspects.
The smArDS system works correctly with any type of content, coding, video quality, etc. It always detects automatically the best places to insert publicity by looking inside the content. In addition, in the cases that the content has already been prepared for cutting to publicity in specific places, smArDS can also find these time instances. Specifically, the system detects fades to black and the insertion of covers of TV series that are often the points at which the content creator facilitates the insertion of advertising.
The current system only determines the best temporal moments to cut a video. However, we are currently developing a module that inserts advertising in the space of the screen without interrupting the video display. This advertising would be in the form of banners, logos, animations, etc. smArDS is going to determine automatically the exact location where they should insert logos or animations in order to interfere as little as possible with the content. We want the logo to never overlap with the face of an actor or a fundamental part of a scene.
For instance, we would never interrupt a scene in which Darth Vader tells Luke Skywalker that he is his father, we would wait and cut at the end of the scene. In this manner, the ad cut would be powerful and at the same time the quality of the experience would be retained.
Current predictions indicate that the advertising business in online videos will be the main protagonist in the next 5 or 10 years. In the United States, the market for conventional TV advertising is nearly stagnant and only increases about 2% annually. Instead, the market for digital video advertising has seen an increase of 40% in 2014, 45% in 2015, 28% in 2016 and sustained increases of about 12% -15% annually are expected until 2020. (source: http://www.emarketer.com/Article/Digital-Video-Advertising-Grow-Annual-Double-Digit-Rates/1014105 ). In Europe we follow United States and it is expected that in-between 2017 and 2019 the increases introduced there in 2014-2015 occur here. A steady increase of about 40% in the coming years is expected and that the total turnover is about 2,000 billion in 2020 (http://www.emarketer.com/Article/Europes-Programmatic-Video-Ad-Revenues-Will-Near-2-Billion-2020/1013055 ).
Besides the total volume of business, it is also expected to evolve the current business model towards an advanced programmatic model. These programmatic models, based on the interaction between advertisers and publishers through software tools that manage the buying / selling of ads automatically, are widely used in internet advertising but still have a small impact in online video in Europe. In this regard, we hope that the quality of the advertising points provided by the smArDS system and the possibility of increasing the inventory of them is valued positively by the market.
The system can easily be adapted to any type of content. The currently commercialized smArDS software is focused on series, movies, news programs and documentaries. This is because the intelligent system has been trained taking into account features about the production techniques surrounding this type of content. However, the system can be adapted, through training, to other types of content such as sports events, concerts, adult content, etc.
We are working to use all the low-level audiovisual metadata extracted from the video in order to introduce a new type of technology for personalization and customization of ad contents. Current systems take only into account textual metadata like the kind of movie that sees a certain visitor, the name of the actors, etc. Our goal is to introduce this Big Audiovisual Data that we are extracting to introduce another type of parameters such as: type of colors that are displayed movies, is there a a lot of soundtrack?, maybe too many special effects? Is there a high predominance of dialogues? Do the scenes have a lot of movement? Utilizing this type of information would provide the user with much more original recommendations. In addition, the technology can be used to insert ads in an active form, i.e., if you are in an action scene, with lots of movement and special effects, insert an ad of a car, whereas if you are viewing a scene with a romantic soundtrack it seems less aggressive to insert an ad about perfumes. The idea is to match the low-level metadata of an ad with the metadata of the content at time in a natural manner.
At present we do not know of a company that detects points to insert ads automatically. Usually companies in the sector tend to perform these selections manually or through the given annotations proposed by the producer of the content (if they exist or are available).
Yes, we have several applications in mind. One that we have already available is the use of this information for intelligent navigation in a video player, as it provides a significant improvement in the user experience. The current video players on the internet usually have a scroll bar that allows the user to move forward or backward through the content. However, this navigation system is inefficient because you almost never got to go to the desired moment in the video. With our system, we do that through a gesture on the screen allowing the user to move easily back to one phrase before or to go forward or rewind a scene with another gesture. This form of navigation is highly intuitive: If you are watching a movie in original version and in a specific instant you do not understand well a sentence, with a simple gesture you can repeat it, or repeat a comical scene, or examine a video quickly simply by advancing through the beginning of each one of the scenes. Users who have tried this form of navigation are highly satisfied and increase the interaction with the content. In addition, we can take this interaction with the content to detect user preferences and then assess the interest of the scenes to insert advertising commercials with higher economic value.