Last week (January 28-30) we were invited by VRT Sandbox to attend the Production Technology Seminar (PTS) hosted by the European Broadcasting Unit (EBU) in their headquarters in Geneva, Switzerland. We had the chance to indulge into the challenges that EBU members are facing today and in the coming years, and listen to possible solutions. Aside from absorbing the interesting talks that were scheduled, we also had a nice demo room where we showcased our solutions for broadcaster needs in the coming years. Let’s dive into the 7 things we’ve learned during the event.
Metadata is the digital gold for media companies.
Let’s start with the elephant in the room: metadata. We’ve heard the word “medata” being mentioned for about a thousand times over the three-day period of the conference. We see a lot of broadcasters with similar problems: they deal with enormous archives (audio/video/imagery) alongside new titles that get added to their back catalog every month in addition to news and sports reporting. Maintaining these assets and performing search and retrieval( let alone build automated features on top of them) is very difficult, if not, straight up impossible with limited metadata for these assets. We can think of this metadata as an indexed representation of the content.
This is where Artificial Intelligence (AI) comes in. The solutions discussed during the event all revolved in some way or another around AI. Solutions proposed to extract information from these assets are countless with varying degrees of technical feasibility at the moment. Below are some proposed examples:
- Object Detection
- Face Recognition
- Speaker Identification
- Gender Identification
- Speech to text transcription
The generation of metadata brings us to the next topic:
Making archives searchable is a huge potential time saver
The tragic helicopter accident that killed Kobe Bryant and his daughter, Gianna, on January 26th didn’t only shook many basketball lovers, but the whole world for that matter. The reality for news outlets however, is one where they have to dive head first into their archives to find the right content to air the news reports about this topic. Imagine being a news editor and being able to just give “Kobe Bryant” and “Basketball Court” as parameters to the archive and being presented with a showreel of a certain length of Kobe playing basketball. This could be the ideal content to report on his basketball achievements. Now when the report continues on the tragic loss of not only Kobe, but also his daughter, Gianna, the reporter could use “Kobe Bryant” and “Gianna Bryant” to generate a second reel where they are both present in the picture. This could be a big time saver in the report process.
Another benefit of being able to make archives searchable beyond basic metadata is to give the public access to this functionality. Although this might only be important for public broadcasters.
Sports reporting loves Augmented Reality (AR)
Companies like Sky UK are leading the way when it comes to sports reporting with their Sky VR. For instance on last years Open Golf, one of the most prestigious golf events of the year, Sky VR offered a 360 VR capture of the golfers competing in the event. This resulted in a 3D avatar of the golfers projected on the golf course while the reporter can walk around and comment on the golfer’s swing.
Sky VR as well as BBC Sport are using generated elements to augment the televised studio experience to give the illusion to be in a sky box of the sports arena, or to display extensive player and game statistics that are reminiscent to sports video games. For that they use game engines like Unreal. These techniques can be used for other topics than sports, as shown below, where exit polls are displayed in a game like manner inside an existing building.
The future of mobile journalism is in your pocket
With smartphone cameras getting better each year and rivaling with more expensive cameras when it comes to picture quality, production value rises as well. All you really need to make a descent video report is your smartphone and maybe an external mic and you’re good to go. Setups like this are already used in the field by professionals. With smartphones being omnipresent and the popularization of 360 cameras, we’ll probably see a shift in value creation from ‘capturing’ to ‘curating’ footage, as we’ll see a trend in “over capturing”. This is where we again see the urgency in generating metadata because we now have to sift through all this content that’s being generated to find the right bits we want to report on.
People want to see the right content with minimal effort
In times where content offerings are getting almost too big to browse with traditional indexing methods, recommendation engines are still of utmost importance when it comes to content platforms. We see this reflected in the roadmaps of broadcasters presented during this conference. YLE, the Finnish public broadcaster, gave us some interesting insights in the performance of the recommendation engine they use on their online platform, Areena. They managed to grow their weekly active users with 23% in 2019. With a personalized front page instead of same content for everyone, they managed to generate 30% more stream starts.
NHK is making the 2020 Olympics accessible to everyone
Japan's public broadcast company, Nippon Hoso Kyokai (NHK), showed us a lot of impressive technological achievements they either have in production, or plan to roll out soon. But one really impressed us: NHK plans to translate all broadcasted games of the 2020 Olympics in real time with a 3D avatar that is able to sign the reporters spoken words. Unfortunately we don’t have footage of their demo and we weren’t able to find footage of it online either. You can take our word for it: the footage we saw was super impressive. A little research taught us that they started this project as early as 2011, this gives us an indication to how much work went into achieving this.
Deep fakes are here
VRT report Tom Van de Weghe gave an interesting talk about the power of deep fakes, neural networks used to produce realistic images, video and audio of (famous) people executing scripts that never happened. With the 2020 US election in sight for one, you can understand its potential harm. Tom was part of a dedicated research group at Stanford University concerned with deep fakes. These deep fakes come in varying degrees of credibility, but because of the fact that as of January 2020 18 000 deep fakes are in circulation, the risk of them creating havoc is getting more plausible every day. AI is being used to detect deep fakes, AI to fight AI if you will.
So, to wrap this up, a lot of interesting technologies and views on media and content generation were presented during this seminar. We also felt a sense of urgency when it comes to innovation and media. The time for broadcasters to invest in the future is now, especially when big players like Netflix, Disney +, HBO GO strengthening their user base in the states as well as in Europe. We definitely feel energized and burst with ideas to tackle hurdles for the future of content creation so if you feel like having a chat about this topic, come and say hi!