Imagine creating a video from just a static photograph and text. This is the basic premise of the Creative Reality Studio platform, created by the Israeli company D-ID.
Basically, the software uses artificial intelligence to “fit” the sound of someone speaking into the mouth of the person captured in the photo.
The idea, according to the company, is for the technology to meet demands in areas such as corporate training, distance education, internal and external business communication, as well as marketing and sales, according to information from the TechCrunch website.
That’s because, instead of preparing a scenario and equipping it with video and audio capture material, just select an image and artificial intelligence does all the rest.
how the system works
Users must upload a photo with the face of the person they wish to have as the presenter of the video. There are also pre-selected presenter options by Creative Reality Studio itself.
Subscribers to the platform’s most expensive plan gain the ability to select “more expressive” presenters, with more options for facial expressions and hand gestures.
The sound that the intelligence uses to simulate the person in the photo speaking is generated from a text typed by the user or from an audio recorded and uploaded to the platform. The company says it supports 119 languages (such as English, Mandarin, Spanish, Arabic and Afrikaans — one of the languages in South Africa. There is no Portuguese).
Below is an example of the technology at work:
Interested parties can also choose the mood of the video, among options such as “happy”, “sad”, “excited” and “friendly”.
“Reading documents and going through presentations can be dry and boring. Plus, it takes thousands of dollars to hire actors and create educational videos. So we use our AI to create presenters and tutors and make the content more engaging and effective.” explained Gil Perry, chief executive of D-ID, to TechCrunch.
Does it have fake news potential?
An obvious concern to Creative Reality Studio’s business model is the generation of fake news. The site’s technique is similar to deepfake videos, a digital technique in which artificial intelligence is used to generate content with the image and even the voice of a person who has never recorded what is being said.
This year’s electoral dispute in Brazil, by the way, is already being the target of several deepfakes.
To lessen the risks, D-ID says it has taken some steps. First, a filter was placed that prevents the reproduction of profanity and racist swearing. In addition, the AI has image recognition capability, to prevent the faces chosen for the recordings from being famous people.
The company still prohibits the creation of political content. If it detects a breach of its rules, it warns that it can suspend the account responsible for it and remove the generated video from its library.
These are necessary measures, but human creativity will still be a challenge. It doesn’t seem at all difficult that videos with faces of unknown people passing on false information as if it were true continue to circulate. And this can get worse if they are associated with positions and specialties that give the impression of relevance in their speeches — psychology explains why so many people believe in fake news.
training the AI
According to TechCrunch, for those interested in the platform, there is a free 14-day trial period, in which up to 5 minutes of video can be generated. The subscription costs US$ 49 dollars (R$ 258.60 in direct conversion) per month, and entitles you to generate 15 minutes of video, in the best quality that the site offers.
The idea is to attract subscribers, especially those who are willing to collaborate to further improve the platform’s AI. Interested parties can upload their own voice, so that audio cloning is smarter and more accurate.
Soon, the platform, according to the company, will have the possibility of uploading video, so that the AI learns to better imitate the gestures and intonation of each presenter.
These characteristics, however, are limited to corporate contracts, to avoid the generation of fake news.