Over the past two years, Facebook AI Research (FAIR) has worked with 13 universities around the world to integrate the largest data sets of first-person videos বিশেষ especially to train image-recognition models of deep learning. AIs trained in data sets would be better at controlling robots interacting with humans, or interpreting images from smart glasses. “Machines in our daily lives will only be able to help us if they really understand the world through our eyes,” said Kristen Grumman of FAIR, who led the project.
This type of technology can help people who need help around the house, or guide people in the tasks they are learning to accomplish. “The video in this data set is very close to how humans observe the Earth,” said Michael Reeu, a Google Brain and computer vision researcher at Stony Brook University in New York, who is not involved in egoD.
But the potential abuse is obvious and worrying. The study was funded by social media Facebook, which has recently been accused by the US Senate of making profits on human welfare, as evidenced by the MIT Technology Review’s own investigation.
The business model of Facebook and other big tech companies is to remove as much information as possible from people’s online behavior and sell it to advertisers. The AI described in the project can enhance the reach of people in their daily offline behaviors, revealing an unprecedented degree of personal information revealing what things around your home, what activities you enjoyed, who you spent time with and even where your vision was.
“There’s work on privacy that when you take it out of the world of investigative research and take it as a product, this work can be inspired by this project,” Grumman said.
The largest data set before the first person video contains 100 hours of footage of people in the kitchen. The Ego 4D data set recorded video of nine of the 050 people in nine different 73 countries (United States, United Kingdom, India, Japan, Italy, Singapore, Saudi Arabia, Colombia and Rwanda).
Participants had different ages and backgrounds; Some were hired for their lucrative occupations, such as baker, mechanic, carpenter and landscaper.
Previous data sets usually consist of a few seconds long semi-scripted video clips. For Ego4D, participants wore head-mounted cameras for up to 10 hours at a time and recorded first-person videos of walking, reading, laundry, shopping, playing with pets, playing board games, and daily activities. Interactions with other people Some footage includes audio, information about where the participants’ vision was, and multiple perspectives on the same scene. This is the first data set of its kind, Rio said.