And audio descriptions in general are why I'll never publish videos in the Fediverse.
I'd have to go into similar detail as for my pictures, only for moving pictures plus sound plus voice-over now. My descriptions would have to be so detailed that the video would have to pause to let the audio description catch up with the visuals. In fact, the video would spend more time paused while the audio description is rambling than actually moving, and it would never spend more than a few seconds moving at a time.
For one, I would have to describe and explain what the video shows at the very same level of detailed as I describe my images. And at least once I've described one single image at such a level of detail that it'd probably take a screen reader one full hour to read the image description aloud.
Besides, I would have take into account that it's a video. Everything would need timestamps. And instead of only describing the camera position and the camera angle, I would have to describe the camera movements like so:
Seven minutes, eighteen point one three seconds. The camera quickly rotates to the left around a vertical axis through a point roughly two point four metres straight ahead of the avatar. It starts rotating from the direction in which the avatar is facing, roughly twelve degrees to the east of north. The barn which has first appeared at five minutes, fifty-two point two eight seconds comes into view again, including all decoration around it. The camera only rotates around this vertical axis and not around any horizontal axis. The avatar does not rotate with the camera.
Seven minutes, eighteen point six four seconds: The video pauses to let this description catch up.
Seven minutes, eighteen point seven one seconds: The video no longer pauses. The camera reaches a rotation angle of roughly twenty degrees to the south of west. The rotation speed of the camera slows down. It continues to rotate to the left.
Seven minutes, eighteen point nine three seconds: The video pauses to let this description catch up.
Seven minutes, nineteen point zero four seconds: The video no longer pauses. The camera stops rotating at an angle of roughly twenty-five degrees to the west of south.
That is, in order to cater to deaf-blind users, I would have to have two time codes. One, the time code of the original video, not taking the pauses into account. Two, the time code of the described video with catch-up pauses.
And the video with catch-up pauses would be dramatically longer than the original video. Ten minutes of video would take me weeks to describe, probably over a month. And it would end up many hours long, depending on how much there is to describe and explain.
So a time code in the Braille description for deaf-blind users might actually read, "Six minutes, thirty-seven point five five seconds in the original video, fourteen hours, three minutes, forty-nine point two one seconds in this described version of the video."
By the way, no, an AI can't do that.
#Long #LongPost #CWLong #CWLongPost #MediaDescription #MediaDescriptions #AudioDescription #AudioDescriptions