"Nothing About Us Without Us", only it still is without them most of the time
zuletzt bearbeitet: Fri, 27 Sep 2024 11:36:38 +0200
jupiter_rowland@hub.netzgemeinde.eu
When disabled Fediverse users demand participation in accessibility discussions, but there are no discussions in the first place, and they themselves don't even seem to be available to give accessibility feedback
Artikel ansehen
Zusammenfassung ansehen
"Nothing about us without us" is the catchphrase used by disabled accessibility activists who are trying to get everyone to get accessibility right. It means that non-disabled people should stop assuming what disabled people need. Instead, they should listen to what disabled people say they need and then give them what they need.
Just like accessibility in the digital realm in general, this is not only targetted at professional Web or UI developers. This is targetted at any and all social media users just as well.
However, this would be a great deal easier if it wasn't still "without them" all the time.
Alt-text and image descriptions are one example and one major issue. How are we, the sighted Fediverse users, supposed to know what blind or visually-impaired users really need and where they need it if we never get any feedback? And we never get any feedback, especially not from blind or visually-impaired users.
Granted, only sighted users can call us out for an AI-generated alt-text that's complete rubbish because non-sighted users can't compare the alt-text with the image.
But non-sighted users could tell us whether they're sufficiently informed or not. They could tell us whether they're satisfied with an image description mentioning that something is there, or whether they need to be told what this something looks like. They could tell us which information in an image description is useful to them, which isn't, and what they'd suggest to improve its usefulness.
They could tell us whether certain information that's in the alt-text right now should better go elsewhere, like into the post. They could tell us whether extra information needed to understand a post or an image should be given right in the post that contains the image or through an external link. They could tell us whether they need more explanation on a certain topic displayed in an image, or whether there is too much explanation that they don't need. (Of course, they should take into consideration that some of us do not have a 500-character limit.)
Instead, we, the sighted users who are expected to describe our images, receive no feedback for our image descriptions at all. We're expected to know exactly what blind or visually-impaired users need, and we're expected to know it right off the bat without being told so by blind or visually-impaired users. It should be crystal-clear how this is impossible.
What are we supposed to do instead? Send all our image posts directly to one or two dozen people who we know are blind and ask for feedback? I'm pretty sure I'm not the only one who considers this very bad style, especially in the long run, not to mention no guarantee for feedback.
So with no feedback, all we can do is guess what blind or visually-impaired users need.
Now you might wonder why all this is supposed to be such a big problem. After all, there are so many alt-text guides out there on the Web that tell us how to do it.
Yes, but here in the Fediverse, they're all half-useless.
The vast majority of them is written for static Web sites, either scientific or technological or commercial. Some include blogs, again, either scientific or technological or commercial. The moment they start relying on captions and HTML code, you know you can toss them because they don't translate to almost anything in the Fediverse.
What few alt-text guides are written for social media are written for the huge corporate American silos. 𝕏, Facebook, Instagram, LinkedIn. They do not translate to the Fediverse which has its own rules and cultures, not to mention much higher character limits, if any.
Yes, there are one or two guides on how to write alt-text in the Fediverse. But they're always about Mastodon, only Mastodon and nothing but Mastodon. They're written for Mastodon's limitations, especially only 500 characters being available in the post itself versus a whopping 1,500 characters being available in the alt-text. And they're written with Mastodon's culture in mind which, in turn, is influenced by Mastodon's limitations.
Elsewhere in the Fediverse than Mastodon, you have much more possibilities. You have thousands of characters to use up in your post. Or you don't have any character limit to worry about at all. You don't have all means at hand that you have on a static HTML Web site. Even the few dozen (streams) users who can use HTML in social media posts don't have the same influence on the layout of their posts as Web designers have on Web sites. Still, you aren't bound to Mastodon's self-imposed limitations.
And yet, those Mastodon alt-text guides tell you you have to squeeze all information into the alt-text as if you don't have any room in the post. Which, unlike most Mastodon users, you do have.
It certainly doesn't help that the Fediverse's entire accessibility culture comes from Mastodon, concentrates on Mastodon and only takes Mastodon into consideration with all its limitations. Apparently, if you describe an image for the blind and the visually-impaired, you must describe everything in the alt-text. After all, according to the keepers of accessibility in the Fediverse, how could you possibly describe anything in a post with a 500-character limit?
In addition, all guides always only cover their specific standard cases. For example, an image description guide for static scientific Web sites only covers images that are typical for static scientific Web sites. Graphs, flowcharts, maybe a portrait picture. Everything else is an edge-case that is not covered by the guide.
There are even pictures that are edge-cases for all guides and not sufficiently or not at all covered by any of them. When I post an image, it's practically always such an edge-case, and I can only guess what might be the right way to describe it.
Even single feedback for image descriptions, media descriptions, transcripts etc. is not that useful. If one user gives you feedback, you know what this one user needs. But you do not know what the general public with disabilities needs. And what actually matters is just that. Another user might give you wholly different feedback. Two different blind users are likely to give you two different feedbacks on the same image description.
What is needed so direly is open discussion about accessibility in the Fediverse. People gathering together, talking about accessibility, exchanging experiences, exchanging ideas, exchanging knowledge that others don't have. People with various disabilities and special requirements in the Fediverse need to join this discussion because "nothing about them without them", right? After all, it is about them.
And people from outside of Mastodon need to join, too. They are needed to give insights on what can be done on Pleroma and Akkoma, on Misskey, Firefish, Iceshrimp, Sharkey and Catodon, on Friendica, Hubzilla and (streams), on Lemmy, Mbin, PieFed and Sublinks and everywhere else. They are needed to combat the rampant Mastodon-centricism and keep reminding the Mastodon users that the Fediverse is more than Mastodon. They are needed to explain that the Fediverse outside of Mastodon offers many more possibilities than Mastodon that can be used for accessibility. They are needed for solutions to be found that are not bound to Mastodon's restrictions. And they need to learn about there being accessibility in the Fediverse in the first place because it's currently pretty much a topic that only exists on Mastodon.
There are so many things I'd personally like to be discussed and ideally brought to a consensus of sorts. For example:
Alas, this won't happen. Ever. It won't happen because there is no place in the Fediverse where it could sensibly happen.
Now you might wonder what gives me that idea. Can't this just be done on Mastodon?
No, it can't. Yes, most participants would be on Mastodon. And Mastodon users who don't know anything else keep saying that Mastodon is sooo good for discussions.
But seriously, if you've experienced anything in the Fediverse that isn't purist microblogging like Mastodon, you've long since have come to the realisation that when it comes to discussions with a certain number of participants, Mastodon is utter rubbish. It has no concept of conversations whatsoever. It's great as a soapbox. But it's outright horrible at holding a discussion together. How are you supposed to have a meaningful discussion with 30 people if you burn through most of your 500-character limit mentioning the other 29?
Also, Mastodon has another disadvantage: Almost all participants will be on Mastodon themselves. Most of them will not know anything about the Fediverse outside Mastodon. At least some will not even know that the Fediverse is more than just Mastodon. And that one poor sap from Friendica will constantly try to remind people that the Fediverse is not only Mastodon, but he'll be ignored because he doesn't always mention all participants in this thread. Because mentioning everyone is not necessary on Friendica itself, so he isn't used to it, but on Mastodon, it's pretty much essential.
Speaking of Friendica, it'd actually be the ideal place in the Fediverse for such discussions because users from almost all over the place could participate. Interaction between Mastodon users and Friendica forums is proven to work very well. A Friendica forum can be moderated, unlike a Guppe group. And posts and comments reach all members of a Friendica forum without mass-mentioning.
The difficulty here would be to get it going in the first place. Ideally, the forum would be set up and run by an experienced Friendica user. But accessibility is not nearly as much an issue on Friendica as it is on Mastodon, so the difficult part would be to find someone who sees the point in running a forum about it in the first place. A Mastodon user who does see the point, on the other hand, would have to get used to something that is a whole lot different from Mastodon while being a forum admin/mod.
Lastly, there is the Threadiverse, Lemmy first and foremost. But Lemmy has its own issues. For starters, it's federated with the Fediverse outside the Threadiverse only barely and not quite reliably, and the devs don't seem to be interested in non-Threadiverse federation. So everyone interested in the topic would need a Lemmy account, and many refuse to make a second Fediverse account for whichever purpose.
If it's on Lemmy, it will naturally attract Lemmy natives. But the vast majority of these have come from Reddit straight to Lemmy. Just like most Mastodon users know next to nothing about the Fediverse outside Mastodon, most Lemmy users know next to nothing about the Fediverse outside Lemmy. I am on Lemmy, and I've actually run into that wall. After all, they barely interact with the Fediverse outside Lemmy. As accessibility isn't an issue on Lemmy either, they know nothing about accessibility on top of knowing nothing about most of the Fediverse.
So instead of having meaningful discussions, you'll spend most of the time educating Lemmy users about the Fediverse outside Lemmy, about Mastodon culture, about accessibility and about why all this should even matter to people who aren't professional Web devs. And yes, you'll have to do it again and again for each newcomer who couldn't be bothered to read up on any of this in older threads.
In fact, I'm not even sure if any of the Threadiverse projects are accessible to blind or visually-impaired users in the first place.
Lastly, I've got some doubts that discussing accessibility in the Fediverse would even possible if there was a perfectly appropriate place for it. I mean, this Fediverse neither gives advice on accessibility within itself beyond linking to always the same useless guides, nor does it give feedback on accessibility measures such as image descriptions.
People, disabled or not, seem to want perfect accessibility. But nobody wants to help others improve their contributions to accessibility in any way. It's easier and more convenient to expect things to happen by themselves.
Just like accessibility in the digital realm in general, this is not only targetted at professional Web or UI developers. This is targetted at any and all social media users just as well.
However, this would be a great deal easier if it wasn't still "without them" all the time.
Lack of necessary feedback
Alt-text and image descriptions are one example and one major issue. How are we, the sighted Fediverse users, supposed to know what blind or visually-impaired users really need and where they need it if we never get any feedback? And we never get any feedback, especially not from blind or visually-impaired users.
Granted, only sighted users can call us out for an AI-generated alt-text that's complete rubbish because non-sighted users can't compare the alt-text with the image.
But non-sighted users could tell us whether they're sufficiently informed or not. They could tell us whether they're satisfied with an image description mentioning that something is there, or whether they need to be told what this something looks like. They could tell us which information in an image description is useful to them, which isn't, and what they'd suggest to improve its usefulness.
They could tell us whether certain information that's in the alt-text right now should better go elsewhere, like into the post. They could tell us whether extra information needed to understand a post or an image should be given right in the post that contains the image or through an external link. They could tell us whether they need more explanation on a certain topic displayed in an image, or whether there is too much explanation that they don't need. (Of course, they should take into consideration that some of us do not have a 500-character limit.)
Instead, we, the sighted users who are expected to describe our images, receive no feedback for our image descriptions at all. We're expected to know exactly what blind or visually-impaired users need, and we're expected to know it right off the bat without being told so by blind or visually-impaired users. It should be crystal-clear how this is impossible.
What are we supposed to do instead? Send all our image posts directly to one or two dozen people who we know are blind and ask for feedback? I'm pretty sure I'm not the only one who considers this very bad style, especially in the long run, not to mention no guarantee for feedback.
So with no feedback, all we can do is guess what blind or visually-impaired users need.
Common alt-text guides are not helpful
Now you might wonder why all this is supposed to be such a big problem. After all, there are so many alt-text guides out there on the Web that tell us how to do it.
Yes, but here in the Fediverse, they're all half-useless.
The vast majority of them is written for static Web sites, either scientific or technological or commercial. Some include blogs, again, either scientific or technological or commercial. The moment they start relying on captions and HTML code, you know you can toss them because they don't translate to almost anything in the Fediverse.
What few alt-text guides are written for social media are written for the huge corporate American silos. 𝕏, Facebook, Instagram, LinkedIn. They do not translate to the Fediverse which has its own rules and cultures, not to mention much higher character limits, if any.
Yes, there are one or two guides on how to write alt-text in the Fediverse. But they're always about Mastodon, only Mastodon and nothing but Mastodon. They're written for Mastodon's limitations, especially only 500 characters being available in the post itself versus a whopping 1,500 characters being available in the alt-text. And they're written with Mastodon's culture in mind which, in turn, is influenced by Mastodon's limitations.
Elsewhere in the Fediverse than Mastodon, you have much more possibilities. You have thousands of characters to use up in your post. Or you don't have any character limit to worry about at all. You don't have all means at hand that you have on a static HTML Web site. Even the few dozen (streams) users who can use HTML in social media posts don't have the same influence on the layout of their posts as Web designers have on Web sites. Still, you aren't bound to Mastodon's self-imposed limitations.
And yet, those Mastodon alt-text guides tell you you have to squeeze all information into the alt-text as if you don't have any room in the post. Which, unlike most Mastodon users, you do have.
It certainly doesn't help that the Fediverse's entire accessibility culture comes from Mastodon, concentrates on Mastodon and only takes Mastodon into consideration with all its limitations. Apparently, if you describe an image for the blind and the visually-impaired, you must describe everything in the alt-text. After all, according to the keepers of accessibility in the Fediverse, how could you possibly describe anything in a post with a 500-character limit?
In addition, all guides always only cover their specific standard cases. For example, an image description guide for static scientific Web sites only covers images that are typical for static scientific Web sites. Graphs, flowcharts, maybe a portrait picture. Everything else is an edge-case that is not covered by the guide.
There are even pictures that are edge-cases for all guides and not sufficiently or not at all covered by any of them. When I post an image, it's practically always such an edge-case, and I can only guess what might be the right way to describe it.
Discussing Fediverse accessibility is necessary...
Even single feedback for image descriptions, media descriptions, transcripts etc. is not that useful. If one user gives you feedback, you know what this one user needs. But you do not know what the general public with disabilities needs. And what actually matters is just that. Another user might give you wholly different feedback. Two different blind users are likely to give you two different feedbacks on the same image description.
What is needed so direly is open discussion about accessibility in the Fediverse. People gathering together, talking about accessibility, exchanging experiences, exchanging ideas, exchanging knowledge that others don't have. People with various disabilities and special requirements in the Fediverse need to join this discussion because "nothing about them without them", right? After all, it is about them.
And people from outside of Mastodon need to join, too. They are needed to give insights on what can be done on Pleroma and Akkoma, on Misskey, Firefish, Iceshrimp, Sharkey and Catodon, on Friendica, Hubzilla and (streams), on Lemmy, Mbin, PieFed and Sublinks and everywhere else. They are needed to combat the rampant Mastodon-centricism and keep reminding the Mastodon users that the Fediverse is more than Mastodon. They are needed to explain that the Fediverse outside of Mastodon offers many more possibilities than Mastodon that can be used for accessibility. They are needed for solutions to be found that are not bound to Mastodon's restrictions. And they need to learn about there being accessibility in the Fediverse in the first place because it's currently pretty much a topic that only exists on Mastodon.
There are so many things I'd personally like to be discussed and ideally brought to a consensus of sorts. For example:
- Explaining things in the alt-text versus explaining things in the post versus linking to external sites for explanations.
The first is the established Mastodon standard, but any information exclusively available in the alt-text is inaccessible to people who can't access alt-text, including due to physical disabilities.
The second is the most accessible, but it inflates the post, and it breaks with several Mastodon principles (probably over 500 characters, explanation not in the alt-text).
The third is the easiest way, but it's inconvenient because image and explanation are in different places. - What if an image needs a very long and very detailed visual description, considering the nature of the image and the expected audience?
Describe the image only in the post (inflates the post, no image description in the alt-text, breaks with Mastodon principles, impossible on vanilla Mastodon)?
Describe it externally and link to the description (no image description anywhere near the image, image description separated from the image, breaks with Mastodon principles, requires an external space to upload the description)?
Only give a description that's short enough for the alt-text regardless (insufficient description)?
Refrain from posting the image altogether? - Seeing as all text in an image must always be transcribed verbatim, what if text is unreadable for some reason, but whoever posts the image can source the text and transcribe it regardless?
Must it be transcribed because that's what the rule says?
Must it be transcribed so that even sighted people know what's written there?
Must it not be transcribed?
...but it's nigh-impossible
Alas, this won't happen. Ever. It won't happen because there is no place in the Fediverse where it could sensibly happen.
Now you might wonder what gives me that idea. Can't this just be done on Mastodon?
No, it can't. Yes, most participants would be on Mastodon. And Mastodon users who don't know anything else keep saying that Mastodon is sooo good for discussions.
But seriously, if you've experienced anything in the Fediverse that isn't purist microblogging like Mastodon, you've long since have come to the realisation that when it comes to discussions with a certain number of participants, Mastodon is utter rubbish. It has no concept of conversations whatsoever. It's great as a soapbox. But it's outright horrible at holding a discussion together. How are you supposed to have a meaningful discussion with 30 people if you burn through most of your 500-character limit mentioning the other 29?
Also, Mastodon has another disadvantage: Almost all participants will be on Mastodon themselves. Most of them will not know anything about the Fediverse outside Mastodon. At least some will not even know that the Fediverse is more than just Mastodon. And that one poor sap from Friendica will constantly try to remind people that the Fediverse is not only Mastodon, but he'll be ignored because he doesn't always mention all participants in this thread. Because mentioning everyone is not necessary on Friendica itself, so he isn't used to it, but on Mastodon, it's pretty much essential.
Speaking of Friendica, it'd actually be the ideal place in the Fediverse for such discussions because users from almost all over the place could participate. Interaction between Mastodon users and Friendica forums is proven to work very well. A Friendica forum can be moderated, unlike a Guppe group. And posts and comments reach all members of a Friendica forum without mass-mentioning.
The difficulty here would be to get it going in the first place. Ideally, the forum would be set up and run by an experienced Friendica user. But accessibility is not nearly as much an issue on Friendica as it is on Mastodon, so the difficult part would be to find someone who sees the point in running a forum about it in the first place. A Mastodon user who does see the point, on the other hand, would have to get used to something that is a whole lot different from Mastodon while being a forum admin/mod.
Lastly, there is the Threadiverse, Lemmy first and foremost. But Lemmy has its own issues. For starters, it's federated with the Fediverse outside the Threadiverse only barely and not quite reliably, and the devs don't seem to be interested in non-Threadiverse federation. So everyone interested in the topic would need a Lemmy account, and many refuse to make a second Fediverse account for whichever purpose.
If it's on Lemmy, it will naturally attract Lemmy natives. But the vast majority of these have come from Reddit straight to Lemmy. Just like most Mastodon users know next to nothing about the Fediverse outside Mastodon, most Lemmy users know next to nothing about the Fediverse outside Lemmy. I am on Lemmy, and I've actually run into that wall. After all, they barely interact with the Fediverse outside Lemmy. As accessibility isn't an issue on Lemmy either, they know nothing about accessibility on top of knowing nothing about most of the Fediverse.
So instead of having meaningful discussions, you'll spend most of the time educating Lemmy users about the Fediverse outside Lemmy, about Mastodon culture, about accessibility and about why all this should even matter to people who aren't professional Web devs. And yes, you'll have to do it again and again for each newcomer who couldn't be bothered to read up on any of this in older threads.
In fact, I'm not even sure if any of the Threadiverse projects are accessible to blind or visually-impaired users in the first place.
Lastly, I've got some doubts that discussing accessibility in the Fediverse would even possible if there was a perfectly appropriate place for it. I mean, this Fediverse neither gives advice on accessibility within itself beyond linking to always the same useless guides, nor does it give feedback on accessibility measures such as image descriptions.
People, disabled or not, seem to want perfect accessibility. But nobody wants to help others improve their contributions to accessibility in any way. It's easier and more convenient to expect things to happen by themselves.
AI superiority at describing images, not so alleged?
zuletzt bearbeitet: Fri, 27 Sep 2024 11:36:16 +0200
jupiter_rowland@hub.netzgemeinde.eu
Could it be that AI can image-describe circles even around me? And that the only ones whom my image descriptions satisfy are Mastodon's alt-text police?
Artikel ansehen
Zusammenfassung ansehen
I think I've reached a point at which I only describe my images for the alt-text police anymore. At which I only keep ramping up my efforts, increasing my description quality and declaring all my previous image descriptions obsolete and hopelessly outdated only to have an edge over those who try hard to enforce quality image descriptions all over the Fediverse, and who might stumble upon one of my image posts in their federated timelines by chance.
For blind or visually-impaired people, my image descriptions ought to fall under "better than nothing" at best and even that only if they have the patience to have them read out in their entirety. But even my short descriptions in the alt-text are too long already, often surpassing the 1,000-character mark. And they're often devoid of text transcripts due to lack of space.
My full descriptions that go into the post are probably mostly ignored, also because nobody on Mastodon actually expects an image description anywhere that isn't alt-text. But on top of that, they're even longer. Five-digit character counts, image descriptions longer than dozens of Mastodon toots, are my standard. Necessarily so because I can't see it being possible to sufficiently describe the kind of images I post in significantly fewer characters, so I can't help it.
But it isn't only about the length. It also seems to be about quality. As @Robert Kingett, blind points out in this Mastodon post and this blog post linked in the same Mastodon post, blind or visually-impaired people generally prefer AI-written image descriptions over human-written image descriptions. Human-written image descriptions lack effort, they lack details, they lack just about everything. AI descriptions, in comparison, are highly detailed and informative. And I guess when they talk about human-written image descriptions, they mean all of them.
I can upgrade my description style as often as I want. I can try to make it more and more inclusive by changing the way I describe colours or dimensions as much as I want. I can spend days describing one image, explaining it, researching necessary details for the description and explanation. But from a blind or visually-impaired user's point of view, AI can apparently write circles around that in every way.
AI can apparently describe and even explain my own images about an absolutely extreme niche topic more accurately and in greater detail than I can. In all details that I describe and explain, with no exception, plus even more on top of that.
If I take two days to describe an image in over 60,000 characters, it's still sub-standard in terms of quality, informativity and level of detail. AI only takes a few seconds to generate a few hundred characters which apparently describe and explain the self-same image at a higher quality, more informatively and at a higher level of detail. It may even be able to not only identify where exactly an image was created, even if that place is only a few days old, but also explain that location to someone who doesn't know anything about virtual worlds within no more than 100 characters or so.
Whenever I have to describe an image, I always have to throw someone in front of the bus. I can't perfectly satisfy everyone all the same at the same time. My detailed image descriptions are too long for many people, be it people with a short attention span, be it people with little time. But if I shortened them dramatically, I'd have to cut information to the disadvantage of not only neurodiverse people who need things explained in great detail, but also blind or visually-impaired users who want to explore a new and previously unknown world through only that one image, just like sighted people can let their eyes wander around the image.
Apparently, AI is fully capable of actually perfectly satisfying everyone all the same at the same time because it can convey more information with only a few hundred characters.
Sure, AI makes mistakes. But apparently, AI still makes fewer mistakes than I do.
#AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #AI #AIVsHuman #HumanVsAI
For blind or visually-impaired people, my image descriptions ought to fall under "better than nothing" at best and even that only if they have the patience to have them read out in their entirety. But even my short descriptions in the alt-text are too long already, often surpassing the 1,000-character mark. And they're often devoid of text transcripts due to lack of space.
My full descriptions that go into the post are probably mostly ignored, also because nobody on Mastodon actually expects an image description anywhere that isn't alt-text. But on top of that, they're even longer. Five-digit character counts, image descriptions longer than dozens of Mastodon toots, are my standard. Necessarily so because I can't see it being possible to sufficiently describe the kind of images I post in significantly fewer characters, so I can't help it.
But it isn't only about the length. It also seems to be about quality. As @Robert Kingett, blind points out in this Mastodon post and this blog post linked in the same Mastodon post, blind or visually-impaired people generally prefer AI-written image descriptions over human-written image descriptions. Human-written image descriptions lack effort, they lack details, they lack just about everything. AI descriptions, in comparison, are highly detailed and informative. And I guess when they talk about human-written image descriptions, they mean all of them.
I can upgrade my description style as often as I want. I can try to make it more and more inclusive by changing the way I describe colours or dimensions as much as I want. I can spend days describing one image, explaining it, researching necessary details for the description and explanation. But from a blind or visually-impaired user's point of view, AI can apparently write circles around that in every way.
AI can apparently describe and even explain my own images about an absolutely extreme niche topic more accurately and in greater detail than I can. In all details that I describe and explain, with no exception, plus even more on top of that.
If I take two days to describe an image in over 60,000 characters, it's still sub-standard in terms of quality, informativity and level of detail. AI only takes a few seconds to generate a few hundred characters which apparently describe and explain the self-same image at a higher quality, more informatively and at a higher level of detail. It may even be able to not only identify where exactly an image was created, even if that place is only a few days old, but also explain that location to someone who doesn't know anything about virtual worlds within no more than 100 characters or so.
Whenever I have to describe an image, I always have to throw someone in front of the bus. I can't perfectly satisfy everyone all the same at the same time. My detailed image descriptions are too long for many people, be it people with a short attention span, be it people with little time. But if I shortened them dramatically, I'd have to cut information to the disadvantage of not only neurodiverse people who need things explained in great detail, but also blind or visually-impaired users who want to explore a new and previously unknown world through only that one image, just like sighted people can let their eyes wander around the image.
Apparently, AI is fully capable of actually perfectly satisfying everyone all the same at the same time because it can convey more information with only a few hundred characters.
Sure, AI makes mistakes. But apparently, AI still makes fewer mistakes than I do.
#AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #AI #AIVsHuman #HumanVsAI
Why descriptions for images from virtual worlds have to be so long and extensive
zuletzt bearbeitet: Fri, 27 Sep 2024 11:35:50 +0200
jupiter_rowland@hub.netzgemeinde.eu
Whenever I describe a picture from a virtual world, the description grows far beyond everyone's wildest imaginations in size; here's why
Artikel ansehen
Zusammenfassung ansehen
I rarely post pictures from virtual worlds anymore. I'd really like to show them to Fediverse users, including those who know nothing about them. But I rarely do that anymore. Not in posts, not even in Hubzilla articles.
That's because pictures posted in the Fediverse need image descriptions. Useful and sufficiently informative image descriptions. And to my understanding, even Hubzilla articles are part of the Fediverse because they're part of Hubzilla. So the exact same rules apply to them that apply to posts. Including image descriptions being an absolute requirement.
And a useful and sufficiently informative image description for a picture from a virtual world has to be absolutely massive. In fact, it can't be done within Mastodon's limits. Not even the 1,500 characters offered for alt-text are enough. Not nearly.
Over the last 12 or 13 months, I've developed my image-describing style, and it's still evolving. However, this also means my image descriptions get more and more detailed with more and more explanations, and so they tend to grow longer and longer.
My first attempt at writing a detailed, informative description for a picture from a virtual world was in November, 2022. It started at over 11,000 characters already and grew beyond 13,000 characters a bit later when I re-worked it and added a missing text transcript. Most recently, I've broken the 40,000-character barrier, also because I've raised my standards to describing pictures within pictures within a picture. I've taken over 13 hours to describe one single picture twice already.
I rarely get any feedback for my image descriptions. But I sometimes have to justify their length, especially to sighted Fediverse users who don't care for virtual worlds.
Sure, most people who come across my pictures don't care for virtual worlds at all. But most people who come across my pictures are fully sighted and don't require any image descriptions. It's still good manners to provide them.
And there may pretty well be people who are very excited about and interested in virtual worlds, especially if it's clear that these are actually existing, living, breathing virtual worlds and not some cryptobro's imagination. And they may want to know everything about these worlds. But they know nothing. They look at the pictures, but they can't figure out from looking at the pictures what these pictures show. Nothing that's in these pictures is really familiar to them.
So when describing a picture from a virtual world, one must never assume that anything in the picture is familiar to the on-looker. In most cases, it is not.
Also, one might say that only sighted people are interested in virtual worlds because virtual worlds are a very visual medium and next to impossible to navigate without eyesight. Still, blind or visually-impaired people may be just as fascinated by virtual worlds as sighted people. And they may be at least just as curious which means they may require even more description and explanation. They want to know what everything looks like, but since they can't see it for themselves, they have to be told.
All this is why pictures from virtual worlds require substantially more detailed and thus much, much longer descriptions than real-life photographs.
The wordiness of descriptions for images from virtual worlds starts with the medium. It's generally said that image descriptions must not start with "Picture of" or "Image of". Some even say that mentioning the medium, i.e. "Photograph of", is too much.
Unless it is not a digital photograph. And no, it isn't always a digital photograph.
It can just as well be a digitised analogue photograph, film grain and all. It can be a painting. It can be a sketch. It can be a graph. It can be a screenshot of a social media post. It can be a scanned newspaper page.
Or it can be a digital rendering.
Technically speaking, virtual world images are digital renderings. But just writing "digital rendering" isn't enough.
If I only wrote "digital rendering", people would think of spectacular, state-of-the-art, high-resolution digital art with ray-tracing and everything. Like stills from Cyberpunk 2077 for which the graphics settings were temporarily cranked up to levels at which the game becomes unplayable, just to show off. Or like promotional pictures from a Pixar film. Or like the stuff we did in PoV-Ray back in the day. When the single-core CPU ran on full blast for half an hour, but the outcome was a gorgeous screen-sized 3-D picture.
But images from the virtual worlds I frequent are nothing like this. Ray-tracing isn't even an option. It's unavailable. It's technologically impossible. So there is no fancy ray-tracing with fully reflective surfaces and whatnot. But there are shaders with stuff like ambient occlusion.
So where other people may or may not write "photograph", I have to write something like "digital 3-D rendering created using shaders, but without ray-tracing".
If you think that was wordy, think again. Mentioning the location is much worse. And mentioning the location is mandatory in this case.
I mean, it's considered good style to always write where a picture was taken unless, maybe, it was at someone's home, or the location of something is classified.
In real life, that's easy. And except for digital art, digitally generated graphs and pictures of text, almost all pictures in the Fediverse were taken in real-life.
In real life, you can often get away with name-dropping. Most people know at least roughly what "Times Square" refers to. Or "Piccadilly Circus". Or "Monument Valley". Or "Stonehenge". There is no need to break down where these places are. It can be considered common knowledge.
In fact, you get away even more easily with name-dropping landmarks without telling where they are. White House. Empire State Building. Tower Bridge. Golden Gate Bridge. Mount Fuji. Eiffel Tower. Taj Mahal. Sydney Opera House which, admittedly, name-drops its rough location, just like the Hollywood Marquee. All these are names that should ring a bell.
But you can't do that in virtual worlds. In no virtual world can you do that. Not even in Roblox which has twice as many users as Germany has citizens. Much less in worlds running on OpenSim, all of which combined are estimated to have fewer than 50,000 unique monthly users. Whatever "unique" means, considering that many users have more than one avatar in more than one of these worlds.
Such tiny user numbers mean that there are even more people who don't use these worlds, who therefore are completely unfamiliar with these worlds. Who, in fact, don't even know these worlds exist. I'm pretty sure there isn't a single paid Metaverse expert of any kind who has ever even heard of OpenSimulator. They know Horizons, they know The Sandbox, they know Decentraland, they know Rec Room, they know VRchat, they know Roblox and so forth, they may even be aware that Second Life is still around, but they've never in their lives heard of OpenSim. It's that obscure.
So imagine I just name-dropped...
What'd that tell you?
It'd tell you nothing. You wouldn't know what that is. I couldn't blame you. Right off the bat, I know only two other Fediverse users who definitely know that building because I was there with them. Maybe a few more have been there before. Definitely much fewer than 50. Likely fewer than 20. Out of millions.
Okay, let's add where it is.
Does that help?
No, it doesn't. If you don't know the Sendalonde Community Library, you don't know what and where Sendalonde is either. That place is only known for its spectacular library building.
And you've probably never heard of a real-life place with that name. Of course you haven't. That place isn't in real life.
So I'd have to add some more information.
What's the Discovery Grid? And what's a grid in this context, and why is it called a grid?
Well, then I have to get even wordier.
Nobody, absolutely nobody writes that much about a real-life location. Ever.
And still, while you know that I'm talking about a place in a virtual world and what that virtual world is based on, while this question is answered, it raises a new question: What is OpenSimulator?
I wouldn't blame you for asking that. Again, even Metaverse experts don't know OpenSimulator. I'm pretty sure that nobody in the Open Metaverse Interoperability Group, in the Open Metaverse Alliance and at the Open Metaverse Foundation has ever heard of OpenSim. The owners and operators of most existing virtual worlds have never heard of OpenSim except those of Second Life, Overte and maybe a few others. Most Second Life users, present and past, have never heard of OpenSim. Most users of most other virtual worlds, present and past, have never heard of OpenSim.
And billions of people out there believe that Zuckerberg has invented "The Metaverse", and that his virtual worlds are actually branded "Metaverse® ("Metaverse" is a registered trademark of Meta Platforms, Inc. All rights reserved.)" Hardly anyone knows that the term "metaverse" was coined by Neal Stephenson in his cyberpunk novel Snow Crash which, by the way, has inspired Philip Rosedale to create Second Life. And nobody knows that the term "metaverse" has been part of the regular OpenSim users' vocabulary since before 2010. Because nobody knows OpenSim.
And that's why I can't just name-drop "OpenSimulator" either. I have to explain even that.
That alone would be more than your typical cat picture alt-text.
But it'd create misconceptions, namely of OpenSim being another walled-garden, headset-only VR platform that has jumped upon the "Metaverse" bandwagon. Because that's what people know about virtual worlds, if anything. So that's what they automatically assume. And that's wrong.
I'd have to keep that from happening by telling people that OpenSim is as decentralised and federated as the Fediverse, only that it even predates Laconi.ca, not to mention Mastodon. Okay, and it only federates with itself and some of its own forks because OpenSim doesn't run on a standardised protocol, and nobody else has ever created anything compatible.
This is more than most alt-texts on Mastodon. Only this.
But it still leaves one question unanswered: "Discovery Grid? What's that? Why is it called a grid? What's a grid in this context?"
So I'd have to add yet another paragraph.
I'm well past 1,000 characters now. Other people paint entire pictures with words with that many characters. I need them only to explain where a picture was taken. But this should answer all immediate questions and make clear what kind of place the picture shows.
The main downside, apart from the length which for some Mastodon users is too long for a full image description already, is that this will be outdated, should the decision be made to move Sendalonde to another grid again.
And I haven't even started actually describing the image. Blind or visually-impaired users still don't know what it actually shows.
If this was a place in real life, I might get away with name-dropping the Sendalonde Community Library and briefly mention that there are some trees around it, and there's a body of water in the background. It'd be absolutely sufficient.
But such a virtual place is something that next to nobody is familiar with. Non-sighted people even less because they're even more unlikely to visit virtual worlds. That's a highly visual medium and usually not really inclusive for non-sighted users.
So if I only name-dropped the Sendalonde Community Library, mentioned where it is located and explained what OpenSim is, I wouldn't be done. There would be blind or visually-impaired people inquiring, "Okay, but what does it look like?" Ditto people with poor internet for whom the image doesn't load.
Sure they would. Because they honestly wouldn't know what it looks like. Because even the sighted users with poor internet have never seen it before. But they would want to know.
So I'd have to tell them. Not doing so would be openly ableist.
And no, one sentence isn't enough. This is a very large, highly complex, highly detailed building and not just a box with a doorway and a sign on it. Besides, remember that we're talking about a virtual world. Architecture in virtual worlds is not bound to the same limits and laws and standards and codes as in real life. Just about everything is possible. So absolutely nothing can ever be considered "a given" and therefore unnecessary to be mentioned.
Now, don't believe that blind or visually-impaired people will limit their "What does it look like?" to the centre-piece of the picture. If you mention something being there, they want to know what it looks like. Always. Regardless of whether or not they used to be sighted, they still don't know what whatever you've mentioned looks like specifically in a virtual world. And, again, it's likely that they don't know what it looks like at all.
Thus, if I mention it, I have to describe it. Always. All of it.
There are exactly two exceptions. One, if something is fully outside the borders of the image. Two, if something is fully covered up by something else. And I'm not even entirely sure about the latter case.
Sometimes, a visual description isn't even enough. Sometimes, I can mention that something is somewhere in the picture. I can describe what that something looks like in all details. But people still don't know what it is.
I can mention that there's an OpenSimWorld beacon standing somewhere. I can describe its looks with over a 1,000 words and so much accuracy that an artist could make a fairly accurate drawing of it just from my description.
But people, the artist included, still would not know what an OpenSimWorld beacon is in the first place, nor what it's there for.
So I have to explain what an OpenSimWorld beacon is and what it does.
Before I can do that, I first have to explain what OpenSimWorld is. And that won't be possible with a short one-liner. OpenSimWorld is a very multi-purpose website. Explaining it will require a four-digit number of characters.
Only after I'm done explaining OpenSimWorld, I can start explaining the beacon. And the beacon is quite multi-functional itself. On top of that, I'll have to explain the concept of teleporting around in OpenSim, especially from grid to grid through the Hypergrid.
This is why I generally avoid having OSW beacons in my pictures.
Teleporters themselves aren't quite as bad, but they, too, require lots and lots of words. They have to be described. If there's a picture on them, maybe one that shows a preview of the chosen destination, that picture has to be described. All of a sudden, I have an entire second image to write a description for. And then I have to explain what that teleporter is, what it does, how it works, how it's operated. They don't know teleporters because there are no teleporters in real life.
At least I might not have to explain to them which destinations the teleporter can send an avatar to. The people who need all these descriptions and explanations won't have any use for this particular information because they don't even know the destinations in the first place. And describing and explaining each of these destinations, especially if they're over a hundred, might actually be beyond the scope of an image description, especially since these destinations usually aren't shown in the image itself.
Just like in-world objects, avatars and everything more or less similar require detailed, extensive descriptions and explanations. People need to understand how avatars work in this kind of world, and of course, blind or visually-impaired people want to know what these avatars look like. Each and every last one of them. Again, how are they supposed to know otherwise?
I'm not quite sure whether or not it's smart to always give the names of all avatars in the image. It's easy to find them out, but when writing a description especially for a party picture with dozens of avatars in it, associating the depictions of avatars in the image with identities has to be done right away before even only one of these avatars leaves the location.
One thing that needs to be explained right afterwards is how avatars are built. In the cases of Second Life and OpenSim, this means explaining that they usually aren't "monobloc" avatars that can't be modified in-world. Instead, they are modular, put together from lots of elements, usually starting with a mesh body that "replaces" the default system body normally rendered by the viewer, continuing with a skin texture, an eye texture and a shape with over 80 different parameters and ending with clothes and accessories. Of course, this requires an explanation on what "mesh" is, why it's special and when and why it was introduced.
OpenSim also supports script-controlled NPCs which require their own explanation, including that NPCs don't exist in Second Life, and how they work in OpenSim. Animesh exists both in Second Life and OpenSim and requires its own explanation again.
After these explanations, the actual visual description can begin. And it can and has to be every bit as extensive and detailed as for everything else in the picture.
The sex of an avatar does not have to be avoided in the description, at least not in Second Life and OpenSim. There, you basically only have two choices: masculine men and feminine women. Deviating from that is extremely difficult, so next to nobody does that. What few people actually declare their avatars trans describe them as such in the profile. The only other exception are "women with a little extra". All other avatars can safely be assumed to be cis, and their visual sex can be used to describe them.
In virtual worlds, especially Second Life and OpenSim, there is no reason not to mention the skin tone either. A skin is just that: a skin. It can be replaced with just about any other skin on any avatar without changing anything else. It doesn't even have to be natural. It can be snow white, or it can be green, or it can be the grey of bare metal. In fact, in order to satisfy those who are really curious about virtual worlds, it's even necessary to mention if a skin is photo-realistic and has highlights and shades baked on.
Following that comes a description of what the avatar wears, including the hairstyle. This, too, should go into detail and mention things that are so common in real life that nobody would waste a thought about them, such as whether there are creases or crinkles on a piece of clothing at all, and if so, if they're actually part of the 3-D model or only painted on.
Needless to say that non-standard avatars, e.g. dragons, require the same amount of detail when describing them.
Now, only describing what an avatar looks like isn't enough. It's also necessary to describe what the avatar does which means a detailed description of its posture and mimics. Just about all human avatars in Second Life and OpenSim have support for mimics, even though they usually wear a neutral, non-descript expression. But even that needs to be mentioned.
They say that if there's text somewhere in a picture, it has to be transcribed verbatim in the image description. However, there is no definite rule for text that is too small to be readable, partially obscured by something in front of it or only partially within the borders of the image.
Text not only appears in screenshots of social media posts, photographs of news articles and the like. It may appear in all kinds of photographs, and it may just as well appear in digital renderings from 3-D virtual worlds. It can be on posters, it can be on billboards, it can be on big and small signs, it can be on store marquees, it can be printed on people's clothes, it can be anywhere.
Again, the basic rule is: If there's text, it has to be transcribed.
Now you might say that transcribing illegible text is completely out of question. It can't be read anyway, so it can't be transcribed either. Case closed.
Not so fast. It's true that this text can't be read in the picture. But that one picture is not necessarily the only source for the text in question. If the picture is a real-life photograph, the last resort would be to go back to where the picture was taken, look around more closely and transcribe the bits of text from there.
Granted, that's difficult if whatever a text was on is no longer there, e.g. if it was printed on a T-shirt. And yes, that's extra effort, too much of an effort if you're at home posting pictures which you've taken during your overseas vacation. Flying back there just to transcribe text is completely out of question.
This is a non-issue for pictures from virtual worlds. In most cases, you can always go back to where you've taken a picture, take closer looks at signs and posters and so on, look behind trees or columns or whatever is standing in front of a sign and partly covering it and easily transcribe everything. Or you take the picture and write the description without even leaving first. You can stay there until you're done describing and transcribing everything.
At least Second Life and OpenSim also allow you to move your camera and therefore your vision independently from your avatar. That really makes it possible to take very close looks at just about everything, regardless of whether or not you can get close enough with your avatar.
There are only four cases in which in-world text does not have to be fully transcribed. One, it's incomplete in-world; in this case, transcribe what is there. Two, it's illegible in-world, for example due to a too low texture resolution or texture quality; that's bad luck. Three, it is fully obscured, either because it is fully covered by something else, or because it's on a surface completely facing away from the camera. And four, it isn't even within the borders of the image.
In all other cases, there is no reason not to transcribe text. The text being illegible in the picture isn't. In fact, that's rather a reason to transcribe it: Even sighted people need help figuring out what's written there. And people who are super-curious about virtual worlds and want to know everything about them will not stop at text.
Yeah, that's all tough, I know. And I can understand if you as the audience are trying to weasel yourself out of having to read such a massive image description. You're trying to get me to not write that much. You're trying to find a situation in which writing so much is not justified, not necessary. Or better yet, enough situations that they become the majority, that a full description ends up only necessary in extremely niche edge cases that you hope to never come across. You want to see that picture, but you want to see it without thousands or tens of thousands of worlds of description.
Let me tell you something: There is no such situation. There is no context in which such a huge image description wouldn't be necessary.
The picture could be part of a post of someone who has visited that place and wants to tell everyone about it. Even if the post itself has only got 200 characters.
The picture could be part of an announcement of an event that's planned to take place there.
The picture could be part of a post from that very event. Or about the event after it has happened.
The picture could be part of an interview with the owners.
The picture could be part of a post about famous locations in OpenSim.
The picture could be part of a post about the Discovery Grid.
The picture could be part of a post about OpenSim in general.
The picture could be part of a post or thread about 6 obscure virtual worlds that you've probably never heard of, and number 4 is really awesome.
The picture could be part of a post about virtual architecture.
The picture could be part of a post about the concept of virtual libraries or bookstores.
The picture could be part of a recommendation of cool OpenSim places to visit.
It doesn't matter. All these cases require the full image description with all its details. And so do all those which I haven't mentioned. There will always be someone coming across the post with the picture who needs the description.
See, I've learned something about the Fediverse. You can try to limit your target audience. But you can't limit your actual audience.
It'd be much easier for me if I could only post to people who know OpenSim and actually lock everyone else out. But I can't.
On the World-Wide Web, it's easy. If you write something niche, pretty much only people interested in that niche will see your content because only they will even look for content like yours. Content has to be actively dug out, but in doing so, you can pick what kind of content to dig out.
In the Fediverse, anyone will come across stuff that they know nothing about, whether they're interested in it or not. Even elaborate filtering of the personal timeline isn't fail-safe. And then there are local and federated timelines on which all kinds of stuff appear.
No matter how hard you try to only post to a specific audience, it is very likely that someone who knows nothing about your topic will see your post on the federated timeline on mastodon.social. It's rude to keep clueless casuals from following you, even though all they do is follow absolutely everyone because they need that background noise of uninteresting stuff on their personal timeline that they have on X due to The Algorithm. And it's impossible to keep people from boosting your posts to clueless casuals, whether these people are your own connections and familiar with your topic, or they've discovered your most recent post on their federated timeline.
You can't keep clueless casuals who need an extensive image description to understand your picture from coming across it. Neither can you keep blind or visually-impaired users who need an image description to even experience the picture in the first place from coming across it.
Neither, by the way, can you keep those who demand everyone always give a sufficient description for any image from coming across yours. And I'm pretty sure that some of them not only demand that from those whom they follow, but from those whose picture posts they come across on the local or federated timelines as well.
Sure, you can ignore them. You can block them. You can flip them the imaginary or actual bird. And then you can refuse to give a description altogether. Or you can put a short description into the alt-text which actually doesn't help at all. Sure, you can do that. But then you have to cope with having a Fediverse-wide reputation as an ableist swine.
The only alternative is to do it right and give those who need a sufficiently informative image description what they need. In the case of virtual worlds, as I've described, "sufficiently informative" starts at several thousand words.
And this is why pictures from virtual worlds always need extremely long image descriptions.
Set of hashtags to see if they're federated across the Fediverse:
#ImageDescription #ImageDescriptions #AltText #Accessibility #Inclusion #Inclusivity #OpenSim #OpenSimulator #SecondLife #Metaverse #VirtualWorlds
That's because pictures posted in the Fediverse need image descriptions. Useful and sufficiently informative image descriptions. And to my understanding, even Hubzilla articles are part of the Fediverse because they're part of Hubzilla. So the exact same rules apply to them that apply to posts. Including image descriptions being an absolute requirement.
And a useful and sufficiently informative image description for a picture from a virtual world has to be absolutely massive. In fact, it can't be done within Mastodon's limits. Not even the 1,500 characters offered for alt-text are enough. Not nearly.
Over the last 12 or 13 months, I've developed my image-describing style, and it's still evolving. However, this also means my image descriptions get more and more detailed with more and more explanations, and so they tend to grow longer and longer.
My first attempt at writing a detailed, informative description for a picture from a virtual world was in November, 2022. It started at over 11,000 characters already and grew beyond 13,000 characters a bit later when I re-worked it and added a missing text transcript. Most recently, I've broken the 40,000-character barrier, also because I've raised my standards to describing pictures within pictures within a picture. I've taken over 13 hours to describe one single picture twice already.
I rarely get any feedback for my image descriptions. But I sometimes have to justify their length, especially to sighted Fediverse users who don't care for virtual worlds.
Sure, most people who come across my pictures don't care for virtual worlds at all. But most people who come across my pictures are fully sighted and don't require any image descriptions. It's still good manners to provide them.
And there may pretty well be people who are very excited about and interested in virtual worlds, especially if it's clear that these are actually existing, living, breathing virtual worlds and not some cryptobro's imagination. And they may want to know everything about these worlds. But they know nothing. They look at the pictures, but they can't figure out from looking at the pictures what these pictures show. Nothing that's in these pictures is really familiar to them.
So when describing a picture from a virtual world, one must never assume that anything in the picture is familiar to the on-looker. In most cases, it is not.
Also, one might say that only sighted people are interested in virtual worlds because virtual worlds are a very visual medium and next to impossible to navigate without eyesight. Still, blind or visually-impaired people may be just as fascinated by virtual worlds as sighted people. And they may be at least just as curious which means they may require even more description and explanation. They want to know what everything looks like, but since they can't see it for themselves, they have to be told.
All this is why pictures from virtual worlds require substantially more detailed and thus much, much longer descriptions than real-life photographs.
The medium
The wordiness of descriptions for images from virtual worlds starts with the medium. It's generally said that image descriptions must not start with "Picture of" or "Image of". Some even say that mentioning the medium, i.e. "Photograph of", is too much.
Unless it is not a digital photograph. And no, it isn't always a digital photograph.
It can just as well be a digitised analogue photograph, film grain and all. It can be a painting. It can be a sketch. It can be a graph. It can be a screenshot of a social media post. It can be a scanned newspaper page.
Or it can be a digital rendering.
Technically speaking, virtual world images are digital renderings. But just writing "digital rendering" isn't enough.
If I only wrote "digital rendering", people would think of spectacular, state-of-the-art, high-resolution digital art with ray-tracing and everything. Like stills from Cyberpunk 2077 for which the graphics settings were temporarily cranked up to levels at which the game becomes unplayable, just to show off. Or like promotional pictures from a Pixar film. Or like the stuff we did in PoV-Ray back in the day. When the single-core CPU ran on full blast for half an hour, but the outcome was a gorgeous screen-sized 3-D picture.
But images from the virtual worlds I frequent are nothing like this. Ray-tracing isn't even an option. It's unavailable. It's technologically impossible. So there is no fancy ray-tracing with fully reflective surfaces and whatnot. But there are shaders with stuff like ambient occlusion.
So where other people may or may not write "photograph", I have to write something like "digital 3-D rendering created using shaders, but without ray-tracing".
The location
If you think that was wordy, think again. Mentioning the location is much worse. And mentioning the location is mandatory in this case.
I mean, it's considered good style to always write where a picture was taken unless, maybe, it was at someone's home, or the location of something is classified.
In real life, that's easy. And except for digital art, digitally generated graphs and pictures of text, almost all pictures in the Fediverse were taken in real-life.
In real life, you can often get away with name-dropping. Most people know at least roughly what "Times Square" refers to. Or "Piccadilly Circus". Or "Monument Valley". Or "Stonehenge". There is no need to break down where these places are. It can be considered common knowledge.
In fact, you get away even more easily with name-dropping landmarks without telling where they are. White House. Empire State Building. Tower Bridge. Golden Gate Bridge. Mount Fuji. Eiffel Tower. Taj Mahal. Sydney Opera House which, admittedly, name-drops its rough location, just like the Hollywood Marquee. All these are names that should ring a bell.
But you can't do that in virtual worlds. In no virtual world can you do that. Not even in Roblox which has twice as many users as Germany has citizens. Much less in worlds running on OpenSim, all of which combined are estimated to have fewer than 50,000 unique monthly users. Whatever "unique" means, considering that many users have more than one avatar in more than one of these worlds.
Such tiny user numbers mean that there are even more people who don't use these worlds, who therefore are completely unfamiliar with these worlds. Who, in fact, don't even know these worlds exist. I'm pretty sure there isn't a single paid Metaverse expert of any kind who has ever even heard of OpenSimulator. They know Horizons, they know The Sandbox, they know Decentraland, they know Rec Room, they know VRchat, they know Roblox and so forth, they may even be aware that Second Life is still around, but they've never in their lives heard of OpenSim. It's that obscure.
So imagine I just name-dropped...
[...] the Sendalonde Community Library.
What'd that tell you?
It'd tell you nothing. You wouldn't know what that is. I couldn't blame you. Right off the bat, I know only two other Fediverse users who definitely know that building because I was there with them. Maybe a few more have been there before. Definitely much fewer than 50. Likely fewer than 20. Out of millions.
Okay, let's add where it is.
[...] the Sendalonde Community Library in Sendalonde.
Does that help?
No, it doesn't. If you don't know the Sendalonde Community Library, you don't know what and where Sendalonde is either. That place is only known for its spectacular library building.
And you've probably never heard of a real-life place with that name. Of course you haven't. That place isn't in real life.
So I'd have to add some more information.
[...] the Sendalonde Community Library in Sendalonde in the Discovery Grid.
What's the Discovery Grid? And what's a grid in this context, and why is it called a grid?
Well, then I have to get even wordier.
[...] the Sendalonde Community Library in Sendalonde in the Discovery Grid which is a 3-D virtual world based on OpenSimulator.
Nobody, absolutely nobody writes that much about a real-life location. Ever.
And still, while you know that I'm talking about a place in a virtual world and what that virtual world is based on, while this question is answered, it raises a new question: What is OpenSimulator?
I wouldn't blame you for asking that. Again, even Metaverse experts don't know OpenSimulator. I'm pretty sure that nobody in the Open Metaverse Interoperability Group, in the Open Metaverse Alliance and at the Open Metaverse Foundation has ever heard of OpenSim. The owners and operators of most existing virtual worlds have never heard of OpenSim except those of Second Life, Overte and maybe a few others. Most Second Life users, present and past, have never heard of OpenSim. Most users of most other virtual worlds, present and past, have never heard of OpenSim.
And billions of people out there believe that Zuckerberg has invented "The Metaverse", and that his virtual worlds are actually branded "Metaverse® ("Metaverse" is a registered trademark of Meta Platforms, Inc. All rights reserved.)" Hardly anyone knows that the term "metaverse" was coined by Neal Stephenson in his cyberpunk novel Snow Crash which, by the way, has inspired Philip Rosedale to create Second Life. And nobody knows that the term "metaverse" has been part of the regular OpenSim users' vocabulary since before 2010. Because nobody knows OpenSim.
And that's why I can't just name-drop "OpenSimulator" either. I have to explain even that.
[...] the Sendalonde Community Library in Sendalonde in the Discovery Grid which is a 3-D virtual world based on OpenSimulator.
OpenSimulator (official website and wiki), OpenSim in short, is a free and open-source platform for 3-D virtual worlds that uses largely the same technology as the commercial virtual world Second Life.
That alone would be more than your typical cat picture alt-text.
But it'd create misconceptions, namely of OpenSim being another walled-garden, headset-only VR platform that has jumped upon the "Metaverse" bandwagon. Because that's what people know about virtual worlds, if anything. So that's what they automatically assume. And that's wrong.
I'd have to keep that from happening by telling people that OpenSim is as decentralised and federated as the Fediverse, only that it even predates Laconi.ca, not to mention Mastodon. Okay, and it only federates with itself and some of its own forks because OpenSim doesn't run on a standardised protocol, and nobody else has ever created anything compatible.
[...] the Sendalonde Community Library in Sendalonde in the Discovery Grid which is a 3-D virtual world based on OpenSimulator.
OpenSimulator (official website and wiki), OpenSim in short, is a free and open-source platform for 3-D virtual worlds that uses largely the same technology as the commercial virtual world Second Life. It was launched as early as 2007, and most of it became a network of federated, interconnected worlds when the Hypergrid was introduced in 2008. It is accessed through client software running on desktop or laptop computers, so-called "viewers". It doesn't require a virtual reality headset, and it actually doesn't support virtual reality headsets.
This is more than most alt-texts on Mastodon. Only this.
But it still leaves one question unanswered: "Discovery Grid? What's that? Why is it called a grid? What's a grid in this context?"
So I'd have to add yet another paragraph.
[...] the Sendalonde Community Library in Sendalonde in the Discovery Grid which is a 3-D virtual world based on OpenSimulator.
OpenSimulator (official website and wiki), OpenSim in short, is a free and open-source platform for 3-D virtual worlds that uses largely the same technology as the commercial virtual world Second Life. It was launched as early as 2007, and most of it a network of federated, interconnected worlds when the Hypergrid was introduced in 2008. It is accessed through client software running on desktop or laptop computers, so-called "viewers". It doesn't require a virtual reality headset, and it actually doesn't support virtual reality headsets.
Just like Second Life's virtual world, worlds based on OpenSim are referred to as "grids" because they are separated into square fields of 256 by 256 metres, so-called "regions". These regions can be empty and inaccessible, or there can be a "simulator" or "sim" running in them. Only these sims count a the actual land area of a grid. It is possible to both look into neighbouring sims and move your avatar across sim borders unless access limitations prevent this.
I'm well past 1,000 characters now. Other people paint entire pictures with words with that many characters. I need them only to explain where a picture was taken. But this should answer all immediate questions and make clear what kind of place the picture shows.
The main downside, apart from the length which for some Mastodon users is too long for a full image description already, is that this will be outdated, should the decision be made to move Sendalonde to another grid again.
And I haven't even started actually describing the image. Blind or visually-impaired users still don't know what it actually shows.
The actual content of the image
If this was a place in real life, I might get away with name-dropping the Sendalonde Community Library and briefly mention that there are some trees around it, and there's a body of water in the background. It'd be absolutely sufficient.
But such a virtual place is something that next to nobody is familiar with. Non-sighted people even less because they're even more unlikely to visit virtual worlds. That's a highly visual medium and usually not really inclusive for non-sighted users.
So if I only name-dropped the Sendalonde Community Library, mentioned where it is located and explained what OpenSim is, I wouldn't be done. There would be blind or visually-impaired people inquiring, "Okay, but what does it look like?" Ditto people with poor internet for whom the image doesn't load.
Sure they would. Because they honestly wouldn't know what it looks like. Because even the sighted users with poor internet have never seen it before. But they would want to know.
So I'd have to tell them. Not doing so would be openly ableist.
And no, one sentence isn't enough. This is a very large, highly complex, highly detailed building and not just a box with a doorway and a sign on it. Besides, remember that we're talking about a virtual world. Architecture in virtual worlds is not bound to the same limits and laws and standards and codes as in real life. Just about everything is possible. So absolutely nothing can ever be considered "a given" and therefore unnecessary to be mentioned.
Now, don't believe that blind or visually-impaired people will limit their "What does it look like?" to the centre-piece of the picture. If you mention something being there, they want to know what it looks like. Always. Regardless of whether or not they used to be sighted, they still don't know what whatever you've mentioned looks like specifically in a virtual world. And, again, it's likely that they don't know what it looks like at all.
Thus, if I mention it, I have to describe it. Always. All of it.
There are exactly two exceptions. One, if something is fully outside the borders of the image. Two, if something is fully covered up by something else. And I'm not even entirely sure about the latter case.
Sometimes, a visual description isn't even enough. Sometimes, I can mention that something is somewhere in the picture. I can describe what that something looks like in all details. But people still don't know what it is.
I can mention that there's an OpenSimWorld beacon standing somewhere. I can describe its looks with over a 1,000 words and so much accuracy that an artist could make a fairly accurate drawing of it just from my description.
But people, the artist included, still would not know what an OpenSimWorld beacon is in the first place, nor what it's there for.
So I have to explain what an OpenSimWorld beacon is and what it does.
Before I can do that, I first have to explain what OpenSimWorld is. And that won't be possible with a short one-liner. OpenSimWorld is a very multi-purpose website. Explaining it will require a four-digit number of characters.
Only after I'm done explaining OpenSimWorld, I can start explaining the beacon. And the beacon is quite multi-functional itself. On top of that, I'll have to explain the concept of teleporting around in OpenSim, especially from grid to grid through the Hypergrid.
This is why I generally avoid having OSW beacons in my pictures.
Teleporters themselves aren't quite as bad, but they, too, require lots and lots of words. They have to be described. If there's a picture on them, maybe one that shows a preview of the chosen destination, that picture has to be described. All of a sudden, I have an entire second image to write a description for. And then I have to explain what that teleporter is, what it does, how it works, how it's operated. They don't know teleporters because there are no teleporters in real life.
At least I might not have to explain to them which destinations the teleporter can send an avatar to. The people who need all these descriptions and explanations won't have any use for this particular information because they don't even know the destinations in the first place. And describing and explaining each of these destinations, especially if they're over a hundred, might actually be beyond the scope of an image description, especially since these destinations usually aren't shown in the image itself.
Avatars
Just like in-world objects, avatars and everything more or less similar require detailed, extensive descriptions and explanations. People need to understand how avatars work in this kind of world, and of course, blind or visually-impaired people want to know what these avatars look like. Each and every last one of them. Again, how are they supposed to know otherwise?
I'm not quite sure whether or not it's smart to always give the names of all avatars in the image. It's easy to find them out, but when writing a description especially for a party picture with dozens of avatars in it, associating the depictions of avatars in the image with identities has to be done right away before even only one of these avatars leaves the location.
One thing that needs to be explained right afterwards is how avatars are built. In the cases of Second Life and OpenSim, this means explaining that they usually aren't "monobloc" avatars that can't be modified in-world. Instead, they are modular, put together from lots of elements, usually starting with a mesh body that "replaces" the default system body normally rendered by the viewer, continuing with a skin texture, an eye texture and a shape with over 80 different parameters and ending with clothes and accessories. Of course, this requires an explanation on what "mesh" is, why it's special and when and why it was introduced.
OpenSim also supports script-controlled NPCs which require their own explanation, including that NPCs don't exist in Second Life, and how they work in OpenSim. Animesh exists both in Second Life and OpenSim and requires its own explanation again.
After these explanations, the actual visual description can begin. And it can and has to be every bit as extensive and detailed as for everything else in the picture.
The sex of an avatar does not have to be avoided in the description, at least not in Second Life and OpenSim. There, you basically only have two choices: masculine men and feminine women. Deviating from that is extremely difficult, so next to nobody does that. What few people actually declare their avatars trans describe them as such in the profile. The only other exception are "women with a little extra". All other avatars can safely be assumed to be cis, and their visual sex can be used to describe them.
In virtual worlds, especially Second Life and OpenSim, there is no reason not to mention the skin tone either. A skin is just that: a skin. It can be replaced with just about any other skin on any avatar without changing anything else. It doesn't even have to be natural. It can be snow white, or it can be green, or it can be the grey of bare metal. In fact, in order to satisfy those who are really curious about virtual worlds, it's even necessary to mention if a skin is photo-realistic and has highlights and shades baked on.
Following that comes a description of what the avatar wears, including the hairstyle. This, too, should go into detail and mention things that are so common in real life that nobody would waste a thought about them, such as whether there are creases or crinkles on a piece of clothing at all, and if so, if they're actually part of the 3-D model or only painted on.
Needless to say that non-standard avatars, e.g. dragons, require the same amount of detail when describing them.
Now, only describing what an avatar looks like isn't enough. It's also necessary to describe what the avatar does which means a detailed description of its posture and mimics. Just about all human avatars in Second Life and OpenSim have support for mimics, even though they usually wear a neutral, non-descript expression. But even that needs to be mentioned.
Text transcripts
They say that if there's text somewhere in a picture, it has to be transcribed verbatim in the image description. However, there is no definite rule for text that is too small to be readable, partially obscured by something in front of it or only partially within the borders of the image.
Text not only appears in screenshots of social media posts, photographs of news articles and the like. It may appear in all kinds of photographs, and it may just as well appear in digital renderings from 3-D virtual worlds. It can be on posters, it can be on billboards, it can be on big and small signs, it can be on store marquees, it can be printed on people's clothes, it can be anywhere.
Again, the basic rule is: If there's text, it has to be transcribed.
Now you might say that transcribing illegible text is completely out of question. It can't be read anyway, so it can't be transcribed either. Case closed.
Not so fast. It's true that this text can't be read in the picture. But that one picture is not necessarily the only source for the text in question. If the picture is a real-life photograph, the last resort would be to go back to where the picture was taken, look around more closely and transcribe the bits of text from there.
Granted, that's difficult if whatever a text was on is no longer there, e.g. if it was printed on a T-shirt. And yes, that's extra effort, too much of an effort if you're at home posting pictures which you've taken during your overseas vacation. Flying back there just to transcribe text is completely out of question.
This is a non-issue for pictures from virtual worlds. In most cases, you can always go back to where you've taken a picture, take closer looks at signs and posters and so on, look behind trees or columns or whatever is standing in front of a sign and partly covering it and easily transcribe everything. Or you take the picture and write the description without even leaving first. You can stay there until you're done describing and transcribing everything.
At least Second Life and OpenSim also allow you to move your camera and therefore your vision independently from your avatar. That really makes it possible to take very close looks at just about everything, regardless of whether or not you can get close enough with your avatar.
There are only four cases in which in-world text does not have to be fully transcribed. One, it's incomplete in-world; in this case, transcribe what is there. Two, it's illegible in-world, for example due to a too low texture resolution or texture quality; that's bad luck. Three, it is fully obscured, either because it is fully covered by something else, or because it's on a surface completely facing away from the camera. And four, it isn't even within the borders of the image.
In all other cases, there is no reason not to transcribe text. The text being illegible in the picture isn't. In fact, that's rather a reason to transcribe it: Even sighted people need help figuring out what's written there. And people who are super-curious about virtual worlds and want to know everything about them will not stop at text.
But why?
Yeah, that's all tough, I know. And I can understand if you as the audience are trying to weasel yourself out of having to read such a massive image description. You're trying to get me to not write that much. You're trying to find a situation in which writing so much is not justified, not necessary. Or better yet, enough situations that they become the majority, that a full description ends up only necessary in extremely niche edge cases that you hope to never come across. You want to see that picture, but you want to see it without thousands or tens of thousands of worlds of description.
Let me tell you something: There is no such situation. There is no context in which such a huge image description wouldn't be necessary.
The picture could be part of a post of someone who has visited that place and wants to tell everyone about it. Even if the post itself has only got 200 characters.
The picture could be part of an announcement of an event that's planned to take place there.
The picture could be part of a post from that very event. Or about the event after it has happened.
The picture could be part of an interview with the owners.
The picture could be part of a post about famous locations in OpenSim.
The picture could be part of a post about the Discovery Grid.
The picture could be part of a post about OpenSim in general.
The picture could be part of a post or thread about 6 obscure virtual worlds that you've probably never heard of, and number 4 is really awesome.
The picture could be part of a post about virtual architecture.
The picture could be part of a post about the concept of virtual libraries or bookstores.
The picture could be part of a recommendation of cool OpenSim places to visit.
It doesn't matter. All these cases require the full image description with all its details. And so do all those which I haven't mentioned. There will always be someone coming across the post with the picture who needs the description.
See, I've learned something about the Fediverse. You can try to limit your target audience. But you can't limit your actual audience.
It'd be much easier for me if I could only post to people who know OpenSim and actually lock everyone else out. But I can't.
On the World-Wide Web, it's easy. If you write something niche, pretty much only people interested in that niche will see your content because only they will even look for content like yours. Content has to be actively dug out, but in doing so, you can pick what kind of content to dig out.
In the Fediverse, anyone will come across stuff that they know nothing about, whether they're interested in it or not. Even elaborate filtering of the personal timeline isn't fail-safe. And then there are local and federated timelines on which all kinds of stuff appear.
No matter how hard you try to only post to a specific audience, it is very likely that someone who knows nothing about your topic will see your post on the federated timeline on mastodon.social. It's rude to keep clueless casuals from following you, even though all they do is follow absolutely everyone because they need that background noise of uninteresting stuff on their personal timeline that they have on X due to The Algorithm. And it's impossible to keep people from boosting your posts to clueless casuals, whether these people are your own connections and familiar with your topic, or they've discovered your most recent post on their federated timeline.
You can't keep clueless casuals who need an extensive image description to understand your picture from coming across it. Neither can you keep blind or visually-impaired users who need an image description to even experience the picture in the first place from coming across it.
Neither, by the way, can you keep those who demand everyone always give a sufficient description for any image from coming across yours. And I'm pretty sure that some of them not only demand that from those whom they follow, but from those whose picture posts they come across on the local or federated timelines as well.
Sure, you can ignore them. You can block them. You can flip them the imaginary or actual bird. And then you can refuse to give a description altogether. Or you can put a short description into the alt-text which actually doesn't help at all. Sure, you can do that. But then you have to cope with having a Fediverse-wide reputation as an ableist swine.
The only alternative is to do it right and give those who need a sufficiently informative image description what they need. In the case of virtual worlds, as I've described, "sufficiently informative" starts at several thousand words.
And this is why pictures from virtual worlds always need extremely long image descriptions.
Set of hashtags to see if they're federated across the Fediverse:
#ImageDescription #ImageDescriptions #AltText #Accessibility #Inclusion #Inclusivity #OpenSim #OpenSimulator #SecondLife #Metaverse #VirtualWorlds
Konversationsmerkmale
Lädt...