I wrote an article explaining the trend with browsers' add-on support and why I think that #Mozilla limiting users' choice on Android massively is part of that trend. The add-on ecosystem is degrading steadily, and I don't expect it to reverse course. https://palant.info/2020/08/31/a-grim-outlook-on-the-future-of-browser-add-ons/
@WPalant scratch "add-ons". browser ecosystem is degrading o0
I just finished evaluating a shiton of tiny browsers for an embedded arm device but they are all dead.
"surf" was the first that looked remotely alive.
@bekopharm Yes, I mentioned that in the article. Add-ons are merely where this shows already, but it will affect everything sooner or later.
A modern web browser is an extremely fast moving target. It requires lots of resources to build one, which is why there are now only Gecko and Blink browser engines left. And even embedding an existing browser engine is lots of work merely to keep up with the development.
Now with Mozilla downsizing, Google is going to dictate the rules…
@WPalant Indeed a grim picture.
Btw wish you'd support Webmentions or Backfeeding.
@bekopharm The fun thing is: as your site rank rises, you get more inventive spam. The spammers will actually invest real thought into fooling you so that you leave the spam standing. 😀
My personal blog isn't too bad. But weeding out the spam on adblockplus.org (blog + forum) was the horror.
@WPalant don't think it's about rank. My personal blog was drowning in spam years ago so I gave up and disabled comments and pingbacks. Also comment links are usually marked user-generated-content so there is nothing in for seo. Anyway, I enabled this again with backfeed and mentions not offering a comment form in itself and wow.. there are real people! Meanwhile the anti spam plugins are bored.
Also imho(!) it's great being allowed to comment via a preferred platform without filling more forms
@bekopharm It absolutely is about rank. I mean, if your commenting system doesn't weed out bots – sure, you will drown in automated comments regardless of rank. But if you make filling out your forms impossible for bots, the high rank websites see the human spammers – those people who earn a few bucks by filling out comment forms all day. And I could see their referrer sites: they were primarily coming from Google searches, looking for popular websites with comment forms.
@bekopharm They also were very inventive. If you blocked their IP address, they simply reconnected to get another one. If that didn't work, they would use free proxies or one of a dozen VPN services. They would even switch to Tor. They would do whatever helped them place their content on your site.
@WPalant probably true. my stuff is niche enough so I only had to deal with bots and only very few desperate manual attempts on a forum of mine.
The few bots I got over social media were easily blocked so far. Here I can make use of the comfort offered by silos who usually remove spammers before I even have to lift a finger.
Say - if we'd had this conversation via your comment section - would this have changed things for you? I mean this is also coming from some random domain on the verse.
@bekopharm Not quite sure what you mean – we shouldn't have conversations via the comments section, it isn't the right tool for that. Some blogs have very elaborate comment systems, but once people start chatting there it still gets very hard to keep an overview. So I didn't really try to make this scenario possible.
@WPalant yeah - this happened accidental 😜 Initially I was just commenting on the article but via ActivityPub ;-)
Thing is that the initial comment to the article is not _at_ the article because I refrained from entering data into a form. It was way more comfortable for me. And you read it and you did even see a reason to reply. That's cool :)
What did you get?
A name. A message. An avatar. A source. All public meta data. It's now on your instance. And mine. Could have ignored or blocked it.
@WPalant So my point is: Would this have changed _if_ this interaction would have taken place between our blogs without 3rd party in the middle?
because a Webmention is just that.
@bekopharm Right now this conversation isn't being displayed on my website. If it were the same with Webmention – sure, nothing would have changed. If however I decided to display Webmentions in my comments section, I'd make this a lucrative target for spammers. That's the difference here.
@WPalant I fail to see the difference to offering the same target via comment form.
It get's moderated - as usual. Hopefully it's handled by nifty plugins already and in the end there is human moderation.
Just like here.
The effort is higher tho. How much does it take to setup a new account on e.g. Twitter or post to a form compared to using a domain as source, that will be completely burned once found out as spam?
Isn't this exactly how adblocking works? You kill the entire source?
@bekopharm Yes, I can try implementing this with the same premoderation mechanism as with the comments. The concern is really that this mechanism will be used to flood receivers with spam before it becomes widespread enough to be useful.
@WPalant legit point. It's an underdog and mostly used by [real] people usually only found by blogroll and alike. Implementing this in Hugo is hard (but I know people who did it). Using WordPress myself I get this ootb so it's cheap talk for me :-)
It's not really suited when the game is ranking and seo. Too much effort. The only good thing is that it's the same for spam targets. There are more hops involved _before_ stuff even ends up in some moderation queue.
@bekopharm I looked through this now. It does seem simple at the first glance. That is, until you start looking at parsing and processing h-entry microformat. That's the point where I thought: “Maybe some other time…” The existing Python libraries don't quite take this problem off my hands.
@WPalant that bad? Only fiddled with the php and node parsers. Results are mixed oc based on the source but the absolute minimum, a name and text content, was usually not a problem.
Anyway, thanks for checking at least 👍
@bekopharm I mean, my first issue was “then SHOULD queue and process the request asynchronously, to prevent DoS attacks.” If I understand the point correctly, the concern is that Webmention could be used to facilitate DDoS attacks: instead of requesting a URL themselves, attackers would make Webmention-enabled blogs do it for them. This both hides the source and amplifies the attack. This is a very valid concern, I'm just not seeing how asynchronous processing is helping here.
@bekopharm And then there is this question of downloading and parsing remote content securely. Most trivial potential issue: remote side gives you a massive document back, exhausting all available memory. In general, I'm not sure whether any Python parsers are good for untrusted data.
Plus of course: parsing some dynamic HTML code, trying to make sense of whatever markup its author came up with – there are bound to be problematic edge conditions.
@WPalant my own take at this is a drain bucket but frankly this is at the current usage more a esoteric question. In theory it is possible like with any other exposed service and yes it may result in someone triggering a needless http request … by doing a needless http request. Could http bomb with any botnet way faster.
Is DOM parsing really such an issue for python? Nobody is supposed to run foreign code. Usually it's a matter of walking some sort of xpath and if no text is found reject it.
@bekopharm The DDoS potential is documented under https://indieweb.org/DDOS. It also explains why they suggest queuing – randomized delays make this less feasible to use for a DDoS attack by distributing the load.
At some point this used to be a theoretical scenario for pingbacks as well. And then it suddenly stopped being theoretical.
HTML parsing is an issue with all programming languages. It merely isn't always documented as such.
@bekopharm Actually, https://docs.python.org/3/library/html.parser.html doesn't appear to have any potential issues. Reason is of course that it merely tokenizes HTML code and leaves interpreting structure to the user (doesn't keep anything in memory). Makes it harder to use of course.
@WPalant all true and known. If someone is not willing to take this risk you can outsource it - unlike a pingback - quite easy tho. You're in control where the endpoint is supposed to reside and it does not even have to be the same domain or server. You stay in control of that.
This raises other concerns, of course. Privacy concerns for example. I'd argue that we're only talking about already public meta data again.
Sorry I'm no help with Python. I can read it but that's it.
The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!