(Redirected from S:BOTREQS)


Shortcuts:

For an explanation of how to use this page and request a bot, see the bot request instructions. Make sure your bot is not a duplicate or a commonly rejected idea.

Note Note: The Vote section is used after a vote on if the bot should be used, and shows the number of people who approved/declined the idea of the bot. Please note, a formal vote may not be needed to decline/accept a bot, therefore it is not compulsory to fill this box in. The comment briefly describes any important information about the bot at this moment, especially changes that need to be made.
Note Note: Remember to add a section to the table with your name and bot information when requesting a bot.

For archived discussions, see the discussion archive.

Requested Bots
Owner Bot Name Bot Use Current Status Voting Comments and Recommendations
12944qwerty TBD link fixes when archiving Being discussed
Purin2022 AutoDeadlinker Automatically add {{dead link}} Being discussed

TBD Bot name

 Unresolved (see all...)

So, I've noticed that there are a bunch of links that get broken when archiving a talk page. Especially the CP because many people link to the CP. If a user finds this broken link, they have to guess which archive it is now found in which is extremely hard.

  • Is your bot's task necessary? Yes
  • Is your bot's task difficult for a human to do? Yes
    • If not, is there any reason that a bot would be better than a human (e.g. too repetitive, humans are too unreliable)? It gets really repetitive, and humans can make mistakes. There are also too many links that could be broken.
  • Is your bot's task more than one time or quick use? Yes.
  • Is your bot's task pronouncedly different from other bots’ tasks and would adding its task to another bot be impractical? I do not see how another bot could add this functionality as this job is quite different.
  • Would your bot have a moderate request frequency? No, it would be once per month if we only do CP archiving fixes.
  • Would your bot help the Wiki as a whole, and not just a few specific users or articles? Yes, whenever I try going through talk pages, I find archived links that are broken. Which is extremely hard to find. Especially for the CP.
  • Would your bot be almost foolproof against causing harm if something goes wrong? I doubt it. Worst thing possible would probably be linking to a wrong section lol.
  • If your bot is designed to fix a problem, is it a significant problem that happens repeatedly? Yes.
  • Would your bot follow the wiki guidelines? Yes.

I also think that people could have a signup sheet for their archives so that their links don't get broken. This could increase frequency of use.
12944qwerty Logo.png 12944qwerty  Talk  Contribs  Scratch  14:11, 22 September 2020 (UTC)

Hmm, ok but is it necessary so much? First, how can you get the exact header? With search system on Special:Search? Also, how can you find and parse all the broken links that linked to CP?
Ahmetlii logo.gif ahmetlii  Talk  Contributions  Directory 
14:18, 22 September 2020 (UTC)
How often are archives done (user talk and CP)? Very rarely. There are also not many broken links, so I see no use. Also, the answer to all questions should be "yes", and you answered no to one, so again, I don't see a purpose. I could fix all broken links in under an hour if I knew where they all were, so I don't see the difficulty with a bot doing it. edit conflict with above lol

garnetluvcookie (talk | contribs) 14:23, 22 September 2020 (UTC)
Wait what, glc, you put part of my text in your message
@ahmetlii What do you mean by header? The header doesn't change, and the parsing doesn't change either so it would be the same. I would also just parse through the CP, and look for each section header and search through the pages (yes, with Special:Search).
@gcl I've seen multiple broken links, and I doubt you'd be able to fix so many in under a hour. The key thing is that you don't know where they all are. To find them would take a really long time as a human. A bot could speed the process up. I also don't need to answer yes to all, (although recommended), I just need to provide a strong argument of why my bot is useful.
12944qwerty Logo.png 12944qwerty  Talk  Contribs  Scratch  14:52, 22 September 2020 (UTC)
bruh I'm not gcl
I could. Doesn't mean that I would, but it is possible.

garnetluvcookie (talk | contribs) 17:43, 22 September 2020 (UTC)
Oops, sorry, fixed
Exactly, you wouldn't. It's too much work for a normal human to be able to do and that's why I proposed this bot.
12944qwerty Logo.png 12944qwerty  Talk  Contribs  Scratch  17:52, 22 September 2020 (UTC)
I believe this bot would be useful. For the bot name, I suggest ArchiveBot. Also, this is a reply to both 12944qwerty and garnetluvcookie: the answer there should've been yes, it is a moderate request frequency. Apparently 12944qwerty doesn't understand what moderate means in that context :P
VFDan.png Luvexina  Talk  Contribs  On Scratch  00:52, 26 September 2020 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I understood moderate as every couple days....
12944qwerty Logo.png 12944qwerty  Talk  Contribs  Scratch  21:47, 26 September 2020 (UTC)

Yes, every couple days is moderate, but so is once a week. A bot that would fail that bullet would be one that requests every second.
-unsigned comment by VFDan (talk | contribs)
Oh ok....
ArchiveBot doesn't seem to express the job that this bot would do. ArchiveBot seems to say that it 'archives' pages... but that is not what it is meant to do. It's meant to fix links, once the pages are archived.
12944qwerty Logo.png 12944qwerty  Talk  Contribs  Scratch  18:14, 30 September 2020 (UTC)
bump
I figured that this doesn't necessarily have to be for only bots though... it can also be for moving pages if they don't leave a redirect.
I was thinking that there could be a sign-up sheet which looks at pages and subpages to see if any links should leading to those pages should be fixed or not.
12944qwerty Logo.png 12944qwerty  Talk  Contribs  Scratch  17:27, 6 October 2020 (UTC)
You can avoid links breaking with Special:PermanentLink. No need for a bot especially as retargetting links is not something that should be done automatically as it requires human discretion.
Naleksuh (talk | contribs) 03:23, 23 October 2020 (UTC)
Very Late reply
Yes, but I wouldn't think that forcing every link to be a permanent link would be a great choice. At all.
And don't all the bots require human discretion? Just to make sure that the bot doesn't break anything? We have a period of time for testing the bot for this purpose as well.
12944qwerty Logo.png 12944qwerty  Talk  Contribs  Scratch  14:09, 19 April 2021 (UTC)
One concern I have is that a significant portion of links that break are in userspace or user-talkspace. Bots must not edit userspace (unless excepted) or other users' messages. Yet if this bot does not, that leaves a rather significant portion of broken links unfixed, defeating the purpose of the bot. Unless you have some workaround for that, I'm afraid this idea is good but ineffective.
Kenny2scratch logo.jpg kenny2scratch  Talk  Contribs  Directory 
01:48, 17 June 2021 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── We could potentially make an exception to userspace where if it changes the target of the link itself but not the underlying link (and this is done by a bot) that's acceptable. For example, if I link to [[User talk:Jvvg#Section Title]] we could allow it to be changed to [[User talk:Jvvg/Archive 1#Section Title|User talk:Jvvg#Section Title]] (and if there is already custom link text different from the page name, then of course just keep the link text the same).
jvvg (talk | contribs) 02:06, 17 June 2021 (UTC)

One thing I think we may have overlooked is that possibly not everyone moves pages to archives. Some may copy and paste everything instead. Although it's probably not often, it's definitely likely.
I agree though to both of you. We definitely should have a system that allows users to say yes or no to editing their userspace, but fixing links is something that could definitely be exempted from the rules, since it's trying to fix links.
12944qwerty Logo.png 12944qwerty  Talk  Contribs  Scratch  13:55, 16 September 2021 (UTC)
I think this bot could look at it's talk page so you could sent it a message saying "fixlinks [url or wiki page path] and it would auto do it.
Ideapad-320 (talk | contribs) 12:54, 4 November 2021 (UTC)
It's been quite some time since I've made this idea and still no consensus :(
I have come up with a name: LinkFixer or chad. Not the best name but I suck at coming up with names in general. What do y'all think?
I'm just trying to bump this request so that we can hopefully come to some consensus :D
12944qwerty Logo.png 12944qwerty  Talk  Contribs  Scratch  21:29, 3 April 2022 (UTC)
Good name! You could change the name up above to say possible LinkFixer. You should go ahead and register that scratch account so you can have the name. But there is a issue: diffrently formatted archive title. There could be a template added called archive format. It would be a invisable template that's first paramater was the archive format(number relpaced with a #) and it could only resolve issues on talk pages with that template. How it would work (could be) that pseudocode:
For each page, find all links to talk pages that point to a missing section and are not in a database of failed ones(so it does not keep trying to get it) using what links here
Check if the target page has the archive format template. If yes, skip it. Possibly have it be in the databased of failures for some time like a month before retry. Or have som way to let the bot know that you have added the template
Check every archive for the section. Choose the first one with the highest heading level. Prefer a higher heading level but older than a lower one but newer.
If it cant find one, Perma-add it do the database of fails.
If there is no autoscanning, only finding a cirten link on request, then a lot of that logic would be omitted.
If it was to help with moves, it could use what links here and set them all to the new archive.
You could recreate this bot with a template that takes the before archive number prefix, all of the archives to search for separated by commas, and the section to search for, but that would be rather clunky, impracticly, and slow for the parser.
Another small issue is that to stop this bot from editing pages, NOBOTS must be added. But that stops wikimoniter's auto sign. We could solve this by having a way to configure NOBOTS.
I am also pretty sure that you can scan every week or few days without API spam. But you really just need to scan everything at the begining, and then do the move work when requsted. The inital "fix everything that was not fixed before" would be most of the overhead.
29590234_18x18.png Ideapad-320 | Talk | Contribs | Scratch 18:16, 27 April 2022 (UTC)
We'd scan the recent changes like several of the existing bots.
Also, I'm confused as to what you mean in your first issue you mentions: "differently formatted archive title". What?
12944qwerty Logo.png 12944qwerty  Talk  Contribs  Scratch  20:56, 16 October 2022 (UTC)
What I mean is different "formats" of archive titles. For example, archive 1 might be at /Archive 1 or /A1 or /Archive-1 or /Archive1 or something else.
-unsigned comment by Ideapad-320 (talk | contribs) at 14:34, 17 October 2022 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Yeah. We can set up some type of settings db/page for the bot to figure out the archives format. Though I think we could definitely sort it by most common like /Archive 1.
12944qwerty Logo.png 12944qwerty  Talk  Contribs  Scratch  02:14, 21 October 2022 (UTC)

AutoDeadlinker

 Unresolved (see all...)

The bot will add {{deadlink}} next to a link, as simple as that. I'm pretty sure that there aren't any active bots that currently do that. The bot will follow this procedure:

  1. The bot, upon a manual command, visits a couple (defined by the command) of articles.
  2. The bot checks all external links (including Scratch links) in the articles to see if they return 404, 410 and other gone codes.
  3. If the targeted page does return a gone code then it will add {{deadlink}} next to the link.
  4. Submit the edit if there are any changes.

Here are the answers for the things to consider in the bot page:

  • Is your bot's task necessary? Yes, no one wants to click on a link only to find a deadlink.
  • Is your bot's task difficult for a human to do? Yes, no editor will check every single link in a article.
    • If not, is there any reason that a bot would be better than a human (e.g. too repetitive, humans are too unreliable)? Yes, too repetitive.
  • Is your bot's task more than one time or quick use? Yes, links constantly get deleted.
  • Is your bot's task pronouncedly different from other bots’ tasks and would adding its task to another bot be impractical? It might be possible for the function to be added to TemplatesFTW since it's a various-task-bot and it does add {{inaccurate}} and {{external programs}}, and the owner has suggested that here. However, as the bot doesn't actually check the website itself, I'd say it's different enough.
  • Would your bot have a moderate request frequency? Yes for reading info but no for writing info. Deadlinks aren't that frequent but needs constant checking to ensure the link isn't dead. Also, it would be about a check per day for the bot.
  • Would your bot help the Wiki as a whole, and not just a few specific users or articles? Yes, as long as there are links in the article the links might become dead.
  • Would your bot be almost foolproof against causing harm if something goes wrong? The worst thing that could happen is all links are added with {{deadlink}}, and it would not be too awful, the article itself would still be readable.
  • If your bot is designed to fix a problem, is it a significant problem that happens repeatedly? No, I wouldn't consider that too significant, but it's a problem that happens repeatedly.
  • Would your bot follow the wiki guidelines? Yes, I don't think I need to explain that.
  • Does your bot serve a purpose that is not commonly rejected? Yes, and again, I don't think I need to explain that.

Also, I will try to make a better name for the bot: the name can be improved.

Thank you!
Purin2022 (talk | contribs) 19:58, 25 November 2023 (UTC)

It'd be nice to attempt restoration at a link besides just marking it as a dead link. But this could be useful either way.
Mrsrec (talk | contribs) 03:32, 20 December 2023 (UTC)
A possible issue with this bot is intentionally linking to pages with errors. For example, the page on 404 Errors has a link to a 404 error page (It DOES return with a status code of 404. There are some potential solutions without going full nuclear and marking the pages as No Bots, stopping other bots from working on that page. Having that convention will also stop it from being able to read no bots pages to log elsewhere, I'm pretty sure that's allowed. This could just be me being a perfectionist.
One is a list of wiki pages it ignores, and/or a list of external link pages it treats as OK even if they return a 404 or 403. Perhaps like WikiMonitor's Config pages.There is a risk of vandalism if someone decides to delete the pages with lists of links Perhaps they could be protected to Wikians only? Another solution is to use a HTML comment, template or parameter in the deadlink template to mark a specific link as "allowed to be dead/intentional dead" on the specific page it is used on right next to the link itself.
You might want to automate it and make it do this periodically. Just doing it when the wiki pages change is not enough, because of links going away before the wiki page changes. I would also cache the status codes incase multiple articles have the same external link to avoid accidentally DOSing scratch, but overall it seems pretty good.
29590234_18x18.png Ideapad-320 | Talk | Contribs | Scratch 17:00, 12 April 2024 (UTC)
@Mrsrec — Apologies for the very late reply. This bot focuses a bit more on links outside of the wiki; restoration of link would require finding new links, which require some sort of high-tech AI (from a library) if we're going to automate it. Adding an AI to it would make it many times as dangerous (as it'd probably fail to do its job), even if it was implemented bug-free. So it's probably not the best idea ever.
@Ideapad-320 — I didn't actually thought that intentional 404 (or any other error code) links would be created, when I original thought about it. One solution (which is along the lines of solutions you suggested) is to make some sort of template (like {{No Dead Link|<Link Target>|<Shown Name>}}) that acts like a normal link except that the bot ignores it. That comes with its own issues (newcomers probably being confused about it, doesn't work well with {{plain link}} etc.) but there's probably going to be a viable solution for most of those problems. (documenting the template within the relevant help pages, adding parameters to the templates, etc.)
Caching is something that I'd have to implement here to avoid 100's of API requests. I'm going to tweak how it works to make it more caching-friendly: mke a big list of links that connects usages of links and the pages itself, which gets updated daily (with Recent Changes). Each time the bot runs it'd select links that hadn't been checked the longest, check to see if they're gone. If so, go to all pages that links to it and add {{Dead link}} to it. Considering that comments are the easiest to be inaccessible it's probably worth it to check them at a high-frequency. If it works well then it could be automatic-ish, but otherwise it'd be manually triggered.
The bot could potentially go the other way around as well. I'm not really sure how likely that's going to happen but a Just-in-Case would be nice.
Purin2022 Mini User Icon.png Purin2022 | 💬Talk | 📝Contribs | 🐱Scratch 18:30, 12 April 2024 (UTC)
That seems good. For the auto fixes, It might be helpful to auto-archive comments on the Wayback Machine before they go away, and then auto change links, but I don't know if the Wayback Machine has a API you can use. You could search the Wayback Machine for the newest working page, but I don't know if you can check if it received when archiving automatically. An issue with those, any of the Auto wayback machine solutions could be a accidental auto-uncensor(not good). For caching, I also don't just mean wiki pages, I also mean pages that more than 1 wiki page links to (For example: if Page A and page B link to scratch.mit.edu, you don't need to see if scratch.mit.edu is up twice). Sorry if I was unclear.
Also, it could replace links for response codes like 101 switching protocols, 3XX Redirects (not 307 Temporary Redirect),and do other stuff based on response codes. It should never try to find replacements for 401 and 403 (Unauthorized and Forbidden). See List of HTTP status code(wikipedia). 404 means it might be back. It also could be good to auto-remove old dead link notices if he link comes back. For example, the List of Misconceptions about Scratch contains a link to the project Weekend, which sometimes goes away temporarily. I think you should get other people's opinions,too.
29590234_18x18.png Ideapad-320 | Talk | Contribs | Scratch 19:37, 12 April 2024 (UTC)
@Ideapad-320 — The Wayback Machine does appearently have an API! (at least according to a Google search) However, there are some problems: the first one is that the WBM is somehow blocked on my network for adult content, so APIs (probably) don't work... I suppose that's fixable. The second problem is that's a potential overscope, but I guess the bot could be a bit more useful this way...? That's a whole seperate feature though so it does require some seperate implementation. (And of course, a new name has to be decided if this feature is going to be implemented)
The caching in my previous post does cover external links as well (including those on the main site). Sorry if that wasn't clear too. Also, it's a excellent idea to make the bot remove {{dead link}} if necessary, since that can probably intergrate with the original purpose.
It's interesting to note what code should be marked as {{dead link}}, which to find replacement, which to not do any action on, and which to do other actions. 404 being the most interesting — 90% of dead links the bot will encounter will be 404. Unforunately, 404 doesn't imply that the content will never be accessible again. Given these circumstances, I'm not really sure how it could handle edge cases like Weekend efficiently — maybe that would be a Wikian-protrected configurationed list that lists down all links that the bot will not respond to on all pages. Most other codes are fairly obvious on what to do.
More opinions are almost always better, so I will ping a few editors: @Kenny2scratch, Han614698, jvvg. Thank you for your suggestions!
Purin2022 Mini User Icon.png Purin2022 | 💬Talk | 📝Contribs | 🐱Scratch 20:33, 12 April 2024 (UTC)
Please don't ping people, use the Discussion Invitation System. I will go ahead and invite some people.
For the Wayback Machine API being blocked, try changing your DNS(1.1.1.1 should be fine) or use a network at home. It also might be a browser extension. In that case the API would work fine. For the 404 edge cases, I don't know if there is enough edge cases to justify automatically searching for edge cases or if just having a human mark the important ones is fine. Perhaps it could be added after the first version of the bot if it is needed. Also, I don't think you would have to rename the bot if it uses the Wayback Machine.
29590234_18x18.png Ideapad-320 | Talk | Contribs | Scratch 21:25, 12 April 2024 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────@Ideapad-320 — Apologies for pinging people rather than using DI.

Unforunately neither changing my DNS nor disabling all browser extensions had worked. I thought that the API might work. Reading some StackOverflow page I found where the APi is located so I coded some simple script in Python (which is what the bot will be coded in):

import requests
get = requests.get('http://archive.org/wayback/available?url=scratch.mit.edu&timestamp=20231214')
print(get.status_code)

The console outputed 200. So changing get.status_code to get.text (yes I probably should have wrote it into a file at this point), it responded some 2000 lines worth of HTML. But upon when I copied it to an HTML file, well, it's basically the page that said I'm trying to access Adult Content, specifically https://www.three.co.uk/support/internet-and-apps/accessing-and-blocking-adult-content. Accessing the Scratch API does not lead to this page. It might get postponed to another version of the bot

Back to the idea bit, I might just hard-code that specific link to not be marked as dead. This is bad practice, but this is to avoid overscoping. Also, I thought that the name has to be changed if it used the Web Archive as now it's also archiving comments instead of just marking links as dead.
Purin2022 Mini User Icon.png Purin2022 | 💬Talk | 📝Contribs | 🐱Scratch 10:28, 15 April 2024 (UTC)

I mean, I really don't know. If we try to fix dead links with Internet Archive, it's probably possible that someone mistyped something (like scratch.mit.euu) and you just might end up on an inappropriate webpage.
Lovecodeabc Links: talk (new topic) | contribs (815) | directory 13:05, 15 April 2024 (UTC)
@Purin2022 — To be 100% honest, I don't think this is a huge issue in articles, more in refs. I also personally enjoy going through to find dead links, and to be perfectly honest, I think it's better left to users to judge what's a dead link (especially since we can then fix it!)
Han614698 H Logo.png han614698 talkcontribs (2,392)profile 00:43, 16 April 2024 (UTC)
@Lovecodeabc Mistyping isn't even an issue because it would be in refs. Even if someone manually typed it in, the only TLD close enough to EDU(accessible from a single letter insertion,deletion,or replacement) to be an issue is EU. Also, only colleges can run EDU websites, and MIT would not allow a fake scratch subdomain. So the only way to get a fake scratch is if someone got scratch.mit.eu. Also, Purin's ISP blocks the web archive so that can't be in the functionality, at least not at first (I assume Purin does not have the ability/acess to disable it) until that issue is resolved.
However, I agree that this bot is still a good idea, even if there is no auto archiving functionality (like the original request). However, @Purin2022, this is your request for your bot that you make, so you can decide if you want to keep with this request, or not(in that case you should to withdraw it, not just abandon it). I disagree that this isn't a huge problem, because going through thousands of pages is tedious. Even just finding the potentials would be a huge help. Also, for links are dead from time to time, would it be that much of a deal if they got incorrectly marked sometimes? Especially because when it is a 404, it is dead, the content it links to isn't there, even though it is only temporarily dead. So it might make more sense to mark them as dead if they are temporary 404s.
Good job having a plan of how to program your bot. I hope this reply wasn't too long.
29590234_18x18.png Ideapad-320 | Talk | Contribs | Scratch 02:32, 16 April 2024 (UTC)
@Lovecodeabc — Well, then the original link itself is a problem, and now that it already happened another one next to it wouldn't really make a huge difference. Some other editor will fix it eventually and the only risk here is that the editor forgot to remove/fix the archived link, which isn't that likely.
@han614698 — The nature of refs is that it is just an element (or template) in the article, so refs are in articles and therefore included in the bot's search. Also, personal joy for something probably isn't a reason you can use to not support anything, as humans don't respond as quickly as they could and they are less accurate. I do agree with you that humans can mostly judge dead links better than bots, however the bot can also serve as a tagger that allows humans to find better alternative links which the humans had not found before.
@Ideapad-320 — I can almost start to actually program the bot. Almost because of 2, very major problems:
  1. All of the discussion so far may not count as consensus, as the bot potentically affects a lot of people! This probably easy to solve, however, by just waiting/inviting people and by me answering the concerns.
  2. A much major problem is related to Code Evaluation. The CE requires the code to be accessible, which pratically means going to GitHub or some other platforms. Either way, signing up there will be a legal issue (for reasons that you can probably imagine, which is also the reason this bot shouldn't be created by me in the first place)! I didn't thought it through when I pressed Save changes in November. Anyway, this problem is most likely the reason why this request would be most likely withdrawn, if not rejected! It sound rather silly to say it at this instant, but better late than never! I'm simply not risking a flame war here.
There are probably workarounds for the latter problem (sort of, anyway). One of which (the harder way) is to just somehow get the code onto the wiki (possibly subpages of the bot's userspace). However, the wiki isn't a code hosting platform and therefore would lead to some inconvinence. (and by some I mean a lot). In contrast, this is the only case where the bot isn't actually withdrawn. However, it sounds like I'm trying to create this bot for the sake of it now.
Another workaround (the simpler way) that may not work as expected is to leave it to someone else to carry it over. This is probably a bad idea with many, many problems (and I'm not sure if they'd let us do it anyway), however I'm saying this idea here simply because it might turn out to be a good idea.
The final workaround (the simpliest way) is to just make a bot that detects dead links and stuff, and possibly just let me to check that the bot's work is fine and I manually sumbit it! However, as that doesn't require a new account, this is still a withdrawal.
Anyway, this post is probably longer than your post. I hope this post isn't too long, messy, nonsense! In all cases, have a good day, afternoon, night, or whenever you are reading this. :)
Purin2022 Mini User Icon.png Purin2022 | 💬Talk | 📝Contribs | 🐱Scratch 18:52, 16 April 2024 (UTC)
I am sorry if I am misunderstanding something as I was curious about AutoDeadlinker, and saw this, but, is there a reason you cannot use GitHub?
Co0lcr34t10ns (talk | contribs) 00:11, 17 April 2024 (UTC)
@Co0lcr34t10ns Something to do with GitHub's ToS section B, part 3, bullet point 4.
Purin2022 Mini User Icon.png Purin2022 | 💬Talk | 📝Contribs | 🐱Scratch 18:15, 17 April 2024 (UTC)
Cookies help us deliver our services. By using our services, you agree to our use of cookies.