[ home ] [ math / cs / ai / phy / as / chem / bio / geo ] [ civ / aero / mech / ee / hdl / os / dev / web / app / sys / net / sec ] [ med / fin / psy / soc / his / lit / lin / phi / arch ] [ off / vg / jp / 2hu / tc / ts / adv / hr / meta / tex ] [ chat ] [ wiki ]

/meta/ - Meta Questions


Name
Email
Subject
Comment
Verification
Instructions:
  • Press the Get Captcha button to get a new captcha
  • Find the correct answer and type the key in TYPE CAPTCHA HERE
  • Press the Publish button to make a post
  • Incorrect answer to the captcha will result in an immediate ban.
File
Password (For file deletion.)

25 Dec 2021Mathchan is launched into public

5 / 3 / 4 / ?

File: YEP7CYAZKETQSIYGVO7DC6FJPOTC7TBK.jpeg ( 26.9 KB , 300x222 , 1641921703532.jpeg )

Image
Hey I noticed links posted here get "cloaked" through the
exit?u=ID
endpoint. This is bad.

Link-shorteners make link-rot so much worse. When a shortening service dies, all links it resolved are lost forever too. By acting as a middleman, shorteners add another point of possible failure in the chain: https://www.economist.com/international/2012/10/13/cut-short

If mathchan dies, links in threads people may have backed up or archive die too. Perhaps this is OK or intended in sites designated for ephemeral discussion, but I get the impression your site is for discussion of pretty serious topics. Think about people, X years, who want to reference dead threads from here and do so through a 3rd party like the wayback machine. Or people today who simply want to read your site offline and want know where links lead to at least.
Don't mess with links.
>>
>>534
Links are cloaked and put through
exit?u=ENCODED_URL
in order to display Mathchan's exit point disclaimer. This is done for the following two reasons:
  1. To protect users from accidentally clicking on harmful links that are posted on Mathchan (such phishing sites or sites with criminal content) before Mathchan's staff is alerted and has a chance to take it down.
  2. To disassociate Mathchan from any websites users link to as well as hint the search engines that Mathchan does not want to associate with these sites (All user-generated links have
    rel="ugc nofollow"
    )

Trusted sites (like links to Mathchan itself) are not put through disclaimer and resolve immediately. Links to some well-known sites (like Google, Youtube, Reddit, Twitter etc.) are distrusted by default because privacy conscientious users may not agree with privacy policy of these sites and do not want to let them know Mathchan was the site that referred them. Therefore, these sites have to go through the exit point too. When user settings becomes available there will be an option to whitelist certain sites in order to circumvent the disclaimer for sites you trust. URLs to websites that Mathchan trusts are going to have direct links but for now
\url
and
\href
commands put all URLs through the exit point.


>If mathchan dies, links in threads people may have backed up or archive die too
If Mathchan is recursively mirrored then exit points ought to be mirrored with it too. Exit points contain a real URL so there would be no issue following the links on Mathchan's future archive.

Also, URLs are not shortened, nor encrypted. They are just encoded and can be easily reconstructed by replacing
._-
with
+/=
and running them through a base64 decoder (which is the current way Mathchan encodes URLs). Creating a URL shortener or encrypting URLs was in consideration but that would require us to maintain database of shortened URLs or an encryption key with no real benefits of doing so. If any of these are lost, all links on Mathchan would be dead. Unfortunately, encoding URLs has a significant drawback that URL shortening does not in that long URLs would become even longer, possibly exceeding the maximum size of a GET request. This is a small price to pay for the benefits of doing it so ridiculously long URLs (longer than 2048 characters) are simply unsupported by Mathchan.

Why are URLs encoded when it was much easier to just do
exit?u=<URL>
? Because a link such as
https://mathchan.org/exit?u=https://terrorism.org/
would appear to casual users, as well as search engines, that Mathchan is linking to terrorism without viewing our disclaimer. A link such as
https://mathchan.org/exit?u=aHR0cHM6Ly90ZXJyb3Jpc20ub3JnLw--
alone appears like it could link to any resource until you open it to see what it links to, coupled with the disclaimer for users, and
rel="nofollow"
for search engines. Search engines are not going to go the extra step to try and interpret
aHR0cHM6Ly90ZXJyb3Jpc20ub3JnLw--
but they can easily consider
https://terrorism.org/
a URL slug and tank Mathchan's rankings. Not that we would ever allow linking to criminal activity like terrorism but search engines usually index pages faster than moderators have the time to act. On the other hand, search engines like when a site is linking to other sites but those are only going to be the sites we trust and currently we trust none.

TL;DR:
Without considerations given above, giving users the
\url{...}
and
\href{...}{...}
commands would be highly dangerous for both Mathchan users and the site itself.

>Perhaps this is OK or intended in sites designated for ephemeral discussion
Image boards are ephemeral by design. Old threads archive on their own and are eventually pruned.
Mathchan Wiki will be permanent but external links will also go through the exit point.
>>
>>535
You know, I was interested to ask this because this is the only image board I can name that messes with URLs. Somewhat reminded me of lame forums that you needed to register to follow links.
>Search engines can easily consider
https://terrorism.org/
a URL slug and tank Mathchan's rankings

Now that I think about it, I don't remember making a search on google and have an imageboard be one of the results, even when there's a thread that fits that search query better than any of the results that actually come up. Imageboards do come up if I search the name, so they're not all blacklisted. It seems (mainstream) search engines don't regard imageboards highly.
Is this actually because chans have an unusual frequency of naughty links? If that's true then cloaking is a good move for Mathchan. Or is that chans simply don't practice aggressive SEO to earn a place in the first page of results?

I appreciate that resolving
exit?u=
doesn't rely on a database, and the encoded URL is just that, it can be decoded. But is a nebulous (AFAIK) search engine optimization principle worth the inconvenience? I think you downplay how inconvenient it is:
>If Mathchan is recursively mirrored then exit points ought to be mirrored with it too. Exit points contain a real URL so there would be no issue following the links on Mathchan's future archive.
I highly doubt there will be an anywhere near competent Mathchan archive. Only 4chan has dedicated archives, other chans at most have individual pages saved on the wayback machine. Links in Mathchan are broken in the Wayback Machine, and they're also broken when casually downloading one thread with
Ctrl+S
.

You underestimate your users' incompetence, and underestimate their competence at the same time. You expect users to be able to do recursive mirroring or base64 decode URLs. But you don't expect them to think before choosing to click link so you greet them with an obnoxious "Are sure you want to exit?" page. Nor do you expect privacy-conscious users to configure their own browser to not send referer headers, if that's their concern.
>1. To protect users from accidentally clicking on harmful links that are posted on Mathchan
>Links to some well-known sites (like Google, Youtube, Reddit, Twitter etc.) are distrusted by default because privacy conscientious users may not agree with privacy policy of these sites and do not want to let them know Mathchan was the site that referred them.
>>
>>536
>this is the only image board I can name that messes with URLs.
This is also one of the few imageboards with significant innovation, which comes with some responsibility. If we are going to provide users with a rich markup that can embed URLs, LaTeX figures etc. then we have to think about how we are going to do that securely. We could've allowed embedding URLs like javascript:while(1); that would've crashed your browser but we don't want browsing Mathchan to feel like a minefield.
\url{...}
and
\href{...}{...}
are there to allow posters link to other websites but users should not be afraid of clicking them, voluntarily or involuntarily.

>Is this actually because chans have an unusual frequency of naughty links?
Yes, and anonymous image boards in particular have a troubled history with this stuff. There is a CSAM linking spambot that plagues all imageboards including Mathchan and we already have two of its IP addresses permabanned, though this one will be thwarted by Mathchan's captcha.

>But is a nebulous (AFAIK) search engine optimization principle worth the inconvenience?
This is not just for search engines but for your security as well. If you're browsing Mathchan at work or school, you may have to install their root certificate in which case your organization will act as the man in the middle between you and Mathchan. If they scan the page you're looking at and find
<a>
tags linking to a blacklisted/criminal websites, you will be in trouble. Even if your connection is secure, it's better if you can review the link before following especially if it's given using
\href{...}{...}
(e.g. Click me). Malicious users may also try to conceal a link using
\color{...}
or
\colorbox{...}{...}
in order to trick you into clicking it. They may also use
\href{...}{...}
to make it look like
\url{...}
and trick you into visiting a website you didn't expect (e.g. https://google.com/). There are simply inummerable reasons why reviewing the URL you're going to follow from Mathchan is a good idea. All these shenanigans are considered a violation of Rule 1d but before a moderator can remove the post with malicious links you'll have at least one layer of protection.

>Links in Mathchan are broken in the Wayback Machine, and they're also broken when casually downloading one thread with
Ctrl+S

Our security overweights Wayback Machine working and you'll also be able to whitelist all websites through user settings in which case
Ctrl+S
will work properly.

>You expect users to be able to do recursive mirroring or base64 decode URLs. But you don't expect them to think before choosing to click link so you greet them with an obnoxious "Are sure you want to exit?" page.
You don't have to decode base64 encoded URLs because Mathchan does that for you. If you decide to recursively archive Mathchan using
wget -m
it should also download the exit points so you can browse the local archive just fine.
>>

File: 1640647504919.png ( 920.89 KB , 1060x772 , 1641949809332.png )

Image
>>538
You've made good points about security. And while I still believe link cloaking is an inconvenience to some aspects of user interaction, you've convinced me its possible benefits outweigh those bumps.
>>

File: holdup.png ( 9.62 KB , 441x224 , 1641950727548.png )

Image
>>539
We should have a pop up like pic related but that'll take a while because lukyon is the only developer.