[ home ] [ math / cs / ai / phy / as / chem / bio / geo ] [ civ / aero / mech / ee / hdl / os / dev / web / app / sys / net / sec ] [ med / fin / psy / soc / his / lit / lin / phi / arch ] [ off / vg / jp / 2hu / tc / ts / adv / hr / meta / tex ] [ chat ] [ wiki ]

Viewing source code

The following is the source code for post >>>/math/129

>>106
Let's eneralize this to id of length \[n\] (for 4chan: \[n=8\]) with set of \[q\] different possible characers (for base64: \[q=64\]) and substring of interest of length \[\ell\ (\ell\leq n)\] (for k0t: \[\ell=3\]) with \[p\] different variations of the substring (for k0t: \[p = 12\])

The number of all possible IDs is:

,,\qquad K_0 = q^n

If \`k0t\` appears in at least one place, we must consume \[\ell = 3\] characters to  assemble them into \`k0t\`, then we return the \`k0t\` into the bunch (so we now have \(n -\ell + 1\) items in the bunch) and then we arrange the \`k0t\` in \(n - \ell + 1\choose 1\) ways. We have \[p\] variations of k0t, plus \[n-\ell\] other characters which can each be \[q\] different vays (so \[q^{n-\ell}\]) (even tho I said "plus" we use \(\times\) bcoz of thing called \textit{multiplication principle} but w/e)

,,\qquad K_1 = {n - \ell + 1 \choose 1}\cdot p \cdot q^{n - \ell}

If there are two \`k0t\`s somewhere, we must consume \[2\ell = 6\] characters to assemble 2 \`k0t\`s (both with \[p=12\] variations so \[p^2\]), then we return the k0t in the bunch (we now have \(n -2\ell + 2\) items) and we arrange the two kots in \(n-2\ell + 2)\choose 2\) ways. We have two kots which can each be \(p\) different ways (so \(p^2\)) and \[n - 2\ell\] unconsoomed characters which can each be \(q\) different ways (\(q^{n - 2\ell}\))

,,\qquad K_2 = {n - 2\ell + 2)\choose 2}\cdot p^2\cdot q^{n - 2\ell}

Therefore if we have \[k\] \`k0t\`s, the fomula is the following: (and I have a truly marevolous way to prove this formula by induction but \spoiler{I am also truly a lazy fuck}.)
 
,,\qquad K_k = {n - k(\ell - 1)\choose k}\cdot p^k\cdot q^{n-k\ell} 

Now if we want to count all the \`k0t\`s without duplicates, we must use \textit{inclusion-exclusion principle}:
 
,,\qquad K_{\Sigma} = \sum_{k = 1}^{n - k(\ell - 1) > 0}(-1)^{k - 1}\cdot K_k

Then the probability of \`k0t\` appearing is

,,\qquad P = \frac{K_\Sigma}{K_0}


So if I plug in the numbers for \`k0t\` on /pol/ slash /biz/:

,align \qquad n &= 8\\q &= 64\\\ell&=3\\p&=12

We get the following result
,,\qquad P = \frac{K_\Sigma}{K_0} = \frac{1}{q^n}\sum_{k = 1}^{n - k(\ell - 1) > 0}(-1)^{k - 1}\cdot K_k = \frac{1}{64^8}\cdot\left({8 - 3 + 1\choose 1}\cdot 12\cdot 64^5 - {8 - 6 + 2\choose 2}\cdot 12^2\cdot 64^2\right) = \frac{12\cdot 64^2}{64^8}\cdot(6\cdot 64^3 - 6\cdot 12) = \frac{77305872384}{281474976710656} \approx  0.02746\%

Which matches OP's numbers!