Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
AWWNAW
Dec 30, 2008

do we got some homophonic substitution going here? could that explain the freq distribution?

(i just got into cryptanalysis last night)

Adbot
ADBOT LOVES YOU

AWWNAW
Dec 30, 2008

but i think that could possibly explain snap titty's very nice and helpful distribution chart, if the lower case letters were being used as homophones

this was pretty cool:

jumbo wales posted:

The Beale ciphers are another example of a homophonic cipher. This is a story of buried treasure that was described in 1819–21 by use of a ciphered text that was keyed to the Declaration of Independence. Here each ciphertext character was represented by a number. The number was determined by taking the plaintext character and finding a word in the Declaration of Independence that started with that character and using the numerical position of that word in the Declaration of Independence as the encrypted form of that letter. Since many words in the Declaration of Independence start with the same letter, the encryption of that character could be any of the numbers associated with the words in the Declaration of Independence that start with that letter. Deciphering the encrypted text character X (which is a number) is as simple as looking up the Xth word of the Declaration of Independence and using the first letter of that word as the decrypted character.

AWWNAW
Dec 30, 2008

'2' & '3' are in the top four highest frequency bigrams, I'm thinking maybe 't' is a homophone for '9' or something I have no idea what I'm doing
code:
("29", 11)
("2t", 10)
("3A", 9)
("v3", 9)
hot on the trail

here's the alphabet or whatever
code:
$()-.0123456789?@ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghiklmnopqrstuvwxyz
code:
open System
open System.IO

let ciphertext =
    File.ReadAllLines("cipher.txt") |> String.concat ""
    |> Convert.FromBase64String
    |> Text.Encoding.UTF8.GetString

let alphabet = ciphertext |> Set.ofSeq
alphabet |> Set.iter (string >> printf "%s")

let getFrequencies terms = terms |> Seq.countBy id |> Seq.sortByDescending snd
let frequencies = ciphertext |> getFrequencies
frequencies |> Seq.iter (printfn "%A")
let join (chars: char array) = String.Join("", chars)
let bigramFreqs = ciphertext |> Seq.windowed 2 |> Seq.map join |> getFrequencies
bigramFreqs |> Seq.iter (printfn "%A")

open System.Text.RegularExpressions
let getMatchRanges term replacement input =
    let matches = Regex.Matches(input, term)
    let indices = [| for termMatch in matches do yield (termMatch.Index, termMatch.Index + term.Length - 1) |]
    indices
let mask term (replacement: string) line =
    let indices = line |> getMatchRanges term replacement
    let substrate = [| for i in 0 .. line.Length - 1 do yield '?' |]
    for lo, hi in indices do
        for i in lo .. hi do
            substrate.[i] <- replacement.[i - lo]
    String(substrate, 0, substrate.Length)
    
let cipherLines = ciphertext.Split([|'\n'|])
let maskedLines = cipherLines |> Array.map (mask "29" "th")
for line in maskedLines do printfn "%s" line

burning swine
May 26, 2004



I got as far as trying to map bigrams and trigrams earlier this week before the poo poo hit the fan at work and I had to stop

I don't really think that approach is going to (directly) bear fruit, the letter distributions show that it isn't a straight up substitution


deffo gonna throw some effort at it this weekend

burning swine fucked around with this message at 21:09 on Aug 19, 2016

Bloody
Mar 3, 2013

i am too dumb to do these but i am interested to see the outcomes

Squeezy Farm
Jun 16, 2009
Hello Yospos this is your computer.

spankmeister
Jun 15, 2008






Squeezy Farm posted:

Hello Yospos this is your computer.

yospos is good

CRIP EATIN BREAD
Jun 24, 2002

Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T



Soiled Meat
yos

Carbon dioxide
Oct 9, 2012

spankmeister posted:

yospos is good

My operating system is a piece of poo poo but I still like it.

Thanks Ants
May 21, 2004

#essereFerrari


Displeased Moo Cow posted:

Some smart mother fuckers on the Internet

echinopsis
Apr 13, 2004

by Fluffdaddy
I dot even have the faintest clue how to do anything like this at all

Lain Iwakura
Aug 5, 2004

The body exists only to verify one's own existence.

Taco Defender

echinopsis posted:

I dot even have the faintest clue how to do anything like this at all

I doubt you can decipher your own posting

vodkat
Jun 30, 2012



cannot legally be sold as vodka

Bloody posted:

i am too dumb to do these but i am interested to see the outcomes

same. last time someone posted an jupiter notebook of them solving the puzzle and that was cool to read.

graph
Nov 22, 2006

aaag peanuts

anthonypants posted:

YAMS YAMS YAMS YAMS YAMS YAMS YAMS YAMS YAMS YAMS YAMS YAMS YAMS YAMS YAMS

someone post barry_bonds_yams.gif and i'll paypal you $20

burning swine
May 26, 2004



sooo since newlines are preserved and the overall frequency distribution doesn't make any kind of sense, i'm tinkering with the idea that the key rotates/changes every line. Problem is the first line is too short to do very much with, so I'm playing with

1. the last line
2. the longest line

iunno still a shot in the dark though

Lain Iwakura
Aug 5, 2004

The body exists only to verify one's own existence.

Taco Defender

COACHS SPORT BAR posted:

sooo since newlines are preserved and the overall frequency distribution doesn't make any kind of sense, i'm tinkering with the idea that the key rotates/changes every line. Problem is the first line is too short to do very much with, so I'm playing with

1. the last line
2. the longest line

iunno still a shot in the dark though

newlines are preserved but they too are enciphered

burning swine
May 26, 2004



oh


lol well nevermind that idea then

wyoak
Feb 14, 2005

a glass case of emotion

Fallen Rib
OSI said there's padding used so....
Entire document is 3329 bytes which is a prime number so that might not be an accident.
If we ignore the trailing newline because why not, 3328 has factors of 256, 128, 64 etc etc, so that's a thing, but I haven't been able to see any patterns when splitting it into blocks of those sizes. I'll going to mess around doing other stuff with the blocks but I'm probably up the wrong tree here

Lain Iwakura
Aug 5, 2004

The body exists only to verify one's own existence.

Taco Defender
It is safe to ignore the trailing newline

burning swine
May 26, 2004



hmmmm
Hadn't thought about chunking the ciphertext, makes sense though. Possibly some CBC going on? I've been trying to find some kind of meaningful character/bigram frequency within subsets of the data, that could be a way to go

I've been playing around with rotating keys some this morning (no longer on a "per line" basis), haven't gotten anywhere that way

bump_fn
Apr 12, 2004

two of them

graph posted:

barry_bonds_yams.gif

NFX
Jun 2, 2008

Fun Shoe
the weird frequency distribution of the cipher text characters(as in the graphs posted on the first page) also holds if you only take every 2nd/3rd or the first or last half of the text. idk there could have been some differential thing going on.


i also tried calculating the index of coincidence of the ciphertext and it's 1.7315 which is pretty much identical to english. i don't really know what this means, but combine it with the fact that cipher symbol frequency is very strongly dependent on symbol value and it seems to me like there's mostly a 1-to-1 substitution between plaintext and cipher symbols. maybe the reason there's no proper digraphs or tripgraphs is that it's just hella shuffled around?

spankmeister
Jun 15, 2008






OSI be honest are you the crystalline guy? Is this crystalline?

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum

spankmeister posted:

OSI be honest are you the crystalline guy? Is this crystalline?
:eyepop:

burning swine
May 26, 2004



status: i've finished writing the ascii pipe spinner that will display while my computer solves this problem for me


any second now

spankmeister
Jun 15, 2008






COACHS SPORT BAR posted:

status: i've finished writing the ascii pipe spinner that will display while my computer solves this problem for me


any second now

doing the lords work :patriot:

Lain Iwakura
Aug 5, 2004

The body exists only to verify one's own existence.

Taco Defender

spankmeister posted:

OSI be honest are you the crystalline guy? Is this crystalline?

i am not crazy enough to be him

for those wondering wtf it is:
https://crystalline.codeplex.com/

fins
May 31, 2011

Floss Finder
has anyone solved yet? I'm late to the party but I've got a couple days months off.

Sham bam bamina!
Nov 6, 2012

ƨtupid cat
awia did on page 1, now the puzzle is figuring out how he did it

wyoak
Feb 14, 2005

a glass case of emotion

Fallen Rib

NFX posted:

i also tried calculating the index of coincidence of the ciphertext and it's 1.7315 which is pretty much identical to english. i don't really know what this means, but combine it with the fact that cipher symbol frequency is very strongly dependent on symbol value and it seems to me like there's mostly a 1-to-1 substitution between plaintext and cipher symbols. maybe the reason there's no proper digraphs or tripgraphs is that it's just hella shuffled around?
How large of an alphabet size did you use to get 1.73?

edit: I don't think Awia actually solved it

wyoak fucked around with this message at 16:08 on Aug 24, 2016

NFX
Jun 2, 2008

Fun Shoe

wyoak posted:

How large of an alphabet size did you use to get 1.73?

edit: I don't think Awia actually solved it

letters (including j) (2 * 26), digits (10), all the symbols that appear, incl. newlines (9) = 71

i suppose if you include those characters then the value for English probably isnt 1.73 any more, but higher

burning swine
May 26, 2004



I got something like 1.7x too, using a rough estimate based on the alphabet size. Didn't make sense so I went in another direction.

burning swine
May 26, 2004



also my first take on alphabet size was 69 which seemed v. yospos

I was disappointed when i recounted and got 70

wyoak
Feb 14, 2005

a glass case of emotion

Fallen Rib
Anyone have links to the other challenges? One of them was a touchtone phone cipher but not sure what the other was. Ima get up in OSI's brain space

NFX
Jun 2, 2008

Fun Shoe
don't have a link, but boffin v2 was rot-N where N is the length of the word iirc

Lain Iwakura
Aug 5, 2004

The body exists only to verify one's own existence.

Taco Defender
this was the second one:
https://forums.somethingawful.com/showthread.php?threadid=3766370

Lain Iwakura
Aug 5, 2004

The body exists only to verify one's own existence.

Taco Defender
this was the first under the bletchley boffins name:
https://forums.somethingawful.com/showthread.php?threadid=3708438

and this was the original challenge:
https://forums.somethingawful.com/showthread.php?threadid=3704805

burning swine
May 26, 2004



problem I'm dealing with right now is figuring out what a meaningful frequency distribution looks like for text with uppercase, lowercase, numbers, and symbols. I've separated out cipherext that I have reason to believe maps to cleartext, but the abnormal nature of this alphabet is making frequency analysis difficult

edit: this is a distribution I found while digging (based on text from the new york times), and I've come close but not quite matched it so far. Numbers in parens are % occurrence.

code:
~: 0 (0)
|: 0 (0)
{: 0 (0)
`: 0 (0)
_: 0 (0)
^: 0 (0)
\: 0 (0)
}: 0 (0)
@: 1 (0)
#: 10 (0)
=: 22 (0)
<: 82 (0)
>: 83 (0)
+: 309 (0)
%: 1993 (0)
!: 2178 (0)
]: 2253 (0)
[: 2261 (0)
Z: 5610 (0.01)
&: 6523 (0.01)
X: 7578 (0.01)
/: 8161 (0.01)
Q: 11659 (0.02)
?: 12357 (0.02)
*: 20716 (0.03)
V: 31053 (0.04)
;: 36727 (0.05)
K: 46580 (0.07)
$: 51572 (0.07)
(: 53398 (0.08)
): 53735 (0.08)
:: 54036 (0.08)
q: 54221 (0.08)
U: 57488 (0.08)
j: 65856 (0.09)
z: 66423 (0.09)
J: 78706 (0.11)
G: 93212 (0.13)
Y: 94297 (0.13)
F: 100751 (0.14)
O: 105700 (0.15)
L: 106984 (0.15)
W: 107195 (0.15)
7: 120094 (0.17)
x: 123577 (0.17)
H: 123632 (0.17)
D: 129632 (0.18)
E: 138443 (0.19)
P: 144239 (0.2)
R: 146448 (0.21)
6: 153865 (0.22)
B: 169474 (0.24)
8: 182627 (0.26)
3: 187606 (0.26)
4: 192528 (0.27)
': 204497 (0.29)
N: 205409 (0.29)
I: 223312 (0.31)
C: 229363 (0.32)
-: 252302 (0.36)
M: 259474 (0.37)
A: 280937 (0.4)
9: 282364 (0.4)
": 284671 (0.4)
S: 304971 (0.43)
T: 325462 (0.46)
2: 333499 (0.47)
5: 374413 (0.53)
k: 460788 (0.65)
1: 460946 (0.65)
0: 546233 (0.77)
v: 653370 (0.92)
b: 866156 (1.22)
.: 946136 (1.33)
,: 984969 (1.39)
w: 1015656 (1.43)
y: 1062040 (1.5)
g: 1206747 (1.7)
p: 1255579 (1.77)
f: 1296925 (1.83)
m: 1467376 (2.07)
u: 1613323 (2.27)
c: 1960412 (2.76)
d: 2369820 (3.34)
l: 2553152 (3.6)
h: 2955858 (4.16)
r: 4137949 (5.83)
s: 4186210 (5.89)
i: 4527332 (6.37)
n: 4535545 (6.39)
o: 4729266 (6.66)
a: 5263779 (7.41)
t: 5507692 (7.76)
e: 7741842 (10.9)

burning swine fucked around with this message at 17:58 on Aug 24, 2016

Lain Iwakura
Aug 5, 2004

The body exists only to verify one's own existence.

Taco Defender
if someone solves it and wants to post a sha256 sum of the output, feel free and i'll confirm that way too

Adbot
ADBOT LOVES YOU

AWWNAW
Dec 30, 2008

i still want to believe that the lower case letters are homophones/synonyms for substitutions, that's why the distribution is so flat. i think if you take out the lower case letters, the distribution looks better, then the most common lower case letters are being homophoned for the least common upper case letters, which would flatten the distribution?

  • Locked thread