“Ð/ṃƇ¬þṄẊƙ€,⁽ṙƬ®OṪJ"ɦ3×kf3Ṙç%ġu’b26ịØaṣ”z
e€¢¬T;2Ḷ¤
ḲŒtǦK
TryItOnline! or run all tests
How?
A compressed string with spaces separating the words would be 47
bytes, splitting it costs 1
byte, for 48
bytes.
Two unseparated compressed strings of the words of length 2
and 3
(with an 'a' on the end of one) respectively would be 40
bytes plus 2
to split each and 1
to join them, for 45
bytes.
One base 250 number as described below is 32
bytes, then 3
to convert to base 26, 3
to index into the lowercase alphabet and 3
to split it on the unused character, 'z'
, for 41
bytes.
So, the lookup for the words not to capitalise:
“Ð/ṃƇ¬þṄẊƙ€,⁽ṙƬ®OṪJ"ɦ3×kf3Ṙç%ġu’
was formed like so:
Take those words and join them with a separator:
s="a an the at by for in of on to up and as but or nor"
Next label 'a'
as 1
, 'b'
as 2
with the separator as 0
:
alpha = ' abcdefghijklmnopqrstuvwxyz'
x = [alpha.index(v) for v in s]
x
[1,0,1,14,0,20,8,5,0,1,20,0,2,25,0,6,15,18,0,9,14,0,15,6,0,15,14,0,20,15,0,21,16,0,1,14,4,0,1,19,0,2,21,20,0,15,18,0,14,15,18]
Convert this into a base 26
number (the last letter used is 'y'
plus a digit for the separator, Python code for this is:
n=sum(v*26**i for i,v in enumerate(x[::-1]))
Convert that into a base 250
number (using a list for the digits):
b=[]
while n:
n,d = divmod(n,250)
b=[d]+b
b
[16,48,220,145,8,32,202,209,162,13,45,142,244,153,9,80,207,75,35,161,52,18,108,103,52,205,24,38,237,118]
Lookup the characters at those indexes in jelly's codepage:
codepage = '''¡¢£¤¥¦©¬®µ½¿€ÆÇÐÑ×ØŒÞßæçðıȷñ÷øœþ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQR TUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¶°¹²³⁴⁵⁶⁷⁸⁹⁺⁻⁼⁽⁾ƁƇƊƑƓƘⱮƝƤƬƲȤɓƈɗƒɠɦƙɱɲƥʠɼʂƭʋȥẠḄḌẸḤỊḲḶṂṆỌṚṢṬỤṾẈỴẒȦḂĊḊĖḞĠḢİĿṀṄȮṖṘṠṪẆẊẎŻạḅḍẹḥịḳḷṃṇọṛṣṭụṿẉỵẓȧḃċḋėḟġḣŀṁṅȯṗṙṡṫẇẋẏż«»‘’“”'''
r=''.join(codepage[i-1] for i in b)
r
'Ð/ṃƇ¬þṄẊƙ€,⁽ṙƬ®OṪJ"ɦ3×kf3Ṙç%ġu'
(note: since the actual implementation is bijective, if b
had any 0
digits one would need to carry down first)
The rest:
ḲŒtǦK - Main link: title string
Ḳ - split on spaces
¦ - apply to indexes
Ç - given by calling the last link (1) as a monad (with the split title string)
Œt - title case (first letter of each (only) word to upper case)
K - join on spaces
e€¢¬T;2Ḷ¤ - Link 1, find indexes to capitalise: split title string
e€ - is an element of, for €ach
¢ - the result of calling the last link (2) as a nilad
¬ - logical not
T - get the truthy indexes (indexes of words that are not in the list)
; - concatenate with
¤ - nilad followed by link(s) as a nilad
2Ḷ - range(2) -> [0,1]
(we always want to capitalise the first index, 1, and the last index, 0)
“Ð/ṃƇ¬þṄẊƙ€,⁽ṙƬ®OṪJ"ɦ3×kf3Ṙç%ġu’b26ịØaṣ”z - Link 2, make the word list: no arguments
“Ð/ṃƇ¬þṄẊƙ€,⁽ṙƬ®OṪJ"ɦ3×kf3Ṙç%ġu’ - the base 250 number
b26 - convert to base 26
ị - index into
Øa - lowercase alphabet
ṣ - split on
”z - literal 'z' (the separator 0 indexes into `z`)