2つの文字列の最長共通プレフィックス


30

入力として2つの文字列を受け取り、最長の共通プレフィックスを返すプログラムを作成します。これはであるため、バイト数が最も少ない答えが優先されます。

Test Case 1:

"global" , "glossary"
"glo"


Test Case 2:

"department" , "depart"
"depart"

Test Case 3:

"glove", "dove"
""

1
別の良いテストケースは"aca", "aba"です。
モーガンスラップ

2
STDINから入力してSTDOUTに出力する完全なプログラムが必要ですか、それとも機能は大丈夫ですか?
xnor

2
入力に改行がないと仮定できますか?入力にはどの文字が含まれますか?
ダウンゴート

5
一般的な注意:正規表現ベースのソリューションを使用している人は、自分でテストせずに他の人の正規表現の回答をコピーしないでください。これは、すべての正規表現エンジンで機能するわけではありません。特に、nviとvimで異なる(誤った両方の)答えを与えます。
Random832

1
与えられた例はすべて小文字ですが、大文字と小文字の区別について心配する必要がありますか?たとえば、should globaland GLOSSARYreturn gloまたは''
AdmBorkBork

回答:


22

Python 3、54バイト

このタスクのための組み込み関数を持っているPythonに感謝します!:D

import os;print(os.path.commonprefix(input().split()))

入力を、などのスペースで区切られた2つの単語として受け取りますglossary global


21

Haskell、29バイト

(c:x)%(d:y)|c==d=c:x%y;_%_=""

使用法:

>> "global"%"glossary"
"glo"

%パターンマッチングによってバイナリ関数を再帰的に定義します。最初の文字が等しい2つの文字列で、その最初の文字を取得し、残りの文字列の関数の先頭に追加します。それ以外では、空の文字列を提供します。


11

Pyth、8 7バイト

e@F._MQ

@isaacgに1バイトのオフをありがとう

引用符で囲まれ、カンマで区切られた入力を受け取ります"abc", "acc"。結果が空の文字列の場合、エラーで終了します(ただし、stdoutは空のままです)。それが受け入れられない場合は、2バイトを追加します#e@F._MQq

テストスイート

説明

e@F._MQ        : implicit Q = eval(input)
   ._MQ        : Map the prefix operator onto both inputs
 @F            : Fold the setwise intersection operator over those lists
e              : Take the last such element, the prefixes are always made from shortest
               : to longest, so this always gives the longest matching prefix

エラーなしで結果を空の文字列にするには:e|@F._M.z]k
kirbyfan64sos

@ kirbyfan64sos私​​はそれを囲むことについて私が入れたもの#...qはそれより1バイト少ないと信じています、私は完全なコードで編集します、私はそれが混乱していると思います
-FryAmTheEggman

1
フォームに入力して、代わりに"abc", "def"使用できますQ.z
-isaacg

10

C ++、101の 100 99バイト

#include<iostream>
int i;main(){std::string s,t;std::cin>>s>>t;for(;s[i]==t[i];std::cout<<s[i++]);}

から2つの文字列を読み取りstdin、現在の位置の文字がもう一方の文字列の同じ位置の文字と等しい間に、文字列の1つから現在の位置の文字を出力します。

1バイトを保存してくれたZeregesに感謝します。


4
それはfor声明の美しく恐ろしい使用です
...-Joshpbarron

文字列が等しい場合、ループは終了しません。
ジョンTrauntvein

2
空白を含む文字列では機能しません。int iグローバルスペースを作成することで1バイトを節約できます(0で初期化されます)
Zereges

@JonTrauntveinその場合はUB(?)だと思います。それは私のために働く™。(gcc-5.1)
sweerpotato

9

Haskell、38バイト

((map fst.fst.span(uncurry(==))).).zip

使用例:( ((map fst.fst.span(uncurry(==))).).zip ) "global" "glossary"->"glo"

両方の入力文字列を文字のペアのリストに圧縮します。2つのリストを作成します。最初のリストは両方の文字が等しい限り最初からすべてのペアを持ち、2番目のリストは残りすべてを持ちます。2番目のリストをドロップし、最初のリストからすべての文字を抽出します。


9

CJam、12 11 9バイト

l_q.-{}#<

これは、Unixスタイルの行末で2つの別々の行にある文字列を読み取ります<string>\n<string>\n

-1バイトの@MartinBüttnerと、-2バイトの@ jimmy23013に感謝します!

CJamインタープリターでオンラインで試す

使い方

l_         e# Read a line (w/o trailing LF) from STDIN and push a copy.
  q        e# Read another line from STDIN (with trailing LF).
           e# The trailing linefeed makes sure that the lines are not equal.
   .-      e# Perform vectorized character subtraction. This yields 0 for equal
           e# characters, a non-zero value for two different characters, and the
           e# characters themselves (truthy) for the tail of the longer string.
     {}#   e# Find the index of the first truthy element.
        <  e# Keep that many characters from the first string.

くそー、私の最初の答えがそんなに近かったなんて信じられない!
-geokavel

1
末尾の改行を想定してを使用することで、少しカンニングをすることができますl_q.-
-jimmy23013

@jimmy23013 That's standard for input on Unix-like OS's, so why not? Thanks!
Dennis

8

APL, 13

{⊃↓K/⍨=⌿K←↑⍵}

This is a function that takes an array of two strings, and returns the prefix:

      {⊃↓K/⍨=⌿K←↑⍵}'glossary' 'global'
glo
      {⊃↓K/⍨=⌿K←↑⍵}'department' 'depart'
depart

Is it really fair to say that the APL alphabet is an alphabet of byte-size characters? Or is that standard practice around here?
Filipq

9
@Filipq Answers here use the encoding most natural to the language. APL has its own code page on which each character is a single byte.
Alex A.

7

AppleScript, 215 Bytes

And I tried so hard... ;(

set x to(display dialog""default answer"")'s text returned
set a to(display dialog""default answer"")'s text returned
set n to 1
set o to""
repeat while x's item n=a's item n
set o to o&x's item n
set n to n+1
end
o

I wanted to see how well AppleScript could pull this off, and man is it not built for string comparisons.


12
AppleScript wasn't built for anything.
kirbyfan64sos

The only thing I use it for besides terrible golfs is tell app "System Events" to <something>. It is interesting to see how it deals with this kind of stuff, though. @kirbyfan64sos
Addison Crump



6

CJam, 12 8 26

r:AAr:B.=0#_W={;;ABe<}{<}?

Try it Online.

(Got idea to use .= instead of .- after looking at Dennis's answer.)

With all the edge cases, it became to hard for a CJam beginner like me to keep it short. Hopefully, this at least works for all cases.


6

C#, 201 147 bytes

using System.Linq;class a{static void Main(string[]a){a[0].Take(a[1].Length).TakeWhile((t,i)=>a[1][i]==t).ToList().ForEach(System.Console.Write);}}

I know it isn't terribly competitive. I just wanted to see what it would look like.

EDIT: Thanks Ash Burlakzenko, Berend, and Dennis_E


2
Just getting a C# answer under 250 bytes is competitive. Also, can't you just using System.*?
clap

1
.ForEach(x=>Console.Write(x)) could be shortened to .ForEach(Console.Write)
Ash Burlaczenko

1
using System.Collections.Generic; is unnecessary. Shave off one more byte by removing the space from string[] a.
Berend

2
1-The Contains is unnecessary. 2-You can save a few bytes by removing using System; and saying System.Console.Write; 3-This code returns the wrong result ("a") for input "aab","aaab", because of IndexOf. The shortest fix I could think of is using a[0].Take(a[1].Length) This is 147 bytes long: "using System.Linq;class a{static void Main(string[]a){a[0].Take(a[1].Length).TakeWhile((c,i)=>a[1][i]==c).ToList().ForEach(System.Console.Write);}}"
Dennis_E

Thanks for the comments when I get a break I'll take a good look at all of them especially Dennis_E's comment.
Jakotheshadows

5

Common Lisp, 39

(lambda(a b)(subseq a 0(mismatch a b)))

Takes two string arguments, determines the index i where they differ, and returns a substring from 0 to i.


5

Perl 5, 20 19 18 bytes

19 bytes, plus 1 for the -E flag instead of -e:

say<>=~/^(.*).* \1/

This is copied shamelessly from Digital Trauma's sed answer. It assumes the input is a couple of words without spaces in them (or before the first) and with one space between them.


Update:

ThisSuitIsBlackNot suggested using -pe as follows, to save a byte (thanks!):

($_)=/^(.*).* \1/

And then Luk Storms suggested using -nE as follows to save another byte (thanks!):

say/^(.*).* \1/

(I'm counting -E as one byte instead of the standard -e, but -n or -p as two. My impression is that that's SOP around here.)


1
"-M5.010, when needed, is free". Per the same meta post, -pe or -ne would be 1 additional byte, not 2. So perl -nE 'say/^(.*).* \1/' would score 16 bytes.
ThisSuitIsBlackNot

4

Python 3, 72

31 bytes saved thanks to FryAmTheEggman. 8 saved thanks to DSM.

r=''
for x,y in zip(input(),input()):
 if x==y:r+=x
 else:break
print(r)

What would Python programmers do without zip? :D
Beta Decay

7
@BetaDecay Our fly would be open all the time.
Morgan Thrapp

You could put the input()s in the zip and save the a and b binding.
DSM

@DSM Ooo, good point. Thanks!
Morgan Thrapp

4

Python 3, 47

def f(w):[print(end=c[c!=d])for c,d in zip(*w)]

A function that takes a list w of two words, and prints the common prefix before terminating with an error.

Python 3's print function lets you prints strings flush against each other with print(end=c) (thanks to Sp3000 for saving 3 bytes with this shorter syntax). This repeatedly take two letters from the words, and prints the first of the letters. The indexing c[c!=d] gives an out-of-bounds error where c!=d, terminating the execution when two unequal letters are encountered.

An explicit for loop is one char longer than the list comprehension:

def f(w):
 for c,d in zip(*w):print(end=c[c!=d])

Wow! I hadn't even thought about using a function! Nice one. +1
Zach Gates

Only saw this now, but how about print(end=c[c!=d])?
Sp3000

1
@Sp3000 Wow, I never connected that the main argument to print being optional meant it could be called with only the end argument, and that could contain the string. That's a really useful trick in general. You should make a tip.
xnor

3

Javascript ES6, 52 bytes

f=(a,b)=>[...a].filter((e,i)=>e==b[i]?1:b='').join``

Usage:

>> f("global","glossary")
"glo"

Does not work with ada,aca...
flawr

Whoops, fixed. Forgot to kill filtering after the strings no longer match.
Dendrobium

1
You don't need to name the function, so you can leave out the f=
Ypnypn

1
you can do it smaller with map (a,b)=>[...a].map((e,i)=>e==b[i]?e:b='').join``
Shaun H

2

Retina, 14 bytes

Uses the same idea as kirbyfan64sos. Unfortunately, despite Martin's claim that eventually Match mode will feature a way to print capturing groups, it hasn't been implemented yet. Otherwise, (.*).* \1 could be used along with 2 bytes or so for some not-yet-existing configuration string option.

(.*).* \1.*
$1

Each line would go in its own file, with 1 byte added per additional file. Alternatively, run in a single file with the -s flag.


The equivalent regex fails to match in vim due to greediness (and a non-greedy regex will match the shortest substring, i.e. blank), are you sure it works?
Random832

@Random832 Try using this regex replace tester, with the .NET option checked. Set the operation to "replace", and put the patterns in the correct boxes. It doesn't fail to match if there should be one. How could it possible fail due to greediness? That's the only reason it works. \1 ensures that both words start with the same prefix. So no matter how greedy (.*) is, \1 is the same.
mbomb007

In vim it refuses to match at all - I think it is finding a longer string for the first (.*), then failing to match it against \1, then not properly backtracking to shorter strings.
Random832

@Random832 Then you need to find something else to test your regexes on.
mbomb007

2

K, 24 bytes

{(+/&\=/(&/#:'x)#'x)#*x}

Find the minimum of the length of each string. ((&/#:'x)). Trim each string to that length (#'x). Then compare, smear and sum the resulting sequence:

  =/("globaa";"glossa")
1 1 1 0 0 1
  &\=/("globaa";"glossa")
1 1 1 0 0 0
  +/&\=/("globaa";"glossa")
3

Finally, take that many characters from the first of the strings provided (#*x).

In action:

 f: {(+/&\=/(&/#:'x)#'x)#*x};
 f'(("global";"glossary")
    ("department";"depart")
    ("glove";"dove")
    ("aaa";"aaaaa")
    ("identical";"identical")
    ("aca";"aba"))
("glo"
 "depart"
 ()
 "aaa"
 "identical"
 ,"a")

2

Powershell, 65 bytes

Compare the strings, shrinking the first until it either matches (print and exit) or the string is null and the loop terminates.

param($a,$b)while($a){if($b-like"$a*"){$a;exit}$a=$a-replace".$"}

2

Julia, 62 bytes

f(a,b)=(c="";for(i,j)=zip(a,b) i!=j?break:(c*=string(i))end;c)

Ungolfed:

function f(a::AbstractString, b::AbstractString)
    # Initialize an output string
    c = ""

    # Iterate over the pairs of characters in a and b,
    # truncated to the shorter of the two lengths
    for (i, j) in zip(a, b)
        if i == j
            # If they match, append to the output string
            c *= string(i)
        else
            # Otherwise stop everything!
            break
        end
    end

    return c
end

Fixed an issue (at the hefty cost of 14 bytes) thanks to xnor!


2

C99, 73 bytes

main(int c,char *a[]){for(char *x=a[1],*y=a[2];*x==*y++;putchar(*x++));}

Similar to this answer, but shorter and meets spec (takes input from stdin).


Spec doesn't say input has to come from stdin. This is actually longer than the other answer if you add #include<stdio.h>, which is necessary for the program to compile.
musarithmia

@AndrewCashner - It doesn't need to be on stdin, but it does need to take input. The other answer is hard-coded. Also, gcc whines about the implicit usage, but it compiles fine without the include.
Comintern

Much shorter without the temporaries: main(int c,char**a){for(;*a[1]==*a[2]++;putchar(*a[1]++));} (59 bytes).
Toby Speight

2

MATLAB, 50 40 bytes

Defines a function that accepts 2 strings as input, outputs to command window

function t(a,b);a(1:find([diff(char(a,b)) 1],1)-1)

This solution will work for any string, outputs

ans =

   Empty string: 1-by-0

if no match is given.

Can be golfed by using a script instead of a function (using local variables a, b) (-16 bytes).

so getting 34 Bytes

a(1:find([diff(char(a,b)) 1],1)-1)

The function style (which seems to be the accepted style), yields

@(a,b)a(1:find([diff(char(a,b)) 1],1)-1)

(Thanks @Stewie Griffin)


40 bytes: @(a,b)a(1:find([diff(char(a,b)) 1],1)-1). =)
Stewie Griffin

2

Perl 6, 28 bytes

I came up with two that take their values from STDIN which are based on the Perl 5 answer.

lines~~/(.*).*' '$0/;say ~$0
lines~~/:s(.*).* $0/;say ~$0

The first requires exactly one space between the inputs, while the other requires at least one whitespace character between the inputs.


That is quite a bit shorter than the first thing I tried which takes the values from the command line.

say [~] map ->($a,$b){$a eq$b&&$a||last},[Z] @*ARGS».comb # 58 bytes

or even the lambda version of it:

{[~] map ->($a,$b){$a eq$b&&$a||last},[Z] @_».comb} # 52 bytes

Though this is much easier to adjust so that it accepts any number of input strings, at the cost of only one stroke.

{[~] map ->@b {([eq] @b)&&@b[0]||last},[Z] @_».comb} # 53 bytes
#          ┗━┛ ┗━━━━━━━┛  ┗━━━┛
my &common-prefix = {[~] map ->@b {([eq] @b)&&@b[0]||last},[Z] @_».comb}

say common-prefix <department depart>; # "depart"
say common-prefix; # ""
say common-prefix <department depart depot deprecated dependant>; # "dep"

# This code does not work directly with a single argument, so you have
# to give it an itemized List or Array, containing a single element.

say common-prefix $('department',); # "department"

# another option would be to replace `@_` with `(@_,)`

2

Japt, 27 bytes

Japt is a shortened version of JavaScript. Interpreter

Um$(X,Y)=>$A&&X==VgY ?X:A=P

(The strings go into the Input box like so: "global" "glossary")

This code is exactly equivalent to the following JS:

A=10;(U,V)=>U.split``.map((X,Y)=>A&&X==V[Y]?X:A="").join``

I have not yet implemented anonymous functions, which is what the $...$ is for: anything between the dollar signs is left untouched in the switch to JS. After I add functions, this 21-byte code will suffice:

UmXY{A&&X==VgY ?X:A=P

And after I implement a few more features, it will ideally be 18 bytes:

UmXY{AxX=VgY ?X:AP

Suggestions welcome!


So it turns out that this program is only 15 bytes in modern Japt:

¡A©X¥VgY ?X:A=P

Try it online!


2

MATL, 11 9 bytes

y!=XdYpf)

Try it online!

(-2 bytes thanks to Giuseppe)

 y  % implicitly input the two strings, then duplicate the
    %  first one into the stack again
    %  stack: ['department' 'deported' 'department']
 !  % transpose the last string into a column vector
 =  % broadcast equality check - gives back a matrix comparing
    %  every letter in first input with the letters in the second
 Xd % diagonal of the matrix - comparison result of each letter with
    %  only corresponding letter in the other string
    %  stack: ['department' [1; 1; 1; 0; 1; 1; 0; 0;]]
 Yp % cumulative product (so only initial sequence of 1s remains
    %  1s, others become 0)
    %  stack: ['department' [1; 1; 1; 0; 0; 0; 0; 0;]]
 f  %  find the indices of the 1s
 )  % index at those elements so we get those letters out
    % (implicit) convert to string and display

Thanks! The y idea is pretty good, I'd tried things like an initial iti instead of the 1Gw, but didn't think of using the y for that.
sundar - Reinstate Monica

1

Clojure/ClojureScript, 51

(defn f[[a & b][c & d]](if(= a c)(str a(f b d))""))

Pretty straightforward. Unfortunately the spaces around the parameter destructuring are necessary (that's the [a & b] stuff). Not the shortest but I beat some other answers in languages that like to brag about their terseness so I'll post it.


1

Python 2, 50 bytes

for a,b in zip(*input()):print(1/0if a!=b else a),

Input

The input is taken as two strings:

"global", "glossary"

Output

The output is each character followed by a space; which, hopefully, isn't a problem. However, if it is, I'll edit my answer.

g l o 

I'm pretty sure this is invalid; the spec clearly gave the output format as a string without spaces.
lirtosiast

Well yes, but the input was also given in the format "global" , "glossary" (two separate strings).. How many other answers follow that to the letter? @ThomasKwa
Zach Gates

"takes two strings" is the language used by OP; usually when something like that is mentioned without any qualifiers, it refers to one of our default I/O, which means we can take one string from the command line and one from STDIN, or an array of two strings, or whatever else follows those rules.
lirtosiast

I think you're taking my answer a bit too seriously. This is just a fun submission and my best attempt at beating a built-in. If OP doesn't like the output format, so be it; I'll remove my answer. @ThomasKwa
Zach Gates

How about print(exit()if a!=b else a,end='')? I don't know if that'll work or not, but it might
Beta Decay

1

TeaScript, 16 bytes 20

xf»l¦y[i]?1:b=0)

Takes each input separated by a space.


1

PHP, 52 bytes

Not spectacular but does the job:

$a=$argv;while($a[1][$i]==$a[2][$i])echo$a[1][$i++];

Takes two command line arguments:

php prefix.php department depart

PHP7 lets you save another byte while(($a=$argv)[1][$i]==$a[2][$i])echo$a[1][$i++]; - Another PHP7 only solution (and best I could come up with @ 50 bytes) <?=substr(($a=$argv)[1],0,strspn($a[1]^$a[2],~ÿ)); - Make sure your editor is in ascii mode, it's important the ~ÿ does not get converted to unicode.
Leigh
弊社のサイトを使用することにより、あなたは弊社のクッキーポリシーおよびプライバシーポリシーを読み、理解したものとみなされます。
Licensed under cc by-sa 3.0 with attribution required.