アルファベット完成率


32

前書き

与えられた文字列はどのくらいの英語アルファベットを使用しますか?前の文は77%を使用しています。20個の一意の文字(howmucftenglisapbdvr)と20/26≃0.77があります。

チャレンジ

入力文字列の場合、文字列に存在する英語のアルファベットの文字の割合を返します。

  • 回答は、パーセント形式または小数形式にすることができます。

  • 入力文字列には、大文字と小文字、および句読点を含めることができます。ただし、発音区別符号やアクセント記号付きの文字はないと想定できます。

テストケース

入力

"Did you put your name in the Goblet of Fire, Harry?" he asked calmly.

有効な出力

77%, 76.9, 0.7692

入力:

The quick brown fox jumps over the lazy dog

すべての有効な出力:

100%, 100, 1

以下のための出力を期待"@#$%^&*?!"して""0です。


3
推奨されるテストケース:"@#$%^&*?!"""
AdámJun

4
受け入れられた場合77%、受け入れられますか?76.977
Grzegorz Oledzki

パーセンテージには小数部分も含めることができます...
Jo King

2
OPの@Shaggy Last editは16時間前で、あなたの答えは15で、あなたのコメントは14でした。つまり、あなたは正しいですが、???
ベスカ

6
20/26を0.7692、0.769または0.77に丸められる場合、0.8、1または0に丸めることはできますか?;-)
ノイラレフ

回答:


18

Python 3、42バイト

lambda s:len({*s.upper()}-{*s.lower()})/26

オンラインでお試しください!

大文字と小文字の表現の(セット)差をとることにより、文字列からすべての非アルファベット文字をフィルタリングします。次に、長さを取得して26で割ります。

Python 3、46バイト

lambda s:sum(map(str.isalpha,{*s.lower()}))/26

オンラインでお試しください!

一意のアルファベット(小文字)文字をカウントし、26で除算します。Python2では、さらに3文字が必要です。に変更{*...}するための2 set(...)つと、26をフロート26.にするための1つ:、フロア分割を避けるため。

Python 3、46バイト

lambda s:sum('`'<c<'{'for c in{*s.lower()})/26

オンラインでお試しください!

同じ長さで、基本的に前のものと同じですが、「組み込み」の文字列メソッドはありません。


なぜ2番目のものは戻って来1.0ないの1ですか?(特定の言語に不利にならないように具体的に禁止したくはありませんでしたが、興味があります)
ヤギ

10
Python 3では、オペランドが整数であっても、単一のスラッシュを持つ@TeleportingGoat Divisionは常にfloatを提供します。整数除算の場合、を使用します//が、それは常に整数除算になります。これは明らかにここでは必要ではありません。オペランドの特定の値に依存して出力のデータ型を作成しなかったのは理にかなっています。つまり、整数であっても常に浮動小数点を意味します。
ArBo

11

MATL、8バイト

2Y2jkmYm

MATL Online試しください

説明

2Y2    % Predefined literal for 'abcdefghijklmnopqrstuvwxyz'
j      % Explicitly grab input as a string
k      % Convert to lower-case
m      % Check for membership of the alphabet characters in the string. 
       % Results in a 26-element array with a 1 where a given character in 
       % the alphabet string was present in the input and a 0 otherwise
Ym     % Compute the mean of this array to yield the percentage as a decimal
       % Implicitly display the result

8

オクターブ / MATLAB、33バイト

@(s)mean(any(65:90==upper(s)',1))

オンラインでお試しください!

説明

@(s)                               % Anonymous function with input s: row vector of chars
             65:90                 % Row vector with ASCII codes of uppercase letters
                    upper(s)       % Input converted to uppercase
                            '      % Transform into column vector
                  ==               % Equality test, element-wise with broadcast. Gives a
                                   % matrix containing true and false
         any(                ,1)   % Row vector containing true for columns that have at
                                   % least one entry with value true
    mean(                       )  % Mean

7

05AB1E8 7 6 バイト

lASåÅA

@LuisMendoのおかげで-1バイト。

オンラインそれを試してみたり、さらにいくつかのテストケースを検証します

@Grimyが提供する6バイトの代替:

láÙg₂/

オンラインそれを試してみたり、さらにいくつかのテストケースを検証します

両方のプログラムは10進数として出力します。

説明:

l       # Convert the (implicit) input-string to lowercase
 AS     # Push the lowercase alphabet as character-list
   å    # Check for each if it's in the lowercase input-string
        # (1 if truthy; 0 if falsey)
    ÅA  # Get the arithmetic mean of this list
        # (and output the result implicitly)

l       # Convert the (implicit) input-string to lowercase
 á      # Only leave the letters in this lowercase string
  Ù     # Uniquify it
   g    # Get the amount of the unique lowercase letters by taking the length
    ₂/  # Divide this by 26
        # (and output the result implicitly)

または、@ LuisMendo láêg₂/も6バイトです。
グリムミー

1
@LuisMendoありがとう(そしてあなたもGrimy)!:)
ケビン・クルーッセン

7

C#(Visual C#Interactive Compiler)56 49バイト

a=>a.ToUpper().Distinct().Count(x=>x>64&x<91)/26f

オンラインでお試しください!

innat3のおかげで-6バイト


1
you can save 6 bytes by comparing the decimal values of the characters 50 bytes (Character codes)
Innat3

@Innat3 49 bytes by changing the && to &.
Kevin Cruijssen

@KevinCruijssen ~2 mins off getting the -1 byte credit, already did that and was editing
Expired Data

@ExpiredData Np, it was an obvious golf. Was mainly directing it to Innat :)
Kevin Cruijssen

6

APL (Dyalog Extended), 10 bytesSBCS

Anonymous tacit prefix function. Returns decimal fraction.

26÷⍨∘≢⎕A∩⌈

Try it online!

 uppercase

⎕A∩ intersection with the uppercase Alphabet

 tally length

 then

26÷⍨ divide by twenty-six


⌹∘≤⍨⎕A∊⌈­­­­­
ngn

@ngn That's very clever, but completely different. Go ahead and post that yourself. I'll be happy insert the explanation if you want me to.
Adám


6

Perl 6, 27 24 bytes

-3 bytes thanks to nwellnhof

*.uc.comb(/<:L>/).Set/26

Try it online!


1
+1 Also, while this works just fine (and .lc would work too), from a "correctness" standpoint, .fc might be better (particularly if the challenge had non-English letters)
user0721090601

6

Bash and Gnu utils (81 78 68 60 42 bytes)

bc -l<<<`grep -io [a-z]|sort -fu|wc -l`/26

-8 bytes thanks to @wastl

-18 bytes thanks to Nahuel using some tricks I didn't know:

  • sort -f and grep -i ignore case
  • sort -u is a replacement for | uniq

1
60 bytes: echo $(tr A-Z a-z|tr -cd a-z|fold -1|sort -u|wc -l)/26|bc -l
wastl

Right. The variable is a reminder after another attempt. Thanks!
Grzegorz Oledzki


Can't "grep -io [a-z]" be shortened to "grep -o [A-z]" ?
Gnudiff

@Gnudiff Assuming ASCII, that would also match all of [\^_`].
jnfnt

6

K (oK), 19 15 bytes

Solution:

1%26%+/26>?97!_

Try it online!

Explanation:

Convert input to lowercase, modulo 97 ("a-z" is 97-122 in ASCII, modulo 97 gives 0-25), take unique, sum up results that are lower than 26, and convert to the percentage of 26.

1%26%+/26>?97!_ / the solution
              _ / lowercase
           97!  / modulo (!) 97
          ?     / distinct
       26>      / is 26 greater than this?
     +/         / sum (+) over (/)
  26%           / 26 divided by ...
1%              / 1 divided by ...

Notes:

  • -1 bytes thanks to ngn, 1-%[;26] => 1-1%26%
  • -3 bytes inspired by ngn #(!26)^ => +/26>?

1
I'm looking forward to the explanation! I have no idea what that 97 is doing here
Teleporting Goat


1
%[;26] -> 1%26%
ngn



6

PowerShell, 55 52 bytes

($args|% *per|% t*y|sort|gu|?{$_-in65..90}).count/26

Try it online!

First attempt, still trying random ideas

EDIT: @Veskah pointed out ToUpper saves a byte due to the number range, also removed extra () and a space

Expansion:
($args|% ToUpper|% ToCharArray|sort|get-unique|where{$_-in 65..90}).count/26

Changes string to all loweruppercase, expands to an array, sorts the elements and selects the unique letters (gu needs sorted input), keep only characters of ascii value 97 to 122 (a to z) 65 to 90 (A to Z), count the total and divide by 26 for the decimal output



1
oh, just noticed you have an extra space after -in.
Veskah

6

R, 47 bytes

function(x)mean(65:90%in%utf8ToInt(toupper(x)))

Try it online!

Converts to upper case then to ASCII code-points, and checks for values 65:90 corresponding to A:Z.


1
This fails when there are quotes in the input.
C. Braun

1
@C.Braun Not in my tests... For instance, the first test case on TIO includes quotes and gives the correct result. Could you give an example?
Robin Ryder

1
I do not quite understand what you have done in the header part on TIO, but running just the code above in an R interpreter does not work. You seem to be redefining scan to not split on quotation marks, like the default does?
C. Braun

1
@C.Braun Got it, thanks! I've explicitly made it into a function (at a cost of 3 bytes) and I think it's OK now.
Robin Ryder

4

Retina 0.8.2, 45 bytes

T`Llp`ll_
+`(.)(.*\1)
$2
.
100$*
^
13$*
.{26}

Try it online! Link includes test cases. Explanation:

T`Llp`ll_

Lowercase letters and delete punctuation.

+`(.)(.*\1)
$2

Deduplicate.

.
100$*

Multiply by 100.

^
13$*

Add 13.

.{26}

Integer divide by 26 and convert to decimal.


I think retina is the only language here using percentages for the output!
Teleporting Goat

Oh, nice trick with adding unary 13 before dividing! Why didn't I think of that.. >.> It would make my answer 44 bytes. I'll still leave my previous version, though.
Kevin Cruijssen

@TeleportingGoat Probably because Retina is also the only language from the ones posted thus far which doesn't have decimal division available. Only (unary) integer-division is possible.
Kevin Cruijssen

4

APL (Dyalog Extended), 8 bytes

⌹∘≤⍨⎕A∊⌈

Try it online!

loosely based on Adám's answer

 uppercase

⎕A∊ boolean (0 or 1) vector of length 26 indicating which letters of the English Alphabet are in the string

⌹∘≤⍨ arithmetic mean, i.e. matrix division of the argument and an all-1 vector of the same length


3

Charcoal, 11 bytes

I∕LΦβ№↧θι²⁶

Try it online! Link is to verbose version of code. Output is as a decimal (or 1 for pangrams). Explanation:

  L         Length of
    β       Lowercase alphabet
   Φ        Filtered on
     №      Count of
        ι   Current letter in
      ↧     Lowercased
       θ    Input
 ∕          Divided by
         ²⁶ Literal 26
I           Cast to string
            Implicitly printed

3

Batch, 197 bytes

@set/ps=
@set s=%s:"=%
@set n=13
@for %%c in (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)do @call set t="%%s:%%c=%%"&call:c
@cmd/cset/an/26
@exit/b
:c
@if not "%s%"==%t% set/an+=100

Takes input on STDIN and outputs a rounded percentage. Explanation:

@set/ps=

Input the string.

@set s=%s:"=%

Strip quotes, because they're a headache to deal with in Batch.

@set n=13

Start with half a letter for rounding purposes.

@for %%c in (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)do @call set t="%%s:%%c=%%"&call:c

Delete each letter in turn from the string. Invoke the subroutine to check whether anything changed, because of the way Batch parses variables.

@cmd/cset/an/26

Calculate the result as a percentage.

@exit/b
:c

Start of subroutine.

@if not "%s%"=="%t%" set/an+=100

If deleting a letter changed the string then increment the letter count.


3

Pepe, 155 138 bytes

rEeEeeeeeEREeEeEEeEeREERrEEEEErEEEeReeReRrEeeEeeeeerEEEEREEeRERrErEErerREEEEEeREEeeRrEreerererEEEEeeerERrEeeeREEEERREeeeEEeEerRrEEEEeereEE

Try it online! Output is in decimal form.

Explanation:

rEeEeeeeeE REeEeEEeEe # Push 65 -> (r), 90 -> (R)
REE # Create loop labeled 90 // creates [65,66,...,89,90]
  RrEEEEE # Increment (R flag: preserve the number) in (r)
  rEEEe # ...then move the pointer to the last
Ree # Do this while (r) != 90

Re # Pop 90 -> (R)
RrEeeEeeeee rEEEE # Push 32 and go to first item -> (r)
REEe # Push input -> (R)
RE RrE # Push 0 on both stacks, (r) prepend 0
rEE # Create loop labeled 0 // makes input minus 32, so the
    # lowercase can be accepted, since of rEEEEeee (below)
  re # Pop 0 -> (r)
  rREEEEEe REEee # Push item of (R) minus 32, then go to next item 
  RrE # Push 0 -> (R)
ree # Do while (R) != 0

rere # Pop 0 & 32 -> (r)
rEEEEeee # Remove items from (r) that don't occur in (R)
         # Remove everything from (r) except the unique letters
rE # Push 0 -> (r)
RrEeee # Push reverse pointer pos -> (r)
REEEE # Move pointer to first position -> (R)
RREeeeEEeEe # Push 26 -> (R)
rRrEEEEee reEE # Divide it and output it

Since Pepe is only a 4 command language really it's like 34.5 bytes if you encoded it as 2 bits per r e R E?
Expired Data


3

Retina, 57 46 35 bytes

.
$L
[^a-z]

D`.
.
100*
^
13*
_{26}

-11 bytes taking inspiration from @Neil's trick of adding unary 13 before dividing.
Another -11 bytes thanks to @Neil directly.
Rounds (correctly) to a whole integer.

Try it online.

57 46 40 bytes version which works with decimal output:

.
$L
[^a-z]

D`.
.
1000*
C`_{26}
-1`\B
.

Same -11 bytes as well as an additional -6 bytes thanks to @Neil.

Outputs with one truncated decimal after the comma ( i.e. 0.1538 (426) is output as 15.3 instead of 15.4). This is done by calculating 1000×unique_letters26 and then inserting the decimal dot manually.

Try it online.

Explanation:

Convert all letters to lowercase:

.
$L

Remove all non-letters:

[^a-z]

Uniquify all letters:

D`.

Replace every unique letter with 1000 underscores:

.
1000*

Count the amount of times 26 adjacent underscores fit into it:

C`_{26}

Insert a dot at the correct place:

-1`\B
.

1
The .* could just be . for a 1 byte saving, but you can save another 10 bytes by using Deduplicate instead of doing it manually!
Neil

@Neil Ah, didn't knew about the D-builtin, thanks! And not sure why I used .* instead of ... Thanks for -11 bytes in both versions! :)
Kevin Cruijssen

1
FYI I had a slightly different approach for the same byte count: Try it online!
Neil

1
For the decimal version I found that -1`\B matches the desired insertion position directly.
Neil

@Neil Thanks again.
Kevin Cruijssen

3

Java 8, 62 59 bytes

s->s.map(c->c&95).distinct().filter(c->c%91>64).count()/26.

-3 bytes thanks to @OlivierGrégoire.

Try it online.

Explanation:

s->                     // Method with IntStream as parameter and double return-type
  s.map(c->c&95)        //  Convert all letters to uppercase
   .distinct()          //  Uniquify it
   .filter(c->c%91>64)  //  Only leave letters (unicode value range [65,90])
   .count()             //  Count the amount of unique letters left
    /26.                //  Divide it by 26.0


@OlivierGrégoire Thanks! I always forget about c&95 in combination with c%91>64 for some reason. I think you've already suggested that golf a few times before to me.
Kevin Cruijssen

Yes, I already suggested those, but that's OK, no worries ;-)
Olivier Grégoire

Way longer, but more fun: s->{int r=0,b=0;for(var c:s)if((c&95)%91>64&&b<(b|=1<<c))r++;return r/26.;} (75 bytes)
Olivier Grégoire

3

Julia 1.0, 34 bytes

s->sum('a':'z'.∈lowercase(s))/26

Uses the vectorized version of the ∈ operator, checking containment in the string for all characters in the range from a to z. Then sums over the resulting BitArray and divides by total number of possible letters.

Try it online!


Welcome and great first answer!
mbomb007



2

Stax, 9 bytes

░║üy$}╙+C

Run and debug it


1
You can take a byte off the unpacked version by dropping u and using |b, but the savings disappear under packing. I might have an 8-byter, but the online interpreter is being weird and buggy.
Khuldraeseth na'Barya

@Khuldraesethna'Barya: Nice find. I think the bug is probably an array mutation. I'm seeing some of that behavior now. Working on a minimal repro...
recursive

Here's a repro of the problem I guess you're having with |b. It incorrectly mutates its operand rather than making a copy. I've created a github issue for the bug. github.com/tomtheisen/stax/issues/29 As a workaround, |b will work correctly the first time. After that, you may have to reload the page. If you found a different bug, if you can provide a reproduction, I'll probably be able to fix it.
recursive

Stax 1.1.4, 8 bytes. Instructions: unpack, insert v at the start, insert |b after Va, run, remove the first v, remove |b, repack. Yep, that's the bug I found.
Khuldraeseth na'Barya

@Khuldraesethna'Barya: I've released 1.1.5, and I believe this bug is fixed now. You can let me know if you still have trouble. Thanks.
recursive

2

Jelly, 8 bytes

ŒuØAe€Æm

Try it online!

Explanation

Œu       | Convert to upper case
  ØAe€   | Check whether each capital letter is present, returning a list of 26 0s and 1s
      Æm | Mean



1

Japt, 9 bytes

;CoU Ê/26

Try it

;CoU Ê/26     :Implicit input of string U
;C            :Lowercase alphabet
  oU          :Remove the characters not included in U, case insensitive
     Ê        :Length
      /26     :Divide by 26



1

C, 95 bytes

f(char*s){int a[256]={},z;while(*s)a[*s++|32]=1;for(z=97;z<'z';*a+=a[z++]);return(*a*100)/26;}

(note: rounds down)

Alternate decimal-returning version (95 bytes):

float f(char*s){int a[256]={},z;while(*s&&a[*s++|32]=1);for(z=97;z<'z';*a+=a[z++]);return*a/26.;}

This borrows some from @Steadybox' answer.


1
Welcome! Good first answer. It might be helpful for people reading your answer if you provide a short explanation of your code or an ungolfed version. It may also be helpful to provide a link to an online interpreter with your runnable code (see some other answers for examples). Many use TIO, and here's the gcc interpreter
mbomb007
弊社のサイトを使用することにより、あなたは弊社のクッキーポリシーおよびプライバシーポリシーを読み、理解したものとみなされます。
Licensed under cc by-sa 3.0 with attribution required.