文字列内の部分文字列の出現

122

次のアルゴリズムが停止しないのはなぜですか？（strは検索している文字列、findStrは検索しようとしている文字列です）

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;

while (lastIndex != -1) {
    lastIndex = str.indexOf(findStr,lastIndex);

    if( lastIndex != -1)
        count++;

    lastIndex += findStr.length();
}

System.out.println(count);

java string

— ロバート・ハーベイ
ソース

8

Udacityで非常に優れた処理を行いました。newSTR= str.replace（findStr、 ""）;を使用しました。返されたカウント=（（str.length（）-newSTR.length（））/ findStr.length（））;

— SolarLunix

文字についての同様の質問：stackoverflow.com/q/275944/873282

— koppor 2017

検索文字列の接頭辞がその接尾辞である場合も考慮に入れませんか？その場合、私は提案された答えがうまくいくとは思いません。ここに例があります。その場合、あなたはCLRSブックにアップコーディングされているクヌースモリスプラット（KMP）のような、より精巧なアルゴリズム、必要があるだろう

— シド

それが原因であなたの「停止」状態に達した後、あなたのために停止されていません（lastIndexの== -1）あなたはlastIndexの値をインクリメントすることにより、それをリセット（lastIndexの+ = findStr.length（）;）

— Legna

83

最後の行は問題を引き起こしていました。lastIndex-1になることはないため、無限ループが発生します。これは、コードの最後の行をifブロックに移動することで修正できます。

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;

while(lastIndex != -1){

    lastIndex = str.indexOf(findStr,lastIndex);

    if(lastIndex != -1){
        count ++;
        lastIndex += findStr.length();
    }
}
System.out.println(count);

— コード違反
ソース

121

この返信は、私が1時間前に作成した投稿の正確なコピーです;）

— Olivier

8

これにより、予期した結果が返される場合と返されない場合があります。部分文字列「aa」と「aaa」を検索する文字列を使用すると、予想される発生回数は1つ（このコードによって返される）になる場合がありますが、2つになる場合もあります（この場合、「lastIndex + =」ではなく「lastIndex ++」が必要になります。 findStr.length（） "）は、探しているものに応じて異なります。

— スタニスラフクニアゼフ2009

@olivier didntの見ている... :( @stan thatsの...私は... ...問題のコードを固定し、それが文字列に出現回数によって何bobcom手段に依存して推測して絶対に正しい

— codebreach

1

このようなものをコピーして静的メソッドに貼り付けることを学ぶのはいつですか？以下の私の回答を参照してください。これも最適化されています。

— mmm

1

ここでの教訓は、答えを書くつもりなら、最初に他の誰かがまったく同じ答えを書いていないかどうかを確認することです。回答がコピーされたものか、独立して書かれたものかに関係なく、同じ回答を2回表示しても何のメリットもありません。

— Dawood ibnカリーム

191

Apache Commons LangのStringUtils.countMatchesを使用するのはどうですか？

String str = "helloslkhellodjladfjhello";
String findStr = "hello";

System.out.println(StringUtils.countMatches(str, findStr));

それは出力します：

— A_M
ソース

9

この提案がどれほど適切であるかに関わらず、OPの質問に答えていないため、ソリューションとして受け入れることはできません

— kommradHomer

3

これは非推奨か何かですか..私のIDEは認識していません

— Vamsi Pavan Mahesh '18 / 07/18

@VamsiPavanMahesh StringUtilsは、Apache Commonsのライブラリです。ここで確認してください：commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/...

— Anup

この回答は、前日のPeter Lawreyの回答のコピーです（以下を参照）。

— Zon

StringUtilscountMatchesメソッドはありません。

— 格子縞のシャツ

117

あなたがlastIndex += findStr.length();無限ループ原因、括弧の外側に配置された（非出現が見られなかった、lastIndexのは常にありましたfindStr.length()）。

ここに修正バージョンがあります：

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;

while (lastIndex != -1) {

    lastIndex = str.indexOf(findStr, lastIndex);

    if (lastIndex != -1) {
        count++;
        lastIndex += findStr.length();
    }
}
System.out.println(count);

— オリビエ
ソース

92

短いバージョン。;）

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
System.out.println(str.split(findStr, -1).length-1);

— ピーター・ローリー
ソース

8

return haystack.split(Pattern.quote(needle), -1).length - 1;たとえばneedle=":)"

— Mr_and_Mrs_D

2

@lOrangerなしの,-1場合、後続の一致は削除されます。

— Peter Lawrey、

3

痛い、ありがとう、知っておくと良い！これは、javadocの小さな行を読むことを教えてくれます...

— LaurentGrégoire

4

いいね！ただし、重複しない一致のみが含まれます。たとえば、「aaa」の「aa」に一致すると、2ではなく1が返されますか？もちろん、重複または非重複の一致を含めることは、どちらも有効であり、ユーザーの要件に依存します（おそらく、カウントの重複を示すフラグ、はい/いいえ）？

— Cornel Masson 2013

2

-1 ..これを "aaaa"と "aa"で実行してみてください。正しい答えは2ではなく3です

— Kalyanaraman Santhanam

79

あなたは本当に自分でマッチングを処理する必要がありますか？特に必要なのが出現回数だけである場合、正規表現の方が整然としています：

String str = "helloslkhellodjladfjhello";
Pattern p = Pattern.compile("hello");
Matcher m = p.matcher(str);
int count = 0;
while (m.find()){
    count +=1;
}
System.out.println(count);

— ジャン
ソース

1

：これは、以下の文字列を0のカウントを見つけるだろう、特殊文字を見つけることができません String str = "hel+loslkhel+lodjladfjhel+lo"; Pattern p = Pattern.compile("hel+lo");

— ベン

13

はい、正規表現を正しく表現すればできます。記号で試してみるとPattern.compile("hel\\+lo");、+正規表現では特別な意味があり、エスケープする必要があります。

— ジーン

4

あなたが探しているものが任意の文字列を取り、それをすべての特殊な正規表現文字を無視した完全一致として使用する場合、Pattern.quote(str)あなたの友達です！

— Mike Furtak、2015年

2

str = "aaaaaa"の場合、これは "aaa"では機能しません。4つの答えがありますが、あなたの答えは2です

— Pujan Srivastava

このソリューションは、この場合機能しません：str = "これはテストです\\ n \\ r文字列"、subStr = "\\ r"、それは0回の出現を示します。

— Maksym Ovsianikov

19

誰もこのライナーについて言及しなかったので、私はとても驚いています。シンプルで簡潔で、パフォーマンスはやや優れていますstr.split(target, -1).length-1

public static int count(String str, String target) {
    return (str.length() - str.replace(target, "").length()) / target.length();
}

— kmecpp
ソース

トップアンサーでなければなりません。ありがとうございました！

— lakam99

12

ここでは、再利用可能な優れたメソッドにまとめました。

public static int count(String text, String find) {
        int index = 0, count = 0, length = find.length();
        while( (index = text.indexOf(find, index)) != -1 ) {                
                index += length; count++;
        }
        return count;
}

— うーん
ソース

8

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;

while((lastIndex = str.indexOf(findStr, lastIndex)) != -1) {
     count++;
     lastIndex += findStr.length() - 1;
}
System.out.println(count);

ループ終了時のカウントは3です。それが役に立てば幸い

— DFA
ソース

5

コードにエラーが含まれています。単一の文字を検索すると、findStr.length() - 10 が返され、無限のサイクルに入ります。

— Jan Bodnar 2014

6

与えられた答えの多くは、以下の1つ以上で失敗します。

任意の長さのパターン
重複する一致（「23232」の「232」や「aaa」の「aa」など）
正規表現のメタ文字

これが私が書いたものです：

static int countMatches(Pattern pattern, String string)
{
    Matcher matcher = pattern.matcher(string);

    int count = 0;
    int pos = 0;
    while (matcher.find(pos))
    {
        count++;
        pos = matcher.start() + 1;
    }

    return count;
}

呼び出し例：

Pattern pattern = Pattern.compile("232");
int count = countMatches(pattern, "23232"); // Returns 2

正規表現以外の検索が必要な場合は、LITERALフラグを使用してパターンを適切にコンパイルします。

Pattern pattern = Pattern.compile("1+1", Pattern.LITERAL);
int count = countMatches(pattern, "1+1+1"); // Returns 2

— ベンク
ソース

はい... Apache StringUtilsにはこのようなものがないことに驚きました。

— マイクげっ歯類

6

public int countOfOccurrences(String str, String subStr) {
  return (str.length() - str.replaceAll(Pattern.quote(subStr), "").length()) / subStr.length();
}

— マクシム・オヴシアニコフ
ソース

いい答えだ。それがどのように機能するかについていくつかのメモを追加していただけませんか？

— santhosh kumar 2017

確かに、str-はソース文字列、subStr-は部分文字列です。目標は、str内のsubStrの発生量を計算することです。これを行うには、式（ab）/ cを使用します。ここで、a-strの長さ、b-subStrのすべてのオカレンスなしのstrの長さ（このため、strからsubStrのすべてのオカレンスを削除します）、c-subStrの長さ。したがって、基本的にはstrの長さから抽出し、すべてのsubStrを除いたstrの長さを取得してから、subStrの長さで結果を除算します。他にご不明な点がありましたらお知らせください。

— Maksym Ovsianikov

サントッシュ、どういたしまして！重要な部分は、subStrにPattern.quoteを使用することです。そうしないと、次のように、場合によっては失敗することがあります：str = "これはテスト\\ n \\ r文字列"、subStr = "\\ r" ここで提供されるいくつかの同様の回答はパターンを使用しないため、そのような場合に失敗します。

— Maksym Ovsianikov

正規表現を使用する理由はありreplaceませんreplaceAll。

— NateS

3

lastIndex次の出現を探すたびに増分します。

それ以外の場合は、常に最初の部分文字列（位置0）が検索されます。

— スタニスラフ・クニャゼフ
ソース

3

public int indexOf(int ch,
                   int fromIndex)

指定されたインデックスから検索を開始し、指定された文字が最初に出現するこの文字列内のインデックスを返します。

したがって、lastindex値は常に0であり、常に文字列でhelloを見つけます。

— ブーシャンバンガーレ
ソース

2

正しい答えは、改行などのカウントには適しておらず、冗長すぎます。後の答えはより良いですが、すべてが簡単に達成できます

str.split(findStr).length

質問の例では、末尾の一致は削除されません。

— マーク
ソース

1

これは別の回答ですでにカバーされています。そしてその答えはそれをより良くしました。

— michaelb958--GoFundMonica 2013

1

これは問題の回答に対するコメントであり、別の回答ではありません。

— james.garriss 2014年

2

組み込みライブラリ関数を使用して、発生回数を指定できます。

import org.springframework.util.StringUtils;
StringUtils.countOccurrencesOf(result, "R-")

— ビクター
ソース

1

機能しません。使用した依存関係を指定する必要があります。

— サイカットは2016年

1

lastIndex+=findStr.length()ループの最後に追加してみてください。そうしないと、部分文字列を見つけたら、同じ最後の位置から何度も検索しようとするため、無限ループになります。

— トールステンシュラインツァー
ソース

1

これを試してみてください。すべての一致をに置き換えます-。

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int numberOfMatches = 0;
while (str.contains(findStr)){
    str = str.replaceFirst(findStr, "-");
    numberOfMatches++;
}

そして、破壊したくない場合strは、同じ内容の新しい文字列を作成できます。

String str = "helloslkhellodjladfjhello";
String strDestroy = str;
String findStr = "hello";
int numberOfMatches = 0;
while (strDestroy.contains(findStr)){
    strDestroy = strDestroy.replaceFirst(findStr, "-");
    numberOfMatches++;
}

このブロックを実行した後、これらはあなたの値になります：

str = "helloslkhellodjladfjhello"
strDestroy = "-slk-djladfj-"
findStr = "hello"
numberOfMatches = 3

— ザンダー
ソース

1

@Mr_and_Mrs_Dが示唆したように：

String haystack = "hellolovelyworld";
String needle = "lo";
return haystack.split(Pattern.quote(needle), -1).length - 1;

— ロン・テスラー
ソース

1

既存の回答に基づいて、以下の場合を除いた「より短い」バージョンを追加します。

String str = "helloslkhellodjladfjhello";
String findStr = "hello";

int count = 0, lastIndex = 0;
while((lastIndex = str.indexOf(findStr, lastIndex)) != -1) {
    lastIndex += findStr.length() - 1;
    count++;
}

System.out.println(count); // output: 3

— sjkm
ソース

これは、文字列が繰り返される場合、たとえば、文字列「xxx」で文字列「xx」を検索する場合に考慮されます。

— tCoe 2016

1

これは、ユーザーが入力した文字列でトークンが発生した回数をカウントするための高度なバージョンです。

public class StringIndexOf {

    public static void main(String[] args) {

        Scanner scanner = new Scanner(System.in);

        System.out.println("Enter a sentence please: \n");
        String string = scanner.nextLine();

        int atIndex = 0;
        int count = 0;

        while (atIndex != -1)
        {
            atIndex = string.indexOf("hello", atIndex);

            if(atIndex != -1)
            {
                count++;
                atIndex += 5;
            }
        }

        System.out.println(count);
    }

}

— Venzentx
ソース

1

次のメソッドは、部分文字列が文字列全体で何回繰り返されるかを示しています。あなたにフル使用を願っています：-

    String searchPattern="aaa"; // search string
    String str="aaaaaababaaaaaa"; // whole string
    int searchLength = searchPattern.length(); 
    int totalLength = str.length(); 
    int k = 0;
    for (int i = 0; i < totalLength - searchLength + 1; i++) {
        String subStr = str.substring(i, searchLength + i);
        if (subStr.equals(searchPattern)) {
           k++;
        }

    }

— ダグ
ソース

0

regexp / patterns / matchersを使用しない、またはStringUtilsを使用しない他のソリューションを次に示します。

String str = "helloslkhellodjladfjhelloarunkumarhelloasdhelloaruhelloasrhello";
        String findStr = "hello";
        int count =0;
        int findStrLength = findStr.length();
        for(int i=0;i<str.length();i++){
            if(findStr.startsWith(Character.toString(str.charAt(i)))){
                if(str.substring(i).length() >= findStrLength){
                    if(str.substring(i, i+findStrLength).equals(findStr)){
                        count++;
                    }
                }
            }
        }
        System.out.println(count);

— アルンクマールムドラボイナ
ソース

0

元の文字列内の各部分文字列のインデックスが必要な場合は、次のようにindexOfを使用して何かを行うことができます。

 private static List<Integer> getAllIndexesOfSubstringInString(String fullString, String substring) {
    int pointIndex = 0;
    List<Integer> allOccurences = new ArrayList<Integer>();
    while(fullPdfText.indexOf(substring,pointIndex) >= 0){
       allOccurences.add(fullPdfText.indexOf(substring, pointIndex));
       pointIndex = fullPdfText.indexOf(substring, pointIndex) + substring.length();
    }
    return allOccurences;
}

— Rhino
ソース

0

public static int getCountSubString(String str , String sub){
int n = 0, m = 0, counter = 0, counterSub = 0;
while(n < str.length()){
  counter = 0;
  m = 0;
  while(m < sub.length() && str.charAt(n) == sub.charAt(m)){
    counter++;
    m++; n++;
  }
  if (counter == sub.length()){
    counterSub++;
    continue;
  }
  else if(counter > 0){
    continue;
  }
  n++;
}

return  counterSub;

}

— ニコライ・ネチャイ
ソース

この質問は8年前のものであり、これが投稿された他の22のソリューションよりも優れたソリューションである理由を示さないため、おそらく削除する必要があります

— Jason Wheeler

0

このソリューションは、文字列全体にわたる特定の部分文字列の出現回数の合計を出力します。また、一致する重複が存在する場合も含まれます。

class SubstringMatch{
    public static void main(String []args){
        //String str = "aaaaabaabdcaa";
        //String sub = "aa";
        //String str = "caaab";
        //String sub = "aa";
        String str="abababababaabb";
        String sub = "bab";

        int n = str.length();
        int m = sub.length();

        // index=-1 in case of no match, otherwise >=0(first match position)
        int index=str.indexOf(sub), i=index+1, count=(index>=0)?1:0;
        System.out.println(i+" "+index+" "+count);

        // i will traverse up to only (m-n) position
        while(index!=-1 && i<=(n-m)){   
            index=str.substring(i, n).indexOf(sub);
            count=(index>=0)?count+1:count;
            i=i+index+1;  
            System.out.println(i+" "+index);
        }
        System.out.println("count: "+count);
    }
}

— アヌバブ・シン
ソース