Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Day 75 】2021-11-23 - 面试题 17.17 多次搜索 #94

Open
azl397985856 opened this issue Nov 22, 2021 · 98 comments
Open

【Day 75 】2021-11-23 - 面试题 17.17 多次搜索 #94

azl397985856 opened this issue Nov 22, 2021 · 98 comments

Comments

@azl397985856
Copy link
Contributor

面试题 17.17 多次搜索

入选理由

暂无

题目地址

https://leetcode-cn.com/problems/multi-search-lcci

前置知识

  • 字符串匹配
  • Trie

题目描述

给定一个较长字符串 big 和一个包含较短字符串的数组 smalls,设计一个方法,根据 smalls 中的每一个较短字符串,对 big 进行搜索。输出 smalls 中的字符串在 big 里出现的所有位置 positions,其中 positions[i]为 smalls[i]出现的所有位置。

示例:

输入:
big = "mississippi"
smalls = ["is","ppi","hi","sis","i","ssippi"]
输出: [[1,4],[8],[],[3],[1,4,7,10],[5]]
提示:

0 <= len(big) <= 1000
0 <= len(smalls[i]) <= 1000
smalls 的总字符数不会超过 100000。
你可以认为 smalls 中没有重复字符串。
所有出现的字符均为英文小写字母。
@yanglr
Copy link

yanglr commented Nov 22, 2021

思路

可以用 Trie树来做。

具体步骤有参考suukii的仓库, 代码完全是自己写的~

方法: Trie树

把短字符串存到 Trie 中,保证 Trie 的高度尽量的小。

  • dictionary 数组构建 Trie,并在结束节点记录每个短串在 dictionary 数组中的下标;

  • 遍历 sentence 字符串,截取所有以 longest 为长度的子串(longestdictionary 中最长的单词长度),拿到 Trie 中去寻找所有匹配的短串,返回所有匹配到的下标,根据下标把当前子串的位置更新到对应的结果数组中。

Trie 的操作:

  • insert: 把单词的 下标 存到最后的节点中
  • search: 需要返回寻找路径中匹配到的所有单词的下标

代码

实现语言: C++

class Solution {
private:
    struct Node {
        int idx;
        vector<Node*> children;
        Node() : idx(-1), children(26, nullptr) {}
    };
    struct Trie {
        Node* root;
        Trie() : root(new Node()) {}
        void insert(string& word, int idx)
        {
            Node* p = root;
            for (char& c : word) {
                c -= 'a';
                if (p->children[c] == nullptr)
                {
                    p->children[c] = new Node();
                }
                p = p->children[c];
            }
            p->idx = idx;
        }
    };    
public:
    vector<vector<int>> multiSearch(string big, vector<string>& smalls) {
        unordered_map<string, vector<int>> cache;
        const int n = big.length();
        const int m = smalls.size();
        vector<vector<int>> res(m);

        Trie trie = Trie();
        // 构造前缀树
        for (int i = 0; i < m; i++)
        {
            trie.insert(smalls[i], i);
        }
        for (int i = 0; i < n; i++)
        {
            int j = i;
            Node* node = trie.root;
            while (j < n && node->children[big[j] - 'a'])
            {
                node = node->children[big[j] - 'a'];
                if (node->idx != -1)
                {
                    res[node->idx].push_back(i);
                }
                j++;
            }
        }
        return res;
    }
};

复杂度分析

时间 空间
insert O(len(dictionary)*avg), avg 是短串的平均长度 O(n^{avg}), n 是字符集大小,avg 是短串的平均长度
search O(maxLen(dictionary)*len(dictionary)) O(n^{m})

@chakochako
Copy link

class Trie:
    def __init__(self, words):
        self.d = {}
        for word in words:
            t = self.d
            for w in word:
                if w not in t:
                    t[w] = {}
                t = t[w]
            t['end'] = word
    
    def search(self, s):
        t = self.d
        res = []
        for w in s:
            if w not in t:
                break
            t = t[w]
            if 'end' in t:
                res.append(t['end'])
        return res

class Solution:
    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        trie = Trie(smalls)
        hit = collections.defaultdict(list)

        for i in range(len(big)):
            matchs = trie.search(big[i:])
            for word in matchs:
                hit[word].append(i)
        
        res = []
        for word in smalls:
            res.append(hit[word])
        return res

@for123s
Copy link

for123s commented Nov 22, 2021

代码

  • 语言支持:C++

C++ Code:

struct triNode{
    vector<triNode*> children;
    bool end;
    int idx;
    triNode()
    {
        end = false;
        idx = 0;
        children.assign(26,NULL);
    }
};

class Solution {
public:
    vector<vector<int>> multiSearch(string big, vector<string>& smalls) {
        triNode* root = new triNode();
        for(int i=0;i<smalls.size();i++)
        {
            triNode* temp = root;
            for(int j=0;j<smalls[i].size();j++)
            {
                int idx = smalls[i][j] - 'a';
                if(!temp->children[idx])
                    temp->children[idx] = new triNode();
                temp = temp->children[idx];
            }
            temp->end = true;
            temp->idx = i;
        }
        vector<vector<int>> res(smalls.size(),vector<int>(0,{}));
        for(int i=0;i<big.size();i++)
        {
            triNode* temp = root;
            int j = i;
            while(temp&&j<big.size())
            {
                int idx = big[j] - 'a';
                if(temp->children[idx])
                {
                    temp = temp->children[idx];
                    if(temp->end)
                    {
                        res[temp->idx].push_back(i);
                    }
                    j++;
                }
                else
                    break;
            }
        }
        return res;
    }
};

@biancaone
Copy link

class Trie:
    def __init__(self, words):
        self.d = {}
        for word in words:
            t = self.d
            for w in word:
                if w not in t:
                    t[w] = {}
                t = t[w]
            t['end'] = word
    
    def search(self, s):
        t = self.d
        res = []
        for w in s:
            if w not in t:
                break
            t = t[w]
            if 'end' in t:
                res.append(t['end'])
        return res

class Solution:
    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        trie = Trie(smalls)
        hit = collections.defaultdict(list)

        for i in range(len(big)):
            matchs = trie.search(big[i:])
            for word in matchs:
                hit[word].append(i)
        
        res = []
        for word in smalls:
            res.append(hit[word])
        return res

@pophy
Copy link

pophy commented Nov 22, 2021

思路

把所有短字符 smalls 放入Trie

  • Trie.search 实现方法:传入str 找到所有满足以str为头的字符串

  • 从前往后遍历big,将找到的match放入matchlist

  • 整理matchlist并归并到indexmap

  • 根据smalls的顺序输出结果

Java Code

public class MultiSearch {
    public int[][] multiSearch(String big, String[] smalls) {
        Trie trie = new Trie();
        for (String word : smalls) {
            trie.insert(word);
        }
        Map<String, List<Integer>> indexMap = new HashMap<>();
        for (int i = 0; i < big.length(); i++) {
            String word = big.substring(i);
            List<String> matches = trie.search(word);
            for (String match : matches) {
                indexMap.computeIfAbsent(match, x -> new ArrayList()).add(i);
            }
        }
        int[][] res = new int[smalls.length][];
        for (int i = 0; i < smalls.length; i++) {
            List<Integer> indexList = indexMap.get(smalls[i]);
            int[] indexArr = new int[indexList == null ? 0 : indexList.size()];
            for (int j = 0; j < indexArr.length; j++) {
                indexArr[j] = indexList.get(j);
            }
            res[i] = indexArr;
        }
        return res;
    }
    private static class Trie {
        TrieNode root = new TrieNode();
        public List<String> search(String str) {
            List<String> result = new ArrayList<>();
            TrieNode p = root;
            for (char c : str.toCharArray()) {
                if (p.children.get(c) == null) {
                    break;
                }
                p = p.children.get(c);
                if (p.word != null) {
                    result.add(p.word);
                }
            }
            return result;
        }
        public void insert(String str) {
            TrieNode p = root;
            for (char c : str.toCharArray()) {
                if (p.children.get(c) == null) {
                    p.children.put(c, new TrieNode());
                }
                p = p.children.get(c);
            }
            p.word = str;
        }
        private static class TrieNode {
            String word = null;
            Map<Character, TrieNode> children = new HashMap<>();
        }
    }
}

@qixuan-code
Copy link

class Trie:
    def __init__(self, words):
        self.d = {}
        for word in words:
            t = self.d
            for w in word:
                if w not in t:
                    t[w] = {}
                t = t[w]
            t['end'] = word
    
    def search(self, s):
        t = self.d
        res = []
        for w in s:
            if w not in t:
                break
            t = t[w]
            if 'end' in t:
                res.append(t['end'])
        return res

class Solution:
    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        trie = Trie(smalls)
        hit = collections.defaultdict(list)

        for i in range(len(big)):
            matchs = trie.search(big[i:])
            for word in matchs:
                hit[word].append(i)
        
        res = []
        for word in smalls:
            res.append(hit[word])
        return res

@JiangyanLiNEU
Copy link

class Trie:
    def __init__(self, words):
        self.d = {}
        for word in words:
            t = self.d
            for w in word:
                if w not in t:
                    t[w] = {}
                t = t[w]
            t['end'] = word
    
    def search(self, s):
        t = self.d
        res = []
        for w in s:
            if w not in t:
                break
            t = t[w]
            if 'end' in t:
                res.append(t['end'])
        return res

class Solution:
    def search(big, small):
        trie = Trie(small)
       check = {}

        for i in range(len(big)):
            match = trie.search(big[i:])
            for word in match:
                check[word].append(i)
        
        res = []
        for word in small:
            res.append(check[word])
        return res

@leo173701
Copy link

class Trie:
    def __init__(self, words):
        self.d = {}
        for word in words:
            t = self.d
            for w in word:
                if w not in t:
                    t[w] = {}
                t = t[w]
            t['end'] = word
    
    def search(self, s):
        t = self.d
        res = []
        for w in s:
            if w not in t:
                break
            t = t[w]
            if 'end' in t:
                res.append(t['end'])
        return res

class Solution:
    def search(big, small):
        trie = Trie(small)
       check = {}

        for i in range(len(big)):
            match = trie.search(big[i:])
            for word in match:
                check[word].append(i)
        
        res = []
        for word in small:
            res.append(check[word])
        return res

@yachtcoder
Copy link

yachtcoder commented Nov 22, 2021

Rolling hash and Trie.

class TrieNode:
    def __init__(self):
        self.children = defaultdict(TrieNode)
        self.eow = False
        self.word = None

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word: str) -> None:
        node = self.root
        for c in word:
            node = node.children[c]
        node.eow = True
        node.word = word

    def findWords(self, prefix):
        ret = []
        node = self.root
        for c in prefix:
            if c not in node.children:
                break
            node = node.children[c]
            if node.eow:
                ret.append(node.word)
        return ret

class Solution:
    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        trie = Trie()
        smallidx = {}

        for i, s in enumerate(smalls):
            trie.insert(s)
            smallidx[s] = i

        ret = [[] for _ in range(len(smalls))]

        for i in range(len(big)):
            words = trie.findWords(big[i:])
            for s in words:
                if s not in smallidx: continue
                ret[smallidx[s]].append(i)
        return ret

    def multiSearchRollingHash(self, big: str, smalls: List[str]) -> List[List[int]]:
        if not big: return [[] for _ in range(len(smalls))]
        def c2n(c):
            return ord(c) - ord('a')
        def get_hash(s, size):
            base = 331
            mod = 99999199999
            hash = 0
            hashset = defaultdict(list)
            for i in range(size):
                n = c2n(s[i])
                hash = hash*base + n
            hash = hash%mod
            mag = pow(base, size-1, mod)
            hashset[hash].append(0)
            for i in range(size, len(s)):
                left = i - size
                hash = ((hash - mag*c2n(s[left]))*base + c2n(s[i]))%mod
                hashset[hash].append(i-size+1)
            return hashset
        
        ls = set([len(s) for s in smalls])
        hash_by_length = defaultdict(list)
        for l in ls:
            if l <= len(big):
                hash_by_length[l]=get_hash(big, l)
        
        ret = []
        for s in smalls:
            if len(s) == 0 or len(s) > len(big): 
                ret.append([])
                continue
            hash = list(get_hash(s, len(s)).keys())[0]
            bighash = hash_by_length[len(s)]
            locs = bighash[hash]
            ret.append(locs)
        return ret

@zhangzz2015
Copy link

zhangzz2015 commented Nov 22, 2021

思路

  • 建立small的trie树,记录下small 字符串最后结束后对应的index放入count,在big 进行递归search,当我们找到一个count>=0,表明我们匹配到一个子串的结束的字符,通过small字符串的大小,我们可反推出在big的字符串的起始位置,放入对应的结果的vector里。时间的复杂度,建立trie的时间复杂度加上遍历big字符的 O( n *m + k^2) n 为small的字符串个数,m为平均字符串的大小,k 为 big 的字符串的大小。

关键点

代码

  • 语言支持:C++

C++ Code:

struct triNode
{
    vector<triNode*> child; 
    int count;

    triNode()
    {
        child.assign(26, NULL); 
        count=-1; 
    }

}; 
class Solution {
public:
    triNode* pRoot; 
    vector<vector<int>> multiSearch(string big, vector<string>& smalls) {

        pRoot = new triNode();         
        for(int i=0; i< smalls.size(); i++)
        {
            triNode* pCurrent = pRoot; 
            for(int j=0; j< smalls[i].size(); j++)
            {
                int index = smalls[i][j] - 'a'; 
                if(pCurrent->child[index]==NULL)
                {
                    pCurrent->child[index] = new triNode(); 
                }
                pCurrent = pCurrent->child[index]; 
            }
            pCurrent->count = i; 
        }
        vector<vector<int>>  ret(smalls.size()); 
        for(int i=0; i< big.size(); i++)
        {
             dfs(big, i, ret, pRoot, smalls);
        }        
        return ret; 
    }
    void dfs(string& big, int start, vector<vector<int>>& ret, triNode* root, vector<string>& smalls)
    {
        if(start>=big.size())
          return ;
           int index = big[start]- 'a'; 
           if(root->child[index])
           {
               if(root->child[index]->count>=0)
               {
                   //cout<< "index:" << index << "start:" << start << "count" << root->child[index]->count << "\n"; 
                   ret[root->child[index]->count].push_back(start - smalls[root->child[index]->count].size()+1);
               }
               dfs(big, start+1, ret, root->child[index], smalls); 
           }
    }
};

@Daniel-Zheng
Copy link

思路

前缀树。

代码(C++)

struct TrieNode{
    int sid;
    TrieNode *child[26];
    TrieNode(){
        sid=-1;
        for(int i=0;i<26;++i) child[i]=NULL;
    }
};
class Solution {
private:
    TrieNode *root=new TrieNode();
public:
    void insert(string word,int s){
        int n=word.size();
        TrieNode *cur=root;
        for(int i=0;i<n;++i){
            int cid=word.at(i)-'a';
            if(cur->child[cid]==NULL) cur->child[cid]=new TrieNode();
            cur=cur->child[cid];
        }
        cur->sid=s;
    }
    void search(string word,vector<vector<int>>& ans,int bid){
        int n=word.size();
        TrieNode *cur=root;
        for(int i=0;i<n;++i){
            int cid=word.at(i)-'a';
            if(cur->sid!=-1) ans[cur->sid].push_back(bid);
            if(cur->child[cid]==NULL) return ;
            cur=cur->child[cid];
        }
        if(cur->sid!=-1) ans[cur->sid].push_back(bid);
    }
    vector<vector<int>> multiSearch(string big, vector<string>& smalls) {
        int n=smalls.size(),m=big.size();
        vector<vector<int>> ans(n,vector<int>{});
        for(int i=0;i<n;++i){
            if(smalls[i].size()==0) continue;
            insert(smalls[i],i);
        }
        for(int i=0;i<m;++i){
            string word=big.substr(i,m-i);
            search(word,ans,i);
        }
        return ans;
    }
};

复杂度分析

  • 时间复杂度:O(N^2)
  • 空间复杂度:O(N^2)

@hwpanda
Copy link

hwpanda commented Nov 22, 2021

class Solution {
  public static int[][] multiSearch(String big, String[] smalls) {
      int[][]multiSearch = new int[smalls.length][];
      for(int i = 0;i < smalls.length; i++){
          String small = smalls[i];
          if("".equals(small)){
              multiSearch[0] = new int[0];
              return multiSearch;
          }
          int g =0;
          int[] nu = new int[10000];

          for(int j =0;j <= big.length()-small.length(); j++){
              String mid = big.substring(j,j+small.length());
              if(mid.contains(small)){
                  int index = j;
                  //multiSearch[i][g] = index;
                  nu[g] = index;
                  g++;

              }
          }
          multiSearch[i] = new int[g];
          for (int k = 0; k <multiSearch[i].length ; k++) {
              multiSearch[i][k] = nu[k];
          }

      }
      return multiSearch;
  }
}

@laofuWF
Copy link

laofuWF commented Nov 22, 2021

class Trie:
    def __init__(self, words):
        self.root = {}
        for word in words:
            current = self.root
            for char in word:
                if char not in current:
                    current[char] = {}
                current = current[char]
            # mark the end
            current['end'] = word
    
    def search(self, word):
        current = self.root
        res = []
        for char in word:
            if char not in current:
                break
            current = current[char]
            if 'end' in current:
                res.append(current['end'])
        return res

class Solution:
    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        trie = Trie(smalls)
        matches = collections.defaultdict(list)

        for i in range(len(big)):
            current_matches = trie.search(big[i:])
            for word in current_matches:
                matches[word].append(i)

        res = []
        for word in smalls:
            res.append(matches[word])
        
        return res

@yan0327
Copy link

yan0327 commented Nov 23, 2021

思路:
先用一遍暴力做,过了。时间好一些,
然后学着用trie再实现一遍。
关键点有两个,大字符串做小字符串匹配,把小字符串放进trie里,然后大字符串进行搜索
核心是,对大字符串整体遍历i从0到最后,然后每次j=i,去字典树里面查询,若为end则,把此时的i添加的输出的二维数组里面。
关键:
trie的结构体由,26位children,isEnd ,id 来控制,输入为word和ID

type TrieNode struct{
    children [26]*TrieNode
    isEnd bool
    id int
}
func (this *TrieNode) insert(word string, id int){
    node := this
    for _ , ch := range word{
        ch -= 'a'
        if node.children[ch] == nil{
            node.children[ch] = &TrieNode{}
        }
        node = node.children[ch]
    }
    node.isEnd = true
    node.id = id
}
func (this *TrieNode) search (word string) bool{
    node := this
    for _,ch := range word{
        ch -= 'a'
        if node.children[ch] == nil{
            return false
        }
        node = node.children[ch]
    }
    return true
}

func multiSearch(big string, smalls []string) [][]int {
    n := len(smalls)
    trie := &TrieNode{}
    out := make([][]int,n)
    for i:=0;i<n;i++{
        out[i] = make([]int,0)
    }
    for i,x := range smalls{
        trie.insert(x,i)
    }
    for i:=0;i<len(big);i++{
        node := trie
        for j:=i;j<len(big);j++{
            if node.children[big[j]-'a'] == nil{
                break
            }
            node = node.children[big[j]-'a']
            if node.isEnd{
                out[node.id] = append(out[node.id],i)
            }
        }
    }
    return out
}

时间复杂度:O(N²)
空间复杂度:O(S)

@wangzehan123
Copy link

关键点

代码

  • 语言支持:Java

Java Code:

class Solution {
    public int[][] multiSearch(String big, String[] smalls) {
        List<List<Integer>> list=new ArrayList<>();
        for(int i=0;i<smalls.length;i++){
            List<Integer> loc=new ArrayList<>();
            //注意边界j>0&&j<=big.length()
            for(int j=smalls[i].length();j>0&&j<=big.length();j++){if(smalls[i].equals(big.substring(j-smalls[i].length(),j))){loc.add(j-smalls[i].length());}}
            list.add(new ArrayList<>(loc));
        }
        int[][] ans=new int[smalls.length][];
        for(int k=0;k<smalls.length;k++){
            ans[k]=new int[list.get(k).size()];
            for(int p=0;p<list.get(k).size();p++){ans[k][p]=list.get(k).get(p);}
        }
        return ans;
    }
}

@shawncvv
Copy link

思路

暴力解法

代码

Java Code

public int[][] multiSearch(String big, String[] smalls) {

    int[][] res = new int[smalls.length][];

    List<Integer> cur = new ArrayList<>();

    for (int i = 0; i < smalls.length; i++) {

        String small = smalls[i];

        if (small.length() == 0) {

            res[i] = new int[]{};
            continue;
        }

        int startIdx = 0;
        while (true) {

            int idx = big.indexOf(small, startIdx);
            if (idx == -1)
                break;

            cur.add(idx);
            startIdx = idx + 1;
        }

        res[i] = new int[cur.size()];
        for (int j = 0; j < res[i].length; j++)
            res[i][j] = cur.get(j);

        cur.clear();
    }

    return res;
}

复杂度

  • 时间复杂度:O(N∗K),其中 K 是敏感词中最长单词长度,N 是长句的长度。
  • 空间复杂度:O(S),S 为所有匹配成功的位置的个数

@GZ712D
Copy link

GZ712D commented Nov 23, 2021

def search(self, s):
    t = self.d
    res = []
    for w in s:
        if w not in t:
            break
        t = t[w]
        if 'end' in t:
            res.append(t['end'])
    return res



    for i in range(len(big)):
        match = trie.search(big[i:])
        for word in match:
            check[word].append(i)
    
    res = []
    for word in small:
        res.append(check[word])
    return res

@CoreJa
Copy link

CoreJa commented Nov 23, 2021

思路

前缀树一开始只想到了把所有big字符串的后缀子串插入到前缀树里,但是感觉复杂度特别高,也想了把smalls生成前缀树但是没细想

前缀树的思路主要就是,

  1. 用smalls分别插入生成前缀树,把smalls对应的下标放在前缀树里作标记
  2. 按顺序读取big的字符,在前缀树里找这个字符
    1. 没找到那就说明这个字符不是smalls要找的,跳过
    2. 找到了就看前缀树的节点里是否有标记
      1. 有标记就说明这个字符是smalls要找的,在ans里记录
      2. 没有标记就顺着该字符往后读,并看后面的字符是否在前缀树后续的节点里,直到前缀树读完或者big读完

代码

    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        trie = {}
        ans = [[] for _ in range(len(smalls))]
        for i, small in enumerate(smalls):
            cur = trie
            for ch in small:
                if ch not in cur:
                    cur[ch] = {}
                cur = cur[ch]
            cur['*'] = i

        for i, ch in enumerate(big):
            cur = trie
            index = i
            while ch in cur:
                cur = cur[ch]
                if '*' in cur:
                    ans[cur['*']] += [index]
                i += 1
                if i >= len(big):
                    break
                ch = big[i]

        return ans

复杂度

n是big字符串长度

m是smalls的个数,(考虑smalls里最长串长度为t)

  • TC: O(m*t+n)
  • SC: O(m*t)

@tongxw
Copy link

tongxw commented Nov 23, 2021

思路

过滤敏感词(短句):把短句数组放到前缀树里,长句从i=0开始遍历,在Trie里查找str.substring(i)。如果可以查到,那么说明短句出现在长句的i位置。

代码

class Solution {
  public int[][] multiSearch(String big, String[] smalls) {
    Trie trie = new Trie();
    for (String word : smalls) {
      trie.insert(word);
    }
    
    HashMap<String, List<Integer>> map = new HashMap<>(); // { 短句 -> [在长句中出现的位置] }
    for (int i=0; i<big.length(); i++) {
      String subStr = big.substring(i);
      List<String> res = trie.search(subStr);
      for (String word : res) {
        if (!map.containsKey(word)) {
          map.put(word, new ArrayList<>());
        }

        map.get(word).add(i);
      }
	  }

    int[][] ans = new int[smalls.length][];
    for (int i=0; i<smalls.length; i++) {
      String word = smalls[i];
      List<Integer> res = map.get(word);
      if (res == null) {
        ans[i] = new int[0];
      } else {
        ans[i] = new int[res.size()];
        for (int j=0; j<res.size(); j++) {
          ans[i][j] = res.get(j);
        }
      }
	  }

    return ans;
  }

  class Trie {
    Trie[] children;
    String word;

    Trie() {
      children = new Trie[26];
      word = null;
	  }

    public void insert(String word) {
      Trie node = this;
      for (int i=0; i<word.length(); i++) {
        int idx = word.charAt(i) - 'a';
        if (node.children[idx] == null) {
          node.children[idx] = new Trie();
        }
        node = node.children[idx];
	    }
      node.word = word;
	  }

    public List<String> search(String word) {
      Trie node = this;
      List<String> res = new ArrayList<>();
      for (int i=0; i<word.length(); i++) {
        int idx = word.charAt(i) - 'a';
        if (node.children[idx] == null) {
          break;
        }
        node = node.children[idx];
        if (node.word != null) {
          res.add(node.word);
        }
	    }

      return res;
	  }
  }
}

TC: O(len(big) * len(smalls[i]))
SC: O(len(smalls)

@nonevsnull
Copy link

思路

  • 拿到题的直觉解法
    • 暴力可解
    • trie解法,学习标准答案

AC

代码

//trie
class Solution {

    private Node root = new Node();

    public int[][] multiSearch(String big, String[] smalls) {

        int n = smalls.length;
        // 初始化结果集
        List<Integer>[] res = new List[n];
        for(int i = 0 ; i < n ; i++)
            res[i] = new ArrayList<>();
        // 建树
        for(int i = 0 ; i < smalls.length; i++)
            insert(smalls[i], i);

        for(int i = 0 ; i < big.length(); i++){

            Node tmp = root;

            for(int j = i ; j < big.length(); j++){
                //不存在以该串为prefix的敏感词
                if(tmp.children[big.charAt(j) - 'a'] == null)
                    break;

                tmp = tmp.children[big.charAt(j) - 'a'];

                if(tmp.isWord)
                    res[tmp.id].add(i);
            }
        }
        // 返回二维数组
        int[][] ret = new int[n][];

        for(int i = 0 ; i < n ; i++){

            ret[i] = new int[res[i].size()];

            for(int j = 0 ; j < ret[i].length; j++)
                ret[i][j] = res[i].get(j);
        }

        return ret;
    }

    private void insert(String word, int id){

        Node tmp = root;

        for(int i = 0; i < word.length(); i++){

            if(tmp.children[word.charAt(i) - 'a'] == null)
                tmp.children[word.charAt(i) - 'a'] = new Node();

            tmp = tmp.children[word.charAt(i) - 'a'];
        }

        tmp.isWord = true;
        tmp.id = id;
    }

    class Node {

        Node[] children;
        boolean isWord;
        int id;

        public Node() {

            children = new Node[26];
            isWord = false;
            id = 0;
        }
    }
}

复杂度

time: O(N*K),k=敏感词最长单词长度,N长句长度
space: O(S),s=成功匹配的位置个数

@QiZhongdd
Copy link


class Solution {

    private Node root = new Node();

    public int[][] multiSearch(String big, String[] smalls) {

        int n = smalls.length;
        // 初始化结果集
        List<Integer>[] res = new List[n];
        for(int i = 0 ; i < n ; i++)
            res[i] = new ArrayList<>();
        // 建树
        for(int i = 0 ; i < smalls.length; i++)
            insert(smalls[i], i);

        for(int i = 0 ; i < big.length(); i++){

            Node tmp = root;

            for(int j = i ; j < big.length(); j++){
                //不存在以该串为prefix的敏感词
                if(tmp.children[big.charAt(j) - 'a'] == null)
                    break;

                tmp = tmp.children[big.charAt(j) - 'a'];

                if(tmp.isWord)
                    res[tmp.id].add(i);
            }
        }
        // 返回二维数组
        int[][] ret = new int[n][];

        for(int i = 0 ; i < n ; i++){

            ret[i] = new int[res[i].size()];

            for(int j = 0 ; j < ret[i].length; j++)
                ret[i][j] = res[i].get(j);
        }

        return ret;
    }

    private void insert(String word, int id){

        Node tmp = root;

        for(int i = 0; i < word.length(); i++){

            if(tmp.children[word.charAt(i) - 'a'] == null)
                tmp.children[word.charAt(i) - 'a'] = new Node();

            tmp = tmp.children[word.charAt(i) - 'a'];
        }

        tmp.isWord = true;
        tmp.id = id;
    }

    class Node {

        Node[] children;
        boolean isWord;
        int id;

        public Node() {

            children = new Node[26];
            isWord = false;
            id = 0;
        }
    }
}

@wenlong201807
Copy link

代码块

var multiSearch = function (big, smalls) {
  let result = Array(smalls.length)
    .fill(0)
    .map(() => []); //预生成数组
  let maxLength = Math.max(...smalls.map((small) => small.length)); //计算smalls中字符串的最大长度
  let trie = new Trie(); //预建trie树
  //将smalls 字符串插入树中
  for (let i = 0; i < smalls.length; i++) {
    trie.insert(smalls[i]);
  }

  for (let i = 0; i < big.length; i++) {
    if (!trie.startsWith(big[i])) continue; //查找是否以 big[i] 开头的字符串,没有则跳过本次执行
    let str = big.slice(i, i + maxLength);
    let node = trie.root;
    for (let j = 0; j < str.length; j++) {
      let jchar = str[j];
      if (!node.children[jchar]) {
        break;
      }
      node = node.children[jchar];
      if (node.isword) {
        result[node.index].push(i);
      }
    }
  }
  return result;
};

时间复杂度和空间复杂度

  • 时间复杂度 O(n)
  • 空间复杂度 O(1)

@JinhMa
Copy link

JinhMa commented Nov 23, 2021

class Solution {
    public int[][] multiSearch(String big, String[] smalls) {
        HashMap<Character, ArrayList<Integer>> record = new HashMap<>();
        for (int i=0;i<big.length();++i){
            char tmp = big.charAt(i);
            ArrayList<Integer> pos;
            if (record.containsKey(tmp)){
                pos = new ArrayList<>(record.get(tmp));
            }
            else {
                pos = new ArrayList<>();
            }
            pos.add(i);
            record.put(tmp, pos);
        }
        ArrayList<ArrayList<Integer>> ans = new ArrayList<>();
        for (String tmpStr:smalls){
            ArrayList<Integer> ansf = new ArrayList<>();
            int l = tmpStr.length();
            if (l!=0 && record.containsKey(tmpStr.charAt(0))){
                ArrayList<Integer> pos = record.get(tmpStr.charAt(0));
                for (int i:pos){
                    if (i+l>big.length()){
                        continue;
                    }
                    if (big.substring(i,i+l).equals(tmpStr)){
                        ansf.add(i);
                    }
                }
            }
            ans.add(ansf);
        }
        int[][] ans2 = new int[smalls.length][];
        for (int i=0;i<smalls.length;++i){
            ans2[i] = new int[ans.get(i).size()];
            for (int j=0;j< ans2[i].length;++j){
                ans2[i][j] = ans.get(i).get(j);
            }
        }
        return ans2;
    }
}

@pan-qin
Copy link

pan-qin commented Nov 23, 2021

Idea:

create a trie for smalls. Iterate through all substring of big. If the substring is found in the trie, save the ID of the trieNode. ID is the corresponding position of each small in the smalls array.

Complexity:

Time: O(N^2) N is the length of big
Space: O(26* n) n is the max length of small words

class Solution {
    TrieNode root = new TrieNode();
    public int[][] multiSearch(String big, String[] smalls) {          
        int m=smalls.length;
        for(int i=0;i<m;i++) {
            insert(smalls[i],i);
        }
        //use an array of arraylist to store results       
        List<Integer>[] res = new List[m];
        for(int i=0;i<m;i++) {
            res[i] = new ArrayList<Integer>();
        }
        //iterate over all substring of big, if the substring is in the trie, save the index
        int n = big.length();    
        for(int i=0;i<n;i++) {
            TrieNode curr=root;
            for(int j=i;j<n;j++) {
                char c = big.charAt(j);
                if(curr.children[c-'a']==null)
                    break;
                curr=curr.children[c-'a'];
                if(curr.isWord==true) {
                    res[curr.ID].add(i);
                }
            }

        }
        //convert res to 2d array
        int[][] ret = new int[m][];
        for(int i=0;i<m;i++) {
            ret[i]=new int[res[i].size()];
            for(int j=0;j<res[i].size();j++) {
                ret[i][j]=res[i].get(j);
            }
        }
        return ret;

    }     
        //insert new word into trie
        private void insert(String word, int index) {
            TrieNode curr = root;
            for(int i=0;i<word.length();i++) {
                char c = word.charAt(i);
                if(curr.children[c-'a']==null) {
                    curr.children[c-'a'] = new TrieNode();
                }
                curr=curr.children[c-'a'];
            }
            curr.ID=index;
            curr.isWord=true;
        }

        private class TrieNode {
            TrieNode[] children;
            int ID; //store the index of smalls
            boolean isWord;

            public TrieNode() {
                children = new TrieNode[26];
                isWord = false;
                ID=-1;
            }
        }
    
}

@ZacheryCao
Copy link

Idea:

hash
##Code:

class Solution:
    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        ans = []
        seen = {}
        for i in range(len(big)):
            for j in range(i+1):
                if big[j:i+1] in seen:
                    seen[big[j:i+1] ].append(j)
                else:
                    seen[big[j:i+1]] = [j]
        
        for i in smalls:
            if i in seen:
                ans.append(seen[i])
            else:
                ans.append([])
        
        return ans

Complexity:

Time: O(N^2)
Space: O(N^2)

@liuyangqiQAQ
Copy link

class Solution {
    Node root = new Node();

    static class Node{

        Node[] children;

        boolean isEnd;

        int id;

        public Node(){
            children = new Node[26];
        }
    }

    public int[][] multiSearch(String big, String[] smalls) {
        int[][] res = new int[smalls.length][];
        for (int i = 0; i < smalls.length; i++) {
            String small = smalls[i];
            insert(small, i);
        }
        ArrayList<Integer>[] list = new ArrayList[res.length];
        for (int i = 0; i < list.length; i++) {
            list[i] = new ArrayList<>();
        }

        for (int i = 0; i < big.length(); i++) {
            Node temp = root;
            for (int j = i; j < big.length(); j++) {
                int ch = big.charAt(j) - 'a';
                if(temp.children[ch] == null) break;

                temp = temp.children[ch];
                if(temp.isEnd) {
                    list[temp.id].add(i);
                }
            }
        }
        for (int i = 0; i < res.length; i++) {
            res[i] = list[i].stream().mapToInt(Integer::intValue).toArray();
        }
        return res;
    }

    public void insert(String small, int id) {
        Node temp = root;

        for (int i = 0; i < small.length(); i++) {
            int ch = small.charAt(i) - 'a';
            if(temp.children[ch] == null) {
                temp.children[ch] = new Node();
            }
            temp = temp.children[ch];
        }
        temp.isEnd = true;
        temp.id = id;
    }
}

@chang-you
Copy link

public class Solution {
    public int[][] multiSearch(String article, String[] sensitive) {
        
        Map<String, List<Integer>> statistics = new LinkedHashMap<>();
        Trie trie = new Trie();
        for (int i = 0; i < sensitive.length; i++) {
            trie.insert(sensitive[i]);
            statistics.put(sensitive[i], new LinkedList<>());
        }

       
        int length = article.length();
        int start = 0;
        int end = 0;
        while (start < length) {
            //subString为[x,y)
            String word = article.substring(start, end + 1);
            
            if (!trie.searchWith(word)) {
                end = ++start;
            }
          
            else {
                if (statistics.containsKey(word))
                    statistics.get(word).add(start);
               
                end++;
            }

          
            if (end > length - 1) {
                end = ++start;
            }
        }

      
        int[][] res = new int[sensitive.length][];
        int rIndex = 0;
        for (Map.Entry<String, List<Integer>> entry : statistics.entrySet()) {
            int size = entry.getValue().size();
            int[] tmp = new int[size];
            for (int i = 0; i < size; i++) {
                tmp[i] = entry.getValue().get(i);
            }
            res[rIndex++] = tmp;
        }
        return res;
    }

    class Trie {
        TrieNode root;

        Trie() {
            root = new TrieNode();
        }

        public void insert(String word) {
            TrieNode cur = root;
            for (int i = 0; i < word.length(); i++) {
                int cIndex = word.charAt(i) - 'a';
                if (cur.children[cIndex] == null)
                    cur.children[cIndex] = new TrieNode();
                cur = cur.children[cIndex];
            }
            cur.isEnd = true;
        }

        public boolean searchWith(String word) {
            TrieNode cur = root;
            for (int i = 0; i < word.length(); i++) {
                int cIndex = word.charAt(i) - 'a';
                if (cur.children[cIndex] == null)
                    return false;
                cur = cur.children[cIndex];
            }
            return true;
        }
    }

    class TrieNode {
        boolean isEnd = false;
        TrieNode[] children = new TrieNode[26];
    }
}

@mokrs
Copy link

mokrs commented Nov 23, 2021

class Solution1 {
private:
	TrieNode *root = new TrieNode();

	void insert(string word, int s){
		TrieNode *p = root;
		for (int i = 0; i < word.size(); ++i){
			int t = word[i] - 'a';
			if (p->next[t] == nullptr) {
				p->next[t] = new TrieNode();
			}
			p = p->next[t];
		}
		p->val = s;
	}

	void search(string word, vector<vector<int>>& res, int bid){
		TrieNode *cur = root;
		for (int i = 0; i < word.size(); ++i){
			int t = word[i] - 'a';
			if (cur->val != -1) {
				res[cur->val].push_back(bid);
			}
			if (cur->next[t] == nullptr) {
				return;
			}
			cur = cur->next[t];
		}

		if (cur->val != -1) {
			res[cur->val].push_back(bid);
		}
	}

public:
	vector<vector<int>> multiSearch(string big, vector<string>& smalls) {
		vector<vector<int>> res(smalls.size(), vector<int>{});

		for (int i = 0; i < smalls.size(); ++i){
			if (smalls[i].size() == 0) {
				continue;
			}

			insert(smalls[i], i);
		}

		for (int i = 0; i < big.size(); ++i){
			string word = big.substr(i, big.size() - i);
			search(word, res, i);
		}

		return res;
	}
	
};

@ZhuMengCheng
Copy link

// 利用js indexof暴力匹配

var multiSearch = function (big, smalls) {
    var positions = [];
    for (let i = 0; i < smalls.length; i++) {
        positions[i] = [];
        if (smalls[i] == "") continue;
        var k = 0;
        while (big.indexOf(smalls[i], k) >= 0) {
            var pos = big.indexOf(smalls[i], k);
            positions[i].push(pos);
            if (k < big.length - 1)
                k = pos + 1;
            else
                break;
        }
    }
    return positions;
};

时间复杂度:O(N*K)
空间复杂度:O(N)

@july-aha
Copy link

function kmp(s,t){
   let next=next(t),i=0,j=0;//i-主串 j-模式串
  while(i<s.length){
    if(s[i]===t[j]) {
       i++;
       j++
    }else if(j!==0){
       j=next[j-1]
    }else{
      i++;
    }
    if(j===t.length){
       return i-j+1
       //j=next[j-1]
    }
  }
  
}


function next(s){
  let j=0,next=[0]
  for(let i=1;i<s.length;i++){
       while(j>0&&s[i]!==s[j]){
           j=next[j-1]
       }
       if(s[i]==s[j]){
          j++
        }
        next[i]=j
   }
}

@kbfx1234
Copy link

面试题 17.17. 多次搜索

// 11-23 trie 模仿还得再研究研究
class Solution {
struct Trie {
    Trie* next[26];
    int idx = -1;
};

public:
    vector<vector<int>> multiSearch(string big, vector<string>& smalls) {
        root = new Trie();
        vector<vector<int>> res(smalls.size(), vector<int> {});

        for (int i = 0; i < smalls.size(); ++i) {
            insert(smalls[i], i);
        }
        search(big, res);
        return res;

    }
private:
    Trie* root;
    void insert(string s, int idx) {
        Trie* p = root;

        for (auto &a : s) {
            int i = a - 'a';
            if (p->next[i] == NULL) {
                p->next[i] = new Trie();
            }
            p = p->next[i];
        }
        p->idx = idx;
    }

    void search(string s, vector<vector<int>>& res) {
        for (int i = 0; i < s.size(); ++i) {
            Trie* p = root;
            int j = i;
            while (j < s.size()) {
                int k = s[j] - 'a';
                if (p->next[k] == NULL) {
                    break;
                }
                p = p->next[k];
                ++j;
                if (p->idx != -1) {
                    res[p->idx].push_back(i);
                }
            }
        }
    }
};

@HWFrankFung
Copy link

Codes

 var multiSearch = function(big, smalls) {

    let result=[]
    function match(big,item){
      let res=[];
      if(item==''||item.length>big.length){
        return res
      }
      let index=big.indexOf(item)
      while(index>-1){
        res.push(index)
        let ary=big.split('')
        ary.splice(index,1,'#')
        big=ary.join('')
        index=big.indexOf(item)
      }
      return res
    }
    for(let i of smalls){
      result.push(match(big,i))
    }
    return result
  };

@HouHao1998
Copy link

class Solution {
    public int[][] multiSearch(String big, String[] smalls) {
        List<List<Integer>> list=new ArrayList<>();
        for(int i=0;i<smalls.length;i++){
            List<Integer> loc=new ArrayList<>();
            for(int j=smalls[i].length();j>0&&j<=big.length();j++){if(smalls[i].equals(big.substring(j-smalls[i].length(),j))){loc.add(j-smalls[i].length());}}
            list.add(new ArrayList<>(loc));
        }
        int[][] ans=new int[smalls.length][];
        for(int k=0;k<smalls.length;k++){
            ans[k]=new int[list.get(k).size()];
            for(int p=0;p<list.get(k).size();p++){ans[k][p]=list.get(k).get(p);}
        }
        return ans;
    }
}

@guangsizhongbin
Copy link

guangsizhongbin commented Nov 23, 2021

func multiSearch(big string, smalls []string) [][]int {
    sa := suffixarray.New([]byte(big))
    pos := make([][]int, 0, len(smalls))
    for _, s := range smalls {
        // 从后往前找s
        ps := sa.Lookup([]byte(s), -1)
        //  从小到大排序
        sort.Ints(ps)
        pos = append(pos, ps)
    }
    return pos
}

@xulli1996
Copy link

    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        trie = {}
        ans = [[] for _ in range(len(smalls))]
        for i, small in enumerate(smalls):
            cur = trie
            for ch in small:
                if ch not in cur:
                    cur[ch] = {}
                cur = cur[ch]
            cur['*'] = i

        for i, ch in enumerate(big):
            cur = trie
            index = i
            while ch in cur:
                cur = cur[ch]
                if '*' in cur:
                    ans[cur['*']] += [index]
                i += 1
                if i >= len(big):
                    break
                ch = big[i]

        return ans

@brainlds
Copy link

class Solution {
class Node{
Node[] children = new Node[26];
int id;
}
Node root;
public void insert(String str,int id){
Node node = this.root;
for(int i = 0 ; i < str.length(); i++ ){
int u = str.charAt(i) - 'a';
if(node.children[u]==null){
node.children[u] = new Node();
}
node = node.children[u];
}
node.id = id;
}
public int[][] multiSearch(String big, String[] smalls) {
root = new Node();
for(int i = 0;i < smalls.length ; i++){
insert(smalls[i],i+1);
}
int[][] ans = new int[smalls.length][];
ArrayList<ArrayList> ls = new ArrayList<>();
for(int i =0 ;i < smalls.length ;i ++){
ls.add(new ArrayList<>());
}
for(int i = 0 ;i < big.length();i++){
Node node = this.root;
int j = i;
while(j<big.length() &&node.children[big.charAt(j)-'a']!=null){

            node = node.children[big.charAt(j)-'a'];
            j++;
            if(node.id!=0) ls.get(node.id-1).add(i);
        }
    }
    for(int i =0 ;i < smalls.length ;i ++){
        int[] res = new int[ls.get(i).size()];
        for(int j =0;j<ls.get(i).size();j++){
            res[j] = ls.get(i).get(j);
        }
        ans[i]=res;
    }
    return ans;
}

}

@ysy0707
Copy link

ysy0707 commented Nov 23, 2021

思路:前缀树

class Solution {

    class Trie{
        TrieNode root;
        public Trie(String[] words){
            root = new TrieNode();
            for(String word : words){
                TrieNode node = root;
                for(char w : word.toCharArray()){
                    int i = w - 'a';
                    if(node.next[i] == null){
                        node.next[i] = new TrieNode();
                    }
                    node = node.next[i];
                }
                node.end = word;
            }
        }

        public List<String> search(String str){
            TrieNode node = root;
            List<String> res = new ArrayList<>();
            for(char c : str.toCharArray()){
                int i = c - 'a';
                if(node.next[i] == null){
                    break;
                }
                node = node.next[i];
                if(node.end != null){
                    res.add(node.end);
                }
            }
            return res;
        }  
    }

    class TrieNode{
        String end;
        TrieNode[] next = new TrieNode[26];
    }


    public int[][] multiSearch(String big, String[] smalls) {
        Trie trie = new Trie(smalls);
        Map<String, List<Integer>> hit = new HashMap<>();
        for(int i = 0; i < big.length(); i++){
            List<String> matchs = trie.search(big.substring(i));
            for(String word: matchs){
                if(!hit.containsKey(word)){
                    hit.put(word, new ArrayList<>());
                }
                hit.get(word).add(i);
            }
        }
        
        int[][] res = new int[smalls.length][];
        for(int i = 0; i < smalls.length; i++){
            List<Integer> list = hit.get(smalls[i]);
            if(list == null){
                res[i] = new int[0];
                continue;
            }
            int size = list.size();
            res[i] = new int[size];
            for(int j = 0; j < size; j++){
                res[i][j] = list.get(j);
            }
        }
        return res;
    }
}

时间复杂度:O(N*K) ,K 是敏感词中最长单词长度,N 是长句的长度。
空间复杂度:O(S),S 为所有匹配成功的位置的个数

@chenming-cao
Copy link

解题思路

前缀树。参考官方题解。将所有在smalls中的word插入到前缀树中,然后我们每次改变起始字母搜索big中的子字符串,判断是否在前缀树中,如果存在,我们就放在结果数组中。

代码

class Solution {
    class TrieNode {
        TrieNode[] children;
        boolean isWord;
        int id;

        public TrieNode() {
            children = new TrieNode[26];
            isWord = false;
            id = 0;
        }
    }

    private TrieNode root = new TrieNode();

    public int[][] multiSearch(String big, String[] smalls) {
        int n = smalls.length;
        // initialize res array
        List<Integer>[] res = new List[n];
        for (int i = 0; i < n; i++) {
            res[i] = new ArrayList<>();
        }
        // build a trie, insert each word in smalls and use the index in smalls as id
        for (int i = 0; i < n; i++) {
            insert(smalls[i], i);
        }
        // search in big, change the starting letter each time and search whether substring exists in smalls
        for (int i = 0; i < big.length(); i++) {
            TrieNode node = root;
            for (int j = i; j < big.length(); j++) {
                char letter = big.charAt(j);
                if (node.children[letter - 'a'] == null) {
                    break;
                }
                node = node.children[letter - 'a'];
                if (node.isWord) {
                    res[node.id].add(i);
                }
            }
        }
        // convert to 2D array for output
        int[][] ret = new int[n][];
        for (int i = 0; i < n; i++) {
            ret[i] = new int[res[i].size()];
            for (int j = 0; j < ret[i].length; j++) {
                ret[i][j] = res[i].get(j);
            }
        }
        return ret;
    }

    private void insert(String word, int id) {
        TrieNode node = root;
        for (int i = 0; i < word.length(); i++) {
            char letter = word.charAt(i);
            if (node.children[letter - 'a'] == null) {
                node.children[letter - 'a'] = new TrieNode();
            }
            node = node.children[letter - 'a'];
        }
        node.isWord = true;
        node.id = id;
    }
}

复杂度分析

  • 时间复杂度:O(n * k),k为smalls中最长单词长度,n为big的长度
  • 空间复杂度:O(s), s为所有匹配成功的位置的个数

@heyqz
Copy link

heyqz commented Nov 23, 2021

class Trie:
   def __init__(self, words):
       self.d = {}
       for i in range(len(words)):
           tree = self.d
           for char in words[i]:
               if char not in tree:
                   tree[char] = {}
               tree = tree[char]
           tree['end'] = i
       
   def search(self, s):
       tree = self.d
       res = []
       for char in s:
           if char not in tree:
               return res
           tree = tree[char]
           if 'end' in tree:
               res.append(tree['end'])
       return res

class Solution:
   def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
       trie = Trie(smalls)
       res = [[] for _ in range(len(smalls))]
       for i in range(len(big)):
           tmp = trie.search(big[i:])
           for idx in tmp:
               res[idx].append(i)
       return res

@dahuang257
Copy link

class Solution {
struct Trie {
Trie* next[26];
int idx = -1;
};

public:
vector<vector> multiSearch(string big, vector& smalls) {
root = new Trie();
vector<vector> res(smalls.size(), vector {});

    for (int i = 0; i < smalls.size(); ++i) {
        insert(smalls[i], i);
    }
    search(big, res);
    return res;

}

private:
Trie* root;
void insert(string s, int idx) {
Trie* p = root;

    for (auto &a : s) {
        int i = a - 'a';
        if (p->next[i] == NULL) {
            p->next[i] = new Trie();
        }
        p = p->next[i];
    }
    p->idx = idx;
}

void search(string s, vector<vector<int>>& res) {
    for (int i = 0; i < s.size(); ++i) {
        Trie* p = root;
        int j = i;
        while (j < s.size()) {
            int k = s[j] - 'a';
            if (p->next[k] == NULL) {
                break;
            }
            p = p->next[k];
            ++j;
            if (p->idx != -1) {
                res[p->idx].push_back(i);
            }
        }
    }
}

};

@ymkymk
Copy link

ymkymk commented Nov 23, 2021

思路

参考官网:
trie中记录smalls中的字符串,末尾记录字符串,方便后面遍历。
trie中的search用于搜索字符串,将搜索到的字符串存入返回值中。
遍历big长字符串,将其与trie匹配。
按smalls顺序输出最终结果。

代码

``


class Solution {

    class Trie{
        TrieNode root;
        public Trie(String[] words){
            root = new TrieNode();
            for(String word : words){
                TrieNode node = root;
                for(char w : word.toCharArray()){
                    int i = w - 'a';
                    if(node.next[i] == null){
                        node.next[i] = new TrieNode();
                    }
                    node = node.next[i];
                }
                node.end = word;
            }
        }

        public List<String> search(String str){
            TrieNode node = root;
            List<String> res = new ArrayList<>();
            for(char c : str.toCharArray()){
                int i = c - 'a';
                if(node.next[i] == null){
                    break;
                }
                node = node.next[i];
                if(node.end != null){
                    res.add(node.end);
                }
            }
            return res;
        }  
    }

    class TrieNode{
        String end;
        TrieNode[] next = new TrieNode[26];
    }


    public int[][] multiSearch(String big, String[] smalls) {
        Trie trie = new Trie(smalls);
        Map<String, List<Integer>> hit = new HashMap<>();
        for(int i = 0; i < big.length(); i++){
            List<String> matchs = trie.search(big.substring(i));
            for(String word: matchs){
                if(!hit.containsKey(word)){
                    hit.put(word, new ArrayList<>());
                }
                hit.get(word).add(i);
            }
        }
        
        int[][] res = new int[smalls.length][];
        for(int i = 0; i < smalls.length; i++){
            List<Integer> list = hit.get(smalls[i]);
            if(list == null){
                res[i] = new int[0];
                continue;
            }
            int size = list.size();
            res[i] = new int[size];
            for(int j = 0; j < size; j++){
                res[i][j] = list.get(j);
            }
        }
        return res;
    }
}


复杂度分析

时间复杂度:O(n∗m),n=len(big),m=len(smalls)

空间复杂度:O(m∗k),k=max(len(smalls[i])),即k是smalls中最长字符串的长度

@wangyifan2018
Copy link

class Trie:
    def __init__(self, words):
        self.d = {}
        for word in words:
            t = self.d
            for w in word:
                if w not in t:
                    t[w] = {}
                t = t[w]
            t['end'] = word
    
    def search(self, s):
        t = self.d
        res = []
        for w in s:
            if w not in t:
                break
            t = t[w]
            if 'end' in t:
                res.append(t['end'])
        return res

class Solution:
    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        trie = Trie(smalls)
        hit = collections.defaultdict(list)

        for i in range(len(big)):
            matchs = trie.search(big[i:])
            for word in matchs:
                hit[word].append(i)
        
        res = []
        for word in smalls:
            res.append(hit[word])
        return res

@Lydia61
Copy link

Lydia61 commented Nov 23, 2021

多次搜索

思路

字典树(评论区)

代码

class Trie:
   def __init__(self, words):
       self.d = {}
       for i in range(len(words)):
           tree = self.d
           for char in words[i]:
               if char not in tree:
                   tree[char] = {}
               tree = tree[char]
           tree['end'] = i
       
   def search(self, s):
       tree = self.d
       res = []
       for char in s:
           if char not in tree:
               return res
           tree = tree[char]
           if 'end' in tree:
               res.append(tree['end'])
       return res

class Solution:
   def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
       trie = Trie(smalls)
       res = [[] for _ in range(len(smalls))]
       for i in range(len(big)):
           tmp = trie.search(big[i:])
           for idx in tmp:
               res[idx].append(i)
       return res

@V-Enzo
Copy link

V-Enzo commented Nov 23, 2021

思路

  1. 对于短的构建trie树。
  2. 从长的字符串中找所有可能以当中的字符开始的串。
    3.评论区的总结,很到位。
    好像用 trie 的题目都是一模一样的场景:给你一个长句子,再给你一堆“敏感词”,然后让你找敏感词在句子里的位置(因为要把敏感词换成 ***)。

把敏感词 smalls 的数量记为 t,把敏感词里最长的字符串长度记为 k,把长句子 big 的长度记为 b。

具体步骤:

1)把这堆敏感词建成一颗 Trie 树,时间复杂度是 O(tk)。

2)遍历长句子的每一个字母,检查“以该字母作为起点”的话,是否可以在 trie 中找到结果。时间复杂度是 O(bk)

综上,总的时间复杂度是 O(tk + bk)。在这种题目场景下这种 trie 的思路应该就是时间复杂度最好的答案了。

class Solution {
private:
    struct node
    {
        int idx;
        vector<node*> children;
        node() : idx(-1), children(26, nullptr){} 

    };
    struct Trie
    {
        node* root;
        Trie():root(new node()){}
        void insert(string& word, int idx)
        {
            node* p = root;
            for(auto& w:word)
            {
                w-='a';
                if(p->children[w]==nullptr)
                {
                    p->children[w] = new node();
                }
                p = p->children[w];
            }
            p->idx = idx;
        }
    };
public:
    vector<vector<int>> multiSearch(string big, vector<string>& smalls) 
    {
        int big_N = big.length();
        int small_N = smalls.size();
        vector<vector<int>> res(small_N);
        Trie trie = Trie();
        for(int i=0; i<small_N; i++)
        {
            trie.insert(smalls[i], i);
        }
        for(int i=0; i<big_N; i++)
        {
            int j = i;
            node* p = trie.root;
            while(j<big_N && p->children[big[j]-'a'])
            {
                p = p ->children[big[j]-'a'];
                if(p->idx != -1)
                    res[p->idx].push_back(i);
                j++;
            }
        }
        return res;
    }
};

Complexity:

Time:O(n^2)
Space:O(n^2)

@BreezePython
Copy link

思路

前缀树

代码

class Trie:
    def __init__(self, words):
        self.d = {}
        for word in words:
            t = self.d
            for w in word:
                if w not in t:
                    t[w] = {}
                t = t[w]
            t['end'] = word
    
    def search(self, s):
        t = self.d
        res = []
        for w in s:
            if w not in t:
                break
            t = t[w]
            if 'end' in t:
                res.append(t['end'])
        return res

class Solution:
    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        trie = Trie(smalls)
        hit = collections.defaultdict(list)

        for i in range(len(big)):
            matchs = trie.search(big[i:])
            for word in matchs:
                hit[word].append(i)
        
        res = []
        for word in smalls:
            res.append(hit[word])
        return res

复杂度

  • 时间复杂度:O(n∗m)
  • 空间复杂度:O(n∗m)

@falsity
Copy link

falsity commented Nov 23, 2021

public int[][] multiSearch(String big, String[] smalls) {

    int[][] res = new int[smalls.length][];

    List<Integer> cur = new ArrayList<>();

    for (int i = 0; i < smalls.length; i++) {

        String small = smalls[i];

        if (small.length() == 0) {

            res[i] = new int[]{};
            continue;
        }

        int startIdx = 0;
        while (true) {

            int idx = big.indexOf(small, startIdx);
            if (idx == -1)
                break;

            cur.add(idx);
            startIdx = idx + 1;
        }

        res[i] = new int[cur.size()];
        for (int j = 0; j < res[i].length; j++)
            res[i][j] = cur.get(j);

        cur.clear();
    }

    return res;
}

@Laurence-try
Copy link

class Solution {

private Node root = new Node();

public int[][] multiSearch(String big, String[] smalls) {

    int n = smalls.length;
    // 初始化结果集
    List<Integer>[] res = new List[n];
    for(int i = 0 ; i < n ; i++)
        res[i] = new ArrayList<>();
    // 建树
    for(int i = 0 ; i < smalls.length; i++)
        insert(smalls[i], i);

    for(int i = 0 ; i < big.length(); i++){

        Node tmp = root;

        for(int j = i ; j < big.length(); j++){
            //不存在以该串为prefix的敏感词
            if(tmp.children[big.charAt(j) - 'a'] == null)
                break;

            tmp = tmp.children[big.charAt(j) - 'a'];

            if(tmp.isWord)
                res[tmp.id].add(i);
        }
    }
    // 返回二维数组
    int[][] ret = new int[n][];

    for(int i = 0 ; i < n ; i++){

        ret[i] = new int[res[i].size()];

        for(int j = 0 ; j < ret[i].length; j++)
            ret[i][j] = res[i].get(j);
    }

    return ret;
}

private void insert(String word, int id){

    Node tmp = root;

    for(int i = 0; i < word.length(); i++){

        if(tmp.children[word.charAt(i) - 'a'] == null)
            tmp.children[word.charAt(i) - 'a'] = new Node();

        tmp = tmp.children[word.charAt(i) - 'a'];
    }

    tmp.isWord = true;
    tmp.id = id;
}

class Node {

    Node[] children;
    boolean isWord;
    int id;

    public Node() {

        children = new Node[26];
        isWord = false;
        id = 0;
    }
}

}

@Richard-LYF
Copy link

class Solution {
public static int[][] multiSearch(String big, String[] smalls) {
int[][]multiSearch = new int[smalls.length][];
for(int i = 0;i < smalls.length; i++){
String small = smalls[i];
if("".equals(small)){
multiSearch[0] = new int[0];
return multiSearch;
}
int g =0;
int[] nu = new int[10000];
//multiSearch[i] = new int[5];
for(int j =0;j <= big.length()-small.length(); j++){
String mid = big.substring(j,j+small.length());
if(mid.contains(small)){
int index = j;
//multiSearch[i][g] = index;
nu[g] = index;
g++;
// System.out.println(index);
}
}
multiSearch[i] = new int[g];
for (int k = 0; k <multiSearch[i].length ; k++) {
multiSearch[i][k] = nu[k];
}

    }
    return multiSearch;
    //return null;
}

}

@huzizailushang
Copy link

//以smalls数组构建字典树,每个长串的后缀字符串去查找结果
class Node{
public:
Node* trieNode[26] = { nullptr };
int isEnd;
Node() {
isEnd = -1;
};
~Node() {
for (int i = 0; i < 26; i++) {
if (trieNode[i]) {
delete trieNode[i];
}
}
};
};
class Trie {
public:
Trie() {
root = new Node();
};
~Trie() {
delete root;
};
//插入单词,同时以单词在数组中的位置作为结束标记
//这样方便在查找到结果时直接插入在对的位置
void insert(string& word, int index) {
Node *p = root;
int n = word.size();
for (int i = 0; i < n; i++) {
int c = word[i] - 'a';
if (!p->trieNode[c]) {
p->trieNode[c] = new Node();
}
p = p->trieNode[c];
}
p->isEnd = index;
}
void getRes(string& str, vector<vector>& res, int start) {
Node *p = root;
int n = str.size();
for (int i = 0; i < n; i++) {
int c = str[i] - 'a';
if (p->trieNode[c]) {
p = p->trieNode[c];
if (p->isEnd != -1) {
res[p->isEnd].push_back(start);
}
} else
return;
}
}
private:
Node * root;
};

class Solution {
public:
vector<vector> multiSearch(string big, vector& smalls) {
int n = big.size();
int m = smalls.size();
Trie* trie = new Trie();
for (int i = 0; i < m; i++) {
trie->insert(smalls[i], i);
}
vector<vector> res(m);
//对每一个后缀字符串进行查找
for (int i = 0; i < n; i++) {
string tmpStr = big.substr(i);
trie->getRes(tmpStr, res, i);
}
return res;
}
};

@jaysonss
Copy link

思路

根据敏感词数组构建一颗trie树,然后遍历big的所有子串,如果子串是敏感词就记录下开始位置,最后返回结果

class Solution {
    Node root = new Node();

    public int[][] multiSearch(String big, String[] smalls) {
        int[][] ans = new int[smalls.length][];
        Map<Integer,List<Integer>> ansMap = new HashMap<>();

        for(int idx = 0;idx<smalls.length;idx++){
            insert(smalls[idx],idx);
        }

        for(int i=0;i<big.length();i++){
            Node cur = root;
            for(int j=i;j<big.length();j++){
                int jIdx = big.charAt(j) - 'a';
                if(cur.children[jIdx] == null)break;
                cur = cur.children[jIdx];
                if(cur.isWord){
                    ansMap.putIfAbsent(cur.id, new ArrayList<>());
                    ansMap.get(cur.id).add(i);
                }
            }
        }

        for(int idx = 0;idx<smalls.length;idx++){
            List<Integer> list = ansMap.getOrDefault(idx,new ArrayList<>());
            int[] arr = new int[list.size()];
            for(int k=0;k<arr.length;k++){
                arr[k] = list.get(k);
            }
            ans[idx] = arr;
        }
        return ans;
    }

    void insert(String word,int id){
        Node cur = root;
        for(int i=0;i<word.length();i++){
            int idx = word.charAt(i) - 'a';
            if(cur.children[idx] == null){
                cur.children[idx] = new Node();
            }
            cur = cur.children[idx];
        }
        cur.isWord = true;
        cur.id = id;
    }

    private static class Node{
        int id;
        boolean isWord = false;
        Node[] children = new Node[26];
    }

}

big长度为n, smalls最长字符串为k

  • 时间复杂度:O(n * k) , 内层循环最多执行k次就break了,外层循环为n次
  • 空间复杂度:smalls所有字符串出现次数的总和

@yibenxiao
Copy link

【Day 75】面试题 17.17 多次搜索

代码

class Solution {
private:
    struct node
    {
        int idx;
        vector<node*> children;
        node() : idx(-1), children(26, nullptr){} 

    };
    struct Trie
    {
        node* root;
        Trie():root(new node()){}
        void insert(string& word, int idx)
        {
            node* p = root;
            for(auto& w:word)
            {
                w-='a';
                if(p->children[w]==nullptr)
                {
                    p->children[w] = new node();
                }
                p = p->children[w];
            }
            p->idx = idx;
        }
    };
public:
    vector<vector<int>> multiSearch(string big, vector<string>& smalls) 
    {
        int big_N = big.length();
        int small_N = smalls.size();
        vector<vector<int>> res(small_N);
        Trie trie = Trie();
        for(int i=0; i<small_N; i++)
        {
            trie.insert(smalls[i], i);
        }
        for(int i=0; i<big_N; i++)
        {
            int j = i;
            node* p = trie.root;
            while(j<big_N && p->children[big[j]-'a'])
            {
                p = p ->children[big[j]-'a'];
                if(p->idx != -1)
                    res[p->idx].push_back(i);
                j++;
            }
        }
        return res;
    }
};

复杂度

时间复杂度:O(N^2)

空间复杂度:O(N^2)

@ChenJingjing85
Copy link

思路

对smalls 构造trie树,对big的每个子串查询

代码

class TrieNode:
    def __init__(self):
        self.children = {}
        self.isEnd = False
class Trie:
    def __init__(self):
        self.root = TrieNode()    

    def insert(self,word:str):
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.isEnd = True
    
    def search(self,word:str):
        node = self.root
        for ch in word:
            if ch not in node.children:
                return False
            node = node.children[ch]
        return node.isEnd

class Solution:
    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        trie = Trie()
        res = {}
        for word in smalls:
            trie.insert(word)
            res[word] = []
        for i in range(len(big)):
            for j in range(i,len(big)):
                if trie.search(big[i:j+1]):
                    res[big[i:j+1]].append(i)
        return list(res.values())

复杂度

  • 时间 O(N*K)

@ZETAVI
Copy link

ZETAVI commented Nov 23, 2021

class Solution {
    public int[][] multiSearch(String big, String[] smalls) {
        List<List<Integer>> list=new ArrayList<>();
        for(int i=0;i<smalls.length;i++){
            List<Integer> loc=new ArrayList<>();
            //注意边界j>0&&j<=big.length()
            for(int j=smalls[i].length();j>0&&j<=big.length();j++){if(smalls[i].equals(big.substring(j-smalls[i].length(),j))){loc.add(j-smalls[i].length());}}
            list.add(new ArrayList<>(loc));
        }
        int[][] ans=new int[smalls.length][];
        for(int k=0;k<smalls.length;k++){
            ans[k]=new int[list.get(k).size()];
            for(int p=0;p<list.get(k).size();p++){ans[k][p]=list.get(k).get(p);}
        }
        return ans;
    }
}

@skinnyh
Copy link

skinnyh commented Nov 23, 2021

Note

  • Brute force is O(N^3) to check all the smalls in big
  • Build a Trie tree with all the small words. Then for each index i in big, check all the substrings starting at i to see if they are in the Trie tree, is so then we find a match. The TrieNode also need to store the word's index in smalls to store the match position in result.

Solution

class TrieNode:
    def __init__(self):
        self.id = 0 # idx in smalls
        self.children = {}
        self.is_word = False

class Solution:
    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        root = TrieNode()
        # Build trie with smalls
        for i in range(len(smalls)):
            node = root
            for c in smalls[i]:
                if c not in node.children:
                    node.children[c] = TrieNode()
                node = node.children[c]
            node.is_word = True
            node.id = i
        # start search with big
        res = [[] for _ in range(len(smalls))]
        for i in range(len(big)):
            node = root
            for j in range(i, len(big)):
                if big[j] not in node.children:
                    break
                node = node.children[big[j]]
                if node.is_word:
                    res[node.id].append(i)
        return res

Time complexity: O((N+L) * K) where N=len(smalls), L=len(big), K=avg small word length

Space complexity: O(NK) Trie tree's worst case space

@shixinlovey1314
Copy link

Title:面试题 17.17. 多次搜索

Question Reference LeetCode

Solution

Build a trie using small, the trie will also store the index of the string in small.
Then for each letter in the big string, check if it gets a match in the trie, if so, add the index.

Code

class Solution {
public:
    struct trie {
        int index;
        vector<trie*> children;
        trie() {
            index = -1;
            children = vector<trie*>(26, NULL);
        }
    };

    vector<vector<int>> multiSearch(string big, vector<string>& smalls) {
        trie* tree = new trie();

        // Build trie
        for(int i = 0; i < smalls.size(); i++) {
            trie* node = tree;
            for (char ch : smalls[i]) {
                if (node->children[ch - 'a'] == NULL)
                    node->children[ch - 'a'] = new trie();
                node = node->children[ch - 'a'];
            }

            node->index = i;
        }

        vector<vector<int>> ans = vector<vector<int>>(smalls.size(), vector<int>());

        // Search
        for (int i = 0; i < big.length(); i++) {
            trie* node = tree;
            for (int j = i; j < big.length(); j++) {
                if (node->children[big[j] - 'a'] == NULL)
                    break;
                node = node->children[big[j] - 'a'];
                if (node->index != -1)
                    ans[node->index].push_back(i);
            }
        }

        return ans;
    }
};

Complexity

Time Complexity and Explanation

O(n*m): n is the length of big, since for each letter, it is potentially to compare with the whole trie tree.

Space Complexity and Explanation

O(m): there will only be 26 letters, m is the max length of the string in small

@ginnydyy
Copy link

Problem

https://leetcode-cn.com/problems/multi-search-lcci/

Notes

  • Trie
    • Insert all the strings from the smalls to a trie
    • Search each substring of the string big in the trie
      • any char of substring is not shown in the trie, break, no need to proceed
      • any word is matched in the substring, keep searching, until the end of substring, may have multiple words matched
    • TrieNode
      • children: array for the next characters
      • isWord: true when this trie node is the end of a word
      • index: the index of the word in the smalls array, when this node is the end of a word (isWord == true)

Solution

class Solution {
    public int[][] multiSearch(String big, String[] smalls) {
        // create a trie
        TrieNode root = new TrieNode();

        // insert all the words in the smalls array to the trie
        for(int i = 0; i < smalls.length; i++){
            String word = smalls[i];
            TrieNode node = root;
            for(char c: word.toCharArray()){
                if(node.children[c - 'a'] == null){
                    node.children[c - 'a'] = new TrieNode();
                }
                node = node.children[c - 'a'];
            }
            node.isWord = true;
            node.index = i;
        }

        // search each substring of the big in the trie
        List<Integer>[] search = new List[smalls.length];
        for(int i = 0; i < big.length(); i++){
            String key = big.substring(i);
            TrieNode node = root;
            for(char c: key.toCharArray()){
                if(node.children[c - 'a'] == null){
                    break;
                }
                node = node.children[c - 'a'];
                if(node.isWord){
                    if(search[node.index] == null){
                        search[node.index] = new ArrayList<>();
                    }
                    search[node.index].add(i);
                }
            }
        }

        // convert the search result to final result
        int[][] res = new int[search.length][];
        for(int i = 0; i < search.length; i++){
            if(search[i] == null){
                res[i] = new int[0];
            }else{
                res[i] = new int[search[i].size()];
                for(int j = 0; j < search[i].size(); j++){
                    res[i][j] = search[i].get(j);
                }
            }
        }

        return res;



    }
}

class TrieNode{
    int index;
    boolean isWord;
    TrieNode[] children;
    public TrieNode(){
        this.index = 0;
        this.isWord = false;
        this.children = new TrieNode[26];
    }
}

Complexity

  • Time: O(m * k) for searching, m is the length of string big, k is the max length of any string in the smalls array. O(N) for inserting all the strings of the smalls array to the trie, N is the number of characters in smalls array.
  • Space: O(N + P). N is the number of characters in the smalls array, for building the trie. P is the matched indexes for the temporay result array.

@q815101630
Copy link

Trie 树

这道题我们可以把small 放进trie里,每个词最后做个标记。然后遍历big,big需要遍历每个可能的substring (big[0:], big[1:], big[2:] ...),如果发现当前substring 是个词,那么加到ans里,最后处理ans即可。
打败时间100%

class Trie:
    def __init__(self):
        self.root = {}


class Solution:
    def multiSearch(self, big: str, smalls: List[str]) -> List[List[int]]:
        trie = Trie()
        for word in smalls:
            cur = trie.root
            for char in word:
                if char in cur:
                    cur = cur[char]
                else:
                    cur[char] = {}
                    cur = cur[char]
            cur["*"] = word


        ans = collections.defaultdict(list)
        
        for i in range(len(big)):
            cur = trie.root
            for innerChar in big[i:]:
                if innerChar in cur:
                    cur = cur[innerChar]
                    if "*" in cur:
                        ans[cur["*"]].append(i)
                else:
                    break
        ret = []
        for i in smalls:
            ret.append(ans[i])
        return ret

时间:
W: len(small)
S: maximum length of a small word
N: len(big)
O(WS+ N^2)
空间:
O(W
S)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests