Question

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Example:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC", "CCCCCAAAAA"]

Thinking:

  • Method: bruto force, TLE, O(N ^ 2)
class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        List<String> result = new ArrayList<>();
        if(s == null || s.length() <= 10) return result;
        int len = s.length();
        LABEL:
        for(int i = 0; i <= len - 11; i++){
            String cmp = s.substring(i, i + 10);
            if(result.contains(cmp)) continue;
            for(int j = i + 1; j <= len - 10; j++){
                String cmp1 = s.substring(j, j + 10);
                if(cmp.equals(cmp1)){
                    result.add(cmp);
                    continue LABEL;
                }
            }
        }
        return result;
    }
}
  • Method 2: 滑动窗口, HashSet, O(N)
    • 将所有的可能都放在一个哈希集合(seen)中。
    • 如果后面出现的子串已经出现在seen中,则加入repeat中。
class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        Set<String> seen = new HashSet<>(), repeat = new HashSet<>();
        int len = s.length();
        for(int i = 0; i <= len - 10; i++){
            String sub = s.substring(i, i + 10);
            if(!seen.add(sub)) repeat.add(sub);
        }
        return new ArrayList<String>(repeat);
    }
}

二刷

  1. 最开始没有想到用两个Set,于是用了Map来记录出现的序列。
    class Solution {
     public List<String> findRepeatedDnaSequences(String s) {
         List<String> result = new LinkedList<>();
         Map<String, Integer> map = new HashMap<>();
         if(s == null || s.length() < 10) return result;
         int len = s.length();
         for(int i = 0; i <= len - 10; i++){
             String sub = s.substring(i, i + 10);
             if(!map.containsKey(sub)){
                 map.put(sub, 1);
             }else if(map.get(sub) == 1){
                 result.add(sub);
                 map.put(sub, map.get(sub) + 1);
             }else continue;
         }
         return result;
     }
    }