Longest Substring Which Contains 2 Unique Characters

This is a problem asked by Google.

Given a string, find the longest substring that contains only two unique characters. For example, given "abcbbbbcccbdddadacb", the longest substring that contains 2 unique character is "bcbbbbcccb".

1. Longest Substring Which Contains 2 Unique Characters

In this solution, a hashmap is used to track the unique elements in the map. When a third character is added to the map, the left pointer needs to move right.

You can use "abac" to walk through this solution.

public int lengthOfLongestSubstringTwoDistinct(String s) {
    int max=0;
    HashMap<Character,Integer> map = new HashMap<Character, Integer>();
    int start=0;
 
    for(int i=0; i<s.length(); i++){
        char c = s.charAt(i);
        if(map.containsKey(c)){
            map.put(c, map.get(c)+1);
        }else{
            map.put(c,1);
        }
 
        if(map.size()>2){
            max = Math.max(max, i-start);
 
            while(map.size()>2){
                char t = s.charAt(start);
                int count = map.get(t);
                if(count>1){
                    map.put(t, count-1);
                }else{
                    map.remove(t);
                }
                start++;
            }
        }
    }
 
    max = Math.max(max, s.length()-start);
 
    return max;
}

Now if this question is extended to be "the longest substring that contains k unique characters", what should we do?

2. Solution for K Unique Characters

The following solution is corrected. Given "abcadcacacaca" and 3, it returns "cadcacacaca".

public int lengthOfLongestSubstringKDistinct(String s, int k) {
     int max=0;
    HashMap<Character,Integer> map = new HashMap<Character, Integer>();
    int start=0;
 
    for(int i=0; i<s.length(); i++){
        char c = s.charAt(i);
        if(map.containsKey(c)){
            map.put(c, map.get(c)+1);
        }else{
            map.put(c,1);
        }
 
        if(map.size()>k){
            max = Math.max(max, i-start);
 
            while(map.size()>k){
                char t = s.charAt(start);
                int count = map.get(t);
                if(count>1){
                    map.put(t, count-1);
                }else{
                    map.remove(t);
                }
                start++;
            }
        }
    }
 
    max = Math.max(max, s.length()-start);
 
    return max;
}

Time is O(n).

Category >> Algorithms  
If you want someone to read your code, please put the code inside <pre><code> and </code></pre> tags. For example:
<pre><code> 
String foo = "bar";
</code></pre>
  • Mike

    public static int longestSubstring(String s) {

    int i = 0, start = 0, count = 0;

    Set set = new HashSet();

    while(i2){

    if(i-start>count) count = i-start;

    set.clear();

    i = ++start;

    }else{

    i++;

    }

    }

    return count;

    }

  • Mike

    I think mine it’s the simplest one..

    public int lengthOfLongestSubstringKDistinct(String s) {
    int i = 0, start = 0, count = 0;
    Set characters = new HashSet();
    while(i2){
    characters.clear();
    if(i-start > count){
    count = i-1;
    }
    i = ++start;
    }else{
    i++;
    }
    }
    return count;
    }

  • Cherry Zhao

    This is an interesting question. I found this post has a detailed analysis

    http://blog.gainlo.co/index.php/2016/04/12/find-the-longest-substring-with-k-unique-characters/

  • Juan Melo

    As Json zhang pointed, solution 3 is wrong the result of this:
    maxSubStringKUniqueChars(“abcadcacacaca”, 3)
    is “abca” and it should be “cadcacacac”

    Here’s an On^2 working piece i did:

    public static String findLongest(String myString, int max) {
    if (myString == null) {
    return null;
    }
    int start = 0, end = 0;
    for (int i = 0; i < myString.length(); i++) {
    Map map = new HashMap();
    for (int j = i; j max || j == myString.length() - 1) {
    if (end - 1 - start max) {
    map.clear();
    }
    break;
    }
    }
    }
    return myString.substring(start, end);
    }

  • Larry Okeke

    Just create a data structure that acts as a string that doesnt allow more than k distinct elements to be added to it.

    import java.util.*;

    public class longest_substring_two {

    public static void main(String[] args){

    System.out.println("Length " + solution(args[0]));
    }

    public static String solution(String s){

    data_structure ds = new data_structure();
    int sequence = 0;
    int longest = 0;
    String solution = "";
    for(int i = 0; i < s.length(); i++){

    for(int j = i; j longest){

    longest = sequence;
    solution = ds.get();
    }
    ds.clear();
    break;
    }

    }
    }

    return solution;

    }

    private static class data_structure{

    /*
    this data structure adds string, allowing only a specified number of duplicates

    */
    ArrayList word = new ArrayList();

    private ArrayList allowed = new ArrayList();
    StringBuilder b = new StringBuilder();
    public boolean insert(char c){

    //this character is allowed just put it in
    if(allowed.contains(c)){
    word.add(c);
    b.append(c);
    return true;
    }
    //this character is not allowed, if it can be allowed, add it
    else if(allowed.size() < 2){
    allowed.add(c);
    word.add(c);
    b.append(c);
    return true;
    }
    //this character is not allowed and cannot be added.
    else{

    return false;

    }
    }

    public int length(){
    return word.size();
    }

    public void clear(){
    allowed.clear();
    word.clear();
    b.setLength(0);
    }

    public String get(){

    return b.toString();
    }

    }
    }

  • Shobhit Jaiswal

    here is my solution

    static String uniquecharSubs(String str, int k) {

    Set set = new HashSet();

    int max = 0;

    StringBuilder sb = new StringBuilder();

    char ch[] = str.toCharArray();

    String newstr = “”;

    for (int i = 0; i < ch.length; i++) {

    set = new HashSet();

    sb = new StringBuilder();

    for (int j = i; set.size() = ch.length) {

    return newstr;

    } else {

    set.add(ch[j]);

    if (set.size() > k) {

    if (sb.length() > max) {

    max = sb.length();

    newstr = sb.toString();

    }

    }

    sb.append(ch[j]);

    }

    }

    }

    return newstr;

    }

  • Pritika Mehta


    import java.util.HashSet;
    import java.util.TreeMap;

    public class LCSWithTwoChar{
    public static void main(String args[]){
    String str = "pritika";
    TreeMap map = new TreeMap();
    int start =0;
    int len =0;
    int i=0;
    int j =0,k;
    char c1 =' ';
    char c2 =' ';
    for(k =0;k<str.length();k++){
    char c= str.charAt(k);
    //System.out.println("char is "+c+" c1= "+c1+" c2="+c2);
    if(c1 == ' '){
    c1 = c;
    i = k;
    len++;
    start =k;
    }else if(c2 == ' '){
    c2 = c;
    j = k;
    len++;
    }else if(c1 == c || c2 ==c){
    if(c1 == c){
    i = k;
    len++;
    while(k+1<str.length() && c1 == str.charAt(k+1)){
    k++;
    len++;
    }

    }else if(c2 == c){
    j = k;
    len++;
    while(k+1<str.length() && c2 == str.charAt(k+1)){
    k++;
    len++;
    }

    }
    }else if(c1 != c && c2 !=c){
    map.put(k-start, new Index(start,k-1));
    if(i<j){
    c1 = c2;
    c2 = c;
    i =j;
    j =k;
    start = i;
    len = k-i;
    }else{
    c2 = c;
    j = k;
    start = i;
    len = k-i;
    }
    }
    if(k == str.length()-1)
    map.put(k-start+(start==0?0:1), new Index(start,k-1));
    //System.out.println("char is "+c+" c1= "+c1+" c2="+c2);
    //System.out.println(map.toString()+" start= "+start+" len is "+len+" i= "+i+" j= "+j+"n--------------------");
    }

    System.out.println(map.lastKey());
    }

    }

    class Index{
    int i;
    int j;
    Index(int i,int j){
    this.i = i;
    this.j = j;
    }
    public String toString(){
    return i+" "+j;
    }
    }

  • up23

    Here’s in php for O(n) time assuming in_array() is constant time or O(n*k) if not.

    function findSubstring($text, $k = 2) {

    if (strlen($text) < $k) {
    return $text;
    }

    $bestAns = "";
    $lastTwoUniqueChars = array();

    $ans = "";
    for ($i = 0; $i < strlen($text); $i ++ ) {
    $letter = $text{$i};

    if (count($lastTwoUniqueChars) < $k) {
    if (!in_array($letter, $lastTwoUniqueChars)) {
    $lastTwoUniqueChars[] = $letter;
    }
    continue;
    }

    if (in_array($letter, $lastTwoUniqueChars)) {
    $ans .= $letter;
    }
    else {
    // new letter from last k uniques
    $lastTwoUniqueChars = array();
    $ans = "";
    for ($j = 0; $j strlen($bestAns)) {
    $bestAns = $ans;
    }
    }

    if ($bestAns == "") {
    return $text;

    }
    else {
    return $bestAns;
    }

    }

    echo findSubstring("a") . "nn";
    echo findSubstring("ab") . "nn";
    echo findSubstring("abcde") . "nn";
    echo findSubstring("abcbbbbcccbdddadacb") ."nn";

  • ryanlr

    You are right. I have changed the 3rd solution. Thanks!

  • Dylan Wang

    Does the 3rd code work?

    Look through code, it seems that this line should be put into while loop, “char first = s.charAt(start);”

    I think the code should be changed to

    //move left cursor toward right, so that substring contains only k chars
    while(map.size()>k)
    {
    char first = s.charAt(start);
    int count = map.get(first);
    if(count>1){
    map.put(first,count-1);
    }else{
    map.remove(first);
    }

    start++;
    }

  • Satish

    public static String unique2CharSubstring(String str) {
    String result = “”;
    int len = str.length();
    HashMap map = new HashMap();
    char[] c = str.toCharArray();
    int right = 0, max = 0;
    LinkedList queue = new LinkedList();
    for (int left = 0; left < len; left++) {
    while (right 2) {
    left = Math.max(left, map.get(queue.peek()) + 1);
    map.remove(queue.pop());
    }
    if (right – left > max) {
    max = right – left;
    result = str.substring(left, right + 1);
    }
    right++;
    }
    }
    return result;
    }

  • Vladimir Kravtsov

    I also tried to execute first method after I failed to understand the idea behind it, and it usually produces answer wit the same beginning as an input string.

  • jason zhang

    The solution 3 does not work. For example, this test case can’t pass:

    assertEquals(“cadcacacaca”,maxSubStringKUniqueChars(“abcadcacacaca”, 3)).

    The logic is here: we should not remove the first character(char first = s.charAt(start); ) which is a, but the first character which does not exist anymore. In the example, the character that should be removed is b

  • Anony

    Simple Java solution for 2 unique chars

    import java.util.Arrays;

    public class LgstUnq {

    public static char [] longestString(char [] tab) {

    if(tab == null || tab.length <= 2) {

    return tab;

    }

    int max = 2;

    int newMax = 2;

    char [] cMax = new char[tab.length];

    char [] lMax = cMax;

    char x = tab[0];

    char y = tab[1];

    cMax[0] = x;

    cMax[1] = y;

    for(int i = 2, j = 2; i max) {

    max = newMax;

    lMax = Arrays.copyOf(cMax, i+1);

    newMax = 0;

    }

    x = c;

    cMax = new char[tab.length – i + 1]; // including previous char

    j = 0;

    c = tab[–i];

    }

    cMax[j++] = c;

    newMax++;

    }

    return lMax;

    }

    public static void main(String [] args) {

    String v = args[0];

    char [] arr = v.toCharArray();

    System.out.println(longestString(arr));

    }

    }

  • neer1304

    For string “aabaaddaa” it gives output as “aabaa” whereas the correct output should be “aaddaa”.

  • guest

    O(n) idea. Using hashmap(check if it contain the word and also pointer to the linked list node with this word) and also a linked list to keep K element. Head of the linkedlist is the smallest index of the k element. While doing update index of existing element in hashmap and linked list we can also do o(1) operation by insert the node to the tail. And the update the hashmap. (Like LRU cache concept) but each operation will only be o(1) so then the total is o(n)

  • ryanlr

    Right, the bad solution is deleted now.

  • Tiago Pinho

    The next one I think works:

    static public String nonRepeated(String input){

    char[] charAInput = input.toCharArray();

    String aux = “”;

    String resultSubStr = “”;

    char otherChar;

    if(input.length() <= 2)

    return input;

    for(int i =0; i < charAInput.length; i++ ){

    String removedOcurrences = aux.replace("" + charAInput[i], "");

    String removedOtherChar = "";

    if (!removedOcurrences.equals("")){

    otherChar = removedOcurrences.charAt(0);

    removedOtherChar = removedOcurrences.replace("" + otherChar, "");
    }

    if(removedOcurrences.equals("") || removedOtherChar.equals("")){

    aux = aux.concat("" + charAInput[i]);

    }
    else
    aux = "" + removedOtherChar + charAInput[i];

    if(resultSubStr.length() < aux.length())
    resultSubStr = aux;
    }

    return resultSubStr;
    }

  • Guest

    The next one I think will work:

    static public String nonRepeated(String input){

    char[] charAInput = input.toCharArray();

    String aux = “”;

    String resultSubStr = “”;

    char otherChar;

    if(input.length() <= 2)

    return input;

    for(int i =0; i < charAInput.length; i++ ){

    String removedOcurrences = aux.replace("" + charAInput[i], "");

    String removedOtherChar = "";

    if (!removedOcurrences.equals("")){

    otherChar = removedOcurrences.charAt(0);

    removedOtherChar = removedOcurrences.replace("" + otherChar, "");

    }

    if(removedOcurrences.equals("") || removedOtherChar.equals("")){

    aux = aux.concat("" + charAInput[i]);

    }

    else

    aux = "" + removedOtherChar + charAInput[i];

    if(resultSubStr.length() < aux.length())

    resultSubStr = aux;

    }

    return resultSubStr;

    }

  • no-nested-loops

    O(N) solution:

    public static String subString(String s) {
    int cstart = 0, lstart = 0, lastswap = 0, pos = 0, beststart = 0, bestlength = -1;
    Character c = null, l = null;
    for (Character curchar : s.toCharArray()) {
    if (curchar != c) {
    cstart += lstart – (lstart = cstart);
    Character swapchar = l;
    l = c;
    if (curchar != (c = swapchar)) {
    cstart = pos;
    c = curchar;
    lstart = lastswap;
    }
    lastswap = pos;
    }
    if (++pos – Math.min(lstart, cstart) > bestlength) {
    bestlength = pos – (beststart = Math.min(lstart, cstart));
    }
    // System.out.printf(“p =%3d, s =%2s, c =%2s, l =%4s, cs =%3d, bs =%3d, ls =%3d, bs =%3d, bl =%3d%n”, pos, curchar, c, l, cstart, lastswap, lstart, beststart, bestlength);
    }
    return s.substring(beststart, beststart + bestlength);
    }

  • Jeffery yuan

    Not work:
    for example: abccccccccaaddddeeee, it will return abccccccccaa as the max, the correct answer should be ccccccccaadddd.

  • Yi Wang

    I think your scalable solution requires O(N*k) time, which may be not very efficient in terms of a large k. By using hash map for character counting, there is a solution with O(N) time.
    http://blog.csdn.net/whuwangyi/article/details/42451289

  • skra

    Here is my solution in C#:

    public sealed class Substring

    {

    private readonly String _input;

    private readonly Char[] _chars;

    private readonly Int32 _count;

    public Substring(String input, Int32 count)

    {

    if (null == input)

    {

    throw new ArgumentNullException(“input”);

    }

    if (count > input.Length)

    {

    throw new ArgumentOutOfRangeException(“count”);

    }

    if (count < 0)

    {

    throw new ArgumentOutOfRangeException("count");

    }

    _count = count;

    _input = input;

    _chars = _input.ToCharArray();

    }

    private Nullable _startIndex;

    private Nullable _length;

    private class CharsContainer

    {

    private readonly Dictionary _items = new Dictionary();

    internal void AddItem(Char ch)

    {

    if (_items.ContainsKey(ch))

    {

    _items[ch]++;

    }

    else

    {

    _items[ch] = 1;

    }

    }

    internal void RemoveItem(Char ch)

    {

    if (_items.ContainsKey(ch))

    {

    _items[ch]–;

    if (_items[ch] == 0)

    {

    _items.Remove(ch);

    }

    }

    }

    internal Int32 Length

    {

    get

    {

    return _items.Count;

    }

    }

    }

    private void TryStoreCurrentResult(Int32 ix, Int32 length)

    {

    if (_length.HasValue)

    {

    if (_length.Value < length)

    {

    _length = length;

    _startIndex = ix;

    }

    }

    else

    {

    _length = length;

    _startIndex = ix;

    }

    }

    public String Find()

    {

    Int32 endPos = -1;

    Int32 startPos = 0;

    CharsContainer charsContainer = new CharsContainer();

    while (endPos < _chars.Length)

    {

    if (charsContainer.Length = _chars.Length)

    {

    break;

    }

    charsContainer.AddItem(_chars[endPos]);

    }

    else

    {

    charsContainer.RemoveItem(_chars[startPos]);

    startPos++;

    }

    Int32 currentLength = endPos – startPos + 1;

    if (charsContainer.Length 0)

    {

    TryStoreCurrentResult(startPos, currentLength);

    }

    }

    if (_startIndex.HasValue && _length.HasValue)

    {

    return _input.Substring(_startIndex.Value, _length.Value);

    }

    return String.Empty;

    }

    }

  • Krzysztof Rajda

    @jason it works for me. Make sure that you have following code at the end

    if (charArray.length – currentStart > maxEnd – maxStart) {
    maxStart = currentStart;
    maxEnd = charArray.length;
    }

    return text.substring(maxStart, maxEnd);

  • Krzysztof Rajda

    I am attaching a solution that works with any k. Basically the trick is to not clear the whole map, but just remove one element.

    private static String findLongestSubstring(String text) {
    int maxStart = 0;
    int maxEnd = 1;
    int currentStart = 0;
    Map lastOccurrenceMap = new HashMap();

    char[] charArray = text.toCharArray();
    for (int i = 0; i < charArray.length; i++) {
    char c = charArray[i];
    Integer lastOccurrenceOfC = lastOccurrenceMap.get(c);
    if (lastOccurrenceOfC == null) {
    // you can just change 2 to any number to solve k problem
    if (lastOccurrenceMap.size() maxEnd – maxStart) {
    maxStart = currentStart;
    maxEnd = i;
    }
    int lastOccurrenceOfCharToRemove = charArray.length;
    char charToRemove = ‘ ‘;

    for (Map.Entry characterIntegerEntry : lastOccurrenceMap.entrySet()) {
    if (characterIntegerEntry.getValue() maxEnd – maxStart) {
    maxStart = currentStart;
    maxEnd = charArray.length;
    }

    return text.substring(maxStart, maxEnd);
    }

  • ashish yadav

    if you got the substring then you can directly get the position of the repeated character by using lastindexof ,i.e no need to make an auxiliary function to help the main.and then we can get the length of the string containing only a single character

  • xiayu5945

    private static String resolution(String str){

    if(StringUtils.isEmpty(str))return str;

    int first=0;int second=0;char firstChar= str.charAt(0) ; char secondChar = str.charAt(0);

    int max=0;int endIndex=0;

    for(int i=0;i<str.length();i++){

    char temp = str.charAt(i);

    if(secondChar == temp){

    second++;

    }else if(firstChar == temp){

    first++;

    }else{

    firstChar = secondChar;

    secondChar = temp;

    if(first !=0 && max<first+second){

    max=first+second;

    endIndex = i;

    }

    first=second;

    second=1;

    }

    }

    if(first !=0 && max<first+second){

    max=first+second;

    endIndex = str.length();

    }

    return str.substring(endIndex-max,endIndex);

    }

  • Rang-ji Hu

    Here is a rough version made by C#

    static int FindMaxConsecutiveSubstring(string target, int tolerant)

    {

    int max = 0;

    int threshold = 0;

    if (target.Length > 0)

    {

    List map = new List();

    char c = target[0];

    map.Add(c);

    threshold = 1;

    int i = 1;

    max = 1;

    int tempCount = 1;

    while (i < target.Length)

    {

    char temp = target[i];

    if (!map.Contains(temp))

    {

    if (threshold max)

    {

    max = tempCount;

    }

    tempCount = 0;

    i–;

    char keep = target[i];

    map = map.Where(a => a == keep).ToList();

    map.Add(temp);

    while (i >= 1)

    {

    if (target[i – 1] == keep)

    {

    i–;

    }

    else

    {

    break;

    }

    }

    }

    }

    else

    {

    tempCount++;

    i++;

    }

    }

    if (tempCount > max)

    {

    max = tempCount;

    }

    }

    return max;

    }

    the second parameter is the number of unique characters.

  • Pulkit

    Why not just find out the top two maximum occurring characters. They would be in the answer.
    This solution would scale for K unique characters.
    i.e. sort the characters acc to the frequency and pick top K characters.

    Now iterate over the array and check if we need to save that character or not.

  • jason

    logic is not correct.
    For example, give an input “ab”, substring return “a”.