Leetcode – Word Break (Java)

Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

For example, given
s = "leetcode",
dict = ["leet", "code"].

Return true because "leetcode" can be segmented as "leet code".

1. Naive Approach

This problem can be solve by using a naive approach, which is trivial. A discussion can always start from that though.

public class Solution {
    public boolean wordBreak(String s, Set<String> dict) {
             return wordBreakHelper(s, dict, 0);
    }
 
    public boolean wordBreakHelper(String s, Set<String> dict, int start){
        if(start == s.length()) 
            return true;
 
        for(String a: dict){
            int len = a.length();
            int end = start+len;
 
            //end index should be <= string length
            if(end > s.length()) 
                continue;
 
            if(s.substring(start, start+len).equals(a))
                if(wordBreakHelper(s, dict, start+len))
                    return true;
        }
 
        return false;
    }
}

Time is O(n^2) and exceeds the time limit.

2. Dynamic Programming

The key to solve this problem by using dynamic programming approach:

  • Define an array t[] such that t[i]==true => 0-(i-1) can be segmented using dictionary
  • Initial state t[0] == true
public class Solution {
    public boolean wordBreak(String s, Set<String> dict) {
        boolean[] t = new boolean[s.length()+1];
        t[0] = true; //set first to be true, why?
        //Because we need initial state
 
        for(int i=0; i<s.length(); i++){
            //should continue from match position
            if(!t[i]) 
                continue;
 
            for(String a: dict){
                int len = a.length();
                int end = i + len;
                if(end > s.length())
                    continue;
 
                if(t[end]) continue;
 
                if(s.substring(i, end).equals(a)){
                    t[end] = true;
                }
            }
        }
 
        return t[s.length()];
    }
}

Time: O(string length * dict size).

3. Java Solution 3 - Simple and Efficient

In Solution 2, if the size of the dictionary is very large, the time is bad. Instead we can solve the problem in O(n^2) time (n is the length of the string).

public boolean wordBreak(String s, Set<String> wordDict) {
    int[] pos = new int[s.length()+1];
 
    Arrays.fill(pos, -1);
 
    pos[0]=0;
 
    for(int i=0; i<s.length(); i++){
        if(pos[i]!=-1){
            for(int j=i+1; j<=s.length(); j++){
                String sub = s.substring(i, j);
                if(wordDict.contains(sub)){
                    pos[j]=i;
                }
            } 
        }
    }
 
    return pos[s.length()]!=-1;
}

4. The More Interesting Problem

The dynamic solution can tell us whether the string can be broken to words, but can not tell us what words the string is broken to. So how to get those words?

Check out Word Break II.

Category >> Algorithms  
If you want someone to read your code, please put the code inside <pre><code> and </code></pre> tags. For example:
<pre><code> 
String foo = "bar";
</code></pre>
  • KARTHIKEYAN DEVENDRAN


    public boolean wordBreak(String s,Set dict){
    HashMap map = new HashMap();
    return wordBreak(s, dict,map);
    }

    public boolean wordBreak(String s,Set dict,HashMap map){

    if(map.containsKey(s)){
    return map.get(s);
    }

    for (int i = 0; i <= s.length(); i++) {
    String prefix = s.substring(0,i);
    if(dict.contains(prefix)){
    if(i == s.length()){
    return true;
    }
    if(wordBreak(s.substring(i),dict,map)){
    map.put(s.substring(i), true);
    return true;
    }
    }
    }
    map.put(s, false);
    return false;
    }

  • Jangku

    This implementation looks neat. But the complexity is exponential so I would choose polynomial implementation for my case.

  • Rahul Shukla

    http://www.ideserve.co.in/learn/word-break-problem
    Here is a detailed explanation of the algorithm.

  • Stephen Boesch

    Yes, i had commented that a Trie were a better solution – before seeing you have already posted it.

  • Stephen Boesch

    A Trie is a better solution than DP for this problem.

  • Abhay

    The dynamic solution fails for the case
    s = “aaaab”
    dict = [“a”, “aa”, “ac”]

    It returns true but should return false.

  • Puneet

    how about this?

    import java.util.*;

    public class WordBreak {
    static Set dictionary = new HashSet();
    public static void main (String[] args) {
    initializeDictionary();
    System.out.println(wordBreak(“leetcodesamsung”));
    }

    private static Boolean wordBreak(String s) {
    Boolean[] memo = new Boolean[s.length()];
    Arrays.fill(memo, Boolean.FALSE);
    int startIndex = 0;
    for (int i = 0; i < s.length(); i++) {
    int endIndex = i+1;
    if (!memo[i] && dictionary.contains(s.substring(startIndex,endIndex)))
    {
    memo[i] = true;
    startIndex = endIndex;
    }
    }

    return memo[s.length() – 1];
    }

    private static void initializeDictionary() {
    dictionary.add("le");
    dictionary.add("et");
    dictionary.add("code");
    dictionary.add("samsun");
    dictionary.add("g");
    }
    }

  • Pan

    Yes, the wordBreak(s.substring(i), dict) should put in the if condition. However, my solution cannot pass the latched online judge.

  • burdz

    I think instead of returning wordBreak(s.substring(i), dict) you need to have that in the if statement with dict.contains(sstr). Otherwise your function returns to early in some cases

    if (dict.contains(sstr) && wordBreak(s.substring(i), dict)) return true;

  • ryanlr

    Your solution does not pass leetcode online judge.

    Input: “goalspecial”, [“go”,”goal”,”goals”,”special”]
    Output: false
    Expected: true

  • Truong Khanh Nguyen

    Thanks for your nice & complete post. Verify the validity of a string is easy. It is more complex to split a valid string into words. My discussion and java program can be found here http://www.capacode.com/?p=335

  • Andrey Sh

    Very short Python solution, also using trie:

    class WordSplitTreeNode:

    def __init__(self):

    self.children = [None for i in range(ord(“z”) – ord(“a”) + 1)]

    self.final = False

    class WordSplitTree:

    def __init__(self):

    self.root = WordSplitTreeNode()

    def addWord(self, word):

    node = self.root

    for c in word:

    i = ord(c) – ord(“a”)

    if node.children[i] is None:

    node.children[i] = WordSplitTreeNode()

    node = node.children[i]

    node.final = True

    def iterate(self, word, pos, node):

    if node.final and pos == len(word):

    yield “”

    return

    nextNodes = []

    i = ord(word[pos]) – ord(“a”)

    if node.final:

    nextNodes.append(self.root.children[i])

    nextNodes.append(node.children[i])

    for nextNode in nextNodes:

    if not nextNode is None:

    for s in self.iterate(word, pos + 1, nextNode):

    yield (” ” if node.final and nextNode is self.root.children[i] else “”) + word[pos] + s

    def checkWordCanBeSplit(word, dictWords):

    tree = WordSplitTree()

    for w in dictWords:

    tree.addWord(w)

    return [splitWord for splitWord in tree.iterate(word, 0, tree.root)]

  • hdante

    The problem is supposed to be equivalent to matching the regexp (leet|code)*, which means that it can be solved by building a DFA in O(2^m) and executing it in O(n)

  • Dhaval Dave

    DP and Recursive Solution with working code at http://www.gohired.in/2014/12/word-break-problem.html

  • Aaron Zhang

    not true if dict = {“a”, “ab”}

  • JY

    Can anyone clarify the complexity of naïve?

  • apricot

    Because you skipped the last IF condition if t[end] is already true, now all possibilities are not given.

    For INPUT: “leetcode”, [“leetcode”,”leet”,”code”].

    matches are : leetcode leet

    code is not given as t[end] is made true by match leetcode.

  • nikhil rao

    /* package whatever; // don’t place package name! */

    import java.util.*;

    import java.lang.*;

    import java.io.*;

    class Wordbreak

    {

    public static boolean wordBreak(String s, String[] dict){

    int counter = 0;

    System.out.println(“Given string to beak :”+ s);

    for(String ds : dict){

    //System.out.println(“dict string :”+ ds);

    if(s.contains(ds)){

    counter = counter +1;

    //System.out.println(“Index of :”+ ds +” ” + s.indexOf(ds));

    int strtindex = s.indexOf(ds);

    int len = ds.length();

    String sb = s.substring( strtindex, strtindex+len);

    //System.out.println(“Am der “+ sb);

    //System.out.println(“dict string :”+ ds);

    }

    }

    if (counter>0){

    return true;

    }

    return false;

    }

    public static void main (String[] args) throws java.lang.Exception

    {

    String s = “programcreek”;

    String[] dict = new String[]{“programcree”,”program”,”creek”};

    boolean b = wordBreak(s, dict);

    if(b){

    System.out.println(“Can be done”);

    }

    else{

    System.out.println(“Not possible!!”);

    }

    }

    }

  • Guest

    {

    if (s == null || s.length() == 0 || dict.isEmpty()) {

    return false;

    }

    for (String w : dict) {

    if (s.equals(w)) {

    return true;

    } else if (s.startsWith(w)) {

    String newS = new String(s);

    do {

    newS = newS.replace(w, “”);

    } while (newS.startsWith(w));

    if (newS.equals(“”)) {

    return true;

    }

    boolean result = this.wordBreak(newS, dict);

    if (result) {

    return true;

    }

    }

    }

    return false;

    }

  • german

    Another solution , O(n^3), being n the length() of s. I assume that the set is a hashSet.

    public boolean wordBreak(String s, Set dict) {

    if(s == null || s.length() == 0){

    return true;

    }

    boolean[] arr = new boolean[s.length()];

    if(dict.contains(s.charAt(0)+””)){

    arr[0] = true;

    }

    for(int i = 1;i<arr.length;i++){

    if (dict.contains(s.substring(0, i + 1))) {

    arr[i] = true;

    }

    for (int j = 0; j < i; j++) {

    int a = i – j + 1;

    String subWord = s.substring(j + 1, j + a);

    if (dict.contains(subWord) && arr[j]) {

    arr[i] = true;

    j = i;

    break;

    }

    }

    }

    return arr[arr.length-1];

    }

  • german

    One of the questions will be: Can we use the same dictionary word more than once? For example, if we have the word “leetcodeleetcode” and the dictionary have the words {“leet”, “code”}, the result will be true?

  • AlgorithmFreak

    Not true for post java7. From Java 7, substring() is a O(n) operation!

  • ravio

    Apparetly, this algorithm is not correct.

    If you use this dictory

    dict.add(“leet”);

    dict.add(“code”);

    dict.add(“lee”);

    dict.add(“programcree”);

    dict.add(“program”);

    dict.add(“creek”);

    , then if will not recognize “leetcode”.

  • Vivek Venkatesh

    I guess this can be solved by using Tries also.

    Time Complexity : O(n) + O(m)
    Space Complexity : O(m)

    Let me know if the following code will work for all cases.

    import java.util.*;
    import java.lang.*;
    import java.io.*;

    class TrieNode {
    char val;
    boolean isRoot;
    boolean isLeaf;
    TrieNode children[]; // There can be atmost 26 children (english alphabets)
    TrieNode() {
    val = ‘^’;
    isRoot = true;
    children = new TrieNode[26];
    initializeChildren();
    }
    TrieNode(char val) {
    this.val = val;
    isRoot = false;
    isLeaf = false;
    children = new TrieNode[26];
    initializeChildren();
    }

    void initializeChildren() {
    for(int i=0;i<children.length;i++) {
    children[i] = null;
    }
    }
    }

    class Trie {
    TrieNode root;
    Trie() {
    root = new TrieNode();
    }

    void insert(String input) {
    int start = ‘A’;
    TrieNode current = root;

    for(int i=0;i<input.length();i++) {
    int val = Character.toUpperCase(input.charAt(i));
    int index = val – start;

    if(current.children[index] == null)
    current.children[index] = new TrieNode(input.charAt(i));

    current = current.children[index];
    }
    current.isLeaf = true;
    }
    }

    class Example {

    boolean wordBreak(String s, Set<String> dict) {
    // First Construct Trie from the dictionary

    Trie a = new Trie();
    for(String i : dict) {
    a.insert(i);
    }

    TrieNode current = a.root;
    int start = ‘A’;

    boolean result = false;

    for(int i=0; i<s.length();i++) {
    int val = Character.toUpperCase(s.charAt(i));
    int index = val – start;

    if(current.children[index] == null) {
    // Word is not in the dictionary
    current = a.root;
    result = false;
    break;
    }
    current = current.children[index];
    if(current.isLeaf == true) {
    // Start from the beginning for the next character
    current = a.root;
    result = true;
    }
    }

    return result;

    }

    public static void main(String[] args) {
    Example temp = new Example() ;
    Set<String> dict = new HashSet<String>();
    dict.add(“leet”);
    dict.add(“code”);
    dict.add(“programcree”);
    dict.add(“program”);
    dict.add(“creek”);
    System.out.println(“Wordbreak (programcreek) = ” + temp.wordBreak(“programcreek”, dict));
    System.out.println(“Wordbreak (leetcode) = ” + temp.wordBreak(“programcreek”, dict));
    }

    }

  • Guest

    import java.util.*;
    import java.lang.*;
    import java.io.*;

    class TrieNode {
    char val;
    boolean isRoot;
    boolean isLeaf;
    TrieNode children[]; // There can be atmost 26 children (english alphabets)
    TrieNode() {
    val = '^';
    isRoot = true;
    children = new TrieNode[26];
    initializeChildren();
    }
    TrieNode(char val) {
    this.val = val;
    isRoot = false;
    isLeaf = false;
    children = new TrieNode[26];
    initializeChildren();
    }

    void initializeChildren() {
    for(int i=0;i<children.length;i++) {
    children[i] = null;
    }
    }
    }

    class Trie {
    TrieNode root;
    Trie() {
    root = new TrieNode();
    }

    void insert(String input) {
    int start = 'A';
    TrieNode current = root;

    for(int i=0;i<input.length();i++) {
    int val = Character.toUpperCase(input.charAt(i));
    int index = val - start;

    if(current.children[index] == null)
    current.children[index] = new TrieNode(input.charAt(i));

    current = current.children[index];
    }
    current.isLeaf = true;
    }
    }

    class Example {

    boolean wordBreak(String s, Set dict) {
    // First Construct Trie from the dictionary

    Trie a = new Trie();
    for(String i : dict) {
    a.insert(i);
    }

    TrieNode current = a.root;
    int start = 'A';

    boolean result = false;

    for(int i=0; i<s.length();i++) {
    int val = Character.toUpperCase(s.charAt(i));
    int index = val - start;

    if(current.children[index] == null) {
    // Word is not in the dictionary
    current = a.root;
    result = false;
    break;
    }
    current = current.children[index];
    if(current.isLeaf == true) {
    // Start from the beginning for the next character
    current = a.root;
    result = true;
    }
    }

    return result;

    }

    public static void main(String[] args) {
    Example temp = new Example() ;
    Set dict = new HashSet();
    dict.add("leet");
    dict.add("code");
    dict.add("programcree");
    dict.add("program");
    dict.add("creek");
    System.out.println("Wordbreak (programcreek) = " + temp.wordBreak("programcreek", dict));
    System.out.println("Wordbreak (leetcode) = " + temp.wordBreak("programcreek", dict));
    }

    }

  • Vivek Venkatesh

    I can think of trie based solution:

    Time Complexity = O(n) + O(m)

    Space Complexity = O(m)

    Let me if this will work:

    import java.util.*;
    import java.lang.*;
    import java.io.*;

    class TrieNode {
    char val;
    boolean isRoot;
    boolean isLeaf;
    TrieNode children[]; // There can be atmost 26 children (english alphabets)
    TrieNode() {
    val = '^';
    isRoot = true;
    children = new TrieNode[26];
    initializeChildren();
    }

    TrieNode(char val) {
    this.val = val;
    isRoot = false;
    isLeaf = false;
    children = new TrieNode[26];
    initializeChildren();
    }
    void initializeChildren() {
    for(int i=0;i<children.length;i++) {
    children[i] = null;
    }
    }
    }

    class Trie {
    TrieNode root;
    Trie() {
    root = new TrieNode();
    }

    void insert(String input) {
    int start = 'A';
    TrieNode current = root;

    for(int i=0;i<input.length();i++) {
    int val = Character.toUpperCase(input.charAt(i));
    int index = val - start;

    if(current.children[index] == null)
    current.children[index] = new TrieNode(input.charAt(i));
    current = current.children[index];
    }
    current.isLeaf = true;
    }

    }

    public class Example {
    boolean wordBreak(String s, Set dict) {

    // First Construct Trie from the dictionary

    Trie a = new Trie();
    for(String i : dict) {
    a.insert(i);

    }

    TrieNode current = a.root;
    int start = 'A';
    boolean result = false;

    for(int i=0; i<s.length();i++) {

    int val = Character.toUpperCase(s.charAt(i));
    int index = val - start;

    if(current.children[index] == null) {
    // Word is not in the dictionary
    current = a.root;
    result = false;
    break;
    }

    current = current.children[index];
    if(current.isLeaf == true) {
    // Start from the beginning for the next character
    current = a.root;
    result = true;
    }
    }
    return result;

    }

    public static void main(String[] args) {
    Example temp = new Example() ;
    Set dict = new HashSet();
    dict.add("leet");
    dict.add("code");
    dict.add("programcree");
    dict.add("program");
    dict.add("creek");

    System.out.println("Wordbreak (programcreek) = " + temp.wordBreak("programcreek", dict));
    System.out.println("Wordbreak (leetcode) = " + temp.wordBreak("leetcode", dict));

    System.out.println("Wordbreak (lesscode) = " + temp.wordBreak("lesscode", dict));

    }

    }

  • Andrei

    Small correction of complexity in the 2nd case.

    It’s mentioned that “Time: O(string length * dict size)” but you also run equals (and substring is not constant for Java > 1.6) for every word in dictionary so it’s more like O(string length * dict size * length of the longest word in dict).

  • codingfacts

    complexity of naïve is O(n^2) not O(2^n)

  • SK

    Use a HasMap can reduce repeated calculation

  • SK

    Recursion with hashMap:

    class Solution:

    def __init__(self):

    self.table = {}

    def wordBreak(self, s, dict):

    if len(s) == 0:

    return True

    if len(s) == 1:

    return s in dict

    if s in self.table:

    return self.table[s]

    isBreakable = False

    for i in range(len(s)):

    word = s[:i+1]

    if word in dict:

    subFlag = self.wordBreak(s[i+1:], dict)

    if s[i+1:] not in self.table:

    self.table[s[i+1:]] = subFlag

    isBreakable |= subFlag

    return isBreakable

  • Pan

    recursive solution

    public static boolean wordBreak(String s, Set dict){

    //input validation

    //Base case

    if(dict.contains(s))

    return true;

    else {

    for(int i = 0; i < s.length(); i++){

    String sstr = s.substring(0, i);

    if(dict.contains(sstr))

    return wordBreak(s.substring(i), dict);

    }

    }

    return false;

    }

  • shreyas KN

    we can do it in O(n) right , assuming if Set dic is actually a Hashset, so the retrival on Hashset is always O(1)

    List arr;
    StringBuilder sb = new StringBuilder();
    int i=0;
    int wordIndex=0;
    while(i>s.length){

    if(dic.get(sb.substring(wordIndex,i) != null){
    wordIndex=i+1;
    }else{
    arr.add(sb.substring(wordIndex, i));
    }
    i++;
    }

  • jason

    Another approach
    package test;

    import java.util.HashSet;
    import java.util.Set;

    public class WordBreak2 {

    public static boolean wordBreak(String s,Set dict) {
    if (s.length()==0) {
    return true;
    }
    for(int i=1; i<=s.length(); i++) {
    String firstWord=s.substring(0, i);
    String remaing=s.substring(i);
    if (dict.contains(firstWord) && wordBreak(remaing, dict) ) {
    System.out.print(" ");
    System.out.print(firstWord);

    return true;
    }
    }
    return false;
    }
    public static void main(String[] args) {
    Set dict=new HashSet(5);
    dict.add(“program”);
    if (wordBreak(“pprogram”, dict)) {
    System.out.println(” YES”);
    } else {
    System.out.println(” NO”);
    }

    dict=new HashSet(5);
    dict.add(“ab”);
    dict.add(“abc”);
    dict.add(“de”);
    if (wordBreak(“abcde”, dict)) {
    System.out.println(” YES”);
    } else {
    System.out.println(” NO”);
    }

    }

    }

    This is more efficient if dict is big which is usually

  • ryanlr

    Seems good to me, I will try later. Thanks!

  • jk451

    Thanks for these solutions. Just starting to go through the problems but looks like very useful website.

    As for how to get the words that the string breaks up to:
    Change the “t” array to integer instead of boolean.

    Replacing setting t[end] to true (i.e. saying you have found a break up of 0..end substring] with setting t[end] to i, thus saying you have found a break up of 0..end substring and the last word in that break up is substring i..end of the main string.

    Then at the end if I can break up the string, I check t[s.length()]. THe last word in the break up will substring starting at t[s.length()] and ending at s.length()-1. And you repeate this procedure to get the other words.

  • ryanlr

    I don’t get what you mean, can you explain in more detail? Thanks.

  • Dynkin

    You should skip the string comparison in the last IF condition if t[end] is already true.

  • Kizzle

    Do you know if a better one exists? Can you provide a better solution? Appreciate it!

  • Callus

    I don’t think looping through the dic is a good idea.