Skip to content

Strings

Definition

A string is an array of bytes.

Strings are a special kind of array, made out of bytes encoded in various formats as characters. Advanced string processing algorithms typically use hash tables. Since strings are fundamentally arrays, problems with strings typically have simple brute force solutions. The best solutions use the string itself to lower the space complexity to $O(1)$. In Python, strings are immutable; but it can be used at mutable by representing it as an list of characters.

Working with strings
  • Strings are immutable (in Python) so s += '123' or s = s[1:] will allocate space for a new string and assign it to s ($O(n^2)$ time complexity)
  • Array-style slicing works on strings.
  • Concatenate an array of strings with ''.join(A)
  • Use s.startswith(prefix) or s.endswith(suffix) to obtain booleans
  • Use s.strip(), s.removesuffix(suffix) and s.removeprefix(pre) to remove suffixes and prefixes
  • Use s.split('delimiter') to split a string at the delimiter, e.g., , or .
  • Use f'name={name}' to format strings
  • Use a look-up string, e.g. lus = '0123456789ABCDEF', and lus.index('F') and lus[15] to convert between int and str types
  • To test an empty string, use either if not s or s == ''

Encodings

ASCII

Uses 1 byte and 7 free bits to encode 128 characters that include the English language, punctuation, and non-printable characters for formatting.

UTF-8

Uses 1--4 bytes to encode millions of characters. The leading byte reserves a number of bits to specify how many bytes are used for the encoding. The following bytes each lose a small number of bits to identify their relation to the leading byte. Free bits for character representations vary between 7 (ASCII equivalent) and 21. See UTF-8 Encoding.

C++

String slicing

c++
std::string original = "qwerty_info:IMPORTANT";
auto i = original.find("info:");
std::string target = original.substr(i, original.length()-i);
assert(target == "info:IMPORTANT");
std::string original = "qwerty_info:IMPORTANT";
auto i = original.find("info:");
std::string target = original.substr(i, original.length()-i);
assert(target == "info:IMPORTANT");

Problems

6.1 Interconvert string and int

The authors did some really fancy Python calls, but to be honest, I would never come up with these in an actual interview setting. I did solve the problem, but it took me quite a long time to solve it.

  • Learned to use a Lookup string to convert between string and integer types easily in Python

6.2 Base conversion

6.4 Replace and remove

I performed well on this question, which I correctly assumed to be a 2-pass solution. I solved all three solutions provided in the book. Brute force, insert in array, big space, where we allocate to an array, and then after looking at the solutions the optimal solution. This entailed computing the number of a's on the first pass when removing b's. Then back-filling the array, keeping index trackers of where to write and current locations

6.5 Test Palindrome

My solution with a left and right index tracker is correct. The only improvement to be made is that I filter the string, creating an array, which I then loop over. This adds space complexity O(n). The space complexity can be decreased by ignoring punctuation marks on the single pass. This solution can be achieved by checking on each iteration if the current value is non-alphanumeric, in this case increase the index in a while loop until it isn't alphanumeric.

6.6 Reverse words in a sentence

''.join(list_of_words) has time complexity O(n) where n is the number of characters in the output. Got this, but with more space complexity than is ideal I suppose. Still space complexity was lower than O(n) for n characters in the original string.

LC 409 Longest Palindrome

Solution: Use a hash table (or alternatively an array) to encode the frequency of each letter. A palindrome can be created with any characters that occur an even number of times, and any characters that occur an odd number of time, minus one. So, after creating the hash table, loop over it an add to a running sum given these rules. Keep track if there are any odd characters in the input string. If so add 1 to the return value.

LC 13 Roman to Integer

Solution: Iterate over the string of roman numerals from right to left, if the current value is less than the one previously encountered, use subtraction instead of addition for your running sum. This will effectively turn, e.g., IX into 9 by first adding 10 then subtracting 1.

  • Time complexity, O(n) for single pass through each character in string
  • Space complexity, O(1) because the hash table is of fixed size and other variables are manipulated in place.

LC 3 Longest Substring Without Repeating Characters

This tests you ability to use a hashmap and manage a complex two-pointer algorithm (turtle and hare)

python
class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        if len(s) == 0 or len(s) == 1:
            return len(s)

        table = set()
        max_length = 0
        left, right = 0, 0

        while right < len(s):
            char = s[right]
            if char in table:
                while char in table:
                    table.remove(s[left])
                    left += 1
            max_length = max(max_length, right - left)
            right += 1
            table.add(char)

        return max_length + 1
class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        if len(s) == 0 or len(s) == 1:
            return len(s)

        table = set()
        max_length = 0
        left, right = 0, 0

        while right < len(s):
            char = s[right]
            if char in table:
                while char in table:
                    table.remove(s[left])
                    left += 1
            max_length = max(max_length, right - left)
            right += 1
            table.add(char)

        return max_length + 1

LC 71 Simplify Path

This tests your ability to recognize when a stack should be used, how to work with stacks, and manage simple flow control based on strings.

python
class Solution:
    def simplifyPath(self, path: str) -> str:
        if path == '':
            return path
        dirs = path.split('/')
        stack = list()
        for d in dirs:
            if d == '..':
                if stack:
                    stack.pop()
                else:
                    continue
            elif d == '.' or d == '':
                continue
            else:
                stack.append(d)
        return '/' + '/'.join(stack)
class Solution:
    def simplifyPath(self, path: str) -> str:
        if path == '':
            return path
        dirs = path.split('/')
        stack = list()
        for d in dirs:
            if d == '..':
                if stack:
                    stack.pop()
                else:
                    continue
            elif d == '.' or d == '':
                continue
            else:
                stack.append(d)
        return '/' + '/'.join(stack)
rust
impl Solution {
    pub fn simplify_path(path: String) -> String {
        let mut stack = Vec::<&str>::new();

        for dir in path.split('/') {
            match dir {
                ".." => if stack.is_empty() { continue } else { stack.pop(); },
                "." | "" => continue,
                _ => stack.push(&dir),
            }
        }

        "/".to_string() + &stack.join("/")
    }
}
impl Solution {
    pub fn simplify_path(path: String) -> String {
        let mut stack = Vec::<&str>::new();

        for dir in path.split('/') {
            match dir {
                ".." => if stack.is_empty() { continue } else { stack.pop(); },
                "." | "" => continue,
                _ => stack.push(&dir),
            }
        }

        "/".to_string() + &stack.join("/")
    }
}