Strings — Python (Part 1)
Table of contents
- Characteristics of Strings
- Creating Strings
- Creating Raw Strings
- Use Cases for Raw Strings
- Example
- Traversing
- Using a Loop
- Using List Comprehension
- Using the enumerate Function
- Using map Function
- Basic Indexing
- Negative Indexing
- Accessing Characters in a Loop
- Slicing
- Basic Slicing Examples
- Negative Indices
- Combining start, stop, and step
- Practical Use Cases
- Type conversion
- Explicit Type Conversion (Type Casting)
- Implicit Type Conversion (Type Coercion)
- Converting Between Strings and Lists
- Concatenation (+)
- Repetition (*)
- Membership (in and not in)
- Comparison (==, !=, <, <=, >, >=)
- String Formatting Operators
In Python, a string is a sequence of characters enclosed within single quotes ('
), double quotes ("
), or triple quotes ('''
or """
). Strings are used to represent text and are a fundamental data type in Python.
Characteristics of Strings
Immutable: Once a string is created, its content cannot be changed. Any operation that modifies a string will create a new string.
Indexed: Characters in a string can be accessed using indices, starting from 0 for the first character.
Iterable: You can iterate over each character in a string using a loop.
Creating Strings
Strings can be created by enclosing characters within quotes:
single_quote_string = 'Hello, World!'
double_quote_string = "Hello, World!"
triple_quote_string = """This is a
multiline string."""
In Python, raw strings are a type of string where backslashes (\
) are treated as literal characters and not as escape characters. This is particularly useful when dealing with strings that contain many backslashes, such as regular expressions or file paths on Windows. You create a raw string by prefixing the string literal with the letter r
or R
.
Creating Raw Strings
Here are some examples of creating raw strings:
# Normal string with escape sequences
normal_string = "C:\Users\Alice\Documents\new_file.txt"
print(normal_string) # Output may have unexpected characters due to escape sequences
# Raw string where backslashes are treated literally
raw_string = r"C:\Users\Alice\Documents\new_file.txt"
print(raw_string) # Output: C:\Users\Alice\Documents\new_file.txt
Use Cases for Raw Strings
File Paths on Windows: Windows file paths often contain backslashes, which can be cumbersome to escape. Using raw strings makes the paths easier to work with.
file_path = r"C:\Users\Alice\Documents\new_file.txt"
print(file_path) # Output: C:\Users\Alice\Documents\new_file.txt
Regular Expressions: Regular expressions frequently use backslashes to denote special characters, and raw strings prevent the need for double escaping.
import re
# Without raw string, need to escape backslashes
pattern = "\\bword\\b"
regex = re.compile(pattern)
# With raw string, no need to escape backslashes
raw_pattern = r"\bword\b"
raw_regex = re.compile(raw_pattern)
Example
str1 = "abcdef"
str2 = r'abcdef'
print(str1, str2) # no difference
str1 = "abc\ndef"
str2 = r'abc\ndef'
print(str1, str2) # diff
str1 = "C:\newfolder\temp\abc"
str2 = r'C:\newfolder\temp\abc'
str3 = "C:\\newfolder\\temp\\abc"
print(str1)
print(str2)
print(str3)
str4 = "'this is a quote'"
print(str4)
str4 = '"this is a quote"'
print(str4)
str4 = "\"this is a quote\""
print(str4)
Traversing
Traversing a string in Python means iterating over each character in the string. This can be done using various methods such as loops and comprehensions. Here are some common ways to traverse a string:
Using a Loop
name = "abcde"
for i in name:
print(i)
for i in range(len(name)): # start 0 stop length-1
print(name[i])
i=0
while i <len(name):
print(name[i])
i=i+1
Using List Comprehension
List comprehensions are a compact way to iterate over each character in a string and can be used to create a list of characters.
Example:
my_string = "Hello, World!"
char_list = [char for char in my_string]
print(char_list)
Output:
['H', 'e', 'l', 'l', 'o', ',', ' ', 'W', 'o', 'r', 'l', 'd', '!']
Using the enumerate
Function
The enumerate
function allows you to iterate over the string and also get the index of each character.
Example:
my_string = "Hello, World!"
for index, char in enumerate(my_string):
print(f"Index: {index}, Character: {char}")
Output:
Index: 0, Character: H
Index: 1, Character: e
Index: 2, Character: l
Index: 3, Character: l
Index: 4, Character: o
Index: 5, Character: ,
Index: 6, Character:
Index: 7, Character: W
Index: 8, Character: o
Index: 9, Character: r
Index: 10, Character: l
Index: 11, Character: d
Index: 12, Character: !
Using map
Function
You can use the map
function to apply a function to each character in the string.
Example:
my_string = "Hello, World!"
def print_char(char):
print(char)
list(map(print_char, my_string))
Output:
H
e
l
l
o
,
W
o
r
l
d
!
Indexing in a string refers to accessing individual characters in the string using their position, or index. In Python, string indices are zero-based, which means the first character has an index of 0, the second character has an index of 1, and so on. Negative indexing is also supported, allowing you to access characters starting from the end of the string, with -1
being the last character, -2
being the second to last, and so on.
Basic Indexing
Example:
my_string = "Hello, World!"
print(my_string[0]) # 'H'
print(my_string[4]) # 'o'
print(my_string[7]) # 'W'
Negative Indexing
Example
my_string = "Hello, World!"
print(my_string[-1]) # '!'
print(my_string[-2]) # 'd'
print(my_string[-5]) # 'W'
Accessing Characters in a Loop
Slicing
Slicing in Python allows you to access a subset of characters from a string. Slicing uses a colon (:
) to separate the start, stop, and step parameters. The general syntax for slicing is:
string[start:stop:step]
start
(optional): The starting index of the slice. Defaults to 0 if not provided.stop
(optional): The ending index of the slice (not included in the result). Defaults to the length of the string if not provided.step
(optional): The step size (how many characters to move forward). Defaults to 1 if not provided.
Basic Slicing Examples
Example 1: Slicing with start and stop
my_string = "Hello, World!"
slice1 = my_string[0:5] # 'Hello'
slice2 = my_string[7:12] # 'World'
print(slice1)
print(slice2)
Example 2: Omitting start or stop
my_string = "Hello, World!"
slice1 = my_string[:5] # 'Hello'
slice2 = my_string[7:] # 'World!'
slice3 = my_string[:] # 'Hello, World!' (the whole string)
print(slice1)
print(slice2)
print(slice3)
Example 3: Using step
y_string = "Hello, World!"
slice1 = my_string[::2] # 'Hlo ol!'
slice2 = my_string[1::2] # 'el,Wrd'
print(slice1)
print(slice2)
Negative Indices
Negative indices count from the end of the string. This can be useful for slicing from the end or reversing the string.
Example 4: Slicing with negative indices
my_string = "Hello, World!"
slice1 = my_string[-6:-1] # 'World'
slice2 = my_string[-6:] # 'World!'
print(slice1)
print(slice2)
Example 5: Reversing a string
my_string = "Hello, World!"
reversed_string = my_string[::-1] # '!dlroW ,olleH'
print(reversed_string)
Combining start, stop, and step
You can combine all three parameters to create complex slices.
Example 6: Complex slicing
string = "Hello, World!"
slice1 = my_string[1:10:2] # 'el,W'
slice2 = my_string[::3] # 'Hl r!'
print(slice1)
print(slice2)
Practical Use Cases
Example 7: Extracting a substring
Extracting a substring from a specific range.
url = "https://www.example.com"
protocol = url[:5] # 'https'
domain = url[8:] # 'www.example.com'
print(protocol)
print(domain)
Example 8: Removing a prefix or suffix
filename = "example.txt"
name_without_extension = filename[:-4] # 'example'
print(name_without_extension)
Example 9: Reversing words in a sentence
sentence = "This is an example sentence"
words = sentence.split() # Split into words: ['This', 'is', 'an', 'example', 'sentence']
reversed_words = words[::-1] # Reverse the list of words
reversed_sentence = " ".join(reversed_words) # Join them back into a string
print(reversed_sentence) # Output: 'sentence example an is This'
Type conversion
Type conversion in Python refers to the process of converting one data type to another. This can be done explicitly by using built-in functions or implicitly by Python during certain operations.
Explicit Type Conversion (Type Casting)
Converting to Integer:
- int(): Converts a number or string to an integer. If the string is not a valid integer, it raises a ValueError.
num_str = "123"
num_int = int(num_str) # 123
float_num = 12.34
int_num = int(float_num) # 12
Converting to Float:
- float(): Converts a number or string to a float. If the string is not a valid float, it raises a ValueError.
num_str = "123.45"
num_float = float(num_str) # 123.45
int_num = 123
float_num = float(int_num) # 123.0
Converting to String:
- str(): Converts any data type to a string.
num_int = 123
num_str = str(num_int) # "123"
float_num = 12.34
float_str = str(float_num) # "12.34"
Converting to List:
- list(): Converts a string, tuple, or set to a list.
my_str = "hello"
my_list = list(my_str) # ['h', 'e', 'l', 'l', 'o']
my_tuple = (1, 2, 3)
tuple_to_list = list(my_tuple) # [1, 2, 3]
Converting to Tuple:
- tuple(): Converts a string, list, or set to a tuple.
my_str = "hello"
my_tuple = tuple(my_str) # ('h', 'e', 'l', 'l', 'o')
my_list = [1, 2, 3]
list_to_tuple = tuple(my_list) # (1, 2, 3)
Converting to Set:
- set(): Converts a string, list, or tuple to a set (which removes duplicates).
my_str = "hello"
my_set = set(my_str) # {'h', 'e', 'l', 'o'}
my_list = [1, 2, 2, 3]
list_to_set = set(my_list) # {1, 2, 3}
Converting to Dictionary:
- dict(): Converts a list of tuples or another dictionary to a dictionary.
my_list = [('a', 1), ('b', 2)]
list_to_dict = dict(my_list) # {'a': 1, 'b': 2}
my_dict = {"a": 1, "b": 2}
new_dict = dict(my_dict) # {'a': 1, 'b': 2}
Implicit Type Conversion (Type Coercion)
Python automatically converts one data type to another during certain operations. This is known as implicit type conversion or type coercion.
Example:
num_int = 123
num_float = 1.23
# Adding an integer and a float results in a float
result = num_int + num_float # 124.23
print(type(result)) # <class 'float'>
# Mixing integer and boolean types
bool_val = True
result = num_int + bool_val # 124 (True is converted to 1)
print(type(result)) # <class 'int'>
Converting Between Strings and Lists
- String to List:
my_string = "hello"
my_list = list(my_string) # ['h', 'e', 'l', 'l', 'o']
- List to String:
my_list = ['h', 'e', 'l', 'l', 'o']
my_string = ''.join(my_list) # 'hello'
Operators
Strings support several operators that allow you to manipulate and interact with string data. Here are the key operators and their usage with strings:
Concatenation (+
)
The +
operator is used to concatenate two or more strings.
Example:
str1 = "Hello"
str2 = "World"
result = str1 + " " + str2
print(result) # Output: Hello World
Repetition (*
)
The *
operator is used to repeat a string a specified number of times.
Example:
str1 = "Hello"
result = str1 * 3
print(result) # Output: HelloHelloHello
Membership (in
and not in
)
The in
and not in
operators are used to check if a substring exists within another string.
Example:
str1 = "Hello, World!"
print("Hello" in str1) # Output: True
print("hello" in str1) # Output: False
print("Hello" not in str1) # Output: False
Comparison (==
, !=
, <
, <=
, >
, >=
)
Strings can be compared using the comparison operators. The comparisons are case-sensitive and based on the Unicode values of characters.
Example:
str1 = "apple"
str2 = "banana"
print(str1 == str2) # Output: False
print(str1 != str2) # Output: True
print(str1 < str2) # Output: True (because 'a' < 'b')
print(str1 > str2) # Output: False
String Formatting Operators
%
Operator
The %
operator is used for old-style string formatting.
Example:
name = "Alice"
age = 30
formatted_string = "Name: %s, Age: %d" % (name, age)
print(formatted_string) # Output: Name: Alice, Age: 30