Strings

Matlab | R

Replace substring

# replace all characters "a" with "x"

"a1 a2 a3 a4".replace("a","x")

x1 x2 x3 x4

replace file ending ".txt" with ".csv"

filename = "mydata.txt"

newFilename = filename.replace(".txt",".csv")

mydata.csv

remove file ending ".txt"

filename = "mydata.txt"

rawname = filename.replace(".txt","")

mydata

Split and merge text strings

split

>>> 'a1,a2,a3'.split(',')

['a1', 'a2', 'a3']

get filename (get first list element)

>>> 'filename.txt.bz2'.split('.')[0]

'filename'

split() by default used all whitespace characters (space, tab \t, new line \n, \r, ...)

join - to concatenate a list to a string

>>> ','.join(['a1', 'a2', 'a3']) # comma separated

'a1,a2,a3'

>>> ' '.join(['a1', 'a2', 'a3']) # space separated

'a1 a2 a3'

concatenate strings

>>> 'hello' + ' ' + 'world'

'hello world'

for many strings, better use join()

>>> ''.join(['hello','world'])

'helloworld'

>>> ' '.join(['hello','world'])

'hello world'

# add prefix, only if not already present

s = 'hello world'

prefix = 'hello'

if not s.startswith(prefix):

s = prefix + ' ' + s

'hello world'

length of string

>>> len('hello')

5

sorted list

>>> sorted(['C', 'b', 'd','A'], key=str.lower)

['A', 'b', 'C', 'd']

sort list in descending order

>>> sorted([4, 1, 3, 2], reverse=True)

[4, 3, 2, 1]

convert strings to numbers (and back as string)

>>> int('23')

23

>>> float('23.1')

23.1

>>> str(23)

'23'

see also: format()

format('hello','>20')

' hello'

convert all strings in a list into int numbers

s = ['4', '1', '3', '2']

n = [int(x) for x in s]

[4, 1, 3, 2]

see also: List Comprehension

check / find substring

check for substring

>>> 'day' in 'Friday'

True

get index location of substring

'Friday'.index('day')

3

s='hello'

[idx for idx, letter in enumerate(s) if letter == 'l']

[2, 3]

Lists

find a substring in list items

>>> dates = ['May 2015','January 2012','Dezember 2015','June 2014']

>>> dates2015= [s for s in dates if '2015' in s]

['May 2015', 'Dezember 2015']

check if any substring is present in list items

dates= ['May 2015','January 2012','Dezember 2015','June 2014']

if any('2015' in s for s in dates):

print('yes, 2015 is present')

read more:

→ Regular expression operations

→ Sorting (wiki.python.org)