regex - Implement a tokeniser in Python -

i trying implement tokeniser in python (without using nltk libraries) splits string words using blank spaces. example usage is:

>> tokens = tokenise1(“a (small, simple) example”) >> tokens [‘a’, ‘(small,’, ‘simple)’, ‘example’]

i can of way using regular expressions return value includes white spaces don't want. how correct return value per example usage?

what have far is:

def tokenise1(string):     return re.split(r'(\s+)', string)

and returns:

['', 'a', ' ', '(small,', ' ', 'simple)', ' ', 'example', '']

so need rid of white space in return

the output having spaces because capture them using (). instead can split like

re.split(r'\s+', string) ['a', '(small,', 'simple)', 'example']

Employment