python - Joining strings. Generator or list comprehension? -


consider problem of extracting alphabets huge string.

one way is

''.join([c c in hugestring if c.isalpha()]) 

the mechanism clear: list comprehension generates list of characters. join method knows how many characters needs join accessing length of list.

other way is

''.join(c c in hugestring if c.isalpha()) 

here generator comprehension results in generator. join method not know how many characters going join because generator not possess len attribute. way of joining should slower list comprehension method.

but testing in python shows not slower. why so? can explain how join works on generator.

to clear:

sum(j j in range(100)) 

doesn't need have knowledge of 100 because can keep track of cumulative sum. can access next element using next method on generator , add cumulative sum. however, since strings immutable, joining strings cumulatively create new string in each iteration. take lot of time.

when call str.join(gen) gen generator, python equivalent of list(gen) before going on examine length of resulting sequence.

specifically, if look @ code implementing str.join in cpython, you'll see call:

    fseq = pysequence_fast(seq, "can join iterable"); 

the call pysequence_fast converts seq argument list if wasn't list or tuple already.

so, 2 versions of call handled identically. in list comprehension, you're building list , passing join. in generator expression version, generator object pass in gets turned list right @ start of join, , rest of code operates same both versions..


Comments

Popular posts from this blog

c++ - llvm function pass ReplaceInstWithInst malloc -

Cross-Compiling Linux Kernel for Raspberry Pi - ${CCPREFIX}gcc -v does not work -

java.lang.NoClassDefFoundError When Creating New Android Project -