How to use tqdm with pandas in a jupyter notebook?

Refresh

December 2018

Views

7.1k time

12

I'm doing some analysis with pandas in a jupyter notebook and since my apply function takes a long time I would like to see a progress bar. Through this post here I found the tqdm library that provides a simple progress bar for pandas operations. There is also a Jupyter integration that provides a really nice progress bar where the bar itself changes over time.

However, I would like to combine the two and don't quite get how to do that. Let's just take the same example as in the documentation

import pandas as pd
import numpy as np
from tqdm import tqdm

df = pd.DataFrame(np.random.randint(0, 100, (100000, 6)))

# Register `pandas.progress_apply` and `pandas.Series.map_apply` with `tqdm`
# (can use `tqdm_gui`, `tqdm_notebook`, optional kwargs, etc.)
tqdm.pandas(desc="my bar!")

# Now you can use `progress_apply` instead of `apply`
# and `progress_map` instead of `map`
df.progress_apply(lambda x: x**2)
# can also groupby:
# df.groupby(0).progress_apply(lambda x: x**2)

It even says "can use 'tqdm_notebook' " but I don't find a way how. I've tried a few things like

tqdm_notebook(tqdm.pandas(desc="my bar!"))

or

tqdm_notebook.pandas

but they don't work. In the definition it looks to me like

tqdm.pandas(tqdm_notebook(desc="my bar!"))

should work, but the bar doesn't properly show the progress and there is still additional output.

Any other ideas?

1 answers

9

Ты можешь использовать:

tqdm_notebook().pandas(*args, **kwargs)

Это потому, что tqdm_notebook имеет адаптер Delayer, поэтому необходимо создать экземпляр его до обращения к его методам (включая методы класса).

В дальнейшем (> v5.1), вы должны быть в состоянии использовать более равномерное API:

tqdm_pandas(tqdm_notebook, *args, **kwargs)