Is there a way to parse %-style strings like string.Formatter.parse


November 2018


72 time


I need to get a list of all placeholders in a string:

Thus, "There're %(num_items)d items in the %(container)s" should yield (('num_items', 'd'), ('container', 's')).

What I tried:

1) I tried looking into the source code and found that the

PyObject *
PyString_Format(PyObject *format, PyObject *args)

function does % interpolation on C level.

2) I also tried searching pypi and found a parse lib that does the same thing as string.Formatter.parse which is parsing {}-style string, which is not what I need.

Warning: a quick regexp is unlikely to cover all syntax of % substitution, which is what I need.

Similar question: How can I find all placeholders for str.format in a python string using a regex?


It seems to be solvable pretty well with a reasonably complex regexp, so it will make a nice homework task.

I'll accept this as an answer in two days and I don't anticipate any new answers to the question.


Is the question so localized that will never be useful to anyone else (except maybe those taking the same class)? If so, vote to close.

(from Please clarify the policy on homework questions)

2 answers


Я закончил с этим регулярным выражением:

re.findall(r'%\(([^)]+)\)[0-9]*(?:\.[0-9]*)?([diouxXeEfFgGcrs%])', a)

в качестве разумного приближения к задаче (соответствие 5 жетонов из 7).

import re

s = "There're %(num_items)d items in the %(container)s"
print re.findall(r'%\((.*?)\)', s)