淘先锋技术网

首页 1 2 3 4 5 6 7

I am having some trouble wrapping my head around Python regular expressions to come up with a regular expression to extract specific values.

The page I am trying to parse has a number of productIds which appear in the following format

\"productId\":\"111111\"

I need to extract all the values, 111111 in this case.

解决方案t = "\"productId\":\"111111\""

m = re.match("\W*productId[^:]*:\D*(\d+)", t)

if m:

print m.group(1)

meaning match non-word characters (\W*), then productId followed by non-column characters ([^:]*) and a :. Then match non-digits (\D*) and match and capture following digits ((\d+)).

Output

111111