淘先锋技术网

首页 1 2 3 4 5 6 7

I have the following strings:

1 "R J BRUCE & OTHERS V B J & W L A EDWARDS And Ors CA CA19/02 27 February 2003",

2 "H v DIRECTOR OF PROCEEDINGS [2014] NZHC 1031 [16 May 2014]",

3 '''GREGORY LANCASTER AND JOHN HENRY HUNTER V CULLEN INVESTMENTS LIMITED AND

ERIC JOHN WATSON CA CA51/03 26 May 2003'''

I am trying to find a regular expression which matches all of them. I don't know how to match optional square brackets around the date at the end of the string eg [16 May 2014].

casename = re.compile(r'(^[A-Z][A-Za-z\'\(\) ]+\b[v|V]\b[A-Za-z\'\(\) ]+(.*?)[ \[ ]\d+ \w+ \d\d\d\d[\] ])', re.S)

The date regex at the end only matches cases with dates in square bracket but not the ones without.

Thank to everybody who answered. @Matt Clarkson what I am trying to match is a judicial decision 'handle' in a much larger text. There is a large variation within those handles, but they all start at the beginning of a line have 'v' for versus between the party names and a date at the end. Mostly the names of the parties are in capital but not exclusively. I am trying to have only one match per document and no false positives.

解决方案

I got all of them to match using this (You'll need to add the case-insensitive flag):

(^[a-z][a-z\'&\(\) ]+\bv\b[a-z&\'\(\) ]+(?:.*?) \[?\d+ \w+ \d{4}\]?)

Explanation:

( Begin capture group

[a-z\'&\(\) ]+ Match one or more of the characters in this group

\b Match a word boundary

v Match the character 'v' literally

\b Match a word boundary

[a-z&\'\(\) ]+ Match one or more of the characters in this group

(?: Begin non-capturing group

.*? Match anything

) End non-capturing group

\[?\d+ \w+ \d{4}\]? Match a date, optionally surrounded by brackets

) End capture group