I have the following strings:
1 "R J BRUCE & OTHERS V B J & W L A EDWARDS And Ors CA CA19/02 27 February 2003",
2 "H v DIRECTOR OF PROCEEDINGS [2014] NZHC 1031 [16 May 2014]",
3 '''GREGORY LANCASTER AND JOHN HENRY HUNTER V CULLEN INVESTMENTS LIMITED AND
ERIC JOHN WATSON CA CA51/03 26 May 2003'''
I am trying to find a regular expression which matches all of them. I don't know how to match optional square brackets around the date at the end of the string eg [16 May 2014].
casename = re.compile(r'(^[A-Z][A-Za-z\'\(\) ]+\b[v|V]\b[A-Za-z\'\(\) ]+(.*?)[ \[ ]\d+ \w+ \d\d\d\d[\] ])', re.S)
The date regex at the end only matches cases with dates in square bracket but not the ones without.
Thank to everybody who answered. @Matt Clarkson what I am trying to match is a judicial decision 'handle' in a much larger text. There is a large variation within those handles, but they all start at the beginning of a line have 'v' for versus between the party names and a date at the end. Mostly the names of the parties are in capital but not exclusively. I am trying to have only one match per document and no false positives.
解决方案
I got all of them to match using this (You'll need to add the case-insensitive flag):
(^[a-z][a-z\'&\(\) ]+\bv\b[a-z&\'\(\) ]+(?:.*?) \[?\d+ \w+ \d{4}\]?)
Explanation:
( Begin capture group
[a-z\'&\(\) ]+ Match one or more of the characters in this group
\b Match a word boundary
v Match the character 'v' literally
\b Match a word boundary
[a-z&\'\(\) ]+ Match one or more of the characters in this group
(?: Begin non-capturing group
.*? Match anything
) End non-capturing group
\[?\d+ \w+ \d{4}\]? Match a date, optionally surrounded by brackets
) End capture group