c# - Match Multiline & IgnoreSome -


i'm trying extract information jcl source using regex in c# basically, string can have:

//jobname0 job (blablabla),'some text',msgclass=yes,ilike=potatoes, grmbl //             ialsolike=tomatoes,      garbage //             finally=bye //other stuff 

so need extract jobname jobname0, info (blablabla), description 'some text' , other parms msgclass=yes ilike=potatoes ialsolike=tomatoes finally=bye.

i must ignore after space ... grmbl or another garbage

i must continue next line if last valid char , , stop if there none.

so far, have managed jobname, info , description, pretty easy. other parms, i'm able parms , split them, don't know how rid of garbage.

here code:

var regex = "//([^\\s]*) job (\\([^)]*\\))?,?(\\'[^']*\\')?,?([^,]*[,|\\s|$])*"; match match2 = regex.match(test5, regex,regexoptions.singleline);  string cartejob2 = match2.groups[0].value; string jobname2 = match2.groups[1].value; string jobinfo2 = match2.groups[2].value; string jobdesc2 = match2.groups[3].value; ienumerable<string> parms = match2.groups[4].captures.oftype<capture>().select(x => x.value); string jobparms2 = string.join("|", parms);  console.writeline(cartejob2 + "|"); console.writeline(jobname2 + "|"); console.writeline(jobinfo2 + "|"); console.writeline(jobdesc2 + "|"); console.writeline(jobparms2 + "|"); 

the output one:

//jobname0 job (blablabla),'some text',msgclass=yes,ilike=potatoes, grmbl //             ialsolike=tomatoes,      garbage //             finally=bye //other | jobname0| (blablabla)| 'some text'| msgclass=yes,|ilike=potatoes,| grmbl //             ialsolike=tomatoes,|      garbage //             finally=bye //other | 

the output see is:

//jobname0 job (blablabla),'some text',msgclass=yes,ilike=potatoes, grmbl //             ialsolike=tomatoes,      garbage //             finally=bye| jobname0| (blablabla)| 'some text'| msgclass=yes|ilike=potatoes|ialsolike=tomatoes|finally=bye| 

is there way want ?

i think i'd try , 2 regex expressions.

the first 1 starting information beginning of string - job name, info, description.

the second 1 parameters, seem have simple pattern of <param name>=<param value>.

the first regex might this:

^//(?<job>[\d\w]+)[ ]+job[ ]+\((?<info>[\d\w]+)\),'(?<description>[\d\w ]+)' 

i don't know if rules permit whitespaces appear in job name, info or description - adjust needed. also, i'm assuming start of file using ^ char. finally, regex has groups defined, getting values should easier in c#.

the second regex might this:

(?<param>[\w\d]+)=(?<value>[\w\d]+) 

again, grouping added parameter names , values.

hope helps.

edit:

a small tip - can use @ sign before string in c# make easier write such regex patterns. example:

regex reg = new regex(@"(?<param>[\w\d]+)=(?<value>[\w\d]+)"); 

Comments

Popular posts from this blog

javascript - DIV "hiding" when changing dropdown value -

Does Firefox offer AppleScript support to get URL of windows? -

android - How to install packaged app on Firefox for mobile? -