php - Yet another regex. Getting image from markdown, bugged if markdown inside -
i'm trying images info wiki, have working regex i'm failing when description has markdown also.
format of images on markdown:
//[[image:williamgodwin.jpg|thumb|right|150px|william godwin]] //[[image:johannmost.jpg|left|150px|thumb|[[johann most]] outspoken advocate of violence]] //[[image:cnt-armoured-car-factory.jpg|right|thumb|270px|[[spain]], [[1936]]. members of [[cnt]] construct [[armoured car]]s fight against [[fascist]]s in 1 of [[collectivisation|collectivised]] factories.]] [[image:cnt_tu_votar_y_ellos_deciden.jpg|thumb|175px|cnt propaganda april 2004. reads: don't let politicians rule our lives/ vote , decide/ don't allow it/ unity, action, self-management.]] [[image:flag of anarcho syndicalism.svg|thumb|175px|the red-and-black flag, coming experience of anarchists in labour movement, particularly associated anarcho-syndicalism.]] [[image:leotolstoy.jpg|thumb|150px|[[leo tolstoy|leo tolstoy]] 1828-1910]]
{{main articles|[[christian anarchism]] , [[anarchism , religion]]}}
here's tries: https://regex101.com/r/pd6nf8/1
i'm trying like:
// \[\[image:(.*?)\|(.*?)\|(.*?)\|(.*?)\|\[*(.*?)\|*(.*?)\]* $re = "/\\[\\[image:(.*?)\\|(.*?)\\|(.*?)\\|(.*?)\\|\\[*(.*?)\\|*(.*?)\\]*/i";
it should find 14 test i'm getting 11 far, or if 14 noise ]] or parts of description...
how can include optional case of having [[(.*?)]] inside last part?
you can define nested parts before, using kind of syntax:
$pattern = '~ # definitions (?(define) (?<nested> \[\[ [^][]*+ (?:\[\[ \g<nested> ]] [^][]*)*+ ]] ) (?<part> [^][|]*+ (?: \g<nested> [^][|]* )*+ ) ) # main pattern \[\[ image: (\g<part>) \| (\g<part>) \| (\g<part>) \| (\g<part>) \| (\g<part>) ]] ~ix';
obviously, can more precise. if know 4th part size, can replace it:
\[\[ image: (\g<part>) \| (\g<part>) \| (\g<part>) \| (\d+ px) \| (\g<part>) ]]
you free make part optional if needed (for example alignment parameter can omitted):
\[\[ image: (\g<part>) \| (\g<part>) (?:\| (\g<part>) )? \| (\d+ px) \| (\g<part>) ]]
or can parameters optional , can occur once, in case need precise:
~ (?(define) (?<nested> \[\[ [^][]*+ (?: \[\[ \g<nested> ]] [^][]* )*+ ]] ) (?<part> [^][|]*+ (?: \g<nested> [^][|]* )*+ ) ) \[\[image: (?<name> [^]|]* ) (?: \| (?: (?<align> left|right|center ) | (?<type> thumb ) | (?<size> \d+[a-z]{0,3} ) | (?<description> \g<part> ) ) )* ]] ~ix