#StackBounty: #php #html How to wrap multiple groups of LIs within a string with ULs in php

Bounty: 50

I get data (strings with some html) from different sources I can’t influence. The strings contain (but are not limited to) LI elements that are visually grouped – but miss parent UL elements. I need to wrap the groups of LI tags with a UL tag.

This works fine if there is only one group of LI elements within a string. I can easily use DOMDocument, search the LI tags and wrap them with a newly created UL tag. Unfortunately there can be multiple groups and the separation of the groups isn’t defined – but is always some kind of text or a html tag. It’s easily to recognize the groups as a human 🙂

So logically speaking I would need to find an opening <li> as the starting point of a group and a closing </li> that isn’t followed by another opening <li> as the end point, ignoring all white spaces.

An example source string could be:

Some text
<strong>Some other text</strong>
<li>Element A1</li><li>Element A2</li>
<li>Element A3</li>
Text that separates group A from group B
<li>Element B1</li>

<li>Element B2</li> <li>Element B3</li>
<li>Element B4</li>
<strong>Element that separates group B from group C</strong>
<li>Element C1</li>
<li>Element C2</li>
Text can follow. 

The desired result would be

Some text
<strong>Some other text</strong>
<ul>
  <li>Element A1</li><li>Element A2</li>
  <li>Element A3</li>
</ul>
Text that separates group A from group B
<ul>
  <li>Element B1</li>

  <li>Element B2</li> <li>Element B3</li>
  <li>Element B4</li>
</ul>
<strong>Element that separates group B from group C</strong>
<ul>
  <li>Element C1</li>
  <li>Element C2</li>
</ul>
Text can follow. 

I was thinking about using regex (I know, usually not the best idea for html). But here I don’t know how to recognize the ending </li> (or etc.) that is followed by anything other than a white space or another opening <li> (or < li > etc.)

I could also remove all white spaces between a > and a <; maybe the world would be a little bit easier then. But even then I don’t know how to “include” an opening LI as a valid following element within a group and exclude everything else.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.