[ale] Need help with a regular expression please....

John Marasco jemarasco at bellsouth.net
Mon Jul 28 22:11:15 EDT 2003


IIRC grouping and character classes don't mix.  Those character classes
expressions are trying to match one of the listed characters and they aren't
there.  Depending on the format of your document, this simple alternate
might work:

[^>]$keyword[^<]

or a little more fancy solution

[^>](<(b)>)?$keyword(<\(b)>)?[^<]

you can filter out whatever sorts of format tags you want and still look for
a keyword sans the bounding outer tag.  Regular Expressions aren't good at
searching for missing patterns though.

-----Original Message-----
From: ale-admin at ale.org [mailto:ale-admin at ale.org]On Behalf Of David
To: ale at ale.org
Corbin
Sent: Monday, July 28, 2003 4:16 PM
To: ale at ale.org
Subject: Re: [ale] Need help with a regular expression please....


Keith Morris wrote:

> Hi all!  I'm creating a specialized mini CMS in PHP that will store
> content in a MySQL database.  What I am trying to do is parse the
> content and replace certain keywords with a link.  The keywords and
> associated links are kept in a MySQL table.
>
> Here is an example.
>
> $keyword = "Widgets Technology Co.";
> $location = "http://www.widgets.com/about";
>
> $keyword2 = "Widgets";
> $location2 = "http://www.widgets.com";
>
>
>
> $content = "We have the best Widgets at Widgets Technology Co.";
>
> I want to parse through $content looking for $keyword and replacing it
> with:
> $keyword  (this I can do with no problem)
>
> but I am going to be looping through a series of keywords (phrases)
> sorted by length (longest to shortest) that may or may not contain
> other defined keywords such as the values above for $keyword2 which
> would cause nested links and other nonsense.
>
> so what I'm needing is a regular expression that will find the
> $keyword (phrase) that is not already between "<a href =" and "</a>"
> so that it will not try to relink it.
>
> so far, this is the regular expression that I have, but does not work
> properly:
>
> [^(^\<a href=)][^(\>)]($keyword)[^(a\>)$]

This expression is no-where near what you want.  You should look into
zero-length lookahead/behind patterns.  But regardless, I don't think
you're going to find a reasonable RegEx for handling this.

David

>


_______________________________________________
Ale mailing list
Ale at ale.org
http://www.ale.org/mailman/listinfo/ale

_______________________________________________
Ale mailing list
Ale at ale.org
http://www.ale.org/mailman/listinfo/ale





More information about the Ale mailing list