PHP: eregi is killing me (i.e. I don't get regular expressions)

Rower_CPU · Oct 7, 2004

OK, I'm trying to do something that I thought would be simple, but apparently isn't.

I'm checking user input via a form to make sure that there are only English characters (no numbers, punctuation, foreign characters or <i>accented Latin</i>). So far everything I've tried just doesn't work.

I think it basically comes down to me not being familiar enough with regex, but I could be tripping up on some obscure thing that includes accented Latin in the typical "a-z" or ":alpha:" ranges.

Any help is much appreciated.

hotwire132002 · Oct 7, 2004

Rower_CPU said:
OK, I'm trying to do something that I thought would be simple, but apparently isn't.

I'm checking user input via a form to make sure that there are only English characters (no numbers, punctuation, foreign characters or <i>accented Latin</i>). So far everything I've tried just doesn't work.

I think it basically comes down to me not being familiar enough with regex, but I could be tripping up on some obscure thing that includes accented Latin in the typical "a-z" or ":alpha:" ranges.

Any help is much appreciated.

Just out of curiousity, what is "eregi"? I assume it's a PHP script or something, what does it do? Just curious.

zimv20 · Oct 7, 2004

as a test, have you tried replacing the range(s) w/ the 52 allowable characters?

Rower_CPU · Oct 7, 2004

hotwire-
eregi is a function for pattern matching. eregi manual

zimv20-
I haven't gone that route yet - I try to avoid explicitly stating values like that. If no-one else has a suggestion that'll be my next attempt.

zimv20 · Oct 7, 2004

let's see some code!

EminenceGrise · Oct 7, 2004

Looking at the eregi manual you supplied and tracking down a man page for POSIX regex (it's a bit dense...), I think that I have a regex will match all non-english characters, leaving only the 52 upper and lower case "regular" characters.

Try using "[^a-z]" (no double quotes needed for the expresssion) the carat (^) should tell eregi to match everything that is NOT in the following list (a-z). You dont need (A-Z) since you are using the case insensitive version (of ereg) - in other words the expression above automatically expands to "[^a-zA-Z]" in strict regex terms. You can then use the result of eregi for your filtering - e.g. if it gets a match then an "illegal" charater has been entered. The entity in "[]" is a single character, and should trigger a match if that character exists anywhere in the input string. "^[a-z]" will probably also work, or may work if "[^a-z]" doesn't.

Note: I don't do PHP. I can cobble together a regex if needed, but I'm no expert - so what I gave you has a chance of not working, but the only thing to do is try it.

I hope it works, but if not I can give it another go, or maybe what's above will be enough to get you started.

Edit: [a-z] should not match any accented characters as far as I know (but this should be easy to double check by testing your code), but [:alpha:] may, depending on the system locale.

Rower_CPU · Oct 7, 2004

Ah, thanks Eminence. I've been using !eregi and then trying to match the "bad stuff", but this is much more simple. Too many double-negatives and I was getting all turned around.

I added a space ("[^a-z ]") to allow spaces between words in the string and I'm set.

One fun thing this mess produced was a form for testing it out so that I wasn't bombarding my database with inserts as I tested my regex. Here's the page:
http://larcx.sdsu.edu/dma/contributor/test/regexp.php

jeremy.king · Oct 8, 2004

Nevermind...can't seem to read today.

Rower_CPU · Oct 8, 2004

kingjr3 said:
Nevermind...can't seem to read today.

No prob - that little "i" is easy to miss.

EminenceGrise · Oct 8, 2004

Rower_CPU said:
Ah, thanks Eminence. I've been using !eregi and then trying to match the "bad stuff", but this is much more simple. Too many double-negatives and I was getting all turned around.

I added a space ("[^a-z ]") to allow spaces between words in the string and I'm set.

Excellent! Glad I could help. Good catch with the space, I'd forgotten about that one.

Search

Search

PHP: eregi is killing me (i.e. I don't get regular expressions)

Rower_CPU

Moderator emeritus

hotwire132002

macrumors 65816

zimv20

macrumors 601

Rower_CPU

Moderator emeritus

zimv20

macrumors 601

EminenceGrise

macrumors member

Rower_CPU

Moderator emeritus

jeremy.king

macrumors 603

Rower_CPU

Moderator emeritus

EminenceGrise

macrumors member

Our Staff