PDA

View Full Version : PHP programming question: Removing duplicate strings from an array




scem0
Apr 26, 2005, 09:07 PM
Does anyone know how I would go about searching through an array for repeated values and return the one with the smaller index? For example:

$array[0]= "timmy";
$array[1] = "joe";
$array[2] = "elmo";
$array[3] = "clarissa";
$array[4] = "joe";

I need a function that will return "1" if I do findrepeatlowerindex("joe");

In addition, I would like it to not be case sensitive ( :o ), if at all possible.

So

$array[0]= "timmy";
$array[1] = "joe";
$array[2] = "elmo";
$array[3] = "clarissa";
$array[4] = "JoE";

findrepeatlowerindex("joE"); would return "1", too.

Any suggestions on how I could go about doing this? Or if you are really cool you could write the function for me :D.

Thanks,
scem0



Grover
Apr 26, 2005, 10:13 PM
Try this - it's just off the top of my head so maybe you can optimize it. It will only return an index value if there's more than one instance of the search value in the array. I wasn't sure if that's what you were looking for but, if not, you can strip out the check and have it return the index value of the first instance.


$array[0] = "timmy";
$array[1] = "joe";
$array[2] = "elmo";
$array[3] = "clarissa";
$array[4] = "Joe";

function findRepeatLowerindex($sourceArray, $searchValue) {

$firstInstanceIndex = -1;
$foundSecondInstance = false;


for ($i = 0; $i < count($sourceArray); $i++) {

if (strtolower($sourceArray[$i]) == $searchValue) {

//if the value has already been found once, then this is a repeat
if ($firstInstanceIndex != -1) {

$foundSecondIndex = true;

//and we can break out of the loop and return the first found index
break;

} else { //otherwise, store this as the first index

$firstInstanceIndex = $i;

}

}

}

if ($foundSecondIndex == false) {

$firstInstanceIndex = -1;

}

return $firstInstanceIndex;
}

$firstIndex = findRepeatLowerindex($array, "joe");

echo $firstIndex;

plinden
Apr 26, 2005, 10:23 PM
Your spec is incomplete.

What's it supposed to return for:
a) values that occur only once - e.g. if you call findrepeatlowerindex("timmy"); ?
b) nonexistent values - e.g. if you call findrepeatlowerindex("Steve Jobs"); ?

scem0
Apr 26, 2005, 11:40 PM
sorry about not being clear. It looks like Grover's solution will work perfectly.

I wanted the first value only if there are 2 instances of that value in the array.

Thanks sooooooo much.

An appreciative amateur programmer,

scem0

edit - it did work perfecty! Thanks!

jeremy.king
Apr 29, 2005, 03:17 PM
I know you got your solution, but I wrote an alternative routine trying to leverage the array functions. The routine takes an array as an argument and will find all your duplicates and their first index - right now just prints them out, but you could store them or do what you like.

My approach was to
1.lowercase the array
2.build another array containing just the unique values
3.diff the unique wth the original to get duplicate values (preserving index)
4.for the dups - search the original array for the 1st index



$origArray = array("a","b","c","B","d","a","e","f","F"); //original array

goFindDuplicates($origArray);

//PS...if your goal was just to remove the duplicate values, it gets even simpler

/************************************
*Functions
************************************/
function goFindDuplicates($lcArray)
{
array_walk($lcArray, 'lcArrayVal'); //lowercase all the values first
$uniqueArray = array_unique($lcArray);//only unique values
$dupArray = array_diff_assoc($lcArray, $uniqueArray);//duplicated values

foreach ($dupArray as $value)
{
//do whatever you want here like build an assoc array.
echo "duplicate value='$value', first occurence at index=" . array_search($value,$lcArray) . "\n";
}
}

function lcArrayVal(&$item, $key)
{
$item = strtolower($item);
}

bubbagump
May 5, 2005, 11:45 PM
What exactly is your intent for selecting the lowest indexed element that meets your search criteria? It just seems like a bad idea to me to do a linear search for this. Maybe your dataset is small enough that it is not a performance issue. I would look into doing some sort of hashing if this needs to be fast. I am not familiar with the features of PHP, but you may want to look into "associative arrays."

From http://www.tizag.com/phpT/arrays.php

PHP - Associative Arrays

In an associative array a key is associated with a value. If you wanted to store the salaries of your employees in an array, a numerically indexed array would not be the best choice. Instead, we could use the employees names as the keys in our associative array, and the value would be their respective salary.


PHP Code:

$salaries["Bob"] = 2000;
$salaries["Sally"] = 4000;
$salaries["Charlie"] = 600;
$salaries["Clare"] = 0;

echo "Bob is being paid - $" . $salaries["Bob"] . "<br />";
echo "Sally is being paid - $" . $salaries["Sally"] . "<br />";
echo "Charlie is being paid - $" . $salaries["Charlie"] . "<br />";
echo "Clare is being paid - $" . $salaries["Clare"];


Display:

Bob is being paid - $2000
Sally is being paid - $4000
Charlie is being paid - $600
Clare is being paid - $0


Once again, the usefulness of arrays will become more apparent once you have knowledge of for and while loops.



Or, maybe I should have kept my mouth shut because you really do need the datastructure you proposed, or any modern processor can waste a ton of performance and still handle your application.

Bubba