 | | | | |  | | | | | Guest | Re: find (matching) person in other table -
06-02-2007, 08:56 PM
> I was thinking to ***ign points (percentage) to matching fields (last
> name, first name, email, phone, city, zip, phone) and then list people
> with more than 50%. e.g., if first and last name match - 75%, if only
> email match - 85%, if first name, last name and email match - 100%, if
> last name and phone match - 50%... etc.
>
> does anybody have any experience with such a problem? or something
similar?
Although you should be able to do this with you SELECT (I guess, never
have), since you posted this to a PHP mailing, you get a PHP answer!
Look up Levinshtein in the php manual and start from there: http://us2.php.net/manual/en/function.levenshtein.php
If you can do this on SELECT (using the db engine), I would suggest that, as
that way you don't have to return a giant list to poke through.
You can also use wildcards, and only select matches that have the first
three characters:
$lastname = strpos('Rogers',0,2);
$firstname = strpos('Timothy',0,2);
$select = "SELECT `uid`,`LastName`,`FirstName`
FROM `users`
WHERE LastName='$lastname%'
AND FirstName='$firstname%'";
I haven't tested that, but I think it would work. You would need to work on
a way to LIMIT the matches effectively. If that doesn't work, hey, this is a
PHP list...
--
Jared Farrish
Intermediate Web Developer
Denton, Tx
Abraham Maslow: "If the only tool you have is a hammer, you tend to see
every problem as a nail." $$ | | | | | | | | Guest | Re: find (matching) person in other table -
06-02-2007, 08:56 PM
On 5/30/07, Jared Farrish <EMAIL REMOVED> wrote:
>
> $lastname = strpos('Rogers',0,2);
> $firstname = strpos('Timothy',0,2);
> $select = "SELECT `uid`,`LastName`,`FirstName`
> FROM `users`
> WHERE LastName='$lastname%'
> AND FirstName='$firstname%'";
>
Strike the above and make it:
$lastname = substr('Rogers',0,3);
$firstname = substr('Timothy',0,3);
$select = "SELECT `uid`,`LastName`,`FirstName`
FROM `users`
WHERE LastName='$lastname%'
AND FirstName='$firstname%'";
Foolisness!
--
Jared Farrish
Intermediate Web Developer
Denton, Tx
Abraham Maslow: "If the only tool you have is a hammer, you tend to see
every problem as a nail." $$ | | | | | | | | Guest | Re: [PHP] Re: find (matching) person in other table -
06-02-2007, 08:56 PM
Jared Farrish wrote:
>> I was thinking to ***ign points (percentage) to matching fields (last
>> name, first name, email, phone, city, zip, phone) and then list people
>> with more than 50%. e.g., if first and last name match - 75%, if only
>> email match - 85%, if first name, last name and email match - 100%, if
>> last name and phone match - 50%... etc.
>>
>> does anybody have any experience with such a problem? or something
> similar?
>
> Although you should be able to do this with you SELECT (I guess, never
> have), since you posted this to a PHP mailing, you get a PHP answer!
>
> Look up Levinshtein in the php manual and start from there:
>
> http://us2.php.net/manual/en/function.levenshtein.php
>
> If you can do this on SELECT (using the db engine), I would suggest
> that, as
> that way you don't have to return a giant list to poke through.
>
> You can also use wildcards, and only select matches that have the first
> three characters:
>
> $lastname = strpos('Rogers',0,2);
> $firstname = strpos('Timothy',0,2);
> $select = "SELECT `uid`,`LastName`,`FirstName`
> FROM `users`
> WHERE LastName='$lastname%'
> AND FirstName='$firstname%'";
>
> I haven't tested that, but I think it would work. You would need to
> work on
> a way to LIMIT the matches effectively. If that doesn't work, hey,
> this is a
> PHP list...
yes. in one hand it's more for mysql list. though, I was thinking more
if somebody had already something similar as a "project". more as path I
have to follow.
e.g., in your example, in where clause AND doesn't work because bob
could be robert too, right? and last name has to match 100%, right? (or
I'm wrong?)
how "smart" solution will be something like this:
$query = my_query("select id from members where last_name='$last_name'");
while($result = mysql_fetch_array($query))
{
$MEMBERS[$result['id']] += 50;
}
$query = my_query("select id from members where first_name='$first_name'");
while($result = mysql_fetch_array($query))
{
$MEMBERS[$result['id']] += 10;
}
$query = my_query("select id from members where email='$email'");
while($result = mysql_fetch_array($query))
{
$MEMBERS[$result['id']] += 85;
}
etc.
after last query I will have an array of people. and I'll list all
person with "score" more than 50.
or, since last name MUST match, I think it's better this way (just got
in my head):
$query = my_query("select id from members where last_name='$last_name'");
while($result = mysql_fetch_array($query))
{
$query = my_query("select id from members where
first_name='$first_name'");
while($result = mysql_fetch_array($query))
{
$MEMBERS[$result['id']] += 10;
}
$query = my_query("select id from members where email='$email'");
while($result = mysql_fetch_array($query))
{
$MEMBERS[$result['id']] += 85;
}
etc.
} | | | | | | | | Guest | Re: [PHP] Re: find (matching) person in other table -
06-02-2007, 08:56 PM
On 5/30/07, Afan Pasalic <EMAIL REMOVED> wrote:
>
> yes. in one hand it's more for mysql list. though, I was thinking more
> if somebody had already something similar as a "project". more as path I
> have to follow.
> e.g., in your example, in where clause AND doesn't work because bob
> could be robert too, right? and last name has to match 100%, right? (or
> I'm wrong?)
You're right. Remember, that was an example of what you MIGHT do, not
necessarily what you SHOULD do.
You could also situationally check the returned fields and if it's greater
than, say, 25 or 50, re-run the query and change the letters matched to 4,
for instance, and then add a link to get the greater total.
You could also look at the "search box suggestion" code that's out there for
a way to implement this on the server side. Don't know if that code will be
optimized or not, but that's essentially what you're doing here.
how "smart" solution will be something like this:
>
> $query = my_query("select id from members where last_name='$last_name'");
> while($result = mysql_fetch_array($query))
> {
> $MEMBERS[$result['id']] += 50;
> }
Well, see, if the match isn't exact, it won't return anything. Unless you
know the exact name.
You also may have to deal with someone misstyping their name(s).
$query = my_query("select id from members where first_name='$first_name'");
> while($result = mysql_fetch_array($query))
> {
> $MEMBERS[$result['id']] += 10;
> }
>
> $query = my_query("select id from members where email='$email'");
> while($result = mysql_fetch_array($query))
> {
> $MEMBERS[$result['id']] += 85;
> }
Why would you do that many SELECTs? (Also, if you cap the SQL commands, it's
easier to read.)
etc.
>
> after last query I will have an array of people. and I'll list all
> person with "score" more than 50.
This is a really roundabout way to do this. Look at the Levinshtein PHP
manual page for some suggestions on how to calculate similarities. I *think*
that should be better to do this:
for ($i = 0; $i < count($mysqlresultset); $i++) {
$lev = levenshtein($mysqlresultset[$i][$firstname], $postedname);
if ($lev > 49) {
$matches[] = $mysqlresultset[$i];
}
}
or, since last name MUST match, I think it's better this way (just got
> in my head):
> $query = my_query("select id from members where last_name='$last_name'");
> while($result = mysql_fetch_array($query))
> {
> $query = my_query("select id from members where
> first_name='$first_name'");
> while($result = mysql_fetch_array($query))
> {
> $MEMBERS[$result['id']] += 10;
> }
>
> $query = my_query("select id from members where email='$email'");
> while($result = mysql_fetch_array($query))
> {
> $MEMBERS[$result['id']] += 85;
> }
>
> etc.
> }
There's a lot of unnecessary work you're making PHP and your database do.
This is quite inefficient code.
If you're trying to match the emails and whatnot, then combine all those
queries together. SELECT them all together. It looks like what you're doing
is weighting it by email address, which you can add to the SELECT I posted
(although you need to think about how you use your wildcards for email
addresses, such as maybe matching the beginning OR the end, for instance).
It's even better if the person has to activate the account with an email
link to activate, since then you'd know the email address existed (although
it doesn't mean it isn't someone in the database that isn't already in
there).
--
Jared Farrish
Intermediate Web Developer
Denton, Tx
Abraham Maslow: "If the only tool you have is a hammer, you tend to see
every problem as a nail." $$ | | | | | | | | Guest | Re: [PHP] Re: find (matching) person in other table -
06-02-2007, 08:56 PM
Jared Farrish wrote:
> On 5/30/07, Afan Pasalic <EMAIL REMOVED> wrote:
>>
>> yes. in one hand it's more for mysql list. though, I was thinking more
>> if somebody had already something similar as a "project". more as path I
>> have to follow.
>> e.g., in your example, in where clause AND doesn't work because bob
>> could be robert too, right? and last name has to match 100%, right? (or
>> I'm wrong?)
>
>
> You're right. Remember, that was an example of what you MIGHT do, not
> necessarily what you SHOULD do.
sure. I just want to be sure you understand what I was thinking (because
of my english :-) )
> You could also situationally check the returned fields and if it's
> greater
> than, say, 25 or 50, re-run the query and change the letters matched
> to 4,
> for instance, and then add a link to get the greater total.
>
> You could also look at the "search box suggestion" code that's out
> there for
> a way to implement this on the server side. Don't know if that code
> will be
> optimized or not, but that's essentially what you're doing here.
>
> how "smart" solution will be something like this:
>>
>> $query = my_query("select id from members where
>> last_name='$last_name'");
>> while($result = mysql_fetch_array($query))
>> {
>> $MEMBERS[$result['id']] += 50;
>> }
>
>
> Well, see, if the match isn't exact, it won't return anything. Unless you
> know the exact name.
>
> You also may have to deal with someone misstyping their name(s).
>
> $query = my_query("select id from members where
> first_name='$first_name'");
>> while($result = mysql_fetch_array($query))
>> {
>> $MEMBERS[$result['id']] += 10;
>> }
>>
>> $query = my_query("select id from members where email='$email'");
>> while($result = mysql_fetch_array($query))
>> {
>> $MEMBERS[$result['id']] += 85;
>> }
>
>
> Why would you do that many SELECTs? (Also, if you cap the SQL
> commands, it's
> easier to read.)
"most likely" because I was thinking that it shouldn't be big deal. but
after your and richard's email - definitely have to try to make it as
one query.
> etc.
>>
>> after last query I will have an array of people. and I'll list all
>> person with "score" more than 50.
>
>
> This is a really roundabout way to do this. Look at the Levinshtein PHP
> manual page for some suggestions on how to calculate similarities. I
> *think*
> that should be better to do this:
>
> for ($i = 0; $i < count($mysqlresultset); $i++) {
> $lev = levenshtein($mysqlresultset[$i][$firstname], $postedname);
> if ($lev > 49) {
> $matches[] = $mysqlresultset[$i];
> }
> }
>
I'm just studying it. :-)
> or, since last name MUST match, I think it's better this way (just got
>> in my head):
>> $query = my_query("select id from members where
>> last_name='$last_name'");
>> while($result = mysql_fetch_array($query))
>> {
>> $query = my_query("select id from members where
>> first_name='$first_name'");
>> while($result = mysql_fetch_array($query))
>> {
>> $MEMBERS[$result['id']] += 10;
>> }
>>
>> $query = my_query("select id from members where email='$email'");
>> while($result = mysql_fetch_array($query))
>> {
>> $MEMBERS[$result['id']] += 85;
>> }
>>
>> etc.
>> }
>
>
> There's a lot of unnecessary work you're making PHP and your database do.
> This is quite inefficient code.
that's why I ask here - to learn. and I appreciate for any help.
> If you're trying to match the emails and whatnot, then combine all those
> queries together. SELECT them all together. It looks like what you're
> doing
> is weighting it by email address, which you can add to the SELECT I
> posted
> (although you need to think about how you use your wildcards for email
> addresses, such as maybe matching the beginning OR the end, for
> instance).
> It's even better if the person has to activate the account with an email
> link to activate, since then you'd know the email address existed
> (although
> it doesn't mean it isn't someone in the database that isn't already in
> there).
email has to match "in total". EMAIL REMOVED and EMAIL REMOVED
are NOT the same in my case.
thanks jared,
-afan | | | | | | | | Guest | Re: [PHP] Re: find (matching) person in other table -
06-02-2007, 08:56 PM
On 5/30/07, Afan Pasalic <EMAIL REMOVED> wrote:
email has to match "in total". EMAIL REMOVED and EMAIL REMOVED
> are NOT the same in my case.
>
> thanks jared,
If you can match a person by their email, why not just SELECT by email only
(and return the persons information)?
Consider, as well, that each time you're calling a database, you're slowing
down the response of the page. So, while making a bunch of small calls might
not seem like that much, consider:
||||||| x |||||||
||||||| a |||||||
||||||| b |||||||
Versus
||||||| x, a, b |||||||
The letters represent the request/response data (what you're giving to get,
then get back), and the pipes (|) are the overhead to process, send, receive
(on DB), process (on DB), send (on DB), receive, process, return to code.
The overhead and latency used to complete one request makes it a quicker,
less "heavy" operation. If you did the first a couple hundred or thousand
times, I would bet your page would drag to a halt while it loads...
--
Jared Farrish
Intermediate Web Developer
Denton, Tx
Abraham Maslow: "If the only tool you have is a hammer, you tend to see
every problem as a nail." $$ | | | | | | | | Guest | Re: [PHP] Re: find (matching) person in other table -
06-02-2007, 08:56 PM
Jared Farrish wrote:
> On 5/30/07, Afan Pasalic <EMAIL REMOVED> wrote:
> email has to match "in total". EMAIL REMOVED and EMAIL REMOVED
>> are NOT the same in my case.
>>
>> thanks jared,
>
> If you can match a person by their email, why not just SELECT by email
> only
> (and return the persons information)?
'cause some members can be added to database by administrator and maybe
they don't have email address at all. or several memebers can use the
same email address (EMAIL REMOVED) and then macthing last name is
kind of "required". that's how it works now and can't change it.
> Consider, as well, that each time you're calling a database, you're
> slowing
> down the response of the page. So, while making a bunch of small calls
> might
> not seem like that much, consider:
>
> ||||||| x |||||||
> ||||||| a |||||||
> ||||||| b |||||||
>
> Versus
>
> ||||||| x, a, b |||||||
>
> The letters represent the request/response data (what you're giving to
> get,
> then get back), and the pipes (|) are the overhead to process, send,
> receive
> (on DB), process (on DB), send (on DB), receive, process, return to code.
>
> The overhead and latency used to complete one request makes it a quicker,
> less "heavy" operation. If you did the first a couple hundred or thousand
> times, I would bet your page would drag to a halt while it loads...
agree. now, I have to figure it out HOW? :-)
I was looking at levenshtein, though, I think the richard's solution is
just enough:
select member_id, first_name, last_name, email, ...,
(5*(first_name='$first_name) + 2*(first_name='$first_name')) as score
from members
where score > 0
though, I'm getting error: "Unknown column 'score' in where clause"?!?
thanks jared. | | | | | | | | Guest | Re: [PHP] Re: find (matching) person in other table -
06-02-2007, 08:56 PM
On 5/31/07, Afan Pasalic <EMAIL REMOVED> wrote:
>
>
>
> Jared Farrish wrote:
> > On 5/30/07, Afan Pasalic <EMAIL REMOVED> wrote:
> > email has to match "in total". EMAIL REMOVED and EMAIL REMOVED
> >> are NOT the same in my case.
> >>
> >> thanks jared,
> >
> > If you can match a person by their email, why not just SELECT by email
> > only
> > (and return the persons information)?
> 'cause some members can be added to database by administrator and maybe
> they don't have email address at all. or several memebers can use the
> same email address (EMAIL REMOVED) and then macthing last name is
> kind of "required". that's how it works now and can't change it.
>
> > Consider, as well, that each time you're calling a database, you're
> > slowing
> > down the response of the page. So, while making a bunch of small calls
> > might
> > not seem like that much, consider:
> >
> > ||||||| x |||||||
> > ||||||| a |||||||
> > ||||||| b |||||||
> >
> > Versus
> >
> > ||||||| x, a, b |||||||
> >
> > The letters represent the request/response data (what you're giving to
> > get,
> > then get back), and the pipes (|) are the overhead to process, send,
> > receive
> > (on DB), process (on DB), send (on DB), receive, process, return to
> code.
> >
> > The overhead and latency used to complete one request makes it a
> quicker,
> > less "heavy" operation. If you did the first a couple hundred or
> thousand
> > times, I would bet your page would drag to a halt while it loads...
> agree. now, I have to figure it out HOW? :-)
>
> I was looking at levenshtein, though, I think the richard's solution is
> just enough:
>
> select member_id, first_name, last_name, email, ...,
> (5*(first_name='$first_name) + 2*(first_name='$first_name')) as score
> from members
> where score > 0
>
> though, I'm getting error: "Unknown column 'score' in where clause"?!?
>
> thanks jared.
Try using the keyword 'having' rather than 'where'. You can't use an alias
in a where clause.
David | | | | | | | | Guest | Re: [PHP] Re: find (matching) person in other table -
06-02-2007, 08:56 PM
David Giragosian wrote:
> On 5/31/07, Afan Pasalic <EMAIL REMOVED> wrote:
>>
>>
>>
>> Jared Farrish wrote:
>> > On 5/30/07, Afan Pasalic <EMAIL REMOVED> wrote:
>> > email has to match "in total". EMAIL REMOVED and
>> EMAIL REMOVED
>> >> are NOT the same in my case.
>> >>
>> >> thanks jared,
>> >
>> > If you can match a person by their email, why not just SELECT by email
>> > only
>> > (and return the persons information)?
>> 'cause some members can be added to database by administrator and maybe
>> they don't have email address at all. or several memebers can use the
>> same email address (EMAIL REMOVED) and then macthing last name is
>> kind of "required". that's how it works now and can't change it.
>>
>> > Consider, as well, that each time you're calling a database, you're
>> > slowing
>> > down the response of the page. So, while making a bunch of small calls
>> > might
>> > not seem like that much, consider:
>> >
>> > ||||||| x |||||||
>> > ||||||| a |||||||
>> > ||||||| b |||||||
>> >
>> > Versus
>> >
>> > ||||||| x, a, b |||||||
>> >
>> > The letters represent the request/response data (what you're giving to
>> > get,
>> > then get back), and the pipes (|) are the overhead to process, send,
>> > receive
>> > (on DB), process (on DB), send (on DB), receive, process, return to
>> code.
>> >
>> > The overhead and latency used to complete one request makes it a
>> quicker,
>> > less "heavy" operation. If you did the first a couple hundred or
>> thousand
>> > times, I would bet your page would drag to a halt while it loads...
>> agree. now, I have to figure it out HOW? :-)
>>
>> I was looking at levenshtein, though, I think the richard's solution is
>> just enough:
>>
>> select member_id, first_name, last_name, email, ...,
>> (5*(first_name='$first_name) + 2*(first_name='$first_name')) as score
>> from members
>> where score > 0
>>
>> though, I'm getting error: "Unknown column 'score' in where clause"?!?
>>
>> thanks jared.
>
> Try using the keyword 'having' rather than 'where'. You can't use an
> alias
> in a where clause.
>
> David
Yup. that works! :-)
Thanks David | | | | | | | | Guest | Re: [PHP] Re: find (matching) person in other table -
06-02-2007, 08:56 PM
MySQL doesn't let you use the calculated values (score) in the where
clause.
PostgreSQL does, as I recall.
Sorry.
You may be able to get around that with:
Do a GROUP BY on something unique, so the GROUP BY is pointless, but
then you can use HAVING score > 0
Use a sub-query in MySQL 4.mumble or higher, and the outer query can
use 'score' to get rid of the 0-point non-matches
Build up the expression for $score and $where at the same time like:
$where = ' 0 ';
$score = ' 0 ';
$where .= " OR lastname = '$lastname';
$score .= " + 5 * (lastname = '$lastame' ) ";
$where .= " OR firstname = '$firstname' ";
$score .= " + 2 * (firstname = '$firstname' ) ";
This gets tiresome to type, but is least confusing to non-programmers...
On Thu, May 31, 2007 9:36 am, Afan Pasalic wrote:
>
>
> Jared Farrish wrote:
>> On 5/30/07, Afan Pasalic <EMAIL REMOVED> wrote:
>> email has to match "in total". EMAIL REMOVED and
>> EMAIL REMOVED
>>> are NOT the same in my case.
>>>
>>> thanks jared,
>>
>> If you can match a person by their email, why not just SELECT by
>> email
>> only
>> (and return the persons information)?
> 'cause some members can be added to database by administrator and
> maybe
> they don't have email address at all. or several memebers can use the
> same email address (EMAIL REMOVED) and then macthing last name
> is
> kind of "required". that's how it works now and can't change it.
>
>> Consider, as well, that each time you're calling a database, you're
>> slowing
>> down the response of the page. So, while making a bunch of small
>> calls
>> might
>> not seem like that much, consider:
>>
>> ||||||| x |||||||
>> ||||||| a |||||||
>> ||||||| b |||||||
>>
>> Versus
>>
>> ||||||| x, a, b |||||||
>>
>> The letters represent the request/response data (what you're giving
>> to
>> get,
>> then get back), and the pipes (|) are the overhead to process, send,
>> receive
>> (on DB), process (on DB), send (on DB), receive, process, return to
>> code.
>>
>> The overhead and latency used to complete one request makes it a
>> quicker,
>> less "heavy" operation. If you did the first a couple hundred or
>> thousand
>> times, I would bet your page would drag to a halt while it loads...
> agree. now, I have to figure it out HOW? :-)
>
> I was looking at levenshtein, though, I think the richard's solution
> is
> just enough:
>
> select member_id, first_name, last_name, email, ...,
> (5*(first_name='$first_name) + 2*(first_name='$first_name')) as score
> from members
> where score > 0
>
> though, I'm getting error: "Unknown column 'score' in where clause"?!?
>
> thanks jared.
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist. http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So? | | | | | Thread Tools | | | | Display Modes | Linear Mode |
Posting Rules
| You may not post new threads You may not post replies You may not post attachments You may not edit your posts HTML code is Off | | | |  |